There are a few implementations of DTW in Cython floating around...I think
mblondel has one. Maybe you could tweak one of these and see whether it
yields a useful speed-up?
https://github.com/SnippyHolloW/DTW_Cython
http://www.mblondel.org/journal/2009/08/31/dynamic-time-warping-theory/
https://github.com/mdeklerk/DTW/blob/master/_dtw.pyx
On 10 January 2014 13:05, Mark Regan <m...@thinkvein.com> wrote:
> Am I correct to assume the only algorithm that will work with a custom
> distance metric is "brute"? DWT with 1NN is performing pretty slow with
> just 10,000 observations.
>
> New to Python, perhaps I could write the distance metric function more
> efficiently?
>
> # Define function to compute dynamic time warp distance between
> # two arrays containing multiple time series arrays (each array
> # represents a different data attribute. eg Gmail 30DA)
> def dtw_2d(u, v, num_ts=80):
> """
> Function to compute Dynamic Time Warp Distance
> between two arrays of shape m x n.
> m: number of time series attributes
> n: number of observations in each time series attribute
>
> num_ts: Number of elements in each time series attribute
> Due to sklearn constraints, the input array into KNearestNeighbour
> is converted from a 3D arrary to a 2D array. This function
> converts it back to a 3D array before computing DTW distances
> """
>
> # Reshape u and v into 3D arrays
> u_dim = np.shape(u)
> v_dim = np.shape(v)
>
> # Calculate num dimensions to add to u & v
> u_obs = u_dim[0]/num_ts
> v_obs = v_dim[0]/num_ts
>
> # Reshape u & v
> new_u = u.reshape(u_obs, num_ts)
> new_v = v.reshape(v_obs, num_ts)
>
> # Compute DTW distances between u & v
> dtw_distance = []
> for a, b in zip(new_u, new_v):
> dtw_distance.append(mlpy.dtw.dtw_std(a, b))
>
> # Return the average of all distances
> # ToDo: Improve this aggregation metric
> return np.average(dtw_distance)
>
> On Fri Jan 10 2014 at 9:42:01 AM, Gael Varoquaux <
> gael.varoqu...@normalesup.org> wrote:
>
>> Fully agreed with Lars.
>>
>> On Fri, Jan 10, 2014 at 02:44:40PM +0100, Lars Buitinck wrote:
>> > 2014/1/10 Robert Layton <robertlay...@gmail.com>:
>> > > I wonder if that check could be removed -- as long as the input is
>> > > fancy-indexable, the code should otherwise not have an issue (until
>> it hits
>> > > the distance metric, in which case you have that covered).
>>
>> > -1. Since high-d data is usually a mistake and NumPy offers easy
>> > reshaping for the advanced use cases, I think we should leave the code
>> > as is. It fits the existing convention that an array has shape
>> > (n_samples, n_features) and raises a very clear exception. Passing
>> > higher-d data on would raise an exception deep down in the k-NN code,
>> > making debugging of easy mistakes harder.
>>
>> > ------------------------------------------------------------
>> ------------------
>> > CenturyLink Cloud: The Leader in Enterprise Cloud Services.
>> > Learn Why More Businesses Are Choosing CenturyLink Cloud For
>> > Critical Workloads, Development Environments & Everything In Between.
>> > Get a Quote or Start a Free Trial Today.
>> > http://pubads.g.doubleclick.net/gampad/clk?id=119420431&
>> iu=/4140/ostg.clktrk
>> > _______________________________________________
>> > Scikit-learn-general mailing list
>> > Scikit-learn-general@lists.sourceforge.net
>> > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>> --
>> Gael Varoquaux
>> Researcher, INRIA Parietal
>> Laboratoire de Neuro-Imagerie Assistee par Ordinateur
>> NeuroSpin/CEA Saclay , Bat 145, 91191 Gif-sur-Yvette France
>> Phone: ++ 33-1-69-08-79-68
>> http://gael-varoquaux.info http://twitter.com/
>> GaelVaroquaux
>>
>> ------------------------------------------------------------
>> ------------------
>> CenturyLink Cloud: The Leader in Enterprise Cloud Services.
>> Learn Why More Businesses Are Choosing CenturyLink Cloud For
>> Critical Workloads, Development Environments & Everything In Between.
>> Get a Quote or Start a Free Trial Today.
>> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&
>> iu=/4140/ostg.clktrk
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>
>
> ------------------------------------------------------------------------------
> CenturyLink Cloud: The Leader in Enterprise Cloud Services.
> Learn Why More Businesses Are Choosing CenturyLink Cloud For
> Critical Workloads, Development Environments & Everything In Between.
> Get a Quote or Start a Free Trial Today.
>
> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today.
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general