There are a few implementations of DTW in Cython floating around...I think
mblondel has one. Maybe you could tweak one of these and see whether it
yields a useful speed-up?

https://github.com/SnippyHolloW/DTW_Cython
http://www.mblondel.org/journal/2009/08/31/dynamic-time-warping-theory/
https://github.com/mdeklerk/DTW/blob/master/_dtw.pyx



On 10 January 2014 13:05, Mark Regan <m...@thinkvein.com> wrote:

> Am I correct to assume the only algorithm that will work with a custom
> distance metric is "brute"? DWT with 1NN is performing pretty slow with
> just 10,000 observations.
>
> New to Python, perhaps I could write the distance metric function more
> efficiently?
>
> # Define function to compute dynamic time warp distance between
> # two arrays containing multiple time series arrays (each array
> # represents a different data attribute. eg Gmail 30DA)
> def dtw_2d(u, v, num_ts=80):
>     """
>     Function to compute Dynamic Time Warp Distance
>     between two arrays of shape m x n.
>     m: number of time series attributes
>     n: number of observations in each time series attribute
>
>     num_ts: Number of elements in each time series attribute
>     Due to sklearn constraints, the input array into KNearestNeighbour
>     is converted from a 3D arrary to a 2D array. This function
>     converts it back to a 3D array before computing DTW distances
>     """
>
>     # Reshape u and v into 3D arrays
>     u_dim = np.shape(u)
>     v_dim = np.shape(v)
>
>     # Calculate num dimensions to add to u & v
>     u_obs = u_dim[0]/num_ts
>     v_obs = v_dim[0]/num_ts
>
>     # Reshape u & v
>     new_u = u.reshape(u_obs, num_ts)
>     new_v = v.reshape(v_obs, num_ts)
>
>     # Compute DTW distances between u & v
>     dtw_distance = []
>     for a, b in zip(new_u, new_v):
>         dtw_distance.append(mlpy.dtw.dtw_std(a, b))
>
>     # Return the average of all distances
>     # ToDo: Improve this aggregation metric
>     return np.average(dtw_distance)
>
> On Fri Jan 10 2014 at 9:42:01 AM, Gael Varoquaux <
> gael.varoqu...@normalesup.org> wrote:
>
>> Fully agreed with Lars.
>>
>> On Fri, Jan 10, 2014 at 02:44:40PM +0100, Lars Buitinck wrote:
>> > 2014/1/10 Robert Layton <robertlay...@gmail.com>:
>> > > I wonder if that check could be removed -- as long as the input is
>> > > fancy-indexable, the code should otherwise not have an issue (until
>> it hits
>> > > the distance metric, in which case you have that covered).
>>
>> > -1. Since high-d data is usually a mistake and NumPy offers easy
>> > reshaping for the advanced use cases, I think we should leave the code
>> > as is. It fits the existing convention that an array has shape
>> > (n_samples, n_features) and raises a very clear exception. Passing
>> > higher-d data on would raise an exception deep down in the k-NN code,
>> > making debugging of easy mistakes harder.
>>
>> > ------------------------------------------------------------
>> ------------------
>> > CenturyLink Cloud: The Leader in Enterprise Cloud Services.
>> > Learn Why More Businesses Are Choosing CenturyLink Cloud For
>> > Critical Workloads, Development Environments & Everything In Between.
>> > Get a Quote or Start a Free Trial Today.
>> > http://pubads.g.doubleclick.net/gampad/clk?id=119420431&;
>> iu=/4140/ostg.clktrk
>> > _______________________________________________
>> > Scikit-learn-general mailing list
>> > Scikit-learn-general@lists.sourceforge.net
>> > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>> --
>>     Gael Varoquaux
>>     Researcher, INRIA Parietal
>>     Laboratoire de Neuro-Imagerie Assistee par Ordinateur
>>     NeuroSpin/CEA Saclay , Bat 145, 91191 Gif-sur-Yvette France
>>     Phone:  ++ 33-1-69-08-79-68
>>     http://gael-varoquaux.info            http://twitter.com/
>> GaelVaroquaux
>>
>> ------------------------------------------------------------
>> ------------------
>> CenturyLink Cloud: The Leader in Enterprise Cloud Services.
>> Learn Why More Businesses Are Choosing CenturyLink Cloud For
>> Critical Workloads, Development Environments & Everything In Between.
>> Get a Quote or Start a Free Trial Today.
>> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&;
>> iu=/4140/ostg.clktrk
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>
>
> ------------------------------------------------------------------------------
> CenturyLink Cloud: The Leader in Enterprise Cloud Services.
> Learn Why More Businesses Are Choosing CenturyLink Cloud For
> Critical Workloads, Development Environments & Everything In Between.
> Get a Quote or Start a Free Trial Today.
>
> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to