Thanks Fred and all. Appreciate the help.

On Fri Jan 10 2014 at 1:26:13 PM, Fred Mailhot <fred.mail...@gmail.com>
wrote:

There are a few implementations of DTW in Cython floating around...I think
mblondel has one. Maybe you could tweak one of these and see whether it
yields a useful speed-up?

https://github.com/SnippyHolloW/DTW_Cython
http://www.mblondel.org/journal/2009/08/31/dynamic-time-warping-theory/
https://github.com/mdeklerk/DTW/blob/master/_dtw.pyx



On 10 January 2014 13:05, Mark Regan <m...@thinkvein.com> wrote:

Am I correct to assume the only algorithm that will work with a custom
distance metric is "brute"? DWT with 1NN is performing pretty slow with
just 10,000 observations.

New to Python, perhaps I could write the distance metric function more
efficiently?

# Define function to compute dynamic time warp distance between
# two arrays containing multiple time series arrays (each array
# represents a different data attribute. eg Gmail 30DA)
def dtw_2d(u, v, num_ts=80):
    """
    Function to compute Dynamic Time Warp Distance
    between two arrays of shape m x n.
    m: number of time series attributes
    n: number of observations in each time series attribute

    num_ts: Number of elements in each time series attribute
    Due to sklearn constraints, the input array into KNearestNeighbour
    is converted from a 3D arrary to a 2D array. This function
    converts it back to a 3D array before computing DTW distances
    """

    # Reshape u and v into 3D arrays
    u_dim = np.shape(u)
    v_dim = np.shape(v)

    # Calculate num dimensions to add to u & v
    u_obs = u_dim[0]/num_ts
    v_obs = v_dim[0]/num_ts

    # Reshape u & v
    new_u = u.reshape(u_obs, num_ts)
    new_v = v.reshape(v_obs, num_ts)

    # Compute DTW distances between u & v
    dtw_distance = []
    for a, b in zip(new_u, new_v):
        dtw_distance.append(mlpy.dtw.dtw_std(a, b))

    # Return the average of all distances
    # ToDo: Improve this aggregation metric
    return np.average(dtw_distance)

On Fri Jan 10 2014 at 9:42:01 AM, Gael Varoquaux <
gael.varoqu...@normalesup.org> wrote:

Fully agreed with Lars.

On Fri, Jan 10, 2014 at 02:44:40PM +0100, Lars Buitinck wrote:
> 2014/1/10 Robert Layton <robertlay...@gmail.com>:
> > I wonder if that check could be removed -- as long as the input is
> > fancy-indexable, the code should otherwise not have an issue (until it
hits
> > the distance metric, in which case you have that covered).

> -1. Since high-d data is usually a mistake and NumPy offers easy
> reshaping for the advanced use cases, I think we should leave the code
> as is. It fits the existing convention that an array has shape
> (n_samples, n_features) and raises a very clear exception. Passing
> higher-d data on would raise an exception deep down in the k-NN code,
> making debugging of easy mistakes harder.

> ------------------------------------------------------------
------------------
> CenturyLink Cloud: The Leader in Enterprise Cloud Services.
> Learn Why More Businesses Are Choosing CenturyLink Cloud For
> Critical Workloads, Development Environments & Everything In Between.
> Get a Quote or Start a Free Trial Today.
> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/
4140/ostg.clktrk
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

--
    Gael Varoquaux
    Researcher, INRIA Parietal
    Laboratoire de Neuro-Imagerie Assistee par Ordinateur
    NeuroSpin/CEA Saclay , Bat 145, 91191 Gif-sur-Yvette France
    Phone:  ++ 33-1-69-08-79-68
    http://gael-varoquaux.info            http://twitter.com/GaelVaroquaux

------------------------------------------------------------
------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today.
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general


------------------------------------------------------------
------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today.
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to