If the data series is large it might be interesting to further split the job over time using overlap/add or overlap/save, or even an FFT suitably partitioned.
On Dec 6, 2011, at 1:48 PM, Josh Patterson <[email protected]> wrote: > Mahout currently does not have, afaik, much/any time series specific > code for it. If I were to point someone at some good resources I'd > start wtih: > > - Box and Jenkins book > - Dr Keogh's line of research on time series pattern matching > > And then beyond that it begins to become "what are you specifically > looking for?". R is typically the "go to" resource for a lot of time > series work, but there has been some very successful work with Hadoop > and large scale time series data. Below I link to a few articles where > time series techniques are demonstrated with Hadoop. Specifically here > is a blog article on general time series processing with Hadoop: > > http://www.cloudera.com/blog/2011/03/simple-moving-average-secondary-sort-and-mapreduce-part-1/ > http://www.cloudera.com/blog/2011/03/simple-moving-average-secondary-sort-and-mapreduce-part-2/ > http://www.cloudera.com/blog/2011/04/simple-moving-average-secondary-sort-and-mapreduce-part-3/ > > Beyond that you could take a look at how we applied these concepts to > the US powergrid PMU / smartgrid data back in 2009: > > http://openpdc.codeplex.com > http://www.slideshare.net/jpatanooga/oscon-data-2011-lumberyard > > Hope that gets you going, > > Josh > > 2011/12/4 myn <[email protected]>: >> does mahout contain this method? >> or is there any other open soure projcet about this? > > > > -- > Twitter: @jpatanooga > Solution Architect @ Cloudera > hadoop: http://www.cloudera.com
