There's some not-so-public work we are doing at Twitter (vote for the Hadoop Summit talk!) and also Ted Dunning's Mahout integration: https://github.com/tdunning/pig-vector
On Mon, Mar 12, 2012 at 1:02 PM, Dmitriy Lyubimov <[email protected]> wrote: > No known public good attempts known to me exist to put ML kind of > stuff on top of pig . (well almost none). There are some statistical > packages written at Yahoo but afaik they don't do directly what you > need. > > Pig is somewhat excellent data prep pipeline, but IMO is not as > excellent as something like R-Hadoop. > > Also depending on # of your predictors and training latency required, > you may not need a map reduce at all to train something like > stochastic gradient descent-based schemes. They converge way too fast > to really take advantage of MR based methods (again, in most typical > settings of # of predictors). If you do have a virtually unbounded > number of predictors, you probably will need some techniques to reduce > it anyway (such as feature hashing found in Mahout). So perhaps > there's an easier way to do actual training other than using Pig. > > -d > > On Mon, Mar 12, 2012 at 12:21 AM, chethan <[email protected]> wrote: >> Hi, >> >> We want want to do Linear regression analysis to achieve Interpolation for >> a set of values, using PIG Scripts. >> Do we have any in-built functions to achieve this, if not how to achieve. >> >> Thanks & Regards >> Chethan Prakash.
