No known public good attempts known to me exist to put ML kind of stuff on top of pig . (well almost none). There are some statistical packages written at Yahoo but afaik they don't do directly what you need.
Pig is somewhat excellent data prep pipeline, but IMO is not as excellent as something like R-Hadoop. Also depending on # of your predictors and training latency required, you may not need a map reduce at all to train something like stochastic gradient descent-based schemes. They converge way too fast to really take advantage of MR based methods (again, in most typical settings of # of predictors). If you do have a virtually unbounded number of predictors, you probably will need some techniques to reduce it anyway (such as feature hashing found in Mahout). So perhaps there's an easier way to do actual training other than using Pig. -d On Mon, Mar 12, 2012 at 12:21 AM, chethan <[email protected]> wrote: > Hi, > > We want want to do Linear regression analysis to achieve Interpolation for > a set of values, using PIG Scripts. > Do we have any in-built functions to achieve this, if not how to achieve. > > Thanks & Regards > Chethan Prakash.
