I'm all for Pig, especially once we are a TLP. I haven't had the proper time to review the PLSI implementation, but it looks useful. I agree on the other points, though, in that I think we it would be nice to have consistent formats based on Vector so that things can be more portable.
On Feb 22, 2010, at 2:41 AM, Ankur C. Goel wrote: > Hi Folks, > I would like to know how mahout community feels about having > some of the Mahout algorithms implemented in pig - > http://hadoop.apache.org/pig. The benefits of using Pig are many including. > > > 1. Small learning curve, people with a bit of SQL knowledge will find it > very easy. > 2. Operations like grouping, aggregations, join need just few lines of pig > code. > 3. Insulation against Hadoop complexity - Job chains and JobConf. > 4. Quick prototyping and hence increased programmer productivity. > > I had Sean's opinion on this and he was not too comfortable with the Idea of > having things in different languages in Mahout. However, given the benefits > of PIG, I feel otherwise. I may be biased here due to my own experience of > being able to do more in lesser time in Pig then in M/R, so I thought let me > ask how folks feel. > > Ted, I believe you have some PIG experience yourself so any thoughts on this ? > > Regards > -...@nkur