YES!!! :) Thanks for sharing,
Anze On Tuesday 16 November 2010, Olga Natkovich wrote: > Functions or rather inline macros are coming in Pig 0.9: > http://wiki.apache.org/pig/TuringCompletePig > > Olga > > -----Original Message----- > From: Anze [mailto:[email protected]] > Sent: Tuesday, November 16, 2010 9:34 AM > To: [email protected] > Subject: Re: pig needed? > > > My 0.02 EUR: > I think it's not the learning curve that makes Pig a better tool for some > applications. In my experience the learning curve is even *steeper* for pig > than for raw MR. MR can very easily be learned from Tom White's book while > Pig - well, Pig is in there too, but it's quite a short chapter and lacks > good examples. The online tutorials however are almost non-existent, or at > least I couldn't find any. > > Where Pig excels is the power with which you can manipulate data. You can > write complex queries in just a few lines whereas with MR you end up > writing hundreds of lines of code. > > The major drawback of Pig however (in my limited experience) is its lack of > functions (or objects :), making any larger piece of code spaghetti-like. > Also, it is still very much evolving so if you are dealing with anything > else than raw HDFS files... well, good luck. :) > > While we are at it, I am curios how other users use Pig? I am writing in > PigPen Eclipse plugin and then copy+paste to Pig shell (I wasn't able to > make PigPen work with cluster directly), which is pretty cumbersome. So > this is another downside for me. > > But I still love Pig as it makes me control the data much more easily and > it makes writing ad-hoc queries much easier. And it will only get better > with time. > > But if your code works in MR, why rewrite it? Let it be, unless you have > problems with the code and needs to be rewritten anyway. > > Enjoy! > > Anze > > On Tuesday 16 November 2010, Renato Marroquín Mogrovejo wrote: > > Pig has some clear advantages over raw mapreduce code, but IHMO the most > > important is the learning curve. But, if you are just loading, probably > > you don't want to just translate it into pig, well, maybe just for the > > fun of it (: but if you are planing to do some more other operations > > like joining or grouping, it would be a lot more simple to do it from > > pig. > > > > Give this a look, it will help you understand better the bigger picture. > > http://www.slideshare.net/hadoop/practical-problem-solving-with-apache-ha > > do op-pig > > > > Renato M. > > > > > > If you already have it as a hadoop job, why would you want it pass to > > pig? > > > > 2010/11/15 Gerrit van Vuuren <[email protected]> > > > > > Is this a bot? > > > > > > Y si no, si puedes utilizar pig anque te consejo reutilizar lo que ya > > > se ha desarollado y no repetir udfs si existe :) > > > > > > > > > ----- Original Message ----- > > > From: Cornelio Iñigo <[email protected]> > > > To: [email protected] <[email protected]> > > > Sent: Mon Nov 15 20:48:35 2010 > > > Subject: pig needed? > > > > > > Hi > > > > > > My name is Cornelio Iñigo and I´m a developer just beginning with this > > > of hadoop and pig. > > > I have a doubt about developing an application on pig, I already have > > > my program on hadoop, this program gets just a column from a dataset > > > (csv file) > > > and process this data with some functions (like language analisis, > > > analysis of the content) > > > > > > note that in the process of the file I dont use FILTERS COUNTS or any > > > > > > built > > > in function of Pig, I think that all the fucntions have to be User > > > Defined Functions > > > > > > so Is a good idea (has sense ) to develop this program in Pig? > > > > > > Thanks in advice > > > -- > > > *Cornelio*
