Re: pig needed?

Anze Skerlavaj Wed, 17 Nov 2010 08:02:25 -0800

YES!!! :)

Thanks for sharing,


Anze


On Tuesday 16 November 2010, Olga Natkovich wrote:
> Functions or rather inline macros are coming in Pig 0.9:
> http://wiki.apache.org/pig/TuringCompletePig
> 
> Olga
> 
> -----Original Message-----
> From: Anze [mailto:[email protected]]
> Sent: Tuesday, November 16, 2010 9:34 AM
> To: [email protected]
> Subject: Re: pig needed?
> 
> 
> My 0.02 EUR:
> I think it's not the learning curve that makes Pig a better tool for some
> applications. In my experience the learning curve is even *steeper* for pig
> than for raw MR. MR can very easily be learned from Tom White's book while
> Pig - well, Pig is in there too, but it's quite a short chapter and lacks
> good examples. The online tutorials however are almost non-existent, or at
> least I couldn't find any.
> 
> Where Pig excels is the power with which you can manipulate data. You can
> write complex queries in just a few lines whereas with MR you end up
> writing hundreds of lines of code.
> 
> The major drawback of Pig however (in my limited experience) is its lack of
> functions (or objects :), making any larger piece of code spaghetti-like.
> Also, it is still very much evolving so if you are dealing with anything
> else than raw HDFS files... well, good luck. :)
> 
> While we are at it, I am curios how other users use Pig? I am writing in
> PigPen Eclipse plugin and then copy+paste to Pig shell (I wasn't able to
> make PigPen work with cluster directly), which is pretty cumbersome. So
> this is another downside for me.
> 
> But I still love Pig as it makes me control the data much more easily and
> it makes writing ad-hoc queries much easier. And it will only get better
> with time.
> 
> But if your code works in MR, why rewrite it? Let it be, unless you have
> problems with the code and needs to be rewritten anyway.
> 
> Enjoy!
> 
> Anze
> 
> On Tuesday 16 November 2010, Renato Marroquín Mogrovejo wrote:
> > Pig has some clear advantages over raw mapreduce code, but IHMO the most
> > important is the learning curve. But, if you are just loading, probably
> > you don't want to just translate it into pig, well, maybe just for the
> > fun of it (: but if you are planing to do some more other operations
> > like joining or grouping, it would be a lot more simple to do it from
> > pig.
> > 
> > Give this a look, it will help you understand better the bigger picture.
> > http://www.slideshare.net/hadoop/practical-problem-solving-with-apache-ha
> > do op-pig
> > 
> > Renato M.
> > 
> > 
> > If you already have it as a hadoop job, why would you want it pass to
> > pig?
> > 
> > 2010/11/15 Gerrit van Vuuren <[email protected]>
> > 
> > > Is this a bot?
> > > 
> > > Y si no, si puedes utilizar pig anque te consejo reutilizar lo que ya
> > > se ha desarollado y no repetir udfs si existe :)
> > > 
> > > 
> > > ----- Original Message -----
> > > From: Cornelio Iñigo <[email protected]>
> > > To: [email protected] <[email protected]>
> > > Sent: Mon Nov 15 20:48:35 2010
> > > Subject: pig needed?
> > > 
> > > Hi
> > > 
> > > My name is Cornelio Iñigo and I´m a developer just beginning with this
> > > of hadoop and pig.
> > > I have a doubt about developing an application on pig, I already have
> > > my program on hadoop, this program gets just a column from a dataset
> > > (csv file)
> > > and process this data with some functions (like language analisis,
> > > analysis of the content)
> > > 
> > >  note that in the process of the file I dont use FILTERS COUNTS or any
> > > 
> > > built
> > > in function of Pig, I think that all the fucntions have to be User
> > > Defined Functions
> > > 
> > >  so Is a good idea (has sense ) to develop this program in Pig?
> > > 
> > > Thanks in advice
> > > --
> > > *Cornelio*

Re: pig needed?

Reply via email to