On Mon, Aug 06, 2007 at 05:57:21AM -0400, hank williams wrote:
> Hi shane,
> 
> I just heard about pig latin myself, and am curious about it without
> having much time to explore it right now.
> 
> I am wondering if you (or anyone else who knows) could tell me a
> little about it. As I understand it, it is relational, but I am not
> clear whether it could effectively do CRUD type operations. In other
> words, would it actually be useful for replacing mysql with a highly
> scalable hadoop based database?
> 

No, not really. Pig is not going to replace MySQL as the backend for a 
Slashdot/Flickr/Amazon website. Pig could replace MySQL for some data
analysis tasks. 

Pig makes it easy to load big datasets and run queries against it. In a
sense, this is also what Map/Reduce does, but Pig makes certain tasks
even easier. Many Map/Reduce programs could have been formulated as
something like a database query, so Pig provides an infrastructure to
write these sorts of programs in a query langauage and efficently run
them on Hadoop.  Pig users only have to write a few lines of PigLatin for
each different query they want to run, and they don't have to reinvent
the whole infrastructure for each program.

Pig is "relational", but what that really means is Pig allows you to
perform relational algebra operations on sets. It's not SQL, but it is
based on the same ideas as SQL. 

If you're looking for a big hadoop-based database, HBase might be closer
to what you're looking for.

-Erik

> Thanks
> Hank
> 
> On 8/3/07, Shane Butler <[EMAIL PROTECTED]> wrote:
> > Hi,
> >
> > Page 3 of the "Hacking Pig" documentation suggests it is possible to
> > LOAD a file, do stuff with it and then STORE it out... Is this correct
> > or does it have to be done in Java?
> >
> > Regards,
> > Shane
> >
> > PS. Apologies if this is the wrong list.
> >

Reply via email to