On Mon, Aug 06, 2007 at 05:57:21AM -0400, hank williams wrote: > Hi shane, > > I just heard about pig latin myself, and am curious about it without > having much time to explore it right now. > > I am wondering if you (or anyone else who knows) could tell me a > little about it. As I understand it, it is relational, but I am not > clear whether it could effectively do CRUD type operations. In other > words, would it actually be useful for replacing mysql with a highly > scalable hadoop based database? >
No, not really. Pig is not going to replace MySQL as the backend for a Slashdot/Flickr/Amazon website. Pig could replace MySQL for some data analysis tasks. Pig makes it easy to load big datasets and run queries against it. In a sense, this is also what Map/Reduce does, but Pig makes certain tasks even easier. Many Map/Reduce programs could have been formulated as something like a database query, so Pig provides an infrastructure to write these sorts of programs in a query langauage and efficently run them on Hadoop. Pig users only have to write a few lines of PigLatin for each different query they want to run, and they don't have to reinvent the whole infrastructure for each program. Pig is "relational", but what that really means is Pig allows you to perform relational algebra operations on sets. It's not SQL, but it is based on the same ideas as SQL. If you're looking for a big hadoop-based database, HBase might be closer to what you're looking for. -Erik > Thanks > Hank > > On 8/3/07, Shane Butler <[EMAIL PROTECTED]> wrote: > > Hi, > > > > Page 3 of the "Hacking Pig" documentation suggests it is possible to > > LOAD a file, do stuff with it and then STORE it out... Is this correct > > or does it have to be done in Java? > > > > Regards, > > Shane > > > > PS. Apologies if this is the wrong list. > >
