I tried to use it, but it is currently tied to the 0.13.1 release of hadoop as specially patched with the HadoopExe class by the pig group.
I hear that they are working on a new release and also working towards having a real open source version with nightly builds RSN, but there hasn't been any externally visible progress lately. On 9/17/07 11:14 AM, "Ashish Thusoo" <[EMAIL PROTECTED]> wrote: > Thanks for the pointer. > > We did take a look at pig and did find that it some of the constructs > that we have been talking about. How stable is the pig software? Has > anyone on this list used it? > > Thanks, > Ashish > > -----Original Message----- > From: Ted Dunning [mailto:[EMAIL PROTECTED] > Sent: Thursday, September 13, 2007 11:10 AM > To: [email protected] > Subject: Re: JOIN-type operations with Hadoop... > > > > See pig. > > This one: http://research.yahoo.com/project/pig > > Not this one: http://en.wikipedia.org/wiki/Pig > > On 9/13/07 10:45 AM, "Ashish Thusoo" <[EMAIL PROTECTED]> wrote: > >> On a related note - has anyone seen proposals or ideas for languages > on >> top of hadoop map/reduce (could even be languages for some sort of > code >> generators) to make writing the joins easy. It is quite a nightmare to >> write these joins especially when it involves multiple data sources. > We >> are thinking of doing something similar. I wanted to find out if > someone >> else has some ideas to share. >> >> Thanks, >> Ashish >> >> -----Original Message----- >> From: Joydeep Sen Sarma [mailto:[EMAIL PROTECTED] >> Sent: Thursday, September 13, 2007 7:43 AM >> To: [email protected] >> Subject: RE: JOIN-type operations with Hadoop... >> >> We use the directory namespace to distinguish different types of > files. >> Wrote a simple wrapper around TextInputFormat/SequenceFileInputFormat > - >> such that they key returned is the pathname (or some component of the >> pathname). That way u can look at the key - and then decide what kind > of >> record structure the value encodes and take the proper action. >> >> Ping me if u want an example and will be happy to share. >> >> >> -----Original Message----- >> From: C G [mailto:[EMAIL PROTECTED] >> Sent: Thursday, September 13, 2007 7:11 AM >> To: [email protected] >> Subject: JOIN-type operations with Hadoop... >> >> Consider two row based files. The first has fields: >> >> A B C >> >> the second has fields: >> >> B D E >> >> I want to join these files on the key B, to create records of the >> form: >> >> A B C D E >> >> So B can be thought of as a primary key, and the second file will > only >> distinct values of B...i.e. no repeats. >> >> I'm trying to reason through how to do this type of join operation > in >> Hadoop but am unsure how to proceed with different "types" of files. >> >> Does the community have any wisdom to share? >> >> Thanks, >> C G >> >> >> --------------------------------- >> Yahoo! oneSearch: Finally, mobile search that gives answers, not web >> links. >
