+1 to releasing Pig 0.11.1 when this is addressed. I should be able to help with the release again.
On Fri, Mar 1, 2013 at 11:25 AM, Prashant Kommireddi <prash1...@gmail.com>wrote: > Hey Guys, > > I wanted to start a conversation on this again. If Kai is not looking at > PIG-3194 I can start working on it to get 0.11 compatible with 20.2. If > everyone agrees, we should roll out 0.11.1 sooner than usual and I > volunteer to help with it in anyway possible. > > Any objections to getting 0.11.1 out soon after 3194 is fixed? > > -Prashant > > On Wed, Feb 20, 2013 at 3:34 PM, Russell Jurney <russell.jur...@gmail.com > >wrote: > > > I stand corrected. Cool, 0.11 is good! > > > > > > On Wed, Feb 20, 2013 at 1:15 PM, Jarek Jarcec Cecho <jar...@apache.org > > >wrote: > > > > > Just a unrelated note: The CDH3 is more closer to Hadoop 1.x than to > > 0.20. > > > > > > Jarcec > > > > > > On Wed, Feb 20, 2013 at 12:04:51PM -0800, Dmitriy Ryaboy wrote: > > > > I agree -- this is a good release. The bugs Kai pointed out should be > > > > fixed, but as they are not critical regressions, we can fix them in > > > 0.11.1 > > > > (if someone wants to roll 0.11.1 the minute these fixes are > committed, > > I > > > > won't mind and will dutifully vote for the release). > > > > > > > > I think the Hadoop 20.2 incompatibility is unfortunate but iirc this > is > > > > fixable by setting HADOOP_USER_CLASSPATH_FIRST=true (was that in > 20.2?) > > > > > > > > FWIW Twitter's running CDH3 and this release works in our > environment. > > > > > > > > At this point things that block a release are critical regressions in > > > > performance or correctness. > > > > > > > > D > > > > > > > > > > > > On Wed, Feb 20, 2013 at 11:52 AM, Alan Gates <ga...@hortonworks.com> > > > wrote: > > > > > > > > > No. Bugs like these are supposed to be found and fixed after we > > branch > > > > > from trunk (which happened several months ago in the case of 0.11). > > > The > > > > > point of RCs are to check that it's a good build, licenses are > right, > > > etc. > > > > > Any bugs found this late in the game have to be seen as failures > of > > > > > earlier testing. > > > > > > > > > > Alan. > > > > > > > > > > On Feb 20, 2013, at 11:33 AM, Russell Jurney wrote: > > > > > > > > > > > Isn't the point of an RC to find and fix bugs like these> > > > > > > > > > > > > > > > > > > On Wed, Feb 20, 2013 at 11:31 AM, Bill Graham < > > billgra...@gmail.com> > > > > > wrote: > > > > > > > > > > > >> Regarding Pig 11 rc2, I propose we continue with the current > vote > > > as is > > > > > >> (which closes today EOD). Patches for 0.20.2 issues can be > rolled > > > into a > > > > > >> Pig 0.11.1 release whenever they're available and tested. > > > > > >> > > > > > >> > > > > > >> > > > > > >> On Wed, Feb 20, 2013 at 9:24 AM, Olga Natkovich < > > > onatkov...@yahoo.com > > > > > >>> wrote: > > > > > >> > > > > > >>> I agree that supporting as much as we can is a good goal. The > > > issue is > > > > > >> who > > > > > >>> is going to be testing against all these versions? We found the > > > issues > > > > > >>> under discussion because of a customer report, not because we > > > > > >> consistently > > > > > >>> test against all versions. Perhaps when we decide which > versions > > to > > > > > >> support > > > > > >>> for next release we need also to agree who is going to be > testing > > > and > > > > > >>> maintaining compatibility with a particular version. > > > > > >>> > > > > > >>> For instance since Hadoop 23 compatibility is important for us > at > > > Yahoo > > > > > >> we > > > > > >>> have been maintaining compatibility with this version for 0.9, > > > 0.10 and > > > > > >>> will do the same for 0.11 and going forward. I think we would > > need > > > > > others > > > > > >>> to step in and claim the versions of their interest. > > > > > >>> > > > > > >>> Olga > > > > > >>> > > > > > >>> > > > > > >>> ________________________________ > > > > > >>> From: Kai Londenberg <kai.londenb...@googlemail.com> > > > > > >>> To: dev@pig.apache.org > > > > > >>> Sent: Wednesday, February 20, 2013 1:51 AM > > > > > >>> Subject: Re: pig 0.11 candidate 2 feedback: Several problems > > > > > >>> > > > > > >>> Hi, > > > > > >>> > > > > > >>> I stronly agree with Jonathan here. If there are good reasons > why > > > you > > > > > >>> can't support an older version of Hadoop any more, that's one > > > thing. > > > > > >>> But having to change 2 lines of code doesn't really qualify as > > > such in > > > > > >>> my point of view ;) > > > > > >>> > > > > > >>> At least for me, pig support for 0.20.2 is essential - without > > it, > > > I > > > > > >>> can't use it. If it doesn't support it, I'll have to branch pig > > and > > > > > >>> hack it myself, or stop using it. > > > > > >>> > > > > > >>> I guess, there are a lot of people still running 0.20.2 > Clusters. > > > If > > > > > >>> you really have lots of data stored on HDFS and a continuously > > busy > > > > > >>> cluster, an upgrade is nothing you do "just because". > > > > > >>> > > > > > >>> > > > > > >>> 2013/2/20 Jonathan Coveney <jcove...@gmail.com>: > > > > > >>>> I agree that we shouldn't have to support old versions > forever. > > > That > > > > > >>> said, > > > > > >>>> I also don't think we should be too blase about supporting > older > > > > > >> versions > > > > > >>>> where it is not odious to do so. We have a lot of competition > in > > > the > > > > > >>>> language space and the broader the versions we can support, > the > > > better > > > > > >>>> (assuming it isn't too odious to do so). In this case, I don't > > > think > > > > > it > > > > > >>>> should be too hard to change ObjectSerializer so that the > > > > > commons-codec > > > > > >>>> code used is compatible with both versions...we could just > > in-line > > > > > some > > > > > >>> of > > > > > >>>> the Base64 code, and comment accordingly. > > > > > >>>> > > > > > >>>> That said, we also should be clear about what versions we > > > support, but > > > > > >>> 6-12 > > > > > >>>> months seems short. The upgrade cycles on Hadoop are really, > > > really > > > > > >> long. > > > > > >>>> > > > > > >>>> > > > > > >>>> 2013/2/20 Prashant Kommireddi <prash1...@gmail.com> > > > > > >>>> > > > > > >>>>> Agreed, that makes sense. Probably supporting older hadoop > > > version > > > > > for > > > > > >>> a 1 > > > > > >>>>> or 2 pig releases before moving to a newer/stable version? > > > > > >>>>> > > > > > >>>>> Having said that, should we use 0.11 period to communicate > the > > > same > > > > > to > > > > > >>> the > > > > > >>>>> community and start moving on 0.12 onwards? I know we are way > > > past > > > > > >> 6-12 > > > > > >>>>> months (1-2 release) time frame with 0.20.2, but we also need > > to > > > make > > > > > >>> sure > > > > > >>>>> users are aware and plan accordingly. > > > > > >>>>> > > > > > >>>>> I'd also be interested to hear how other projects (Hive, > Oozie) > > > are > > > > > >>>>> handling this. > > > > > >>>>> > > > > > >>>>> -Prashant > > > > > >>>>> > > > > > >>>>> On Tue, Feb 19, 2013 at 3:22 PM, Olga Natkovich < > > > > > onatkov...@yahoo.com > > > > > >>>>>> wrote: > > > > > >>>>> > > > > > >>>>>> It seems that for each Pig release we need to agree and > > clearly > > > > > >> state > > > > > >>>>>> which Hadoop versions it will support. I guess the main > > > question is > > > > > >>> how > > > > > >>>>> we > > > > > >>>>>> decide on this. Perhaps we should say that Pig no longer > > > supports > > > > > >>> older > > > > > >>>>>> Hadoop versions once the newer one is out for at least 6-12 > > > month to > > > > > >>> make > > > > > >>>>>> sure it is stable. I don't think we can support old versions > > > > > >>>>> indefinitely. > > > > > >>>>>> It is in everybody's interest to keep moving forward. > > > > > >>>>>> > > > > > >>>>>> Olga > > > > > >>>>>> > > > > > >>>>>> > > > > > >>>>>> ________________________________ > > > > > >>>>>> From: Prashant Kommireddi <prash1...@gmail.com> > > > > > >>>>>> To: dev@pig.apache.org > > > > > >>>>>> Sent: Tuesday, February 19, 2013 10:57 AM > > > > > >>>>>> Subject: Re: pig 0.11 candidate 2 feedback: Several problems > > > > > >>>>>> > > > > > >>>>>> What do you guys feel about the JIRA to do with 0.20.2 > > > compatibility > > > > > >>>>>> (PIG-3194)? I am interested in discussing the strategy > around > > > > > >> backward > > > > > >>>>>> compatibility as this is something that would haunt us each > > > time we > > > > > >>> move > > > > > >>>>> to > > > > > >>>>>> the next hadoop version. For eg, we might be in a similar > > > situation > > > > > >>> while > > > > > >>>>>> moving to Hadoop 2.0, when some of the stuff might break for > > > 1.0. > > > > > >>>>>> > > > > > >>>>>> I feel it would be good to get this JIRA fix in for 0.11, as > > > 0.20.2 > > > > > >>> users > > > > > >>>>>> might be caught unaware. Of course, I must admit there is > > > selfish > > > > > >>>>> interest > > > > > >>>>>> here and it's probably easier for us to have a workaround on > > Pig > > > > > >>> rather > > > > > >>>>>> than upgrade hadoop in all our production DCs. > > > > > >>>>>> > > > > > >>>>>> -Prashant > > > > > >>>>>> > > > > > >>>>>> > > > > > >>>>>> On Tue, Feb 19, 2013 at 9:54 AM, Russell Jurney < > > > > > >>>>> russell.jur...@gmail.com > > > > > >>>>>>> wrote: > > > > > >>>>>> > > > > > >>>>>>> I think someone should step up and fix the easy ones, if > > > possible. > > > > > >>>>>>> > > > > > >>>>>>> > > > > > >>>>>>> On Tue, Feb 19, 2013 at 9:51 AM, Bill Graham < > > > > > >> billgra...@gmail.com> > > > > > >>>>>> wrote: > > > > > >>>>>>> > > > > > >>>>>>>> Thanks Kai for reporting these. > > > > > >>>>>>>> > > > > > >>>>>>>> What do people think about the severity of these issues > > w.r.t. > > > > > >> Pig > > > > > >>>>> 11? > > > > > >>>>>> I > > > > > >>>>>>>> see a few possible options: > > > > > >>>>>>>> > > > > > >>>>>>>> 1. We include some or all of these patches in a new Pig 11 > > rc. > > > > > >>> We'd > > > > > >>>>>> want > > > > > >>>>>>> to > > > > > >>>>>>>> make sure that they don't destabilize the current branch. > > This > > > > > >>>>> approach > > > > > >>>>>>>> makes sense if we think Pig 11 wouldn't be a good release > > > > > >> without > > > > > >>> one > > > > > >>>>>> or > > > > > >>>>>>>> more of these included. > > > > > >>>>>>>> > > > > > >>>>>>>> 2. We continue with the Pig 11 release without these, but > > then > > > > > >>>>> include > > > > > >>>>>>> one > > > > > >>>>>>>> or more in a 0.11.1 release. > > > > > >>>>>>>> > > > > > >>>>>>>> 3. We continue with the Pig 11 release without these, but > > then > > > > > >>>>> include > > > > > >>>>>>> them > > > > > >>>>>>>> in a 0.12 release. > > > > > >>>>>>>> > > > > > >>>>>>>> Jon has a patch for the MAP issue > > > > > >>>>>>>> (PIG-3144<https://issues.apache.org/jira/browse/PIG-3144 > >) > > > > > >>>>>>>> ready, which seems like the most pressing of the three to > > me. > > > > > >>>>>>>> > > > > > >>>>>>>> thanks, > > > > > >>>>>>>> Bill > > > > > >>>>>>>> > > > > > >>>>>>>> On Mon, Feb 18, 2013 at 2:27 AM, Kai Londenberg < > > > > > >>>>>>>> kai.londenb...@googlemail.com> wrote: > > > > > >>>>>>>> > > > > > >>>>>>>>> Hi, > > > > > >>>>>>>>> > > > > > >>>>>>>>> I just subscribed to the dev mailing list in order to > give > > > you > > > > > >>> some > > > > > >>>>>>>>> feedback on pig 0.11 candidate 2. > > > > > >>>>>>>>> > > > > > >>>>>>>>> The following three issues are currently present in 0.11 > > > > > >>> candidate > > > > > >>>>> 2: > > > > > >>>>>>>>> > > > > > >>>>>>>>> https://issues.apache.org/jira/browse/PIG-3144 - > > 'Erroneous > > > > > >> map > > > > > >>>>>> entry > > > > > >>>>>>>>> alias resolution leading to "Duplicate schema alias" > > errors' > > > > > >>>>>>>>> https://issues.apache.org/jira/browse/PIG-3194 - Changes > > to > > > > > >>>>>>>>> ObjectSerializer.java break compatibility with Hadoop > > 0.20.2 > > > > > >>>>>>>>> https://issues.apache.org/jira/browse/PIG-3195 - Race > > > > > >>> Condition in > > > > > >>>>>>>>> PhysicalOperator leads to ExecException "Error while > trying > > > to > > > > > >>> get > > > > > >>>>>>>>> next result in POStream" > > > > > >>>>>>>>> > > > > > >>>>>>>>> The last two of these are easily solveable (see the > tickets > > > > > >> for > > > > > >>>>>>>>> details on that). The first one is a bit trickier I > think, > > > but > > > > > >>> at > > > > > >>>>>>>>> least there is a workaround for it (pass Map fields > through > > > an > > > > > >>> UDF) > > > > > >>>>>>>>> > > > > > >>>>>>>>> In my personal opinion, each of these problems is pretty > > > > > >> severe, > > > > > >>>>> but > > > > > >>>>>>>>> opinions about the importance of the MAP Datatype and > > STREAM > > > > > >>>>>> Operator, > > > > > >>>>>>>>> as well as Hadoop 0.20.2 compatibility might differ. > > > > > >>>>>>>>> > > > > > >>>>>>>>> so far .. > > > > > >>>>>>>>> > > > > > >>>>>>>>> Kai Londenberg > > > > > >>>>>>>>> > > > > > >>>>>>>> > > > > > >>>>>>>> > > > > > >>>>>>>> > > > > > >>>>>>>> -- > > > > > >>>>>>>> *Note that I'm no longer using my Yahoo! email address. > > Please > > > > > >>> email > > > > > >>>>> me > > > > > >>>>>>> at > > > > > >>>>>>>> billgra...@gmail.com going forward.* > > > > > >>>>>>>> > > > > > >>>>>>> > > > > > >>>>>>> > > > > > >>>>>>> > > > > > >>>>>>> -- > > > > > >>>>>>> Russell Jurney twitter.com/rjurney > russell.jur...@gmail.com > > > > > >>>>>>> datasyndrome.com > > > > > >>>>>>> > > > > > >>>>>> > > > > > >>>>> > > > > > >>> > > > > > >> > > > > > >> > > > > > >> > > > > > >> -- > > > > > >> *Note that I'm no longer using my Yahoo! email address. Please > > > email me > > > > > at > > > > > >> billgra...@gmail.com going forward.* > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > Russell Jurney twitter.com/rjurney russell.jur...@gmail.com > > > > > datasyndrome.com > > > > > > > > > > > > > > > > > > > > > -- > > Russell Jurney twitter.com/rjurney russell.jur...@gmail.com > > datasyndrome.com > > >