Dmitriy, are the gc fixes all in for 0.11.1? PIG-3148 and PIG-3212 are the 2 JIRAs I know were fixed, any others?
I have a patch up for 3194, I think we should be good for a release once that makes it in. -Prashant On Sat, Mar 2, 2013 at 11:16 AM, Prashant Kommireddi <[email protected]>wrote: > Great. > > I have commented regarding a possible approach for PIG-3194 > http://goo.gl/UQ3zs. Please take a look when you folks have a chance. > > > On Fri, Mar 1, 2013 at 7:00 PM, Dmitriy Ryaboy <[email protected]> wrote: > >> I'd like to get the gc fix in as well, but looks like Rohini is about to >> commit it so we are good there. >> >> On Mar 1, 2013, at 11:33 AM, Bill Graham <[email protected]> wrote: >> >> > +1 to releasing Pig 0.11.1 when this is addressed. I should be able to >> help >> > with the release again. >> > >> > >> > >> > On Fri, Mar 1, 2013 at 11:25 AM, Prashant Kommireddi < >> [email protected]>wrote: >> > >> >> Hey Guys, >> >> >> >> I wanted to start a conversation on this again. If Kai is not looking >> at >> >> PIG-3194 I can start working on it to get 0.11 compatible with 20.2. If >> >> everyone agrees, we should roll out 0.11.1 sooner than usual and I >> >> volunteer to help with it in anyway possible. >> >> >> >> Any objections to getting 0.11.1 out soon after 3194 is fixed? >> >> >> >> -Prashant >> >> >> >> On Wed, Feb 20, 2013 at 3:34 PM, Russell Jurney < >> [email protected] >> >>> wrote: >> >> >> >>> I stand corrected. Cool, 0.11 is good! >> >>> >> >>> >> >>> On Wed, Feb 20, 2013 at 1:15 PM, Jarek Jarcec Cecho < >> [email protected] >> >>>> wrote: >> >>> >> >>>> Just a unrelated note: The CDH3 is more closer to Hadoop 1.x than to >> >>> 0.20. >> >>>> >> >>>> Jarcec >> >>>> >> >>>> On Wed, Feb 20, 2013 at 12:04:51PM -0800, Dmitriy Ryaboy wrote: >> >>>>> I agree -- this is a good release. The bugs Kai pointed out should >> be >> >>>>> fixed, but as they are not critical regressions, we can fix them in >> >>>> 0.11.1 >> >>>>> (if someone wants to roll 0.11.1 the minute these fixes are >> >> committed, >> >>> I >> >>>>> won't mind and will dutifully vote for the release). >> >>>>> >> >>>>> I think the Hadoop 20.2 incompatibility is unfortunate but iirc this >> >> is >> >>>>> fixable by setting HADOOP_USER_CLASSPATH_FIRST=true (was that in >> >> 20.2?) >> >>>>> >> >>>>> FWIW Twitter's running CDH3 and this release works in our >> >> environment. >> >>>>> >> >>>>> At this point things that block a release are critical regressions >> in >> >>>>> performance or correctness. >> >>>>> >> >>>>> D >> >>>>> >> >>>>> >> >>>>> On Wed, Feb 20, 2013 at 11:52 AM, Alan Gates <[email protected] >> > >> >>>> wrote: >> >>>>> >> >>>>>> No. Bugs like these are supposed to be found and fixed after we >> >>> branch >> >>>>>> from trunk (which happened several months ago in the case of 0.11). >> >>>> The >> >>>>>> point of RCs are to check that it's a good build, licenses are >> >> right, >> >>>> etc. >> >>>>>> Any bugs found this late in the game have to be seen as failures >> >> of >> >>>>>> earlier testing. >> >>>>>> >> >>>>>> Alan. >> >>>>>> >> >>>>>> On Feb 20, 2013, at 11:33 AM, Russell Jurney wrote: >> >>>>>> >> >>>>>>> Isn't the point of an RC to find and fix bugs like these> >> >>>>>>> >> >>>>>>> >> >>>>>>> On Wed, Feb 20, 2013 at 11:31 AM, Bill Graham < >> >>> [email protected]> >> >>>>>> wrote: >> >>>>>>> >> >>>>>>>> Regarding Pig 11 rc2, I propose we continue with the current >> >> vote >> >>>> as is >> >>>>>>>> (which closes today EOD). Patches for 0.20.2 issues can be >> >> rolled >> >>>> into a >> >>>>>>>> Pig 0.11.1 release whenever they're available and tested. >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> On Wed, Feb 20, 2013 at 9:24 AM, Olga Natkovich < >> >>>> [email protected] >> >>>>>>>>> wrote: >> >>>>>>>> >> >>>>>>>>> I agree that supporting as much as we can is a good goal. The >> >>>> issue is >> >>>>>>>> who >> >>>>>>>>> is going to be testing against all these versions? We found the >> >>>> issues >> >>>>>>>>> under discussion because of a customer report, not because we >> >>>>>>>> consistently >> >>>>>>>>> test against all versions. Perhaps when we decide which >> >> versions >> >>> to >> >>>>>>>> support >> >>>>>>>>> for next release we need also to agree who is going to be >> >> testing >> >>>> and >> >>>>>>>>> maintaining compatibility with a particular version. >> >>>>>>>>> >> >>>>>>>>> For instance since Hadoop 23 compatibility is important for us >> >> at >> >>>> Yahoo >> >>>>>>>> we >> >>>>>>>>> have been maintaining compatibility with this version for 0.9, >> >>>> 0.10 and >> >>>>>>>>> will do the same for 0.11 and going forward. I think we would >> >>> need >> >>>>>> others >> >>>>>>>>> to step in and claim the versions of their interest. >> >>>>>>>>> >> >>>>>>>>> Olga >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> ________________________________ >> >>>>>>>>> From: Kai Londenberg <[email protected]> >> >>>>>>>>> To: [email protected] >> >>>>>>>>> Sent: Wednesday, February 20, 2013 1:51 AM >> >>>>>>>>> Subject: Re: pig 0.11 candidate 2 feedback: Several problems >> >>>>>>>>> >> >>>>>>>>> Hi, >> >>>>>>>>> >> >>>>>>>>> I stronly agree with Jonathan here. If there are good reasons >> >> why >> >>>> you >> >>>>>>>>> can't support an older version of Hadoop any more, that's one >> >>>> thing. >> >>>>>>>>> But having to change 2 lines of code doesn't really qualify as >> >>>> such in >> >>>>>>>>> my point of view ;) >> >>>>>>>>> >> >>>>>>>>> At least for me, pig support for 0.20.2 is essential - without >> >>> it, >> >>>> I >> >>>>>>>>> can't use it. If it doesn't support it, I'll have to branch pig >> >>> and >> >>>>>>>>> hack it myself, or stop using it. >> >>>>>>>>> >> >>>>>>>>> I guess, there are a lot of people still running 0.20.2 >> >> Clusters. >> >>>> If >> >>>>>>>>> you really have lots of data stored on HDFS and a continuously >> >>> busy >> >>>>>>>>> cluster, an upgrade is nothing you do "just because". >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> 2013/2/20 Jonathan Coveney <[email protected]>: >> >>>>>>>>>> I agree that we shouldn't have to support old versions >> >> forever. >> >>>> That >> >>>>>>>>> said, >> >>>>>>>>>> I also don't think we should be too blase about supporting >> >> older >> >>>>>>>> versions >> >>>>>>>>>> where it is not odious to do so. We have a lot of competition >> >> in >> >>>> the >> >>>>>>>>>> language space and the broader the versions we can support, >> >> the >> >>>> better >> >>>>>>>>>> (assuming it isn't too odious to do so). In this case, I don't >> >>>> think >> >>>>>> it >> >>>>>>>>>> should be too hard to change ObjectSerializer so that the >> >>>>>> commons-codec >> >>>>>>>>>> code used is compatible with both versions...we could just >> >>> in-line >> >>>>>> some >> >>>>>>>>> of >> >>>>>>>>>> the Base64 code, and comment accordingly. >> >>>>>>>>>> >> >>>>>>>>>> That said, we also should be clear about what versions we >> >>>> support, but >> >>>>>>>>> 6-12 >> >>>>>>>>>> months seems short. The upgrade cycles on Hadoop are really, >> >>>> really >> >>>>>>>> long. >> >>>>>>>>>> >> >>>>>>>>>> >> >>>>>>>>>> 2013/2/20 Prashant Kommireddi <[email protected]> >> >>>>>>>>>> >> >>>>>>>>>>> Agreed, that makes sense. Probably supporting older hadoop >> >>>> version >> >>>>>> for >> >>>>>>>>> a 1 >> >>>>>>>>>>> or 2 pig releases before moving to a newer/stable version? >> >>>>>>>>>>> >> >>>>>>>>>>> Having said that, should we use 0.11 period to communicate >> >> the >> >>>> same >> >>>>>> to >> >>>>>>>>> the >> >>>>>>>>>>> community and start moving on 0.12 onwards? I know we are way >> >>>> past >> >>>>>>>> 6-12 >> >>>>>>>>>>> months (1-2 release) time frame with 0.20.2, but we also need >> >>> to >> >>>> make >> >>>>>>>>> sure >> >>>>>>>>>>> users are aware and plan accordingly. >> >>>>>>>>>>> >> >>>>>>>>>>> I'd also be interested to hear how other projects (Hive, >> >> Oozie) >> >>>> are >> >>>>>>>>>>> handling this. >> >>>>>>>>>>> >> >>>>>>>>>>> -Prashant >> >>>>>>>>>>> >> >>>>>>>>>>> On Tue, Feb 19, 2013 at 3:22 PM, Olga Natkovich < >> >>>>>> [email protected] >> >>>>>>>>>>>> wrote: >> >>>>>>>>>>> >> >>>>>>>>>>>> It seems that for each Pig release we need to agree and >> >>> clearly >> >>>>>>>> state >> >>>>>>>>>>>> which Hadoop versions it will support. I guess the main >> >>>> question is >> >>>>>>>>> how >> >>>>>>>>>>> we >> >>>>>>>>>>>> decide on this. Perhaps we should say that Pig no longer >> >>>> supports >> >>>>>>>>> older >> >>>>>>>>>>>> Hadoop versions once the newer one is out for at least 6-12 >> >>>> month to >> >>>>>>>>> make >> >>>>>>>>>>>> sure it is stable. I don't think we can support old versions >> >>>>>>>>>>> indefinitely. >> >>>>>>>>>>>> It is in everybody's interest to keep moving forward. >> >>>>>>>>>>>> >> >>>>>>>>>>>> Olga >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> ________________________________ >> >>>>>>>>>>>> From: Prashant Kommireddi <[email protected]> >> >>>>>>>>>>>> To: [email protected] >> >>>>>>>>>>>> Sent: Tuesday, February 19, 2013 10:57 AM >> >>>>>>>>>>>> Subject: Re: pig 0.11 candidate 2 feedback: Several problems >> >>>>>>>>>>>> >> >>>>>>>>>>>> What do you guys feel about the JIRA to do with 0.20.2 >> >>>> compatibility >> >>>>>>>>>>>> (PIG-3194)? I am interested in discussing the strategy >> >> around >> >>>>>>>> backward >> >>>>>>>>>>>> compatibility as this is something that would haunt us each >> >>>> time we >> >>>>>>>>> move >> >>>>>>>>>>> to >> >>>>>>>>>>>> the next hadoop version. For eg, we might be in a similar >> >>>> situation >> >>>>>>>>> while >> >>>>>>>>>>>> moving to Hadoop 2.0, when some of the stuff might break for >> >>>> 1.0. >> >>>>>>>>>>>> >> >>>>>>>>>>>> I feel it would be good to get this JIRA fix in for 0.11, as >> >>>> 0.20.2 >> >>>>>>>>> users >> >>>>>>>>>>>> might be caught unaware. Of course, I must admit there is >> >>>> selfish >> >>>>>>>>>>> interest >> >>>>>>>>>>>> here and it's probably easier for us to have a workaround on >> >>> Pig >> >>>>>>>>> rather >> >>>>>>>>>>>> than upgrade hadoop in all our production DCs. >> >>>>>>>>>>>> >> >>>>>>>>>>>> -Prashant >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> On Tue, Feb 19, 2013 at 9:54 AM, Russell Jurney < >> >>>>>>>>>>> [email protected] >> >>>>>>>>>>>>> wrote: >> >>>>>>>>>>>> >> >>>>>>>>>>>>> I think someone should step up and fix the easy ones, if >> >>>> possible. >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> On Tue, Feb 19, 2013 at 9:51 AM, Bill Graham < >> >>>>>>>> [email protected]> >> >>>>>>>>>>>> wrote: >> >>>>>>>>>>>>> >> >>>>>>>>>>>>>> Thanks Kai for reporting these. >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> What do people think about the severity of these issues >> >>> w.r.t. >> >>>>>>>> Pig >> >>>>>>>>>>> 11? >> >>>>>>>>>>>> I >> >>>>>>>>>>>>>> see a few possible options: >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> 1. We include some or all of these patches in a new Pig 11 >> >>> rc. >> >>>>>>>>> We'd >> >>>>>>>>>>>> want >> >>>>>>>>>>>>> to >> >>>>>>>>>>>>>> make sure that they don't destabilize the current branch. >> >>> This >> >>>>>>>>>>> approach >> >>>>>>>>>>>>>> makes sense if we think Pig 11 wouldn't be a good release >> >>>>>>>> without >> >>>>>>>>> one >> >>>>>>>>>>>> or >> >>>>>>>>>>>>>> more of these included. >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> 2. We continue with the Pig 11 release without these, but >> >>> then >> >>>>>>>>>>> include >> >>>>>>>>>>>>> one >> >>>>>>>>>>>>>> or more in a 0.11.1 release. >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> 3. We continue with the Pig 11 release without these, but >> >>> then >> >>>>>>>>>>> include >> >>>>>>>>>>>>> them >> >>>>>>>>>>>>>> in a 0.12 release. >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> Jon has a patch for the MAP issue >> >>>>>>>>>>>>>> (PIG-3144<https://issues.apache.org/jira/browse/PIG-3144 >> >>> ) >> >>>>>>>>>>>>>> ready, which seems like the most pressing of the three to >> >>> me. >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> thanks, >> >>>>>>>>>>>>>> Bill >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> On Mon, Feb 18, 2013 at 2:27 AM, Kai Londenberg < >> >>>>>>>>>>>>>> [email protected]> wrote: >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> Hi, >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> I just subscribed to the dev mailing list in order to >> >> give >> >>>> you >> >>>>>>>>> some >> >>>>>>>>>>>>>>> feedback on pig 0.11 candidate 2. >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> The following three issues are currently present in 0.11 >> >>>>>>>>> candidate >> >>>>>>>>>>> 2: >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/PIG-3144 - >> >>> 'Erroneous >> >>>>>>>> map >> >>>>>>>>>>>> entry >> >>>>>>>>>>>>>>> alias resolution leading to "Duplicate schema alias" >> >>> errors' >> >>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/PIG-3194 - Changes >> >>> to >> >>>>>>>>>>>>>>> ObjectSerializer.java break compatibility with Hadoop >> >>> 0.20.2 >> >>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/PIG-3195 - Race >> >>>>>>>>> Condition in >> >>>>>>>>>>>>>>> PhysicalOperator leads to ExecException "Error while >> >> trying >> >>>> to >> >>>>>>>>> get >> >>>>>>>>>>>>>>> next result in POStream" >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> The last two of these are easily solveable (see the >> >> tickets >> >>>>>>>> for >> >>>>>>>>>>>>>>> details on that). The first one is a bit trickier I >> >> think, >> >>>> but >> >>>>>>>>> at >> >>>>>>>>>>>>>>> least there is a workaround for it (pass Map fields >> >> through >> >>>> an >> >>>>>>>>> UDF) >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> In my personal opinion, each of these problems is pretty >> >>>>>>>> severe, >> >>>>>>>>>>> but >> >>>>>>>>>>>>>>> opinions about the importance of the MAP Datatype and >> >>> STREAM >> >>>>>>>>>>>> Operator, >> >>>>>>>>>>>>>>> as well as Hadoop 0.20.2 compatibility might differ. >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> so far .. >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> Kai Londenberg >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> -- >> >>>>>>>>>>>>>> *Note that I'm no longer using my Yahoo! email address. >> >>> Please >> >>>>>>>>> email >> >>>>>>>>>>> me >> >>>>>>>>>>>>> at >> >>>>>>>>>>>>>> [email protected] going forward.* >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> -- >> >>>>>>>>>>>>> Russell Jurney twitter.com/rjurney >> >> [email protected] >> >>>>>>>>>>>>> datasyndrome.com >> >>>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>> >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> -- >> >>>>>>>> *Note that I'm no longer using my Yahoo! email address. Please >> >>>> email me >> >>>>>> at >> >>>>>>>> [email protected] going forward.* >> >>>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>> -- >> >>>>>>> Russell Jurney twitter.com/rjurney [email protected] >> >>>>>> datasyndrome.com >> >>>>>> >> >>>>>> >> >>>> >> >>> >> >>> >> >>> >> >>> -- >> >>> Russell Jurney twitter.com/rjurney [email protected] >> >>> datasyndrome.com >> >>> >> >> >> > >
