* For YARN-7813, not sure why moving from 2.8.4/5 -> 2.8.6 would be incompatible with this strategy? It should be OK to remove/add optional fields (removing the field with id 12, and adding the field with id 13) - Ahhhh I misunderstood. I was thinking that the field was overwritten in branch-2.8 as well. Yea I think that approach will be fine.
On Wed, Sep 25, 2019 at 2:31 AM Robert Kanter <rkan...@apache.org> wrote: > > > > * For YARN-6050, there's a bit here: > > https://developers.google.com/protocol-buffers/docs/proto that says > > "optional is compatible with repeated", so I think we should be OK there. > > - Optional is compatible with repeatable over the wire such that > > protobuf won't blow up, but does that actually mean that it's compatible > in > > this case? If it's expecting an optional and gets a repeated, it's going > to > > drop everything except for the last value. I don't know enough about > > YARN-6050 to say if this will be ok or not. > > > It's been a while since I looked into this, but I think it should be okay. > If an older client (using optional) sends the message to a newer server > (using repeated), then there will never be more than one value for the > field. The server puts these into a list, so the list would simply have a > single value in it. The server's logic should be able to handle a single > valued list here because (a) IIRC we wanted to make sure compatibility > wasn't a problem (Cloudera supported rolling upgrades between CDH 5.x so > this was important) and (b) sending a single resource request, even in a > newer client, is a still a valid thing to do. > If a newer client (using repeated) sends the message to an older server > (using optional), I'm not sure what will happen. My guess is that it will > drop the extra values (though I wonder if it will keep the first or last > value...). In any case, I believe most clients will only send the one > value - in order for a client to send multiple values, you'd have to > specify some additional MR configs (see MAPREDUCE-6871). IIRC, there's > also a SPARK JIRA similar to MAPREDUCE-6871, but I can't find it right now. > > - Robert > > On Tue, Sep 24, 2019 at 9:49 PM Jonathan Hung <jyhung2...@gmail.com> > wrote: > > > - I've created YARN-9855 and uploaded patches to fix YARN-6616 in > > branch-2.8 and branch-2.7. > > - For YARN-6050, not sure either. Robert/Wangda, can you comment on > > YARN-6050 compatibility? > > - For YARN-7813, not sure why moving from 2.8.4/5 -> 2.8.6 would be > > incompatible with this strategy? It should be OK to remove/add optional > > fields (removing the field with id 12, and adding the field with id 13). > > The difficulties I see here are, we would have to leave id 12 blank in > > 2.8.6 (so we cannot have YARN-6164 in branch-2.8), and users on 2.8.4/5 > > would have to move to 2.8.6 before moving to 2.9+. But rolling upgrade > > would still work IIUC. > > > > Jonathan Hung > > > > > > On Tue, Sep 24, 2019 at 2:52 PM Eric Badger <ebad...@verizonmedia.com> > > wrote: > > > >> * For YARN-6616, for branch-2.8 and below, it was only committed to > >> 2.7.8/2.8.6 which have not been released (as I understand). Perhaps we > can > >> revert YARN-6616 from branch-2.7 and branch-2.8. > >> - This seems reasonable. Since we haven't released anything, it should > >> be no issue to change the 2.7/2.8 protobuf field to have the same value > as > >> 2.9+ > >> > >> * For YARN-6050, there's a bit here: > >> https://developers.google.com/protocol-buffers/docs/proto that says > >> "optional is compatible with repeated", so I think we should be OK > there. > >> - Optional is compatible with repeatable over the wire such that > >> protobuf won't blow up, but does that actually mean that it's > compatible in > >> this case? If it's expecting an optional and gets a repeated, it's > going to > >> drop everything except for the last value. I don't know enough about > >> YARN-6050 to say if this will be ok or not. > >> > >> * For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or > >> 2.8.5 to a 2.9+ version will be an issue. One option could be to move > the > >> intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8, > then > >> users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to > >> release this), then upgrade from 2.8.6 to 2.9+. > >> - I'm ok with this, but it should be noted that the upgrade from > >> 2.8.4/2.8.5 to 2.8.6 (or 2.9+) would not be compatible for a rolling > >> upgrade. So this would cause some pain to anybody with clusters on those > >> versions. > >> > >> Eric > >> > >> On Tue, Sep 24, 2019 at 2:42 PM Jonathan Hung <jyhung2...@gmail.com> > >> wrote: > >> > >>> Sorry, let me edit my first point. We can just create addendums for > >>> YARN-6616 in branch-2.7 and branch-2.8 to edit the submitTime field to > the > >>> correct id 28. We don’t need to revert YARN-6616 from these branches > >>> completely. > >>> > >>> Jonathan > >>> > >>> ________________________________ > >>> From: Jonathan Hung <jyhung2...@gmail.com> > >>> Sent: Tuesday, September 24, 2019 11:38 AM > >>> To: Eric Badger > >>> Cc: Hadoop Common; yarn-dev; mapreduce-dev; Hdfs-dev > >>> Subject: Re: Incompatible changes between branch-2.8 and branch-2.9 > >>> > >>> Hi Eric, thanks for the investigation. > >>> > >>> * For YARN-6616, for branch-2.8 and below, it was only committed to > >>> 2.7.8/2.8.6 which have not been released (as I understand). Perhaps we > can > >>> revert YARN-6616 from branch-2.7 and branch-2.8. > >>> * For YARN-6050, there's a bit here: > >>> https://developers.google.com/protocol-buffers/docs/proto that says > >>> "optional is compatible with repeated", so I think we should be OK > there. > >>> * For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or > >>> 2.8.5 to a 2.9+ version will be an issue. One option could be to move > the > >>> intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8, > then > >>> users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to > >>> release this), then upgrade from 2.8.6 to 2.9+. > >>> > >>> Jonathan Hung > >>> > >>> > >>> On Tue, Sep 24, 2019 at 9:23 AM Eric Badger <ebad...@verizonmedia.com > .invalid> > >>> wrote: > >>> We (Verizon Media) are currently moving towards upgrading our clusters > >>> from > >>> our internal fork of branch-2.8 to an internal fork of branch-2. During > >>> this process, we have found multiple incompatible changes in protobufs > >>> between branch-2.8 and branch-2. These incompatibilities were all > >>> introduced between branch-2.8 and branch-2.9. I did a git diff over all > >>> .proto files across the branch-2.8 and branch-2.9 and found 3 instances > >>> of > >>> incompatibilities from 3 separate commits. All of the incompatibilities > >>> are > >>> in yarn_protos.proto > >>> > >>> > >>> I would like to discuss how to fix these incompatible changes. > Otherwise, > >>> rolling upgrades will not be supported between branch-2.8 (or below) > and > >>> branch-2.9 (or beyond). We could revert the incompatible changes, but > >>> then > >>> the new releases would be incompatible with the releases that have > these > >>> incompatible changes. If we do nothing, then rolling upgrades won't > work > >>> between 2.8- and 2.9+. > >>> > >>> > >>> Thanks, > >>> > >>> > >>> Eric > >>> > >>> > >>> ------------------------------------------------------------------- > >>> > >>> > >>> git diff branch-2.8..branch-2.9 $(find . -name '*\.proto') > >>> > >>> > >>> https://issues.apache.org/jira/browse/YARN-6616 > >>> > >>> - Trunk patch (applied through branch-2.9) differs from branch-2.8 > >>> patch > >>> > >>> @@ -211,7 +245,20 @@ message ApplicationReportProto { > >>> > >>> optional PriorityProto priority = 23; > >>> > >>> optional string appNodeLabelExpression = 24; > >>> > >>> optional string amNodeLabelExpression = 25; > >>> > >>> - optional int64 submitTime = 26; > >>> > >>> + repeated AppTimeoutsMapProto appTimeouts = 26; > >>> > >>> + optional int64 launchTime = 27; > >>> > >>> + optional int64 submitTime = 28; > >>> > >>> > >>> https://issues.apache.org/jira/browse/YARN-6050 > >>> > >>> - Trunk and branch-2 patches both change the protobuf type in the > same > >>> way. > >>> > >>> @@ -356,7 +416,22 @@ message ApplicationSubmissionContextProto { > >>> > >>> optional LogAggregationContextProto log_aggregation_context = 14; > >>> > >>> optional ReservationIdProto reservation_id = 15; > >>> > >>> optional string node_label_expression = 16; > >>> > >>> - optional ResourceRequestProto am_container_resource_request = 17; > >>> > >>> + repeated ResourceRequestProto am_container_resource_request = 17; > >>> > >>> + repeated ApplicationTimeoutMapProto application_timeouts = 18; > >>> > >>> > >>> https://issues.apache.org/jira/browse/YARN-7813 > >>> > >>> - Trunk (applied through branch-3.1) and branch-3.0 (applied through > >>> branch-2.9) patches differ from branch-2.8 patch > >>> > >>> @@ -425,7 +501,21 @@ message QueueInfoProto { > >>> > >>> optional string defaultNodeLabelExpression = 9; > >>> > >>> optional QueueStatisticsProto queueStatistics = 10; > >>> > >>> optional bool preemptionDisabled = 11; > >>> > >>> - optional bool intraQueuePreemptionDisabled = 12; > >>> > >>> + repeated QueueConfigurationsMapProto queueConfigurationsMap = 12; > >>> > >>> + optional bool intraQueuePreemptionDisabled = 13; > >>> > >> >