> On Dec. 14, 2018, 9:39 p.m., Chun-Hung Hsiao wrote: > > src/common/protobuf_utils.cpp > > Lines 470 (patched) > > <https://reviews.apache.org/r/69569/diff/1/?file=2114077#file2114077line470> > > > > Since we set the agent ID and RP ID here in SLRP, how about doing the > > following changes: > > > > 1) Change > > https://github.com/apache/mesos/blob/b6fc9706c03c93239128f4a27051997c9647d8ff/src/slave/slave.cpp#L7859-L7869 > > to: > > ``` > > // Inject the agent ID into the status update if the resource > > // provider does not provide it (because the resource provider > > // may not know it). > > ... > > if (!update.has_slave_id()) { > > update.mutable_slave_id()->CopyFrom(...); > > } > > if (update.has_latest_status() && > > !update.latest_status().has_slave_id()) { > > update.mutable_latest_status()->mutable_slave_id()->CopyFrom(...); > > } > > ``` > > > > 2) Change > > https://github.com/apache/mesos/blob/b6fc9706c03c93239128f4a27051997c9647d8ff/src/resource_provider/manager.cpp#L886-L893 > > to: > > ``` > > CHECK_EQ(update.status().resource_provider_id(), > > resourceProvider->info().id()); > > > > ... > > if (update.has_latest_status()) { > > CHECK_EQ(update.latest_status().resource_provider_id(), > > resourceProvider->info().id()); > > > > body.update.mutable_latest_status()->CopyFrom(...); > > } > > ``` > > Then a followup work would be introduce another event to surface these > > errors and shutdown the RP instead of crash the RP manager. Could you > > create a ticket for this? > > > > > > Or alternatively, make the agent ID injection code into CHECKs since > > `LocalResourceProviderDaemon` provides the agent ID when creating LRPs. You > > might argue that it leaks informations to RP, but my counter argument would > > be that an LRP needs to learn about the agent lifecycle so it can know when > > to subscribe with no ID and when to resubscribe with a new ID, so it either > > needs to learn about the current agent ID, or we need provide feedback > > through the HTTP response of the `SUBSCRIBE` call. > > > > We could pick either approach for now. > > Benjamin Bannier wrote: > Good suggestions. > > 1) Not sure what making the modifications conditional would gain us. It > would seem to increase the complexity and number of possible combinations > while just setting unconditionally shouldn't introduce additional cases. > Suggest to always set. > > 2) Since it isn't directly related, I added these changes in > https://reviews.apache.org/r/69572/. I filed > https://issues.apache.org/jira/browse/MESOS-9482 for making the RP manage > code safer.
Then how about checking the `slave_id` and print a warning if it does not match the current agent ID? - Chun-Hung ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/69569/#review211342 ----------------------------------------------------------- On Dec. 17, 2018, 9:57 a.m., Benjamin Bannier wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/69569/ > ----------------------------------------------------------- > > (Updated Dec. 17, 2018, 9:57 a.m.) > > > Review request for mesos, Chun-Hung Hsiao and Greg Mann. > > > Bugs: MESOS-9479 > https://issues.apache.org/jira/browse/MESOS-9479 > > > Repository: mesos > > > Description > ------- > > The helper function to set up the message was set up to receive > arguments, but we missed implementing the code making use of the values. > This went undiscovered as previously only used a mock resource provider. > > > Diffs > ----- > > src/common/protobuf_utils.cpp 583f0ab0dd874602bbd20731f1f32116a425ebc4 > src/tests/master_tests.cpp 50b905b5ab508378eddf070eabfe744b68fd4cea > src/tests/storage_local_resource_provider_tests.cpp > 7887bd5ec300be3cc093886525d71b3e09bbf085 > > > Diff: https://reviews.apache.org/r/69569/diff/2/ > > > Testing > ------- > > * `make check` > * ran added test in repeatition, both with and without additional stress on > the system > > > Thanks, > > Benjamin Bannier > >
