Please can the "reason" be the reason for the failure and NOT the reason the message was sent, e.g. "RECONCILIATION"
On Wed, Aug 23, 2017 at 1:58 PM Yan Xu <xuj...@apple.com> wrote: > Yeah a reason for failed operations is probably useful for all resource > operations. It looks like the task-style status update is still the best > approach. > > --- > @xujyan <https://twitter.com/xujyan> > > On Wed, Aug 23, 2017 at 11:40 AM, Jie Yu <yujie....@gmail.com> wrote: > >> We should continue the discussion here: >> >> I think I forgot to mention one important reason that I went for the >> operation based reconciliation API proposal. For new operations like >> CREATE_VOLUME/CREATE_BLOCK, not only we need to know the end result (the >> resources) if it's successful, we also need to know the failure reason if >> it fails. For instance, imagine you're creating an EBS volume by talking to >> a CSI EBS plugin. Surfacing the creation error (e.g., retryable or not from >> the CSI plugin) will be useful for scheduler to determine the next step. >> >> I don't think a resources based reconciliation API can address this. >> Maybe we can add both if we feel both are useful? >> >> Thoughts? >> - Jie >> >> On Wed, Aug 23, 2017 at 11:26 AM, Jie Yu <yujie....@gmail.com> wrote: >> >>> Hi, >>> >>> We had a discussion on some very early proposal (see the attached >>> slides) on providing feedback for offer operations (e.g., CREATE/DESTORY, >>> RESERVE/UNRESERVE, etc.) with a bunch of folks from the community. Here are >>> the notes I captured in the meeting: >>> >>> >>> - One alternative approach discussed was to have best effort >>> feedback, and a resources based reconciliation API allowing framework to >>> query the resources on a given resource provider or agent. That way, we >>> don't necessarily need the status update mechanism for offer operations, >>> which causes complexity in the frameworks. >>> - In the current proposal, do we need agent_id (or resource provider >>> id) when performing reconciliation for that operation? The reason we >>> require that in the task reconciliation case is because agent might not >>> re-register yet during master failover. >>> - We need to mock up the operator API for this work. >>> - What's the order guarantee for the operations specified in one API >>> call? >>> - Wish list >>> - Reservation tie to framework instead of role. >>> - When a framework teardown, auto release resources reserved for >>> that framework >>> >>> If I miss anything, please reply to this thread! Thanks! >>> >>> >>> https://docs.google.com/presentation/d/1Mef8K3aLIuzcFVc3MnAo64TkjpyTWarYVShtvCN4e48/edit?usp=sharing >>> >>> - Jie >>> >> >> >