James DeFelice created MESOS-9293:
-------------------------------------

             Summary: OperationStatus messages sent to framework should include 
both agent ID and resource provider ID
                 Key: MESOS-9293
                 URL: https://issues.apache.org/jira/browse/MESOS-9293
             Project: Mesos
          Issue Type: Bug
    Affects Versions: 1.7.0
            Reporter: James DeFelice


Normally, frameworks are expected to checkpoint agent ID and resource provider 
ID before accepting an offer with an OfferOperation. From this expectation 
comes the requirement in the v1 scheduler API that a framework must provide the 
agent ID and resource provider ID when acknowledging an offer operation status 
update. However, this expectation breaks down:

1. the framework might lose its checkpointed data; it no longer remembers the 
agent ID or the resource provider ID

2. even if the framework checkpoints data, it could be sent a stale update: 
maybe the original ACK it sent to Mesos was lost, and it needs to re-ACK. If a 
framework deleted its checkpointed data after sending the ACK (that's dropped) 
then upon replay of the status update it no longer has the agent ID or resource 
provider ID for the operation.

An easy remedy would be to add the agent ID and resource provider ID to the 
OperationStatus message received by the scheduler so that a framework can build 
a proper ACK for the update, even if it doesn't have access to its previously 
checkpointed information.

I'm filing this a BUG because there's no way to reliably use the offer 
operation status API until this has been fixed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to