Joep Rottinghuis created YARN-6357:
--------------------------------------

             Summary: Implement TimelineCollector#putEntitiesAsync
                 Key: YARN-6357
                 URL: https://issues.apache.org/jira/browse/YARN-6357
             Project: Hadoop YARN
          Issue Type: Sub-task
          Components: ATSv2, timelineserver
    Affects Versions: YARN-2928
            Reporter: Joep Rottinghuis
            Assignee: Haibo Chen


As discovered and discussed in YARN-5269 the TimelineCollector#putEntitiesAsync 
method is currently not implemented and TimelineCollector#putEntities is 
asynchronous.

TimelineV2ClientImpl#putEntities vs TimelineV2ClientImpl#putEntitiesAsync 
correctly call TimelineEntityDispatcher#dispatchEntities(boolean sync,... with 
the correct argument. This argument does seem to make it into the params, and 
on the server side TimelineCollectorWebService#putEntities correctly pulls the 
async parameter from the rest call. See line 156:
{code}
    boolean isAsync = async != null && async.trim().equalsIgnoreCase("true");
{code}
However, this is where the problem starts. It simply calls 
TimelineCollector#putEntities and ignores the value of isAsync. It should 
instead have called TimelineCollector#putEntitiesAsync, which is currently not 
implemented.
putEntities should call putEntitiesAsync and then after that call writer.flush()
The fact that we flush on close and we flush periodically should be more of a 
concern of avoiding data loss; close in case sync is never called and the 
periodic flush to guard against having data from slow writers get buffered for 
a long time and expose us to risk of loss in case the collector crashes with 
data in its buffers. Size-based flush is a different concern to avoid blowing 
up memory footprint.
The spooling behavior is also somewhat separate.
We have two separate methods on our API putEntities and putEntitiesAsync and 
they should have different behavior beyond waiting for the request to be sent. 
I can file a separate bug from this one dealing with exception handling to 
tackle the sync vs async nature. During the meeting today I was thinking about 
the HBase writer that has a flush, which definitely blocks until data is 
flushed to HBase (ignoring the spooling for the moment).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to