[jira] [Commented] (YARN-6357) Implement TimelineCollector#putEntitiesAsync
[ https://issues.apache.org/jira/browse/YARN-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15939031#comment-15939031 ] Hadoop QA commented on YARN-6357: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 23s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 46s{color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 21m 13s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:a9ad5d6 | | JIRA Issue | YARN-6357 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12860203/YARN-6357.03.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux a47bbf8a88ad 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 15:37:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 2e30aa7 | | Default Java | 1.8.0_121 | | findbugs | v3.0.0 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/15366/testReport/ | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/15366/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Implement TimelineCollector#putEntitiesAsync > > > Key: YARN-6357 > URL: https://issues.apache.org/jira/browse/YARN-6357 > Project: Hadoop YARN > Issue Type: Sub-task > Components: ATSv2, timelineserver >Affects Versions: YARN-2928 >Reporter: Joep Rottinghuis >Assignee: Haibo Chen > Labels: yarn-5355-merge-blocker > Attachments: YARN-6357.01.patch,
[jira] [Commented] (YARN-6357) Implement TimelineCollector#putEntitiesAsync
[ https://issues.apache.org/jira/browse/YARN-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15938910#comment-15938910 ] Haibo Chen commented on YARN-6357: -- bq. We can merely implement the async API here. Sure. Will update the patch based on your previous comments. > Implement TimelineCollector#putEntitiesAsync > > > Key: YARN-6357 > URL: https://issues.apache.org/jira/browse/YARN-6357 > Project: Hadoop YARN > Issue Type: Sub-task > Components: ATSv2, timelineserver >Affects Versions: YARN-2928 >Reporter: Joep Rottinghuis >Assignee: Haibo Chen > Labels: yarn-5355-merge-blocker > Attachments: YARN-6357.01.patch, YARN-6357.02.patch > > > As discovered and discussed in YARN-5269 the > TimelineCollector#putEntitiesAsync method is currently not implemented and > TimelineCollector#putEntities is asynchronous. > TimelineV2ClientImpl#putEntities vs TimelineV2ClientImpl#putEntitiesAsync > correctly call TimelineEntityDispatcher#dispatchEntities(boolean sync,... > with the correct argument. This argument does seem to make it into the > params, and on the server side TimelineCollectorWebService#putEntities > correctly pulls the async parameter from the rest call. See line 156: > {code} > boolean isAsync = async != null && async.trim().equalsIgnoreCase("true"); > {code} > However, this is where the problem starts. It simply calls > TimelineCollector#putEntities and ignores the value of isAsync. It should > instead have called TimelineCollector#putEntitiesAsync, which is currently > not implemented. > putEntities should call putEntitiesAsync and then after that call > writer.flush() > The fact that we flush on close and we flush periodically should be more of a > concern of avoiding data loss; close in case sync is never called and the > periodic flush to guard against having data from slow writers get buffered > for a long time and expose us to risk of loss in case the collector crashes > with data in its buffers. Size-based flush is a different concern to avoid > blowing up memory footprint. > The spooling behavior is also somewhat separate. > We have two separate methods on our API putEntities and putEntitiesAsync and > they should have different behavior beyond waiting for the request to be sent. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6357) Implement TimelineCollector#putEntitiesAsync
[ https://issues.apache.org/jira/browse/YARN-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15937850#comment-15937850 ] Varun Saxena commented on YARN-6357: bq. I propose, we replace TimelineResponse with void as the return type, effectively making TimelineWriter API very much similar to that of BufferedMutator. I think this would largely depend on the conclusion for the discussion taking place on YARN-5269. The proposal above is fine if we decide to propagate only an exception from the collector and not send errors per entity. So lets do this in YARN-5269. We can merely implement the async API here. > Implement TimelineCollector#putEntitiesAsync > > > Key: YARN-6357 > URL: https://issues.apache.org/jira/browse/YARN-6357 > Project: Hadoop YARN > Issue Type: Sub-task > Components: ATSv2, timelineserver >Affects Versions: YARN-2928 >Reporter: Joep Rottinghuis >Assignee: Haibo Chen > Labels: yarn-5355-merge-blocker > Attachments: YARN-6357.01.patch, YARN-6357.02.patch > > > As discovered and discussed in YARN-5269 the > TimelineCollector#putEntitiesAsync method is currently not implemented and > TimelineCollector#putEntities is asynchronous. > TimelineV2ClientImpl#putEntities vs TimelineV2ClientImpl#putEntitiesAsync > correctly call TimelineEntityDispatcher#dispatchEntities(boolean sync,... > with the correct argument. This argument does seem to make it into the > params, and on the server side TimelineCollectorWebService#putEntities > correctly pulls the async parameter from the rest call. See line 156: > {code} > boolean isAsync = async != null && async.trim().equalsIgnoreCase("true"); > {code} > However, this is where the problem starts. It simply calls > TimelineCollector#putEntities and ignores the value of isAsync. It should > instead have called TimelineCollector#putEntitiesAsync, which is currently > not implemented. > putEntities should call putEntitiesAsync and then after that call > writer.flush() > The fact that we flush on close and we flush periodically should be more of a > concern of avoiding data loss; close in case sync is never called and the > periodic flush to guard against having data from slow writers get buffered > for a long time and expose us to risk of loss in case the collector crashes > with data in its buffers. Size-based flush is a different concern to avoid > blowing up memory footprint. > The spooling behavior is also somewhat separate. > We have two separate methods on our API putEntities and putEntitiesAsync and > they should have different behavior beyond waiting for the request to be sent. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6357) Implement TimelineCollector#putEntitiesAsync
[ https://issues.apache.org/jira/browse/YARN-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15937110#comment-15937110 ] Haibo Chen commented on YARN-6357: -- Thanks for the pointer [~varun_impala_149e]. Reading through the discussion there, it now makes sense to me why a TimelineWriter.writesync() is never added (https://issues.apache.org/jira/browse/YARN-3949?focusedCommentId=14640959=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14640959). However, I think there is still some confusion in the current TimelienWriter API. Given the way TimelineCollector uses TimelineWriter, i.e. call flush() right after write() for synchronous putEntities requests, TimelineCollector expects TimelineWriter.write() to be asynchronous. In the meantime, it also expects a TimelineResponse from this asynchronous method, which I think is awkward and can cause confusion for alternative TimelineWriter implementations. I propose, we replace TimelineResponse with void as the return type, effectively making TimelineWriter API very much similar to that of BufferedMutator. This way, for truly asynchronous implementations of TimelineWriter.write(), such as HBaseTimelineWriter, no bogus TimelineResponse is created any more. For synchronous implementations, response can be given back in terms of IOException, that is, if the underlying synchronous store succeeds, no exception is thrown, otherwise, the failure is wrapped in a IOException and given back to TimelineCollector. > Implement TimelineCollector#putEntitiesAsync > > > Key: YARN-6357 > URL: https://issues.apache.org/jira/browse/YARN-6357 > Project: Hadoop YARN > Issue Type: Sub-task > Components: ATSv2, timelineserver >Affects Versions: YARN-2928 >Reporter: Joep Rottinghuis >Assignee: Haibo Chen > Labels: yarn-5355-merge-blocker > Attachments: YARN-6357.01.patch, YARN-6357.02.patch > > > As discovered and discussed in YARN-5269 the > TimelineCollector#putEntitiesAsync method is currently not implemented and > TimelineCollector#putEntities is asynchronous. > TimelineV2ClientImpl#putEntities vs TimelineV2ClientImpl#putEntitiesAsync > correctly call TimelineEntityDispatcher#dispatchEntities(boolean sync,... > with the correct argument. This argument does seem to make it into the > params, and on the server side TimelineCollectorWebService#putEntities > correctly pulls the async parameter from the rest call. See line 156: > {code} > boolean isAsync = async != null && async.trim().equalsIgnoreCase("true"); > {code} > However, this is where the problem starts. It simply calls > TimelineCollector#putEntities and ignores the value of isAsync. It should > instead have called TimelineCollector#putEntitiesAsync, which is currently > not implemented. > putEntities should call putEntitiesAsync and then after that call > writer.flush() > The fact that we flush on close and we flush periodically should be more of a > concern of avoiding data loss; close in case sync is never called and the > periodic flush to guard against having data from slow writers get buffered > for a long time and expose us to risk of loss in case the collector crashes > with data in its buffers. Size-based flush is a different concern to avoid > blowing up memory footprint. > The spooling behavior is also somewhat separate. > We have two separate methods on our API putEntities and putEntitiesAsync and > they should have different behavior beyond waiting for the request to be sent. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6357) Implement TimelineCollector#putEntitiesAsync
[ https://issues.apache.org/jira/browse/YARN-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15936804#comment-15936804 ] Varun Saxena commented on YARN-6357: Seems exact same thing has been discussed in YARN-3949 before. > Implement TimelineCollector#putEntitiesAsync > > > Key: YARN-6357 > URL: https://issues.apache.org/jira/browse/YARN-6357 > Project: Hadoop YARN > Issue Type: Sub-task > Components: ATSv2, timelineserver >Affects Versions: YARN-2928 >Reporter: Joep Rottinghuis >Assignee: Haibo Chen > Labels: yarn-5355-merge-blocker > Attachments: YARN-6357.01.patch, YARN-6357.02.patch > > > As discovered and discussed in YARN-5269 the > TimelineCollector#putEntitiesAsync method is currently not implemented and > TimelineCollector#putEntities is asynchronous. > TimelineV2ClientImpl#putEntities vs TimelineV2ClientImpl#putEntitiesAsync > correctly call TimelineEntityDispatcher#dispatchEntities(boolean sync,... > with the correct argument. This argument does seem to make it into the > params, and on the server side TimelineCollectorWebService#putEntities > correctly pulls the async parameter from the rest call. See line 156: > {code} > boolean isAsync = async != null && async.trim().equalsIgnoreCase("true"); > {code} > However, this is where the problem starts. It simply calls > TimelineCollector#putEntities and ignores the value of isAsync. It should > instead have called TimelineCollector#putEntitiesAsync, which is currently > not implemented. > putEntities should call putEntitiesAsync and then after that call > writer.flush() > The fact that we flush on close and we flush periodically should be more of a > concern of avoiding data loss; close in case sync is never called and the > periodic flush to guard against having data from slow writers get buffered > for a long time and expose us to risk of loss in case the collector crashes > with data in its buffers. Size-based flush is a different concern to avoid > blowing up memory footprint. > The spooling behavior is also somewhat separate. > We have two separate methods on our API putEntities and putEntitiesAsync and > they should have different behavior beyond waiting for the request to be sent. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6357) Implement TimelineCollector#putEntitiesAsync
[ https://issues.apache.org/jira/browse/YARN-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15936770#comment-15936770 ] Varun Saxena commented on YARN-6357: The way current code is structured, the responsibility of the writer is merely to buffer data and write it out. The control of when to flush is still with the collector i.e. the timer to flush. So the reasoning from my side is that the control of flush from sync API can also be with the collector. However, this is not something which I feel strongly about so let us do whatever most of the people think. > Implement TimelineCollector#putEntitiesAsync > > > Key: YARN-6357 > URL: https://issues.apache.org/jira/browse/YARN-6357 > Project: Hadoop YARN > Issue Type: Sub-task > Components: ATSv2, timelineserver >Affects Versions: YARN-2928 >Reporter: Joep Rottinghuis >Assignee: Haibo Chen > Labels: yarn-5355-merge-blocker > Attachments: YARN-6357.01.patch, YARN-6357.02.patch > > > As discovered and discussed in YARN-5269 the > TimelineCollector#putEntitiesAsync method is currently not implemented and > TimelineCollector#putEntities is asynchronous. > TimelineV2ClientImpl#putEntities vs TimelineV2ClientImpl#putEntitiesAsync > correctly call TimelineEntityDispatcher#dispatchEntities(boolean sync,... > with the correct argument. This argument does seem to make it into the > params, and on the server side TimelineCollectorWebService#putEntities > correctly pulls the async parameter from the rest call. See line 156: > {code} > boolean isAsync = async != null && async.trim().equalsIgnoreCase("true"); > {code} > However, this is where the problem starts. It simply calls > TimelineCollector#putEntities and ignores the value of isAsync. It should > instead have called TimelineCollector#putEntitiesAsync, which is currently > not implemented. > putEntities should call putEntitiesAsync and then after that call > writer.flush() > The fact that we flush on close and we flush periodically should be more of a > concern of avoiding data loss; close in case sync is never called and the > periodic flush to guard against having data from slow writers get buffered > for a long time and expose us to risk of loss in case the collector crashes > with data in its buffers. Size-based flush is a different concern to avoid > blowing up memory footprint. > The spooling behavior is also somewhat separate. > We have two separate methods on our API putEntities and putEntitiesAsync and > they should have different behavior beyond waiting for the request to be sent. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6357) Implement TimelineCollector#putEntitiesAsync
[ https://issues.apache.org/jira/browse/YARN-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15936763#comment-15936763 ] Varun Saxena commented on YARN-6357: bq. Also, I am wondering if there is any way for the TimelineWriteResponse to indicate that a sync call was invoked or if an aync call was invoked. If I am not wrong, async and sync call would lead to 2 separate puts so client should be knowing already which one is what so is it necessary to indicate so in response? In TimelineV2ClientImpl we return directly for an async call and for sync, we wait. Some place you think this will be useful? bq. In the write APIs, I think it will be good to have an explicitly named async API and a sync API (rather than persistXX). I think the API doc should explain that the sync one means syncing to call HBase flush and that aysnc means it will be translated to hbase#flush after a time interval or at the end of the application life. [~vrushalic], are you referring to TimelineWriter interface when you talk about persistXXX APIs'? Actually the comments above are for code in TimelineCollector. Anyways coming to APIs' in TimelineWriter which Haibo referred to in his comment above, TimelineWriter is an interface so we cannot really talk about HBase flush functionality in the javadoc here. Probably can be mentioned as an example. Also, we can split the APIs' or even pass a flag to write API and explain in javadoc that you should call flush mandatorily after write for sync calls and that should also flush buffered async calls if there is any buffering. But if this is what we want writer (which is a pluggable interface) to do mandatorily and essentially writer implementation would also need to call write followed by flush for sync, can't we simply ensure the sequence from TimelineCollector itself? Is there some scenario where you think an explicit separate API will be useful? Thoughts? > Implement TimelineCollector#putEntitiesAsync > > > Key: YARN-6357 > URL: https://issues.apache.org/jira/browse/YARN-6357 > Project: Hadoop YARN > Issue Type: Sub-task > Components: ATSv2, timelineserver >Affects Versions: YARN-2928 >Reporter: Joep Rottinghuis >Assignee: Haibo Chen > Labels: yarn-5355-merge-blocker > Attachments: YARN-6357.01.patch, YARN-6357.02.patch > > > As discovered and discussed in YARN-5269 the > TimelineCollector#putEntitiesAsync method is currently not implemented and > TimelineCollector#putEntities is asynchronous. > TimelineV2ClientImpl#putEntities vs TimelineV2ClientImpl#putEntitiesAsync > correctly call TimelineEntityDispatcher#dispatchEntities(boolean sync,... > with the correct argument. This argument does seem to make it into the > params, and on the server side TimelineCollectorWebService#putEntities > correctly pulls the async parameter from the rest call. See line 156: > {code} > boolean isAsync = async != null && async.trim().equalsIgnoreCase("true"); > {code} > However, this is where the problem starts. It simply calls > TimelineCollector#putEntities and ignores the value of isAsync. It should > instead have called TimelineCollector#putEntitiesAsync, which is currently > not implemented. > putEntities should call putEntitiesAsync and then after that call > writer.flush() > The fact that we flush on close and we flush periodically should be more of a > concern of avoiding data loss; close in case sync is never called and the > periodic flush to guard against having data from slow writers get buffered > for a long time and expose us to risk of loss in case the collector crashes > with data in its buffers. Size-based flush is a different concern to avoid > blowing up memory footprint. > The spooling behavior is also somewhat separate. > We have two separate methods on our API putEntities and putEntitiesAsync and > they should have different behavior beyond waiting for the request to be sent. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6357) Implement TimelineCollector#putEntitiesAsync
[ https://issues.apache.org/jira/browse/YARN-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15936731#comment-15936731 ] Vrushali C commented on YARN-6357: -- Also, I am wondering if there is any way for the TimelineWriteResponse to indicate that a sync call was invoked or if an aync call was invoked. > Implement TimelineCollector#putEntitiesAsync > > > Key: YARN-6357 > URL: https://issues.apache.org/jira/browse/YARN-6357 > Project: Hadoop YARN > Issue Type: Sub-task > Components: ATSv2, timelineserver >Affects Versions: YARN-2928 >Reporter: Joep Rottinghuis >Assignee: Haibo Chen > Labels: yarn-5355-merge-blocker > Attachments: YARN-6357.01.patch, YARN-6357.02.patch > > > As discovered and discussed in YARN-5269 the > TimelineCollector#putEntitiesAsync method is currently not implemented and > TimelineCollector#putEntities is asynchronous. > TimelineV2ClientImpl#putEntities vs TimelineV2ClientImpl#putEntitiesAsync > correctly call TimelineEntityDispatcher#dispatchEntities(boolean sync,... > with the correct argument. This argument does seem to make it into the > params, and on the server side TimelineCollectorWebService#putEntities > correctly pulls the async parameter from the rest call. See line 156: > {code} > boolean isAsync = async != null && async.trim().equalsIgnoreCase("true"); > {code} > However, this is where the problem starts. It simply calls > TimelineCollector#putEntities and ignores the value of isAsync. It should > instead have called TimelineCollector#putEntitiesAsync, which is currently > not implemented. > putEntities should call putEntitiesAsync and then after that call > writer.flush() > The fact that we flush on close and we flush periodically should be more of a > concern of avoiding data loss; close in case sync is never called and the > periodic flush to guard against having data from slow writers get buffered > for a long time and expose us to risk of loss in case the collector crashes > with data in its buffers. Size-based flush is a different concern to avoid > blowing up memory footprint. > The spooling behavior is also somewhat separate. > We have two separate methods on our API putEntities and putEntitiesAsync and > they should have different behavior beyond waiting for the request to be sent. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6357) Implement TimelineCollector#putEntitiesAsync
[ https://issues.apache.org/jira/browse/YARN-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15936727#comment-15936727 ] Vrushali C commented on YARN-6357: -- In the write APIs, I think it will be good to have an explicitly named async API and a sync API (rather than persistXX). I think the API doc should explain that the sync one means syncing to call HBase flush and that aysnc means it will be translated to hbase#flush after a time interval or at the end of the application life. > Implement TimelineCollector#putEntitiesAsync > > > Key: YARN-6357 > URL: https://issues.apache.org/jira/browse/YARN-6357 > Project: Hadoop YARN > Issue Type: Sub-task > Components: ATSv2, timelineserver >Affects Versions: YARN-2928 >Reporter: Joep Rottinghuis >Assignee: Haibo Chen > Labels: yarn-5355-merge-blocker > Attachments: YARN-6357.01.patch, YARN-6357.02.patch > > > As discovered and discussed in YARN-5269 the > TimelineCollector#putEntitiesAsync method is currently not implemented and > TimelineCollector#putEntities is asynchronous. > TimelineV2ClientImpl#putEntities vs TimelineV2ClientImpl#putEntitiesAsync > correctly call TimelineEntityDispatcher#dispatchEntities(boolean sync,... > with the correct argument. This argument does seem to make it into the > params, and on the server side TimelineCollectorWebService#putEntities > correctly pulls the async parameter from the rest call. See line 156: > {code} > boolean isAsync = async != null && async.trim().equalsIgnoreCase("true"); > {code} > However, this is where the problem starts. It simply calls > TimelineCollector#putEntities and ignores the value of isAsync. It should > instead have called TimelineCollector#putEntitiesAsync, which is currently > not implemented. > putEntities should call putEntitiesAsync and then after that call > writer.flush() > The fact that we flush on close and we flush periodically should be more of a > concern of avoiding data loss; close in case sync is never called and the > periodic flush to guard against having data from slow writers get buffered > for a long time and expose us to risk of loss in case the collector crashes > with data in its buffers. Size-based flush is a different concern to avoid > blowing up memory footprint. > The spooling behavior is also somewhat separate. > We have two separate methods on our API putEntities and putEntitiesAsync and > they should have different behavior beyond waiting for the request to be sent. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6357) Implement TimelineCollector#putEntitiesAsync
[ https://issues.apache.org/jira/browse/YARN-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15935524#comment-15935524 ] Varun Saxena commented on YARN-6357: By the way, coming to the patch, although almost all the writer implementations will have asynchronous writes and some buffering but there is no guarantee (unlikely but writer implementations can choose to persist data to backend immediately and choose to implement flush as a no-op), so lets name the method writeTimelineEntities instead of writeTimelineEntitiesAsync. Also persistOutstandingTimelineEntities can be renamed to persistBufferedTimelineEntities. Or simply name the method flushEntities. Other than this patch looks fine to me. > Implement TimelineCollector#putEntitiesAsync > > > Key: YARN-6357 > URL: https://issues.apache.org/jira/browse/YARN-6357 > Project: Hadoop YARN > Issue Type: Sub-task > Components: ATSv2, timelineserver >Affects Versions: YARN-2928 >Reporter: Joep Rottinghuis >Assignee: Haibo Chen > Labels: yarn-5355-merge-blocker > Attachments: YARN-6357.01.patch, YARN-6357.02.patch > > > As discovered and discussed in YARN-5269 the > TimelineCollector#putEntitiesAsync method is currently not implemented and > TimelineCollector#putEntities is asynchronous. > TimelineV2ClientImpl#putEntities vs TimelineV2ClientImpl#putEntitiesAsync > correctly call TimelineEntityDispatcher#dispatchEntities(boolean sync,... > with the correct argument. This argument does seem to make it into the > params, and on the server side TimelineCollectorWebService#putEntities > correctly pulls the async parameter from the rest call. See line 156: > {code} > boolean isAsync = async != null && async.trim().equalsIgnoreCase("true"); > {code} > However, this is where the problem starts. It simply calls > TimelineCollector#putEntities and ignores the value of isAsync. It should > instead have called TimelineCollector#putEntitiesAsync, which is currently > not implemented. > putEntities should call putEntitiesAsync and then after that call > writer.flush() > The fact that we flush on close and we flush periodically should be more of a > concern of avoiding data loss; close in case sync is never called and the > periodic flush to guard against having data from slow writers get buffered > for a long time and expose us to risk of loss in case the collector crashes > with data in its buffers. Size-based flush is a different concern to avoid > blowing up memory footprint. > The spooling behavior is also somewhat separate. > We have two separate methods on our API putEntities and putEntitiesAsync and > they should have different behavior beyond waiting for the request to be sent. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6357) Implement TimelineCollector#putEntitiesAsync
[ https://issues.apache.org/jira/browse/YARN-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933594#comment-15933594 ] Hadoop QA commented on YARN-6357: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 45s{color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 22m 37s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:a9ad5d6 | | JIRA Issue | YARN-6357 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12859632/YARN-6357.02.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 23176df02cfb 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 15:37:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 6c399a8 | | Default Java | 1.8.0_121 | | findbugs | v3.0.0 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/15338/testReport/ | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/15338/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Implement TimelineCollector#putEntitiesAsync > > > Key: YARN-6357 > URL: https://issues.apache.org/jira/browse/YARN-6357 > Project: Hadoop YARN > Issue Type: Sub-task > Components: ATSv2, timelineserver >Affects Versions: YARN-2928 >Reporter: Joep Rottinghuis >Assignee: Haibo Chen > Labels: yarn-5355-merge-blocker > Attachments: YARN-6357.01.patch,
[jira] [Commented] (YARN-6357) Implement TimelineCollector#putEntitiesAsync
[ https://issues.apache.org/jira/browse/YARN-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933545#comment-15933545 ] Haibo Chen commented on YARN-6357: -- One thing I noticed, is that TimelineWriter.write() is effectively an async method, i.e. writeAsync() for the moment. If we need to ensure entities are written to backend, we need to call TimelineWriter.flush(() after TimelineWriter.write(). In addition, both TimelineWriter implementations return new TimelineWriteResponse() in all situations, so response carries no value. I wonder if the TimelineWriter.write() should be renamed to writeAsync() and a new writeSync() should be added to make it cleaner. > Implement TimelineCollector#putEntitiesAsync > > > Key: YARN-6357 > URL: https://issues.apache.org/jira/browse/YARN-6357 > Project: Hadoop YARN > Issue Type: Sub-task > Components: ATSv2, timelineserver >Affects Versions: YARN-2928 >Reporter: Joep Rottinghuis >Assignee: Haibo Chen > Labels: yarn-5355-merge-blocker > Attachments: YARN-6357.01.patch, YARN-6357.02.patch > > > As discovered and discussed in YARN-5269 the > TimelineCollector#putEntitiesAsync method is currently not implemented and > TimelineCollector#putEntities is asynchronous. > TimelineV2ClientImpl#putEntities vs TimelineV2ClientImpl#putEntitiesAsync > correctly call TimelineEntityDispatcher#dispatchEntities(boolean sync,... > with the correct argument. This argument does seem to make it into the > params, and on the server side TimelineCollectorWebService#putEntities > correctly pulls the async parameter from the rest call. See line 156: > {code} > boolean isAsync = async != null && async.trim().equalsIgnoreCase("true"); > {code} > However, this is where the problem starts. It simply calls > TimelineCollector#putEntities and ignores the value of isAsync. It should > instead have called TimelineCollector#putEntitiesAsync, which is currently > not implemented. > putEntities should call putEntitiesAsync and then after that call > writer.flush() > The fact that we flush on close and we flush periodically should be more of a > concern of avoiding data loss; close in case sync is never called and the > periodic flush to guard against having data from slow writers get buffered > for a long time and expose us to risk of loss in case the collector crashes > with data in its buffers. Size-based flush is a different concern to avoid > blowing up memory footprint. > The spooling behavior is also somewhat separate. > We have two separate methods on our API putEntities and putEntitiesAsync and > they should have different behavior beyond waiting for the request to be sent. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6357) Implement TimelineCollector#putEntitiesAsync
[ https://issues.apache.org/jira/browse/YARN-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933517#comment-15933517 ] Haibo Chen commented on YARN-6357: -- Thanks [~varun_saxena] for the review! I updated the patch accordingly with your comments. > Implement TimelineCollector#putEntitiesAsync > > > Key: YARN-6357 > URL: https://issues.apache.org/jira/browse/YARN-6357 > Project: Hadoop YARN > Issue Type: Sub-task > Components: ATSv2, timelineserver >Affects Versions: YARN-2928 >Reporter: Joep Rottinghuis >Assignee: Haibo Chen > Labels: yarn-5355-merge-blocker > Attachments: YARN-6357.01.patch, YARN-6357.02.patch > > > As discovered and discussed in YARN-5269 the > TimelineCollector#putEntitiesAsync method is currently not implemented and > TimelineCollector#putEntities is asynchronous. > TimelineV2ClientImpl#putEntities vs TimelineV2ClientImpl#putEntitiesAsync > correctly call TimelineEntityDispatcher#dispatchEntities(boolean sync,... > with the correct argument. This argument does seem to make it into the > params, and on the server side TimelineCollectorWebService#putEntities > correctly pulls the async parameter from the rest call. See line 156: > {code} > boolean isAsync = async != null && async.trim().equalsIgnoreCase("true"); > {code} > However, this is where the problem starts. It simply calls > TimelineCollector#putEntities and ignores the value of isAsync. It should > instead have called TimelineCollector#putEntitiesAsync, which is currently > not implemented. > putEntities should call putEntitiesAsync and then after that call > writer.flush() > The fact that we flush on close and we flush periodically should be more of a > concern of avoiding data loss; close in case sync is never called and the > periodic flush to guard against having data from slow writers get buffered > for a long time and expose us to risk of loss in case the collector crashes > with data in its buffers. Size-based flush is a different concern to avoid > blowing up memory footprint. > The spooling behavior is also somewhat separate. > We have two separate methods on our API putEntities and putEntitiesAsync and > they should have different behavior beyond waiting for the request to be sent. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6357) Implement TimelineCollector#putEntitiesAsync
[ https://issues.apache.org/jira/browse/YARN-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933141#comment-15933141 ] Varun Saxena commented on YARN-6357: Thanks [~haibochen] for the patch. # TODO comment in TimelineCollector#putEntitiesAsync should be removed. # Not related but in TimelineCollector#putEntities we can probably remove the log {{LOG.debug("SUCCESS - TIMELINE V2 PROTOTYPE");}} # Duplicate code in putEntities and putEntitiesAsync, can probably be moved to a private method and we can then call this code from respective methods? # Checkstyle and javadoc issues seem fixable. > Implement TimelineCollector#putEntitiesAsync > > > Key: YARN-6357 > URL: https://issues.apache.org/jira/browse/YARN-6357 > Project: Hadoop YARN > Issue Type: Sub-task > Components: ATSv2, timelineserver >Affects Versions: YARN-2928 >Reporter: Joep Rottinghuis >Assignee: Haibo Chen > Labels: yarn-5355-merge-blocker > Attachments: YARN-6357.01.patch > > > As discovered and discussed in YARN-5269 the > TimelineCollector#putEntitiesAsync method is currently not implemented and > TimelineCollector#putEntities is asynchronous. > TimelineV2ClientImpl#putEntities vs TimelineV2ClientImpl#putEntitiesAsync > correctly call TimelineEntityDispatcher#dispatchEntities(boolean sync,... > with the correct argument. This argument does seem to make it into the > params, and on the server side TimelineCollectorWebService#putEntities > correctly pulls the async parameter from the rest call. See line 156: > {code} > boolean isAsync = async != null && async.trim().equalsIgnoreCase("true"); > {code} > However, this is where the problem starts. It simply calls > TimelineCollector#putEntities and ignores the value of isAsync. It should > instead have called TimelineCollector#putEntitiesAsync, which is currently > not implemented. > putEntities should call putEntitiesAsync and then after that call > writer.flush() > The fact that we flush on close and we flush periodically should be more of a > concern of avoiding data loss; close in case sync is never called and the > periodic flush to guard against having data from slow writers get buffered > for a long time and expose us to risk of loss in case the collector crashes > with data in its buffers. Size-based flush is a different concern to avoid > blowing up memory footprint. > The spooling behavior is also somewhat separate. > We have two separate methods on our API putEntities and putEntitiesAsync and > they should have different behavior beyond waiting for the request to be sent. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6357) Implement TimelineCollector#putEntitiesAsync
[ https://issues.apache.org/jira/browse/YARN-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15930863#comment-15930863 ] Hadoop QA commented on YARN-6357: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 16s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 14s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 10s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice: The patch generated 3 new + 1 unchanged - 0 fixed = 4 total (was 1) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 10s{color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-timelineservice generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 39s{color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 16s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 18m 12s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:a9ad5d6 | | JIRA Issue | YARN-6357 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12859365/YARN-6357.01.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 32d1abca7430 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 4a8e304 | | Default Java | 1.8.0_121 | | findbugs | v3.0.0 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/15322/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-timelineservice.txt | | javadoc | https://builds.apache.org/job/PreCommit-YARN-Build/15322/artifact/patchprocess/diff-javadoc-javadoc-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-timelineservice.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/15322/testReport/ | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice | | Console output |
[jira] [Commented] (YARN-6357) Implement TimelineCollector#putEntitiesAsync
[ https://issues.apache.org/jira/browse/YARN-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15929254#comment-15929254 ] Joep Rottinghuis commented on YARN-6357: bq. I can file a separate bug from this one dealing with exception handling to tackle the sync vs async nature. Sorry, copy paste from previous bug YARN-5269. This _is_ the separate jira. [~varun_saxena] you were right in the previous call. I was thinking about the writer side where flush works correctly, you were thinking one level up where flush wasn't appropriately called. > Implement TimelineCollector#putEntitiesAsync > > > Key: YARN-6357 > URL: https://issues.apache.org/jira/browse/YARN-6357 > Project: Hadoop YARN > Issue Type: Sub-task > Components: ATSv2, timelineserver >Affects Versions: YARN-2928 >Reporter: Joep Rottinghuis >Assignee: Haibo Chen > Labels: yarn-5355-merge-blocker > > As discovered and discussed in YARN-5269 the > TimelineCollector#putEntitiesAsync method is currently not implemented and > TimelineCollector#putEntities is asynchronous. > TimelineV2ClientImpl#putEntities vs TimelineV2ClientImpl#putEntitiesAsync > correctly call TimelineEntityDispatcher#dispatchEntities(boolean sync,... > with the correct argument. This argument does seem to make it into the > params, and on the server side TimelineCollectorWebService#putEntities > correctly pulls the async parameter from the rest call. See line 156: > {code} > boolean isAsync = async != null && async.trim().equalsIgnoreCase("true"); > {code} > However, this is where the problem starts. It simply calls > TimelineCollector#putEntities and ignores the value of isAsync. It should > instead have called TimelineCollector#putEntitiesAsync, which is currently > not implemented. > putEntities should call putEntitiesAsync and then after that call > writer.flush() > The fact that we flush on close and we flush periodically should be more of a > concern of avoiding data loss; close in case sync is never called and the > periodic flush to guard against having data from slow writers get buffered > for a long time and expose us to risk of loss in case the collector crashes > with data in its buffers. Size-based flush is a different concern to avoid > blowing up memory footprint. > The spooling behavior is also somewhat separate. > We have two separate methods on our API putEntities and putEntitiesAsync and > they should have different behavior beyond waiting for the request to be sent. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6357) Implement TimelineCollector#putEntitiesAsync
[ https://issues.apache.org/jira/browse/YARN-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15929048#comment-15929048 ] Varun Saxena commented on YARN-6357: bq. We have two separate methods on our API putEntities and putEntitiesAsync and they should have different behavior beyond waiting for the request to be sent. I can file a separate bug from this one dealing with exception handling to tackle the sync vs async nature. Sorry couldn't get this. Which exception handling part will be handled in this JIRA? IIUC, this JIRA is intended to check isAsync flag and call flush immediately for a sync call. > Implement TimelineCollector#putEntitiesAsync > > > Key: YARN-6357 > URL: https://issues.apache.org/jira/browse/YARN-6357 > Project: Hadoop YARN > Issue Type: Sub-task > Components: ATSv2, timelineserver >Affects Versions: YARN-2928 >Reporter: Joep Rottinghuis >Assignee: Haibo Chen > Labels: yarn-5355-merge-blocker > > As discovered and discussed in YARN-5269 the > TimelineCollector#putEntitiesAsync method is currently not implemented and > TimelineCollector#putEntities is asynchronous. > TimelineV2ClientImpl#putEntities vs TimelineV2ClientImpl#putEntitiesAsync > correctly call TimelineEntityDispatcher#dispatchEntities(boolean sync,... > with the correct argument. This argument does seem to make it into the > params, and on the server side TimelineCollectorWebService#putEntities > correctly pulls the async parameter from the rest call. See line 156: > {code} > boolean isAsync = async != null && async.trim().equalsIgnoreCase("true"); > {code} > However, this is where the problem starts. It simply calls > TimelineCollector#putEntities and ignores the value of isAsync. It should > instead have called TimelineCollector#putEntitiesAsync, which is currently > not implemented. > putEntities should call putEntitiesAsync and then after that call > writer.flush() > The fact that we flush on close and we flush periodically should be more of a > concern of avoiding data loss; close in case sync is never called and the > periodic flush to guard against having data from slow writers get buffered > for a long time and expose us to risk of loss in case the collector crashes > with data in its buffers. Size-based flush is a different concern to avoid > blowing up memory footprint. > The spooling behavior is also somewhat separate. > We have two separate methods on our API putEntities and putEntitiesAsync and > they should have different behavior beyond waiting for the request to be > sent. I can file a separate bug from this one dealing with exception handling > to tackle the sync vs async nature. During the meeting today I was thinking > about the HBase writer that has a flush, which definitely blocks until data > is flushed to HBase (ignoring the spooling for the moment). -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6357) Implement TimelineCollector#putEntitiesAsync
[ https://issues.apache.org/jira/browse/YARN-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15929037#comment-15929037 ] Varun Saxena commented on YARN-6357: Correct. This is what I was suspecting in the previous call. In YARN-3367 sync/async changes were only made on the client side. > Implement TimelineCollector#putEntitiesAsync > > > Key: YARN-6357 > URL: https://issues.apache.org/jira/browse/YARN-6357 > Project: Hadoop YARN > Issue Type: Sub-task > Components: ATSv2, timelineserver >Affects Versions: YARN-2928 >Reporter: Joep Rottinghuis >Assignee: Haibo Chen > Labels: yarn-5355-merge-blocker > > As discovered and discussed in YARN-5269 the > TimelineCollector#putEntitiesAsync method is currently not implemented and > TimelineCollector#putEntities is asynchronous. > TimelineV2ClientImpl#putEntities vs TimelineV2ClientImpl#putEntitiesAsync > correctly call TimelineEntityDispatcher#dispatchEntities(boolean sync,... > with the correct argument. This argument does seem to make it into the > params, and on the server side TimelineCollectorWebService#putEntities > correctly pulls the async parameter from the rest call. See line 156: > {code} > boolean isAsync = async != null && async.trim().equalsIgnoreCase("true"); > {code} > However, this is where the problem starts. It simply calls > TimelineCollector#putEntities and ignores the value of isAsync. It should > instead have called TimelineCollector#putEntitiesAsync, which is currently > not implemented. > putEntities should call putEntitiesAsync and then after that call > writer.flush() > The fact that we flush on close and we flush periodically should be more of a > concern of avoiding data loss; close in case sync is never called and the > periodic flush to guard against having data from slow writers get buffered > for a long time and expose us to risk of loss in case the collector crashes > with data in its buffers. Size-based flush is a different concern to avoid > blowing up memory footprint. > The spooling behavior is also somewhat separate. > We have two separate methods on our API putEntities and putEntitiesAsync and > they should have different behavior beyond waiting for the request to be > sent. I can file a separate bug from this one dealing with exception handling > to tackle the sync vs async nature. During the meeting today I was thinking > about the HBase writer that has a flush, which definitely blocks until data > is flushed to HBase (ignoring the spooling for the moment). -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org