[ 
https://issues.apache.org/jira/browse/YARN-6382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15954358#comment-15954358
 ] 

Haibo Chen edited comment on YARN-6382 at 4/3/17 11:55 PM:
-----------------------------------------------------------

Thanks for the nice summary [~jrottinghuis]! 
bq. This write causes the buffer to be full, or perhaps thread B calls flush, 
or a timer calls flush.
The latter two cases have been fixed by YARN-6357, so we only need to concern 
ourselves with the case where the buffer to be full.

I believe, what I was mostly concerned about, losing data due to intermittent 
connection issues and this race condition, is only an issue if there is no 
spooling support. 
Assuming most data/entities are not problematic, that is, a flush will not fail 
because of the data itself and subsequent retries will eventually write the 
data successfully in HBase, we can provide enough guarantee that good entities 
are all going to be eventually persisted in HBase. 
Given that most of what b) solves will go away when we have the spooling 
writer, I agree that we could just document the issue for now. Once we get the 
spooling writer, we can come back and revisit this to address what we want to 
do with malformed/problematic entities if they failed to be persisted.


was (Author: haibochen):
Thanks for the nice summary [~jrottinghuis]! 
bq. This write causes the buffer to be full, or perhaps thread B calls flush, 
or a timer calls flush.
The latter two cases have been fixed by YARN-6357, so we only need to concern 
ourselves with the case where the buffer to be full.

I believe, what I was mostly concerned about, losing data due to intermittent 
connection issues and this race condition, is only an issue if there is no 
spooling support. 
Assuming most data/entities are not problematic, that is, a flush will not fail 
because of the data itself and subsequent retries will eventually write the 
data successfully in HBase, we can provide enough guarantee that good entities 
are all going to be eventually persisted in HBase. 
Given that most of what b) solves will go away when we have the spooling 
writer, I agree that we could just document the issue for now. Once we get the 
spooling writer, we can come back and revisit this to address what we want to 
do with malformed/problematic entities.

> Address race condition on TimelineWriter.flush() caused by buffer-sized flush
> -----------------------------------------------------------------------------
>
>                 Key: YARN-6382
>                 URL: https://issues.apache.org/jira/browse/YARN-6382
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>    Affects Versions: 3.0.0-alpha2
>            Reporter: Haibo Chen
>            Assignee: Haibo Chen
>              Labels: yarn-5355-merge-blocker
>
> YARN-6376 fixes the race condition between putEntities() and periodical 
> flush() by WriterFlushThread in TimelineCollectorManager, or between 
> putEntities() in different threads.
> However, BufferedMutator can have internal size-based flush as well. We need 
> to address the resulting race condition.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to