[ 
https://issues.apache.org/jira/browse/DL-145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15755813#comment-15755813
 ] 

Sijie Guo commented on DL-145:
------------------------------

[~xieliang007] It would be good to fix the root cause here. so based on your 
comment, it seems that closing the writer doesn't abort pending writes (because 
the writes are still pending for rolling a new log segment). If it is the case, 
we should probably address that in AsyncLogWriter to make sure all the pending 
writes are aborted when the writer is closed.

> Fix the flaky testServiceTimeout
> --------------------------------
>
>                 Key: DL-145
>                 URL: https://issues.apache.org/jira/browse/DL-145
>             Project: DistributedLog
>          Issue Type: Test
>          Components: distributedlog-service
>    Affects Versions: 0.4.0
>            Reporter: Liang Xie
>            Assignee: Liang Xie
>
> The TestDistributedLogService#testServiceTimeout case is not stable, e.g. 
> https://builds.apache.org/job/distributedlog-precommit-pullrequest/22/com.twitter$distributedlog-service/testReport/com.twitter.distributedlog.service/TestDistributedLogService/testServiceTimeout/
> It could be reproduced on my box occasionally, and the failures were stable 
> if i tuned the ServiceTimeoutMs from 200 to 150, and always passed if tuned 
> to a larger value, e.g. 1000(btw, my disk is SSD tyle)
> After digging into it, shows it related with starting a new log segment 
> corner case.
> For a good case, once service time out occurs, steam status : ERROR -> 
> CLOSING -> CLOSED, calling Abortables.asyncAbort will trigger the cached 
> logsegment be aborted, then writeOp will be injected an exception, e.g. write 
> cancel exception.
> For a bad case, since no log records be written before, so there'll be an 
> async start new log segment, once the timeout occurs, the segment starting 
> still not be done, so no cache, then asyncAbort has no change to abort that 
> segment.
> I think change the test timeout value to a larger one should be find for this 
> special test corner case.
> will attache a minor patch later.  Any suggestions are welcome.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to