[ 
https://issues.apache.org/jira/browse/HBASE-14362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14935217#comment-14935217
 ] 

Matteo Bertozzi commented on HBASE-14362:
-----------------------------------------

in that way we just loose data. 
The exception in syncSlots(Stream, slots, offset, count) is already catched in 
syncSlots().
if we are not able to write/sync, we are trying to close the current wal and 
reopen a new one.
we try N times, if we can't we give up because maybe that machine is no longer 
able to talk with HDFS or something like that. and we let the backup master try.
for the test we just need to increase the number of retries, to be able to the 
slow test machines

> org.apache.hadoop.hbase.master.procedure.TestWALProcedureStoreOnHDFS is super 
> duper flaky
> -----------------------------------------------------------------------------------------
>
>                 Key: HBASE-14362
>                 URL: https://issues.apache.org/jira/browse/HBASE-14362
>             Project: HBase
>          Issue Type: Bug
>          Components: test
>    Affects Versions: 2.0.0
>            Reporter: Dima Spivak
>            Priority: Critical
>         Attachments: HBASE-14362.patch, HBASE-14362.patch
>
>
> [As seen in 
> Jenkins|https://builds.apache.org/job/HBase-TRUNK/lastCompletedBuild/testReport/org.apache.hadoop.hbase.master.procedure/TestWALProcedureStoreOnHDFS/history/],
>  this test has been super flaky and we should probably address it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to