[ 
https://issues.apache.org/jira/browse/ACCUMULO-4425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15444116#comment-15444116
 ] 

Josh Elser edited comment on ACCUMULO-4425 at 8/28/16 9:03 PM:
---------------------------------------------------------------

bq. The label was the easiest first pass I thought of. I could simplify it with 
a boolean conditional on the outer loop. I'm more concerned about the strategy 
of handling this in the test itself than the specific implementation.

Yup, I understand and agree with you completely.

bq. I agree with the concern about runtime issues. That's why I put it up for 
review as a PR. I'm concerned we're not properly handling this internally in 
the WalStateManager. But, I'm also wondering if this is something that can only 
happen in the test. The thing is... in the dirty shutdown case, I'm not 
actually sure why these states persist. Perhaps it's just because the ephemeral 
ZK nodes haven't timed out yet? Maybe it's not something to be concerned about 
in a real system and is only an artifact of the test. At the very least, it's 
clear from the workaround that they will eventually resolve themselves, and 
maybe that's sufficient for a running system? This part of our code is hard to 
reason about... because there aren't a lot of comments explaining how the 
design is supposed to work.

I'm sure [~kturner] will be able to weigh in when he returns. The ephemeral 
node timing out certainly seems a plausible explanation (did you check that 
these are ephemeral nodes, though?). I would assume that changing volumes is a 
rare scenario and thus our test here is stressing things beyond the normal 
amount.


was (Author: elserj):
bq. The label was the easiest first pass I thought of. I could simplify it with 
a boolean conditional on the outer loop. I'm more concerned about the strategy 
of handling this in the test itself than the specific implementation.

Yup, I understand and agree with you completely.

bq. I agree with the concern about runtime issues. That's why I put it up for 
review as a PR. I'm concerned we're not properly handling this internally in 
the WalStateManager. But, I'm also wondering if this is something that can only 
happen in the test. The thing is... in the dirty shutdown case, I'm not 
actually sure why these states persist. Perhaps it's just because the ephemeral 
ZK nodes haven't timed out yet? Maybe it's not something to be concerned about 
in a real system and is only an artifact of the test. At the very least, it's 
clear from the workaround that they will eventually resolve themselves, and 
maybe that's sufficient for a running system? This part of our code is hard to 
reason about... because there aren't a lot of comments explaining how the 
design is supposed to work.

I'm sure [~kturner]

> VolumeIT.testDirtyReplaceVolumes fails
> --------------------------------------
>
>                 Key: ACCUMULO-4425
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-4425
>             Project: Accumulo
>          Issue Type: Bug
>            Reporter: Christopher Tubbs
>            Assignee: Christopher Tubbs
>            Priority: Blocker
>             Fix For: 1.8.0
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> *Error Message*
> {code}
> Unexpected volume 
> file:/var/lib/jenkins/workspace/Accumulo-1.8-ITs-failures/test/target/mini-tests/org.apache.accumulo.test.VolumeIT_testDirtyReplaceVolumes/volumes/v1/wal/jenkins.revelc.net+38766/3eb39803-c014-4195-943a-7a12efa2f515
> {code}
> *Stacktrace*
> {code}
> java.lang.AssertionError: Unexpected volume 
> file:/var/lib/jenkins/workspace/Accumulo-1.8-ITs-failures/test/target/mini-tests/org.apache.accumulo.test.VolumeIT_testDirtyReplaceVolumes/volumes/v1/wal/jenkins.revelc.net+38766/3eb39803-c014-4195-943a-7a12efa2f515
>       at 
> org.apache.accumulo.test.VolumeIT.verifyVolumesUsed(VolumeIT.java:441)
>       at 
> org.apache.accumulo.test.VolumeIT.testReplaceVolume(VolumeIT.java:533)
>       at 
> org.apache.accumulo.test.VolumeIT.testDirtyReplaceVolumes(VolumeIT.java:566)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to