[ 
https://issues.apache.org/jira/browse/GEODE-7085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dan Smith updated GEODE-7085:
-----------------------------
    Description: 
We hit an issue where a member failed to recover due to a 
IndexOutOfBoundsException while recording a version during recovery.

Looking closer, it looks like the issue is due to the fact that a 
RegionVersionHolder cannot record a version greater than Integer.MAX_VALUE if 
it just just constructed.

When we are recovering from disk, the first thing we read from is the .drf 
files. The first thing in those drf files is RVV information. We read the RVV 
records and call recordRecoveredGCVersion.

When that call gets down inside RegionVersionHolder.recordVersion, there is 
some logic that is supposed to flush out the bitSet and advance the 
bitSetVersion. Unfortunately it looks like flushBitSetDuringRecording is not 
actually doing that. So if version we read from disk is greater than 
Integer.MAX_VALUE, we wrap around and try to set a negative index in the bitset.

I can reproduce this with a unit test of RegionVersionVector that records a 
version greater than Integer.MAX_VALUE.

  was:
We hit an issue where a member failed to recover due to a 
IndexOutOfBoundsException while recording a version during recovery.

Looking closer, it looks like the issue is due to the fact that a 
RegionVersionHolder cannot record a version greater than Integer.MAX_VALUE if 
it just just constructed.

When we are recovering from disk, the first thing we read from is the .drf 
files. The first thing in those drf files is RVV information. We read the RVV 
records and call recordRecoveredGCVersion.

When that call gets down inside RegionVersionHolder.recordVersion, there is 
some logic that is supposed to flush out the bitSet and advance the 
bitSetVersion. Unfortunately it looks like flushBitSetDuringRecording is not 
actually doing that. So if version we read from disk is greater than 
Integer.MAX_VALUE, we wrap around and try to set a negative index in the bitset.

I can reproduce this with a unit test of RegionVersionVector that records a 
version greater than Integer.MAX_VALUE. I’m looking into how to fix the 
flushBitSetDuringRecording method.


> Cannot recover from disk store if region version is greater than 
> Integer.MAX_VALUE
> ----------------------------------------------------------------------------------
>
>                 Key: GEODE-7085
>                 URL: https://issues.apache.org/jira/browse/GEODE-7085
>             Project: Geode
>          Issue Type: Bug
>          Components: membership, persistence
>            Reporter: Dan Smith
>            Assignee: Dan Smith
>            Priority: Major
>             Fix For: 1.11.0
>
>          Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> We hit an issue where a member failed to recover due to a 
> IndexOutOfBoundsException while recording a version during recovery.
> Looking closer, it looks like the issue is due to the fact that a 
> RegionVersionHolder cannot record a version greater than Integer.MAX_VALUE if 
> it just just constructed.
> When we are recovering from disk, the first thing we read from is the .drf 
> files. The first thing in those drf files is RVV information. We read the RVV 
> records and call recordRecoveredGCVersion.
> When that call gets down inside RegionVersionHolder.recordVersion, there is 
> some logic that is supposed to flush out the bitSet and advance the 
> bitSetVersion. Unfortunately it looks like flushBitSetDuringRecording is not 
> actually doing that. So if version we read from disk is greater than 
> Integer.MAX_VALUE, we wrap around and try to set a negative index in the 
> bitset.
> I can reproduce this with a unit test of RegionVersionVector that records a 
> version greater than Integer.MAX_VALUE.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

Reply via email to