[
https://issues.apache.org/jira/browse/CASSANDRA-8458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14246786#comment-14246786
]
Benedict edited comment on CASSANDRA-8458 at 12/15/14 4:30 PM:
---------------------------------------------------------------
I'm pretty sure this should throw an AssertionError given our new checks on
first/last, instead of simply continuing, and it simplifies the code a little.
Also, technically not related to this ticket, but doesn't it strike you as a
bug to use Operation.GT, instead of Operation.GE? If so, I'm not sure how we
don't have tests to catch it.
{code}
RowIndexEntry idxLeft = getPosition(leftBound, Operator.GT);
long left = idxLeft == null ? -1 : idxLeft.position;
if (left == -1)
// left is past the end of the file
continue;
{code}
Couldn't we also simplify correcting "right" to a similar approach as you've
taken for left, and use the fact that we know the positions we're looking for
exist in both cases? I think the following snippet for the guts of the method
should work but is (perhaps?) a little clearer:
{code}
// range is never wrap around, unless right is the minimum token
assert !range.isWrapAround() || range.right.isMinimum();
// truncate the range so it at most covers the sstable
AbstractBounds<RowPosition> bounds = range.toRowBounds();
RowPosition leftBound = bounds.left.compareTo(first) > 0 ?
bounds.left : first.getToken().minKeyBound();
RowPosition rightBound = bounds.right.isMinimum() ?
last.getToken().maxKeyBound() : bounds.right;
// skip non-overlapping ranges
if (leftBound.compareTo(last) > 0 || rightBound.compareTo(first) <
0)
continue;
// transform into positions known to exist in the file
long left = getPosition(leftBound, Operator.GE).position;
long right = (rightBound.compareTo(last) >= 0)
? (openReason == OpenReason.EARLY
// if opened early, we overlap with the old
sstables by one key, so we know that the last
// (and further) key(s) will be streamed from these
if necessary
? getPosition(last.getToken().maxKeyBound(),
Operator.GE).position
: uncompressedLength())
: getPosition(rightBound, Operator.GT).position;
{code}
was (Author: benedict):
There would be a bug if ranges could wrap with the first "continue" potentially
filtering files we actually need to visit. However Range.normalize() _claims_
to remove all wrapped ranges, in which case we can actually simplify the second
check to remove !isWrapRound(), making it a little clearer.
I'm pretty sure this should throw an AssertionError given our new checks on
first/last, and again it simplifies the code a little. Also, technically not
related to this ticket, but doesn't it strike you as a bug to use Operation.GT,
instead of Operation.GE? If so, I'm not sure how we don't have tests to catch
it.
{code}
RowIndexEntry idxLeft = getPosition(leftBound, Operator.GT);
long left = idxLeft == null ? -1 : idxLeft.position;
if (left == -1)
// left is past the end of the file
continue;
{code}
Couldn't we also simplify correcting "right" to a similar approach as you've
taken for left, and use the fact that we know the positions we're looking for
exist in both cases? I think the following snippet for the guts of the method
should work but is (perhaps?) a little clearer:
{code}
// range is never wrap around, unless right is the minimum token
assert !range.isWrapAround() || range.right.isMinimum();
// truncate the range so it at most covers the sstable
AbstractBounds<RowPosition> bounds = range.toRowBounds();
RowPosition leftBound = bounds.left.compareTo(first) > 0 ?
bounds.left : first.getToken().minKeyBound();
RowPosition rightBound = bounds.right.isMinimum() ?
last.getToken().maxKeyBound() : bounds.right;
// skip non-overlapping ranges
if (leftBound.compareTo(last) > 0 || rightBound.compareTo(first) <
0)
continue;
// transform into positions known to exist in the file
long left = getPosition(leftBound, Operator.GE).position;
long right = (rightBound.compareTo(last) >= 0)
? (openReason == OpenReason.EARLY
// if opened early, we overlap with the old
sstables by one key, so we know that the last
// (and further) key(s) will be streamed from these
if necessary
? getPosition(last.getToken().maxKeyBound(),
Operator.GE).position
: uncompressedLength())
: getPosition(rightBound, Operator.GT).position;
{code}
> Don't give out positions in an sstable beyond its first/last tokens
> -------------------------------------------------------------------
>
> Key: CASSANDRA-8458
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8458
> Project: Cassandra
> Issue Type: Bug
> Reporter: Marcus Eriksson
> Assignee: Marcus Eriksson
> Fix For: 2.1.3
>
> Attachments:
> 0001-Make-sure-we-don-t-give-out-positions-from-an-sstabl.patch
>
>
> Looks like we include tmplink sstables in streams in 2.1+, and when we do,
> sometimes we get this error message on the receiving side:
> {{java.io.IOException: Corrupt input data, block did not start with 2 byte
> signature ('ZV') followed by type byte, 2-byte length)}}. I've only seen this
> happen when a tmplink sstable is included in the stream.
> We can not just exclude the tmplink files when starting the stream - we need
> to include the original file, which we might miss since we check if the
> requested stream range intersects the sstable range.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)