[ 
https://issues.apache.org/jira/browse/CASSANDRA-13939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16194540#comment-16194540
 ] 

Robert Stupp commented on CASSANDRA-13939:
------------------------------------------

+1

> Mishandling of cells for removed/dropped columns when reading legacy files
> --------------------------------------------------------------------------
>
>                 Key: CASSANDRA-13939
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13939
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Local Write-Read Paths
>            Reporter: Sylvain Lebresne
>            Assignee: Sylvain Lebresne
>             Fix For: 3.0.x, 3.11.x
>
>
> The tl;dr is that there is a bug in reading legacy files that can manifests 
> itself with a trace looking like this:
> {noformat}
> Exception (java.lang.IllegalStateException) encountered during startup: One 
> row required, 2 found
> java.lang.IllegalStateException: One row required, 2 found
>     at 
> org.apache.cassandra.cql3.UntypedResultSet$FromResultSet.one(UntypedResultSet.java:84)
>     at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readTableTimestamp(LegacySchemaMigrator.java:254)
>     at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readTable(LegacySchemaMigrator.java:244)
>     at 
> org.apache.cassandra.schema.LegacySchemaMigrator.lambda$readTables$7(LegacySchemaMigrator.java:238)
>     at 
> org.apache.cassandra.schema.LegacySchemaMigrator$$Lambda$126/591203139.accept(Unknown
>  Source)
>     at java.util.ArrayList.forEach(ArrayList.java:1249)
>     at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readTables(LegacySchemaMigrator.java:238)
>     at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readKeyspace(LegacySchemaMigrator.java:187)
>     at 
> org.apache.cassandra.schema.LegacySchemaMigrator.lambda$readSchema$4(LegacySchemaMigrator.java:178)
>     at 
> org.apache.cassandra.schema.LegacySchemaMigrator$$Lambda$123/1612073393.accept(Unknown
>  Source)
>     at java.util.ArrayList.forEach(ArrayList.java:1249)
>     at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readSchema(LegacySchemaMigrator.java:178)
> {noformat}
> The reason this can happen has to do with the handling of legacy files. 
> Legacy files are cell based while the 3.0 storage engine is primarily row 
> based, so we group those cells into rows early in the deserialization process 
> (in {{UnfilteredDeserializer.OldFormatDeserializer}}), but in doing so, we 
> can only consider a row finished when we've either reach the end of the 
> partition/file, or when we've read a cell that doesn't belong to that row.  
> That second case means that when the deserializer returns a given row, the 
> underlying file pointer may actually not positioned at the end of that row, 
> but rather it may be past the first cell of the next row (which the 
> deserializer remembers for future use). Long story short, when we try to 
> detect if we're logically past our current index block in 
> {{AbstractIterator.IndexState#isPastCurrentBlock(}}), we can't simply rely on 
> the file pointer, which again may be a bit more advanced that we logically 
> are, and that's the reason for the "correction" in that method. That 
> correction is really just the amount of bytes remembered but not yet used in 
> the deserializer.
> That "correction" is sometimes wrong however and that's due to the fact that 
> in {{LegacyLayout#readLegacyAtom}}, if we get a cell for an dropped or 
> removed cell, we ignore that cell (which, in itself, is fine). Problem is 
> that this skipping is done within the {{LegacyLayout#readLegacyAtom}} method 
> but without {{UnfilteredDeserializer.OldFormatDeserializer}} knowing about 
> it. As such, the size of the skipped cell ends up being accounted in the 
> "correction" bytes for the next cell we read. Lo and behold, if such cell for 
> a removed/dropped column is both the last cell of a CQL row and just before 
> an index boundary (pretty unlikely in general btw, but definitively 
> possible), then the deserializer will count its size with the first cell of 
> the next row, which happens to also be the first cell of the next index 
> block.  And when the code then tries to figure out if it crossed an index 
> boundary, it will over-correct. That is, the {{indexState.updateBlock()}} 
> call at the start of {{SSTableIterator.ForwardIndexedReader#computeNext}} 
> will not work correctly.  This can then make the code return a row that is 
> after the requested slice end (and should thus not be returned) because it 
> doesn't compare that row to said requested end due to thinking it's not on 
> the last index block to read, even though it genuinely is.
> Anyway, the whole explanation is a tad complex, but the fix isn't: we need to 
> move the skipping of cells for removed/dropped column a level up so the 
> deserializer knows about it and don't silently count their size in the next 
> atom size.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to