[ 
https://issues.apache.org/jira/browse/CASSANDRA-7543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14064110#comment-14064110
 ] 

Matt Byrd commented on CASSANDRA-7543:
--------------------------------------

Thanks for looking at this.
With the attached patch my repro script no longer reproduces the problem.
 It might also be nice to include the value of openedMarkerSize in the debug 
log line for the dataSize, if only to avoid confusion when debugging, however 
it's not strictly necessary.


> Assertion error when compacting large row with map//list field or range 
> tombstone
> ---------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-7543
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7543
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: linux
>            Reporter: Matt Byrd
>            Assignee: Yuki Morishita
>              Labels: compaction, map
>             Fix For: 1.2.19
>
>         Attachments: 0001-add-rangetombstone-test.patch, 
> 0002-fix-rangetomebstone-not-included-in-LCR-size-calc.patch
>
>
> Hi,
> So in a couple of clusters we're hitting this problem when compacting large 
> rows with a schema which contains the map data-type.
> Here is an example of the error:
> {code}
> java.lang.AssertionError: incorrect row data size 87776427 written to 
> /cassandra/X/Y/X-Y-tmp-ic-2381-Data.db; correct is 87845952
> org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:162)
> org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:163)
> org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:58)
>  
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60)
>  
> org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:208)
> {code}
> I have a python script which reproduces the problem, by just writing lots of 
> data to a single partition key with a schema that contains the map data-type.
> I added some debug logging and found that the difference in bytes seen in the 
> reproduction (255) was due to the following pieces of data being written:
> {code}
> DEBUG [CompactionExecutor:3] 2014-07-13 00:38:42,891 ColumnIndex.java (line 
> 168) DATASIZE writeOpenedMarker columnIndex: 
> org.apache.cassandra.db.ColumnIndex$Builder@6678a9d0 firstColumn: 
> [java.nio.HeapByteBuffer[pos=0 lim=34 cap=34], java.nio.HeapByteBuffer[pos=0 
> lim=34 cap=34]](deletedAt=1405237116014999, localDeletion=1405237116) 
> startPosition: 262476 endPosition: 262561 diff: 85 
> DEBUG [CompactionExecutor:3] 2014-07-13 00:38:43,007 ColumnIndex.java (line 
> 168) DATASIZE writeOpenedMarker columnIndex: 
> org.apache.cassandra.db.ColumnIndex$Builder@6678a9d0 firstColumn: 
> org.apache.cassandra.db.Column@3e5b5939 startPosition: 328157 endPosition: 
> 328242 diff: 85 
> DEBUG [CompactionExecutor:3] 2014-07-13 00:38:44,159 ColumnIndex.java (line 
> 168) DATASIZE writeOpenedMarker columnIndex: 
> org.apache.cassandra.db.ColumnIndex$Builder@6678a9d0 firstColumn: 
> org.apache.cassandra.db.Column@fc3299b startPosition: 984105 endPosition: 
> 984190 diff: 85
> {code}
> So looking at the code you can see that there are extra range tombstones 
> written on the column index border (in ColumnIndex where 
> tombstoneTracker.writeOpenedMarker is called) which aren't accounted for in 
> LazilyCompactedRow.columnSerializedSize.
> This is where the difference comes from in the assertion error, so the 
> solution is just to account for this data.
> I have a patch which does just this, by keeping track of the extra data 
> written out via tombstoneTracker.writeOpenedMarker in ColumnIndex and adding 
> it back to the dataSize in LazilyCompactedRow.java, where it serialises out 
> the row size.
> After applying the patch the reproduction stops producing the AssertionError.
> I know this is not a problem in 2.0 + because of singe pass compaction, 
> however there are lots of 1.2 clusters out there still which might run into 
> this.
> Please let me know if you've any questions.
> Thanks,
> Matt



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to