[ 
https://issues.apache.org/jira/browse/CASSANDRA-8978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Yeksigian updated CASSANDRA-8978:
--------------------------------------
    Attachment: 8978-2.1.txt

This is caused because of a race condition in the CQLSSTableWriter. If we are 
writing to a single partition, we create a new column family to write into and 
send the previous one to the writer thread, but continue to use the old column 
family until we are done with the current partition.

This adds a call to check whether the row needs to be added, and if needed will 
sync at that point which is after an insert has completed is finished.

> CQLSSTableWriter causes ArrayIndexOutOfBoundsException
> ------------------------------------------------------
>
>                 Key: CASSANDRA-8978
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8978
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: 3.8.0-42-generic #62~precise1-Ubuntu SMP Wed Jun 4 
> 22:04:18 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
> java version "1.8.0_20"
> Java(TM) SE Runtime Environment (build 1.8.0_20-b26)
> Java HotSpot(TM) 64-Bit Server VM (build 25.20-b23, mixed mode)
>            Reporter: Thomas Borg Salling
>            Assignee: Carl Yeksigian
>             Fix For: 2.1.4
>
>         Attachments: 8978-2.1.txt
>
>
> On long-running jobs with CQLSSTableWriter preparing sstables for later bulk 
> load via sstableloader, occassionally I get the sporadic error shown below.
> I can run the exact same job again - and it will succeed or fail with the 
> same error at another location in the input stream. The error is appears to 
> occur "randomly" - with the same input it may occur never, early or late in 
> the run with no apparent logic or system.
> I use five instances of CQLSSTableWriter in the application (to write 
> redundantly to five different tables). But these instances do not exist at 
> the same time; and thus never used concurrently.
> {code}
> 09:26:33.582 [main] INFO  d.dma.ais.store.FileSSTableConverter - Finished 
> processing directory, 369582175 packets was converted from /nas1/
> Exception in thread "main" java.lang.reflect.InvocationTargetException
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:483)
>         at dk.dma.commons.app.CliCommandList$1.execute(CliCommandList.java:50)
>         at dk.dma.commons.app.CliCommandList.invoke(CliCommandList.java:80)
>         at dk.dma.ais.store.Main.main(Main.java:34)
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 297868
>         at 
> org.apache.cassandra.db.ArrayBackedSortedColumns.append(ArrayBackedSortedColumns.java:196)
>         at 
> org.apache.cassandra.db.ArrayBackedSortedColumns.appendOrReconcile(ArrayBackedSortedColumns.java:191)
>         at 
> org.apache.cassandra.db.ArrayBackedSortedColumns.sortCells(ArrayBackedSortedColumns.java:176)
>         at 
> org.apache.cassandra.db.ArrayBackedSortedColumns.maybeSortCells(ArrayBackedSortedColumns.java:125)
>         at 
> org.apache.cassandra.db.ArrayBackedSortedColumns.access$1100(ArrayBackedSortedColumns.java:44)
>         at 
> org.apache.cassandra.db.ArrayBackedSortedColumns$CellCollection.iterator(ArrayBackedSortedColumns.java:622)
>         at 
> org.apache.cassandra.db.ColumnFamily.iterator(ColumnFamily.java:476)
>         at 
> org.apache.cassandra.db.ColumnIndex$Builder.build(ColumnIndex.java:129)
>         at 
> org.apache.cassandra.io.sstable.SSTableWriter.rawAppend(SSTableWriter.java:233)
>         at 
> org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:218)
>         at 
> org.apache.cassandra.io.sstable.SSTableSimpleUnsortedWriter$DiskWriter.run(SSTableSimpleUnsortedWriter.java:215){code}
> So far I overcome this problem by simply retrying with another run of the 
> application in attempt to generate the sstables. But this is a rather time 
> consuming and shaky approach - and I feel a bit uneasy relying on the 
> produced sstables, though their contents appear to be correct when I sample 
> them with cqlsh 'select' after load into Cassandra.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to