DOAN DuyHai created CASSANDRA-11525: ---------------------------------------
Summary: SASI index corruption Key: CASSANDRA-11525 URL: https://issues.apache.org/jira/browse/CASSANDRA-11525 Project: Cassandra Issue Type: Bug Components: sasi Environment: Cassandra 3.5-SNAPSHOT Reporter: DOAN DuyHai Bug reproduced in *Cassandra 3.5-SNAPSHOT* (after the fix of OOM) {noformat} create table if not exists test.resource_bench ( dsr_id uuid, rel_seq bigint, seq bigint, dsp_code varchar, model_code varchar, media_code varchar, transfer_code varchar, commercial_offer_code varchar, territory_code varchar, period_end_month_int int, authorized_societies_txt text, rel_type text, status text, dsp_release_code text, title text, contributors_name list<text>, unic_work text, paying_net_qty bigint, PRIMARY KEY ((dsr_id, rel_seq), seq) ) WITH CLUSTERING ORDER BY (seq ASC); CREATE CUSTOM INDEX resource_period_end_month_int_idx ON test.resource_bench (period_end_month_int) USING 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {'mode': 'PREFIX'}; {noformat} So the index is a {{DENSE}} numerical index. When doing the request {{SELECT dsp_code, unic_work, paying_net_qty FROM test.resource_bench WHERE period_end_month_int = 201401}} using server-side paging. I bumped into this stack trace: {noformat} WARN [SharedPool-Worker-1] 2016-04-06 00:00:30,825 AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread Thread[SharedPool-Worker-1,5,main]: {} java.lang.ArrayIndexOutOfBoundsException: -55 at org.apache.cassandra.db.ClusteringPrefix$Serializer.deserialize(ClusteringPrefix.java:268) ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT] at org.apache.cassandra.db.Serializers$2.deserialize(Serializers.java:128) ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT] at org.apache.cassandra.db.Serializers$2.deserialize(Serializers.java:120) ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT] at org.apache.cassandra.io.sstable.IndexHelper$IndexInfo$Serializer.deserialize(IndexHelper.java:148) ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT] at org.apache.cassandra.db.RowIndexEntry$Serializer.deserialize(RowIndexEntry.java:218) ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT] at org.apache.cassandra.io.sstable.format.SSTableReader.keyAt(SSTableReader.java:1823) ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT] at org.apache.cassandra.index.sasi.SSTableIndex$DecoratedKeyFetcher.apply(SSTableIndex.java:168) ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT] at org.apache.cassandra.index.sasi.SSTableIndex$DecoratedKeyFetcher.apply(SSTableIndex.java:155) ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT] at org.apache.cassandra.index.sasi.disk.TokenTree$KeyIterator.computeNext(TokenTree.java:518) ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT] at org.apache.cassandra.index.sasi.disk.TokenTree$KeyIterator.computeNext(TokenTree.java:504) ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT] at org.apache.cassandra.index.sasi.utils.AbstractIterator.tryToComputeNext(AbstractIterator.java:116) ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT] at org.apache.cassandra.index.sasi.utils.AbstractIterator.hasNext(AbstractIterator.java:110) ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT] at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:374) ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT] at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:186) ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT] at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:155) ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT] at org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT] at org.apache.cassandra.index.sasi.plan.QueryPlan$ResultIterator.computeNext(QueryPlan.java:106) ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT] at org.apache.cassandra.index.sasi.plan.QueryPlan$ResultIterator.computeNext(QueryPlan.java:71) ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT] at org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT] at org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:72) ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT] at org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:289) ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT] {noformat} There are 2 possible root cause: 1. Index corrupted 2. Raw SSTable is corrupted To rule out *scenario 1*, I just drop and rebuild the index *many times* but the exception was still there, so I modified the method {{SSTableReader.keyAt(long indexPosition)}} to log the impacted partition: {noformat} try { if (isKeyCacheSetup()) cacheKey(key, rowIndexEntrySerializer.deserialize(in)); } catch (IndexOutOfBoundsException ex) { logger.error(String.format( "Error when reading index entry for token '%s' at indexPosition %s ", key.getToken().getTokenValue(), indexPosition)); } {noformat} Below are the output in the log after code modification: {noformat} system_ns3038406.ip-5-39-72.eu.log:ERROR [SharedPool-Worker-1] 2016-04-07 17:08:28,843 SSTableReader.java:1830 - Error when reading index entry for token '-7005474773654630139' at indexPosition 2147457128 system_ns3038406.ip-5-39-72.eu.log:ERROR [SharedPool-Worker-1] 2016-04-07 17:08:28,917 SSTableReader.java:1830 - Error when reading index entry for token '-5016711186446865616' at indexPosition 2147458268 system_ns3038406.ip-5-39-72.eu.log:ERROR [SharedPool-Worker-1] 2016-04-07 17:08:28,918 SSTableReader.java:1830 - Error when reading index entry for token '1027994831942941747' at indexPosition 2147459218 {noformat} I double check the original C* data using {{cqlsh}} but it seems that there is no data for those tokens: {noformat} SELECT dsr_id,rel_seq FROM resource_bench WHERE token(dsr_id,rel_seq)=-7005474773654630139; dsr_id | rel_seq --------+--------- (0 rows) SELECT dsr_id,rel_seq FROM resource_bench WHERE token(dsr_id,rel_seq)=-5016711186446865616; dsr_id | rel_seq --------+--------- (0 rows) SELECT dsr_id,rel_seq FROM resource_bench WHERE token(dsr_id,rel_seq)=1027994831942941747; dsr_id | rel_seq --------+--------- (0 rows) {noformat} /cc [~xedin] [~beobal] -- This message was sent by Atlassian JIRA (v6.3.4#6332)