[ 
https://issues.apache.org/jira/browse/CASSANDRA-11525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15231340#comment-15231340
 ] 

DOAN DuyHai commented on CASSANDRA-11525:
-----------------------------------------

Ok I'm uploading to 
https://drive.google.com/folderview?id=0B6wR2aj4Cb6wYWdfcl9Pb05CYW8&usp=sharing

There are 9 SSTables and 9 index files

The SSTable size sums up to more than 20Gb

It will take 2h at least for the upload to finish

I will rebuild the index with 3.4 tomorrow and re-test.


> SASI index corruption
> ---------------------
>
>                 Key: CASSANDRA-11525
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11525
>             Project: Cassandra
>          Issue Type: Bug
>          Components: sasi
>         Environment: Cassandra 3.5-SNAPSHOT
>            Reporter: DOAN DuyHai
>
> Bug reproduced in *Cassandra 3.5-SNAPSHOT* (after the fix of OOM)
> {noformat}
> create table if not exists test.resource_bench ( 
>  dsr_id uuid,
>  rel_seq bigint,
>  seq bigint,
>  dsp_code varchar,
>  model_code varchar,
>  media_code varchar,
>  transfer_code varchar,
>  commercial_offer_code varchar,
>  territory_code varchar,
>  period_end_month_int int,
>  authorized_societies_txt text,
>  rel_type text,
>  status text,
>  dsp_release_code text,
>  title text,
>  contributors_name list<text>,
>  unic_work text,
>  paying_net_qty bigint,
> PRIMARY KEY ((dsr_id, rel_seq), seq)
> ) WITH CLUSTERING ORDER BY (seq ASC); 
> CREATE CUSTOM INDEX resource_period_end_month_int_idx ON test.resource_bench 
> (period_end_month_int) USING 'org.apache.cassandra.index.sasi.SASIIndex' WITH 
> OPTIONS = {'mode': 'PREFIX'};
> {noformat}
> So the index is a {{DENSE}} numerical index.
> When doing the request {{SELECT dsp_code, unic_work, paying_net_qty FROM 
> test.resource_bench WHERE period_end_month_int = 201401}} using server-side 
> paging.
> I bumped into this stack trace:
> {noformat}
> WARN  [SharedPool-Worker-1] 2016-04-06 00:00:30,825 
> AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread 
> Thread[SharedPool-Worker-1,5,main]: {}
> java.lang.ArrayIndexOutOfBoundsException: -55
>       at 
> org.apache.cassandra.db.ClusteringPrefix$Serializer.deserialize(ClusteringPrefix.java:268)
>  ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
>       at 
> org.apache.cassandra.db.Serializers$2.deserialize(Serializers.java:128) 
> ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
>       at 
> org.apache.cassandra.db.Serializers$2.deserialize(Serializers.java:120) 
> ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
>       at 
> org.apache.cassandra.io.sstable.IndexHelper$IndexInfo$Serializer.deserialize(IndexHelper.java:148)
>  ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
>       at 
> org.apache.cassandra.db.RowIndexEntry$Serializer.deserialize(RowIndexEntry.java:218)
>  ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
>       at 
> org.apache.cassandra.io.sstable.format.SSTableReader.keyAt(SSTableReader.java:1823)
>  ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
>       at 
> org.apache.cassandra.index.sasi.SSTableIndex$DecoratedKeyFetcher.apply(SSTableIndex.java:168)
>  ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
>       at 
> org.apache.cassandra.index.sasi.SSTableIndex$DecoratedKeyFetcher.apply(SSTableIndex.java:155)
>  ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
>       at 
> org.apache.cassandra.index.sasi.disk.TokenTree$KeyIterator.computeNext(TokenTree.java:518)
>  ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
>       at 
> org.apache.cassandra.index.sasi.disk.TokenTree$KeyIterator.computeNext(TokenTree.java:504)
>  ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
>       at 
> org.apache.cassandra.index.sasi.utils.AbstractIterator.tryToComputeNext(AbstractIterator.java:116)
>  ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
>       at 
> org.apache.cassandra.index.sasi.utils.AbstractIterator.hasNext(AbstractIterator.java:110)
>  ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
>       at 
> org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:374)
>  ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
>       at 
> org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:186)
>  ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
>       at 
> org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:155)
>  ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
>       at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
>       at 
> org.apache.cassandra.index.sasi.plan.QueryPlan$ResultIterator.computeNext(QueryPlan.java:106)
>  ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
>       at 
> org.apache.cassandra.index.sasi.plan.QueryPlan$ResultIterator.computeNext(QueryPlan.java:71)
>  ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
>       at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
>       at 
> org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:72)
>  ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
>       at 
> org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:289)
>  ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
> {noformat}
> There are 2 possible root cause:
> 1. Index corrupted
> 2. Raw SSTable is corrupted
> To rule out *scenario 1*, I just drop and rebuild the index *many times* but 
> the exception was still there, so I modified the method 
> {{SSTableReader.keyAt(long indexPosition)}} to log the impacted partition:
> {noformat}
>             try
>             {
>                 if (isKeyCacheSetup())
>                     cacheKey(key, rowIndexEntrySerializer.deserialize(in));
>             } catch (IndexOutOfBoundsException ex)
>             {
>                 logger.error(String.format(
>                 "Error when reading index entry for token '%s' at 
> indexPosition %s ",
>                 key.getToken().getTokenValue(), indexPosition));
>             }
> {noformat}
> Below are the output in the log after code modification:
> {noformat}
> system_ns3038406.ip-5-39-72.eu.log:ERROR [SharedPool-Worker-1] 2016-04-07 
> 17:08:28,843 SSTableReader.java:1830 - Error when reading index entry for 
> token '-7005474773654630139' at indexPosition 2147457128
> system_ns3038406.ip-5-39-72.eu.log:ERROR [SharedPool-Worker-1] 2016-04-07 
> 17:08:28,917 SSTableReader.java:1830 - Error when reading index entry for 
> token '-5016711186446865616' at indexPosition 2147458268
> system_ns3038406.ip-5-39-72.eu.log:ERROR [SharedPool-Worker-1] 2016-04-07 
> 17:08:28,918 SSTableReader.java:1830 - Error when reading index entry for 
> token '1027994831942941747' at indexPosition 2147459218
> {noformat}
> I double check the original C* data using {{cqlsh}} but it seems that there 
> is no data for those tokens:
> {noformat}
> SELECT dsr_id,rel_seq FROM resource_bench WHERE 
> token(dsr_id,rel_seq)=-7005474773654630139;
>  dsr_id | rel_seq
> --------+---------
> (0 rows)
>  SELECT dsr_id,rel_seq FROM resource_bench WHERE 
> token(dsr_id,rel_seq)=-5016711186446865616;
>  dsr_id | rel_seq
> --------+---------
> (0 rows)
> SELECT dsr_id,rel_seq FROM resource_bench WHERE 
> token(dsr_id,rel_seq)=1027994831942941747;
>  dsr_id | rel_seq
> --------+---------
> (0 rows)
> {noformat}
> /cc [~xedin] [~beobal]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to