[
https://issues.apache.org/jira/browse/CASSANDRA-12962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Corentin Chary updated CASSANDRA-12962:
---------------------------------------
Description:
Apparently when cassandra any index that does not index a value in *every* live
SSTable gets rebuild. The offending code can be found in the constructor of
SASIIndex.
You can easilly reproduce it:
{code}
CREATE KEYSPACE test WITH replication = {'class': 'SimpleStrategy',
'replication_factor': '1'} AND durable_writes = true;
CREATE TABLE test.test (
a text PRIMARY KEY,
b text,
c text
) WITH bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class':
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class':
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';
CREATE CUSTOM INDEX test_c_idx ON test.test (c) USING
'org.apache.cassandra.index.sasi.SASIIndex';
CREATE CUSTOM INDEX test_b_idx ON test.test (b) USING
'org.apache.cassandra.index.sasi.SASIIndex';
{code}
Log (I added additional traces):
{code}
INFO [main] 2016-11-28 15:32:21,191 ColumnFamilyStore.java:406 - Initializing
test.test
DEBUG [SSTableBatchOpen:1] 2016-11-28 15:32:21,192 SSTableReader.java:505 -
Opening
/mnt/ssd/tmp/data/data/test/test-229e6380b57711e68407158fde22e121/mc-1-big
(0.034KiB)
DEBUG [main] 2016-11-28 15:32:21,194 SASIIndex.java:118 - index:
org.apache.cassandra.schema.IndexMetadata@2f661b1a[id=6b00489b-7010-396e-9348-9f32f5167f88,name=test_b_idx,kind=CUSTOM,options={class_name=org.a\
pache.cassandra.index.sasi.SASIIndex, target=b}], base CFS(Keyspace='test',
ColumnFamily='test'), tracker org.apache.cassandra.db.lifecycle.Tracker@15900b83
INFO [main] 2016-11-28 15:32:21,194 DataTracker.java:152 -
SSTableIndex.open(column: b, minTerm: value, maxTerm: value, minKey: key,
maxKey: key, sstable: BigTableReader(path='/mnt/ssd/tmp/data/data/test/test\
-229e6380b57711e68407158fde22e121/mc-1-big-Data.db'))
DEBUG [main] 2016-11-28 15:32:21,195 SASIIndex.java:129 - Rebuilding SASI
Indexes: {}
DEBUG [main] 2016-11-28 15:32:21,195 ColumnFamilyStore.java:895 - Enqueuing
flush of IndexInfo: 0.386KiB (0%) on-heap, 0.000KiB (0%) off-heap
DEBUG [PerDiskMemtableFlushWriter_0:1] 2016-11-28 15:32:21,204
Memtable.java:465 - Writing Memtable-IndexInfo@748981977(0.054KiB serialized
bytes, 1 ops, 0%/0% of on/off-heap limit), flushed range = (min(-9223\
372036854775808), max(9223372036854775807)]
DEBUG [PerDiskMemtableFlushWriter_0:1] 2016-11-28 15:32:21,204
Memtable.java:494 - Completed flushing
/mnt/ssd/tmp/data/data/system/IndexInfo-9f5c6374d48532299a0a5094af9ad1e3/mc-4256-big-Data.db
(0.035KiB) for\
commitlog position CommitLogPosition(segmentId=1480343535479, position=15652)
DEBUG [MemtableFlushWriter:1] 2016-11-28 15:32:21,224
ColumnFamilyStore.java:1200 - Flushed to
[BigTableReader(path='/mnt/ssd/tmp/data/data/system/IndexInfo-9f5c6374d48532299a0a5094af9ad1e3/mc-4256-big-Data.db\
')] (1 sstables, 4.838KiB), biggest 4.838KiB, smallest 4.838KiB
DEBUG [main] 2016-11-28 15:32:21,224 SASIIndex.java:118 - index:
org.apache.cassandra.schema.IndexMetadata@12f3d291[id=45fcb286-b87a-3d18-a04b-b899a9880c91,name=test_c_idx,kind=CUSTOM,options={class_name=org.a\
pache.cassandra.index.sasi.SASIIndex, target=c}], base CFS(Keyspace='test',
ColumnFamily='test'), tracker org.apache.cassandra.db.lifecycle.Tracker@15900b83
DEBUG [main] 2016-11-28 15:32:21,224 SASIIndex.java:121 - to rebuild: index:
BigTableReader(path='/mnt/ssd/tmp/data/data/test/test-229e6380b57711e68407158fde22e121/mc-1-big-Data.db'),
sstable: org.apache.cassa\
ndra.index.sasi.conf.ColumnIndex@6cbb6b0e
DEBUG [main] 2016-11-28 15:32:21,224 SASIIndex.java:129 - Rebuilding SASI
Indexes:
{BigTableReader(path='/mnt/ssd/tmp/data/data/test/test-229e6380b57711e68407158fde22e121/mc-1-big-Data.db')={c=org.apache.cassa\
ndra.index.sasi.conf.ColumnIndex@6cbb6b0e}}
DEBUG [main] 2016-11-28 15:32:21,225 ColumnFamilyStore.java:895 - Enqueuing
flush of IndexInfo: 0.386KiB (0%) on-heap, 0.000KiB (0%) off-heap
DEBUG [PerDiskMemtableFlushWriter_0:2] 2016-11-28 15:32:21,235
Memtable.java:465 - Writing Memtable-IndexInfo@951411443(0.054KiB serialized
bytes, 1 ops, 0%/0% of on/off-heap limit), flushed range = (min(-9223\
372036854775808), max(9223372036854775807)]
DEBUG [PerDiskMemtableFlushWriter_0:2] 2016-11-28 15:32:21,235
Memtable.java:494 - Completed flushing
/mnt/ssd/tmp/data/data/system/IndexInfo-9f5c6374d48532299a0a5094af9ad1e3/mc-4257-big-Data.db
(0.035KiB) for\
commitlog position CommitLogPosition(segmentId=1480343535479, position=15720)
DEBUG [MemtableFlushWriter:2] 2016-11-28 15:32:21,254
ColumnFamilyStore.java:1200 - Flushed to
[BigTableReader(path='/mnt/ssd/tmp/data/data/system/IndexInfo-9f5c6374d48532299a0a5094af9ad1e3/mc-4257-big-Data.db\
')] (1 sstables, 4.836KiB), biggest 4.836KiB, smallest 4.836KiB
{code}
I think a better behavior would be to ask users to explicitly rebuild indexes
if they remove the files, that's fine as long as we handle correctly the case
of new indexes.
was:
Apparently when cassandra any index that does not index a value in *every* live
SSTable gets rebuild. The offending code can be found in the constructor of
SASIIndex.
You can easilly reproduce it:
{code}
CREATE KEYSPACE test WITH replication = {'class': 'SimpleStrategy',
'replication_factor': '1'} AND durable_writes = true;
CREATE TABLE test.test (
a text PRIMARY KEY,
b text,
c text
) WITH bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class':
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class':
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';
CREATE CUSTOM INDEX test_c_idx ON test.test (c) USING
'org.apache.cassandra.index.sasi.SASIIndex';
CREATE CUSTOM INDEX test_b_idx ON test.test (b) USING
'org.apache.cassandra.index.sasi.SASIIndex';
{code}
Log (I added additional traces):
{code}
INFO [main] 2016-11-28 15:32:21,191 ColumnFamilyStore.java:406 - Initializing
test.test
DEBUG [SSTableBatchOpen:1] 2016-11-28 15:32:21,192 SSTableReader.java:505 -
Opening
/mnt/ssd/tmp/data/data/test/test-229e6380b57711e68407158fde22e121/mc-1-big
(0.034KiB)
DEBUG [main] 2016-11-28 15:32:21,194 SASIIndex.java:118 - index:
org.apache.cassandra.schema.IndexMetadata@2f661b1a[id=6b00489b-7010-396e-9348-9f32f5167f88,name=test_b_idx,kind=CUSTOM,options={class_name=org.a\
pache.cassandra.index.sasi.SASIIndex, target=b}], base CFS(Keyspace='test',
ColumnFamily='test'), tracker org.apache.cassandra.db.lifecycle.Tracker@15900b83
INFO [main] 2016-11-28 15:32:21,194 DataTracker.java:152 -
SSTableIndex.open(column: b, minTerm: value, maxTerm: value, minKey: key,
maxKey: key, sstable: BigTableReader(path='/mnt/ssd/tmp/data/data/test/test\
-229e6380b57711e68407158fde22e121/mc-1-big-Data.db'))
DEBUG [main] 2016-11-28 15:32:21,195 SASIIndex.java:129 - Rebuilding SASI
Indexes: {}
DEBUG [main] 2016-11-28 15:32:21,195 ColumnFamilyStore.java:895 - Enqueuing
flush of IndexInfo: 0.386KiB (0%) on-heap, 0.000KiB (0%) off-heap
DEBUG [PerDiskMemtableFlushWriter_0:1] 2016-11-28 15:32:21,204
Memtable.java:465 - Writing Memtable-IndexInfo@748981977(0.054KiB serialized
bytes, 1 ops, 0%/0% of on/off-heap limit), flushed range = (min(-9223\
372036854775808), max(9223372036854775807)]
DEBUG [PerDiskMemtableFlushWriter_0:1] 2016-11-28 15:32:21,204
Memtable.java:494 - Completed flushing
/mnt/ssd/tmp/data/data/system/IndexInfo-9f5c6374d48532299a0a5094af9ad1e3/mc-4256-big-Data.db
(0.035KiB) for\
commitlog position CommitLogPosition(segmentId=1480343535479, position=15652)
DEBUG [MemtableFlushWriter:1] 2016-11-28 15:32:21,224
ColumnFamilyStore.java:1200 - Flushed to
[BigTableReader(path='/mnt/ssd/tmp/data/data/system/IndexInfo-9f5c6374d48532299a0a5094af9ad1e3/mc-4256-big-Data.db\
')] (1 sstables, 4.838KiB), biggest 4.838KiB, smallest 4.838KiB
DEBUG [main] 2016-11-28 15:32:21,224 SASIIndex.java:118 - index:
org.apache.cassandra.schema.IndexMetadata@12f3d291[id=45fcb286-b87a-3d18-a04b-b899a9880c91,name=test_c_idx,kind=CUSTOM,options={class_name=org.a\
pache.cassandra.index.sasi.SASIIndex, target=c}], base CFS(Keyspace='test',
ColumnFamily='test'), tracker org.apache.cassandra.db.lifecycle.Tracker@15900b83
DEBUG [main] 2016-11-28 15:32:21,224 SASIIndex.java:121 - to rebuild: index:
BigTableReader(path='/mnt/ssd/tmp/data/data/test/test-229e6380b57711e68407158fde22e121/mc-1-big-Data.db'),
sstable: org.apache.cassa\
ndra.index.sasi.conf.ColumnIndex@6cbb6b0e
DEBUG [main] 2016-11-28 15:32:21,224 SASIIndex.java:129 - Rebuilding SASI
Indexes:
{BigTableReader(path='/mnt/ssd/tmp/data/data/test/test-229e6380b57711e68407158fde22e121/mc-1-big-Data.db')={c=org.apache.cassa\
ndra.index.sasi.conf.ColumnIndex@6cbb6b0e}}
DEBUG [main] 2016-11-28 15:32:21,225 ColumnFamilyStore.java:895 - Enqueuing
flush of IndexInfo: 0.386KiB (0%) on-heap, 0.000KiB (0%) off-heap
DEBUG [PerDiskMemtableFlushWriter_0:2] 2016-11-28 15:32:21,235
Memtable.java:465 - Writing Memtable-IndexInfo@951411443(0.054KiB serialized
bytes, 1 ops, 0%/0% of on/off-heap limit), flushed range = (min(-9223\
372036854775808), max(9223372036854775807)]
DEBUG [PerDiskMemtableFlushWriter_0:2] 2016-11-28 15:32:21,235
Memtable.java:494 - Completed flushing
/mnt/ssd/tmp/data/data/system/IndexInfo-9f5c6374d48532299a0a5094af9ad1e3/mc-4257-big-Data.db
(0.035KiB) for\
commitlog position CommitLogPosition(segmentId=1480343535479, position=15720)
DEBUG [MemtableFlushWriter:2] 2016-11-28 15:32:21,254
ColumnFamilyStore.java:1200 - Flushed to
[BigTableReader(path='/mnt/ssd/tmp/data/data/system/IndexInfo-9f5c6374d48532299a0a5094af9ad1e3/mc-4257-big-Data.db\
')] (1 sstables, 4.836KiB), biggest 4.836KiB, smallest 4.836KiB
{code}
> SASI: Index are rebuilt on restart
> ----------------------------------
>
> Key: CASSANDRA-12962
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12962
> Project: Cassandra
> Issue Type: Bug
> Components: sasi
> Reporter: Corentin Chary
>
> Apparently when cassandra any index that does not index a value in *every*
> live SSTable gets rebuild. The offending code can be found in the constructor
> of SASIIndex.
> You can easilly reproduce it:
> {code}
> CREATE KEYSPACE test WITH replication = {'class': 'SimpleStrategy',
> 'replication_factor': '1'} AND durable_writes = true;
> CREATE TABLE test.test (
> a text PRIMARY KEY,
> b text,
> c text
> ) WITH bloom_filter_fp_chance = 0.01
> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
> AND comment = ''
> AND compaction = {'class':
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
> 'max_threshold': '32', 'min_threshold': '4'}
> AND compression = {'chunk_length_in_kb': '64', 'class':
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
> AND crc_check_chance = 1.0
> AND dclocal_read_repair_chance = 0.1
> AND default_time_to_live = 0
> AND gc_grace_seconds = 864000
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = '99PERCENTILE';
> CREATE CUSTOM INDEX test_c_idx ON test.test (c) USING
> 'org.apache.cassandra.index.sasi.SASIIndex';
> CREATE CUSTOM INDEX test_b_idx ON test.test (b) USING
> 'org.apache.cassandra.index.sasi.SASIIndex';
> {code}
> Log (I added additional traces):
> {code}
> INFO [main] 2016-11-28 15:32:21,191 ColumnFamilyStore.java:406 -
> Initializing test.test
> DEBUG [SSTableBatchOpen:1] 2016-11-28 15:32:21,192 SSTableReader.java:505 -
> Opening
> /mnt/ssd/tmp/data/data/test/test-229e6380b57711e68407158fde22e121/mc-1-big
> (0.034KiB)
> DEBUG [main] 2016-11-28 15:32:21,194 SASIIndex.java:118 - index:
> org.apache.cassandra.schema.IndexMetadata@2f661b1a[id=6b00489b-7010-396e-9348-9f32f5167f88,name=test_b_idx,kind=CUSTOM,options={class_name=org.a\
> pache.cassandra.index.sasi.SASIIndex, target=b}], base CFS(Keyspace='test',
> ColumnFamily='test'), tracker
> org.apache.cassandra.db.lifecycle.Tracker@15900b83
> INFO [main] 2016-11-28 15:32:21,194 DataTracker.java:152 -
> SSTableIndex.open(column: b, minTerm: value, maxTerm: value, minKey: key,
> maxKey: key, sstable: BigTableReader(path='/mnt/ssd/tmp/data/data/test/test\
> -229e6380b57711e68407158fde22e121/mc-1-big-Data.db'))
> DEBUG [main] 2016-11-28 15:32:21,195 SASIIndex.java:129 - Rebuilding SASI
> Indexes: {}
> DEBUG [main] 2016-11-28 15:32:21,195 ColumnFamilyStore.java:895 - Enqueuing
> flush of IndexInfo: 0.386KiB (0%) on-heap, 0.000KiB (0%) off-heap
> DEBUG [PerDiskMemtableFlushWriter_0:1] 2016-11-28 15:32:21,204
> Memtable.java:465 - Writing Memtable-IndexInfo@748981977(0.054KiB serialized
> bytes, 1 ops, 0%/0% of on/off-heap limit), flushed range = (min(-9223\
> 372036854775808), max(9223372036854775807)]
> DEBUG [PerDiskMemtableFlushWriter_0:1] 2016-11-28 15:32:21,204
> Memtable.java:494 - Completed flushing
> /mnt/ssd/tmp/data/data/system/IndexInfo-9f5c6374d48532299a0a5094af9ad1e3/mc-4256-big-Data.db
> (0.035KiB) for\
> commitlog position CommitLogPosition(segmentId=1480343535479, position=15652)
> DEBUG [MemtableFlushWriter:1] 2016-11-28 15:32:21,224
> ColumnFamilyStore.java:1200 - Flushed to
> [BigTableReader(path='/mnt/ssd/tmp/data/data/system/IndexInfo-9f5c6374d48532299a0a5094af9ad1e3/mc-4256-big-Data.db\
> ')] (1 sstables, 4.838KiB), biggest 4.838KiB, smallest 4.838KiB
> DEBUG [main] 2016-11-28 15:32:21,224 SASIIndex.java:118 - index:
> org.apache.cassandra.schema.IndexMetadata@12f3d291[id=45fcb286-b87a-3d18-a04b-b899a9880c91,name=test_c_idx,kind=CUSTOM,options={class_name=org.a\
> pache.cassandra.index.sasi.SASIIndex, target=c}], base CFS(Keyspace='test',
> ColumnFamily='test'), tracker
> org.apache.cassandra.db.lifecycle.Tracker@15900b83
> DEBUG [main] 2016-11-28 15:32:21,224 SASIIndex.java:121 - to rebuild: index:
> BigTableReader(path='/mnt/ssd/tmp/data/data/test/test-229e6380b57711e68407158fde22e121/mc-1-big-Data.db'),
> sstable: org.apache.cassa\
> ndra.index.sasi.conf.ColumnIndex@6cbb6b0e
> DEBUG [main] 2016-11-28 15:32:21,224 SASIIndex.java:129 - Rebuilding SASI
> Indexes:
> {BigTableReader(path='/mnt/ssd/tmp/data/data/test/test-229e6380b57711e68407158fde22e121/mc-1-big-Data.db')={c=org.apache.cassa\
> ndra.index.sasi.conf.ColumnIndex@6cbb6b0e}}
> DEBUG [main] 2016-11-28 15:32:21,225 ColumnFamilyStore.java:895 - Enqueuing
> flush of IndexInfo: 0.386KiB (0%) on-heap, 0.000KiB (0%) off-heap
> DEBUG [PerDiskMemtableFlushWriter_0:2] 2016-11-28 15:32:21,235
> Memtable.java:465 - Writing Memtable-IndexInfo@951411443(0.054KiB serialized
> bytes, 1 ops, 0%/0% of on/off-heap limit), flushed range = (min(-9223\
> 372036854775808), max(9223372036854775807)]
> DEBUG [PerDiskMemtableFlushWriter_0:2] 2016-11-28 15:32:21,235
> Memtable.java:494 - Completed flushing
> /mnt/ssd/tmp/data/data/system/IndexInfo-9f5c6374d48532299a0a5094af9ad1e3/mc-4257-big-Data.db
> (0.035KiB) for\
> commitlog position CommitLogPosition(segmentId=1480343535479, position=15720)
> DEBUG [MemtableFlushWriter:2] 2016-11-28 15:32:21,254
> ColumnFamilyStore.java:1200 - Flushed to
> [BigTableReader(path='/mnt/ssd/tmp/data/data/system/IndexInfo-9f5c6374d48532299a0a5094af9ad1e3/mc-4257-big-Data.db\
> ')] (1 sstables, 4.836KiB), biggest 4.836KiB, smallest 4.836KiB
> {code}
> I think a better behavior would be to ask users to explicitly rebuild indexes
> if they remove the files, that's fine as long as we handle correctly the case
> of new indexes.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)