Hi,
We are having an issue with TTL on Secondary index columns. We get 0
rows in return when running queries on indexed columns that have TTL.
Everything works fine with small amounts of data, but when we get over
a ceratin threshold it looks like older rows dissapear from the index.
In the example below we create 70 rows with 45k columns each + one
indexed column with just the rowkey as value, so we have one row per
indexed value. When the script is finished the index contains rows
66-69. Rows 0-65 are gone from the index.
Using 'indexedColumn' without TTL fixes the problem.
------------- SCHEMA START -----------------
create keyspace ks123
with placement_strategy = 'NetworkTopologyStrategy'
and strategy_options = {datacenter1 : 1}
and durable_writes = true;
use ks123;
create column family cf1
with column_type = 'Standard'
and comparator = 'AsciiType'
and default_validation_class = 'AsciiType'
and key_validation_class = 'AsciiType'
and read_repair_chance = 0.1
and dclocal_read_repair_chance = 0.0
and gc_grace = 864000
and min_compaction_threshold = 4
and max_compaction_threshold = 32
and replicate_on_write = true
and compaction_strategy =
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'
and caching = 'KEYS_ONLY'
and column_metadata = [
{column_name : 'indexedColumn',
validation_class : AsciiType,
index_name : 'INDEX1',
index_type : 0}]
and compression_options = {'sstable_compression' :
'org.apache.cassandra.io.compress.SnappyCompressor'};
------------- SCHEMA FINISH -----------------
------------- POPULATE START -----------------
from pycassa.batch import Mutator
import pycassa
pool = pycassa.ConnectionPool('ks123')
cf = pycassa.ColumnFamily(pool, 'cf1')
for rowKey in xrange(70):
b = Mutator(pool)
for datapoint in xrange(1, 45001):
b.insert(cf,str(rowKey), {str(datapoint): 'val'}, ttl=7884000);
b.insert(cf, str(rowKey), {'indexedColumn': str(rowKey)}, ttl=7887600);
print 'row %d' % rowKey
b.send()
b = Mutator(pool)
pool.dispose()
------------- POPULATE FINISH -----------------
------------- QUERY START -----------------
[default@ks123] get cf1 where 'indexedColumn'='65';
0 Row Returned.
Elapsed time: 2.38 msec(s).
[default@ks123] get cf1 where 'indexedColumn'='66';
-------------------
RowKey: 66
=> (column=1, value=val, timestamp=1355818765548964, ttl=7884000)
...
=> (column=10087, value=val, timestamp=1355818766075538, ttl=7884000)
=> (column=indexedColumn, value=66, timestamp=1355818768119334, ttl=7887600)
1 Row Returned.
Elapsed time: 31 msec(s).
------------- QUERY FINISH -----------------
This is all using Cassandra 1.1.7 with default settings.
Best regards,
Alexei Bakanov