[
https://issues.apache.org/jira/browse/CASSANDRA-8712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14301074#comment-14301074
]
mlowicki commented on CASSANDRA-8712:
-------------------------------------
1. Drop keyspace
{code}
cqlsh> use sync;
cqlsh:sync> drop keyspace sync;
cqlsh:sync>
{code}
2. Creating keyspace from scratch (I'm using sync_casandra from
django-cassandra-engine)
{code}
./bin/django sync_cassandra
Creating keyspace sync..
Syncing sync.api.models.Entity
Syncing sync.api.models.UserStore
{code}
3. Populate database using Django's shell
{code}
>>> from sync.api.models import Entity, UserStore
>>> user = UserStore.objects.create(user_id='foo')
>>> root = Entity.objects.create(user_id='foo', data_type_id=0, version=0,
>>> id='-1')
{code}
4. Run {{check_parent_index_consistency}} script:
{code}
./bin/django check_parent_index_consistency
{
"folder": 1,
"user": 1
}
{code}
5. Add entities to root folder
{code}
>>> for i in range(10000):
>>> Entity.objects.create(user_id='foo', data_type_id=0, version=0, id='a'
>>> + str(i), parent_id='-1', folder=False)
{code}
6. While inserting run {{check_parent_index_consistency}} script:
{code}
./bin/django check_parent_index_consistency
{
"folder": 1,
"inconsistent_folder": 1,
"user": 1
}
{code}
Number of entities returned directly from {{entity}} while running insert was
8918 but got only 372 from index.
It seems to be related to number of entities I'm adding. If less than 10000 I
couldn't reproduce the issue. When running {{check_parent_index_consistency}}
script after couple of minutes it was completely fine - no inconsistencies. Not
sure if this is the same issue as number of inconsistencies is zero after some
time but maybe it'll help.
{{check_parent_index_consistency}} is available on https://cpaste.org/p7zht9rli
> Out-of-sync secondary index
> ---------------------------
>
> Key: CASSANDRA-8712
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8712
> Project: Cassandra
> Issue Type: Bug
> Environment: 2.1.2
> Reporter: mlowicki
> Fix For: 2.1.3
>
>
> I've such table with index:
> {code}
> CREATE TABLE entity (
> user_id text,
> data_type_id int,
> version bigint,
> id text,
> cache_guid text,
> client_defined_unique_tag text,
> ctime timestamp,
> deleted boolean,
> folder boolean,
> mtime timestamp,
> name text,
> originator_client_item_id text,
> parent_id text,
> position blob,
> server_defined_unique_tag text,
> specifics blob,
> PRIMARY KEY (user_id, data_type_id, version, id)
> ) WITH CLUSTERING ORDER BY (data_type_id ASC, version ASC, id ASC)
> AND bloom_filter_fp_chance = 0.01
> AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
> AND comment = ''
> AND compaction = {'min_threshold': '4', 'class':
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
> 'max_threshold': '32'}
> AND compression = {'sstable_compression':
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
> AND dclocal_read_repair_chance = 0.1
> AND default_time_to_live = 0
> AND gc_grace_seconds = 864000
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = '99.0PERCENTILE';
> CREATE INDEX index_entity_parent_id ON entity (parent_id);
> {code}
> It turned out that index became out of sync:
> {code}
> >>> Entity.objects.filter(user_id='255824802',
> >>> parent_id=parent_id).consistency(6).count()
> 16
>
> >>> counter = 0
> >>> for e in Entity.objects.filter(user_id='255824802'):
> ... if e.parent_id and e.parent_id == parent_id:
> ... counter += 1
> ...
> >>> counter
> 10
> {code}
> After couple of hours it was fine (at night) but then when user probably
> started to interact with DB we got the same problem. As a temporary solution
> we'll try to rebuild indexes from time to time as suggested in
> http://dev.nuclearrooster.com/2013/01/20/using-nodetool-to-rebuild-secondary-indexes-in-cassandra/
> Launched simple script for checking such anomaly and before rebuilding index
> for 4024856 folders 10378 had this issue.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)