[
https://issues.apache.org/jira/browse/CASSANDRA-6137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13788238#comment-13788238
]
Constance Eustace commented on CASSANDRA-6137:
----------------------------------------------
The error is reappearing.
Nodetool repairs/compactions/etc triggering didn't do anything.
Perhaps this is yet another caching error, this time in the key caches. We're
going to try some restarts of the node or schemes to clean the caches to see if
this is an actual internal data representation problem.
> CQL3 SELECT IN CLAUSE inconsistent
> ----------------------------------
>
> Key: CASSANDRA-6137
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6137
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Environment: Ubuntu AWS Cassandra 2.0.1 SINGLE NODE
> Reporter: Constance Eustace
> Fix For: 2.0.1
>
>
> We are encountering inconsistent results from CQL3 queries with column keys
> using IN clause in WHERE. This has been reproduced in cqlsh.
> Rowkey is e_entid
> Column key is p_prop
> This returns roughly 21 rows for 21 column keys that match p_prop.
> cqlsh> SELECT
> e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars
> FROM internal_submission.Entity_Job WHERE e_entid =
> '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB';
> These three queries each return one row for the requested single column key
> in the IN clause:
> SELECT
> e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars
> FROM internal_submission.Entity_Job WHERE e_entid =
> '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB' AND p_prop in
> ('urn:bby:pcm:job:ingest:content:complete:count');
> SELECT
> e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars
> FROM internal_submission.Entity_Job WHERE e_entid =
> '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB' AND p_prop in
> ('urn:bby:pcm:job:ingest:content:all:count');
> SELECT
> e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars
> FROM internal_submission.Entity_Job WHERE e_entid =
> '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB' AND p_prop in
> ('urn:bby:pcm:job:ingest:content:fail:count');
> This query returns ONLY ONE ROW (one column key), not three as I would expect
> from the three-column-key IN clause:
> cqlsh> SELECT
> e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars
> FROM internal_submission.Entity_Job WHERE e_entid =
> '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB' AND p_prop in
> ('urn:bby:pcm:job:ingest:content:complete:count','urn:bby:pcm:job:ingest:content:all:count','urn:bby:pcm:job:ingest:content:fail:count');
> This query does return two rows however for the requested two column keys:
> cqlsh> SELECT
> e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars
> FROM internal_submission.Entity_Job WHERE e_entid =
> '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB' AND p_prop in (
>
> 'urn:bby:pcm:job:ingest:content:all:count','urn:bby:pcm:job:ingest:content:fail:count');
> cqlsh> describe table internal_submission.entity_job;
> CREATE TABLE entity_job (
> e_entid text,
> p_prop text,
> describes text,
> dndcondition text,
> e_entlinks text,
> e_entname text,
> e_enttype text,
> ingeststatus text,
> ingeststatusdetail text,
> p_flags text,
> p_propid text,
> p_proplinks text,
> p_storage text,
> p_subents text,
> p_val text,
> p_vallang text,
> p_vallinks text,
> p_valtype text,
> p_valunit text,
> p_vars text,
> partnerid text,
> referenceid text,
> size int,
> sourceip text,
> submitdate bigint,
> submitevent text,
> userid text,
> version text,
> PRIMARY KEY (e_entid, p_prop)
> ) WITH
> bloom_filter_fp_chance=0.010000 AND
> caching='KEYS_ONLY' AND
> comment='' AND
> dclocal_read_repair_chance=0.000000 AND
> gc_grace_seconds=864000 AND
> index_interval=128 AND
> read_repair_chance=0.100000 AND
> replicate_on_write='true' AND
> populate_io_cache_on_flush='false' AND
> default_time_to_live=0 AND
> speculative_retry='NONE' AND
> memtable_flush_period_in_ms=0 AND
> compaction={'class': 'SizeTieredCompactionStrategy'} AND
> compression={'sstable_compression': 'LZ4Compressor'};
> CREATE INDEX internal_submission__JobDescribesIDX ON entity_job (describes);
> CREATE INDEX internal_submission__JobDNDConditionIDX ON entity_job
> (dndcondition);
> CREATE INDEX internal_submission__JobIngestStatusIDX ON entity_job
> (ingeststatus);
> CREATE INDEX internal_submission__JobIngestStatusDetailIDX ON entity_job
> (ingeststatusdetail);
> CREATE INDEX internal_submission__JobReferenceIDIDX ON entity_job
> (referenceid);
> CREATE INDEX internal_submission__JobUserIDX ON entity_job (userid);
> CREATE INDEX internal_submission__JobVersionIDX ON entity_job (version);
> -------------------------------
> My suspicion is that the three-column-key IN Clause is translated (improperly
> or not) to a two-column key range with the assumption that the third column
> key is present in that range, but it isn't...
--
This message was sent by Atlassian JIRA
(v6.1#6144)