[
https://issues.apache.org/jira/browse/CASSANDRA-6137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13795676#comment-13795676
]
Constance Eustace commented on CASSANDRA-6137:
----------------------------------------------
Some debugging of
SELECT * FROM wayfair_submission.entity_job WHERE e_entid =
'924d6742-31fd-11e3-97f7-001c42000009-CJOB' AND p_prop IN
('__CPSYS_name','urn:bby:pcm:ingest:status','subPropA:filttest:sdf','urn@bby@pcm@job@ingest@content@complete@count')
SelectRawStatement[name=wayfair_submission.entity_job, selectClause=[],
whereClause=[e_entid EQ '924d6742-31fd-11e3-97f7-001c42000009-CJOB', p_prop IN
['__CPSYS_name', 'urn:bby:pcm:ingest:status', 'subPropA:filttest:sdf',
'urn@bby@pcm@job@ingest@content@complete@count']], isCount=false,
--> is the CF metadata properly returned for all the columns in the parsed
statement?
--> is a Range/Slice of Columns (SelectStatement:217)
--> Range must have a high and a low, are the right ones being selected
[SliceFromReadCommand(table='wayfair_submission',
key='39323464363734322d333166642d313165332d393766372d3030316334323030303030392d434a4f42',
column_parent='QueryPath(columnFamilyName='entity_job',
superColumnName='null', columnName='null')', filter='SliceQueryFilter
[reversed=false, slices=[[000c5f5f43505359535f6e616d6500,
000c5f5f43505359535f6e616d6501],
[001573756250726f70413a66696c74746573743a73646600,
001573756250726f70413a66696c74746573743a73646601],
[001975726e3a6262793a70636d3a696e676573743a73746174757300,
001975726e3a6262793a70636d3a696e676573743a73746174757301],
[002d75726e406262794070636d406a6f6240696e6765737440636f6e74656e7440636f6d706c65746540636f756e7400,
002d75726e406262794070636d406a6f6240696e6765737440636f6e74656e7440636f6d706c65746540636f756e7401]],
count=10000, toGroup = 1]')]
SliceFromReadCommand
(table='wayfair_submission',
key='39323464363734322d333166642d313165332d393766372d3030316334323030303030392d434a4f42',
column_parent='QueryPath(columnFamilyName='entity_job',
superColumnName='null', columnName='null')',
filter='
SliceQueryFilter [
reversed=false,
slices=[
[000c5f5f43505359535f6e616d6500, 000c5f5f43505359535f6e616d6501],
[001573756250726f70413a66696c74746573743a73646600,
001573756250726f70413a66696c74746573743a73646601],
[001975726e3a6262793a70636d3a696e676573743a73746174757300,
001975726e3a6262793a70636d3a696e676573743a73746174757301],
[002d75726e406262794070636d406a6f6240696e6765737440636f6e74656e7440636f6d706c65746540636f756e7400,
002d75726e406262794070636d406a6f6240696e6765737440636f6e74656e7440636f6d706c65746540636f756e7401]
],
count=10000,
toGroup = 1]'
)
Guessing from the length of the four number pairs there, you can see that the
four numbers are probably the four column names in terms of length....
Row(
key=
DecoratedKey(8705314879532960628,
39323464363734322d333166642d313165332d393766372d3030316334323030303030392d434a4f42),
cf=ColumnFamily(
entity_job [
__CPSYS_name::false:0@1381445090517000,
__CPSYS_name:e_entname:false:10@1381445090517000,
subPropA\:filttest\:sdf::false:0@1381445090517000,
subPropA\:filttest\:sdf:p_flags:false:3@1381445090517000,
subPropA\:filttest\:sdf:p_propid:false:36@1381445090517000,
subPropA\:filttest\:sdf:p_proplinks:false:2@1381445090517000,
subPropA\:filttest\:sdf:p_subents:false:2@1381445090517000,
subPropA\:filttest\:sdf:p_val:false:22@1381445090517000,
subPropA\:filttest\:sdf:p_vallang:false:5@1381445090517000,
subPropA\:filttest\:sdf:p_vallinks:false:2@1381445090517000,
subPropA\:filttest\:sdf:p_valtype:false:4@1381445090517000,
subPropA\:filttest\:sdf:p_valunit:false:1@1381445090517000,
subPropA\:filttest\:sdf:p_vars:false:84@1381445090517000,
urn\:bby\:pcm\:ingest\:status:ingeststatus:false:4@1381445090517000,
urn\:bby\:pcm\:ingest\:status:p_flags:false:3@1381445090517000,
urn\:bby\:pcm\:ingest\:status:p_propid:false:36@1381445090517000,
urn\:bby\:pcm\:ingest\:status:p_proplinks:false:2@1381445090517000,
urn\:bby\:pcm\:ingest\:status:p_subents:false:2@1381445090517000,
urn\:bby\:pcm\:ingest\:status:p_vallinks:false:2@1381445090517000,
urn\:bby\:pcm\:ingest\:status:p_vars:false:70@1381445090517000,
]
)
)
... does that mean the CF metadata doesn't have the columns for the last one?
Or is that just the data value?
> CQL3 SELECT IN CLAUSE inconsistent
> ----------------------------------
>
> Key: CASSANDRA-6137
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6137
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Environment: Ubuntu AWS Cassandra 2.0.1 SINGLE NODE
> Reporter: Constance Eustace
> Fix For: 2.0.1
>
>
> We are encountering inconsistent results from CQL3 queries with column keys
> using IN clause in WHERE. This has been reproduced in cqlsh and the jdbc
> driver.
> Rowkey is e_entid
> Column key is p_prop
> This returns roughly 21 rows for 21 column keys that match p_prop.
> cqlsh> SELECT
> e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars
> FROM internal_submission.Entity_Job WHERE e_entid =
> '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB';
> These three queries each return one row for the requested single column key
> in the IN clause:
> SELECT
> e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars
> FROM internal_submission.Entity_Job WHERE e_entid =
> '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB' AND p_prop in
> ('urn:bby:pcm:job:ingest:content:complete:count');
> SELECT
> e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars
> FROM internal_submission.Entity_Job WHERE e_entid =
> '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB' AND p_prop in
> ('urn:bby:pcm:job:ingest:content:all:count');
> SELECT
> e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars
> FROM internal_submission.Entity_Job WHERE e_entid =
> '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB' AND p_prop in
> ('urn:bby:pcm:job:ingest:content:fail:count');
> This query returns ONLY ONE ROW (one column key), not three as I would expect
> from the three-column-key IN clause:
> cqlsh> SELECT
> e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars
> FROM internal_submission.Entity_Job WHERE e_entid =
> '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB' AND p_prop in
> ('urn:bby:pcm:job:ingest:content:complete:count','urn:bby:pcm:job:ingest:content:all:count','urn:bby:pcm:job:ingest:content:fail:count');
> This query does return two rows however for the requested two column keys:
> cqlsh> SELECT
> e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars
> FROM internal_submission.Entity_Job WHERE e_entid =
> '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB' AND p_prop in (
>
> 'urn:bby:pcm:job:ingest:content:all:count','urn:bby:pcm:job:ingest:content:fail:count');
> cqlsh> describe table internal_submission.entity_job;
> CREATE TABLE entity_job (
> e_entid text,
> p_prop text,
> describes text,
> dndcondition text,
> e_entlinks text,
> e_entname text,
> e_enttype text,
> ingeststatus text,
> ingeststatusdetail text,
> p_flags text,
> p_propid text,
> p_proplinks text,
> p_storage text,
> p_subents text,
> p_val text,
> p_vallang text,
> p_vallinks text,
> p_valtype text,
> p_valunit text,
> p_vars text,
> partnerid text,
> referenceid text,
> size int,
> sourceip text,
> submitdate bigint,
> submitevent text,
> userid text,
> version text,
> PRIMARY KEY (e_entid, p_prop)
> ) WITH
> bloom_filter_fp_chance=0.010000 AND
> caching='KEYS_ONLY' AND
> comment='' AND
> dclocal_read_repair_chance=0.000000 AND
> gc_grace_seconds=864000 AND
> index_interval=128 AND
> read_repair_chance=0.100000 AND
> replicate_on_write='true' AND
> populate_io_cache_on_flush='false' AND
> default_time_to_live=0 AND
> speculative_retry='NONE' AND
> memtable_flush_period_in_ms=0 AND
> compaction={'class': 'SizeTieredCompactionStrategy'} AND
> compression={'sstable_compression': 'LZ4Compressor'};
> CREATE INDEX internal_submission__JobDescribesIDX ON entity_job (describes);
> CREATE INDEX internal_submission__JobDNDConditionIDX ON entity_job
> (dndcondition);
> CREATE INDEX internal_submission__JobIngestStatusIDX ON entity_job
> (ingeststatus);
> CREATE INDEX internal_submission__JobIngestStatusDetailIDX ON entity_job
> (ingeststatusdetail);
> CREATE INDEX internal_submission__JobReferenceIDIDX ON entity_job
> (referenceid);
> CREATE INDEX internal_submission__JobUserIDX ON entity_job (userid);
> CREATE INDEX internal_submission__JobVersionIDX ON entity_job (version);
> -------------------------------
> My suspicion is that the three-column-key IN Clause is translated (improperly
> or not) to a two-column key range with the assumption that the third column
> key is present in that range, but it isn't...
--
This message was sent by Atlassian JIRA
(v6.1#6144)