[jira] [Resolved] (CASSANDRA-7854) Unable to select partition keys directly using IN keyword (no replacement for multi row multiget in thrift)

2015-01-29 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne resolved CASSANDRA-7854.
-
Resolution: Duplicate

bq. The issue is not really a duplicate of 7855.

You're right, it's more of a duplicate of CASSANDRA-6875. At least the intent 
expressed by Todd is.

What Todd was really asking about here was to be able use a IN on the partition 
keys and a IN on the clustering columns (which was allowed by CASSANDRA-6875).  
It is true that the made-up syntax in the description uses a IN on both 
partition key and clustering columns, which is not stricly equivalent to the 
former, but Todd clearly didn't intented that generality since the ticket is in 
the context of replacing thrift multiget and the thrift multiget has never 
supported the more general form implied by an IN on the full primary key.

So I'm closing again as duplicate to avoid the confusion of reusing it for 
something it wasn't intented for.

We can create a separate issue for IN on full primary key *but* I'm not so sure 
it's a good idea because it's yet another form of multi-partition query and we 
discourage those (as doing multiple one-partition queries concurrently is a 
better idea in practice). We kind of had to support as much as what multiget 
was giving us with thrift for political reasons, but adding a new form that 
we'll spend our time discouraging while that form was never supported and never 
asked for (to the best of my knowledge) doesn't sound too compeling.


 Unable to select partition keys directly using IN keyword (no replacement for 
 multi row multiget in thrift)
 ---

 Key: CASSANDRA-7854
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7854
 Project: Cassandra
  Issue Type: Bug
Reporter: Todd Nine
Assignee: Benjamin Lerer

 We're converting some old thrift CF's to CQL.  We aren't looking to change 
 the underlying physical structure, since this has proven effective in 
 production.  In order to migrate, we need full select via multi equivalent.  
 In thrift, the format was as follows.
 (scopeId, scopeType, nodeId, nodeType){ 0x00, timestamp }
 Where we have deliberately designed only 1 column per row.  To translate this 
 to CQL, I have defined the following table.
 {code}
 CREATE TABLE Graph_Marked_Nodes ( 
  scopeId uuid,
  scopeType varchar,
  nodeId uuid,
  nodeType varchar,
  timestamp bigint,
  PRIMARY KEY(scopeId, scopeType, nodeId, nodeType)
 )
 {code}
 I then try to select using the IN keyword.
 {code}
 select timestamp from Graph_Marked_Nodes WHERE (scopeId , scopeType , nodeId 
 , nodeType)  IN ( (5a391596-3181-11e4-a87e-600308a690e2, 'organization', 
 5a3a2708-3181-11e4-a87e-600308a690e2 ,'test' 
 ),(5a391596-3181-11e4-a87e-600308a690e2, 'organization', 
 5a3a2709-3181-11e4-a87e-600308a690e2 ,'test' 
 ),(5a391596-3181-11e4-a87e-600308a690e2, 'organization', 
 5a39fff7-3181-11e4-a87e-600308a690e2 ,'test' ) )
 {code}
 Which results in the following stack trace
 {code}
 Caused by: com.datastax.driver.core.exceptions.InvalidQueryException: 
 Multi-column relations can only be applied to clustering columns: scopeid
   at 
 com.datastax.driver.core.Responses$Error.asException(Responses.java:97)
   at 
 com.datastax.driver.core.DefaultResultSetFuture.onSet(DefaultResultSetFuture.java:110)
   at 
 com.datastax.driver.core.RequestHandler.setFinalResult(RequestHandler.java:235)
   at 
 com.datastax.driver.core.RequestHandler.onSet(RequestHandler.java:367)
   at 
 com.datastax.driver.core.Connection$Dispatcher.messageReceived(Connection.java:584)
 {code}
 This is still possible via the thrift API.  Apologies in advance if I've 
 filed this erroneously.  I can't find any examples of this type of query 
 anywhere.
 Note that our size grows far too large to fit in a single physical partition 
 (row) if we use only scopeId and scopeType, so we need all 4 data elements to 
 be part of our partition key to ensure we have the distribution we need.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CASSANDRA-7854) Unable to select partition keys directly using IN keyword (no replacement for multi row multiget in thrift)

2014-09-01 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne resolved CASSANDRA-7854.
-
Resolution: Duplicate

My bad, I've missed that you had open this and I've opened CASSANDRA-7855. I'm 
closing this one as duplicate then because the description of CASSANDRA-7855 is 
slightly more terse (hope you don't mind).

bq. This is still possible via the thrift API.

To be entirely fair, the thrift API has not native notion of composite row 
key. You can use a blob for you partition key in CQL and shove composite 
information in there, in which case you can do multi-get and that's completely 
equivalent (including as inconvenient) to what you'd do in thrift.


 Unable to select partition keys directly using IN keyword (no replacement for 
 multi row multiget in thrift)
 ---

 Key: CASSANDRA-7854
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7854
 Project: Cassandra
  Issue Type: Bug
Reporter: Todd Nine

 We're converting some old thrift CF's to CQL.  We aren't looking to change 
 the underlying physical structure, since this has proven effective in 
 production.  In order to migrate, we need full select via multi equivalent.  
 In thrift, the format was as follows.
 (scopeId, scopeType, nodeId, nodeType){ 0x00, timestamp }
 Where we have deliberately designed only 1 column per row.  To translate this 
 to CQL, I have defined the following table.
 {code}
 CREATE TABLE Graph_Marked_Nodes ( 
  scopeId uuid,
  scopeType varchar,
  nodeId uuid,
  nodeType varchar,
  timestamp bigint,
  PRIMARY KEY(scopeId, scopeType, nodeId, nodeType)
 )
 {code}
 I then try to select using the IN keyword.
 {code}
 select timestamp from Graph_Marked_Nodes WHERE (scopeId , scopeType , nodeId 
 , nodeType)  IN ( (5a391596-3181-11e4-a87e-600308a690e2, 'organization', 
 5a3a2708-3181-11e4-a87e-600308a690e2 ,'test' 
 ),(5a391596-3181-11e4-a87e-600308a690e2, 'organization', 
 5a3a2709-3181-11e4-a87e-600308a690e2 ,'test' 
 ),(5a391596-3181-11e4-a87e-600308a690e2, 'organization', 
 5a39fff7-3181-11e4-a87e-600308a690e2 ,'test' ) )
 {code}
 Which results in the following stack trace
 {code}
 Caused by: com.datastax.driver.core.exceptions.InvalidQueryException: 
 Multi-column relations can only be applied to clustering columns: scopeid
   at 
 com.datastax.driver.core.Responses$Error.asException(Responses.java:97)
   at 
 com.datastax.driver.core.DefaultResultSetFuture.onSet(DefaultResultSetFuture.java:110)
   at 
 com.datastax.driver.core.RequestHandler.setFinalResult(RequestHandler.java:235)
   at 
 com.datastax.driver.core.RequestHandler.onSet(RequestHandler.java:367)
   at 
 com.datastax.driver.core.Connection$Dispatcher.messageReceived(Connection.java:584)
 {code}
 This is still possible via the thrift API.  Apologies in advance if I've 
 filed this erroneously.  I can't find any examples of this type of query 
 anywhere.
 Note that our size grows far too large to fit in a single physical partition 
 (row) if we use only scopeId and scopeType, so we need all 4 data elements to 
 be part of our partition key to ensure we have the distribution we need.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)