What Happened To Alternate Storage And Rocksandra?
Hi, I remember a couple of years ago there was some noise about Rocksandra (Cassandra using rocksdb for storage) and opening up Cassandra to alternate storage mechanisms. I haven't seen anything about it for a while now though. The last commit to Rocksandra on github was in Nov 2019. The associated JIRA items (CASSANDRA-13474 and CASSANDRA-13476) haven't had any activity since 2019 either. I was wondering whether anyone knew anything about it. Was it decided that this wasn't a good idea after all (the alleged performance differences weren't worth it...or were exaggerated)? Or is it just that it still may be a good idea, but there are no resources available to make this happen (e.g. perhaps the original sponsor moved onto other things)? I ask because I was looking at RocksDB/Kafka Streams for another project (which may replace some functionality which currently uses Cassandra)...and was wondering if there could be some important info about RocksDB I may be missing. thanks in advance, Gareth Collins
Re: Stumped By Cassandra delays
Hi Shalom, Thanks very much for the response! We are only using batches on one Cassandra partition to improve performance. Batches are NEVER used in this app across Cassandra partition. And if you look at the trace messages I showed, there is only one statement per batch anyway. In fact, what I see in the trace is that the responses to the writes may be being held up by the reads. Here is a more complete example which is consistent across nodes. We are using datastax client 3.1.2. Note that all the requests appear to be processed on nio-worker-5 which is suggesting that this may be all on the one connection (even though I can see two connections to each C* server from each client): *2018-07-20 05:32:43,185 [luster1-nio-worker-5] [ ] [ ] [] ( core.QueryLogger.SLOW) DEBUG - [cluster1] [/10.123.4.52:9042 <http://10.123.4.52:9042>] Query too slow, took 9322 ms: [2 bound values] select a, b, c, d from where token(a)>? and token(a)<=?; << slow read2018-07-20 05:32:43,185 [luster1-nio-worker-5] [ ] [] [ ] ( core.QueryLogger.SLOW) DEBUG - [cluster1] [/10.123.4.52:9042 <http://10.123.4.52:9042>] Query too slow, took 5950 ms: [1 statements, 6 bound values] BEGIN BATCH INSERT INTO (a, b, c, d, e) VALUES (?, ?, ?, ?, ?) using ttl ?; APPLY BATCH; << write response received immediately after the read2018-07-20 05:32:43,185 [luster1-nio-worker-5] [ ] [] [ ] ( core.QueryLogger.SLOW) DEBUG - [cluster1] [/10.123.4.52:9042 <http://10.123.4.52:9042>] Query too slow, took 511 ms: [1 statements, 6 bound values] BEGIN BATCH INSERT INTO (a, b, c, d, e) VALUES (?, ?, ?, ?, ?) using ttl ?; APPLY BATCH; << write response received immediately after the read* 2018-07-20 05:32:43,607 [luster1-nio-worker-5] [ ] [ ] [] ( core.QueryLogger.NORMAL) DEBUG - [cluster1] [/10.123.4.52:9042] Query completed normally, took 33 ms: [2 bound values] select CustomerID, ds_, data_, AudienceList from data.customer_b01be157931bcbfa32b7f240a638129d where token(CustomerID)>? and token(CustomerID)<=?; << normal read 2018-07-20 05:32:45,938 [luster1-nio-worker-5] [ ] [ ] [] ( core.QueryLogger.SLOW) DEBUG - [cluster1] [/10.123.4.52:9042] Query too slow, took 1701 ms: [2 bound values] select a, b, c, d from where token(a)>? and token(a)<=?; << slow read 2018-07-20 05:32:46,257 [luster1-nio-worker-5] [ ] [ ] [] ( core.QueryLogger.NORMAL) DEBUG - [cluster1] [/10.123.4.52:9042] Query completed normally, took 0 ms: [1 statements, 6 bound values] BEGIN BATCH INSERT INTO (a, b, c, d, e) VALUES (?, ?, ?, ?, ?) using ttl ?; APPLY BATCH; << normal write – no overlap with the read 2018-07-20 05:32:46,336 [luster1-nio-worker-5] [ ] [ ] [] ( core.QueryLogger.NORMAL) DEBUG - [cluster1] [/10.123.4.52:9042] Query completed normally, took 30 ms: [2 bound values] select a, b, c, d from where token(a)>? and token(a)<=?; << normal read *2018-07-20 05:32:48,622 [luster1-nio-worker-5] [ ] [ ] [] ( core.QueryLogger.SLOW) DEBUG - [cluster1] [/10.123.4.52:9042 <http://10.123.4.52:9042>] Query too slow, took 1626 ms: [2 bound values] select select a, b, c, d from where token(a)>? and token(a)<=?; << slow read2018-07-20 05:32:48,622 [luster1-nio-worker-5] [ ] [] [ ] ( core.QueryLogger.SLOW) DEBUG - [cluster1] [/10.123.4.52:9042 <http://10.123.4.52:9042>] Query too slow, took 425 ms: [1 statements, 6 bound values] BEGIN BATCH INSERT INTO (a, b, c, d, e) VALUES (?, ?, ?, ?, ?) using ttl ?; APPLY BATCH; << write appears immediately after the read* I would be suggesting some sort of bug on the client holding up the thread...but I don't know why I would only have a problem on one C* node at any one time (the clients process reads and writes to other nodes at the same time without delays). thanks in advance, Gareth On Sun, Jul 22, 2018 at 4:12 AM, shalom sagges wrote: > Hi Gareth, > > If you're using batches for multiple partitions, this may be the root > cause you've been looking for. > > https://inoio.de/blog/2016/01/13/cassandra-to-batch-or-not-to-batch/ > > If batches are optimally used and only one node is misbehaving, check if > NTP on the node is properly synced. > > Hope this helps! > > > On Sat, Jul 21, 2018 at 9:31 PM, Gareth Collins < > gareth.o.coll...@gmail.com> wrote: > >> Hello, >> >> We are running Cassandra 2.1.14 in AWS, with c5.4xlarge machines >> (initially these were m4.xlarge) for our cassandra servers and >> m4.xlarge for our application servers
Stumped By Cassandra delays
the past before the C* server upgrade and we still had problems, but I could always try again). Any ideas/suggestions are greatly appreciated. thanks in advance, Gareth Collins - To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional commands, e-mail: user-h...@cassandra.apache.org
Re: Performance Of IN Queries On Wide Rows
Thanks for the response! I could understand that being the case if the Cassandra cluster is not loaded. Splitting the work across multiple nodes would obviously make the query faster. But if this was just a single node, shouldn't one IN query be faster than multiple due to the fact that, if I understand correctly, Cassandra should need to do less work? thanks in advance, Gareth On Wed, Feb 21, 2018 at 7:27 AM, Rahul Singh wrote: > That depends on the driver you use but separate queries asynchronously > around the cluster would be faster. > > > -- > Rahul Singh > rahul.si...@anant.us > > Anant Corporation > > On Feb 20, 2018, 6:48 PM -0500, Eric Stevens , wrote: > > Someone can correct me if I'm wrong, but I believe if you do a large IN() on > a single partition's cluster keys, all the reads are going to be served from > a single replica. Compared to many concurrent individual equal statements > you can get the performance gain of leaning on several replicas for > parallelism. > > On Tue, Feb 20, 2018 at 11:43 AM Gareth Collins > wrote: >> >> Hello, >> >> When querying large wide rows for multiple specific values is it >> better to do separate queries for each value...or do it with one query >> and an "IN"? I am using Cassandra 2.1.14 >> >> I am asking because I had changed my app to use 'IN' queries and it >> **appears** to be slower rather than faster. I had assumed that the >> "IN" query should be faster...as I assumed it only needs to go down >> the read path once (i.e. row cache -> memtable -> key cache -> bloom >> filter -> index summary -> index -> compaction -> sstable) rather than >> once for each entry? Or are there some additional caveats that I >> should be aware of for 'IN' query performance (e.g. ordering of 'IN' >> query entries, closeness of 'IN' query values in the SSTable etc.)? >> >> thanks in advance, >> Gareth Collins >> >> - >> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org >> For additional commands, e-mail: user-h...@cassandra.apache.org >> > - To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional commands, e-mail: user-h...@cassandra.apache.org
Performance Of IN Queries On Wide Rows
Hello, When querying large wide rows for multiple specific values is it better to do separate queries for each value...or do it with one query and an "IN"? I am using Cassandra 2.1.14 I am asking because I had changed my app to use 'IN' queries and it **appears** to be slower rather than faster. I had assumed that the "IN" query should be faster...as I assumed it only needs to go down the read path once (i.e. row cache -> memtable -> key cache -> bloom filter -> index summary -> index -> compaction -> sstable) rather than once for each entry? Or are there some additional caveats that I should be aware of for 'IN' query performance (e.g. ordering of 'IN' query entries, closeness of 'IN' query values in the SSTable etc.)? thanks in advance, Gareth Collins - To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional commands, e-mail: user-h...@cassandra.apache.org
Weird Bootstrapping Issue
Hi, We are running Cassandra 2.1.14 on an IBM AIX cluster using IBM Java 7 (1.7.1.64). I am having problems adding new nodes to the cluster. I am seeing the following exception. It appears like the new node is getting stuck trying to send the magic number on the first streaming socket...whilst the receiving node never receives it and times out after 10 seconds. New Node: INFO [StreamConnectionEstablisher:1] 2017-04-28 17:39:20,196 StreamSession.java:220 - [Stream #22c10290-2c5b-11e7-a33c-8f9ab3a4bd92] Starting streaming to /1.2.3.4 INFO [StreamConnectionEstablisher:2] 2017-04-28 17:39:20,197 StreamSession.java:220 - [Stream #22c10290-2c5b-11e7-a33c-8f9ab3a4bd92] Starting streaming to /5.6.7.8 INFO [StreamConnectionEstablisher:1] 2017-04-28 17:39:20,209 StreamCoordinator.java:209 - [Stream #22c10290-2c5b-11e7-a33c-8f9ab3a4bd92, ID#0] Beginning stream session with /1.2.3.4 INFO [STREAM-IN-/1.2.3.4] 2017-04-28 17:39:20,276 StreamResultFuture.java:166 - [Stream #22c10290-2c5b-11e7-a33c-8f9ab3a4bd92 ID#0] Prepare completed. Receiving 2 files(43103 bytes), sending 0 files(0 bytes) INFO [StreamReceiveTask:2] 2017-04-28 17:39:20,410 StreamResultFuture.java:180 - [Stream #22c10290-2c5b-11e7-a33c-8f9ab3a4bd92] Session with /1.2.3.4 is complete ERROR [StreamConnectionEstablisher:2] 2017-04-28 17:39:30,207 StreamSession.java:505 - [Stream #22c10290-2c5b-11e7-a33c-8f9ab3a4bd92] Streaming error occurred java.nio.channels.AsynchronousCloseException: null at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:224) ~[na:1.7.0] at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:538) ~[na:1.7.0] at org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48) ~[apache-cassandra-2.1.14.jar:2.1.14] at org.apache.cassandra.streaming.ConnectionHandler$MessageHandler.sendInitMessage(ConnectionHandler.java:191) ~[apache-cassandra-2.1.14.jar:2.1.14] at org.apache.cassandra.streaming.ConnectionHandler.initiate(ConnectionHandler.java:81) ~[apache-cassandra-2.1.14.jar:2.1.14] at org.apache.cassandra.streaming.StreamSession.start(StreamSession.java:223) ~[apache-cassandra-2.1.14.jar:2.1.14] at org.apache.cassandra.streaming.StreamCoordinator$StreamSessionConnector.run(StreamCoordinator.java:208) [apache-cassandra-2.1.14.jar:2.1.14] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1157) [na:1.7.0] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:627) [na:1.7.0] at java.lang.Thread.run(Thread.java:809) [na:1.7.0] INFO [StreamConnectionEstablisher:2] 2017-04-28 17:39:30,208 StreamResultFuture.java:180 - [Stream #22c10290-2c5b-11e7-a33c-8f9ab3a4bd92] Session with /5.6.7.8 is complete WARN [StreamConnectionEstablisher:2] 2017-04-28 17:39:30,211 StreamResultFuture.java:207 - [Stream #22c10290-2c5b-11e7-a33c-8f9ab3a4bd92] Stream failed INFO [StreamConnectionEstablisher:2] 2017-04-28 17:39:30,212 StreamCoordinator.java:209 - [Stream #22c10290-2c5b-11e7-a33c-8f9ab3a4bd92, ID#0] Beginning stream session with /5.6.7.8 ERROR [main] 2017-04-28 17:39:30,213 CassandraDaemon.java:581 - Exception encountered during startup java.lang.RuntimeException: Error during boostrap: Stream failed at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:86) ~[apache-cassandra-2.1.14.jar:2.1.14] at org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:1166) ~[apache-cassandra-2.1.14.jar:2.1.14] Existing node: DEBUG [ACCEPT-/5.6.7.8] 2017-04-28 17:39:29,914 MessagingService.java:1014 - Error reading the socket Socket[addr=/9.0.1.2,port=55848,localport=7000] java.net.SocketTimeoutException: null at sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:242) ~[na:1.7.0] at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:116) ~[na:1.7.0] at java.io.DataInputStream.readFully(DataInputStream.java:207) ~[na:1.7.0] at java.io.DataInputStream.readInt(DataInputStream.java:399) ~[na:1.7.0] at org.apache.cassandra.net.MessagingService$SocketThread.run(MessagingService.java:988) ~[apache-cassandra-2.1.14.jar:2.1.14] TRACE [MessagingService-Incoming-/9.0.1.2] 2017-04-28 17:39:29,989 IncomingTcpConnection.java:92 - eof reading from socket; closing java.io.EOFException: null at java.io.DataInputStream.readFully(DataInputStream.java:209) ~[na:1.7.0] at java.io.DataInputStream.readInt(DataInputStream.java:399) ~[na:1.7.0] at org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:171) ~[apache-cassandra-2.1.14.jar:2.1.14] at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:88) ~[apache-cassandra-2.1.14.jar:2.1.14] TRACE [MessagingService-Incoming-/9.0.1.2] 2017-04-28 17:39:29,990 IncomingTcpConnection.
Cassandra Memory Question
Hello, I have a question about CQL memory usage. I am currently using 1.2.9. If I have a Cassandra table like this (created using Astyanax API): CREATE TABLE table_name ( key text, column1 text, value blob, PRIMARY KEY (key, column1) ) WITH COMPACT STORAGE; and I run a query like this: select key from table_name; Will Cassandra filter the "key" from the row as it goes...or will it get all the rows first (i.e. requiring the whole table in memory), then filter out the "key"? Or will it filter each row as it goes? I ask because I am researching an OOM on our Cassandra system. I believe there must be a query "select * from table_name" (each value blob is very large - I see the value blobs in the Cassandra hprof), which would explain the OOM. However I am told the query is "select key from table_name". If it needs to read the whole table into memory anyway, this would explain the OOM (BTW - I know that this type of query is usually a bad idea without some type of paging). As a supplementary question, is there any way to actually trace the CQL query test? I turned on the tracing described here: http://www.datastax.com/dev/blog/advanced-request-tracing-in-cassandra-1-2 Whilst I found the bad query (I was able to match it to the thread name from the OOM Exception), the trace did not appear to be storing the original query text. The only CQL text I saw in the trace was from those queries done from cqlsh. thanks in advance, Gareth
Re: Secondary Indexes On Partitioned Time Series Data Question
OK, thanks for the information. Gareth On Thu, Aug 1, 2013 at 3:53 PM, Robert Coli wrote: > On Thu, Aug 1, 2013 at 12:49 PM, Gareth Collins > wrote: >> >> Would this be correct? Just making sure I understand how to best use >> secondary indexes in Cassandra with time series data. > > > In general unless you ABSOLUTELY NEED the one unique feature of built-in > Secondary Indexes (atomic update of base row and index) you should just use > a normal column family for secondary index cases. > > =Rob
Secondary Indexes On Partitioned Time Series Data Question
Hello, Say I have time series data for a table like this: CREATE TABLE mytimeseries ( pk_part1 text, partition bigint, << e.g. partition per day or per hour pk_part2 text, << this is part of the partition key so I can split write load message_id timeuuid, secondary_key1 text, secondary_key2 text, . . PRIMARY KEY ((pk_part1, partition, pk_part2), message_id)); Most of the time I will need to do queries with pk_part1/partition/pk_part2/message_id range. So this is what I optimize for. Sometimes, however, I will need to do queries with pk_part1/partition/message_id range and some combination of secondary_key1 (95% of the time there is a one-to-one relationship with pk_part1) or secondary_key2 (for each secondary_key2 there will be many pk_part2 values). In this time series scenario, to efficiently make use of secondary_key1/secondary_key2 as Cassandra secondary indexes for these queries I assume that secondary_key1/secondary_key_2 would really need to be composites combined into one column (in SQL I would create multi-column indexes)? i.e.: secondary_key_1 - pk_part1 + partition_key + real_secondary_key_1 secondary_key_2 - pl_part2 + partition_key + real_secondary_key_2 Would this be correct? Just making sure I understand how to best use secondary indexes in Cassandra with time series data. thanks in advance, Gareth
Re: Coprosessors/Triggers in C*
Edward, Michal, Thanks very much for the answers. I hadn't really thought before about how Cassandra would implement the TTL feature. I had foolishly assumed that it would be like a delete (which I would eventually be able to trigger on to execute another action) but it makes sense how it is really implemented. I will need to find another way outside of Cassandra to implement my "do something if not deleted before TTL requirement" (ugh). Anyway, thanks again for the clarification. Gareth On Thu, Jun 13, 2013 at 2:19 AM, Michal Michalski wrote: > I understood it as a "run trigger when column gets deleted due to TTL", so > - as you said - it doesn't sound like something that can be done. > > Gareth, TTL'd columns in Cassandra are not really removed after TTL - they > are just ignored from that time (so they're not returned by queries), but > they still exist as long as they're not tombstoned and then removed after > grace period. Cassandra doesn't know about the exact moment they become > "outdated" due to TTL. It could be doable to do something when they get > converted to tombstone, but I don't think it's the use case you're looking > for. > > M. > > > I do not understand what feature you suggesting. Columns can already have >> a >> ttl. Are you speaking of a ttl column that could delete something beside >> itself. >> > > That does not sound easy because a ttl comment is dorment until read or >> compacted. >> >> On Tuesday, June 11, 2013, Gareth Collins >> wrote: >> >>> Hello Edward, >>> I am curious - What about triggering on a TTL timeout delete (something I >>> >> am most interested in doing - perhaps it doesn't make sense?)? Would you >> say that is something the user should implement themselves? Would you see >> intravert being able to do something with this at some later point >> (somehow?)? >> >>> thanks, >>> Gareth >>> On Tue, Jun 11, 2013 at 2:34 PM, Edward Capriolo >>> >> wrote: >> >>> >>>> This is arguably something you should do yourself. I have been >>>> >>> investigating integrating vertx and cassandra together for a while to >> accomplish this type of work, mainly to move processing close to data and >> eliminate large batches that can be computed from a single map of data. >> >>> >>>> >>>> https://github.com/zznate/**intravert-ug/wiki/Service-** >> Processor-for-trigger-like-**functionality<https://github.com/zznate/intravert-ug/wiki/Service-Processor-for-trigger-like-functionality> >> >>> >>>> On Tue, Jun 11, 2013 at 5:06 AM, Tanya Malik >>>> >>> wrote: >> >>> >>>>> Thanks Romain. >>>>> >>>>> On Tue, Jun 11, 2013 at 1:44 AM, Romain HARDOUIN < >>>>> >>>> romain.hardo...@urssaf.fr> wrote: >> >>> >>>>>> Not yet but Cassandra 2.0 will provide experimental triggers: >>>>>> https://issues.apache.org/**jira/browse/CASSANDRA-1311<https://issues.apache.org/jira/browse/CASSANDRA-1311> >>>>>> >>>>>> >>>>>> Tanya Malik a écrit sur 11/06/2013 04:12:44 >>>>>> : >>>>>> >>>>>> De : Tanya Malik >>>>>>> A : user@cassandra.apache.org, >>>>>>> Date : 11/06/2013 04:13 >>>>>>> Objet : Coprosessors/Triggers in C* >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> Does C* support something like co-processor functionality/triggers >>>>>>> >>>>>> to >> >>> run client-supplied code in the address space of the server? >>>>>>> >>>>>> >>>> >>> >>> >> >
Re: Coprosessors/Triggers in C*
Hello Edward, I am curious - What about triggering on a TTL timeout delete (something I am most interested in doing - perhaps it doesn't make sense?)? Would you say that is something the user should implement themselves? Would you see intravert being able to do something with this at some later point (somehow?)? thanks, Gareth On Tue, Jun 11, 2013 at 2:34 PM, Edward Capriolo wrote: > This is arguably something you should do yourself. I have been > investigating integrating vertx and cassandra together for a while to > accomplish this type of work, mainly to move processing close to data and > eliminate large batches that can be computed from a single map of data. > > > https://github.com/zznate/intravert-ug/wiki/Service-Processor-for-trigger-like-functionality > > > On Tue, Jun 11, 2013 at 5:06 AM, Tanya Malik wrote: > >> Thanks Romain. >> >> >> On Tue, Jun 11, 2013 at 1:44 AM, Romain HARDOUIN < >> romain.hardo...@urssaf.fr> wrote: >> >>> Not yet but Cassandra 2.0 will provide experimental triggers: >>> https://issues.apache.org/jira/browse/CASSANDRA-1311 >>> >>> >>> Tanya Malik a écrit sur 11/06/2013 04:12:44 : >>> >>> > De : Tanya Malik >>> > A : user@cassandra.apache.org, >>> > Date : 11/06/2013 04:13 >>> > Objet : Coprosessors/Triggers in C* >>> > >>> > Hi, >>> > >>> > Does C* support something like co-processor functionality/triggers to >>> > run client-supplied code in the address space of the server? >>> >> >> >
Re: Hector vs Astyanax dependency issue
Hi Renato, Are you sure that you don't have two copies of guava in your classpath? I don't have this problem (I was using both Hector and Astyanax for a while -> now transitioned completely to Astyanax). Probably the most problematic part of using the datastax or astyanax clients is that they both depend on the "cassandra-all" jar which by default brings in a massive number of dependencies. It took me a good couple of days to figure out what was really required (especially since I work in OSGi -> I had to OSGi all the non-OSGi dependencies, ugh). Gareth On Fri, May 24, 2013 at 7:02 PM, Renato Marroquín Mogrovejo < renatoj.marroq...@gmail.com> wrote: > Hi all, > > I am using Astyanax and Hector client within an application but right now > I am hitting a dependency issue [1] related to Guava version being used by > Hector and Astyanax which makes Maven headache. I have taken it out as > exclusions within my poms but I still get the dependency issue. > Do you guys think you could help me out with this one? > Thanks in advance! > > > Renato M. > > [1] https://github.com/Netflix/astyanax/issues/204 > >
Re: CQL3 And ReversedTypes Question
Added: https://issues.apache.org/jira/browse/CASSANDRA-5472 thanks, Gareth On Sun, Apr 14, 2013 at 2:33 PM, aaron morton wrote: > Bad Request: Type error: > org.apache.cassandra.cql3.statements.Selection$SimpleSelector@1e7318cannot be > passed as argument 0 of function dateof of type timeuuid > > Is there something I am missing here or should I open a new ticket? > > Yes please. > > Cheers > > - > Aaron Morton > Freelance Cassandra Consultant > New Zealand > > @aaronmorton > http://www.thelastpickle.com > > On 13/04/2013, at 4:40 PM, Gareth Collins > wrote: > > OK, trying out 1.2.4. The previous issue seems to be fine, but I am > experiencing a new one: > > cqlsh:location> create table test_y (message_id timeuuid, name text, > PRIMARY KEY (name,message_id)); > cqlsh:location> insert into test_y (message_id,name) VALUES (now(),'foo'); > cqlsh:location> insert into test_y (message_id,name) VALUES (now(),'foo'); > cqlsh:location> insert into test_y (message_id,name) VALUES (now(),'foo'); > cqlsh:location> insert into test_y (message_id,name) VALUES (now(),'foo'); > cqlsh:location> select dateOf(message_id) from test_y; > > dateOf(message_id) > -- > 2013-04-13 00:33:42-0400 > 2013-04-13 00:33:43-0400 > 2013-04-13 00:33:43-0400 > 2013-04-13 00:33:44-0400 > > cqlsh:location> create table test_x (message_id timeuuid, name text, > PRIMARY KEY (name,message_id)) WITH CLUSTERING ORDER BY (message_id DESC); > cqlsh:location> insert into test_x (message_id,name) VALUES (now(),'foo'); > cqlsh:location> insert into test_x (message_id,name) VALUES (now(),'foo'); > cqlsh:location> insert into test_x (message_id,name) VALUES (now(),'foo'); > cqlsh:location> insert into test_x (message_id,name) VALUES (now(),'foo'); > cqlsh:location> insert into test_x (message_id,name) VALUES (now(),'foo'); > cqlsh:location> select dateOf(message_id) from test_x; > Bad Request: Type error: > org.apache.cassandra.cql3.statements.Selection$SimpleSelector@1e7318cannot be > passed as argument 0 of function dateof of type timeuuid > > Is there something I am missing here or should I open a new ticket? > > thanks in advance, > Gareth > > > On Tue, Mar 26, 2013 at 3:30 PM, Gareth Collins < > gareth.o.coll...@gmail.com> wrote: > >> Added: >> >> https://issues.apache.org/jira/browse/CASSANDRA-5386 >> >> Thanks very much for the quick answer! >> >> regards, >> Gareth >> >> On Tue, Mar 26, 2013 at 3:55 AM, Sylvain Lebresne >> wrote: >> > You aren't missing anything obvious. That's a bug really. Would you mind >> > opening a ticket on https://issues.apache.org/jira/browse/CASSANDRA? >> > >> > -- >> > Sylvain >> > >> > >> > On Tue, Mar 26, 2013 at 2:48 AM, Gareth Collins < >> gareth.o.coll...@gmail.com> >> > wrote: >> >> >> >> Hi, >> >> >> >> I created a table with the following structure in cqlsh (Cassandra >> >> 1.2.3 - cql 3): >> >> >> >> CREATE TABLE mytable ( column1 text, >> >> column2 text, >> >> messageId timeuuid, >> >> message blob, >> >> PRIMARY KEY ((column1, column2), messageId)); >> >> >> >> I can quite happily add values to this table. e.g: >> >> >> >> insert into client_queue (column1,column2,messageId,message) VALUES >> >> ('string1','string2',now(),'ABCCDCC123'); >> >> >> >> Yet if I decide I want to set the clustering order on messageId DESC: >> >> >> >> CREATE TABLE mytable ( column1 text, >> >> column2 text, >> >> messageId timeuuid, >> >> message blob, >> >> PRIMARY KEY ((column1, column2), messageId)) WITH CLUSTERING >> >> ORDER BY (messageId DESC); >> >> >> >> and try to do an insert: >> >> >> >> insert into client_queue2 (column1,column2,messageId,message) VALUES >> >> ('string1','string2',now(),'ABCCDCC123'); >> >> >> >> I get the following error: >> >> >> >> Bad Request: Type error: cannot assign result of function now (type >> >> timeuuid) to messageid (type >> >> >> >> >> 'org.apache.cassandra.db.marshal.ReversedType(org.apache.cassandra.db.marshal.TimeUUIDType)') >> >> >> >> I am sure I am missing something obvious here, but I don't understand. >> >> Why am I getting an error? What do I need >> >> to do to be able to add an entry to this table? >> >> >> >> thanks in advance, >> >> Gareth >> > >> > >> > > >
Re: Anyway To Query Just The Partition Key?
Edward, Thanks for the response. This is what I thought. The only reason why I am doing it like this is that I don't know these partition keys in advance (otherwise I would design this differently). So when I need to insert data, it looks like I need to insert to both the data table and the table containing the partition keys. Good thing writes in Cassandra are idempotent...:) thanks again, Gareth On Sat, Apr 13, 2013 at 7:26 AM, Edward Capriolo wrote: > You can 'list' or 'select *' the column family and you get them in a > pseudo random order. When you say subset it implies you might want a > specific range which is something this schema can not do. > > > > > On Sat, Apr 13, 2013 at 2:05 AM, Gareth Collins < > gareth.o.coll...@gmail.com> wrote: > >> Hello, >> >> If I have a cql3 table like this (I don't have a table with this data - >> this is just for example): >> >> create table ( >> surname text, >> city text, >> country text, >> event_id timeuuid, >> data text, >> PRIMARY KEY ((surname, city, country),event_id)); >> >> there is no way of (easily) getting the set (or a subset) of partition >> keys, is there (i.e. surname/city/country)? If I want easy access to do >> queries to get a subset of the partition keys, I have to create another >> table? >> >> I am assuming yes but just making sure I am not missing something obvious >> here. >> >> thanks in advance, >> Gareth >> > >
Re: Anyway To Query Just The Partition Key?
Thank you for the answer. My apologies. I should have been clearer with my question. Say for example, I have a 1000 partition keys and 1 rows per partition key I am trying to avoid bringing back 10 million rows to find the 1000 partition keys. I assume I cannot avoid bringing back the 10 million rows (or at least an order of magnitude more than 1000 rows) without having another table? thanks, Gareth On Sat, Apr 13, 2013 at 4:13 AM, Jabbar Azam wrote: > With your example you can do an equality search with surname and city and > then use "in" with country > > Eg. Select * from yourtable where surname="blah" and city="blah blah" and > country in ("country1", "country2") > > Hope that helps > > Jabbar Azam > On 13 Apr 2013 07:06, "Gareth Collins" wrote: > >> Hello, >> >> If I have a cql3 table like this (I don't have a table with this data - >> this is just for example): >> >> create table ( >> surname text, >> city text, >> country text, >> event_id timeuuid, >> data text, >> PRIMARY KEY ((surname, city, country),event_id)); >> >> there is no way of (easily) getting the set (or a subset) of partition >> keys, is there (i.e. surname/city/country)? If I want easy access to do >> queries to get a subset of the partition keys, I have to create another >> table? >> >> I am assuming yes but just making sure I am not missing something obvious >> here. >> >> thanks in advance, >> Gareth >> >
Anyway To Query Just The Partition Key?
Hello, If I have a cql3 table like this (I don't have a table with this data - this is just for example): create table ( surname text, city text, country text, event_id timeuuid, data text, PRIMARY KEY ((surname, city, country),event_id)); there is no way of (easily) getting the set (or a subset) of partition keys, is there (i.e. surname/city/country)? If I want easy access to do queries to get a subset of the partition keys, I have to create another table? I am assuming yes but just making sure I am not missing something obvious here. thanks in advance, Gareth
Re: CQL3 And ReversedTypes Question
OK, trying out 1.2.4. The previous issue seems to be fine, but I am experiencing a new one: cqlsh:location> create table test_y (message_id timeuuid, name text, PRIMARY KEY (name,message_id)); cqlsh:location> insert into test_y (message_id,name) VALUES (now(),'foo'); cqlsh:location> insert into test_y (message_id,name) VALUES (now(),'foo'); cqlsh:location> insert into test_y (message_id,name) VALUES (now(),'foo'); cqlsh:location> insert into test_y (message_id,name) VALUES (now(),'foo'); cqlsh:location> select dateOf(message_id) from test_y; dateOf(message_id) -- 2013-04-13 00:33:42-0400 2013-04-13 00:33:43-0400 2013-04-13 00:33:43-0400 2013-04-13 00:33:44-0400 cqlsh:location> create table test_x (message_id timeuuid, name text, PRIMARY KEY (name,message_id)) WITH CLUSTERING ORDER BY (message_id DESC); cqlsh:location> insert into test_x (message_id,name) VALUES (now(),'foo'); cqlsh:location> insert into test_x (message_id,name) VALUES (now(),'foo'); cqlsh:location> insert into test_x (message_id,name) VALUES (now(),'foo'); cqlsh:location> insert into test_x (message_id,name) VALUES (now(),'foo'); cqlsh:location> insert into test_x (message_id,name) VALUES (now(),'foo'); cqlsh:location> select dateOf(message_id) from test_x; Bad Request: Type error: org.apache.cassandra.cql3.statements.Selection$SimpleSelector@1e7318 cannot be passed as argument 0 of function dateof of type timeuuid Is there something I am missing here or should I open a new ticket? thanks in advance, Gareth On Tue, Mar 26, 2013 at 3:30 PM, Gareth Collins wrote: > Added: > > https://issues.apache.org/jira/browse/CASSANDRA-5386 > > Thanks very much for the quick answer! > > regards, > Gareth > > On Tue, Mar 26, 2013 at 3:55 AM, Sylvain Lebresne > wrote: > > You aren't missing anything obvious. That's a bug really. Would you mind > > opening a ticket on https://issues.apache.org/jira/browse/CASSANDRA? > > > > -- > > Sylvain > > > > > > On Tue, Mar 26, 2013 at 2:48 AM, Gareth Collins < > gareth.o.coll...@gmail.com> > > wrote: > >> > >> Hi, > >> > >> I created a table with the following structure in cqlsh (Cassandra > >> 1.2.3 - cql 3): > >> > >> CREATE TABLE mytable ( column1 text, > >> column2 text, > >> messageId timeuuid, > >> message blob, > >> PRIMARY KEY ((column1, column2), messageId)); > >> > >> I can quite happily add values to this table. e.g: > >> > >> insert into client_queue (column1,column2,messageId,message) VALUES > >> ('string1','string2',now(),'ABCCDCC123'); > >> > >> Yet if I decide I want to set the clustering order on messageId DESC: > >> > >> CREATE TABLE mytable ( column1 text, > >> column2 text, > >> messageId timeuuid, > >> message blob, > >> PRIMARY KEY ((column1, column2), messageId)) WITH CLUSTERING > >> ORDER BY (messageId DESC); > >> > >> and try to do an insert: > >> > >> insert into client_queue2 (column1,column2,messageId,message) VALUES > >> ('string1','string2',now(),'ABCCDCC123'); > >> > >> I get the following error: > >> > >> Bad Request: Type error: cannot assign result of function now (type > >> timeuuid) to messageid (type > >> > >> > 'org.apache.cassandra.db.marshal.ReversedType(org.apache.cassandra.db.marshal.TimeUUIDType)') > >> > >> I am sure I am missing something obvious here, but I don't understand. > >> Why am I getting an error? What do I need > >> to do to be able to add an entry to this table? > >> > >> thanks in advance, > >> Gareth > > > > >
CQL3 And Map Literals
Hello, I have been playing with map literals in CQL3 queries. I see that single-quotes work: {'foo':'bar'} but double-quotes do not: {"foo":"bar"} I am curious. Was there a specific reason why it was decided to use single-quotes? I ask because double-quotes would make this valid json. thanks in advance, Gareth
Returning A Generated Id From An Insert
Hi, I have a question on if I could do something in Cassandra similar to what I can do in SQL. In SQL (e.g. SQL Server), if I have a generated primary key, I can get the generated primary key back as a result for the insert statement. Is it possible to do something similar with CQL (e.g. could I be returned the generated timeuuid from now() somehow?). It certainly makes my client code cleaner if this were possible (it is a "nice to have"). thanks in advance, Gareth
Re: CQL3 And ReversedTypes Question
Added: https://issues.apache.org/jira/browse/CASSANDRA-5386 Thanks very much for the quick answer! regards, Gareth On Tue, Mar 26, 2013 at 3:55 AM, Sylvain Lebresne wrote: > You aren't missing anything obvious. That's a bug really. Would you mind > opening a ticket on https://issues.apache.org/jira/browse/CASSANDRA? > > -- > Sylvain > > > On Tue, Mar 26, 2013 at 2:48 AM, Gareth Collins > wrote: >> >> Hi, >> >> I created a table with the following structure in cqlsh (Cassandra >> 1.2.3 - cql 3): >> >> CREATE TABLE mytable ( column1 text, >> column2 text, >> messageId timeuuid, >> message blob, >> PRIMARY KEY ((column1, column2), messageId)); >> >> I can quite happily add values to this table. e.g: >> >> insert into client_queue (column1,column2,messageId,message) VALUES >> ('string1','string2',now(),'ABCCDCC123'); >> >> Yet if I decide I want to set the clustering order on messageId DESC: >> >> CREATE TABLE mytable ( column1 text, >> column2 text, >> messageId timeuuid, >> message blob, >> PRIMARY KEY ((column1, column2), messageId)) WITH CLUSTERING >> ORDER BY (messageId DESC); >> >> and try to do an insert: >> >> insert into client_queue2 (column1,column2,messageId,message) VALUES >> ('string1','string2',now(),'ABCCDCC123'); >> >> I get the following error: >> >> Bad Request: Type error: cannot assign result of function now (type >> timeuuid) to messageid (type >> >> 'org.apache.cassandra.db.marshal.ReversedType(org.apache.cassandra.db.marshal.TimeUUIDType)') >> >> I am sure I am missing something obvious here, but I don't understand. >> Why am I getting an error? What do I need >> to do to be able to add an entry to this table? >> >> thanks in advance, >> Gareth > >
CQL3 And ReversedTypes Question
Hi, I created a table with the following structure in cqlsh (Cassandra 1.2.3 - cql 3): CREATE TABLE mytable ( column1 text, column2 text, messageId timeuuid, message blob, PRIMARY KEY ((column1, column2), messageId)); I can quite happily add values to this table. e.g: insert into client_queue (column1,column2,messageId,message) VALUES ('string1','string2',now(),'ABCCDCC123'); Yet if I decide I want to set the clustering order on messageId DESC: CREATE TABLE mytable ( column1 text, column2 text, messageId timeuuid, message blob, PRIMARY KEY ((column1, column2), messageId)) WITH CLUSTERING ORDER BY (messageId DESC); and try to do an insert: insert into client_queue2 (column1,column2,messageId,message) VALUES ('string1','string2',now(),'ABCCDCC123'); I get the following error: Bad Request: Type error: cannot assign result of function now (type timeuuid) to messageid (type 'org.apache.cassandra.db.marshal.ReversedType(org.apache.cassandra.db.marshal.TimeUUIDType)') I am sure I am missing something obvious here, but I don't understand. Why am I getting an error? What do I need to do to be able to add an entry to this table? thanks in advance, Gareth