from:"Gareth Collins"

What Happened To Alternate Storage And Rocksandra?

2021-03-12 Thread Gareth Collins

Hi,

I remember a couple of years ago there was some noise about Rocksandra
(Cassandra using rocksdb for storage) and opening up Cassandra to alternate
storage mechanisms.

I haven't seen anything about it for a while now though. The last commit to
Rocksandra on github was in Nov 2019. The associated JIRA items
(CASSANDRA-13474 and CASSANDRA-13476) haven't had any activity since 2019
either.

I was wondering whether anyone knew anything about it. Was it decided that
this wasn't a good idea after all (the alleged performance differences
weren't worth it...or were exaggerated)? Or is it just that it still may be
a good idea, but there are no resources available to make this happen (e.g.
perhaps the original sponsor moved onto other things)?

I ask because I was looking at RocksDB/Kafka Streams for another project
(which may replace some functionality which currently uses Cassandra)...and
was wondering if there could be some important info about RocksDB I may be
missing.

thanks in advance,
Gareth Collins

Re: Stumped By Cassandra delays

2018-07-22 Thread Gareth Collins

Hi Shalom,

Thanks very much for the response!

 We are only using batches on one Cassandra partition to improve
performance. Batches are NEVER used in this app across Cassandra partition.
And if you look at the trace
messages I showed, there is only one statement per batch anyway.

In fact, what I see in the trace is that the responses to the writes may be
being held up by the reads. Here is a more complete example which is
consistent
across nodes. We are using datastax client 3.1.2. Note that all the
requests appear to be processed on nio-worker-5 which is suggesting that
this may be all on the one connection
(even though I can see two connections to each C* server from each client):



*2018-07-20 05:32:43,185 [luster1-nio-worker-5] [  ] [
   ] [] ( core.QueryLogger.SLOW) DEBUG   -
[cluster1] [/10.123.4.52:9042 <http://10.123.4.52:9042>] Query too slow,
took 9322 ms: [2 bound values] select a, b, c, d from  where
token(a)>? and token(a)<=?; << slow read2018-07-20 05:32:43,185
[luster1-nio-worker-5] [  ] [] [
 ] ( core.QueryLogger.SLOW) DEBUG   - [cluster1]
[/10.123.4.52:9042 <http://10.123.4.52:9042>] Query too slow, took 5950 ms:
[1 statements, 6 bound values] BEGIN BATCH INSERT INTO  (a, b,
c, d, e) VALUES (?, ?, ?, ?, ?) using ttl ?; APPLY BATCH; << write response
received immediately after the read2018-07-20 05:32:43,185
[luster1-nio-worker-5] [  ] [] [
 ] ( core.QueryLogger.SLOW) DEBUG   - [cluster1]
[/10.123.4.52:9042 <http://10.123.4.52:9042>] Query too slow, took 511 ms:
[1 statements, 6 bound values] BEGIN BATCH INSERT INTO  (a, b,
c, d, e) VALUES (?, ?, ?, ?, ?) using ttl ?; APPLY BATCH; << write response
received immediately after the read*
2018-07-20 05:32:43,607 [luster1-nio-worker-5] [  ] [
 ] [] (   core.QueryLogger.NORMAL) DEBUG   -
[cluster1] [/10.123.4.52:9042] Query completed normally, took 33 ms: [2
bound values] select CustomerID, ds_, data_, AudienceList from
data.customer_b01be157931bcbfa32b7f240a638129d where token(CustomerID)>?
and token(CustomerID)<=?; << normal read
2018-07-20 05:32:45,938 [luster1-nio-worker-5] [  ] [
 ] [] ( core.QueryLogger.SLOW) DEBUG   -
[cluster1] [/10.123.4.52:9042] Query too slow, took 1701 ms: [2 bound
values] select a, b, c, d from  where token(a)>? and
token(a)<=?; << slow read
2018-07-20 05:32:46,257 [luster1-nio-worker-5] [  ] [
 ] [] (   core.QueryLogger.NORMAL) DEBUG   -
[cluster1] [/10.123.4.52:9042] Query completed normally, took 0 ms: [1
statements, 6 bound values] BEGIN BATCH INSERT INTO  (a, b, c,
d, e) VALUES (?, ?, ?, ?, ?) using ttl ?; APPLY BATCH; << normal write – no
overlap with the read
2018-07-20 05:32:46,336 [luster1-nio-worker-5] [  ] [
 ] [] (   core.QueryLogger.NORMAL) DEBUG   -
[cluster1] [/10.123.4.52:9042] Query completed normally, took 30 ms: [2
bound values] select a, b, c, d from  where token(a)>? and
token(a)<=?; << normal read

*2018-07-20 05:32:48,622 [luster1-nio-worker-5] [  ] [
   ] [] ( core.QueryLogger.SLOW) DEBUG   -
[cluster1] [/10.123.4.52:9042 <http://10.123.4.52:9042>] Query too slow,
took 1626 ms: [2 bound values] select select a, b, c, d from 
where token(a)>? and token(a)<=?; << slow read2018-07-20 05:32:48,622
[luster1-nio-worker-5] [  ] [] [
 ] ( core.QueryLogger.SLOW) DEBUG   - [cluster1]
[/10.123.4.52:9042 <http://10.123.4.52:9042>] Query too slow, took 425 ms:
[1 statements, 6 bound values] BEGIN BATCH INSERT INTO  (a, b,
c, d, e) VALUES (?, ?, ?, ?, ?) using ttl ?; APPLY BATCH; << write appears
immediately after the read*

I would be suggesting some sort of bug on the client holding up the
thread...but I don't know why I would only have a problem on one C* node at
any one time (the clients process reads and writes to other nodes at the
same time without delays).

thanks in advance,
Gareth


On Sun, Jul 22, 2018 at 4:12 AM, shalom sagges 
wrote:

> Hi Gareth,
>
> If you're using batches for multiple partitions, this may be the root
> cause you've been looking for.
>
> https://inoio.de/blog/2016/01/13/cassandra-to-batch-or-not-to-batch/
>
> If batches are optimally used and only one node is misbehaving, check if
> NTP on the node is properly synced.
>
> Hope this helps!
>
>
> On Sat, Jul 21, 2018 at 9:31 PM, Gareth Collins <
> gareth.o.coll...@gmail.com> wrote:
>
>> Hello,
>>
>> We are running Cassandra 2.1.14 in AWS, with c5.4xlarge machines
>> (initially these were m4.xlarge) for our cassandra servers and
>> m4.xlarge for our application servers

Stumped By Cassandra delays

2018-07-21 Thread Gareth Collins

the past before the C* server
upgrade and we still had problems, but I could always try again).

Any ideas/suggestions are greatly appreciated.

thanks in advance,
Gareth Collins

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org

Re: Performance Of IN Queries On Wide Rows

2018-02-21 Thread Gareth Collins

Thanks for the response!

I could understand that being the case if the Cassandra cluster is not
loaded. Splitting the work across multiple nodes would obviously make
the query faster.

But if this was just a single node, shouldn't one IN query be faster
than multiple due to the fact that, if I understand correctly,
Cassandra should need to do less work?

thanks in advance,
Gareth

On Wed, Feb 21, 2018 at 7:27 AM, Rahul Singh
 wrote:
> That depends on the driver you use but separate queries asynchronously
> around the cluster would be faster.
>
>
> --
> Rahul Singh
> rahul.si...@anant.us
>
> Anant Corporation
>
> On Feb 20, 2018, 6:48 PM -0500, Eric Stevens , wrote:
>
> Someone can correct me if I'm wrong, but I believe if you do a large IN() on
> a single partition's cluster keys, all the reads are going to be served from
> a single replica.  Compared to many concurrent individual equal statements
> you can get the performance gain of leaning on several replicas for
> parallelism.
>
> On Tue, Feb 20, 2018 at 11:43 AM Gareth Collins 
> wrote:
>>
>> Hello,
>>
>> When querying large wide rows for multiple specific values is it
>> better to do separate queries for each value...or do it with one query
>> and an "IN"? I am using Cassandra 2.1.14
>>
>> I am asking because I had changed my app to use 'IN' queries and it
>> **appears** to be slower rather than faster. I had assumed that the
>> "IN" query should be faster...as I assumed it only needs to go down
>> the read path once (i.e. row cache -> memtable -> key cache -> bloom
>> filter -> index summary -> index -> compaction -> sstable) rather than
>> once for each entry? Or are there some additional caveats that I
>> should be aware of for 'IN' query performance (e.g. ordering of 'IN'
>> query entries, closeness of 'IN' query values in the SSTable etc.)?
>>
>> thanks in advance,
>> Gareth Collins
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>
>

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org

Performance Of IN Queries On Wide Rows

2018-02-20 Thread Gareth Collins

Hello,

When querying large wide rows for multiple specific values is it
better to do separate queries for each value...or do it with one query
and an "IN"? I am using Cassandra 2.1.14

I am asking because I had changed my app to use 'IN' queries and it
**appears** to be slower rather than faster. I had assumed that the
"IN" query should be faster...as I assumed it only needs to go down
the read path once (i.e. row cache -> memtable -> key cache -> bloom
filter -> index summary -> index -> compaction -> sstable) rather than
once for each entry? Or are there some additional caveats that I
should be aware of for 'IN' query performance (e.g. ordering of 'IN'
query entries, closeness of 'IN' query values in the SSTable etc.)?

thanks in advance,
Gareth Collins

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org

Weird Bootstrapping Issue

2017-05-01 Thread Gareth Collins

Hi,

We are running Cassandra 2.1.14 on an IBM AIX cluster using IBM Java 7
(1.7.1.64). I am having problems adding new nodes to the cluster. I am
seeing the following exception. It appears like the new node is
getting stuck trying to send the magic number on the first streaming
socket...whilst the receiving node never receives it and times out
after 10 seconds.

New Node:

INFO  [StreamConnectionEstablisher:1] 2017-04-28 17:39:20,196
StreamSession.java:220 - [Stream
#22c10290-2c5b-11e7-a33c-8f9ab3a4bd92] Starting streaming to /1.2.3.4

INFO  [StreamConnectionEstablisher:2] 2017-04-28 17:39:20,197
StreamSession.java:220 - [Stream
#22c10290-2c5b-11e7-a33c-8f9ab3a4bd92] Starting streaming to /5.6.7.8

INFO  [StreamConnectionEstablisher:1] 2017-04-28 17:39:20,209
StreamCoordinator.java:209 - [Stream
#22c10290-2c5b-11e7-a33c-8f9ab3a4bd92, ID#0] Beginning stream session
with /1.2.3.4

INFO  [STREAM-IN-/1.2.3.4] 2017-04-28 17:39:20,276
StreamResultFuture.java:166 - [Stream
#22c10290-2c5b-11e7-a33c-8f9ab3a4bd92 ID#0] Prepare completed.
Receiving 2 files(43103 bytes), sending 0 files(0 bytes)

INFO  [StreamReceiveTask:2] 2017-04-28 17:39:20,410
StreamResultFuture.java:180 - [Stream
#22c10290-2c5b-11e7-a33c-8f9ab3a4bd92] Session with /1.2.3.4 is
complete

ERROR [StreamConnectionEstablisher:2] 2017-04-28 17:39:30,207
StreamSession.java:505 - [Stream
#22c10290-2c5b-11e7-a33c-8f9ab3a4bd92] Streaming error occurred

java.nio.channels.AsynchronousCloseException: null

at 
java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:224)
~[na:1.7.0]

at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:538)
~[na:1.7.0]

at 
org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
~[apache-cassandra-2.1.14.jar:2.1.14]

at 
org.apache.cassandra.streaming.ConnectionHandler$MessageHandler.sendInitMessage(ConnectionHandler.java:191)
~[apache-cassandra-2.1.14.jar:2.1.14]

at 
org.apache.cassandra.streaming.ConnectionHandler.initiate(ConnectionHandler.java:81)
~[apache-cassandra-2.1.14.jar:2.1.14]

at 
org.apache.cassandra.streaming.StreamSession.start(StreamSession.java:223)
~[apache-cassandra-2.1.14.jar:2.1.14]

at 
org.apache.cassandra.streaming.StreamCoordinator$StreamSessionConnector.run(StreamCoordinator.java:208)
[apache-cassandra-2.1.14.jar:2.1.14]

at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1157)
[na:1.7.0]

at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:627)
[na:1.7.0]

at java.lang.Thread.run(Thread.java:809) [na:1.7.0]

INFO  [StreamConnectionEstablisher:2] 2017-04-28 17:39:30,208
StreamResultFuture.java:180 - [Stream
#22c10290-2c5b-11e7-a33c-8f9ab3a4bd92] Session with /5.6.7.8 is
complete

WARN  [StreamConnectionEstablisher:2] 2017-04-28 17:39:30,211
StreamResultFuture.java:207 - [Stream
#22c10290-2c5b-11e7-a33c-8f9ab3a4bd92] Stream failed

INFO  [StreamConnectionEstablisher:2] 2017-04-28 17:39:30,212
StreamCoordinator.java:209 - [Stream
#22c10290-2c5b-11e7-a33c-8f9ab3a4bd92, ID#0] Beginning stream session
with /5.6.7.8

ERROR [main] 2017-04-28 17:39:30,213 CassandraDaemon.java:581 -
Exception encountered during startup

java.lang.RuntimeException: Error during boostrap: Stream failed

at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:86)
~[apache-cassandra-2.1.14.jar:2.1.14]

at 
org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:1166)
~[apache-cassandra-2.1.14.jar:2.1.14]


Existing node:

DEBUG [ACCEPT-/5.6.7.8] 2017-04-28 17:39:29,914
MessagingService.java:1014 - Error reading the socket
Socket[addr=/9.0.1.2,port=55848,localport=7000]

java.net.SocketTimeoutException: null

at 
sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:242)
~[na:1.7.0]

at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:116)
~[na:1.7.0]

at java.io.DataInputStream.readFully(DataInputStream.java:207)
~[na:1.7.0]

at java.io.DataInputStream.readInt(DataInputStream.java:399) ~[na:1.7.0]

at 
org.apache.cassandra.net.MessagingService$SocketThread.run(MessagingService.java:988)
~[apache-cassandra-2.1.14.jar:2.1.14]

TRACE [MessagingService-Incoming-/9.0.1.2] 2017-04-28 17:39:29,989
IncomingTcpConnection.java:92 - eof reading from socket; closing

java.io.EOFException: null

at java.io.DataInputStream.readFully(DataInputStream.java:209)
~[na:1.7.0]

at java.io.DataInputStream.readInt(DataInputStream.java:399) ~[na:1.7.0]

at 
org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:171)
~[apache-cassandra-2.1.14.jar:2.1.14]

at 
org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:88)
~[apache-cassandra-2.1.14.jar:2.1.14]

TRACE [MessagingService-Incoming-/9.0.1.2] 2017-04-28 17:39:29,990
IncomingTcpConnection.

Cassandra Memory Question

2014-03-09 Thread Gareth Collins

Hello,

I have a question about CQL memory usage. I am currently using 1.2.9.

If I have a Cassandra table like this (created using Astyanax API):

CREATE TABLE table_name (
  key text,
  column1 text,
  value blob,
  PRIMARY KEY (key, column1)
) WITH COMPACT STORAGE;

and I run a query like this:

select key from table_name;

Will Cassandra filter the "key" from the row as it goes...or will it
get all the rows first (i.e. requiring the whole table in memory),
then filter out the "key"? Or will it filter each row as it goes?

I ask because I am researching an OOM on our Cassandra system. I
believe there must be a query "select * from table_name" (each value
blob is very large - I see the value blobs in the Cassandra hprof),
which would explain the OOM. However I am told the query is "select
key from table_name". If it needs to read the whole table into memory
anyway, this would explain the OOM (BTW - I know that this type of
query is usually a bad idea without some type of paging).

As a supplementary question, is there any way to actually trace the
CQL query test? I turned on the tracing described here:

http://www.datastax.com/dev/blog/advanced-request-tracing-in-cassandra-1-2

Whilst I found the bad query (I was able to match it to the thread
name from the OOM Exception), the trace did not appear to be storing
the original query text. The only CQL text I saw in the trace was from
those queries done from cqlsh.

thanks in advance,
Gareth

Re: Secondary Indexes On Partitioned Time Series Data Question

2013-08-02 Thread Gareth Collins

OK, thanks for the information.

Gareth

On Thu, Aug 1, 2013 at 3:53 PM, Robert Coli  wrote:
> On Thu, Aug 1, 2013 at 12:49 PM, Gareth Collins 
> wrote:
>>
>> Would this be correct? Just making sure I understand how to best use
>> secondary indexes in Cassandra with time series data.
>
>
> In general unless you ABSOLUTELY NEED the one unique feature of built-in
> Secondary Indexes (atomic update of base row and index) you should just use
> a normal column family for secondary index cases.
>
> =Rob

Secondary Indexes On Partitioned Time Series Data Question

2013-08-01 Thread Gareth Collins

Hello,

Say I have time series data for a table like this:

CREATE TABLE mytimeseries (
pk_part1  text,
partition bigint, << e.g. partition per day or per hour
pk_part2  text, << this is part of the partition key so I can
split write load
message_id  timeuuid,
secondary_key1  text,
secondary_key2   text,
.

.
PRIMARY KEY ((pk_part1, partition, pk_part2), message_id));

Most of the time I will need to do queries with
pk_part1/partition/pk_part2/message_id range. So this is what I
optimize for.

Sometimes, however, I will need to do queries with
pk_part1/partition/message_id range and some combination of
secondary_key1 (95% of the time there is a one-to-one relationship
with pk_part1) or secondary_key2 (for each secondary_key2 there will
be many pk_part2 values).

In this time series scenario, to efficiently make use of
secondary_key1/secondary_key2 as Cassandra secondary indexes for these
queries I assume that secondary_key1/secondary_key_2 would really need
to be composites combined into one column (in SQL I would create
multi-column indexes)? i.e.:

secondary_key_1 - pk_part1 + partition_key + real_secondary_key_1
secondary_key_2 - pl_part2 + partition_key + real_secondary_key_2

Would this be correct? Just making sure I understand how to best use
secondary indexes in Cassandra with time series data.

thanks in advance,
Gareth

Re: Coprosessors/Triggers in C*

2013-06-13 Thread Gareth Collins

Edward, Michal,

Thanks very much for the answers. I hadn't really thought before about how
Cassandra would implement the TTL feature. I had foolishly assumed that it
would be like a delete (which I would eventually be able to trigger on to
execute another action) but it makes sense how it is really implemented.

I will need to find another way outside of Cassandra to implement my "do
something if not deleted before TTL requirement" (ugh).

Anyway, thanks again for the clarification.

Gareth



On Thu, Jun 13, 2013 at 2:19 AM, Michal Michalski  wrote:

> I understood it as a "run trigger when column gets deleted due to TTL", so
> - as you said - it doesn't sound like something that can be done.
>
> Gareth, TTL'd columns in Cassandra are not really removed after TTL - they
> are just ignored from that time (so they're not returned by queries), but
> they still exist as long as they're not tombstoned and then removed after
> grace period. Cassandra doesn't know about the exact moment they become
> "outdated" due to TTL. It could be doable to do something when they get
> converted to tombstone, but I don't think it's the use case you're looking
> for.
>
> M.
>
>
>  I do not understand what feature you suggesting. Columns can already have
>> a
>> ttl. Are you speaking of a ttl column that could delete something beside
>> itself.
>>
>
>  That does not sound easy because a ttl comment is dorment until read or
>> compacted.
>>
>> On Tuesday, June 11, 2013, Gareth Collins 
>> wrote:
>>
>>> Hello Edward,
>>> I am curious - What about triggering on a TTL timeout delete (something I
>>>
>> am most interested in doing - perhaps it doesn't make sense?)? Would you
>> say that is something the user should implement themselves? Would you see
>> intravert being able to do something with this at some later point
>> (somehow?)?
>>
>>> thanks,
>>> Gareth
>>> On Tue, Jun 11, 2013 at 2:34 PM, Edward Capriolo 
>>>
>> wrote:
>>
>>>
>>>> This is arguably something you should do yourself. I have been
>>>>
>>> investigating integrating vertx and cassandra together for a while to
>> accomplish this type of work, mainly to move processing close to data and
>> eliminate large batches that can be computed from a single map of data.
>>
>>>
>>>>
>>>>  https://github.com/zznate/**intravert-ug/wiki/Service-**
>> Processor-for-trigger-like-**functionality<https://github.com/zznate/intravert-ug/wiki/Service-Processor-for-trigger-like-functionality>
>>
>>>
>>>> On Tue, Jun 11, 2013 at 5:06 AM, Tanya Malik 
>>>>
>>> wrote:
>>
>>>
>>>>> Thanks Romain.
>>>>>
>>>>> On Tue, Jun 11, 2013 at 1:44 AM, Romain HARDOUIN <
>>>>>
>>>> romain.hardo...@urssaf.fr> wrote:
>>
>>>
>>>>>> Not yet but Cassandra 2.0 will provide experimental triggers:
>>>>>> https://issues.apache.org/**jira/browse/CASSANDRA-1311<https://issues.apache.org/jira/browse/CASSANDRA-1311>
>>>>>>
>>>>>>
>>>>>> Tanya Malik  a écrit sur 11/06/2013 04:12:44
>>>>>> :
>>>>>>
>>>>>>  De : Tanya Malik 
>>>>>>> A : user@cassandra.apache.org,
>>>>>>> Date : 11/06/2013 04:13
>>>>>>> Objet : Coprosessors/Triggers in C*
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> Does C* support something like co-processor functionality/triggers
>>>>>>>
>>>>>> to
>>
>>> run client-supplied code in the address space of the server?
>>>>>>>
>>>>>>
>>>>
>>>
>>>
>>
>

Re: Coprosessors/Triggers in C*

2013-06-11 Thread Gareth Collins

Hello Edward,

I am curious - What about triggering on a TTL timeout delete (something I
am most interested in doing - perhaps it doesn't make sense?)? Would you
say that is something the user should implement themselves? Would you see
intravert being able to do something with this at some later point
(somehow?)?

thanks,
Gareth

On Tue, Jun 11, 2013 at 2:34 PM, Edward Capriolo wrote:

> This is arguably something you should do yourself. I have been
> investigating integrating vertx and cassandra together for a while to
> accomplish this type of work, mainly to move processing close to data and
> eliminate large batches that can be computed from a single map of data.
>
>
> https://github.com/zznate/intravert-ug/wiki/Service-Processor-for-trigger-like-functionality
>
>
> On Tue, Jun 11, 2013 at 5:06 AM, Tanya Malik wrote:
>
>> Thanks Romain.
>>
>>
>> On Tue, Jun 11, 2013 at 1:44 AM, Romain HARDOUIN <
>> romain.hardo...@urssaf.fr> wrote:
>>
>>> Not yet but Cassandra 2.0 will provide experimental triggers:
>>> https://issues.apache.org/jira/browse/CASSANDRA-1311
>>>
>>>
>>> Tanya Malik  a écrit sur 11/06/2013 04:12:44 :
>>>
>>> > De : Tanya Malik 
>>> > A : user@cassandra.apache.org,
>>> > Date : 11/06/2013 04:13
>>> > Objet : Coprosessors/Triggers in C*
>>> >
>>> > Hi,
>>> >
>>> > Does C* support something like co-processor functionality/triggers to
>>> > run client-supplied code in the address space of the server?
>>>
>>
>>
>

Re: Hector vs Astyanax dependency issue

2013-05-26 Thread Gareth Collins

Hi Renato,

Are you sure that you don't have two copies of guava in your classpath? I
don't have this problem (I was using both Hector and Astyanax for a while
-> now transitioned completely to Astyanax).

Probably the most problematic part of using the datastax or astyanax
clients is that they both depend on the "cassandra-all" jar which by
default brings in a massive number of dependencies. It took me a good
couple of days to figure out what was really required (especially since I
work in OSGi -> I had to OSGi all the non-OSGi dependencies, ugh).

Gareth

On Fri, May 24, 2013 at 7:02 PM, Renato Marroquín Mogrovejo <
renatoj.marroq...@gmail.com> wrote:

> Hi all,
>
> I am using Astyanax and Hector client within an application but right now
> I am hitting a dependency issue [1] related to Guava version being used by
> Hector and Astyanax which makes Maven headache. I have taken it out as
> exclusions within my poms but I still get the dependency issue.
> Do you guys think you could help me out with this one?
> Thanks in advance!
>
>
> Renato M.
>
> [1] https://github.com/Netflix/astyanax/issues/204
>
>

Re: CQL3 And ReversedTypes Question

2013-04-15 Thread Gareth Collins

Added:

https://issues.apache.org/jira/browse/CASSANDRA-5472

thanks,
Gareth


On Sun, Apr 14, 2013 at 2:33 PM, aaron morton wrote:

> Bad Request: Type error:
> org.apache.cassandra.cql3.statements.Selection$SimpleSelector@1e7318cannot be 
> passed as argument 0 of function dateof of type timeuuid
>
> Is there something I am missing here or should I open a new ticket?
>
> Yes please.
>
> Cheers
>
> -
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 13/04/2013, at 4:40 PM, Gareth Collins 
> wrote:
>
> OK, trying out 1.2.4. The previous issue seems to be fine, but I am
> experiencing a new one:
>
> cqlsh:location> create table test_y (message_id timeuuid, name text,
> PRIMARY KEY (name,message_id));
> cqlsh:location> insert into test_y (message_id,name) VALUES (now(),'foo');
> cqlsh:location> insert into test_y (message_id,name) VALUES (now(),'foo');
> cqlsh:location> insert into test_y (message_id,name) VALUES (now(),'foo');
> cqlsh:location> insert into test_y (message_id,name) VALUES (now(),'foo');
> cqlsh:location> select dateOf(message_id) from test_y;
>
>  dateOf(message_id)
> --
>  2013-04-13 00:33:42-0400
>  2013-04-13 00:33:43-0400
>  2013-04-13 00:33:43-0400
>  2013-04-13 00:33:44-0400
>
> cqlsh:location> create table test_x (message_id timeuuid, name text,
> PRIMARY KEY (name,message_id)) WITH CLUSTERING ORDER BY (message_id DESC);
> cqlsh:location> insert into test_x (message_id,name) VALUES (now(),'foo');
> cqlsh:location> insert into test_x (message_id,name) VALUES (now(),'foo');
> cqlsh:location> insert into test_x (message_id,name) VALUES (now(),'foo');
> cqlsh:location> insert into test_x (message_id,name) VALUES (now(),'foo');
> cqlsh:location> insert into test_x (message_id,name) VALUES (now(),'foo');
> cqlsh:location> select dateOf(message_id) from test_x;
> Bad Request: Type error:
> org.apache.cassandra.cql3.statements.Selection$SimpleSelector@1e7318cannot be 
> passed as argument 0 of function dateof of type timeuuid
>
> Is there something I am missing here or should I open a new ticket?
>
> thanks in advance,
> Gareth
>
>
> On Tue, Mar 26, 2013 at 3:30 PM, Gareth Collins <
> gareth.o.coll...@gmail.com> wrote:
>
>> Added:
>>
>> https://issues.apache.org/jira/browse/CASSANDRA-5386
>>
>> Thanks very much for the quick answer!
>>
>> regards,
>> Gareth
>>
>> On Tue, Mar 26, 2013 at 3:55 AM, Sylvain Lebresne 
>> wrote:
>> > You aren't missing anything obvious. That's a bug really. Would you mind
>> > opening a ticket on https://issues.apache.org/jira/browse/CASSANDRA?
>> >
>> > --
>> > Sylvain
>> >
>> >
>> > On Tue, Mar 26, 2013 at 2:48 AM, Gareth Collins <
>> gareth.o.coll...@gmail.com>
>> > wrote:
>> >>
>> >> Hi,
>> >>
>> >> I created a table with the following structure in cqlsh (Cassandra
>> >> 1.2.3 - cql 3):
>> >>
>> >> CREATE TABLE mytable ( column1 text,
>> >>   column2 text,
>> >>   messageId timeuuid,
>> >>   message blob,
>> >>   PRIMARY KEY ((column1, column2), messageId));
>> >>
>> >> I can quite happily add values to this table. e.g:
>> >>
>> >> insert into client_queue (column1,column2,messageId,message) VALUES
>> >> ('string1','string2',now(),'ABCCDCC123');
>> >>
>> >> Yet if I decide I want to set the clustering order on messageId DESC:
>> >>
>> >> CREATE TABLE mytable ( column1 text,
>> >>   column2 text,
>> >>   messageId timeuuid,
>> >>   message blob,
>> >>   PRIMARY KEY ((column1, column2), messageId)) WITH CLUSTERING
>> >> ORDER BY (messageId DESC);
>> >>
>> >> and try to do an insert:
>> >>
>> >> insert into client_queue2 (column1,column2,messageId,message) VALUES
>> >> ('string1','string2',now(),'ABCCDCC123');
>> >>
>> >> I get the following error:
>> >>
>> >> Bad Request: Type error: cannot assign result of function now (type
>> >> timeuuid) to messageid (type
>> >>
>> >>
>> 'org.apache.cassandra.db.marshal.ReversedType(org.apache.cassandra.db.marshal.TimeUUIDType)')
>> >>
>> >> I am sure I am missing something obvious here, but I don't understand.
>> >> Why am I getting an error? What do I need
>> >> to do to be able to add an entry to this table?
>> >>
>> >> thanks in advance,
>> >> Gareth
>> >
>> >
>>
>
>
>

Re: Anyway To Query Just The Partition Key?

2013-04-13 Thread Gareth Collins

Edward,

Thanks for the response. This is what I thought. The only reason why I am
doing it like this is that I don't know these partition keys in advance
(otherwise I would design this differently). So when I need to insert data,
it looks like I need to insert to both the data table and the table
containing the partition keys. Good thing writes in Cassandra are
idempotent...:)

thanks again,
Gareth

On Sat, Apr 13, 2013 at 7:26 AM, Edward Capriolo wrote:

> You can 'list' or 'select *' the column family and you get them in a
> pseudo random order. When you say subset it implies you might want a
> specific range which is something this schema can not do.
>
>
>
>
> On Sat, Apr 13, 2013 at 2:05 AM, Gareth Collins <
> gareth.o.coll...@gmail.com> wrote:
>
>> Hello,
>>
>> If I have a cql3 table like this (I don't have a table with this data -
>> this is just for example):
>>
>> create table (
>> surname text,
>> city text,
>> country text,
>> event_id timeuuid,
>> data text,
>> PRIMARY KEY ((surname, city, country),event_id));
>>
>> there is no way of (easily) getting the set (or a subset) of partition
>> keys, is there (i.e. surname/city/country)? If I want easy access to do
>> queries to get a subset of the partition keys, I have to create another
>> table?
>>
>> I am assuming yes but just making sure I am not missing something obvious
>> here.
>>
>> thanks in advance,
>> Gareth
>>
>
>

Re: Anyway To Query Just The Partition Key?

2013-04-13 Thread Gareth Collins

Thank you for the answer.

My apologies. I should have been clearer with my question.

Say for example, I have a 1000 partition keys and 1 rows per partition
key I am trying to avoid bringing back 10 million rows to find the 1000
partition keys. I assume I cannot avoid bringing back the 10 million rows
(or at least an order of magnitude more than 1000 rows) without having
another table?

thanks,
Gareth

On Sat, Apr 13, 2013 at 4:13 AM, Jabbar Azam  wrote:

> With your example you can do an equality search with surname and city and
> then use "in" with country
>
> Eg.  Select * from yourtable where surname="blah" and city="blah blah" and
> country in ("country1", "country2")
>
> Hope that helps
>
> Jabbar Azam
> On 13 Apr 2013 07:06, "Gareth Collins"  wrote:
>
>> Hello,
>>
>> If I have a cql3 table like this (I don't have a table with this data -
>> this is just for example):
>>
>> create table (
>> surname text,
>> city text,
>> country text,
>> event_id timeuuid,
>> data text,
>> PRIMARY KEY ((surname, city, country),event_id));
>>
>> there is no way of (easily) getting the set (or a subset) of partition
>> keys, is there (i.e. surname/city/country)? If I want easy access to do
>> queries to get a subset of the partition keys, I have to create another
>> table?
>>
>> I am assuming yes but just making sure I am not missing something obvious
>> here.
>>
>> thanks in advance,
>> Gareth
>>
>

Anyway To Query Just The Partition Key?

2013-04-12 Thread Gareth Collins

Hello,

If I have a cql3 table like this (I don't have a table with this data -
this is just for example):

create table (
surname text,
city text,
country text,
event_id timeuuid,
data text,
PRIMARY KEY ((surname, city, country),event_id));

there is no way of (easily) getting the set (or a subset) of partition
keys, is there (i.e. surname/city/country)? If I want easy access to do
queries to get a subset of the partition keys, I have to create another
table?

I am assuming yes but just making sure I am not missing something obvious
here.

thanks in advance,
Gareth

Re: CQL3 And ReversedTypes Question

2013-04-12 Thread Gareth Collins

OK, trying out 1.2.4. The previous issue seems to be fine, but I am
experiencing a new one:

cqlsh:location> create table test_y (message_id timeuuid, name text,
PRIMARY KEY (name,message_id));
cqlsh:location> insert into test_y (message_id,name) VALUES (now(),'foo');
cqlsh:location> insert into test_y (message_id,name) VALUES (now(),'foo');
cqlsh:location> insert into test_y (message_id,name) VALUES (now(),'foo');
cqlsh:location> insert into test_y (message_id,name) VALUES (now(),'foo');
cqlsh:location> select dateOf(message_id) from test_y;

 dateOf(message_id)
--
 2013-04-13 00:33:42-0400
 2013-04-13 00:33:43-0400
 2013-04-13 00:33:43-0400
 2013-04-13 00:33:44-0400

cqlsh:location> create table test_x (message_id timeuuid, name text,
PRIMARY KEY (name,message_id)) WITH CLUSTERING ORDER BY (message_id DESC);
cqlsh:location> insert into test_x (message_id,name) VALUES (now(),'foo');
cqlsh:location> insert into test_x (message_id,name) VALUES (now(),'foo');
cqlsh:location> insert into test_x (message_id,name) VALUES (now(),'foo');
cqlsh:location> insert into test_x (message_id,name) VALUES (now(),'foo');
cqlsh:location> insert into test_x (message_id,name) VALUES (now(),'foo');
cqlsh:location> select dateOf(message_id) from test_x;
Bad Request: Type error:
org.apache.cassandra.cql3.statements.Selection$SimpleSelector@1e7318 cannot
be passed as argument 0 of function dateof of type timeuuid

Is there something I am missing here or should I open a new ticket?

thanks in advance,
Gareth


On Tue, Mar 26, 2013 at 3:30 PM, Gareth Collins
wrote:

> Added:
>
> https://issues.apache.org/jira/browse/CASSANDRA-5386
>
> Thanks very much for the quick answer!
>
> regards,
> Gareth
>
> On Tue, Mar 26, 2013 at 3:55 AM, Sylvain Lebresne 
> wrote:
> > You aren't missing anything obvious. That's a bug really. Would you mind
> > opening a ticket on https://issues.apache.org/jira/browse/CASSANDRA?
> >
> > --
> > Sylvain
> >
> >
> > On Tue, Mar 26, 2013 at 2:48 AM, Gareth Collins <
> gareth.o.coll...@gmail.com>
> > wrote:
> >>
> >> Hi,
> >>
> >> I created a table with the following structure in cqlsh (Cassandra
> >> 1.2.3 - cql 3):
> >>
> >> CREATE TABLE mytable ( column1 text,
> >>   column2 text,
> >>   messageId timeuuid,
> >>   message blob,
> >>   PRIMARY KEY ((column1, column2), messageId));
> >>
> >> I can quite happily add values to this table. e.g:
> >>
> >> insert into client_queue (column1,column2,messageId,message) VALUES
> >> ('string1','string2',now(),'ABCCDCC123');
> >>
> >> Yet if I decide I want to set the clustering order on messageId DESC:
> >>
> >> CREATE TABLE mytable ( column1 text,
> >>   column2 text,
> >>   messageId timeuuid,
> >>   message blob,
> >>   PRIMARY KEY ((column1, column2), messageId)) WITH CLUSTERING
> >> ORDER BY (messageId DESC);
> >>
> >> and try to do an insert:
> >>
> >> insert into client_queue2 (column1,column2,messageId,message) VALUES
> >> ('string1','string2',now(),'ABCCDCC123');
> >>
> >> I get the following error:
> >>
> >> Bad Request: Type error: cannot assign result of function now (type
> >> timeuuid) to messageid (type
> >>
> >>
> 'org.apache.cassandra.db.marshal.ReversedType(org.apache.cassandra.db.marshal.TimeUUIDType)')
> >>
> >> I am sure I am missing something obvious here, but I don't understand.
> >> Why am I getting an error? What do I need
> >> to do to be able to add an entry to this table?
> >>
> >> thanks in advance,
> >> Gareth
> >
> >
>

CQL3 And Map Literals

2013-03-28 Thread Gareth Collins

Hello,

I have been playing with map literals in CQL3 queries. I see that
single-quotes work:

{'foo':'bar'}

but double-quotes do not:

{"foo":"bar"}

I am curious. Was there a specific reason why it was decided to use
single-quotes?
I ask because double-quotes would make this valid json.

thanks in advance,
Gareth

Returning A Generated Id From An Insert

2013-03-26 Thread Gareth Collins

Hi,

I have a question on if I could do something in Cassandra similar to
what I can do in SQL.

In SQL (e.g. SQL Server), if I have a generated primary key, I can get
the generated primary key
back as a result for the insert statement.

Is it possible to do something similar with CQL (e.g. could I be
returned the generated timeuuid from
now() somehow?). It certainly makes my client code cleaner if this
were possible (it is a "nice to have").

thanks in advance,
Gareth

Re: CQL3 And ReversedTypes Question

2013-03-26 Thread Gareth Collins

Added:

https://issues.apache.org/jira/browse/CASSANDRA-5386

Thanks very much for the quick answer!

regards,
Gareth

On Tue, Mar 26, 2013 at 3:55 AM, Sylvain Lebresne  wrote:
> You aren't missing anything obvious. That's a bug really. Would you mind
> opening a ticket on https://issues.apache.org/jira/browse/CASSANDRA?
>
> --
> Sylvain
>
>
> On Tue, Mar 26, 2013 at 2:48 AM, Gareth Collins 
> wrote:
>>
>> Hi,
>>
>> I created a table with the following structure in cqlsh (Cassandra
>> 1.2.3 - cql 3):
>>
>> CREATE TABLE mytable ( column1 text,
>>   column2 text,
>>   messageId timeuuid,
>>   message blob,
>>   PRIMARY KEY ((column1, column2), messageId));
>>
>> I can quite happily add values to this table. e.g:
>>
>> insert into client_queue (column1,column2,messageId,message) VALUES
>> ('string1','string2',now(),'ABCCDCC123');
>>
>> Yet if I decide I want to set the clustering order on messageId DESC:
>>
>> CREATE TABLE mytable ( column1 text,
>>   column2 text,
>>   messageId timeuuid,
>>   message blob,
>>   PRIMARY KEY ((column1, column2), messageId)) WITH CLUSTERING
>> ORDER BY (messageId DESC);
>>
>> and try to do an insert:
>>
>> insert into client_queue2 (column1,column2,messageId,message) VALUES
>> ('string1','string2',now(),'ABCCDCC123');
>>
>> I get the following error:
>>
>> Bad Request: Type error: cannot assign result of function now (type
>> timeuuid) to messageid (type
>>
>> 'org.apache.cassandra.db.marshal.ReversedType(org.apache.cassandra.db.marshal.TimeUUIDType)')
>>
>> I am sure I am missing something obvious here, but I don't understand.
>> Why am I getting an error? What do I need
>> to do to be able to add an entry to this table?
>>
>> thanks in advance,
>> Gareth
>
>

CQL3 And ReversedTypes Question

2013-03-25 Thread Gareth Collins

Hi,

I created a table with the following structure in cqlsh (Cassandra
1.2.3 - cql 3):

CREATE TABLE mytable ( column1 text,
  column2 text,
  messageId timeuuid,
  message blob,
  PRIMARY KEY ((column1, column2), messageId));

I can quite happily add values to this table. e.g:

insert into client_queue (column1,column2,messageId,message) VALUES
('string1','string2',now(),'ABCCDCC123');

Yet if I decide I want to set the clustering order on messageId DESC:

CREATE TABLE mytable ( column1 text,
  column2 text,
  messageId timeuuid,
  message blob,
  PRIMARY KEY ((column1, column2), messageId)) WITH CLUSTERING
ORDER BY (messageId DESC);

and try to do an insert:

insert into client_queue2 (column1,column2,messageId,message) VALUES
('string1','string2',now(),'ABCCDCC123');

I get the following error:

Bad Request: Type error: cannot assign result of function now (type
timeuuid) to messageid (type
'org.apache.cassandra.db.marshal.ReversedType(org.apache.cassandra.db.marshal.TimeUUIDType)')

I am sure I am missing something obvious here, but I don't understand.
Why am I getting an error? What do I need
to do to be able to add an entry to this table?

thanks in advance,
Gareth

What Happened To Alternate Storage And Rocksandra?

Re: Stumped By Cassandra delays

Stumped By Cassandra delays

Re: Performance Of IN Queries On Wide Rows

Performance Of IN Queries On Wide Rows

Weird Bootstrapping Issue

Cassandra Memory Question

Re: Secondary Indexes On Partitioned Time Series Data Question

Secondary Indexes On Partitioned Time Series Data Question

Re: Coprosessors/Triggers in C*

Re: Coprosessors/Triggers in C*

Re: Hector vs Astyanax dependency issue

Re: CQL3 And ReversedTypes Question

Re: Anyway To Query Just The Partition Key?

Re: Anyway To Query Just The Partition Key?

Anyway To Query Just The Partition Key?

Re: CQL3 And ReversedTypes Question

CQL3 And Map Literals

Returning A Generated Id From An Insert

Re: CQL3 And ReversedTypes Question

CQL3 And ReversedTypes Question

21 matches

Site Navigation

Mail list logo

Footer information