date:20140620

Yes, I am using the CQL datastax drivers.
It was a good advice, thanks a lot Janathan.
[]s


2014-06-20 0:28 GMT-03:00 Jonathan Haddad j...@jonhaddad.com:

 The only case in which it might be better to use an IN clause is if
 the entire query can be satisfied from that machine.  Otherwise, go
 async.

 The native driver reuses connections and intelligently manages the
 pool for you.  It can also multiplex queries over a single connection.

 I am assuming you're using one of the datastax drivers for CQL, btw.

 Jon

 On Thu, Jun 19, 2014 at 7:37 PM, Marcelo Elias Del Valle
 marc...@s1mbi0se.com.br wrote:
  This is interesting, I didn't know that!
  It might make sense then to use select = + async + token aware, I will
 try
  to change my code.
 
  But would it be a recomended solution for these cases? Any other
 options?
 
  I still would if this is the right use case for Cassandra, to look for
  random keys in a huge cluster. After all, the amount of connections to
  Cassandra will still be huge, right... Wouldn't it be a problem?
  Or when you use async the driver reuses the connection?
 
  []s
 
 
  2014-06-19 22:16 GMT-03:00 Jonathan Haddad j...@jonhaddad.com:
 
  If you use async and your driver is token aware, it will go to the
  proper node, rather than requiring the coordinator to do so.
 
  Realistically you're going to have a connection open to every server
  anyways.  It's the difference between you querying for the data
  directly and using a coordinator as a proxy.  It's faster to just ask
  the node with the data.
 
  On Thu, Jun 19, 2014 at 6:11 PM, Marcelo Elias Del Valle
  marc...@s1mbi0se.com.br wrote:
   But using async queries wouldn't be even worse than using SELECT IN?
   The justification in the docs is I could query many nodes, but I would
   still
   do it.
  
   Today, I use both async queries AND SELECT IN:
  
   SELECT_ENTITY_LOOKUP = SELECT entity_id FROM  + ENTITY_LOOKUP + 
   WHERE
   name=%s and value in(%s)
  
   for name, values in identifiers.items():
  query = self.SELECT_ENTITY_LOOKUP % ('%s',
   ','.join(['%s']*len(values)))
  args = [name] + values
  query_msg = query % tuple(args)
  futures.append((query_msg, self.session.execute_async(query,
 args)))
  
   for query_msg, future in futures:
  try:
 rows = future.result(timeout=10)
 for row in rows:
   entity_ids.add(row.entity_id)
  except:
 logging.error(Query '%s' returned ERROR  % (query_msg))
 raise
  
   Using async just with select = would mean instead of 1 async query
   (example:
   in (0, 1, 2)), I would do several, one for each value of values
 array
   above.
   In my head, this would mean more connections to Cassandra and the same
   amount of work, right? What would be the advantage?
  
   []s
  
  
  
  
   2014-06-19 22:01 GMT-03:00 Jonathan Haddad j...@jonhaddad.com:
  
   Your other option is to fire off async queries.  It's pretty
   straightforward w/ the java or python drivers.
  
   On Thu, Jun 19, 2014 at 5:56 PM, Marcelo Elias Del Valle
   marc...@s1mbi0se.com.br wrote:
I was taking a look at Cassandra anti-patterns list:
   
   
   
   
 http://www.datastax.com/documentation/cassandra/2.0/cassandra/architecture/architecturePlanningAntiPatterns_c.html
   
Among then is
   
SELECT ... IN or index lookups¶
   
SELECT ... IN and index lookups (formerly secondary indexes) should
be
avoided except for specific scenarios. See When not to use IN in
SELECT
and
When not to use an index in Indexing in
   
CQL for Cassandra 2.0
   
And Looking at the SELECT doc, I saw:
   
When not to use IN¶
   
The recommendations about when not to use an index apply to using
 IN
in
the
WHERE clause. Under most conditions, using IN in the WHERE clause
 is
not
recommended. Using IN can degrade performance because usually many
nodes
must be queried. For example, in a single, local data center
 cluster
having
30 nodes, a replication factor of 3, and a consistency level of
LOCAL_QUORUM, a single key query goes out to two nodes, but if the
query
uses the IN condition, the number of nodes being queried are most
likely
even higher, up to 20 nodes depending on where the keys fall in the
token
range.
   
In my system, I have a column family called entity_lookup:
   
CREATE KEYSPACE IF NOT EXISTS Identification1
  WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy',
  'DC1' : 3 };
USE Identification1;
   
CREATE TABLE IF NOT EXISTS entity_lookup (
  name varchar,
  value varchar,
  entity_id uuid,
  PRIMARY KEY ((name, value), entity_id));
   
And I use the following select to query it:
   
SELECT entity_id FROM entity_lookup WHERE name=%s and value in(%s)
   
Is this an anti-pattern?
   
If not using SELECT IN, which other way would you recomend for
lookups
like
that? I have

Sending BLOBs to Cassandra +

Hi,

I read in Cassandra's FAQ that it is fine with BLOBs up to 64MB. Here am
I trying to send a 1.6MB BLOB using CQL and Cassandra rejects my query
with the following message:
Caused by: com.datastax.driver.core.exceptions.InvalidQueryException:
Request is too big: length 409600086 exceeds maximum allowed length
268435456.

Is there a better way than CQL to send BLOBS? Is there a way to concat
them so that I can use several queries of the right size to upload my
BLOB? I'd like to be able to send BLOBs up to 20MB.

Thanks!

-- 
Simon

Bug on 2.1-rc1 with BLOBs?

Hi,

When I am sending BLOBs _below_ the max query size (blob size=0.6MB), on
Cassandra 2.0, it works fine, but on 2.1-rc1 I get the following error
within the Cassandra server (from the logs) and the query just dies:

WARN  [SharedPool-Worker-2] 2014-06-20 10:06:00,263
AbstractTracingAwareExecutorService.java:166 - Uncaught exception on
thread Thread[SharedPool-Worker-2,5,main]: {}
java.lang.RuntimeException: java.lang.IllegalArgumentException: Mutation
of 122880122 bytes is too large for the maxiumum size of 16777216
at
org.apache.cassandra.service.StorageProxy$LocalMutationRunnable.run(StorageProxy.java:2052)
~[apache-cassandra-2.1.0-rc1.jar:2.1.0-rc1]
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
~[na:1.8.0_05]
at
org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:162)
~[apache-cassandra-2.1.0-rc1.jar:2.1.0-rc1]
at
org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:103)
[apache-cassandra-2.1.0-rc1.jar:2.1.0-rc1]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_05]
Caused by: java.lang.IllegalArgumentException: Mutation of 122880122
bytes is too large for the maxiumum size of 16777216
at
org.apache.cassandra.db.commitlog.CommitLog.add(CommitLog.java:205)
~[apache-cassandra-2.1.0-rc1.jar:2.1.0-rc1]
at
org.apache.cassandra.db.commitlog.CommitLog.add(CommitLog.java:192)
~[apache-cassandra-2.1.0-rc1.jar:2.1.0-rc1]
at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:374)
~[apache-cassandra-2.1.0-rc1.jar:2.1.0-rc1]
at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:354)
~[apache-cassandra-2.1.0-rc1.jar:2.1.0-rc1]
at org.apache.cassandra.db.Mutation.apply(Mutation.java:210)
~[apache-cassandra-2.1.0-rc1.jar:2.1.0-rc1]
at
org.apache.cassandra.service.StorageProxy$7.runMayThrow(StorageProxy.java:958)
~[apache-cassandra-2.1.0-rc1.jar:2.1.0-rc1]
at
org.apache.cassandra.service.StorageProxy$LocalMutationRunnable.run(StorageProxy.java:2048)
~[apache-cassandra-2.1.0-rc1.jar:2.1.0-rc1]
... 4 common frames omitted



I checked on JIRA to see something similar but didn't find (then again,
not sure which keywords to search for). Should I open an issue?

Cheers,

Simon

Re: Bug on 2.1-rc1 with BLOBs?

2014-06-20 Thread Duncan Sands


Hi Simon,

On 20/06/14 10:18, Simon Chemouil wrote:

Hi,

When I am sending BLOBs _below_ the max query size (blob size=0.6MB), on
Cassandra 2.0, it works fine, but on 2.1-rc1 I get the following error
within the Cassandra server (from the logs) and the query just dies:

WARN  [SharedPool-Worker-2] 2014-06-20 10:06:00,263
AbstractTracingAwareExecutorService.java:166 - Uncaught exception on
thread Thread[SharedPool-Worker-2,5,main]: {}
java.lang.RuntimeException: java.lang.IllegalArgumentException: Mutation
of 122880122 bytes is too large for the maxiumum size of 16777216


122880122 bytes is a lot more than 0.6MB...  How are you sending your blob?

Ciao, Duncan.

Re: Bug on 2.1-rc1 with BLOBs?

Le 20/06/2014 10:41, Duncan Sands a écrit :
 Hi Simon,
 122880122 bytes is a lot more than 0.6MB...  How are you sending your blob?

Turns out there was a mistake in my code. The blob in this case was
actually 122MB!
Still the same code works fine on Cassandra 2.0.x so there might be a
bug lurking. Even if it's definitely above the recommended limit.

Simon

Re: Sending BLOBs to Cassandra +

So looks like I was sending more than I expected. Still the question
stands: is CQL the best way to send BLOBs? Are there any remote
operations available on BLOBs?

Thanks,
Simon

Le 20/06/2014 10:03, Simon Chemouil a écrit :
 Hi,
 
 I read in Cassandra's FAQ that it is fine with BLOBs up to 64MB. Here am
 I trying to send a 1.6MB BLOB using CQL and Cassandra rejects my query
 with the following message:
 Caused by: com.datastax.driver.core.exceptions.InvalidQueryException:
 Request is too big: length 409600086 exceeds maximum allowed length
 268435456.
 
 Is there a better way than CQL to send BLOBS? Is there a way to concat
 them so that I can use several queries of the right size to upload my
 BLOB? I'd like to be able to send BLOBs up to 20MB.
 
 Thanks!

Re: Bug on 2.1-rc1 with BLOBs?

For the record, I could reproduce the problem with blobs of size below 64MB.

Caused by: java.lang.IllegalArgumentException: Mutation of 32000122
bytes is too large for the maxiumum size of 16777216

32000122 is just ~30MB and fails on 2.1-rc1 while it works on 2.0.X for
even larger values (up to 64MB works fine)

Simon

Le 20/06/2014 11:00, Simon Chemouil a écrit :
 Le 20/06/2014 10:41, Duncan Sands a écrit :
 Hi Simon,
 122880122 bytes is a lot more than 0.6MB...  How are you sending your blob?
 
 Turns out there was a mistake in my code. The blob in this case was
 actually 122MB!
 Still the same code works fine on Cassandra 2.0.x so there might be a
 bug lurking. Even if it's definitely above the recommended limit.
 
 Simon

Re: Bug on 2.1-rc1 with BLOBs?

OK, so Cassandra 2.1 now rejects writes it considers too big. It is
possible to increase the value by changing commitlog_segment_size_in_mb
in cassandra.yaml. It defaults to 32MB, and the maximum segment size for
a write is half that value:

from CommitLog.java:
// we only permit records HALF the size of a commit log, to ensure we
don't spin allocating many mostly
 // empty segments when writing large records
   private static final long MAX_MUTATION_SIZE =
DatabaseDescriptor.getCommitLogSegmentSize()  1;


which explains (with the request overhead) why my ~30,5MB blob was
rejected.

Simon

Le 20/06/2014 11:24, Simon Chemouil a écrit :
 For the record, I could reproduce the problem with blobs of size below 64MB.
 
 Caused by: java.lang.IllegalArgumentException: Mutation of 32000122
 bytes is too large for the maxiumum size of 16777216
 
 32000122 is just ~30MB and fails on 2.1-rc1 while it works on 2.0.X for
 even larger values (up to 64MB works fine)
 
 Simon
 
 Le 20/06/2014 11:00, Simon Chemouil a écrit :
 Le 20/06/2014 10:41, Duncan Sands a écrit :
 Hi Simon,
 122880122 bytes is a lot more than 0.6MB...  How are you sending your blob?

 Turns out there was a mistake in my code. The blob in this case was
 actually 122MB!
 Still the same code works fine on Cassandra 2.0.x so there might be a
 bug lurking. Even if it's definitely above the recommended limit.

 Simon

Re: Best practices for repair

2014-06-20 Thread Paolo Crosato

Thank you very much, I recompiled it with 2.0 and it works well, now I 
will try to figure out which granularity works better.

Your example was really a boost, thanks again!

Regards,

Paolo


Il 19/06/2014 22:42, Paulo Ricardo Motta Gomes ha scritto:

Hello Paolo,

I just published an open source version of the dsetool 
list_subranges command, which will enable you to perform subrange 
repair as described in the post.


You can find the code and usage instructions here: 
https://github.com/pauloricardomg/cassandra-list-subranges


Currently available for 1.2.16, but I guess that just changing the 
version on the pom.xml and recompiling it will make it work on 2.0.x.


Cheers,

Paulo


On Thu, Jun 19, 2014 at 4:40 PM, Jack Krupansky 
j...@basetechnology.com mailto:j...@basetechnology.com wrote:


The DataStax doc should be current best practices:

http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html

If you or anybody else finds it inadequate, speak up.

-- Jack Krupansky

-Original Message- From: Paolo Crosato
Sent: Thursday, June 19, 2014 10:13 AM
To: user@cassandra.apache.org mailto:user@cassandra.apache.org
Subject: Best practices for repair


Hi eveybody,

we have some problems running repairs on a timely schedule. We have a
three node deployment, and we start repair on one node every week,
repairing one columnfamily by one.
However, when we run into the big column families, usually repair
sessions hangs undefinitely, and we have to restart them manually.

The script runs commands like:

nodetool repair keyspace columnfamily

one by one.

This has not been a major issue for some time, since we never delete
data, however we would like to sort the issue once and for all.

Reading resources on the net, I came to the conclusion that we could:

1) either run a repair sessione like the one above, but with the -pr
switch, and run it on every node, not just on one
2) or run sub range repair as described here
http://www.datastax.com/dev/blog/advanced-repair-techniques , which
would be the best option.
However the latter procedure would require us to write some java
program
that calls describe_splits to get the tokens to feed nodetool
repair with.

The second procedure is available out of the box only in the
commercial
version of the opscenter, is this true?

I would like to know if these are the current best practices for
repairs
or if there is some other option that makes repair easier to perform,
and more
reliable that it is now.

Regards,

Paolo Crosato

-- 
Paolo Crosato

Software engineer/Custom Solutions
e-mail: paolo.cros...@targaubiest.com
mailto:paolo.cros...@targaubiest.com




--
*Paulo Motta*

Chaordic | /Platform/
_www.chaordic.com.br http://www.chaordic.com.br/_
+55 48 3232.3200



--
Paolo Crosato
Software engineer/Custom Solutions
e-mail: paolo.cros...@targaubiest.com
Office phone: +3904221722825

UBIEST S.p.A.

www.ubiest.com
Via E. Reginato, 85/H - 31100 Treviso- ITALY Tel [+39] 0422 210 194 - Fax [+39] 
0422 210 270 

This message is for the designated recipient only and may contain privileged, 
proprietary, or otherwise private information. If you have received it in 
error, please notify the sender immediately and delete the original. Any other 
use of the email by you is prohibited.

Re: Best way to do a multi_get using CQL

2014-06-20 Thread Laing, Michael

However my extensive benchmarking this week of the python driver from
master shows a performance *decrease* when using 'token_aware'.

This is on 12-node, 2-datacenter, RF-3 cluster in AWS.

Also why do the work the coordinator will do for you: send all the queries,
wait for everything to come back in whatever order, and sort the result.

I would rather keep my app code simple.

But the real point is that you should benchmark in your own environment.

ml


On Fri, Jun 20, 2014 at 3:29 AM, Marcelo Elias Del Valle 
marc...@s1mbi0se.com.br wrote:

 Yes, I am using the CQL datastax drivers.
 It was a good advice, thanks a lot Janathan.
 []s


 2014-06-20 0:28 GMT-03:00 Jonathan Haddad j...@jonhaddad.com:

 The only case in which it might be better to use an IN clause is if
 the entire query can be satisfied from that machine.  Otherwise, go
 async.

 The native driver reuses connections and intelligently manages the
 pool for you.  It can also multiplex queries over a single connection.

 I am assuming you're using one of the datastax drivers for CQL, btw.

 Jon

 On Thu, Jun 19, 2014 at 7:37 PM, Marcelo Elias Del Valle
 marc...@s1mbi0se.com.br wrote:
  This is interesting, I didn't know that!
  It might make sense then to use select = + async + token aware, I will
 try
  to change my code.
 
  But would it be a recomended solution for these cases? Any other
 options?
 
  I still would if this is the right use case for Cassandra, to look for
  random keys in a huge cluster. After all, the amount of connections to
  Cassandra will still be huge, right... Wouldn't it be a problem?
  Or when you use async the driver reuses the connection?
 
  []s
 
 
  2014-06-19 22:16 GMT-03:00 Jonathan Haddad j...@jonhaddad.com:
 
  If you use async and your driver is token aware, it will go to the
  proper node, rather than requiring the coordinator to do so.
 
  Realistically you're going to have a connection open to every server
  anyways.  It's the difference between you querying for the data
  directly and using a coordinator as a proxy.  It's faster to just ask
  the node with the data.
 
  On Thu, Jun 19, 2014 at 6:11 PM, Marcelo Elias Del Valle
  marc...@s1mbi0se.com.br wrote:
   But using async queries wouldn't be even worse than using SELECT IN?
   The justification in the docs is I could query many nodes, but I
 would
   still
   do it.
  
   Today, I use both async queries AND SELECT IN:
  
   SELECT_ENTITY_LOOKUP = SELECT entity_id FROM  + ENTITY_LOOKUP + 
   WHERE
   name=%s and value in(%s)
  
   for name, values in identifiers.items():
  query = self.SELECT_ENTITY_LOOKUP % ('%s',
   ','.join(['%s']*len(values)))
  args = [name] + values
  query_msg = query % tuple(args)
  futures.append((query_msg, self.session.execute_async(query,
 args)))
  
   for query_msg, future in futures:
  try:
 rows = future.result(timeout=10)
 for row in rows:
   entity_ids.add(row.entity_id)
  except:
 logging.error(Query '%s' returned ERROR  % (query_msg))
 raise
  
   Using async just with select = would mean instead of 1 async query
   (example:
   in (0, 1, 2)), I would do several, one for each value of values
 array
   above.
   In my head, this would mean more connections to Cassandra and the
 same
   amount of work, right? What would be the advantage?
  
   []s
  
  
  
  
   2014-06-19 22:01 GMT-03:00 Jonathan Haddad j...@jonhaddad.com:
  
   Your other option is to fire off async queries.  It's pretty
   straightforward w/ the java or python drivers.
  
   On Thu, Jun 19, 2014 at 5:56 PM, Marcelo Elias Del Valle
   marc...@s1mbi0se.com.br wrote:
I was taking a look at Cassandra anti-patterns list:
   
   
   
   
 http://www.datastax.com/documentation/cassandra/2.0/cassandra/architecture/architecturePlanningAntiPatterns_c.html
   
Among then is
   
SELECT ... IN or index lookups¶
   
SELECT ... IN and index lookups (formerly secondary indexes)
 should
be
avoided except for specific scenarios. See When not to use IN in
SELECT
and
When not to use an index in Indexing in
   
CQL for Cassandra 2.0
   
And Looking at the SELECT doc, I saw:
   
When not to use IN¶
   
The recommendations about when not to use an index apply to using
 IN
in
the
WHERE clause. Under most conditions, using IN in the WHERE clause
 is
not
recommended. Using IN can degrade performance because usually many
nodes
must be queried. For example, in a single, local data center
 cluster
having
30 nodes, a replication factor of 3, and a consistency level of
LOCAL_QUORUM, a single key query goes out to two nodes, but if the
query
uses the IN condition, the number of nodes being queried are most
likely
even higher, up to 20 nodes depending on where the keys fall in
 the
token
range.
   
In my system, I have a column family called entity_lookup:
   
CREATE KEYSPACE IF NOT

Re: Batch of prepared statements exceeding specified threshold

The cluster is new, so no updates were done. Version 2.0.8.
It happened when I did many writes (no reads). Writes are done in small
batches of 2 inserts (writing to 2 column families). The values are big
blobs (up to 100Kb).

Any clues?

Pavel


On Thu, Jun 19, 2014 at 8:07 PM, Marcelo Elias Del Valle 
marc...@s1mbi0se.com.br wrote:

 Pavel,

 Out of curiosity, did it start to happen before some update? Which version
 of Cassandra are you using?

 []s


 2014-06-19 16:10 GMT-03:00 Pavel Kogan pavel.ko...@cortica.com:

 What a coincidence! Today happened in my cluster of 7 nodes as well.

 Regards,
   Pavel


 On Wed, Jun 18, 2014 at 11:13 AM, Marcelo Elias Del Valle 
 marc...@s1mbi0se.com.br wrote:

 I have a 10 node cluster with cassandra 2.0.8.

 I am taking this exceptions in the log when I run my code. What my code
 does is just reading data from a CF and in some cases it writes new data.

  WARN [Native-Transport-Requests:553] 2014-06-18 11:04:51,391
 BatchStatement.java (line 228) Batch of prepared statements for
 [identification1.entity, identification1.entity_lookup] is of size 6165,
 exceeding specified threshold of 5120 by 1045.
  WARN [Native-Transport-Requests:583] 2014-06-18 11:05:01,152
 BatchStatement.java (line 228) Batch of prepared statements for
 [identification1.entity, identification1.entity_lookup] is of size 21266,
 exceeding specified threshold of 5120 by 16146.
  WARN [Native-Transport-Requests:581] 2014-06-18 11:05:20,229
 BatchStatement.java (line 228) Batch of prepared statements for
 [identification1.entity, identification1.entity_lookup] is of size 22978,
 exceeding specified threshold of 5120 by 17858.
  INFO [MemoryMeter:1] 2014-06-18 11:05:32,682 Memtable.java (line 481)
 CFS(Keyspace='OpsCenter', ColumnFamily='rollups300') liveRatio is
 14.249755859375 (just-counted was 9.85302734375).  calculation took 3ms for
 1024 cells

 After some time, one node of the cluster goes down. Then it goes back
 after some seconds and another node goes down. It keeps happening and there
 is always a node down in the cluster, when it goes back another one falls.

 The only exceptions I see in the log is connected reset by the peer,
 which seems to be relative to gossip protocol, when a node goes down.

 Any hint of what could I do to investigate this problem further?

 Best regards,
 Marcelo Valle.

Re: Bug on 2.1-rc1 with BLOBs?

Thanks Simon for the info. I didn't know that the maximum payload size is
related to commit log config, interesting ...


On Fri, Jun 20, 2014 at 11:39 AM, Simon Chemouil schemo...@gmail.com
wrote:

 OK, so Cassandra 2.1 now rejects writes it considers too big. It is
 possible to increase the value by changing commitlog_segment_size_in_mb
 in cassandra.yaml. It defaults to 32MB, and the maximum segment size for
 a write is half that value:

 from CommitLog.java:
 // we only permit records HALF the size of a commit log, to ensure we
 don't spin allocating many mostly
  // empty segments when writing large records
private static final long MAX_MUTATION_SIZE =
 DatabaseDescriptor.getCommitLogSegmentSize()  1;


 which explains (with the request overhead) why my ~30,5MB blob was
 rejected.

 Simon

 Le 20/06/2014 11:24, Simon Chemouil a écrit :
  For the record, I could reproduce the problem with blobs of size below
 64MB.
 
  Caused by: java.lang.IllegalArgumentException: Mutation of 32000122
  bytes is too large for the maxiumum size of 16777216
 
  32000122 is just ~30MB and fails on 2.1-rc1 while it works on 2.0.X for
  even larger values (up to 64MB works fine)
 
  Simon
 
  Le 20/06/2014 11:00, Simon Chemouil a écrit :
  Le 20/06/2014 10:41, Duncan Sands a écrit :
  Hi Simon,
  122880122 bytes is a lot more than 0.6MB...  How are you sending your
 blob?
 
  Turns out there was a mistake in my code. The blob in this case was
  actually 122MB!
  Still the same code works fine on Cassandra 2.0.x so there might be a
  bug lurking. Even if it's definitely above the recommended limit.
 
  Simon

Re: Best way to do a multi_get using CQL

I've found that if you have any amount of latency between your client and
nodes, and you are executing a large batch of queries, you'll usually want
to send them together to one node unless execution time is of no concern.
The tradeoff is resource usage on the connected node vs. time to complete
all the queries, because you'll need fewer client - node network round
trips.

With large numbers of queries you will still want to make sure you split
them into manageable batches before sending them, to control memory usage
on the executing node. I've been limiting queries to batches of 100 keys in
scenarios like this.


On Fri, Jun 20, 2014 at 5:59 AM, Laing, Michael michael.la...@nytimes.com
wrote:

 However my extensive benchmarking this week of the python driver from
 master shows a performance *decrease* when using 'token_aware'.

 This is on 12-node, 2-datacenter, RF-3 cluster in AWS.

 Also why do the work the coordinator will do for you: send all the
 queries, wait for everything to come back in whatever order, and sort the
 result.

 I would rather keep my app code simple.

 But the real point is that you should benchmark in your own environment.

 ml


 On Fri, Jun 20, 2014 at 3:29 AM, Marcelo Elias Del Valle 
 marc...@s1mbi0se.com.br wrote:

 Yes, I am using the CQL datastax drivers.
 It was a good advice, thanks a lot Janathan.
 []s


 2014-06-20 0:28 GMT-03:00 Jonathan Haddad j...@jonhaddad.com:

 The only case in which it might be better to use an IN clause is if
 the entire query can be satisfied from that machine.  Otherwise, go
 async.

 The native driver reuses connections and intelligently manages the
 pool for you.  It can also multiplex queries over a single connection.

 I am assuming you're using one of the datastax drivers for CQL, btw.

 Jon

 On Thu, Jun 19, 2014 at 7:37 PM, Marcelo Elias Del Valle
 marc...@s1mbi0se.com.br wrote:
  This is interesting, I didn't know that!
  It might make sense then to use select = + async + token aware, I will
 try
  to change my code.
 
  But would it be a recomended solution for these cases? Any other
 options?
 
  I still would if this is the right use case for Cassandra, to look for
  random keys in a huge cluster. After all, the amount of connections to
  Cassandra will still be huge, right... Wouldn't it be a problem?
  Or when you use async the driver reuses the connection?
 
  []s
 
 
  2014-06-19 22:16 GMT-03:00 Jonathan Haddad j...@jonhaddad.com:
 
  If you use async and your driver is token aware, it will go to the
  proper node, rather than requiring the coordinator to do so.
 
  Realistically you're going to have a connection open to every server
  anyways.  It's the difference between you querying for the data
  directly and using a coordinator as a proxy.  It's faster to just ask
  the node with the data.
 
  On Thu, Jun 19, 2014 at 6:11 PM, Marcelo Elias Del Valle
  marc...@s1mbi0se.com.br wrote:
   But using async queries wouldn't be even worse than using SELECT IN?
   The justification in the docs is I could query many nodes, but I
 would
   still
   do it.
  
   Today, I use both async queries AND SELECT IN:
  
   SELECT_ENTITY_LOOKUP = SELECT entity_id FROM  + ENTITY_LOOKUP + 
   WHERE
   name=%s and value in(%s)
  
   for name, values in identifiers.items():
  query = self.SELECT_ENTITY_LOOKUP % ('%s',
   ','.join(['%s']*len(values)))
  args = [name] + values
  query_msg = query % tuple(args)
  futures.append((query_msg, self.session.execute_async(query,
 args)))
  
   for query_msg, future in futures:
  try:
 rows = future.result(timeout=10)
 for row in rows:
   entity_ids.add(row.entity_id)
  except:
 logging.error(Query '%s' returned ERROR  % (query_msg))
 raise
  
   Using async just with select = would mean instead of 1 async query
   (example:
   in (0, 1, 2)), I would do several, one for each value of values
 array
   above.
   In my head, this would mean more connections to Cassandra and the
 same
   amount of work, right? What would be the advantage?
  
   []s
  
  
  
  
   2014-06-19 22:01 GMT-03:00 Jonathan Haddad j...@jonhaddad.com:
  
   Your other option is to fire off async queries.  It's pretty
   straightforward w/ the java or python drivers.
  
   On Thu, Jun 19, 2014 at 5:56 PM, Marcelo Elias Del Valle
   marc...@s1mbi0se.com.br wrote:
I was taking a look at Cassandra anti-patterns list:
   
   
   
   
 http://www.datastax.com/documentation/cassandra/2.0/cassandra/architecture/architecturePlanningAntiPatterns_c.html
   
Among then is
   
SELECT ... IN or index lookups¶
   
SELECT ... IN and index lookups (formerly secondary indexes)
 should
be
avoided except for specific scenarios. See When not to use IN in
SELECT
and
When not to use an index in Indexing in
   
CQL for Cassandra 2.0
   
And Looking at the SELECT doc, I saw:
   
When not to use IN¶
   
The recommendations about when not to use

Re: Batch of prepared statements exceeding specified threshold

Pavel,

In my case, the heap was filling up faster than it was draining. I am still
looking for the cause of it, as I could drain really fast with SSD.

However, in your case you could check (AFAIK) nodetool tpstats and see if
there are too many pending write tasks, for instance. Maybe you really are
writting more than the nodes are able to flush to disk.

How many writes per second are you achieving?

Also, I would look for GCInspector in the log:

cat system.log* | grep GCInspector | wc -l
tail -1000 system.log | grep GCInspector

Do you see it running a lot? Is it taking much more time to run each time
it runs?

I am no Cassandra expert, but I would try these things first and post the
results here. Maybe other people in the list have more ideas.

Best regards,
Marcelo.


2014-06-20 8:50 GMT-03:00 Pavel Kogan pavel.ko...@cortica.com:

 The cluster is new, so no updates were done. Version 2.0.8.
 It happened when I did many writes (no reads). Writes are done in small
 batches of 2 inserts (writing to 2 column families). The values are big
 blobs (up to 100Kb).

 Any clues?

 Pavel


 On Thu, Jun 19, 2014 at 8:07 PM, Marcelo Elias Del Valle 
 marc...@s1mbi0se.com.br wrote:

 Pavel,

 Out of curiosity, did it start to happen before some update? Which
 version of Cassandra are you using?

 []s


 2014-06-19 16:10 GMT-03:00 Pavel Kogan pavel.ko...@cortica.com:

 What a coincidence! Today happened in my cluster of 7 nodes as well.

 Regards,
   Pavel


 On Wed, Jun 18, 2014 at 11:13 AM, Marcelo Elias Del Valle 
 marc...@s1mbi0se.com.br wrote:

 I have a 10 node cluster with cassandra 2.0.8.

 I am taking this exceptions in the log when I run my code. What my code
 does is just reading data from a CF and in some cases it writes new data.

  WARN [Native-Transport-Requests:553] 2014-06-18 11:04:51,391
 BatchStatement.java (line 228) Batch of prepared statements for
 [identification1.entity, identification1.entity_lookup] is of size 6165,
 exceeding specified threshold of 5120 by 1045.
  WARN [Native-Transport-Requests:583] 2014-06-18 11:05:01,152
 BatchStatement.java (line 228) Batch of prepared statements for
 [identification1.entity, identification1.entity_lookup] is of size 21266,
 exceeding specified threshold of 5120 by 16146.
  WARN [Native-Transport-Requests:581] 2014-06-18 11:05:20,229
 BatchStatement.java (line 228) Batch of prepared statements for
 [identification1.entity, identification1.entity_lookup] is of size 22978,
 exceeding specified threshold of 5120 by 17858.
  INFO [MemoryMeter:1] 2014-06-18 11:05:32,682 Memtable.java (line 481)
 CFS(Keyspace='OpsCenter', ColumnFamily='rollups300') liveRatio is
 14.249755859375 (just-counted was 9.85302734375).  calculation took 3ms for
 1024 cells

 After some time, one node of the cluster goes down. Then it goes back
 after some seconds and another node goes down. It keeps happening and there
 is always a node down in the cluster, when it goes back another one falls.

 The only exceptions I see in the log is connected reset by the peer,
 which seems to be relative to gossip protocol, when a node goes down.

 Any hint of what could I do to investigate this problem further?

 Best regards,
 Marcelo Valle.

Re: Bug on 2.1-rc1 with BLOBs?

2014-06-20 Thread Robert Coli

On Fri, Jun 20, 2014 at 2:39 AM, Simon Chemouil schemo...@gmail.com wrote:

 OK, so Cassandra 2.1 now rejects writes it considers too big. It is
 possible to increase the value by changing commitlog_segment_size_in_mb
 in cassandra.yaml. It defaults to 32MB, and the maximum segment size for
 a write is half that value:


The previous behavior, IIRC, was to just not commitlog the gigantic
thing... so that's probably a good change. :)

=Rob

Re: Best way to do a multi_get using CQL

A question, not sure if you guys know the answer:
Supose I async query 1000 rows using token aware and suppose I have 10
nodes. Suppose also each node would receive 100 row queries each.
How does async work in this case? Would it send each row query to each node
in a different connection? Different message?
I guess if there was a way to use batch with async, once you commit the
batch for the 1000 queries, it would create 1 connection to each host and
query 100 rows in a single message to each host.
This would decrease resource usage, am I wrong?

[]s


2014-06-20 12:12 GMT-03:00 Jeremy Jongsma jer...@barchart.com:

 I've found that if you have any amount of latency between your client and
 nodes, and you are executing a large batch of queries, you'll usually want
 to send them together to one node unless execution time is of no concern.
 The tradeoff is resource usage on the connected node vs. time to complete
 all the queries, because you'll need fewer client - node network round
 trips.

 With large numbers of queries you will still want to make sure you split
 them into manageable batches before sending them, to control memory usage
 on the executing node. I've been limiting queries to batches of 100 keys in
 scenarios like this.


 On Fri, Jun 20, 2014 at 5:59 AM, Laing, Michael michael.la...@nytimes.com
  wrote:

 However my extensive benchmarking this week of the python driver from
 master shows a performance *decrease* when using 'token_aware'.

 This is on 12-node, 2-datacenter, RF-3 cluster in AWS.

 Also why do the work the coordinator will do for you: send all the
 queries, wait for everything to come back in whatever order, and sort the
 result.

 I would rather keep my app code simple.

 But the real point is that you should benchmark in your own environment.

 ml


 On Fri, Jun 20, 2014 at 3:29 AM, Marcelo Elias Del Valle 
 marc...@s1mbi0se.com.br wrote:

 Yes, I am using the CQL datastax drivers.
 It was a good advice, thanks a lot Janathan.
 []s


 2014-06-20 0:28 GMT-03:00 Jonathan Haddad j...@jonhaddad.com:

 The only case in which it might be better to use an IN clause is if
 the entire query can be satisfied from that machine.  Otherwise, go
 async.

 The native driver reuses connections and intelligently manages the
 pool for you.  It can also multiplex queries over a single connection.

 I am assuming you're using one of the datastax drivers for CQL, btw.

 Jon

 On Thu, Jun 19, 2014 at 7:37 PM, Marcelo Elias Del Valle
 marc...@s1mbi0se.com.br wrote:
  This is interesting, I didn't know that!
  It might make sense then to use select = + async + token aware, I
 will try
  to change my code.
 
  But would it be a recomended solution for these cases? Any other
 options?
 
  I still would if this is the right use case for Cassandra, to look for
  random keys in a huge cluster. After all, the amount of connections to
  Cassandra will still be huge, right... Wouldn't it be a problem?
  Or when you use async the driver reuses the connection?
 
  []s
 
 
  2014-06-19 22:16 GMT-03:00 Jonathan Haddad j...@jonhaddad.com:
 
  If you use async and your driver is token aware, it will go to the
  proper node, rather than requiring the coordinator to do so.
 
  Realistically you're going to have a connection open to every server
  anyways.  It's the difference between you querying for the data
  directly and using a coordinator as a proxy.  It's faster to just ask
  the node with the data.
 
  On Thu, Jun 19, 2014 at 6:11 PM, Marcelo Elias Del Valle
  marc...@s1mbi0se.com.br wrote:
   But using async queries wouldn't be even worse than using SELECT
 IN?
   The justification in the docs is I could query many nodes, but I
 would
   still
   do it.
  
   Today, I use both async queries AND SELECT IN:
  
   SELECT_ENTITY_LOOKUP = SELECT entity_id FROM  + ENTITY_LOOKUP + 
   WHERE
   name=%s and value in(%s)
  
   for name, values in identifiers.items():
  query = self.SELECT_ENTITY_LOOKUP % ('%s',
   ','.join(['%s']*len(values)))
  args = [name] + values
  query_msg = query % tuple(args)
  futures.append((query_msg, self.session.execute_async(query,
 args)))
  
   for query_msg, future in futures:
  try:
 rows = future.result(timeout=10)
 for row in rows:
   entity_ids.add(row.entity_id)
  except:
 logging.error(Query '%s' returned ERROR  % (query_msg))
 raise
  
   Using async just with select = would mean instead of 1 async query
   (example:
   in (0, 1, 2)), I would do several, one for each value of values
 array
   above.
   In my head, this would mean more connections to Cassandra and the
 same
   amount of work, right? What would be the advantage?
  
   []s
  
  
  
  
   2014-06-19 22:01 GMT-03:00 Jonathan Haddad j...@jonhaddad.com:
  
   Your other option is to fire off async queries.  It's pretty
   straightforward w/ the java or python drivers.
  
   On Thu, Jun 19, 2014 at 5:56 PM, Marcelo Elias Del Valle

Re: Best way to do a multi_get using CQL

That depends on the connection pooling implementation in your driver.
Astyanax will keep N connections open to each node (configurable) and route
each query in a separate message over an existing connection, waiting until
one becomes available if all are in use.


On Fri, Jun 20, 2014 at 12:32 PM, Marcelo Elias Del Valle 
marc...@s1mbi0se.com.br wrote:

 A question, not sure if you guys know the answer:
 Supose I async query 1000 rows using token aware and suppose I have 10
 nodes. Suppose also each node would receive 100 row queries each.
 How does async work in this case? Would it send each row query to each
 node in a different connection? Different message?
 I guess if there was a way to use batch with async, once you commit the
 batch for the 1000 queries, it would create 1 connection to each host and
 query 100 rows in a single message to each host.
 This would decrease resource usage, am I wrong?

 []s


 2014-06-20 12:12 GMT-03:00 Jeremy Jongsma jer...@barchart.com:

 I've found that if you have any amount of latency between your client and
 nodes, and you are executing a large batch of queries, you'll usually want
 to send them together to one node unless execution time is of no concern.
 The tradeoff is resource usage on the connected node vs. time to complete
 all the queries, because you'll need fewer client - node network round
 trips.

 With large numbers of queries you will still want to make sure you split
 them into manageable batches before sending them, to control memory usage
 on the executing node. I've been limiting queries to batches of 100 keys in
 scenarios like this.


 On Fri, Jun 20, 2014 at 5:59 AM, Laing, Michael 
 michael.la...@nytimes.com wrote:

 However my extensive benchmarking this week of the python driver from
 master shows a performance *decrease* when using 'token_aware'.

 This is on 12-node, 2-datacenter, RF-3 cluster in AWS.

 Also why do the work the coordinator will do for you: send all the
 queries, wait for everything to come back in whatever order, and sort the
 result.

 I would rather keep my app code simple.

 But the real point is that you should benchmark in your own environment.

 ml


 On Fri, Jun 20, 2014 at 3:29 AM, Marcelo Elias Del Valle 
 marc...@s1mbi0se.com.br wrote:

 Yes, I am using the CQL datastax drivers.
 It was a good advice, thanks a lot Janathan.
 []s


 2014-06-20 0:28 GMT-03:00 Jonathan Haddad j...@jonhaddad.com:

 The only case in which it might be better to use an IN clause is if
 the entire query can be satisfied from that machine.  Otherwise, go
 async.

 The native driver reuses connections and intelligently manages the
 pool for you.  It can also multiplex queries over a single connection.

 I am assuming you're using one of the datastax drivers for CQL, btw.

 Jon

 On Thu, Jun 19, 2014 at 7:37 PM, Marcelo Elias Del Valle
 marc...@s1mbi0se.com.br wrote:
  This is interesting, I didn't know that!
  It might make sense then to use select = + async + token aware, I
 will try
  to change my code.
 
  But would it be a recomended solution for these cases? Any other
 options?
 
  I still would if this is the right use case for Cassandra, to look
 for
  random keys in a huge cluster. After all, the amount of connections
 to
  Cassandra will still be huge, right... Wouldn't it be a problem?
  Or when you use async the driver reuses the connection?
 
  []s
 
 
  2014-06-19 22:16 GMT-03:00 Jonathan Haddad j...@jonhaddad.com:
 
  If you use async and your driver is token aware, it will go to the
  proper node, rather than requiring the coordinator to do so.
 
  Realistically you're going to have a connection open to every server
  anyways.  It's the difference between you querying for the data
  directly and using a coordinator as a proxy.  It's faster to just
 ask
  the node with the data.
 
  On Thu, Jun 19, 2014 at 6:11 PM, Marcelo Elias Del Valle
  marc...@s1mbi0se.com.br wrote:
   But using async queries wouldn't be even worse than using SELECT
 IN?
   The justification in the docs is I could query many nodes, but I
 would
   still
   do it.
  
   Today, I use both async queries AND SELECT IN:
  
   SELECT_ENTITY_LOOKUP = SELECT entity_id FROM  + ENTITY_LOOKUP +
 
   WHERE
   name=%s and value in(%s)
  
   for name, values in identifiers.items():
  query = self.SELECT_ENTITY_LOOKUP % ('%s',
   ','.join(['%s']*len(values)))
  args = [name] + values
  query_msg = query % tuple(args)
  futures.append((query_msg, self.session.execute_async(query,
 args)))
  
   for query_msg, future in futures:
  try:
 rows = future.result(timeout=10)
 for row in rows:
   entity_ids.add(row.entity_id)
  except:
 logging.error(Query '%s' returned ERROR  % (query_msg))
 raise
  
   Using async just with select = would mean instead of 1 async query
   (example:
   in (0, 1, 2)), I would do several, one for each value of values
 array
   above.
   In my head, this would mean more

Re: Batch of prepared statements exceeding specified threshold

Hi Marcelo,

No pending write tasks, I am writing a lot, about 100-200 writes each up to
100Kb every 15[s].
It is running on decent cluster of 5 identical nodes, quad cores i7 with
32Gb RAM and 480Gb SSD.

Regards,
  Pavel


On Fri, Jun 20, 2014 at 12:31 PM, Marcelo Elias Del Valle 
marc...@s1mbi0se.com.br wrote:

 Pavel,

 In my case, the heap was filling up faster than it was draining. I am
 still looking for the cause of it, as I could drain really fast with SSD.

 However, in your case you could check (AFAIK) nodetool tpstats and see if
 there are too many pending write tasks, for instance. Maybe you really are
 writting more than the nodes are able to flush to disk.

 How many writes per second are you achieving?

 Also, I would look for GCInspector in the log:

 cat system.log* | grep GCInspector | wc -l
 tail -1000 system.log | grep GCInspector

 Do you see it running a lot? Is it taking much more time to run each time
 it runs?

 I am no Cassandra expert, but I would try these things first and post the
 results here. Maybe other people in the list have more ideas.

 Best regards,
 Marcelo.


 2014-06-20 8:50 GMT-03:00 Pavel Kogan pavel.ko...@cortica.com:

 The cluster is new, so no updates were done. Version 2.0.8.
 It happened when I did many writes (no reads). Writes are done in small
 batches of 2 inserts (writing to 2 column families). The values are big
 blobs (up to 100Kb).

 Any clues?

 Pavel


 On Thu, Jun 19, 2014 at 8:07 PM, Marcelo Elias Del Valle 
 marc...@s1mbi0se.com.br wrote:

 Pavel,

 Out of curiosity, did it start to happen before some update? Which
 version of Cassandra are you using?

 []s


 2014-06-19 16:10 GMT-03:00 Pavel Kogan pavel.ko...@cortica.com:

 What a coincidence! Today happened in my cluster of 7 nodes as well.

 Regards,
   Pavel


 On Wed, Jun 18, 2014 at 11:13 AM, Marcelo Elias Del Valle 
 marc...@s1mbi0se.com.br wrote:

 I have a 10 node cluster with cassandra 2.0.8.

 I am taking this exceptions in the log when I run my code. What my
 code does is just reading data from a CF and in some cases it writes new
 data.

  WARN [Native-Transport-Requests:553] 2014-06-18 11:04:51,391
 BatchStatement.java (line 228) Batch of prepared statements for
 [identification1.entity, identification1.entity_lookup] is of size 6165,
 exceeding specified threshold of 5120 by 1045.
  WARN [Native-Transport-Requests:583] 2014-06-18 11:05:01,152
 BatchStatement.java (line 228) Batch of prepared statements for
 [identification1.entity, identification1.entity_lookup] is of size 21266,
 exceeding specified threshold of 5120 by 16146.
  WARN [Native-Transport-Requests:581] 2014-06-18 11:05:20,229
 BatchStatement.java (line 228) Batch of prepared statements for
 [identification1.entity, identification1.entity_lookup] is of size 22978,
 exceeding specified threshold of 5120 by 17858.
  INFO [MemoryMeter:1] 2014-06-18 11:05:32,682 Memtable.java (line 481)
 CFS(Keyspace='OpsCenter', ColumnFamily='rollups300') liveRatio is
 14.249755859375 (just-counted was 9.85302734375).  calculation took 3ms 
 for
 1024 cells

 After some time, one node of the cluster goes down. Then it goes back
 after some seconds and another node goes down. It keeps happening and 
 there
 is always a node down in the cluster, when it goes back another one falls.

 The only exceptions I see in the log is connected reset by the peer,
 which seems to be relative to gossip protocol, when a node goes down.

 Any hint of what could I do to investigate this problem further?

 Best regards,
 Marcelo Valle.

Custom snitch classpath?

Where do I add my custom snitch JAR to the Cassandra classpath so I can use
it?

Re: Batch of prepared statements exceeding specified threshold

If you have 32 Gb RAM, the heap is probably 8Gb.
200 writes of 100 kb / s would be 20MB / s in the worst case, supposing all
writes of a replica goes to a single node.
I really don't see any reason why it should be filling up the heap.
Anyone else?

But did you check the logs for the GCInspector?
In my case, nodes are falling because of the heap, in your case, maybe it's
something else.
Do you see increased times when looking for GCInspector in the logs?

[]s



2014-06-20 14:51 GMT-03:00 Pavel Kogan pavel.ko...@cortica.com:

 Hi Marcelo,

 No pending write tasks, I am writing a lot, about 100-200 writes each up
 to 100Kb every 15[s].
 It is running on decent cluster of 5 identical nodes, quad cores i7 with
 32Gb RAM and 480Gb SSD.

 Regards,
   Pavel


 On Fri, Jun 20, 2014 at 12:31 PM, Marcelo Elias Del Valle 
 marc...@s1mbi0se.com.br wrote:

 Pavel,

 In my case, the heap was filling up faster than it was draining. I am
 still looking for the cause of it, as I could drain really fast with SSD.

 However, in your case you could check (AFAIK) nodetool tpstats and see if
 there are too many pending write tasks, for instance. Maybe you really are
 writting more than the nodes are able to flush to disk.

 How many writes per second are you achieving?

 Also, I would look for GCInspector in the log:

 cat system.log* | grep GCInspector | wc -l
 tail -1000 system.log | grep GCInspector

 Do you see it running a lot? Is it taking much more time to run each time
 it runs?

 I am no Cassandra expert, but I would try these things first and post the
 results here. Maybe other people in the list have more ideas.

 Best regards,
 Marcelo.


 2014-06-20 8:50 GMT-03:00 Pavel Kogan pavel.ko...@cortica.com:

 The cluster is new, so no updates were done. Version 2.0.8.
 It happened when I did many writes (no reads). Writes are done in small
 batches of 2 inserts (writing to 2 column families). The values are big
 blobs (up to 100Kb).

 Any clues?

 Pavel


 On Thu, Jun 19, 2014 at 8:07 PM, Marcelo Elias Del Valle 
 marc...@s1mbi0se.com.br wrote:

 Pavel,

 Out of curiosity, did it start to happen before some update? Which
 version of Cassandra are you using?

 []s


 2014-06-19 16:10 GMT-03:00 Pavel Kogan pavel.ko...@cortica.com:

 What a coincidence! Today happened in my cluster of 7 nodes as well.

 Regards,
   Pavel


 On Wed, Jun 18, 2014 at 11:13 AM, Marcelo Elias Del Valle 
 marc...@s1mbi0se.com.br wrote:

 I have a 10 node cluster with cassandra 2.0.8.

 I am taking this exceptions in the log when I run my code. What my
 code does is just reading data from a CF and in some cases it writes new
 data.

  WARN [Native-Transport-Requests:553] 2014-06-18 11:04:51,391
 BatchStatement.java (line 228) Batch of prepared statements for
 [identification1.entity, identification1.entity_lookup] is of size 6165,
 exceeding specified threshold of 5120 by 1045.
  WARN [Native-Transport-Requests:583] 2014-06-18 11:05:01,152
 BatchStatement.java (line 228) Batch of prepared statements for
 [identification1.entity, identification1.entity_lookup] is of size 21266,
 exceeding specified threshold of 5120 by 16146.
  WARN [Native-Transport-Requests:581] 2014-06-18 11:05:20,229
 BatchStatement.java (line 228) Batch of prepared statements for
 [identification1.entity, identification1.entity_lookup] is of size 22978,
 exceeding specified threshold of 5120 by 17858.
  INFO [MemoryMeter:1] 2014-06-18 11:05:32,682 Memtable.java (line
 481) CFS(Keyspace='OpsCenter', ColumnFamily='rollups300') liveRatio is
 14.249755859375 (just-counted was 9.85302734375).  calculation took 3ms 
 for
 1024 cells

 After some time, one node of the cluster goes down. Then it goes back
 after some seconds and another node goes down. It keeps happening and 
 there
 is always a node down in the cluster, when it goes back another one 
 falls.

 The only exceptions I see in the log is connected reset by the
 peer, which seems to be relative to gossip protocol, when a node goes 
 down.

 Any hint of what could I do to investigate this problem further?

 Best regards,
 Marcelo Valle.

Re: Custom snitch classpath?

2014-06-20 Thread Tyler Hobbs

The lib directory (where all the other jars are).  bin/cassandra.in.sh does
this:

for jar in $CASSANDRA_HOME/lib/*.jar; do
CLASSPATH=$CLASSPATH:$jar
done



On Fri, Jun 20, 2014 at 12:58 PM, Jeremy Jongsma jer...@barchart.com
wrote:

 Where do I add my custom snitch JAR to the Cassandra classpath so I can
 use it?




-- 
Tyler Hobbs
DataStax http://datastax.com/

Re: Batch of prepared statements exceeding specified threshold

I think some figures from nodetool tpstats and nodetool compactionstats
may help seeing clearer

And Pavel, when you said batch, did you mean LOGGED batch or UNLOGGED batch
?





On Fri, Jun 20, 2014 at 8:02 PM, Marcelo Elias Del Valle 
marc...@s1mbi0se.com.br wrote:

 If you have 32 Gb RAM, the heap is probably 8Gb.
 200 writes of 100 kb / s would be 20MB / s in the worst case, supposing
 all writes of a replica goes to a single node.
 I really don't see any reason why it should be filling up the heap.
 Anyone else?

 But did you check the logs for the GCInspector?
 In my case, nodes are falling because of the heap, in your case, maybe
 it's something else.
 Do you see increased times when looking for GCInspector in the logs?

 []s



 2014-06-20 14:51 GMT-03:00 Pavel Kogan pavel.ko...@cortica.com:

 Hi Marcelo,

 No pending write tasks, I am writing a lot, about 100-200 writes each up
 to 100Kb every 15[s].
 It is running on decent cluster of 5 identical nodes, quad cores i7 with
 32Gb RAM and 480Gb SSD.

 Regards,
   Pavel


 On Fri, Jun 20, 2014 at 12:31 PM, Marcelo Elias Del Valle 
 marc...@s1mbi0se.com.br wrote:

 Pavel,

 In my case, the heap was filling up faster than it was draining. I am
 still looking for the cause of it, as I could drain really fast with SSD.

 However, in your case you could check (AFAIK) nodetool tpstats and see
 if there are too many pending write tasks, for instance. Maybe you really
 are writting more than the nodes are able to flush to disk.

 How many writes per second are you achieving?

 Also, I would look for GCInspector in the log:

 cat system.log* | grep GCInspector | wc -l
 tail -1000 system.log | grep GCInspector

 Do you see it running a lot? Is it taking much more time to run each
 time it runs?

 I am no Cassandra expert, but I would try these things first and post
 the results here. Maybe other people in the list have more ideas.

 Best regards,
 Marcelo.


 2014-06-20 8:50 GMT-03:00 Pavel Kogan pavel.ko...@cortica.com:

 The cluster is new, so no updates were done. Version 2.0.8.
 It happened when I did many writes (no reads). Writes are done in small
 batches of 2 inserts (writing to 2 column families). The values are big
 blobs (up to 100Kb).

 Any clues?

 Pavel


 On Thu, Jun 19, 2014 at 8:07 PM, Marcelo Elias Del Valle 
 marc...@s1mbi0se.com.br wrote:

 Pavel,

 Out of curiosity, did it start to happen before some update? Which
 version of Cassandra are you using?

 []s


 2014-06-19 16:10 GMT-03:00 Pavel Kogan pavel.ko...@cortica.com:

 What a coincidence! Today happened in my cluster of 7 nodes as well.

 Regards,
   Pavel


 On Wed, Jun 18, 2014 at 11:13 AM, Marcelo Elias Del Valle 
 marc...@s1mbi0se.com.br wrote:

 I have a 10 node cluster with cassandra 2.0.8.

 I am taking this exceptions in the log when I run my code. What my
 code does is just reading data from a CF and in some cases it writes new
 data.

  WARN [Native-Transport-Requests:553] 2014-06-18 11:04:51,391
 BatchStatement.java (line 228) Batch of prepared statements for
 [identification1.entity, identification1.entity_lookup] is of size 6165,
 exceeding specified threshold of 5120 by 1045.
  WARN [Native-Transport-Requests:583] 2014-06-18 11:05:01,152
 BatchStatement.java (line 228) Batch of prepared statements for
 [identification1.entity, identification1.entity_lookup] is of size 
 21266,
 exceeding specified threshold of 5120 by 16146.
  WARN [Native-Transport-Requests:581] 2014-06-18 11:05:20,229
 BatchStatement.java (line 228) Batch of prepared statements for
 [identification1.entity, identification1.entity_lookup] is of size 
 22978,
 exceeding specified threshold of 5120 by 17858.
  INFO [MemoryMeter:1] 2014-06-18 11:05:32,682 Memtable.java (line
 481) CFS(Keyspace='OpsCenter', ColumnFamily='rollups300') liveRatio is
 14.249755859375 (just-counted was 9.85302734375).  calculation took 3ms 
 for
 1024 cells

 After some time, one node of the cluster goes down. Then it goes
 back after some seconds and another node goes down. It keeps happening 
 and
 there is always a node down in the cluster, when it goes back another 
 one
 falls.

 The only exceptions I see in the log is connected reset by the
 peer, which seems to be relative to gossip protocol, when a node goes 
 down.

 Any hint of what could I do to investigate this problem further?

 Best regards,
 Marcelo Valle.

Re: Custom snitch classpath?

Sharing in case anyone else wants to use this:

https://github.com/barchart/cassandra-plugins/blob/master/src/main/java/com/barchart/cassandra/plugins/snitch/GossipingPropertyFileWithEC2FallbackSnitch.java

Basically it is a proxy that attempts to use GossipingPropertyFileSnitch,
and it that fails to initialize due to missing rack or datacenter
values, it falls back to Ec2MultiRegionSnitch. We are using it for hybrid
cloud deployments between AWS and our private datacenter.


On Fri, Jun 20, 2014 at 1:04 PM, Tyler Hobbs ty...@datastax.com wrote:

 The lib directory (where all the other jars are).  bin/cassandra.in.sh
 does this:

 for jar in $CASSANDRA_HOME/lib/*.jar; do
 CLASSPATH=$CLASSPATH:$jar
 done



 On Fri, Jun 20, 2014 at 12:58 PM, Jeremy Jongsma jer...@barchart.com
 wrote:

 Where do I add my custom snitch JAR to the Cassandra classpath so I can
 use it?




 --
 Tyler Hobbs
 DataStax http://datastax.com/

Re: Custom snitch classpath?

This is nice!
I was looking for something like this to implement a multi DC cluster
between OVh and Amazon.
Thanks for sharing!
[]s


2014-06-20 15:35 GMT-03:00 Jeremy Jongsma jer...@barchart.com:

 Sharing in case anyone else wants to use this:


 https://github.com/barchart/cassandra-plugins/blob/master/src/main/java/com/barchart/cassandra/plugins/snitch/GossipingPropertyFileWithEC2FallbackSnitch.java

 Basically it is a proxy that attempts to use GossipingPropertyFileSnitch,
 and it that fails to initialize due to missing rack or datacenter
 values, it falls back to Ec2MultiRegionSnitch. We are using it for hybrid
 cloud deployments between AWS and our private datacenter.


 On Fri, Jun 20, 2014 at 1:04 PM, Tyler Hobbs ty...@datastax.com wrote:

 The lib directory (where all the other jars are).  bin/cassandra.in.sh
 does this:

 for jar in $CASSANDRA_HOME/lib/*.jar; do
 CLASSPATH=$CLASSPATH:$jar
 done



 On Fri, Jun 20, 2014 at 12:58 PM, Jeremy Jongsma jer...@barchart.com
 wrote:

 Where do I add my custom snitch JAR to the Cassandra classpath so I can
 use it?




 --
 Tyler Hobbs
 DataStax http://datastax.com/

Re: Best way to do a multi_get using CQL

I am using python + CQL Driver.
I wonder how they do...
These things seems little important, but they are fundamental to get a good
performance in Cassandra...
I wish there was a simpler way to query in batches. Opening a large amount
of connections and sending 1 message at a time seems bad to me, as
sometimes you want to work with small rows.
It's no surprise Cassandra performs better when we use average row sizes.
But honestly I disagree with this part of Cassandra/Driver's design.
[]s


2014-06-20 14:37 GMT-03:00 Jeremy Jongsma jer...@barchart.com:

 That depends on the connection pooling implementation in your driver.
 Astyanax will keep N connections open to each node (configurable) and route
 each query in a separate message over an existing connection, waiting until
 one becomes available if all are in use.


 On Fri, Jun 20, 2014 at 12:32 PM, Marcelo Elias Del Valle 
 marc...@s1mbi0se.com.br wrote:

 A question, not sure if you guys know the answer:
 Supose I async query 1000 rows using token aware and suppose I have 10
 nodes. Suppose also each node would receive 100 row queries each.
 How does async work in this case? Would it send each row query to each
 node in a different connection? Different message?
 I guess if there was a way to use batch with async, once you commit the
 batch for the 1000 queries, it would create 1 connection to each host and
 query 100 rows in a single message to each host.
 This would decrease resource usage, am I wrong?

 []s


 2014-06-20 12:12 GMT-03:00 Jeremy Jongsma jer...@barchart.com:

 I've found that if you have any amount of latency between your client and
 nodes, and you are executing a large batch of queries, you'll usually want
 to send them together to one node unless execution time is of no concern.
 The tradeoff is resource usage on the connected node vs. time to complete
 all the queries, because you'll need fewer client - node network round
 trips.

 With large numbers of queries you will still want to make sure you split
 them into manageable batches before sending them, to control memory usage
 on the executing node. I've been limiting queries to batches of 100 keys in
 scenarios like this.


 On Fri, Jun 20, 2014 at 5:59 AM, Laing, Michael 
 michael.la...@nytimes.com wrote:

 However my extensive benchmarking this week of the python driver from
 master shows a performance *decrease* when using 'token_aware'.

 This is on 12-node, 2-datacenter, RF-3 cluster in AWS.

 Also why do the work the coordinator will do for you: send all the
 queries, wait for everything to come back in whatever order, and sort the
 result.

 I would rather keep my app code simple.

 But the real point is that you should benchmark in your own environment.

 ml


 On Fri, Jun 20, 2014 at 3:29 AM, Marcelo Elias Del Valle 
 marc...@s1mbi0se.com.br wrote:

 Yes, I am using the CQL datastax drivers.
 It was a good advice, thanks a lot Janathan.
 []s


 2014-06-20 0:28 GMT-03:00 Jonathan Haddad j...@jonhaddad.com:

 The only case in which it might be better to use an IN clause is if
 the entire query can be satisfied from that machine.  Otherwise, go
 async.

 The native driver reuses connections and intelligently manages the
 pool for you.  It can also multiplex queries over a single connection.

 I am assuming you're using one of the datastax drivers for CQL, btw.

 Jon

 On Thu, Jun 19, 2014 at 7:37 PM, Marcelo Elias Del Valle
 marc...@s1mbi0se.com.br wrote:
  This is interesting, I didn't know that!
  It might make sense then to use select = + async + token aware, I
 will try
  to change my code.
 
  But would it be a recomended solution for these cases? Any other
 options?
 
  I still would if this is the right use case for Cassandra, to look
 for
  random keys in a huge cluster. After all, the amount of connections
 to
  Cassandra will still be huge, right... Wouldn't it be a problem?
  Or when you use async the driver reuses the connection?
 
  []s
 
 
  2014-06-19 22:16 GMT-03:00 Jonathan Haddad j...@jonhaddad.com:
 
  If you use async and your driver is token aware, it will go to the
  proper node, rather than requiring the coordinator to do so.
 
  Realistically you're going to have a connection open to every
 server
  anyways.  It's the difference between you querying for the data
  directly and using a coordinator as a proxy.  It's faster to just
 ask
  the node with the data.
 
  On Thu, Jun 19, 2014 at 6:11 PM, Marcelo Elias Del Valle
  marc...@s1mbi0se.com.br wrote:
   But using async queries wouldn't be even worse than using SELECT
 IN?
   The justification in the docs is I could query many nodes, but I
 would
   still
   do it.
  
   Today, I use both async queries AND SELECT IN:
  
   SELECT_ENTITY_LOOKUP = SELECT entity_id FROM  + ENTITY_LOOKUP
 + 
   WHERE
   name=%s and value in(%s)
  
   for name, values in identifiers.items():
  query = self.SELECT_ENTITY_LOOKUP % ('%s',
   ','.join(['%s']*len(values)))
  args = [name] + values
  query_msg =

Re: Best way to do a multi_get using CQL

Well it's kind of a trade-off.

 Either you send data directly to the primary replica nodes to take
advantage of data-locality using token-aware strategy and the price to pay
is a high number of opened connections from client side.

Or you just batch data to a random node playing the coordinator role to
dispatch requests to the right nodes. The price to pay is then spike load
on 1 node (the coordinator) and intra-cluster bandwdith usage.

 The choice is yours, it has nothing to do with good or bad design.


On Fri, Jun 20, 2014 at 8:55 PM, Marcelo Elias Del Valle 
marc...@s1mbi0se.com.br wrote:

 I am using python + CQL Driver.
 I wonder how they do...
 These things seems little important, but they are fundamental to get a
 good performance in Cassandra...
 I wish there was a simpler way to query in batches. Opening a large amount
 of connections and sending 1 message at a time seems bad to me, as
 sometimes you want to work with small rows.
 It's no surprise Cassandra performs better when we use average row sizes.
 But honestly I disagree with this part of Cassandra/Driver's design.
 []s


 2014-06-20 14:37 GMT-03:00 Jeremy Jongsma jer...@barchart.com:

 That depends on the connection pooling implementation in your driver.
 Astyanax will keep N connections open to each node (configurable) and route
 each query in a separate message over an existing connection, waiting until
 one becomes available if all are in use.


 On Fri, Jun 20, 2014 at 12:32 PM, Marcelo Elias Del Valle 
 marc...@s1mbi0se.com.br wrote:

 A question, not sure if you guys know the answer:
 Supose I async query 1000 rows using token aware and suppose I have 10
 nodes. Suppose also each node would receive 100 row queries each.
 How does async work in this case? Would it send each row query to each
 node in a different connection? Different message?
 I guess if there was a way to use batch with async, once you commit the
 batch for the 1000 queries, it would create 1 connection to each host and
 query 100 rows in a single message to each host.
 This would decrease resource usage, am I wrong?

 []s


 2014-06-20 12:12 GMT-03:00 Jeremy Jongsma jer...@barchart.com:

 I've found that if you have any amount of latency between your client
 and nodes, and you are executing a large batch of queries, you'll usually
 want to send them together to one node unless execution time is of no
 concern. The tradeoff is resource usage on the connected node vs. time to
 complete all the queries, because you'll need fewer client - node network
 round trips.

 With large numbers of queries you will still want to make sure you
 split them into manageable batches before sending them, to control memory
 usage on the executing node. I've been limiting queries to batches of 100
 keys in scenarios like this.


 On Fri, Jun 20, 2014 at 5:59 AM, Laing, Michael 
 michael.la...@nytimes.com wrote:

 However my extensive benchmarking this week of the python driver from
 master shows a performance *decrease* when using 'token_aware'.

 This is on 12-node, 2-datacenter, RF-3 cluster in AWS.

 Also why do the work the coordinator will do for you: send all the
 queries, wait for everything to come back in whatever order, and sort the
 result.

 I would rather keep my app code simple.

 But the real point is that you should benchmark in your own
 environment.

 ml


 On Fri, Jun 20, 2014 at 3:29 AM, Marcelo Elias Del Valle 
 marc...@s1mbi0se.com.br wrote:

 Yes, I am using the CQL datastax drivers.
 It was a good advice, thanks a lot Janathan.
 []s


 2014-06-20 0:28 GMT-03:00 Jonathan Haddad j...@jonhaddad.com:

 The only case in which it might be better to use an IN clause is if
 the entire query can be satisfied from that machine.  Otherwise, go
 async.

 The native driver reuses connections and intelligently manages the
 pool for you.  It can also multiplex queries over a single
 connection.

 I am assuming you're using one of the datastax drivers for CQL, btw.

 Jon

 On Thu, Jun 19, 2014 at 7:37 PM, Marcelo Elias Del Valle
 marc...@s1mbi0se.com.br wrote:
  This is interesting, I didn't know that!
  It might make sense then to use select = + async + token aware, I
 will try
  to change my code.
 
  But would it be a recomended solution for these cases? Any other
 options?
 
  I still would if this is the right use case for Cassandra, to look
 for
  random keys in a huge cluster. After all, the amount of
 connections to
  Cassandra will still be huge, right... Wouldn't it be a problem?
  Or when you use async the driver reuses the connection?
 
  []s
 
 
  2014-06-19 22:16 GMT-03:00 Jonathan Haddad j...@jonhaddad.com:
 
  If you use async and your driver is token aware, it will go to the
  proper node, rather than requiring the coordinator to do so.
 
  Realistically you're going to have a connection open to every
 server
  anyways.  It's the difference between you querying for the data
  directly and using a coordinator as a proxy.  It's faster to just
 ask
  the node

Re: Best way to do a multi_get using CQL

There is nothing preventing that in Cassandra, it's just a matter of how
intelligent the driver API is. Submit a feature request to Astyanax or
Datastax driver projects.


On Fri, Jun 20, 2014 at 2:27 PM, Marcelo Elias Del Valle 
marc...@s1mbi0se.com.br wrote:

 The bad design part (just my opinion, no intention to offend) is not allow
 the possibility of sending batches directly to the data nodes, without
 using a coordinator.
 I would choose that option.
 []s


 2014-06-20 16:05 GMT-03:00 DuyHai Doan doanduy...@gmail.com:

 Well it's kind of a trade-off.

  Either you send data directly to the primary replica nodes to take
 advantage of data-locality using token-aware strategy and the price to pay
 is a high number of opened connections from client side.

 Or you just batch data to a random node playing the coordinator role to
 dispatch requests to the right nodes. The price to pay is then spike load
 on 1 node (the coordinator) and intra-cluster bandwdith usage.

  The choice is yours, it has nothing to do with good or bad design.


 On Fri, Jun 20, 2014 at 8:55 PM, Marcelo Elias Del Valle 
 marc...@s1mbi0se.com.br wrote:

 I am using python + CQL Driver.
 I wonder how they do...
 These things seems little important, but they are fundamental to get a
 good performance in Cassandra...
 I wish there was a simpler way to query in batches. Opening a large
 amount of connections and sending 1 message at a time seems bad to me, as
 sometimes you want to work with small rows.
 It's no surprise Cassandra performs better when we use average row
 sizes. But honestly I disagree with this part of Cassandra/Driver's design.
 []s


 2014-06-20 14:37 GMT-03:00 Jeremy Jongsma jer...@barchart.com:

 That depends on the connection pooling implementation in your driver.
 Astyanax will keep N connections open to each node (configurable) and route
 each query in a separate message over an existing connection, waiting until
 one becomes available if all are in use.


 On Fri, Jun 20, 2014 at 12:32 PM, Marcelo Elias Del Valle 
 marc...@s1mbi0se.com.br wrote:

 A question, not sure if you guys know the answer:
 Supose I async query 1000 rows using token aware and suppose I have 10
 nodes. Suppose also each node would receive 100 row queries each.
 How does async work in this case? Would it send each row query to each
 node in a different connection? Different message?
 I guess if there was a way to use batch with async, once you commit
 the batch for the 1000 queries, it would create 1 connection to each host
 and query 100 rows in a single message to each host.
 This would decrease resource usage, am I wrong?

 []s


 2014-06-20 12:12 GMT-03:00 Jeremy Jongsma jer...@barchart.com:

 I've found that if you have any amount of latency between your client
 and nodes, and you are executing a large batch of queries, you'll usually
 want to send them together to one node unless execution time is of no
 concern. The tradeoff is resource usage on the connected node vs. time to
 complete all the queries, because you'll need fewer client - node 
 network
 round trips.

 With large numbers of queries you will still want to make sure you
 split them into manageable batches before sending them, to control memory
 usage on the executing node. I've been limiting queries to batches of 100
 keys in scenarios like this.


 On Fri, Jun 20, 2014 at 5:59 AM, Laing, Michael 
 michael.la...@nytimes.com wrote:

 However my extensive benchmarking this week of the python driver
 from master shows a performance *decrease* when using 'token_aware'.

 This is on 12-node, 2-datacenter, RF-3 cluster in AWS.

 Also why do the work the coordinator will do for you: send all the
 queries, wait for everything to come back in whatever order, and sort 
 the
 result.

 I would rather keep my app code simple.

 But the real point is that you should benchmark in your own
 environment.

 ml


 On Fri, Jun 20, 2014 at 3:29 AM, Marcelo Elias Del Valle 
 marc...@s1mbi0se.com.br wrote:

 Yes, I am using the CQL datastax drivers.
 It was a good advice, thanks a lot Janathan.
 []s


 2014-06-20 0:28 GMT-03:00 Jonathan Haddad j...@jonhaddad.com:

 The only case in which it might be better to use an IN clause is if
 the entire query can be satisfied from that machine.  Otherwise, go
 async.

 The native driver reuses connections and intelligently manages the
 pool for you.  It can also multiplex queries over a single
 connection.

 I am assuming you're using one of the datastax drivers for CQL,
 btw.

 Jon

 On Thu, Jun 19, 2014 at 7:37 PM, Marcelo Elias Del Valle
 marc...@s1mbi0se.com.br wrote:
  This is interesting, I didn't know that!
  It might make sense then to use select = + async + token aware,
 I will try
  to change my code.
 
  But would it be a recomended solution for these cases? Any
 other options?
 
  I still would if this is the right use case for Cassandra, to
 look for
  random keys in a huge cluster. After all, the amount of
 connections to
  Cassandra will

Re: Best way to do a multi_get using CQL

2014-06-20 Thread Jonathan Haddad

I forgot to add that each connection can handle multiple simultaneous
queries.  This was part of the original protocol as of C* 1.2:
http://www.datastax.com/dev/blog/binary-protocol

Asynchronous: each connection can handle more than one active request
at the same time. In practice, this means that client libraries will
only need to maintain a relatively low amount of open connections to a
given Cassandra node to achieve good performance. This particularly
matters with Cassandra where a client usually wants to keep connection
to all (or at least a good part of) the nodes of the Cluster and so
having a low number of per-node connections helps scaling to large
clusters.
Technically, this is achieved by giving each messages a stream ID, and
by having responses to a request preserve the request’s stream ID.
Clients can thus send multiple requests with different stream IDs on
the same connection (i.e. without waiting for the response to a
request to send the next one) while still being able to associate each
received response to the right request, even if said responses comes
in a different order than the one in which requests were submitted.
That asynchronicity is of course optional in the sense that a client
library can still choose to use the protocol in a synchronous way if
that is simpler.

On Fri, Jun 20, 2014 at 12:30 PM, Jeremy Jongsma jer...@barchart.com wrote:
 There is nothing preventing that in Cassandra, it's just a matter of how
 intelligent the driver API is. Submit a feature request to Astyanax or
 Datastax driver projects.


 On Fri, Jun 20, 2014 at 2:27 PM, Marcelo Elias Del Valle
 marc...@s1mbi0se.com.br wrote:

 The bad design part (just my opinion, no intention to offend) is not allow
 the possibility of sending batches directly to the data nodes, without using
 a coordinator.
 I would choose that option.
 []s


 2014-06-20 16:05 GMT-03:00 DuyHai Doan doanduy...@gmail.com:

 Well it's kind of a trade-off.

  Either you send data directly to the primary replica nodes to take
 advantage of data-locality using token-aware strategy and the price to pay
 is a high number of opened connections from client side.

 Or you just batch data to a random node playing the coordinator role to
 dispatch requests to the right nodes. The price to pay is then spike load on
 1 node (the coordinator) and intra-cluster bandwdith usage.

  The choice is yours, it has nothing to do with good or bad design.


 On Fri, Jun 20, 2014 at 8:55 PM, Marcelo Elias Del Valle
 marc...@s1mbi0se.com.br wrote:

 I am using python + CQL Driver.
 I wonder how they do...
 These things seems little important, but they are fundamental to get a
 good performance in Cassandra...
 I wish there was a simpler way to query in batches. Opening a large
 amount of connections and sending 1 message at a time seems bad to me, as
 sometimes you want to work with small rows.
 It's no surprise Cassandra performs better when we use average row
 sizes. But honestly I disagree with this part of Cassandra/Driver's design.
 []s


 2014-06-20 14:37 GMT-03:00 Jeremy Jongsma jer...@barchart.com:

 That depends on the connection pooling implementation in your driver.
 Astyanax will keep N connections open to each node (configurable) and 
 route
 each query in a separate message over an existing connection, waiting 
 until
 one becomes available if all are in use.


 On Fri, Jun 20, 2014 at 12:32 PM, Marcelo Elias Del Valle
 marc...@s1mbi0se.com.br wrote:

 A question, not sure if you guys know the answer:
 Supose I async query 1000 rows using token aware and suppose I have 10
 nodes. Suppose also each node would receive 100 row queries each.
 How does async work in this case? Would it send each row query to each
 node in a different connection? Different message?
 I guess if there was a way to use batch with async, once you commit
 the batch for the 1000 queries, it would create 1 connection to each host
 and query 100 rows in a single message to each host.
 This would decrease resource usage, am I wrong?

 []s


 2014-06-20 12:12 GMT-03:00 Jeremy Jongsma jer...@barchart.com:

 I've found that if you have any amount of latency between your client
 and nodes, and you are executing a large batch of queries, you'll 
 usually
 want to send them together to one node unless execution time is of no
 concern. The tradeoff is resource usage on the connected node vs. time 
 to
 complete all the queries, because you'll need fewer client - node 
 network
 round trips.

 With large numbers of queries you will still want to make sure you
 split them into manageable batches before sending them, to control 
 memory
 usage on the executing node. I've been limiting queries to batches of 
 100
 keys in scenarios like this.


 On Fri, Jun 20, 2014 at 5:59 AM, Laing, Michael
 michael.la...@nytimes.com wrote:

 However my extensive benchmarking this week of the python driver
 from master shows a performance decrease when using 'token_aware'.

 This is on 12-node,

Re: Batch of prepared statements exceeding specified threshold

Logged batch.


On Fri, Jun 20, 2014 at 2:13 PM, DuyHai Doan doanduy...@gmail.com wrote:

 I think some figures from nodetool tpstats and nodetool
 compactionstats may help seeing clearer

 And Pavel, when you said batch, did you mean LOGGED batch or UNLOGGED
 batch ?





 On Fri, Jun 20, 2014 at 8:02 PM, Marcelo Elias Del Valle 
 marc...@s1mbi0se.com.br wrote:

 If you have 32 Gb RAM, the heap is probably 8Gb.
 200 writes of 100 kb / s would be 20MB / s in the worst case, supposing
 all writes of a replica goes to a single node.
 I really don't see any reason why it should be filling up the heap.
 Anyone else?

 But did you check the logs for the GCInspector?
 In my case, nodes are falling because of the heap, in your case, maybe
 it's something else.
 Do you see increased times when looking for GCInspector in the logs?

 []s



 2014-06-20 14:51 GMT-03:00 Pavel Kogan pavel.ko...@cortica.com:

 Hi Marcelo,

 No pending write tasks, I am writing a lot, about 100-200 writes each up
 to 100Kb every 15[s].
 It is running on decent cluster of 5 identical nodes, quad cores i7 with
 32Gb RAM and 480Gb SSD.

 Regards,
   Pavel


 On Fri, Jun 20, 2014 at 12:31 PM, Marcelo Elias Del Valle 
 marc...@s1mbi0se.com.br wrote:

 Pavel,

 In my case, the heap was filling up faster than it was draining. I am
 still looking for the cause of it, as I could drain really fast with SSD.

 However, in your case you could check (AFAIK) nodetool tpstats and see
 if there are too many pending write tasks, for instance. Maybe you really
 are writting more than the nodes are able to flush to disk.

 How many writes per second are you achieving?

 Also, I would look for GCInspector in the log:

 cat system.log* | grep GCInspector | wc -l
 tail -1000 system.log | grep GCInspector

 Do you see it running a lot? Is it taking much more time to run each
 time it runs?

 I am no Cassandra expert, but I would try these things first and post
 the results here. Maybe other people in the list have more ideas.

 Best regards,
 Marcelo.


 2014-06-20 8:50 GMT-03:00 Pavel Kogan pavel.ko...@cortica.com:

 The cluster is new, so no updates were done. Version 2.0.8.
 It happened when I did many writes (no reads). Writes are done in
 small batches of 2 inserts (writing to 2 column families). The values are
 big blobs (up to 100Kb).

 Any clues?

 Pavel


 On Thu, Jun 19, 2014 at 8:07 PM, Marcelo Elias Del Valle 
 marc...@s1mbi0se.com.br wrote:

 Pavel,

 Out of curiosity, did it start to happen before some update? Which
 version of Cassandra are you using?

 []s


 2014-06-19 16:10 GMT-03:00 Pavel Kogan pavel.ko...@cortica.com:

 What a coincidence! Today happened in my cluster of 7 nodes as well.

 Regards,
   Pavel


 On Wed, Jun 18, 2014 at 11:13 AM, Marcelo Elias Del Valle 
 marc...@s1mbi0se.com.br wrote:

 I have a 10 node cluster with cassandra 2.0.8.

 I am taking this exceptions in the log when I run my code. What my
 code does is just reading data from a CF and in some cases it writes 
 new
 data.

  WARN [Native-Transport-Requests:553] 2014-06-18 11:04:51,391
 BatchStatement.java (line 228) Batch of prepared statements for
 [identification1.entity, identification1.entity_lookup] is of size 
 6165,
 exceeding specified threshold of 5120 by 1045.
  WARN [Native-Transport-Requests:583] 2014-06-18 11:05:01,152
 BatchStatement.java (line 228) Batch of prepared statements for
 [identification1.entity, identification1.entity_lookup] is of size 
 21266,
 exceeding specified threshold of 5120 by 16146.
  WARN [Native-Transport-Requests:581] 2014-06-18 11:05:20,229
 BatchStatement.java (line 228) Batch of prepared statements for
 [identification1.entity, identification1.entity_lookup] is of size 
 22978,
 exceeding specified threshold of 5120 by 17858.
  INFO [MemoryMeter:1] 2014-06-18 11:05:32,682 Memtable.java (line
 481) CFS(Keyspace='OpsCenter', ColumnFamily='rollups300') liveRatio is
 14.249755859375 (just-counted was 9.85302734375).  calculation took 
 3ms for
 1024 cells

 After some time, one node of the cluster goes down. Then it goes
 back after some seconds and another node goes down. It keeps happening 
 and
 there is always a node down in the cluster, when it goes back another 
 one
 falls.

 The only exceptions I see in the log is connected reset by the
 peer, which seems to be relative to gossip protocol, when a node goes 
 down.

 Any hint of what could I do to investigate this problem further?

 Best regards,
 Marcelo Valle.

Re: Batch of prepared statements exceeding specified threshold

Ok, in my case it was straightforward. It is just warning, which however
says that batches with large data size (above 5Kb) can sometimes lead to
node instability (why?). This limit seems to be hard-coded, I didn't find
anyway to configure it externally. Anyway, removing batch and giving up
atomicity, resolved the issue for me.

http://mail-archives.apache.org/mod_mbox/cassandra-commits/201404.mbox/%3ceee5dd5bc4794ef0b5c5153fdb583...@git.apache.org%3E

On Fri, Jun 20, 2014 at 3:55 PM, Pavel Kogan pavel.ko...@cortica.com
wrote:

Logged batch.

On Fri, Jun 20, 2014 at 2:13 PM, DuyHai Doan doanduy...@gmail.com wrote:

I think some figures from nodetool tpstats and nodetool
compactionstats may help seeing clearer

And Pavel, when you said batch, did you mean LOGGED batch or UNLOGGED
batch ?

On Fri, Jun 20, 2014 at 8:02 PM, Marcelo Elias Del Valle
marc...@s1mbi0se.com.br wrote:

If you have 32 Gb RAM, the heap is probably 8Gb.
200 writes of 100 kb / s would be 20MB / s in the worst case, supposing
all writes of a replica goes to a single node.
I really don't see any reason why it should be filling up the heap.
Anyone else?

But did you check the logs for the GCInspector?
In my case, nodes are falling because of the heap, in your case, maybe
it's something else.
Do you see increased times when looking for GCInspector in the logs?

[]s

2014-06-20 14:51 GMT-03:00 Pavel Kogan pavel.ko...@cortica.com:

Hi Marcelo,

No pending write tasks, I am writing a lot, about 100-200 writes each
up to 100Kb every 15[s].
It is running on decent cluster of 5 identical nodes, quad cores i7
with 32Gb RAM and 480Gb SSD.

Regards,
Pavel

On Fri, Jun 20, 2014 at 12:31 PM, Marcelo Elias Del Valle
marc...@s1mbi0se.com.br wrote:

Pavel,

In my case, the heap was filling up faster than it was draining. I am
still looking for the cause of it, as I could drain really fast with SSD.

However, in your case you could check (AFAIK) nodetool tpstats and see
if there are too many pending write tasks, for instance. Maybe you really
are writting more than the nodes are able to flush to disk.

How many writes per second are you achieving?

Also, I would look for GCInspector in the log:

cat system.log* | grep GCInspector | wc -l
tail -1000 system.log | grep GCInspector

Do you see it running a lot? Is it taking much more time to run each
time it runs?

I am no Cassandra expert, but I would try these things first and post
the results here. Maybe other people in the list have more ideas.

Best regards,
Marcelo.

2014-06-20 8:50 GMT-03:00 Pavel Kogan pavel.ko...@cortica.com:

The cluster is new, so no updates were done. Version 2.0.8.
It happened when I did many writes (no reads). Writes are done in
small batches of 2 inserts (writing to 2 column families). The values are
big blobs (up to 100Kb).

Any clues?

Pavel

On Thu, Jun 19, 2014 at 8:07 PM, Marcelo Elias Del Valle
marc...@s1mbi0se.com.br wrote:

Pavel,

Out of curiosity, did it start to happen before some update? Which
version of Cassandra are you using?

[]s

2014-06-19 16:10 GMT-03:00 Pavel Kogan pavel.ko...@cortica.com:

What a coincidence! Today happened in my cluster of 7 nodes as well.

Regards,
Pavel

On Wed, Jun 18, 2014 at 11:13 AM, Marcelo Elias Del Valle
marc...@s1mbi0se.com.br wrote:

I have a 10 node cluster with cassandra 2.0.8.

I am taking this exceptions in the log when I run my code. What my
code does is just reading data from a CF and in some cases it writes
new
data.

WARN [Native-Transport-Requests:553] 2014-06-18 11:04:51,391
BatchStatement.java (line 228) Batch of prepared statements for
[identification1.entity, identification1.entity_lookup] is of size
6165,
exceeding specified threshold of 5120 by 1045.
WARN [Native-Transport-Requests:583] 2014-06-18 11:05:01,152
BatchStatement.java (line 228) Batch of prepared statements for
[identification1.entity, identification1.entity_lookup] is of size
21266,
exceeding specified threshold of 5120 by 16146.
WARN [Native-Transport-Requests:581] 2014-06-18 11:05:20,229
BatchStatement.java (line 228) Batch of prepared statements for
[identification1.entity, identification1.entity_lookup] is of size
22978,
exceeding specified threshold of 5120 by 17858.
INFO [MemoryMeter:1] 2014-06-18 11:05:32,682 Memtable.java (line
481) CFS(Keyspace='OpsCenter', ColumnFamily='rollups300') liveRatio is
14.249755859375 (just-counted was 9.85302734375). calculation took
3ms for
1024 cells

After some time, one node of the cluster goes down. Then it goes
back after some seconds and another node goes down. It keeps
happening and
there is always a node down in the cluster, when it goes back another
one
falls.

The only exceptions I see in the log is connected reset by the
peer, which seems to be relative to gossip protocol, when a node
goes down.

Any hint of what could I do to investigate this problem further?

Best

Using Cassandra as cache

Hi,

In our project, many distributed modules sending each other binary blobs,
up to 100-200kb each in average. Small JSONs are being sent over message
queue, while Cassandra is being used as temporary storage for blobs. We are
using Cassandra instead of in memory distributed cache like Couch due to
following reasons: (1) We don't wan't to be limited by RAM size (2) We are
using intensively ordered composite keys and ranges (it is not simple
key/value cache).

We don't use TTL mechanism for several reasons. Major reason is that we
need to reclaim free disk space immediately and not after 10 days
(gc_grace). We are very limited in disk space cause traffic is intensive
and blobs are big.

So what we did is creating every hour new keyspace named _MM_dd_HH and
when disk becomes full, script running in crontrab on each node drops
keyspace with IF EXISTS flag, and deletes whole keyspace folder. That way
whole process is very clean and no garbage is left on disk.

Keyspace is created by first module in flow on hourly basis and its name is
being sent over message queue, to avoid possible problems. All modules read
and write with consistency ONE and of cause there is no replication.

Actually it works nice but we have several problems:
1) When new keyspace with its columnfamilies is being just created (every
round hour), sometimes other modules failed to read/write data, and we lose
request. Can it be that creation of keyspace and columnfamilies is async
operation or there is propagation time between nodes?

2) We are reading and writing intensively, and usually I don't need the
data for more than 1-2 hours. What optimizations can I do to increase my
small cluster read performance? Cluster configuration - 3 identical nodes:
i7 3GHz, SSD 120Gb, 16Gb RAM, CentOS 6.

Hope not too much text :)

Thanks,
  Pavel

Re: Using Cassandra as cache

2014-06-20 Thread Robert Stupp


Am 20.06.2014 um 23:48 schrieb Pavel Kogan pavel.ko...@cortica.com:

 1) When new keyspace with its columnfamilies is being just created (every 
 round hour), sometimes other modules failed to read/write data, and we lose 
 request. Can it be that creation of keyspace and columnfamilies is async 
 operation or there is propagation time between nodes? 

Schema needs to settle down (nodes actually agree on a common view) - this 
may take several seconds until all nodes have that common view. Turn on DEBUG 
output in Java driver for example to see these messages.
CL ONE requires the one node to be up and running - if that node's not 
running your request will definitely fail. Maybe you want to try CL ANY or 
increase RF to 2.

 2) We are reading and writing intensively, and usually I don't need the data 
 for more than 1-2 hours. What optimizations can I do to increase my small 
 cluster read performance? Cluster configuration - 3 identical nodes: i7 3GHz, 
 SSD 120Gb, 16Gb RAM, CentOS 6.

Depending on the data, table layout, access patterns and C* version try with 
various key cache and maybe row cache configurations in both table options and 
cassandra.yaml



signature.asc
Description: Message signed with OpenPGP using GPGMail

Re: Using Cassandra as cache

2014-06-20 Thread Robert Coli

On Fri, Jun 20, 2014 at 2:48 PM, Pavel Kogan pavel.ko...@cortica.com
wrote:

 So what we did is creating every hour new keyspace named _MM_dd_HH and
 when disk becomes full, script running in crontrab on each node drops
 keyspace with IF EXISTS flag, and deletes whole keyspace folder. That way
 whole process is very clean and no garbage is left on disk.


I've recommended a similar technique in the past, but with alternating
between Keyspace_A and Keyspace_B. That way you just TRUNCATE them instead
of having to DROP.

DROP/CREATE keyspace have problems that TRUNCATE do not. Perhaps use a
TRUNCATE oriented technique?

=Rob

Re: Using Cassandra as cache

Schema propagation takes times:
https://issues.apache.org/jira/browse/CASSANDRA-5725

@Robert: do we still need to cleanup manually snapshot when truncating ? I
remembered that on the 1.2 branch, even though the auto_snapshot param was
set to false, truncating leads to snapshot creation that forced us to
manually remove the snapshot folder on disk


On Sat, Jun 21, 2014 at 12:01 AM, Robert Stupp sn...@snazy.de wrote:


 Am 20.06.2014 um 23:48 schrieb Pavel Kogan pavel.ko...@cortica.com:

  1) When new keyspace with its columnfamilies is being just created
 (every round hour), sometimes other modules failed to read/write data, and
 we lose request. Can it be that creation of keyspace and columnfamilies is
 async operation or there is propagation time between nodes?

 Schema needs to settle down (nodes actually agree on a common view) -
 this may take several seconds until all nodes have that common view. Turn
 on DEBUG output in Java driver for example to see these messages.
 CL ONE requires the one node to be up and running - if that node's not
 running your request will definitely fail. Maybe you want to try CL ANY or
 increase RF to 2.

  2) We are reading and writing intensively, and usually I don't need the
 data for more than 1-2 hours. What optimizations can I do to increase my
 small cluster read performance? Cluster configuration - 3 identical nodes:
 i7 3GHz, SSD 120Gb, 16Gb RAM, CentOS 6.

 Depending on the data, table layout, access patterns and C* version try
 with various key cache and maybe row cache configurations in both table
 options and cassandra.yaml

Re: Using Cassandra as cache

Thanks Robert,

Can you please explain what problems DROP/CREATE keyspace may cause?
Seems like truncate working per column family and I have up to 10.
What I should I delete from disk in that case? I can't delete whole folder
right? I need to delete all content under each cf folder, but not folders?
Correct?

Pavel



On Fri, Jun 20, 2014 at 6:01 PM, Robert Coli rc...@eventbrite.com wrote:

 On Fri, Jun 20, 2014 at 2:48 PM, Pavel Kogan pavel.ko...@cortica.com
 wrote:

 So what we did is creating every hour new keyspace named _MM_dd_HH
 and when disk becomes full, script running in crontrab on each node drops
 keyspace with IF EXISTS flag, and deletes whole keyspace folder. That way
 whole process is very clean and no garbage is left on disk.


 I've recommended a similar technique in the past, but with alternating
 between Keyspace_A and Keyspace_B. That way you just TRUNCATE them instead
 of having to DROP.

 DROP/CREATE keyspace have problems that TRUNCATE do not. Perhaps use a
 TRUNCATE oriented technique?

 =Rob

Re: Using Cassandra as cache