Re: Lots of replicate on write tasks pending, want to investigate

2013-07-04 Thread Sylvain Lebresne
 The write path (not replicate on write) for counters involves a read,


I'm afraid you got it wrong. The read done during counter writes *is* done
by the replicate on write taks. Though really, the replicate on write taks
are just one part of the counter write path (they are not not the write
path).

--
Sylvain



 On Wed, Jul 3, 2013 at 1:03 PM, Robert Coli rc...@eventbrite.com wrote:

 On Wed, Jul 3, 2013 at 9:59 AM, Andrew Bialecki 
 andrew.biale...@gmail.com wrote:

 2. I'm assuming in our case the cause is incrementing counters because
 disk reads are part of the write path for counters and are not for
 appending columns to a row. Does that logic make sense?


 That's a pretty reasonable assumption if you are not doing any other
 reads and you see your disk busy doing non-compaction related reads. :)

 =Rob





What is best Cassandra client?

2013-07-04 Thread Tony Anecito
Hi All,
What is the best client to use? I want to use CQL 3.0.3 and have support for 
preparedStatmements. I tried JDBC and the thrift client so far.
 
Thanks!

Re: What is best Cassandra client?

2013-07-04 Thread Theo Hultberg
Datastax Java driver: https://github.com/datastax/java-driver

T#


On Thu, Jul 4, 2013 at 10:25 AM, Tony Anecito adanec...@yahoo.com wrote:

 Hi All,
 What is the best client to use? I want to use CQL 3.0.3 and have support
 for preparedStatmements. I tried JDBC and the thrift client so far.

 Thanks!



Re: What is best Cassandra client?

2013-07-04 Thread Sachin Sinha
Datastax driver for me as well. 

Sent from my iPhone

On 4 Jul 2013, at 09:34, Theo Hultberg t...@iconara.net wrote:

 Datastax Java driver: https://github.com/datastax/java-driver
 
 T#
 
 
 On Thu, Jul 4, 2013 at 10:25 AM, Tony Anecito adanec...@yahoo.com wrote:
 Hi All,
 What is the best client to use? I want to use CQL 3.0.3 and have support for 
 preparedStatmements. I tried JDBC and the thrift client so far.
  
 Thanks!
 


going down from RF=3 to RF=2, repair constantly falls over with JVM OOM

2013-07-04 Thread Evan Dandrea
Hi,

We've made the mistake of letting our nodes get too large, now holding
about 3TB each. We ran out of enough free space to have a successful
compaction, and because we're on 1.0.7, enabling compression to get
out of the mess wasn't feasible. We tried adding another node, but we
think this may have put too much pressure on the existing ones it was
replicating from, so we backed out.

So we decided to drop RF down to 2 from 3 to relieve the disk pressure
and started building a secondary cluster with lots of 1 TB nodes. We
ran repair -pr on each node, but it’s failing with a JVM OOM on one
node while another node is streaming from it for the final repair.

Does anyone know what we can tune to get the cluster stable enough to
put it in a multi-dc setup with the secondary cluster? Do we actually
need to wait for these RF3-RF2 repairs to stabilize, or could we
point it at the secondary cluster without worry of data loss?

We’ve set the heap on these two problematic nodes to 20GB, up from the
equally too high 12GB, but we’re still hitting OOM. I had seen in
other threads that tuning down compaction might help, so we’re trying
the following:

in_memory_compaction_limit_in_mb 32 (down from 64)
compaction_throughput_mb_per_sec 8 (down from 16)
concurrent_compactors 2 (the nodes have 24 cores)
flush_largest_memtables_at 0.45 (down from 0.50)
stream_throughput_outbound_megabits_per_sec 300 (down from 400)
reduce_cache_sizes_at 0.5 (down from 0.6)
reduce_cache_capacity_to 0.35 (down from 0.4)

-XX:CMSInitiatingOccupancyFraction=30

Here’s the log from the most recent repair failure:

http://paste.ubuntu.com/5843017/

The OOM starts at line 13401.

Thanks for whatever insight you can provide.


Re: going down from RF=3 to RF=2, repair constantly falls over with JVM OOM

2013-07-04 Thread Michał Michalski
I don't think you need to run repair if you decrease RF. At least I 
wouldn't do it.


In case of *decreasing* RF have 3 nodes containing some data, but only 2 
of them should store them from now on, so you should rather run cleanup, 
instead of repair, toget rid of the data on 3rd replica. And I guess it 
should work (in terms of disk space and memory), if you've been able to 
perform compaction.


Repair makes sense if you *increase* RF, so the data are streamed to the 
new replicas.


M.


W dniu 04.07.2013 12:20, Evan Dandrea pisze:

Hi,

We've made the mistake of letting our nodes get too large, now holding
about 3TB each. We ran out of enough free space to have a successful
compaction, and because we're on 1.0.7, enabling compression to get
out of the mess wasn't feasible. We tried adding another node, but we
think this may have put too much pressure on the existing ones it was
replicating from, so we backed out.

So we decided to drop RF down to 2 from 3 to relieve the disk pressure
and started building a secondary cluster with lots of 1 TB nodes. We
ran repair -pr on each node, but it’s failing with a JVM OOM on one
node while another node is streaming from it for the final repair.

Does anyone know what we can tune to get the cluster stable enough to
put it in a multi-dc setup with the secondary cluster? Do we actually
need to wait for these RF3-RF2 repairs to stabilize, or could we
point it at the secondary cluster without worry of data loss?

We’ve set the heap on these two problematic nodes to 20GB, up from the
equally too high 12GB, but we’re still hitting OOM. I had seen in
other threads that tuning down compaction might help, so we’re trying
the following:

in_memory_compaction_limit_in_mb 32 (down from 64)
compaction_throughput_mb_per_sec 8 (down from 16)
concurrent_compactors 2 (the nodes have 24 cores)
flush_largest_memtables_at 0.45 (down from 0.50)
stream_throughput_outbound_megabits_per_sec 300 (down from 400)
reduce_cache_sizes_at 0.5 (down from 0.6)
reduce_cache_capacity_to 0.35 (down from 0.4)

-XX:CMSInitiatingOccupancyFraction=30

Here’s the log from the most recent repair failure:

http://paste.ubuntu.com/5843017/

The OOM starts at line 13401.

Thanks for whatever insight you can provide.





Partitioner type

2013-07-04 Thread Vivek Mishra
Hi,
Is it possible to know, type of partitioner programmitcally at runtime?

-Vivek


Re: Partitioner type

2013-07-04 Thread Shubham Mittal
Yeah its possible,
It depends on which client you're using.

e,g.
In pycassa(python client for cassandra), I use
 import pycassa
 from pycassa.system_manager import *
 sys = SystemManager('hostname:portnumber')
 sys.describe_partitioner()




On Thu, Jul 4, 2013 at 5:32 PM, Vivek Mishra mishra.v...@gmail.com wrote:

 Hi,
 Is it possible to know, type of partitioner programmitcally at runtime?

 -Vivek



Re: Partitioner type

2013-07-04 Thread Haithem Jarraya
yes, you can query local CF in system keyspace:

 select partitioner from system.local;


H


On 4 July 2013 13:02, Vivek Mishra mishra.v...@gmail.com wrote:

 Hi,
 Is it possible to know, type of partitioner programmitcally at runtime?

 -Vivek



Re: Partitioner type

2013-07-04 Thread Vivek Mishra
Just saw , thrift apis describe_paritioner() method.

Thanks for quick suggestions.

-Vivek


On Thu, Jul 4, 2013 at 5:40 PM, Haithem Jarraya
haithem.jarr...@struq.comwrote:

 yes, you can query local CF in system keyspace:

  select partitioner from system.local;


 H


 On 4 July 2013 13:02, Vivek Mishra mishra.v...@gmail.com wrote:

 Hi,
 Is it possible to know, type of partitioner programmitcally at runtime?

 -Vivek





Re: going down from RF=3 to RF=2, repair constantly falls over with JVM OOM

2013-07-04 Thread Alain RODRIGUEZ
@Michal: all true, a clean up would certainly remove a lot of useless data
there, and I also advice Evan to do it. However, Evan may want to continue
repairing his cluster as a routine operation an there is no reason a RF
change shouldn't lead to this kind of issues.

@Evan : With this amount of data, and not being using C*1.2, you could try
tuning your bloom filters to use less memory. Let's say disabling them the
time to recover from this issue : bloom_filter_fp_chance = 1.0 then upgrade
sstables and retry repairing.

This depends a lot of your needs and your context, but it might work if you
can afford it.

By the way, C* prior 1.2 should not exceed 300-500 GB per node. I read once
that C*1.2 aims to reach 3-5 TB per node. Yet, horizontal scaling, using
peer-to-peer is one of the main point of Cassandra. You might be carefull
and scale when needed to never reach that much data per node.

As always, please experts/commiters, correct me if I am wrong.

Alain


2013/7/4 Michał Michalski mich...@opera.com

 I don't think you need to run repair if you decrease RF. At least I
 wouldn't do it.

 In case of *decreasing* RF have 3 nodes containing some data, but only 2
 of them should store them from now on, so you should rather run cleanup,
 instead of repair, toget rid of the data on 3rd replica. And I guess it
 should work (in terms of disk space and memory), if you've been able to
 perform compaction.

 Repair makes sense if you *increase* RF, so the data are streamed to the
 new replicas.

 M.


 W dniu 04.07.2013 12:20, Evan Dandrea pisze:

  Hi,

 We've made the mistake of letting our nodes get too large, now holding
 about 3TB each. We ran out of enough free space to have a successful
 compaction, and because we're on 1.0.7, enabling compression to get
 out of the mess wasn't feasible. We tried adding another node, but we
 think this may have put too much pressure on the existing ones it was
 replicating from, so we backed out.

 So we decided to drop RF down to 2 from 3 to relieve the disk pressure
 and started building a secondary cluster with lots of 1 TB nodes. We
 ran repair -pr on each node, but it’s failing with a JVM OOM on one
 node while another node is streaming from it for the final repair.

 Does anyone know what we can tune to get the cluster stable enough to
 put it in a multi-dc setup with the secondary cluster? Do we actually
 need to wait for these RF3-RF2 repairs to stabilize, or could we
 point it at the secondary cluster without worry of data loss?

 We’ve set the heap on these two problematic nodes to 20GB, up from the
 equally too high 12GB, but we’re still hitting OOM. I had seen in
 other threads that tuning down compaction might help, so we’re trying
 the following:

 in_memory_compaction_limit_in_**mb 32 (down from 64)
 compaction_throughput_mb_per_**sec 8 (down from 16)
 concurrent_compactors 2 (the nodes have 24 cores)
 flush_largest_memtables_at 0.45 (down from 0.50)
 stream_throughput_outbound_**megabits_per_sec 300 (down from 400)
 reduce_cache_sizes_at 0.5 (down from 0.6)
 reduce_cache_capacity_to 0.35 (down from 0.4)

 -XX:**CMSInitiatingOccupancyFraction**=30

 Here’s the log from the most recent repair failure:

 http://paste.ubuntu.com/**5843017/ http://paste.ubuntu.com/5843017/

 The OOM starts at line 13401.

 Thanks for whatever insight you can provide.





Restart node = hinted handoff flood

2013-07-04 Thread Alain RODRIGUEZ
Hi,

Using C*1.2.2 12 EC2 xLarge cluster.

When I restart a node, if it spend a few minutes down, when I bring it up,
all the cpu are blocked at 100%, even once compactions are disabled,
inducing a very big and intolerable latency in my app. I suspect Hinted
Handoff to be the cause of this. disabling gossip fix the problem, enabling
it again brings the latency back (with a lot of gc, dropped messages...).

Is there a way to disable HH ? Are they responsible for this issue ?

I currently have this node down, any fast insight would be appreciated.

Alain


Re: What is best Cassandra client?

2013-07-04 Thread Tony Anecito
Where can I get a compiled jar? I found out about this yesterday but do not 
have environment setup to compile it.
 
Thanks!

From: Theo Hultberg t...@iconara.net
To: user@cassandra.apache.org; Tony Anecito adanec...@yahoo.com 
Sent: Thursday, July 4, 2013 2:34 AM
Subject: Re: What is best Cassandra client?



Datastax Java driver: https://github.com/datastax/java-driver 

T#



On Thu, Jul 4, 2013 at 10:25 AM, Tony Anecito adanec...@yahoo.com wrote:

Hi All,
What is the best client to use? I want to use CQL 3.0.3 and have support for 
preparedStatmements. I tried JDBC and the thrift client so far.

Thanks!

Re: What is best Cassandra client?

2013-07-04 Thread Michael Klishin
2013/7/4 Tony Anecito adanec...@yahoo.com

 Where can I get a compiled jar?


http://search.maven.org/#search%7Cga%7C1%7Ca%3A%22cassandra-driver-core%22
-- 
MK

http://github.com/michaelklishin
http://twitter.com/michaelklishin


Re: What is best Cassandra client?

2013-07-04 Thread Tony Anecito
Thanks I found the jar in the maven repository.
 
-Tony

From: Theo Hultberg t...@iconara.net
To: user@cassandra.apache.org; Tony Anecito adanec...@yahoo.com 
Sent: Thursday, July 4, 2013 2:34 AM
Subject: Re: What is best Cassandra client?



Datastax Java driver: https://github.com/datastax/java-driver 

T#



On Thu, Jul 4, 2013 at 10:25 AM, Tony Anecito adanec...@yahoo.com wrote:

Hi All,
What is the best client to use? I want to use CQL 3.0.3 and have support for 
preparedStatmements. I tried JDBC and the thrift client so far.

Thanks!

Re: Restart node = hinted handoff flood

2013-07-04 Thread Alain RODRIGUEZ
The point is that there is no way, afaik, to limit the speed of these
Hinted Handoff since it's not a stream like repair or bootstrap, no way
either to keep the node out of the ring during the time it is receiving
hints since hints and normal traffic both go through gossip protocol on
port 7000.

How to avoid this Hinted Handoff flood on returning nodes ?

Alain


2013/7/4 Alain RODRIGUEZ arodr...@gmail.com

 Hi,

 Using C*1.2.2 12 EC2 xLarge cluster.

 When I restart a node, if it spend a few minutes down, when I bring it up,
 all the cpu are blocked at 100%, even once compactions are disabled,
 inducing a very big and intolerable latency in my app. I suspect Hinted
 Handoff to be the cause of this. disabling gossip fix the problem, enabling
 it again brings the latency back (with a lot of gc, dropped messages...).

 Is there a way to disable HH ? Are they responsible for this issue ?

 I currently have this node down, any fast insight would be appreciated.

 Alain



Migrating data from 2 node cluster to a 3 node cluster

2013-07-04 Thread srmore
We are planning to move data from a 2 node cluster to a 3 node cluster. We
are planning to copy the data from the two nodes (snapshot) to the new 2
nodes and hoping that Cassandra will sync it to the third node. Will this
work ? are there any other commands to run after we are done migrating,
like nodetool repair.

Thanks all.


videos of 2013 summit

2013-07-04 Thread S Ahmed
Hi,

Are the videos online anywhere for the 2013 summit?


Re: videos of 2013 summit

2013-07-04 Thread Jabbar Azam
http://www.youtube.com/playlist?list=PLqcm6qE9lgKJzVvwHprow9h7KMpb5hcUU

Thanks

Jabbar Azam
On 4 Jul 2013 18:17, S Ahmed sahmed1...@gmail.com wrote:

 Hi,

 Are the videos online anywhere for the 2013 summit?



Re: Migrating data from 2 node cluster to a 3 node cluster

2013-07-04 Thread Jonathan Haddad
You should run a nodetool repair after you copy the data over.  You could
also use the sstable loader, which would stream the data to the proper node.


On Thu, Jul 4, 2013 at 10:03 AM, srmore comom...@gmail.com wrote:

 We are planning to move data from a 2 node cluster to a 3 node cluster. We
 are planning to copy the data from the two nodes (snapshot) to the new 2
 nodes and hoping that Cassandra will sync it to the third node. Will this
 work ? are there any other commands to run after we are done migrating,
 like nodetool repair.

 Thanks all.




-- 
Jon Haddad
http://www.rustyrazorblade.com
skype: rustyrazorblade


Re:videos of 2013 summit

2013-07-04 Thread Shamim
http://www.planetcassandra.org/blog/post/cassandra-summit-2013---use-cases-and-technical-presentations


CQL and IN

2013-07-04 Thread Tony Anecito
Hi All,
 
I am using the DataStax driver and got prepared to work. When I tried to use 
the IN keyword with a SQL it did not work. According to DataStax IN should 
work.
 
So if I tried:
 
Select * from items Where item_id IN (Select item_id FROM users where user_id = 
?)
 
 
Thanks for the feedback.
-Tony

Re: CQL and IN

2013-07-04 Thread Rui Vieira
CQL does not support sub-queries.


On 4 July 2013 22:53, Tony Anecito adanec...@yahoo.com wrote:

 Hi All,

 I am using the DataStax driver and got prepared to work. When I tried to
 use the IN keyword with a SQL it did not work. According to DataStax IN
 should work.

 So if I tried:

 Select * from items Where item_id IN (Select item_id FROM users where
 user_id = ?)


 Thanks for the feedback.
 -Tony



Re: CQL and IN

2013-07-04 Thread Rui Vieira
You can use the actual item_ids however,

Select * from items Where item_id IN (1, 2, 3, ..., n)


On 4 July 2013 23:16, Rui Vieira ruidevie...@googlemail.com wrote:

 CQL does not support sub-queries.


 On 4 July 2013 22:53, Tony Anecito adanec...@yahoo.com wrote:

 Hi All,

 I am using the DataStax driver and got prepared to work. When I tried to
 use the IN keyword with a SQL it did not work. According to DataStax IN
 should work.

 So if I tried:

 Select * from items Where item_id IN (Select item_id FROM users where
 user_id = ?)


 Thanks for the feedback.
 -Tony





Installing specific version

2013-07-04 Thread Ben Gambley
Hi all

Can anyone point me in the right direction for  installing a specific
version from datastax repo, we need 1.2.4 to keep consistent with our qa
environment.

It's for a new prod cluster , on Debian 6.

I thought it may be a value in /etc/apt/source.list ?

The latest 1.2.6 does not appear compatible with our phpcassa thrift
drivers.

After many late nights my google ability seems to have evaporated!

Cheers
Ben


How to build indexes?

2013-07-04 Thread Tony Anecito
Hi All,

I updated a table with a secondary index. I discovered using CLI describe that 
the index was not built.

How do I build an index after I have altered an existing table with data?

I looked at nodetool and cli and saw no command that had the word build index 
associated with it. And most of the postings I have found so far cover creating 
an index but not building it or verifying the index was built.

Thanks,
-Tony