Re: OOM under high write throughputs on 2.2.5

2016-05-24 Thread Bryan Cheng
Hi Zhiyan, Silly question but are you sure your heap settings are actually being applied? "697,236,904 (51.91%)" would represent a sub-2GB heap. What's the real memory usage for Java when this crash happens? Other thing to look into might be memtable_heap_space_in_mb, as it looks like you're

Re: Increasing replication factor and repair doesn't seem to work

2016-05-24 Thread Bryan Cheng
Hi Luke, I've never found nodetool status' load to be useful beyond a general indicator. You should expect some small skew, as this will depend on your current compaction status, tombstones, etc. IIRC repair will not provide consistency of intermediate states nor will it remove tombstones, it

Re: Increasing replication factor and repair doesn't seem to work

2016-05-24 Thread kurt Greaves
Not necessarily considering RF is 2 so both nodes should have all partitions. Luke, are you sure the repair is succeeding? You don't have other keyspaces/duplicate data/extra data in your cassandra data directory? Also, you could try querying on the node with less data to confirm if it has the

Re: Increasing replication factor and repair doesn't seem to work

2016-05-24 Thread Bhuvan Rawal
For the other DC, it can be acceptable because partition reside on one node, so say if you have a large partition, it may skew things a bit. On May 25, 2016 2:41 AM, "Luke Jolly" wrote: > So I guess the problem may have been with the initial addition of the > 10.128.0.20

Re: Increasing replication factor and repair doesn't seem to work

2016-05-24 Thread Luke Jolly
So I guess the problem may have been with the initial addition of the 10.128.0.20 node because when I added it in it never synced data I guess? It was at around 50 MB when it first came up and transitioned to "UN". After it was in I did the 1->2 replication change and tried repair but it didn't

Re: Cassandra and Kubernetes and scaling

2016-05-24 Thread Aiman Parvaiz
Looking forward to hearing from the community about this. Sent from my iPhone > On May 24, 2016, at 10:19 AM, Mike Wojcikiewicz wrote: > > I saw a thread from April 2016 talking about Cassandra and Kubernetes, and > have a few follow up questions. It seems that especially

Re: Increasing replication factor and repair doesn't seem to work

2016-05-24 Thread Bhuvan Rawal
Hi Luke, You mentioned that replication factor was increased from 1 to 2. In that case was the node bearing ip 10.128.0.20 carried around 3GB data earlier? You can run nodetool repair with option -local to initiate repair local datacenter for gce-us-central1. Also you may suspect that if a lot

Re: Cassandra event notification on INSERT/DELETE of records

2016-05-24 Thread Mark Reddy
+1 to what Eric said, a queue is a classic C* anti-pattern. Something like Kafka or RabbitMQ might fit your use case better. Mark On 24 May 2016 at 18:03, Eric Stevens wrote: > It sounds like you're trying to build a queue in Cassandra, which is one > of the classic

Re: Too many keyspaces causes cql connection to time out ?

2016-05-24 Thread Justin Lin
ion. So we suspect this Stop-The-World GC might block the connection >> until it times out. This is the log that i think is relevant. >> >> INFO 20160524-060930.028882 :: Initializing >> sandbox_20160524_t06_09_18.table1 >> >> INFO 20160524-060933.908008 :: G1

Re: Increasing replication factor and repair doesn't seem to work

2016-05-24 Thread Luke Jolly
Here's my setup: Datacenter: gce-us-central1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 10.128.0.3 6.4 GB 256 100.0%

Cassandra and Kubernetes and scaling

2016-05-24 Thread Mike Wojcikiewicz
I saw a thread from April 2016 talking about Cassandra and Kubernetes, and have a few follow up questions. It seems that especially after v1.2 of Kubernetes, and the upcoming 1.3 features, this would be a very viable option of running Cassandra on. My questions pertain to HostIds and Scaling

Re: Thrift client creates massive amounts of network packets

2016-05-24 Thread Eric Stevens
I'm not familiar with Titan's usage patterns for Cassandra, but I wonder if this is because of the consistency level it's querying Cassandra at - i.e. if CL isn't LOCAL_[something], then this might just be lots of little checksums required to satisfy consistency requirements. On Mon, May 23, 2016

Re: Too many keyspaces causes cql connection to time out ?

2016-05-24 Thread Eric Stevens
rld GC might block the connection until it times > out. This is the log that i think is relevant. > > INFO 20160524-060930.028882 :: Initializing > sandbox_20160524_t06_09_18.table1 > > INFO 20160524-060933.908008 :: G1 Young Generation GC in 551ms. G1 Eden > Spa

Too many keyspaces causes cql connection to time out ?

2016-05-24 Thread Justin Lin
-The-World GC might block the connection until it times out. This is the log that i think is relevant. INFO 20160524-060930.028882 :: Initializing sandbox_20160524_t06_09_18.table1 INFO 20160524-060933.908008 :: G1 Young Generation GC in 551ms. G1 Eden Space: 98112 -> 0; G1 Old

Re: Cassandra event notification on INSERT/DELETE of records

2016-05-24 Thread Eric Stevens
It sounds like you're trying to build a queue in Cassandra, which is one of the classic anti-pattern use cases for Cassandra. You may be able to do something clever with triggers, but I highly recommend you look at purpose-built queuing software such as Kafka to solve this instead. On Tue, May

Re: Removing a datacenter

2016-05-24 Thread Jeff Jirsa
The fundamental difference between a removenode and a decommission is which node(s) stream data. In decom, the leaving node streams. In removenode, other owners of the data stream. If you set replication factor for that DC to 0, there’s nothing to stream, so it’s irrelevant – do whichever you

RE: Removing a datacenter

2016-05-24 Thread Anubhav Kale
Sorry I should have more clear. What I meant was doing exactly what you wrote, but do a “removenode” instead of “decommission” to make it even faster. Will that have any side-effect (I think it shouldn’t) ? From: Jeff Jirsa [mailto:jeff.ji...@crowdstrike.com] Sent: Monday, May 23, 2016 4:43 PM

Cassandra event notification on INSERT/DELETE of records

2016-05-24 Thread Aaditya Vadnere
Hi experts, We are evaluating Cassandra as messaging infrastructure for a project. In our workflow Cassandra database will be synchronized across two nodes, a component will INSERT/UPDATE records on one node and another component (who has registered for the specific table) on second node will

Re: UUID coming as int while using SPARK SQL

2016-05-24 Thread Laing, Michael
Yes - a UUID is just a 128 bit value. You can view it using any base or format. If you are looking at the same row, you should see the same 128 bit value, otherwise my theory is incorrect :) Cheers, ml On Tue, May 24, 2016 at 6:57 AM, Rajesh Radhakrishnan < rajesh.radhakrish...@phe.gov.uk>

RE: UUID coming as int while using SPARK SQL

2016-05-24 Thread Rajesh Radhakrishnan
Hi Michael, Thank you for the quick reply. So you are suggesting to convert this int value(UUID comes back as int via Spark SQL) to hex? And selection is just a example to highlight the UUID convertion issue. So in Cassandra it should be SELECT id, workflow FROM sam WHERE dept='blah'; And in

Re: UUID coming as int while using SPARK SQL

2016-05-24 Thread Laing, Michael
Try converting that int from decimal to hex and inserting dashes in the appropriate spots - or go the other way. Also, you are looking at different rows, based upon your selection criteria... ml On Tue, May 24, 2016 at 6:23 AM, Rajesh Radhakrishnan < rajesh.radhakrish...@phe.gov.uk> wrote: >

UUID coming as int while using SPARK SQL

2016-05-24 Thread Rajesh Radhakrishnan
Hi, I got a Cassandra keyspace, but while reading the data(especially UUID) via Spark SQL using Python is not returning the correct value. Cassandra: -- My table 'SAM'' is described below: CREATE table ks.sam (id uuid, dept text, workflow text, type double primary key (id,

Re: cqlsh problem

2016-05-24 Thread joseph gao
I used to think it's firewall/network issues too. So I make ufw to be inactive. I really don't what's the reason. 2016-05-09 19:01 GMT+08:00 kurt Greaves : > Don't be fooled, despite saying tcp6 and :::*, it still listens on IPv4. > As far as I'm aware this happens on all

Re: sstableloader: Stream failed

2016-05-24 Thread Ralf Steppacher
Thanks for the hint! Indeed I could not telnet to the host. It was the listen_address that was not properly configured. Thanks again! Ralf > On 23.05.2016, at 21:01, Paulo Motta wrote: > > Can you telnet 10.211.55.8 7000? This is the port used for streaming >