RPC timeout error while exporting data from CQL

2013-09-18 Thread pradeep kumar
Hello all, I am trying to export data from cassandra using CQL client. A column family has about 10 rows in it. when i am copying dta into csv file using COPY TO command i get following rpc_time out error. copy mycolfamily to '/root/mycolfamily.csv' Request did not complete within

cassandra not responding, log full of gc invocations

2013-09-18 Thread Alexander Shutyaev
Hi all! We have a problem with cassandra 2.0. We have installed cassandra from datastax community respository. We haven't changed any java options from the default ones. De-facto Xmx is 1GB. Recently we have encountered a couple of cases when cassandra stopped responding and the log was showing

datastax ops center shows a lot of activity when idle

2013-09-18 Thread Alexander Shutyaev
Hi all! We have experienced a strange issue with datastax opscenter. It showed much more write requests that we should actually have. We've set everything up on an isolated node and there without any activity it showed 100+ writes per second. Is this some opscenter bug? Does it maybe count some

cassandra just gone..no heap dump, no log info

2013-09-18 Thread Hiller, Dean
Anyone know how to debug cassandra processes just exiting? There is no info in the cassandra logs and there is no heap dump file(which in the past has shown up in /opt/cassandra/bin directory for me). This occurs when running a map/reduce job that put severe load on the system. The logs look

Re: cassandra not responding, log full of gc invocations

2013-09-18 Thread David McNelis
It is a little more involved than just changing the heap size. Every cluster is different, so there isn't much of a set formula. Some areas to look into, though: **Caveat, we're still running in the 1.2 branch and 2.0 has some differences in what is on versus off heap memory usage, but the

Re: cassandra just gone..no heap dump, no log info

2013-09-18 Thread Franc Carter
A random guess - possibly an OOM (Out of Memory) where Linux will kill a process to recover memory when it is desperately low on memory. Have a look in either your syslog output of the output of dmesg cheers On Wed, Sep 18, 2013 at 10:21 PM, Hiller, Dean dean.hil...@nrel.gov wrote: Anyone

Re: cassandra just gone..no heap dump, no log info

2013-09-18 Thread Vara Kumar
Check if java process got crashed. You can find hs_err*.log file in root directory or cassandra working directory or temporary files directory. Information in this log file can give an idea about failure. On Wed, Sep 18, 2013 at 5:51 PM, Hiller, Dean dean.hil...@nrel.gov wrote: Anyone know how

Re: Multi-dc restart impact

2013-09-18 Thread Chris Burroughs
On 09/17/2013 04:44 PM, Robert Coli wrote: On Thu, Sep 5, 2013 at 6:14 AM, Chris Burroughs chris.burrou...@gmail.comwrote: We have a 2 DC cluster running cassandra 1.2.9. They are in actual physically separate DCs on opposite coasts of the US, not just logical ones. The primary use of this

Re: cassandra just gone..no heap dump, no log info

2013-09-18 Thread Juan Manuel Formoso
This shouldn't happen if you have swap active in the server On Wednesday, September 18, 2013, Franc Carter wrote: A random guess - possibly an OOM (Out of Memory) where Linux will kill a process to recover memory when it is desperately low on memory. Have a look in either your syslog output

Re: I don't understand shuffle progress

2013-09-18 Thread Chris Burroughs
On 09/17/2013 09:41 PM, Paulo Motta wrote: So you're saying the only feasible way of enabling VNodes on an upgraded C* 1.2 is by doing fork writes to a brand new cluster + bulk load of sstables from the old cluster? Or is it possible to succeed on shuffling, even if that means waiting some weeks

Re: cassandra just gone..no heap dump, no log info

2013-09-18 Thread Hiller, Dean
Ah neat, I didn't know the dmesg command…that works great. Dean From: Franc Carter franc.car...@sirca.org.aumailto:franc.car...@sirca.org.au Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Date: Wednesday, September

Re: cassandra just gone..no heap dump, no log info

2013-09-18 Thread Hiller, Dean
Swappiness is set to 60 though cassandra recommendations are to turn swap completely off(we still have not done that as far as I know though) and sure enough linux killed it. Dean From: Juan Manuel Formoso jform...@gmail.commailto:jform...@gmail.com Reply-To:

Re: cassandra just gone..no heap dump, no log info

2013-09-18 Thread Ken Hancock
We ran into this while tuning heap sizes. With Cassandra 1.2 making use of off-heap memory, if we made our JVM too large relative to the server memory, the system would just bail. We found for our app that the limit of the JVM size relative to server memory was about 50%. On Wed, Sep 18, 2013

Re: RPC timeout error while exporting data from CQL

2013-09-18 Thread pradeep kumar
Experts.. Any help? On Wed, Sep 18, 2013 at 2:55 PM, pradeep kumar pradeepkuma...@gmail.comwrote: Hello all, I am trying to export data from cassandra using CQL client. A column family has about 10 rows in it. when i am copying dta into csv file using COPY TO command i get following

RE: cassandra just gone..no heap dump, no log info

2013-09-18 Thread java8964 java8964
We faced same issue sometimes too. 1) Linux OOM killer kill your Cassandra process. You should find this event log in /var/log/message.2) The JVM crashed. You should be able to find the hs_err_pid file under /tmp folder, if you didn't specify the location when you started your JVM. We still

Revisit with another spin: is there any type of table existing on all nodes?

2013-09-18 Thread Hiller, Dean
The meta information stored on behalf of CQL must exist on all nodes and must update all nodes as well. What table type is that meta information stored in? And is it possible we can use that same type of table? After all, this makes M/R blazingly fast to do local lookups in the database (and

Re: cassandra just gone..no heap dump, no log info

2013-09-18 Thread Hiller, Dean
We had hs_err_pid files moths ago and it was happening every 6 days or so and we switched to this JVM and we have not seen one since(including today)…that worked for us at least. java version 1.6.0_41 Java(TM) SE Runtime Environment (build 1.6.0_41-b02) Java HotSpot(TM) 64-Bit Server VM (build

Re: Revisit with another spin: is there any type of table existing on all nodes?

2013-09-18 Thread Sylvain Lebresne
On Wed, Sep 18, 2013 at 3:09 PM, Hiller, Dean dean.hil...@nrel.gov wrote: The meta information stored on behalf of CQL must exist on all nodes and must update all nodes as well. What table type is that meta information stored in? And is it possible we can use that same type of table? It's

Why don't you start off with a “single small” Cassandra server as you usually do it with MySQL?

2013-09-18 Thread Ertio Lew
For any website just starting out, the load initially is minimal grows with a slow pace initially. People usually start with their MySQL based sites with a single server(***that too a VPS not a dedicated server) running as both app server as well as DB server usually get too far with this setup

Re: Why don't you start off with a “single small” Cassandra server as you usually do it with MySQL?

2013-09-18 Thread Michał Michalski
You might be interested in this: http://mail-archives.apache.org/mod_mbox/cassandra-user/201308.mbox/%3ccaeqobhpav25pcgjfwbkmd1rzxvrif94e6lpybpj3mu_bqn9...@mail.gmail.com%3E M. W dniu 18.09.2013 15:34, Ertio Lew pisze: For any website just starting out, the load initially is minimal grows

Re: I don't understand shuffle progress

2013-09-18 Thread Juan Manuel Formoso
I really like this idea. I can create a new cluster and have it replicate the old one, after it finishes I can remove the original. Any good resource that explains how to add a new datacenter to a live single dc cluster that anybody can recommend? On Wed, Sep 18, 2013 at 9:58 AM, Chris

TTL and gc_grace_Seconds

2013-09-18 Thread Christopher Wirt
I have a column family contains time series events, all columns have a 24 hour TTL and gc_grace_seconds is currently 20 days. There is a TimeUUID in part of the key. It takes 15 days to repair the entire ring. Consistency is not my main worry. Speed is. We currently write to this CF at

Re: I don't understand shuffle progress

2013-09-18 Thread Chris Burroughs
http://www.datastax.com/documentation/cassandra/1.2/webhelp/index.html#cassandra/operations/ops_add_dc_to_cluster_t.html This is a basic outline. On 09/18/2013 10:32 AM, Juan Manuel Formoso wrote: I really like this idea. I can create a new cluster and have it replicate the old one, after it

Re: TTL and gc_grace_Seconds

2013-09-18 Thread horschi
Hi Christopher, in 2.0 gc_grace should be capped by TTL anyway: see CASSANDRA-4917 cheers, Christian On Wed, Sep 18, 2013 at 4:29 PM, Christopher Wirt chris.w...@struq.comwrote: I have a column family contains time series events, all columns have a 24 hour TTL and gc_grace_seconds is

Re: Why don't you start off with a “single small” Cassandra server as you usually do it with MySQL?

2013-09-18 Thread Jonathan Haddad
For future references, a blog post on this topic. http://rustyrazorblade.com/2013/09/cassandra-faq-can-i-start-with-a-single-node/ On Wed, Sep 18, 2013 at 6:38 AM, Michał Michalski mich...@opera.com wrote: You might be interested in this:

Re: I don't understand shuffle progress

2013-09-18 Thread Juan Manuel Formoso
Awesome, thanks! A few final questions: 1) Can I change the Snitch in the live source cluster? I'm using SimpleSnitch, I'd change it to GossipingPropertyFileSnitch (in preparation for changing the replication strategy when the new cluster is up and running). 2) Can I have different Partitioners

Re: Why don't you start off with a “single small” Cassandra server as you usually do it with MySQL?

2013-09-18 Thread Vegard Berget
Hi, The idea behind Cassandra is not the same as for MySQL.  First of all you can't get fault tolerance with one node.  I don't think Cassandra nodes are more prone to be unavailable, but by using replication you can get more availability right away.  If you have multiple instances when you start

Re: datastax ops center shows a lot of activity when idle

2013-09-18 Thread Nick Bailey
OpsCenter's writes for monitoring data will show up in the request/latency graphs yes. 100/sec may be reasonable depending on the number of nodes and columnfamilies OpsCenter is monitoring. On Wed, Sep 18, 2013 at 7:15 AM, Alexander Shutyaev shuty...@gmail.comwrote: Hi all! We have

cassandra and sqoop

2013-09-18 Thread Grga Pitich
Is there a vanilla cassandra sqoop driver for importing data into hadoop? I know datastax cassandra comes with the utility however i'm interested in vanilla cassandra. Many thanks.

Problem with counter columns

2013-09-18 Thread Yulian Oifa
Hello to all i am using counter columns in cassandra cluster with 3 nodes. all 3 nodes are up and synchronized with ntp timeserver , same with client. I am using libthrift java client. Current problem i am having is that part of writes to counter columns simply disappears ( most of time different

Re: TTL and gc_grace_Seconds

2013-09-18 Thread sankalp kohli
You might want to do some stuff in the application layer. If you can deal with deleted deletes in the application layer, you can reduce your gc-grace period. On Wed, Sep 18, 2013 at 7:42 AM, horschi hors...@gmail.com wrote: Hi Christopher, in 2.0 gc_grace should be capped by TTL anyway: see

Re: What is the ideal value for sstable_size_in_mb when using LeveledCompactionStrategy ?

2013-09-18 Thread Hiller, Dean
1. Always in cassandra up your file descriptor limits on linux and even in 0.7 that was the recommendation so cassandra could open tons of files 2. We use 50M for our LCS with no performance issues. We had it 10M on our previous with no issues but a huge amount of files of course with our

What is the ideal value for sstable_size_in_mb when using LeveledCompactionStrategy ?

2013-09-18 Thread Jayadev Jayaraman
We have set up a 24 node (m1.xlarge nodes, 1.7 TB per node) cassandra cluster on Amazon EC2 : version=1.2.9 replication factor = 2 snitch=EC2Snitch placement_strategy=NetworkTopologyStrategy (with 12 nodes each in 2 availability zones) Background on our use-case : We plan on using hadoop with

Re: What is the ideal value for sstable_size_in_mb when using LeveledCompactionStrategy ?

2013-09-18 Thread Jayadev Jayaraman
Thanks for the quick reply. We've already upped the ulimit as high as our Linux distro allows us to ( around 1.8 million ). I have a follow-up question. I see that the size of individual nodes in your use case is quite massive. Does the safe number vary widely based on differences in underlying

Re: What is the ideal value for sstable_size_in_mb when using LeveledCompactionStrategy ?

2013-09-18 Thread Hiller, Dean
Sorry, bad bad typo…..300G is what I meant. Cassandra heavily advises to stay under 1T per node or you run into big troubles and most people stay under 500G per node. Later, Dean From: Jayadev Jayaraman jdisal...@gmail.commailto:jdisal...@gmail.com Reply-To:

Re: RPC timeout error while exporting data from CQL

2013-09-18 Thread Arthur Zubarev
Hello Pradeep, Let me try to help you, I faced a similar issue, too. Thing is I was told selecting all the records at once is not an ideal approach. No matter how strong the hardware is an arbitrary upward adjusted RPC time out would not help, whatever value you give to it, the ‘SELECT *’

hadoop 12 T recommendation vs. cassandra 1T recommendation

2013-09-18 Thread Hiller, Dean
This article looks like it came out just one month ago or not even http://blog.cloudera.com/blog/2013/08/how-to-select-the-right-hardware-for-your-new-hadoop-cluster/ And recommends 12-24 1-4TB disks in a JBOD configuration. I know hadoop is used a lot in analytics but can also be used in some

What are the steps to go from SimpleSnitch to GossipingPropertyFileSnitch in a live cluster?

2013-09-18 Thread Juan Manuel Formoso
Besides making sure the datacenter name is the same in the cassandra-rackdc.properties file and the one originally created (datacenter1), what else do I have to take into account? Can I do a rolling restart or should I kill the entire cluster and then startup one at a time? -- *Juan Manuel

Need help configuring WAN replication over slow WAN

2013-09-18 Thread Oleg Dulin
Here is a problem: My customer has a 45Megabit connection to their off-site DR data center. They have about 500G worth of data. That connection is shared. Needless to say this is not an optimal configuration. To replicate all that in real time it'll take a week. My primary cluster is 4

nodetool tpstats

2013-09-18 Thread Kanwar Sangha
Hi - During a write heavy load, the tpstats show the following - Message type Dropped RANGE_SLICE 0 READ_REPAIR 0 BINARY 0 READ 0 MUTATION 65570 _TRACE 0

Re: What is the ideal value for sstable_size_in_mb when using LeveledCompactionStrategy ?

2013-09-18 Thread Nate McCall
The analysis on https://issues.apache.org/jira/browse/CASSANDRA-5727 will be of interest. As of 1.2.9 160mb is the new default for LCS. On Wed, Sep 18, 2013 at 3:35 PM, Jayadev Jayaraman jdisal...@gmail.comwrote: m1.xlarge ( total ephemeral volume size 1.7TB ) is the most widely used node

Re: nodetool tpstats

2013-09-18 Thread Tyler Hobbs
On Wed, Sep 18, 2013 at 3:43 PM, Kanwar Sangha kan...@mavenir.com wrote: what does the request_response signify ? That the node accepted the message but was not able to process it in the timeout ? ** Yes, I'm pretty sure it's referring to requests that the node dropped when acting as a

Rebalancing vnodes cluster

2013-09-18 Thread Nimi Wariboko Jr
Hi, When I started with cassandra I had originally set it up to use tokens. I then migrated to vnodes (using shuffle), but my cluster isn't balanced (http://imgur.com/73eNhJ3). What steps can I take to balance my cluster? Thanks, Nimi

Re: Rebalancing vnodes cluster

2013-09-18 Thread Nick Bailey
OpsCenter only supports vnodes minimally at this point. More specifically, it chooses a random token that a node owns in order to display that node on the ring. So a vnode cluster will always appear unbalanced in OpsCenter. Your cluster is probably balanced fine, but 'nodetoo status' should

Re: Rebalancing vnodes cluster

2013-09-18 Thread Nimi Wariboko Jr
This isn't the case. I noticed the error because of some unusual hotspotting. `nodetool status` also shows the cluster is unbalanced. root@cass1:~# nodetool status Datacenter: 129 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- AddressLoad Tokens Owns