Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced.

2013-09-19 Thread Richard Low
On 19 September 2013 02:06, Jayadev Jayaraman jdisal...@gmail.com wrote: We use vnodes with num_tokens = 256 ( 256 tokens per node ) . After loading some data with sstableloader , we find that the cluster is heavily imbalanced : How did you select the tokens? Is this a brand new cluster

Row size in cfstats vs cfhistograms

2013-09-19 Thread Rene Kochen
Hi all, I use Cassandra 1.0.11 If I do cfstats for a particular column family, I see a Compacted row maximum size of 43388628 However, when I do a cfhistograms I do not see such a big row in the Row Size column. The biggest row there is 126934. Can someone explain this? Thanks! Rene

cqlsh startup error Can't locate transport factory function cqlshlib.tfactory.regular_transport_factory

2013-09-19 Thread Oisin Kim
Hi, cqlsh stopped working for me recently, I'm unsure how / why it broke and I couldn't find anything from the mail archives (or google) that gave me an indication of how to fix the problem. Here's the output I see when I have cassandra running locally (default config except using Random

Re: Row size in cfstats vs cfhistograms

2013-09-19 Thread Michał Michalski
I believe the reason is that cfhistograms tells you about the sizes of the rows returned by given node in a response to the read request, while cfstats tracks the largest row stored on given node. M. W dniu 19.09.2013 11:31, Rene Kochen pisze: Hi all, I use Cassandra 1.0.11 If I do cfstats

Re: cqlsh startup error Can't locate transport factory function cqlshlib.tfactory.regular_transport_factory

2013-09-19 Thread Oisin Kim
Fixed this issue, for anyone else with this issue, it was that the version of Python installed via brew was 2.7.5 and needed to be put on the path as OS X has it's own version of python (2.7.2 currently). On Thursday 19 September 2013 at 10:33, Oisin Kim wrote: Hi, cqlsh stopped working

Re: Row size in cfstats vs cfhistograms

2013-09-19 Thread Richard Low
On 19 September 2013 10:31, Rene Kochen rene.koc...@schange.com wrote: I use Cassandra 1.0.11 If I do cfstats for a particular column family, I see a Compacted row maximum size of 43388628 However, when I do a cfhistograms I do not see such a big row in the Row Size column. The biggest row

Re: Row size in cfstats vs cfhistograms

2013-09-19 Thread Rene Kochen
And how does cfstats track the maximum size? What does Compacted mean in Compacted row maximum size. Thanks again! Rene 2013/9/19 Michał Michalski mich...@opera.com I believe the reason is that cfhistograms tells you about the sizes of the rows returned by given node in a response to the

Re: Row size in cfstats vs cfhistograms

2013-09-19 Thread Rene Kochen
That is indeed how I read it. The maximal size is 3 rows with an offset of 126934, while cfstats reports 43388628. Thanks, Rene 2013/9/19 Richard Low rich...@wentnet.com On 19 September 2013 10:31, Rene Kochen rene.koc...@schange.com wrote: I use Cassandra 1.0.11 If I do cfstats for a

Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced.

2013-09-19 Thread Suruchi Deodhar
Hi Richard, This is a brand new cluster which started with num_tokens =256 on first boot and chose random tokens. The attached ring status is after data is loaded into the cluster for the first time using sdtableloader and remains that way even after Cassandra is restarted. Thanks, Suruchi

Reverse compaction on 1.1.11?

2013-09-19 Thread Michael Theroux
Hello, Quick question. Is there a tool that allows sstablesplit (reverse compaction) against 1.1.11 sstables? I seem to recall a separate utility somewhere, but I'm having difficulty locating it, Thanks, -Mike

Re: Cannot get secondary indexes on fields in compound primary key to work (Cassandra 2.0.0)

2013-09-19 Thread Petter von Dolwitz (Hem)
For the record: https://issues.apache.org/jira/browse/CASSANDRA-5975 (2.0.1) resolved this issue for me. 2013/9/8 Petter von Dolwitz (Hem) petter.von.dolw...@gmail.com Thank you for you reply. I will look into this. I cannot not get my head around why the scenario I am describing does

Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced.

2013-09-19 Thread Richard Low
I think what has happened is that Cassandra was started with num_tokens = 1, then shutdown and num_tokens set to 256. When this happens, the first time Cassandra chooses a single random token. Then when restarted it splits the token into 256 adjacent ranges. You can see something like this has

Re: Reverse compaction on 1.1.11?

2013-09-19 Thread Hiller, Dean
Can ou describe what you mean by reverse compaction? I mean once you put a row together and blow away sstables that contained it before, you can't possibly know how to split it since that information is gone. Perhaps you want the simple sstable2json script in the bin directory so you can inspect

Re: Reverse compaction on 1.1.11?

2013-09-19 Thread Nate McCall
See https://issues.apache.org/jira/browse/CASSANDRA-4766 The original gist posted by Rob therein might be helpful/work with earlier versions (I have not tried). Worst case, might be a good reason to upgrade to 1.2.x (if you suffering pressure from a large SSTable, the additional offheap

Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced.

2013-09-19 Thread Robert Coli
On Thu, Sep 19, 2013 at 7:03 AM, Richard Low rich...@wentnet.com wrote: I think what has happened is that Cassandra was started with num_tokens = 1, then shutdown and num_tokens set to 256. When this happens, the first time Cassandra chooses a single random token. Then when restarted it

Re: 1.2 leveled compactions can affect big bunch of writes? how to stop/restart them?

2013-09-19 Thread Nate McCall
As opposed to stopping compaction altogether, have you experimented with turning down compaction_throughput_mb_per_sec (16mb default) and/or explicitly setting concurrent_compactors (defaults to the number of cores, iirc). On Thu, Sep 19, 2013 at 10:58 AM, rash aroskar

Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced.

2013-09-19 Thread Suruchi Deodhar
Hi Robert, I downloaded apache-cassandra-1.2.9.tar.gz from http://cassandra.apache.org/download/ ( http://apache.mirrors.tds.net/cassandra/1.2.9/apache-cassandra-1.2.9-bin.tar.gz) and installed it on the individual nodes of the cassandra cluster. Thanks, Suruchi On Thu, Sep 19, 2013 at 12:35 PM,

Re: Problem with counter columns

2013-09-19 Thread Robert Coli
On Wed, Sep 18, 2013 at 11:07 AM, Yulian Oifa oifa.yul...@gmail.com wrote: i am using counter columns in cassandra cluster with 3 nodes. Current cassandra version is 0.8.10. How can i debug , find the problem The problem is using Counters in Cassandra 0.8. But seriously, I don't know

Re: Row size in cfstats vs cfhistograms

2013-09-19 Thread Robert Coli
On Thu, Sep 19, 2013 at 3:08 AM, Rene Kochen rene.koc...@schange.comwrote: And how does cfstats track the maximum size? What does Compacted mean in Compacted row maximum size. That maximum size is the largest row that I have encountered in the course of compaction, since I started. Hence

1.2 leveled compactions can affect big bunch of writes? how to stop/restart them?

2013-09-19 Thread rash aroskar
Hi, In general leveled compaction are I/O heavy so when there are bunch of writes do we need to stop leveled compactions at all? I found the nodetool stop COMPACTION, which states it stops compaction happening, does this work for any type of compaction? Also it states in documents 'eventually

Re: questions related to the SSTable file

2013-09-19 Thread Robert Coli
On Tue, Sep 17, 2013 at 6:51 PM, java8964 java8964 java8...@hotmail.comwrote: I thought I was clearer, but your clarification confused me again. But there is no way we can be sure that these SSTable files will ONLY contain modified data. So the statement being quoted above is not exactly

Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced.

2013-09-19 Thread Suruchi Deodhar
Hi Rob, Do you suggest I should try with some other installation mechanism? Are there any known problems with the tar installation of cassandra 1.2.9 that I should be aware of? Please do let me know. Thanks, Suruchi On Thu, Sep 19, 2013 at 1:04 PM, Suruchi Deodhar

Re: What are the steps to go from SimpleSnitch to GossipingPropertyFileSnitch in a live cluster?

2013-09-19 Thread Juan Manuel Formoso
Just FYI, I did it with a rolling restart and everything worked great. On Wed, Sep 18, 2013 at 5:01 PM, Juan Manuel Formoso jform...@gmail.comwrote: Besides making sure the datacenter name is the same in the cassandra-rackdc.properties file and the one originally created ( datacenter1), what

Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced.

2013-09-19 Thread Robert Coli
On Thu, Sep 19, 2013 at 10:59 AM, Suruchi Deodhar suruchi.deod...@generalsentiment.com wrote: Do you suggest I should try with some other installation mechanism? Are there any known problems with the tar installation of cassandra 1.2.9 that I should be aware of? I was asking in the context

Re: 1.2 leveled compactions can affect big bunch of writes? how to stop/restart them?

2013-09-19 Thread sankalp kohli
You cannot start level compaction. It will run based on data in each level. On Thu, Sep 19, 2013 at 9:19 AM, Nate McCall n...@thelastpickle.com wrote: As opposed to stopping compaction altogether, have you experimented with turning down compaction_throughput_mb_per_sec (16mb default) and/or

Re: Rebalancing vnodes cluster

2013-09-19 Thread Robert Coli
On Wed, Sep 18, 2013 at 4:26 PM, Nimi Wariboko Jr nimiwaribo...@gmail.comwrote: When I started with cassandra I had originally set it up to use tokens. I then migrated to vnodes (using shuffle), but my cluster isn't balanced ( http://imgur.com/73eNhJ3). Are you saying that (other than the

Re: AssertionError: sstableloader

2013-09-19 Thread Yuki Morishita
Sounds like a bug. Would you mind filing JIRA at https://issues.apache.org/jira/browse/CASSANDRA? Thanks, On Thu, Sep 19, 2013 at 2:12 PM, Vivek Mishra mishra.v...@gmail.com wrote: Hi, I am trying to use sstableloader to load some external data and getting given below error: Established

Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced.

2013-09-19 Thread Richard Low
The only thing you need to guarantee is that Cassandra doesn't start with num_tokens=1 (the default in 1.2.x) or, if it does, that you wipe all the data before starting it with higher num_tokens. On 19 September 2013 19:07, Robert Coli rc...@eventbrite.com wrote: On Thu, Sep 19, 2013 at 10:59

Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced.

2013-09-19 Thread Suruchi Deodhar
Thanks for your replies. I wiped out my data from the cluster and also cleared the commitlog before restarting it with num_tokens=256. I then uploaded data using sstableloader. However, I am still not able to see a uniform distribution of data across nodes of the clusters. The output of the

AssertionError: sstableloader

2013-09-19 Thread Vivek Mishra
Hi, I am trying to use sstableloader to load some external data and getting given below error: Established connection to initial hosts Opening sstables and calculating sections to stream Streaming relevant part of /home/impadmin/source/Examples/data/Demo/Users/Demo-Users-ja-1-Data.db to [/

Re: 1.2 leveled compactions can affect big bunch of writes? how to stop/restart them?

2013-09-19 Thread rash aroskar
Thanks for responses. Nate - I haven't tried changing compaction_throughput_mb_per_sec. In my cassandra.yaml I had set it to 32 to begin with. Do you think 32 can be too much if the cassandra get once in a while writes but when it gets writes its a big chunk together? On Thu, Sep 19, 2013 at

Re: how can i get the column value? Need help!.. cassandra 1.28 and pig 0.11.1

2013-09-19 Thread Cyril Scetbon
Hi, Did you try to build 1.2.10 and to use it for your tests ? I've got the same issue and will give it a try as soon as it's released (expected at the end of the week). Regards -- Cyril SCETBON On Sep 2, 2013, at 3:09 PM, Miguel Angel Martin junquera mianmarjun.mailingl...@gmail.com wrote:

Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced.

2013-09-19 Thread Suruchi Deodhar
Yes, the key distribution does vary across the nodes. For example, on the node with the highest data, Number of Keys (estimate) is 6527744 for a particular column family, whereas for the same column family on the node with least data, Number of Keys (estimate) = 3840. Is there a way to control

Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced.

2013-09-19 Thread Mohit Anchlia
Can you check cfstats to see number of keys per node? On Thu, Sep 19, 2013 at 12:36 PM, Suruchi Deodhar suruchi.deod...@generalsentiment.com wrote: Thanks for your replies. I wiped out my data from the cluster and also cleared the commitlog before restarting it with num_tokens=256. I then

Storing binary blobs data in Cassandra Column family?

2013-09-19 Thread Raihan Jamal
I need to store binary byte data in Cassandra column family in all my columns. Each columns will have its own binary byte data. Below is the code where I will be getting binary byte data. My rowKey is going to be String but all my columns has to store binary blobs data.

Re: 1.2 leveled compactions can affect big bunch of writes? how to stop/restart them?

2013-09-19 Thread Juan Manuel Formoso
concurrent_compactors is ignored when using leveled compactions On Thu, Sep 19, 2013 at 1:19 PM, Nate McCall n...@thelastpickle.com wrote: As opposed to stopping compaction altogether, have you experimented with turning down compaction_throughput_mb_per_sec (16mb default) and/or explicitly

Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced.

2013-09-19 Thread Richard Low
On 19 September 2013 20:36, Suruchi Deodhar suruchi.deod...@generalsentiment.com wrote: Thanks for your replies. I wiped out my data from the cluster and also cleared the commitlog before restarting it with num_tokens=256. I then uploaded data using sstableloader. However, I am still not

Re: AssertionError: sstableloader

2013-09-19 Thread Vivek Mishra
More to add on this: This is happening for column families created via CQL3 with collection type columns and without WITH COMPACT STORAGE. On Fri, Sep 20, 2013 at 12:51 AM, Yuki Morishita mor.y...@gmail.com wrote: Sounds like a bug. Would you mind filing JIRA at

Re: Decomissioning a datacenter

2013-09-19 Thread Juan Manuel Formoso
Not forever, while I decommission the nodes I assume. What I don't understand is the wording no longer reference On Thu, Sep 19, 2013 at 6:17 PM, Robert Coli rc...@eventbrite.com wrote: On Thu, Sep 19, 2013 at 1:52 PM, Juan Manuel Formoso jform...@gmail.comwrote:

Re: Decomissioning a datacenter

2013-09-19 Thread Robert Coli
On Thu, Sep 19, 2013 at 1:52 PM, Juan Manuel Formoso jform...@gmail.comwrote: http://www.datastax.com/documentation/cassandra/1.2/webhelp/index.html#cassandra/operations/../../cassandra/operations/ops_decomission_dc_t.html When it says Change all keyspaces so they no longer reference the data

Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced.

2013-09-19 Thread Mohit Anchlia
Can you run nodetool repair on all the nodes first and look at the keys? On Thu, Sep 19, 2013 at 1:22 PM, Suruchi Deodhar suruchi.deod...@generalsentiment.com wrote: Yes, the key distribution does vary across the nodes. For example, on the node with the highest data, Number of Keys (estimate)

Decomissioning a datacenter

2013-09-19 Thread Juan Manuel Formoso
Quick question. http://www.datastax.com/documentation/cassandra/1.2/webhelp/index.html#cassandra/operations/../../cassandra/operations/ops_decomission_dc_t.html When it says Change all keyspaces so they no longer reference the data center being removed., does that mean setting my

Re: Error during startup - java.lang.OutOfMemoryError: unable to create new native thread

2013-09-19 Thread srmore
I hit this issue again today and looks like changing -Xss option does not work :( I am on 1.0.11 (I know its old, we are upgrading to 1.2.9 right now) and have about 800-900GB of data. I can see cassandra is spending a lot of time reading the data files before it quits with

Re: Error during startup - java.lang.OutOfMemoryError: unable to create new native thread

2013-09-19 Thread srmore
Was too fast on the send button, sorry. The thing I wanted to add was the pending signals (-i) 515038 that looks odd to me, could that be related. On Thu, Sep 19, 2013 at 4:53 PM, srmore comom...@gmail.com wrote: I hit this issue again today and looks like changing -Xss

Re: Decomissioning a datacenter

2013-09-19 Thread Robert Coli
On Thu, Sep 19, 2013 at 2:43 PM, Juan Manuel Formoso jform...@gmail.comwrote: Not forever, while I decommission the nodes I assume. What I don't understand is the wording no longer reference Why does your replication strategy need to be aware of nodes which receive zero replicas? No longer

Re: I don't understand shuffle progress

2013-09-19 Thread Jeremiah D Jordan
http://www.datastax.com/documentation/cassandra/1.2/webhelp/index.html#cassandra/configuration/configVnodesProduction_t.html On Sep 18, 2013, at 9:41 AM, Chris Burroughs chris.burrou...@gmail.com wrote:

Re: Decomissioning a datacenter

2013-09-19 Thread Robert Coli
On Thu, Sep 19, 2013 at 3:03 PM, Juan Manuel Formoso jform...@gmail.comwrote: Oh, so just datacenter2:N then. Yes. Sorry, not a native English speaker, and also tired :) NP! :D =Rob

Re: Decomissioning a datacenter

2013-09-19 Thread Juan Manuel Formoso
Oh, so just datacenter2:N then. Sorry, not a native English speaker, and also tired :) On Thu, Sep 19, 2013 at 6:57 PM, Robert Coli rc...@eventbrite.com wrote: On Thu, Sep 19, 2013 at 2:43 PM, Juan Manuel Formoso jform...@gmail.comwrote: Not forever, while I decommission the nodes I

NetworkTopologyStrategy Error

2013-09-19 Thread Ashley Martens
I tried to split my cluster and ran into this error, which I did not see in the tests I performed. ERROR [pool-1-thread-52165] 2013-09-19 21:48:08,262 Cassandra.java (line 3250) Internal error processing describe_ring java.lang.IllegalStateException: datacenter (DC103) has no more endpoints, (3)

Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced.

2013-09-19 Thread Jayadev Jayaraman
We ran nodetool repair on all nodes for all Keyspaces / CFs, restarted cassandra and this is what we get for nodetool status : bin/nodetool -h localhost status Datacenter: us-east === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns

Re: Rebalancing vnodes cluster

2013-09-19 Thread Nimi Wariboko Jr
We had originally started with 3 nodes w/ 32GB ram and 768GB SSDs. I pretty much Google'd my way into setting up cassandra and set it up using tokens because I was following an older docco. We were using Cassandra 1.2.5, I learned about vnodes later on and regretted waking up that morning. 1.)

Re: NetworkTopologyStrategy Error

2013-09-19 Thread sankalp kohli
Is any of your keyspace still reference this DC? On Thu, Sep 19, 2013 at 3:03 PM, Ashley Martens ashley.mart...@dena.comwrote: I tried to split my cluster and ran into this error, which I did not see in the tests I performed. ERROR [pool-1-thread-52165] 2013-09-19 21:48:08,262

Re: Cassandra column family using Composite Columns

2013-09-19 Thread Raihan Jamal
Can anyone help me on this? Any help will be appreciated.. Thanks.. *Raihan Jamal* On Tue, Sep 17, 2013 at 4:44 PM, Raihan Jamal jamalrai...@gmail.com wrote: I am designing the Column Family for our use case in Cassandra. I am planning to go with Dynamic Column Structure. Below is my

Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced.

2013-09-19 Thread Mohit Anchlia
Other thing I noticed is that you are using mutiple RACKS and that might be contributing factor to it. However, I am not sure. Can you paste the output of nodetool cfstats and ring? Is it possible to run the same test but keeping all the nodes in one rack? I think you should open a JIRA if you

Re: I don't understand shuffle progress

2013-09-19 Thread Juan Manuel Formoso
Thanks. I did this and I finished rebuilding the new cluster in about 8 hours... much better option than shuffle (you have to have the hardware for duplicating your environment though) On Thu, Sep 19, 2013 at 7:21 PM, Jeremiah D Jordan jeremiah.jor...@gmail.com wrote:

BigTable-like Versioned Cells, Importing PostgreSQL Data

2013-09-19 Thread Keith Bogs
I've been playing with Cassandra and have a few questions that I've been stuck on for awhile, and Googling around didn't seem to help much: 1. What's the quickest way to import a bunch of data from PostgreSQL? I have ~20M rows with mostly text (some long text with newlines, and blob files.) I