select count query not working at cassandra 2.0.0

2013-09-19 Thread Katsutoshi
I would like to use select count query. Although it was work at Cassandra 1.2.9, but there is a situation which does not work at Cassandra 2.0.0. so, If some row is deleted, 'select count query' seems to return the wrong value. Did anything change by Cassandra 2.0.0 ? or Have I made a mistake ? My

BigTable-like Versioned Cells, Importing PostgreSQL Data

2013-09-19 Thread Keith Bogs
I've been playing with Cassandra and have a few questions that I've been stuck on for awhile, and Googling around didn't seem to help much: 1. What's the quickest way to import a bunch of data from PostgreSQL? I have ~20M rows with mostly text (some long text with newlines, and blob files.) I trie

Re: I don't understand shuffle progress

2013-09-19 Thread Juan Manuel Formoso
Thanks. I did this and I finished rebuilding the new cluster in about 8 hours... much better option than shuffle (you have to have the hardware for duplicating your environment though) On Thu, Sep 19, 2013 at 7:21 PM, Jeremiah D Jordan < jeremiah.jor...@gmail.com> wrote: > > http://www.datastax.

Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced.

2013-09-19 Thread Mohit Anchlia
Other thing I noticed is that you are using mutiple RACKS and that might be contributing factor to it. However, I am not sure. Can you paste the output of nodetool cfstats and ring? Is it possible to run the same test but keeping all the nodes in one rack? I think you should open a JIRA if you ar

Re: Cassandra column family using Composite Columns

2013-09-19 Thread Raihan Jamal
Can anyone help me on this? Any help will be appreciated.. Thanks.. *Raihan Jamal* On Tue, Sep 17, 2013 at 4:44 PM, Raihan Jamal wrote: > I am designing the Column Family for our use case in Cassandra. I am > planning to go with Dynamic Column Structure. > > Below is my requirement per o

Re: NetworkTopologyStrategy Error

2013-09-19 Thread sankalp kohli
Is any of your keyspace still reference this DC? On Thu, Sep 19, 2013 at 3:03 PM, Ashley Martens wrote: > I tried to split my cluster and ran into this error, which I did not see > in the tests I performed. > > ERROR [pool-1-thread-52165] 2013-09-19 21:48:08,262 Cassandra.java (line > 3250) Inte

Re: Rebalancing vnodes cluster

2013-09-19 Thread Nimi Wariboko Jr
We had originally started with 3 nodes w/ 32GB ram and 768GB SSDs. I pretty much Google'd my way into setting up cassandra and set it up using tokens because I was following an older docco. We were using Cassandra 1.2.5, I learned about vnodes later on and regretted waking up that morning. 1.)

Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced.

2013-09-19 Thread Jayadev Jayaraman
We ran nodetool repair on all nodes for all Keyspaces / CFs, restarted cassandra and this is what we get for nodetool status : bin/nodetool -h localhost status Datacenter: us-east === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns

NetworkTopologyStrategy Error

2013-09-19 Thread Ashley Martens
I tried to split my cluster and ran into this error, which I did not see in the tests I performed. ERROR [pool-1-thread-52165] 2013-09-19 21:48:08,262 Cassandra.java (line 3250) Internal error processing describe_ring java.lang.IllegalStateException: datacenter (DC103) has no more endpoints, (3) r

Re: Decomissioning a datacenter

2013-09-19 Thread Juan Manuel Formoso
Oh, so just "datacenter2:N" then. Sorry, not a native English speaker, and also tired :) On Thu, Sep 19, 2013 at 6:57 PM, Robert Coli wrote: > On Thu, Sep 19, 2013 at 2:43 PM, Juan Manuel Formoso > wrote: > >> Not forever, while I decommission the nodes I assume. What I don't >> understand is

Re: Decomissioning a datacenter

2013-09-19 Thread Robert Coli
On Thu, Sep 19, 2013 at 3:03 PM, Juan Manuel Formoso wrote: > Oh, so just "datacenter2:N" then. > Yes. > Sorry, not a native English speaker, and also tired :) > NP! :D =Rob

Re: I don't understand shuffle progress

2013-09-19 Thread Jeremiah D Jordan
http://www.datastax.com/documentation/cassandra/1.2/webhelp/index.html#cassandra/configuration/configVnodesProduction_t.html On Sep 18, 2013, at 9:41 AM, Chris Burroughs wrote: > http://www.datastax.com/documentation/cassandra/1.2/webhelp/index.html#cassandra/operations/ops_add_dc_to_cluster_t.h

Re: Decomissioning a datacenter

2013-09-19 Thread Robert Coli
On Thu, Sep 19, 2013 at 2:43 PM, Juan Manuel Formoso wrote: > Not forever, while I decommission the nodes I assume. What I don't > understand is the wording "no longer reference" > Why does your replication strategy need to be aware of nodes which receive zero replicas? "No longer reference" alm

Re: Error during startup - java.lang.OutOfMemoryError: unable to create new native thread

2013-09-19 Thread srmore
Was too fast on the send button, sorry. The thing I wanted to add was the pending signals (-i) 515038 that looks odd to me, could that be related. On Thu, Sep 19, 2013 at 4:53 PM, srmore wrote: > > I hit this issue again today and looks like changing -Xss option does not > wo

Re: Error during startup - java.lang.OutOfMemoryError: unable to create new native thread

2013-09-19 Thread srmore
I hit this issue again today and looks like changing -Xss option does not work :( I am on 1.0.11 (I know its old, we are upgrading to 1.2.9 right now) and have about 800-900GB of data. I can see cassandra is spending a lot of time reading the data files before it quits with "java.lang.OutOfMemoryE

Decomissioning a datacenter

2013-09-19 Thread Juan Manuel Formoso
Quick question. http://www.datastax.com/documentation/cassandra/1.2/webhelp/index.html#cassandra/operations/../../cassandra/operations/ops_decomission_dc_t.html When it says "Change all keyspaces so they no longer reference the data center being removed.", does that mean setting my replication_str

Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced.

2013-09-19 Thread Mohit Anchlia
Can you run nodetool repair on all the nodes first and look at the keys? On Thu, Sep 19, 2013 at 1:22 PM, Suruchi Deodhar < suruchi.deod...@generalsentiment.com> wrote: > Yes, the key distribution does vary across the nodes. For example, on the > node with the highest data, Number of Keys (estima

Re: Decomissioning a datacenter

2013-09-19 Thread Juan Manuel Formoso
Not forever, while I decommission the nodes I assume. What I don't understand is the wording "no longer reference" On Thu, Sep 19, 2013 at 6:17 PM, Robert Coli wrote: > On Thu, Sep 19, 2013 at 1:52 PM, Juan Manuel Formoso > wrote: > >> >> http://www.datastax.com/documentation/cassandra/1.2/web

Re: Decomissioning a datacenter

2013-09-19 Thread Robert Coli
On Thu, Sep 19, 2013 at 1:52 PM, Juan Manuel Formoso wrote: > > http://www.datastax.com/documentation/cassandra/1.2/webhelp/index.html#cassandra/operations/../../cassandra/operations/ops_decomission_dc_t.html > > When it says "Change all keyspaces so they no longer reference the data > center bein

Re: AssertionError: sstableloader

2013-09-19 Thread Vivek Mishra
More to add on this: This is happening for column families created via CQL3 with collection type columns and without "WITH COMPACT STORAGE". On Fri, Sep 20, 2013 at 12:51 AM, Yuki Morishita wrote: > Sounds like a bug. > Would you mind filing JIRA at > https://issues.apache.org/jira/browse/CASS

Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced.

2013-09-19 Thread Richard Low
On 19 September 2013 20:36, Suruchi Deodhar < suruchi.deod...@generalsentiment.com> wrote: > Thanks for your replies. I wiped out my data from the cluster and also > cleared the commitlog before restarting it with num_tokens=256. I then > uploaded data using sstableloader. > > However, I am still

Re: 1.2 leveled compactions can affect big bunch of writes? how to stop/restart them?

2013-09-19 Thread Juan Manuel Formoso
concurrent_compactors is ignored when using leveled compactions On Thu, Sep 19, 2013 at 1:19 PM, Nate McCall wrote: > As opposed to stopping compaction altogether, have you experimented with > turning down compaction_throughput_mb_per_sec (16mb default) and/or > explicitly setting concurrent_co

Storing binary blobs data in Cassandra Column family?

2013-09-19 Thread Raihan Jamal
I need to store binary byte data in Cassandra column family in all my columns. Each columns will have its own binary byte data. Below is the code where I will be getting binary byte data. My rowKey is going to be String but all my columns has to store binary blobs data. GenericDatumWriter writ

Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced.

2013-09-19 Thread Mohit Anchlia
Can you check cfstats to see number of keys per node? On Thu, Sep 19, 2013 at 12:36 PM, Suruchi Deodhar < suruchi.deod...@generalsentiment.com> wrote: > Thanks for your replies. I wiped out my data from the cluster and also > cleared the commitlog before restarting it with num_tokens=256. I then

Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced.

2013-09-19 Thread Suruchi Deodhar
Yes, the key distribution does vary across the nodes. For example, on the node with the highest data, Number of Keys (estimate) is 6527744 for a particular column family, whereas for the same column family on the node with least data, Number of Keys (estimate) = 3840. Is there a way to control thi

Re: how can i get the column value? Need help!.. cassandra 1.28 and pig 0.11.1

2013-09-19 Thread Cyril Scetbon
Hi, Did you try to build 1.2.10 and to use it for your tests ? I've got the same issue and will give it a try as soon as it's released (expected at the end of the week). Regards -- Cyril SCETBON On Sep 2, 2013, at 3:09 PM, Miguel Angel Martin junquera wrote: > hi all: > > More info : > >

Re: 1.2 leveled compactions can affect big bunch of writes? how to stop/restart them?

2013-09-19 Thread rash aroskar
Thanks for responses. Nate - I haven't tried changing compaction_throughput_mb_per_sec. In my cassandra.yaml I had set it to 32 to begin with. Do you think 32 can be too much if the cassandra get once in a while writes but when it gets writes its a big chunk together? On Thu, Sep 19, 2013 at 12:3

AssertionError: sstableloader

2013-09-19 Thread Vivek Mishra
Hi, I am trying to use sstableloader to load some external data and getting given below error: Established connection to initial hosts Opening sstables and calculating sections to stream Streaming relevant part of /home/impadmin/source/Examples/data/Demo/Users/Demo-Users-ja-1-Data.db to [/ 127.0.0.

Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced.

2013-09-19 Thread Suruchi Deodhar
Thanks for your replies. I wiped out my data from the cluster and also cleared the commitlog before restarting it with num_tokens=256. I then uploaded data using sstableloader. However, I am still not able to see a uniform distribution of data across nodes of the clusters. The output of the bin/n

Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced.

2013-09-19 Thread Richard Low
The only thing you need to guarantee is that Cassandra doesn't start with num_tokens=1 (the default in 1.2.x) or, if it does, that you wipe all the data before starting it with higher num_tokens. On 19 September 2013 19:07, Robert Coli wrote: > On Thu, Sep 19, 2013 at 10:59 AM, Suruchi Deodhar

Re: AssertionError: sstableloader

2013-09-19 Thread Yuki Morishita
Sounds like a bug. Would you mind filing JIRA at https://issues.apache.org/jira/browse/CASSANDRA? Thanks, On Thu, Sep 19, 2013 at 2:12 PM, Vivek Mishra wrote: > Hi, > I am trying to use sstableloader to load some external data and getting > given below error: > Established connection to initial

Re: Rebalancing vnodes cluster

2013-09-19 Thread Robert Coli
On Wed, Sep 18, 2013 at 4:26 PM, Nimi Wariboko Jr wrote: > When I started with cassandra I had originally set it up to use tokens. I > then migrated to vnodes (using shuffle), but my cluster isn't balanced ( > http://imgur.com/73eNhJ3). > Are you saying that (other than the imbalance that is the

Re: 1.2 leveled compactions can affect big bunch of writes? how to stop/restart them?

2013-09-19 Thread sankalp kohli
You cannot start level compaction. It will run based on data in each level. On Thu, Sep 19, 2013 at 9:19 AM, Nate McCall wrote: > As opposed to stopping compaction altogether, have you experimented with > turning down compaction_throughput_mb_per_sec (16mb default) and/or > explicitly setting c

Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced.

2013-09-19 Thread Robert Coli
On Thu, Sep 19, 2013 at 10:59 AM, Suruchi Deodhar < suruchi.deod...@generalsentiment.com> wrote: > Do you suggest I should try with some other installation mechanism? Are > there any known problems with the tar installation of cassandra 1.2.9 that > I should be aware of? > I was asking in the con

Re: What are the steps to go from SimpleSnitch to GossipingPropertyFileSnitch in a live cluster?

2013-09-19 Thread Juan Manuel Formoso
Just FYI, I did it with a rolling restart and everything worked great. On Wed, Sep 18, 2013 at 5:01 PM, Juan Manuel Formoso wrote: > Besides making sure the datacenter name is the same in the > cassandra-rackdc.properties file and the one originally created ( > datacenter1), what else do I have

Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced.

2013-09-19 Thread Suruchi Deodhar
Hi Rob, Do you suggest I should try with some other installation mechanism? Are there any known problems with the tar installation of cassandra 1.2.9 that I should be aware of? Please do let me know. Thanks, Suruchi On Thu, Sep 19, 2013 at 1:04 PM, Suruchi Deodhar < suruchi.deod...@generalsentime

1.2 leveled compactions can affect big bunch of writes? how to stop/restart them?

2013-09-19 Thread rash aroskar
Hi, In general leveled compaction are I/O heavy so when there are bunch of writes do we need to stop leveled compactions at all? I found the nodetool stop COMPACTION, which states it stops compaction happening, does this work for any type of compaction? Also it states in documents 'eventually cassa

Re: questions related to the SSTable file

2013-09-19 Thread Robert Coli
On Tue, Sep 17, 2013 at 6:51 PM, java8964 java8964 wrote: > I thought I was clearer, but your clarification confused me again. > > But there is no way we can be sure that these SSTable files will ONLY > contain modified data. So the statement being quoted above is not exactly > right. I agree th

Re: Row size in cfstats vs cfhistograms

2013-09-19 Thread Robert Coli
On Thu, Sep 19, 2013 at 3:08 AM, Rene Kochen wrote: > And how does cfstats track the maximum size? What does "Compacted" mean in > "Compacted row maximum size". > That maximum size is "the largest row that I have encountered in the course of compaction, since I started." Hence "compacted," to tr

Re: Problem with counter columns

2013-09-19 Thread Robert Coli
On Wed, Sep 18, 2013 at 11:07 AM, Yulian Oifa wrote: > i am using counter columns in cassandra cluster with 3 nodes. > > Current cassandra version is 0.8.10. > > How can i debug , find the problem > The problem is using Counters in Cassandra 0.8. But seriously, I don't know whether the par

Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced.

2013-09-19 Thread Suruchi Deodhar
Hi Robert, I downloaded apache-cassandra-1.2.9.tar.gz from http://cassandra.apache.org/download/ ( http://apache.mirrors.tds.net/cassandra/1.2.9/apache-cassandra-1.2.9-bin.tar.gz) and installed it on the individual nodes of the cassandra cluster. Thanks, Suruchi On Thu, Sep 19, 2013 at 12:35 PM,

Re: 1.2 leveled compactions can affect big bunch of writes? how to stop/restart them?

2013-09-19 Thread Nate McCall
As opposed to stopping compaction altogether, have you experimented with turning down compaction_throughput_mb_per_sec (16mb default) and/or explicitly setting concurrent_compactors (defaults to the number of cores, iirc). On Thu, Sep 19, 2013 at 10:58 AM, rash aroskar wrote: > Hi, > In general

Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced.

2013-09-19 Thread Robert Coli
On Thu, Sep 19, 2013 at 7:03 AM, Richard Low wrote: > I think what has happened is that Cassandra was started with num_tokens = > 1, then shutdown and num_tokens set to 256. When this happens, the first > time Cassandra chooses a single random token. Then when restarted it > splits the token in

Re: Reverse compaction on 1.1.11?

2013-09-19 Thread Nate McCall
See https://issues.apache.org/jira/browse/CASSANDRA-4766 The original gist posted by Rob therein might be helpful/work with earlier versions (I have not tried). Worst case, might be a good reason to upgrade to 1.2.x (if you suffering pressure from a large SSTable, the additional offheap structure

Re: Reverse compaction on 1.1.11?

2013-09-19 Thread Hiller, Dean
Can ou describe what you mean by reverse compaction? I mean once you put a row together and blow away sstables that contained it before, you can't possibly know how to split it since that information is gone. Perhaps you want the simple sstable2json script in the bin directory so you can inspect

Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced.

2013-09-19 Thread Richard Low
I think what has happened is that Cassandra was started with num_tokens = 1, then shutdown and num_tokens set to 256. When this happens, the first time Cassandra chooses a single random token. Then when restarted it splits the token into 256 adjacent ranges. You can see something like this has h

Re: Cannot get secondary indexes on fields in compound primary key to work (Cassandra 2.0.0)

2013-09-19 Thread Petter von Dolwitz (Hem)
For the record: https://issues.apache.org/jira/browse/CASSANDRA-5975 (2.0.1) resolved this issue for me. 2013/9/8 Petter von Dolwitz (Hem) > Thank you for you reply. > > I will look into this. I cannot not get my head around why the scenario I > am describing does not work though. Should I

Reverse compaction on 1.1.11?

2013-09-19 Thread Michael Theroux
Hello, Quick question. Is there a tool that allows sstablesplit (reverse compaction) against 1.1.11 sstables? I seem to recall a separate utility somewhere, but I'm having difficulty locating it, Thanks, -Mike

Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced.

2013-09-19 Thread Suruchi Deodhar
Hi Richard, This is a brand new cluster which started with num_tokens =256 on first boot and chose random tokens. The attached ring status is after data is loaded into the cluster for the first time using sdtableloader and remains that way even after Cassandra is restarted. Thanks, Suruchi On

Re: Row size in cfstats vs cfhistograms

2013-09-19 Thread Rene Kochen
That is indeed how I read it. The maximal size is 3 rows with an offset of 126934, while cfstats reports 43388628. Thanks, Rene 2013/9/19 Richard Low > On 19 September 2013 10:31, Rene Kochen wrote: > > I use Cassandra 1.0.11 >> >> If I do cfstats for a particular column family, I see a "Com

Re: Row size in cfstats vs cfhistograms

2013-09-19 Thread Rene Kochen
And how does cfstats track the maximum size? What does "Compacted" mean in "Compacted row maximum size". Thanks again! Rene 2013/9/19 Michał Michalski > I believe the reason is that cfhistograms tells you about the sizes of the > rows returned by given node in a response to the read request,

Re: Row size in cfstats vs cfhistograms

2013-09-19 Thread Richard Low
On 19 September 2013 10:31, Rene Kochen wrote: I use Cassandra 1.0.11 > > If I do cfstats for a particular column family, I see a "Compacted row > maximum size" of 43388628 > > However, when I do a cfhistograms I do not see such a big row in the Row > Size column. The biggest row there is 126934.

Re: cqlsh startup error "Can't locate transport factory function cqlshlib.tfactory.regular_transport_factory"

2013-09-19 Thread Oisin Kim
Fixed this issue, for anyone else with this issue, it was that the version of Python installed via brew was 2.7.5 and needed to be put on the path as OS X has it's own version of python (2.7.2 currently). On Thursday 19 September 2013 at 10:33, Oisin Kim wrote: > Hi, > > cqlsh stopped workin

Re: Row size in cfstats vs cfhistograms

2013-09-19 Thread Michał Michalski
I believe the reason is that cfhistograms tells you about the sizes of the rows returned by given node in a response to the read request, while cfstats tracks the largest row stored on given node. M. W dniu 19.09.2013 11:31, Rene Kochen pisze: Hi all, I use Cassandra 1.0.11 If I do cfstats

cqlsh startup error "Can't locate transport factory function cqlshlib.tfactory.regular_transport_factory"

2013-09-19 Thread Oisin Kim
Hi, cqlsh stopped working for me recently, I'm unsure how / why it broke and I couldn't find anything from the mail archives (or google) that gave me an indication of how to fix the problem. Here's the output I see when I have cassandra running locally (default config except using Random Parti

Row size in cfstats vs cfhistograms

2013-09-19 Thread Rene Kochen
Hi all, I use Cassandra 1.0.11 If I do cfstats for a particular column family, I see a "Compacted row maximum size" of 43388628 However, when I do a cfhistograms I do not see such a big row in the Row Size column. The biggest row there is 126934. Can someone explain this? Thanks! Rene

Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced.

2013-09-19 Thread Richard Low
On 19 September 2013 02:06, Jayadev Jayaraman wrote: We use vnodes with num_tokens = 256 ( 256 tokens per node ) . After loading > some data with sstableloader , we find that the cluster is heavily > imbalanced : > How did you select the tokens? Is this a brand new cluster which started on firs