Re: slow read

2012-03-05 Thread ruslan usifov
2012/3/5 Jeesoo Shin bsh...@gmail.com Hi all. I have very SLOW READ here. :-( I made a cluster with three node (aws xlarge, replication = 3) Cassandra version is 1.0.6 I have inserted 1,000,000 rows. (standard column) Each row has 200 columns. Each column has 16 byte key, 512 byte value.

Re: Secondary indexes don't go away after metadata change

2012-03-05 Thread aaron morton
The secondary index CF's are marked as no longer required / marked as compacted. under 1.x they would then be deleted reasonably quickly, and definitely deleted after a restart. Is there a zero length .Compacted file there ? Also, when adding a new node to the ring the new node will build

Re: slow read

2012-03-05 Thread Jeesoo Shin
Thank you for reply. :) Yes I did multiple thread. 160, 320 gave me same result. On 3/5/12, ruslan usifov ruslan.usi...@gmail.com wrote: 2012/3/5 Jeesoo Shin bsh...@gmail.com Hi all. I have very SLOW READ here. :-( I made a cluster with three node (aws xlarge, replication = 3) Cassandra

Re: can't find rows

2012-03-05 Thread aaron morton
am guessing a lot here, but I would check if auto_bootstrap is enabled. It is by default. When a new node joins reads are not directed to it until it is marked as UP (writes are sent to it as it is joining). So reads should continue to go to the original UP node. Sounds like it's all running

Re: Schema change causes exception when adding data

2012-03-05 Thread aaron morton
I don't have a lot of Hector experience but it sounds like the way to go. The CLI and cqlsh will take care of this. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 2/03/2012, at 10:12 AM, Tharindu Mathew wrote: There are 2. I'd like to

Re: slow read

2012-03-05 Thread ruslan usifov
And sum of all rq/s threads is 160?? 2012/3/5 Jeesoo Shin bsh...@gmail.com Thank you for reply. :) Yes I did multiple thread. 160, 320 gave me same result. On 3/5/12, ruslan usifov ruslan.usi...@gmail.com wrote: 2012/3/5 Jeesoo Shin bsh...@gmail.com Hi all. I have very SLOW READ

Re: composite types in CQL

2012-03-05 Thread aaron morton
It's not currently supported in CQL https://issues.apache.org/jira/browse/CASSANDRA-3761 You can do it using the CLI, see the online help. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 2/03/2012, at 10:39 AM, Bayle Shanks wrote: hi,

Re: Test Data creation in Cassandra

2012-03-05 Thread aaron morton
try tools/stress in the source distribution. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 3/03/2012, at 6:01 AM, A J wrote: What is the best way to create millions of test data in Cassandra ? I would like to have some script where

RE: cli question

2012-03-05 Thread Rishabh Agrawal
I faced the same issue some time back. Solution which fit my bill is as follows: CREATE COLUMN FAMILY aaa with comparator = 'CompositeType(UTF8Type,UTF8Type)' and default_validation_class = 'UTF8Type' and key_validation_class = 'CompositeType(UTF8Type,UTF8Type,UTF8Type,)'; notice I

running two rings on the same subnet

2012-03-05 Thread Tamar Fraenkel
Hi! I have a Cassandra cluster with two nodes nodetool ring -h localhost Address DC RackStatus State LoadOwns Token 85070591730234615865843651857942052864 10.0.0.19 datacenter1 rack1 Up Normal 488.74 KB 50.00% 0 10.0.0.28

Re: cli question

2012-03-05 Thread Tamar Fraenkel
Thanks! I decided to just replace all : with ^ and I can simply run: get a_b_indx ['AAA:BBB^CCC']; *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Mon, Mar 5, 2012 at

Re: Maximum Row Size in Cassandra : Potential Bottleneck

2012-03-05 Thread aaron morton
Is there any way in which the writes can be made pretty slow on different nodes. Ideally I would like data to be written on one node and eventually replicating across other nodes I dont really need a real time update, so can pretty much live with slow writes. Replicating inside the mutation

Mutation Dropped Messages

2012-03-05 Thread Tiwari, Dushyant
Hi All, While benchmarking Cassandra I found Mutation Dropped messages in the logs. Now I know this is a good old question. It will be really great if someone can provide a check list to recover when such a thing happens. I am looking for answers of the following questions - 1. Which

Re: slow read

2012-03-05 Thread aaron morton
Where is the client running from ? To see if a node it keeping up with requests look at nodetool tpstats, check if the read stage is backing up. To see how long a read takes, use nodetool cfstats and look at the read latency. (this the latency of a read on that node, not cluster wide) To

Re: running two rings on the same subnet

2012-03-05 Thread aaron morton
Would the rings be separate? Yes. But I would recommend you give them different cluster names. It's a good protections against nodes accidentally joining the wrong cluster. cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 5/03/2012, at

Re: running two rings on the same subnet

2012-03-05 Thread Hontvári József Levente
You have to use PropertyFileSnitch and NetworkTopologyStrategy to create a multi-datacenter setup with two circles. You can start reading from this page: http://www.datastax.com/docs/1.0/cluster_architecture/replication#about-replica-placement-strategy Moreover all

Re: Mutation Dropped Messages

2012-03-05 Thread aaron morton
1. Which parameters to tune in the config files? – Especially looking for heavy writes The node is overloaded. It may be because there are no enough nodes, or the node is under temporary stress such as GC or repair. If you have spare IO / CPU capacity you could increase the

Re: running two rings on the same subnet

2012-03-05 Thread aaron morton
Do you want to create two separate clusters or a single cluster with two data centres ? If it's the later, token selection is discussed here http://www.datastax.com/docs/1.0/install/cluster_init#token-gen-cassandra Moreover all tokens must be unique (even across datacenters), although -

Re: how stable is 1.0 these days?

2012-03-05 Thread Viktor Jevdokimov
1.0.7 is very stable, weeks in high-load production environment without any exception, 1.0.8 should be even more stable, check changes.txt for what was fixed. 2012/3/2 Marcus Eriksson krum...@gmail.com beware of https://issues.apache.org/jira/browse/CASSANDRA-3820 though if you have many keys

Re: Huge amount of empty files in data directory.

2012-03-05 Thread Viktor Jevdokimov
After running Cassandra for 2 years in production on Windows servers, starting from 0.7 beta2 up to 1.0.7 we have moved to Linux and forgot all the hell we had on Windows. Having JNA, off-heap row cache and normally working MMAP on Linux you're getting a lot better performance and stability

Re: running two rings on the same subnet

2012-03-05 Thread Tamar Fraenkel
I want tow separate clusters. *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Mon, Mar 5, 2012 at 12:48 PM, aaron morton aa...@thelastpickle.comwrote: Do you want to

Rationale behind incrementing all tokens by one in a different datacenter (was: running two rings on the same subnet)

2012-03-05 Thread Hontvári József Levente
I am thinking about the frequent example: dc1 - node1: 0 dc1 - node2: large...number dc2 - node1: 1 dc2 - node2: large...number + 1 In theory using the same tokens in dc2 as in dc1 does not significantly affect key distribution, specifically the two keys on the border will move to the next

Adding a second datacenter

2012-03-05 Thread David Koblas
Everything that I've read about data centers focuses on setting things up at the beginning of time. I've the the following situation: 10 machines in a datacenter (DC1), with replication factor of 2. I want to set up a second data center (DC2) with the following configuration: 20 machines

Re: Adding a second datacenter

2012-03-05 Thread Jeremiah Jordan
You need to make sure your clients are reading using LOCAL_* settings so that they don't try to get data from the other data center. But you shouldn't get errors while replication_factor is 0. Once you change the replication factor to 4, you should get missing data if you are using LOCAL_*

Division by zero

2012-03-05 Thread Vanger
After upgrading from version 1.0.1 to 1.0.8 we started to get exception: ERROR [http-8095-1 WideEntityServiceImpl.java:142] - get: key1 - {type=RANGE, start=0, end=9223372036854775807, orderDesc=false, limit=1} me.prettyprint.hector.api.exceptions.HCassandraInternalException: Cassandra

Re: Rationale behind incrementing all tokens by one in a different datacenter (was: running two rings on the same subnet)

2012-03-05 Thread Jeremiah Jordan
There is a requirement that all nodes have a unique token. There is still one global cluster/ring that each node needs to be unique on. The logically seperate rings that NetworkTopologyStrategy puts them into is hidden from the rest of the code. -Jeremiah On 03/05/2012 05:13 AM, Hontvári

Re: how stable is 1.0 these days?

2012-03-05 Thread Thibaut Britz
Thanks for the feedback. I will certainly execute scrub after the update. On Mon, Mar 5, 2012 at 11:55 AM, Viktor Jevdokimov vjevdoki...@gmail.comwrote: 1.0.7 is very stable, weeks in high-load production environment without any exception, 1.0.8 should be even more stable, check changes.txt

Re: Adding a second datacenter

2012-03-05 Thread David Koblas
Jeremiah, Thanks! I'm running 1.0.8, two interesting things to note: - I don't have sufficient disk space to handle the straight bump to a replication factor of 4, so I think I'm going to have to do it one by one (1,2,3 and 4) with a bunch of cleanups in between. - Also, using a

Re: Issue with nodetool clearsnapshot

2012-03-05 Thread aaron morton
It seems that instead of removing the snapshot, clearsnapshot moved the data files from the snapshot directory to the parent directory and the size of the data for that keyspace has doubled. That is not possible, there is only code there to delete a files in the snapshot. Note that in the

Re: running two rings on the same subnet

2012-03-05 Thread aaron morton
Create nodes that do not share seeds, and give the clusters different names as a safety measure. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 6/03/2012, at 12:04 AM, Tamar Fraenkel wrote: I want tow separate clusters. Tamar Fraenkel

Re: Division by zero

2012-03-05 Thread aaron morton
(Commented in the ticket as well) What is the error in the server log ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 6/03/2012, at 5:04 AM, Vanger wrote: After upgrading from version 1.0.1 to 1.0.8 we started to get exception:

Re: Mutation Dropped Messages

2012-03-05 Thread aaron morton
I increased the size of the cluster also the concurrent_writes parameter. Still there is a node which keeps on dropping the mutation messages. Ensure all the nodes have the same spec, and the nodes have the same config. In a virtual environment consider moving the node. Is this due to some

Re: Issue with nodetool clearsnapshot

2012-03-05 Thread B R
Hi Aaron, 1)Since you mentioned hard links, I would like to add that our data directory itself is a sym-link. Could that be causing an issue ? 2)Yes, there are 0 byte files of the same numbers in Keyspace1 directory 0 Mar 4 01:33 Standard1-g-7317-Compacted 0 Mar 3 22:58

hector connection pool

2012-03-05 Thread Daning Wang
I just got this error : All host pools marked down. Retry burden pushed out to client. in a few clients recently, client could not recover, we have to restart client application. we are using 0.8.0.3 hector. At that time we did compaction for a CF, it takes several hours, server was busy. But

RE: Secondary indexes don't go away after metadata change

2012-03-05 Thread Frisch, Michael
Thank you very much for your response. It is true that the older, previously existing nodes are not snapshotting the indexes that I had removed. I'll go ahead and just delete those SSTables from the data directory. They may be around still because they were created back when we used 0.8.

Cassandra cache patterns with thiny and wide rows

2012-03-05 Thread Maciej Miklas
I've asked this question already on stackoverflow but without answer - I wll try again: My use case expects heavy read load - there are two possible model design strategies: 1. Tiny rows with row cache: In this case row is small enough to fit into RAM and all columns are being cached.

Re: hector connection pool

2012-03-05 Thread Maciej Miklas
Have you tried to change: me.prettyprint.cassandra.service.CassandraHostConfigurator#retryDownedHostsDelayInSeconds ? Hector will ping down hosts every xx seconds and recover connection. Regards, Maciej On Mon, Mar 5, 2012 at 8:13 PM, Daning Wang dan...@netseer.com wrote: I just got this

Re: Cassandra cache patterns with thiny and wide rows

2012-03-05 Thread Viktor Jevdokimov
Depends on how large is a data set, specifically hot data, comparing to available RAM, what is a heavy read load, and what are the latency requirements. 2012/3/6 Maciej Miklas mac.mik...@googlemail.com I've asked this question already on stackoverflow but without answer - I wll try again:

Re: running two rings on the same subnet

2012-03-05 Thread Tamar Fraenkel
Works.. But during the night my setup encountered a problem. I have two VMs on my cluster (running on VmWare ESXi). Each VM has1GB memory, and two Virtual Disks of 16 GB They are running on a small server with 4CPUs (2.66 GHz), and 4 GB memory (together with two other VMs) I put cassandra data on