Re: understanding tombstones

2011-03-10 Thread Wangpei (Peter)
My question: what the client would get, when following happens:(RF=3, N=3) 1, write with timestamp T and succeed in all nodes. 2, delete with timestamp T+1, CL=Q, and succeed in node1 and node2 but failed in node3. 3, force flush + compaction 4, read CL=Q Does the client will get the row and

Re: understanding tombstones

2011-03-10 Thread Sylvain Lebresne
2011/3/10 Wangpei (Peter) peter.wang...@huawei.com My question: what the client would get, when following happens:(RF=3, N=3) 1, write with timestamp T and succeed in all nodes. 2, delete with timestamp T+1, CL=Q, and succeed in node1 and node2 but failed in node3. 3, force flush +

On 0.6.6 to 0.7.3 migration, DC-aware traffic and minimising data transfer

2011-03-10 Thread Jedd Rashbrooke
Howdi, Assortment of questions relating to an upgrade combined with a possible migration between Data Centers (or perhaps a multi-DC redesign). Apologies if some of these have been asked before - I have kept half an eye on the list in recent times but haven't seen anything covering these

mutator.execute() timings - big variance noted - pointers needed on understanding/improving it

2011-03-10 Thread Roshan Dawrani
Hi, I am in the middle of some load testing on a 1-node Cassandra setup. We are not on very high loads yet. We have recorded the timings taken up by mutator.execute() calls and we see this kind of variation during the test run: So, 25% of the times, execute() calls come back in 25 milli-seconds,

Re: problem with bootstrap

2011-03-10 Thread Patrik Modesto
Hi, I'm stil fighting the Exception in thread main java.lang.IllegalStateException: replication factor (3) exceeds number of endpoints (2). When I have a 2-server cluster, create Keyspace with RF 3, I'm able to add (without auto_bootstrap) another node but cluster nodetool commands don't work

Re: mutator.execute() timings - big variance noted - pointers needed on understanding/improving it

2011-03-10 Thread sridhar basam
Sounds like GC from your description of fast-slow-fast. Collect GC times from both the client and server side and plot against your application timing. If you uncomment the verbose GC entries in the cassandra-env.sh file you should get timing for the server side, pass in the same arguments for

Cassandra LongType data insertion problem for secondary index usage

2011-03-10 Thread Adi
Environment: Cassandra 0.7.0 , C++ Thrift client on windows I have a column family with a secondary index ColumnFamily: Page Columns sorted by: org.apache.cassandra.db.marshal.BytesType Built indexes: [Page.index_domain, Page.index_content_size] Column Metadata: Column

Re: Nodes frozen in GC

2011-03-10 Thread Peter Schuller
I think it would be very useful to get to the bottom of this but without further details (like the asked for GC logs) I'm not sure what to do/suggest. It's clear that a single CF with a 64 MB memtable flush threshold and without key cache and row cache and some bulk insertion, should not be

Re: Understanding index builds (updated: crashed cluster)

2011-03-10 Thread Jonathan Ellis
If you read the bugs I linked, you would see that this is expected behavior with 0.7.3 once you get more data than you can index in-memory. You should wait for the next Hudson build (which will include 2295) and use that. Or, create your indexes before adding the data. On Thu, Mar 10, 2011 at

Re: Cassandra LongType data insertion problem for secondary index usage

2011-03-10 Thread Adi
That was it. Thanks thobbs :-) The queries work as expected now. -Adi On Thu, Mar 10, 2011 at 1:01 PM, Tyler Hobbs ty...@datastax.com wrote: I looked again at the original

Re: problem with bootstrap

2011-03-10 Thread mcasandra
mcasandra wrote: aaron morton wrote: The issue I think you and Patrik are seeing occurs when you *remove* nodes from the ring. The ring does not know if they are up or down. E.g. you have a ring of 3 nodes, and add a keyspace with RF 3. Then for whatever reason 2 nodes are removed

how to force a GC in cronjob to free up disk space?

2011-03-10 Thread Karl Hiramoto
Reading the FAQ http://wiki.apache.org/cassandra/FAQ http://wiki.apache.org/cassandra/FAQ SSTables that are obsoleted by a compaction are deleted asynchronously when the JVM performs a GC. You can force a GC from jconsole if necessary How can i force the GC with a simple java commandline? Is

Re: Modeling Multi-Valued Fields

2011-03-10 Thread aaron morton
Two approaches here. First the many columns approach. Have a super column called Email, for each email address store the type as the column name and the email address as the column name. In cassandra you can store information in the column names as well as the column values. And you do not

Re: mutator.execute() timings - big variance noted - pointers needed on understanding/improving it

2011-03-10 Thread aaron morton
http://wiki.apache.org/cassandra/FAQ#slows_down_after_lotso_inserts Aaron On 11 Mar 2011, at 05:08, sridhar basam wrote: Sounds like GC from your description of fast-slow-fast. Collect GC times from both the client and server side and plot against your application timing. If you

Re: Modeling Multi-Valued Fields

2011-03-10 Thread Sasha Dolgy
hm. i use this approach and have secondary indexes configured on the columns if i need to do a specific search for an address. alternately, in the user cf, if you wanted to be very uncool, but optimized for always retrieving the user email addresses, you could have the uuid for the user record

Re: problem with bootstrap

2011-03-10 Thread aaron morton
Can you include this info... - output from nodetool ring for all nodes so we can see whats in the ring - what you've run on the node you are trying to bring in - the nodetool command you are trying to run - error logs In general asking the cluster to replicate data more times than the number of

Re: Understanding index builds (updated: crashed cluster)

2011-03-10 Thread Matt Kennedy
Sorry, I wasn't clear on the timeline of events. I started the index build and then posted this message to the list. Once I read the links you posted, I did expect the cluster to crash, but I let it run until it blew up anyway, since I didn't really know how to stop the index build. Which is

RE: Nodes frozen in GC

2011-03-10 Thread Gregory Szorc
I do believe there is a fundamental issue with compactions allocating too much memory and incurring too many garbage collections (at least with 0.6.12). On nearly every Cassandra node I operate, garbage collections simply get out of control during compactions of any reasonably sized CF (1GB). I

Re: Understanding index builds (updated: crashed cluster)

2011-03-10 Thread Jonathan Ellis
Drop the index, then restart once more. It shouldn't try to rebuild the index after that. On Thu, Mar 10, 2011 at 3:36 PM, Matt Kennedy stinkym...@gmail.com wrote: Sorry, I wasn't clear on the timeline of events.  I started the index build and then posted this message to the list. Once I read

Re: problem with bootstrap

2011-03-10 Thread mcasandra
I am completely confused. I repeated same test after turning on auto_bootstrap to true and it worked this time. I did it exactly same way where I killed 2 nodes and this time it started with no issues. Could it be because once auto_bootstrap is off it's off forever? I am using hector and

Re: problem with bootstrap

2011-03-10 Thread Peter Schuller
Could it be because once auto_bootstrap is off it's off forever? I am not entirely sure if this answers your question (I revisisted the thread history but I'm a bit confused myself): If by that you mean that given a node which was started with auto_bootstrap=false, and it successfully joined the

Re: problem with bootstrap

2011-03-10 Thread Peter Schuller
Bootstrapping uses the same mechanisms as a repair to streams data from other nodes. This can be a heavy weight process and you may want to control when it starts. Joining the ring just tells the other nodes you exists and this is your token. And in general, except when initially setting

Re: Understanding index builds (updated: crashed cluster)

2011-03-10 Thread Matt Kennedy
Great, that worked, thanks for your time. On Thu, Mar 10, 2011 at 4:57 PM, Jonathan Ellis jbel...@gmail.com wrote: Drop the index, then restart once more. It shouldn't try to rebuild the index after that. On Thu, Mar 10, 2011 at 3:36 PM, Matt Kennedy stinkym...@gmail.com wrote: Sorry, I

Re: Exception when running a clean up

2011-03-10 Thread Stu King
I have upgraded from 0.7.0 to 0.7.3. I then run nodetool scrub on my keyspace and now see this exception: Exception in thread main java.io.IOError: java.io.IOException: Cannot run program ln: java.io.IOException: error=24, Too many open files at

Re: Cassandra LongType data insertion problem for secondary index usage

2011-03-10 Thread buddhasystem
Tyler, as a collateral issue - I've been wondering for a while what advantage if any it buys me, if I declare a value 'long' (which it roughly is) as opposed to passing around strings. String is flattened onto a replica of itself, I assume? No conversion? Maybe it even means better speed. Thanks,

Re: mutator.execute() timings - big variance noted - pointers needed on understanding/improving it

2011-03-10 Thread Roshan Dawrani
Hi All, Thanks for the inputs. I will start investigating this morning with the help of these. Regards, Roshan On Fri, Mar 11, 2011 at 2:49 AM, aaron morton aa...@thelastpickle.comwrote: http://wiki.apache.org/cassandra/FAQ#slows_down_after_lotso_inserts

Cassandra startup port problem, apache-cassandra-0.7.3 on Snow Leopard.

2011-03-10 Thread Bob Futrelle
After a reboot, cassandra spits out many lines on startup but then appears to stall. Worse, trying to run cassandra a second time stops immediately because of a port problem: apache-cassandra-0.7.3: sudo ./bin/cassandra -f -p pidfile Password: Error: Exception thrown by the agent :

Re: Cassandra startup port problem, apache-cassandra-0.7.3 on Snow Leopard.

2011-03-10 Thread Jeremy Hanna
Comments in-line. On Mar 10, 2011, at 8:10 PM, Bob Futrelle wrote: After a reboot, cassandra spits out many lines on startup but then appears to stall. Worse, trying to run cassandra a second time stops immediately because of a port problem: apache-cassandra-0.7.3: sudo

memory utilization

2011-03-10 Thread Bill Hastings
Hi All Memory utilization reported by JCOnsole for Cassandra seems to be much lesser than that reported by top (RES memory). Can someone explain this? Maybe off topic but would appreciate a response. -- Cheers Bill

Re: Exception when running a clean up

2011-03-10 Thread Jonathan Ellis
Unrelated to either upgrade or scrub. That just means you need to install JNA to get native linking instead of having to fork to run ln. On Thu, Mar 10, 2011 at 5:54 PM, Stu King s...@stuartrexking.com wrote: I have upgraded from 0.7.0 to 0.7.3. I then run nodetool scrub on my keyspace and now

Re: memory utilization

2011-03-10 Thread Jonathan Ellis
http://wiki.apache.org/cassandra/FAQ#mmap On Thu, Mar 10, 2011 at 8:26 PM, Bill Hastings bllhasti...@gmail.com wrote: Hi All Memory utilization reported by JCOnsole for Cassandra seems to be much lesser than that reported by top (RES memory). Can someone explain this? Maybe off topic but

Re: FW: Very slow batch insert using version 0.7.2

2011-03-10 Thread Erik Forkalsrud
I see the same behavior with smaller batch sizes. It appears to happen when starting Cassandra with the defaults on relatively large systems. Attached is a script I created to reproduce the problem. (usage: mutate.sh /path/to/apache-cassandra-0.7.3-bin.tar.gz) It extracts a stock

Fatal configuration error, so how to change listen_address:storage_port in cassandra.yaml ?

2011-03-10 Thread Bob Futrelle
Now that I've made the JMX_PORT change cassandra will attempt to run. (Dumb me, I didn't need to ask - the answer about changing JMX_PORT was already in the archives. I'm getting with it now, so I know to look there first. Just finding my way around cassandra) Made the change:

Secondary Index not working?

2011-03-10 Thread Rommel Garcia
I tried the tutorial on this site - http://www.datastax.com/docs/0.7/data_model/secondary_indexes and worked on creating an index on a new column. That went good. But when I indexed an existing column, my query below returns 0 row where in fact it should return 1. Query: get users where state

Re: Secondary Index not working?

2011-03-10 Thread Jonathan Ellis
https://issues.apache.org/jira/browse/CASSANDRA-2244 On Thu, Mar 10, 2011 at 9:28 PM, Rommel Garcia groups.no...@gmail.com wrote: I tried the tutorial on this site - http://www.datastax.com/docs/0.7/data_model/secondary_indexes and worked on creating an index on a new column. That went good.

How long will all nodes data sync.

2011-03-10 Thread Vincent Lu (ECL)
Hi all, I have a question about eventually consistency. If there are 3 nodes and RF=3, Write-C=Quorum. How long will all 3 nodes data sync? Does any configuration can change that? Thanks in advance. Vincent This correspondence is from Cyberlink Corp. and is intended only for use by

Pig output to Cassandra

2011-03-10 Thread Mark
I thought I read somewhere that Pig has an output format that can write to Cassandra but I am unable to find any documentation on this. Is this possible and if so can someone please point me in the right direction. Thanks

Re: Pig output to Cassandra

2011-03-10 Thread Matt Kennedy
On its way... https://issues.apache.org/jira/browse/CASSANDRA-1828 On Mar 10, 2011, at 11:17 PM, Mark wrote: I thought I read somewhere that Pig has an output format that can write to Cassandra but I am unable to find any documentation on this. Is this possible and if so can someone please

Re: Fatal configuration error, so how to change listen_address:storage_port in cassandra.yaml ?

2011-03-10 Thread Aaron Morton
Something else is using the port, perhaps an existing Cassandra process? Use lsof -i | grep 7000 to see what is. If you need to change it, you are looking for storage_port in the config. Aaron On 11/03/2011, at 3:43 PM, Bob Futrelle bob.futre...@gmail.com wrote: Now that I've made the

Re: Pig output to Cassandra

2011-03-10 Thread Mark
Sweet! This is exactly what I was looking for and it looks like it was just resolved. Are there any working examples or documentation on this feature? Thanks On 3/10/11 8:57 PM, Matt Kennedy wrote: On its way... https://issues.apache.org/jira/browse/CASSANDRA-1828 On Mar 10, 2011, at 11:17

Secondary indices: Why low cardinality?

2011-03-10 Thread Kevin
There's pretty limited information on Cassandra's built-in secondary index facility as is, but trying to find out why the secondary index has to have low cardinality has been like finding a needle in a haystack..that is floating somewhere in the Atlantic. Can someone explain why low