Re: Concurrency Control

2012-05-30 Thread Filipe Gonçalves
It's the timestamps provided in the columns that do concurrency control/conflict resolution. Basically, the newer timestamp wins. For counters I think there is no such mechanism (i.e. counter updates are not idempotent). From https://wiki.apache.org/cassandra/DataModel : All values are supplied

Re: Retrieving old data version for a given row

2012-05-30 Thread Felipe Schmidt
I have further questions: -Is there any other way to stract the contect of SSTable, writing a java program for example instead of using sstable2json? -I tried to get tombstons using the thrift API, but seems to be not possible, is it right? When I try, the program throws an exception. thanks in

Moving to 1.1

2012-05-30 Thread Vanger
I didn't track mailing list since 1.1-rc is out and know i have several questions. 1) We want to upgrade from 1.09. How stable 1.1 is? I mean work under high load, running compactions and clean-ups? Is it faster then 1.09? 2) If i what to use hector as cassandra client which version is

Renaming a keyspace in 1.1

2012-05-30 Thread Oleg Dulin
Is it possible ? How ?

Re: commitlog_sync_batch_window_in_ms change in 0.7

2012-05-30 Thread osishkin osishkin
Thank you all. We're planning to move soon to a more advanced version. But for now I have a lot of data on my 0.7 cluster which i dont want to lose, just because of some schema error on restart etc. I dont mind losing any writes during the shutdown, however losing ALL the data would require me to

tokens and RF for multiple phases of deployment

2012-05-30 Thread Chong Zhang
Hi all, We are planning to deploy a small cluster with 4 nodes in one DC first, and will expend that to 8 nodes, then add another DC with 8 nodes for fail over (not active-active), so all the traffic will go to the 1st cluster, and switch to 2nd cluster if the whole 1st cluster is down or on

Re: Moving to 1.1

2012-05-30 Thread Edward Capriolo
1) Stable is a hard word to define. History shows it is better to let anything .0 burn in a bit. if you are pre-production it probably does not matter, otherwise I would say play safe. Wait for a .1 or .2 or the .0 is in the wild for a few weeks. 2) I worked on one of the patches to get hector

unsibscribe

2012-05-30 Thread Maxim Potekhin

Cassandra 1.1.1 release?

2012-05-30 Thread Roland Mechler
Anyone have a rough idea of when Cassandra 1.1.1 is likely to be released? -Roland

Re: Replication factor

2012-05-30 Thread aaron morton
Ah. The lack of page cache hits after compaction makes sense. But I don't think the drastic effect it appears to have is expected. Do you have an idea of how much slower local reads get ? If you are selecting coordinators based on token ranges the DS is not as much. It still has some utility

Re: what about an hybrid partitioner for CF with composite row key ?

2012-05-30 Thread aaron morton
* with the RP: for one ui action, many nodes may be requested, but it's simpler to balance the cluster Many nodes good. You will have increased availability if the data is more widely distributed. one sweeter(?) partitioner would be a partitioner that would distribute a row according only

Re: Moving to 1.1

2012-05-30 Thread Rob Coli
On Wed, May 30, 2012 at 4:08 AM, Vanger disc...@gmail.com wrote: 3) Java 7 now recommended for use by Oracle. We have several developers running local cassandra instances on it for a while without problems. Anybody tried it in production? Some time ago java 7 wasn't recommended for use with

Re: Confusion regarding the terms replica and replication factor

2012-05-30 Thread Jeff Williams
First, note that replication is done at the row level, not at the node level. That line should look more like: placement_strategy = 'NetworkTopologyStrategy' and strategy_options = {DC1: 1,DC2: 1,DC3: 1 } This means that each row will have one copy in each DC and within each DC it's

Re: Confusion regarding the terms replica and replication factor

2012-05-30 Thread David Fischer
Thanks! My missunderstanding was the snitch names are broken up by DC1:RAC1 and the strategy_options takes only the first part of the snitch names? On Wed, May 30, 2012 at 12:14 PM, Jeff Williams je...@wherethebitsroam.com wrote: First, note that replication is done at the row level, not at

Re: commitlog_sync_batch_window_in_ms change in 0.7

2012-05-30 Thread Rob Coli
On Tue, May 29, 2012 at 10:29 PM, Pierre Chalamet pie...@chalamet.net wrote: You'd better use version 1.0.9 (using this one in production) or 1.0.10. 1.1 is still a bit young to be ready for prod unfortunately. OP described himself as experimenting which I inferred to mean not-production. I

Re: Confusion regarding the terms replica and replication factor

2012-05-30 Thread Edward Capriolo
You can avoid the confusion by using the term natural endpoints. For example, with a replication factor of 3 natural endpoints for key x are node1, node2, node11. The snitch does use the datacenter and the rack but almost all deployments use a single rack per DC, because when you have more then

Re: Confusion regarding the terms replica and replication factor

2012-05-30 Thread Jeff Williams
On May 30, 2012, at 10:32 PM, Edward Capriolo wrote: The snitch does use the datacenter and the rack but almost all deployments use a single rack per DC, because when you have more then one rack in a data center the NTS snitch has some logic to spread the data between racks. (most people

Re: unknown exception with hector

2012-05-30 Thread aaron morton
i'm not sure if using framed transport is an option with hector. http://hector-client.github.com/hector//source/content/API/core/0.8.0-2/me/prettyprint/cassandra/service/CassandraHostConfigurator.html#setUseThriftFramedTransport(boolean) what should i be in the logs looking for to find the

Re: about multitenant datamodel

2012-05-30 Thread aaron morton
- Do a lot of keyspaces cause some problems? (If I have 1,000 users, cassandra creates 1,000 keyspaces…) It's not keyspaces, but the number of column families. Without storing any data each CF uses about 1MB of ram. When they start storing and reading data they use more. IMHO a model that

Re: High CPU load on Cassandra Node

2012-05-30 Thread aaron morton
Further I need to understand that for internal read/write does cassandra uses thrift for doing so over an rpc connection(port 9160) or 7000 as for inter node communication.May be that also could be a reason for so many connections on 9160. Uses 7000 What I could see from Ganglia is high

Re: Schema changes not getting picked up from different process

2012-05-30 Thread aaron morton
What clients are the scripts using ? This sounds like something that should be handled in the client. I would worry about holding a long running connection to a single node. There are several situations where the correct behaviour for a client is to kill a connection and connect to another

Re: Frequent exception with Cassandra 1.0.9

2012-05-30 Thread aaron morton
Still getting this ? Was there some more to the message ? Here's an example from the internets http://pastebin.com/WdD7181x it may be an issue with the JVM on windows. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 26/05/2012, at 6:07

Re: will compaction delete empty rows after all columns expired?

2012-05-30 Thread aaron morton
Minor compaction will remove the tombstones if the row only exists in the sstable being compaction. Are these very wide rows that are constantly written to ? Cheers p.s. cassandra 1.0 really does rock. - Aaron Morton Freelance Developer @aaronmorton

Re: cassandra read latency help

2012-05-30 Thread aaron morton
80 ms per request sounds high. I'm doing some guessing here, i am guessing memory usage is the problem.. * I assume you are not longer seeing excessive GC activity. * The key cache will not get used when you hit the row cache. I would disable the row cache if you have a random workload,

Re: TimedOutException caused by Stop the world activity

2012-05-30 Thread aaron morton
The cluster is running into GC problems and this is slowing it down under the stress test. When it slows down one or more of the nodes is failing to perform the write within rpc_timeout . This causes the coordinator of the write to raise the TimedOutException. You options are: * allocate

Re: Snapshot failing on JSON files in 1.1.0

2012-05-30 Thread aaron morton
CASSANDRA-4230 is a bug in 1.1 I am not aware of issues using snapshot on 1.0.9. But errno 0 is a bit odd. On the server side there should be a log message at ERROR level that contains the string Unable to create hard link and the error message. What does that say ? Can you also include the

Re: Doubts regarding compaction

2012-05-30 Thread aaron morton
Also, I want to make sure, if Major compactions could only be done manually ? Major compactions are the ones you run using nodetool Is the author referring to this time period as no minor compactions being triggered automatically ? They minor compaction will be triggered less frequently

Re: About Composite range queries

2012-05-30 Thread aaron morton
Composite Columns compare each part in turn, so the values are ordered as you've shown them. However the rows are not ordered according to key value. They are ordered using the random token generated by the partitioner see http://wiki.apache.org/cassandra/FAQ#range_rp What is the real

Re: All host pools Marked Down

2012-05-30 Thread aaron morton
I would remove the load balancer from the equation. Compactions do not stop the world, they may degrade performance for a while but thats about it. Look in the logs on the servers, are the nodes logging that other nodes are going DOWN ? Cheers - Aaron Morton Freelance

Re: Nodetool talking to an old IP address (and timing out)

2012-05-30 Thread aaron morton
node tool passes the host name un modified to the JMX library to connect to the host. The JMX server will, by default, bind to the ip address of the machine. If the host name was wrong, I would guess the JMX service failed to bind. Cheers - Aaron Morton Freelance Developer

RE: will compaction delete empty rows after all columns expired?

2012-05-30 Thread Curt Allred
No, these were not wide rows. They are rows that formerly had one or 2 columns. The columns are deleted but the empty rows dont go away, even after gc_grace_secs. So if I understand... the empty row will only be removed after gc_grace if enough compactions have occurred so that all the column

Re: will compaction delete empty rows after all columns expired?

2012-05-30 Thread Zhu Han
On Thu, May 31, 2012 at 9:31 AM, Curt Allred c...@mediosystems.com wrote: No, these were not wide rows. They are rows that formerly had one or 2 columns. The columns are deleted but the empty rows dont go away, even after gc_grace_secs. The empty row goes away only during a compaction after

Re: Confusion regarding the terms replica and replication factor

2012-05-30 Thread Edward Capriolo
http://answers.oreilly.com/topic/2408-replica-placement-strategies-when-using-cassandra/ As mentioned it does this: The Network Topology Strategy places some replicas in another data center and the remainder in other racks in the first data center, as specified Which is not what most would