Re: Memory overhead of vector clocks…. how often are they pruned?

2011-08-24 Thread Radim Kolar
From my point vector clocks is too much overhead. If you sync clocks in your cluster using NTP (which you should do anyway) you will get clock precision 1/1000s which is good enough. all my machines running NTP has offset 1/1000s. They are FreeBSD, Linux is not that precise in clock

Re: preloading entire CF with SEQ access on startup

2011-08-24 Thread aaron morton
Nothing automatic, you can do it by using range slices that request 0 columns. Once you have a hot cache it will be automatically saved a re-loaded at startup if you have enabled row_cache_save_period or key_cache_save_period for the CF. Cheers - Aaron Morton Freelance

Re: multi-node cassandra config doubt

2011-08-24 Thread aaron morton
Did you get this sorted ? At a guess I would say there are no nodes listed in the Hadoop JobConf. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 23/08/2011, at 9:51 PM, Thamizh wrote: Hi All, This is regarding multi-node

Re: run Cassandra tutorial example

2011-08-24 Thread aaron morton
HColumn(city=Austin) Is the data you are after. Have a look in src/main/resources/log4j.properties if you want to change the logging settings. Have fun. - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 24/08/2011, at 4:06 AM, Alvin

Re: Customized Secondary Index Schema

2011-08-24 Thread aaron morton
IMHO it's only a scalability problem if those nodes have trouble handling the throughput. The load will go all all replicas, not one, unless you turn off Read Repair. If it is a problem then you could manually partition the index into multiple rows, bit of a pain thought. I'd wait and see, or

Re: checksumming

2011-08-24 Thread aaron morton
At the file level see https://issues.apache.org/jira/browse/CASSANDRA-674 At the higher level there is node tool repair http://wiki.apache.org/cassandra/AntiEntropy. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 24/08/2011,

Re: Could Not connect to cassandra-cli on windows

2011-08-24 Thread aaron morton
Not off the top of my head. Can you get 0.7.8 running with a pre-packaged client ? Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 24/08/2011, at 12:16 PM, Alaa Zubaidi wrote: Hi Aaron, We are using Thrift 5..

Re: cassandra unexpected shutdown

2011-08-24 Thread aaron morton
First thing is are you on 0.8 ? It has some automagical memory management that is both automatic and magical http://thelastpickle.com/2011/05/04/How-are-Memtables-measured/ Secondly if you are OOM'ing you need to look at how much memory your schema is taking. See the link above, or just use

Cassandra-CLI does not allow list 1105115; with Syntax error

2011-08-24 Thread Renato Bacelar da Silveira
Hi All Good day, A question concerning Cassandra-Cli. I have a Column Family named 11001500. I have inserted the CF with Hector, and it did not throw any exception concerning the name of the column. If I am issuing the command list 1105115; I incur the following error: [default@unknown]

Cassandra-CLI does not allow list 1105115; with Syntax error

2011-08-24 Thread Renato Bacelar da Silveira
Just some information about the Column family in question: ColumnFamily: 1105100 Key Validation Class: org.apache.cassandra.db.marshal.BytesType Default column value validator: org.apache.cassandra.db.marshal.BytesType Columns sorted by:

nodetool repair does not return...

2011-08-24 Thread Boris Yen
Hi, In our testing environment, we got two nodes with RF=2 running 0.8.4. We tried to test the repair functions of cassandra, however, every once a while, the nodetool repair never returns. We have checked the system.log, nothing seems to be out of ordinary, no errors, no exceptions. The data is

Re: multi-node cassandra config doubt

2011-08-24 Thread Thamizh
Hi Aaron, This is yet to be resolved. I have set-up Cassandra multi node clustering and facing issues in pushing HDFS data to Cassandra. When I ran MapReduce progrma I am getting UnknownHostException. In hadoop(0.20.1), I have configured node01-as master and node01, node02 node03 as

Re: help creating data model

2011-08-24 Thread Helder Oliveira
Thanks Indranath Ghosh for your tip! I will continue here the question. Aaron, i have read your suggestion and tried to design your suggestion and i have one question regarding it. Let's forget for now the Requests and Events! Just keep the Visitants and the Sessions. My goal is when having

Re: Customized Secondary Index Schema

2011-08-24 Thread Alvin UW
Thanks. 2011/8/24 aaron morton aa...@thelastpickle.com IMHO it's only a scalability problem if those nodes have trouble handling the throughput. The load will go all all replicas, not one, unless you turn off Read Repair. If it is a problem then you could manually partition the index into

Re: checksumming

2011-08-24 Thread Jonathan Ellis
https://issues.apache.org/jira/browse/CASSANDRA-1717 added block level checksums. On Wed, Aug 24, 2011 at 4:28 AM, aaron morton aa...@thelastpickle.com wrote: At the file level see https://issues.apache.org/jira/browse/CASSANDRA-674 At the higher level there is node tool repair 

Re: Could Not connect to cassandra-cli on windows

2011-08-24 Thread Alaa Zubaidi
Hi Aaron, I cannot at this point of time.. Thanks for your help.. Alaa On 8/24/2011 2:30 AM, aaron morton wrote: Not off the top of my head. Can you get 0.7.8 running with a pre-packaged client ? Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton

Re: Commit log fills up in less than a minute

2011-08-24 Thread Anand Somani
So, I restarted the cluster (not rolling), but it is still maintaining hints for the IP's that are no longer part of the ring. nodetool ring shows things correctly (as only 3 nodes). When I check thru the jmx hintedhandoff manager, it shows it is maintaining the hints for those non existent IP's.

Re: Commit log fills up in less than a minute

2011-08-24 Thread Anand Somani
So I have looked at the cluster from - Cassandra-client - describe cluster = shows correctly - 3 nodes - used the StorageService - JMX bean =UnreachableNodes - shows 0 If all these show the correct ring state, why are hints being maintained, looks like that is the only way to find out

Re: cassandra unexpected shutdown

2011-08-24 Thread Ernst D Schoen-René
If by magical, you mean magically shuts down randomly, then yes, that is magical. We're on 8, but we discovered that 8 has an undocumented feature where turning off the commitlog doesn't work, so we're upgrading to 8.1 or whatever is current. It doesn't seem to be tied to high or low load,

Cassandra-cli not able to find CF after fresh CF insert.

2011-08-24 Thread Renato Bacelar da Silveira
Hi All Good day, I have again come across a situation where the CF is not being found by the list command... it would be too painful at this stage to restart the node just to be able to query the CF... *ColumnFamily: a1307* Key Validation Class:

Re: Memory overhead of vector clocks…. how often are they pruned?

2011-08-24 Thread Kevin Burton
This is really interesting… I can track it down but there are a number of references to Cassandra HAVING vector clocks … which would make sense that I can't find out how much memory they are using :-P Cassandra: The Definitive Guide … which I was reading the other night says that they were

Re: Memory overhead of vector clocks…. how often are they pruned?

2011-08-24 Thread Jeremy Hanna
At the point that book was written (about a year ago it was finalized), vector clocks were planned. In August or September of last year, they were removed. 0.7 was released in January. The ticket for vector clocks is here and you can see the reasoning for not using them at the bottom.

how to migrate?

2011-08-24 Thread William Oberman
I was hoping to transition my simple cassandra cluster (where each node is a cassandra + hadoop tasktracker) to a cluster with two virtual datacenters (vanilla cassandra vs. cassandra + hadoop tasktracker), based on this: http://wiki.apache.org/cassandra/HadoopSupport#ClusterConfig The problem

Re: cassandra unexpected shutdown

2011-08-24 Thread Ernst D Schoen-René
So, we're on 8, so I don't think there's a key cache setting. Am I wrong? here's my newest crash log: ERROR [Thread-210] 2011-08-24 06:29:53,247 AbstractCassandraDaemon.java (line 113) Fatal exception in thread Thread[Thread-210,5,main] java.util.concurrent.RejectedExecutionException:

question about cassandra.in.sh

2011-08-24 Thread Koert Kuipers
i have an existing cassandra instance on my machine, it came with brisk and lives in /usr/share/brisk/cassandra. it also created /usr/share/cassandra/ cassandra.in.sh now i wanted to run another instance of cassandra (i needed a 0.7 version for compatibility reasons), so i downloaded it from

Cassandra Node Requirements

2011-08-24 Thread Jacob, Arun
I'm trying to determine a node configuration for Cassandra. From what I've been able to determine from reading around: 1. we need to cap data size at 50% of total node storage capacity for compaction 2. with RF=3, that means that I need to effectively assume that I have 1/6th of total

Re: Cassandra-CLI does not allow list 1105115; with Syntax error

2011-08-24 Thread aaron morton
Similar to https://issues.apache.org/jira/browse/CASSANDRA-3054 can you create a new ticket and link to that one. - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 24/08/2011, at 10:33 PM, Renato Bacelar da Silveira wrote: Just some

Re: Cassandra Node Requirements

2011-08-24 Thread Edward Capriolo
On Wed, Aug 24, 2011 at 2:54 PM, Jacob, Arun arun.ja...@disney.com wrote: I'm trying to determine a node configuration for Cassandra. From what I've been able to determine from reading around: 1. we need to cap data size at 50% of total node storage capacity for compaction 2. with

Re: help creating data model

2011-08-24 Thread aaron morton
I normally suggest trying a model with Standard CF's first as there are some down sides to super CF's. If you know there will only be a few sub columns there are probably OK (see http://wiki.apache.org/cassandra/CassandraLimitations). Your alternative design is fine. Test it out and see what

Re: run Cassandra tutorial example

2011-08-24 Thread Thairu
Error stacktraces is output from maven. mvn -e option turns on Error reporting. From: aaron morton aa...@thelastpickle.commailto:aa...@thelastpickle.com Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Date: Wed, 24 Aug

Atomic or Non-Atomic Counters

2011-08-24 Thread Sal Fuentes
The design document that is referenced on the Cassandra wiki page ( http://wiki.apache.org/cassandra/Counters) describes the Counters in Cassandra as non-atomic ( https://issues.apache.org/jira/secure/attachment/12459754/Partitionedcountersdesigndoc.pdf). However, the DataStax post on counters (

Re: Customized Secondary Index Schema

2011-08-24 Thread Ryan King
On Tue, Aug 23, 2011 at 10:03 AM, Alvin UW alvi...@gmail.com wrote: Hello, As mentioned by Ed Anuff in his blog and slides, one way to build customized secondary index is: We use one CF, each row to represent a secondary index, with the secondary index name as row key. For example,

Re: Cassandra Node Requirements

2011-08-24 Thread Jacob, Arun
Thanks for the links and the answers. The vagueness of my initial questions reflects the fact that I'm trying to configure for a general case — I will clarify below: I need to account for a variety of use cases. (1) they will be both read and write heavy. I was assuming that SSDs would be

Re: question about cassandra.in.sh

2011-08-24 Thread Eric Evans
On Wed, Aug 24, 2011 at 1:28 PM, Koert Kuipers ko...@tresata.com wrote: my problem is that the scripts for my cassandra 0.7 instance don't work properly. the problem lies in the code snippets below. when i run the scripts they source /usr/share/cassandra/cassandra.in.sh, which has the wrong

Re: Atomic or Non-Atomic Counters

2011-08-24 Thread Jonathan Ellis
They are atomic in the sense that if you increment from N to M, readers will never see any intermediate values, just N or M itself. On Wed, Aug 24, 2011 at 6:50 PM, Sal Fuentes fuente...@gmail.com wrote: The design document that is referenced on the Cassandra wiki page

Re: Cassandra Node Requirements

2011-08-24 Thread Jonathan Ellis
On Wed, Aug 24, 2011 at 1:54 PM, Jacob, Arun arun.ja...@disney.com wrote: we need to cap data size at 50% of total node storage capacity for compaction Sort of. There's some fine print, such as the 50% number is only if you're manually forcing major compactions, which is not recommended, but a

Re: nodetool repair does not return...

2011-08-24 Thread Boris Yen
Would Cassandra-2433 cause this? On Wed, Aug 24, 2011 at 7:23 PM, Boris Yen yulin...@gmail.com wrote: Hi, In our testing environment, we got two nodes with RF=2 running 0.8.4. We tried to test the repair functions of cassandra, however, every once a while, the nodetool repair never returns.

Re: how to know if nodetool cleanup is safe?

2011-08-24 Thread Yan Chunlu
got it! thanks a lot for the explanation! On Wed, Aug 24, 2011 at 1:06 AM, Edward Capriolo edlinuxg...@gmail.comwrote: On Tue, Aug 23, 2011 at 11:56 AM, Sam Overton sover...@acunu.com wrote: On 21 August 2011 12:34, Yan Chunlu springri...@gmail.com wrote: since nodetool cleanup could

For multi-tenant, is it good to have a key space for each tenant?

2011-08-24 Thread Guofeng Zhang
I wonder if it is a good practice to create a key space for each tenant. Any advice is appreciated. Thanks