Re: Error messages after rolling updating cassandra from 0.7.0 to 0.7.2

2011-04-04 Thread Kazuo YAGI
Solution: upgrade to 0.7.4, run scrub Although I upgraded all my cassandra nodes from 0.7.0 to 0.7.4 and ran nodetool scrub to all keyspaces, this EOFException error messages didn't go away. Do you have any ideas how to deal with it next? Besides, it would be really useful if I could know

Strange nodetool repair behaviour

2011-04-04 Thread Jonas Borgström
Hi, I have a 6 node 0.7.4 cluster with replication_factor=3 where nodetool repair keyspace behaves really strange. The keyspace contains three column families and about 60GB data in total (i.e 30GB on each node). Even though no data has been added or deleted since the last repair, a repair

AW: Strange nodetool repair behaviour

2011-04-04 Thread Roland Gude
I am experiencing the same behavior but had it on previous versions of 0.7 as well. -Ursprüngliche Nachricht- Von: Jonas Borgström [mailto:jonas.borgst...@trioptima.com] Gesendet: Montag, 4. April 2011 12:26 An: user@cassandra.apache.org Betreff: Strange nodetool repair behaviour Hi,

Re: balance between concurrent_[reads|writes] and feeding/reading threads i clients

2011-04-04 Thread aaron morton
What do your TP Stats look like under load? Are you actually using the 100 read/write threads ? What is the IO platform and what sort of load is that under and how many cores do the machines have ? It's interested that you seem to having a better time with such high values. If you are

Re: Abnormal memory consumption

2011-04-04 Thread aaron morton
For background see the JVM Heap Size section here http://wiki.apache.org/cassandra/MemtableThresholds You can also add a fudge factor of anywhere from X2 to X8 to the size of the memtables. You are in for a very difficult time trying to run cassandra with under 500MB of heap space. Is this

Re: nodetool cleanup - results in more disk use?

2011-04-04 Thread aaron morton
cleanup reads each SSTable on disk and writes a new file that contains the same data with the exception of rows that are no longer in a token range the node is a replica for. It's not compacting the files into fewer files or purging tombstones. But it is re-writing all the data for the CF.

Re: Phantom node keeps coming back

2011-04-04 Thread aaron morton
Sounds like http://comments.gmane.org/gmane.comp.db.cassandra.user/14498 https://issues.apache.org/jira/browse/CASSANDRA-2371 Aaron On 3 Apr 2011, at 09:42, Jason Harvey wrote: Greetings all, I removetoken'd a node a few weeks back and completely shut down the node which owned that

Re: urgent

2011-04-04 Thread aaron morton
Using a single directory will make be the most efficient use of space, multi directories are useful when you accidentally run out of space http://www.mail-archive.com/user@cassandra.apache.org/msg07874.html Can you put the SSD's in a stripe set ? Also this may be of interest

Re: nodetool cleanup - results in more disk use?

2011-04-04 Thread Jonathan Colby
hi Aaron - The Datastax documentation brought to light the fact that over time, major compactions will be performed on bigger and bigger SSTables. They actually recommend against performing too many major compactions. Which is why I am wary to trigger too many major compactions ...

Re: NullPointerException with 0.7.4

2011-04-04 Thread aaron morton
Was this using one of the included stress tests or your own system ? If it was something from contrib/ what was the command line you used? If you own system what was the schema and what was the tests doing ? If it's reproducible could you create a ticket here

Re: compaction behaviour

2011-04-04 Thread aaron morton
Is nodetool compact what you are looking for ? Aaron On 4 Apr 2011, at 05:35, Anurag Gujral wrote: Hi Zhu, I did not got that SSDs have read latency of 0.1ms.Since there is only one data file I would expect the read of any key to take 0.1ms may be I am missing something

Re: change row cache size in cassandra

2011-04-04 Thread aaron morton
You can also set it when creating the schema, either view yaml for the cassandra-cli. Aaron On 4 Apr 2011, at 05:45, Anurag Gujral wrote: Hi All, I looked at the nodetool there is an option to change cache sizes . Thanks Anurag On Sun, Apr 3, 2011 at 12:25 PM, Anurag

Re: Embedding Cassandra in Java code w/o using ports

2011-04-04 Thread aaron morton
I'm interested to know more about the problems using the CLI. Aaron. On 2 Apr 2011, at 15:07, Bob Futrelle wrote: Connecting via CLI to local host with a port number has never been successful for me in Snow Leopard. No amount of reading suggestions and varying the approach has worked.

Questions on combining custom with built-in secondary indexes

2011-04-04 Thread Miroslav Madecki
Hi, I would like to combine custom secondary index with Cassandra's built-in secondary indexes. Custom secondary index uses multiple columns and is based on historical but incomplete info about matching data in target column family. It produces list of candidate rows which could be too large to

Re: Strange nodetool repair behaviour

2011-04-04 Thread Mateusz Korniak
On Monday 04 of April 2011, Jonas Borgström wrote: I have a 6 node 0.7.4 cluster with replication_factor=3 where nodetool repair keyspace behaves really strange. I think I am observing similar issue. I have three 0.7.4 nodes with RF=3. After compaction I see about 7GB load in node but after

Re: AW: Strange nodetool repair behaviour

2011-04-04 Thread aaron morton
Jonas, AFAIK if repair completed successfully there should be no streaming the next time round. This sounds odd please look into it if you can. Can you run at DEBUG logging, there will be some messages about receiving streams from files and which ranges are being requested. I would be

Re: nodetool cleanup - results in more disk use?

2011-04-04 Thread aaron morton
mmm, interesting. My theory was t0 - major compaction runs, there is now one sstable t1 - x new sstables have been created t2 - minor compaction runs and determines there are two buckets, one with the x new sstables and one with the single big file. The bucket of many files is compacted

Re: nodetool cleanup - results in more disk use?

2011-04-04 Thread shimi
The bigger the file the longer it will take for it to be part of a compaction again. Compacting bucket of large files takes longer then compacting bucket of small files Shimi On Mon, Apr 4, 2011 at 3:58 PM, aaron morton aa...@thelastpickle.comwrote: mmm, interesting. My theory was t0 -

Re: Abnormal memory consumption

2011-04-04 Thread openvictor Open
Hey Aaron, Thank you for your kind answer. This is a test server, the production serveur (single instance at the moment) has 8 Gb (or 12 Go not decided yet) of RAM. But with it there are other things running such as : Solr, Redis, PostGreSQL, Tomcat. The total take up to 1 Gb of RAM when running

Re: Embedding Cassandra in Java code w/o using ports

2011-04-04 Thread Edward Capriolo
On Mon, Apr 4, 2011 at 8:29 AM, aaron morton aa...@thelastpickle.com wrote: I'm interested to know more about the problems using the CLI. Aaron. On 2 Apr 2011, at 15:07, Bob Futrelle wrote: Connecting via CLI to local host with a port number has never been successful for me in Snow

index file contains a different key or row size

2011-04-04 Thread shimi
It make sense to me that compaction should solved this as well since compaction creates new index files. Am I missing something here? WARN [CompactionExecutor:1] 2011-04-04 14:50:54,105 CompactionManager.java (line 602) Row scrubbed successfully but index file contains a different key or row

statistcs query on cassandra

2011-04-04 Thread Donal Zang
Can we do count like this? /count cf[startKey:endKey] where column = value/ -- Donal Zang Computing Center, IHEP 19B YuquanLu, Shijingshan District,Beijing, 100049 zan...@ihep.ac.cn 86 010 8823 6018

mmap segment underflow

2011-04-04 Thread Or Yanay
Hi All, I have upgraded from 0.7.0 to 0.7.4, and while running scrub I get the following exception quite a lot: java.lang.AssertionError: mmap segment underflow; remaining is 73936639 but 1970430821 requested at

Re: Abnormal memory consumption

2011-04-04 Thread Peter Schuller
My last concern and for me it is a flaw for Cassandra and I am sad to admit it because I love cassandra : how come that for 6Mb of data, Cassandra feels the need to fill 500 Mb of RAM ? I can understand the need for, let's say, 100 Mo because of cache and several Memtable being alive at the

Re: Abnormal memory consumption

2011-04-04 Thread Peter Schuller
You can change VM settings and tweak things like memtable thresholds and in-memory compaction limits to get it down and get away with a smaller heap size, but honestly I don't recommend doing so unless you're willing to spend some time getting that right and probably repeating some of the

Re: Abnormal memory consumption

2011-04-04 Thread Victor Kabdebon
And about the production 7Gb or RAM is sufficient ? Or 11 Gb is the minimum ? Thank you for your inputs for the JVM I'll try to tune that 2011/4/4 Peter Schuller peter.schul...@infidyne.com You can change VM settings and tweak things like memtable thresholds and in-memory compaction limits

Re: Abnormal memory consumption

2011-04-04 Thread Peter Schuller
And about the production 7Gb or RAM is sufficient ? Or 11 Gb is the minimum ? Thank you for your inputs for the JVM I'll try to tune that Production mem reqs are mostly dependent on memtable thresholds: http://www.datastax.com/docs/0.7/operations/tuning If you enable key caching or row

Re: Abnormal memory consumption

2011-04-04 Thread openvictor Open
Okay, I see. But isn't there a big issue for scaling here ? Imagine that I am the developper of a certain very successful website : At year 1 I need 20 CF. I might need to have 8Gb of RAM. Year 2 I need 50 CF because I added functionalities to my wonderful webiste will I need 20 Gb of RAM ? And if

Re: statistcs query on cassandra

2011-04-04 Thread aaron morton
No. You could build a custom secondary index where column=value is the key and startKey and endKey are column names. Then call get_count() with a SlicePredicate that specifies the startKey and endKey as the start and finish column names. Aaron On 5 Apr 2011, at 01:45, Donal Zang wrote:

nodetool repair compact

2011-04-04 Thread Maki Watanabe
Hello, On reading O'Reilly's Cassandra book and wiki, I'm a bit confusing on nodetool repair and compact. I believe we need to run nodetool repair regularly, and it synchronize all replica nodes at the end. According to the documents the repair invokes major compaction also (as side effect?). Will

maven repository

2011-04-04 Thread Mikael Wikblom
Hi, is there a maven repository where I can download the latest version of cassandra? I've found a few versions at riptano http://mvn.riptano.com/content/repositories/riptano/org/apache/cassandra/apache-cassandra/ but note the latest 0.7.4 regards Mikael Wikblom -- Mikael Wikblom Software