[RELEASE] Apache Cassandra 1.0.8 released

2012-02-27 Thread Sylvain Lebresne
The Cassandra team is pleased to announce the release of Apache Cassandra version 1.0.8. Cassandra is a highly scalable second-generation distributed database, bringing together Dynamo's fully distributed design and Bigtable's ColumnFamily-based data model. You can read more here:

How to reduce the memory consumed by cassandra (so as to prevent crashes OOMs) ?

2012-02-27 Thread Aditya Gupta
I'm running a 4 nodes cassandra cluster of VMware ubuntu instances each 768MB memory (on a single machine for development purposes). I need to reduce heap size appropriately as my nodes have been crashing at times with OOMs. How do I configure for this ? I think I would need to make some tweaks

Using cassandra at minimal expenditures

2012-02-27 Thread Ertio Lew
Hi I'm creating an networking site using cassandra. I am wanting to host this application but initially with the lowest possible resources then slowly increasing the resources as per the service's demand need. *1. *I am wandering *what is the minimum recommended cluster size to start with*?

Re: MemtableThroughput test in ColumnFamily.apply

2012-02-27 Thread aaron morton
That sounds odd because it is checked after each row is added to the memtable. What are you seeing logged when the memtable flushes ? It will say how many ops and how many (tracked) bytes. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On

Re: Using cassandra at minimal expenditures

2012-02-27 Thread Dave Brosius
I guess the issue with 2 machines and RF=2 is that Consistency level of QUORUM is the same as ALL, so you've pretty much have little flexibility with this setup, of course this might be fine depending on what you want to do. In addition, RF=2 also means that you get no data-storage improvements

Re: CounterColumn java.lang.AssertionError: Wrong class type.

2012-02-27 Thread aaron morton
To rule out the obvious problem can you check the nodes have the same schema ? Use cassandra-cli and run describe cluster. It looks like one of the nodes involved in the read has sent the wrong sort of column for the CF. That's not the sort of thing that normally happens. Otherwise are you

Re: Frequency of Flushing in 1.0

2012-02-27 Thread aaron morton
Isn't decomission meant to do the same thing as disablethrift and gossip? decommission removes a node entirely from the cluster, including streaming it's data to other nodes. disablethrift and disablegossip just stop it from responding to clients and other nodes. Cheers -

Re: Frequency of Flushing in 1.0

2012-02-27 Thread aaron morton
yes, reducing commitlog_total_space_in_mb will reduce the amount of space needed by the commit logs. memtable_total_space_in_mb controls how often sstables are flushed to disk, this does not really affect the commit log. Other than the fact that a commit log segment cannot be deleted until

Re: newer Cassandra + Hadoop = TimedOutException()

2012-02-27 Thread aaron morton
What settings do you have for cassandra.range.batch.size and rpc_timeout_in_ms ? Have you tried reducing the first and/or increasing the second ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 27/02/2012, at 8:02 PM, Patrik Modesto

Re: How to reduce the memory consumed by cassandra (so as to prevent crashes OOMs) ?

2012-02-27 Thread aaron morton
MAX_HEAP_SIZE=500M HEAP_NEWSIZE=100M That is a very small amount of memory for java, you are probably going to have problems. Take a look at reducing these settings in cassandra.yaml to reduce the amount of memory used. memtable_total_space_in_mb memtable_flush_queue_size

Re: Using cassandra at minimal expenditures

2012-02-27 Thread aaron morton
1. I am wandering what is the minimum recommended cluster size to start with? IMHO 3 http://thelastpickle.com/2011/06/13/Down-For-Me/ A - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 28/02/2012, at 8:17 AM, Ertio Lew wrote: Hi I'm

Cassndra 1.0.6 GC query

2012-02-27 Thread Roshan
Hi Experts After getting an OOM error in production, I reduce the -XX:CMSInitiatingOccupancyFraction to .45 (from .75) and flush_largest_memtables_at to .45 (from .75). But still I am get an warning message in production for the same Cassandra node regarding OOM. Also reduce the concurrent

Re: Cassndra 1.0.6 GC query

2012-02-27 Thread Jonathan Ellis
Take a heap dump (there should be one from when you OOMed) and see what is consuming your memory. On Mon, Feb 27, 2012 at 3:45 PM, Roshan codeva...@gmail.com wrote: Hi Experts After getting an OOM error in production, I reduce the -XX:CMSInitiatingOccupancyFraction to .45 (from .75) and

Re: Cassndra 1.0.6 GC query

2012-02-27 Thread Roshan
As a configuration issue, I haven't enable the heap dump directory. Is there another way to find the cause to this and identify possible configuration changes? Thanks. -- View this message in context:

Re: Please advise -- 750MB object possible?

2012-02-27 Thread Ben Coverston
GridFS for Cassandra here, take it FWIW. AFAIK Joaquin spent a few hours putting this together at most. https://github.com/joaquincasares/gratefs -- Ben Coverston DataStax -- The Apache Cassandra Company

Re: Cassndra 1.0.6 GC query

2012-02-27 Thread Ben Coverston
Heap dump is really the gold standard for analysis, but if you don't want to take a heap dump for some reason: 1. Decrease the cache sizes 2. Increase the index interval size These in combination may reduce pressure on the heap enough so you do not see these warnings in the log. On Mon, Feb 27,

TimeUUID

2012-02-27 Thread Tamar Fraenkel
Hi! I have a column family where I use rows as time buckets. What I do is take epoc time in seconds, and round it to 1 hour (taking the result of time_since_epoc_second divided by 3600). My key validation type is LongType. I wonder whether it is better to use TimeUUID or even readable string

Is this the correct data model thinking?

2012-02-27 Thread Blake Starkenburg
Using a user/member as an example I am curious which of the data models would be the best fit for performance and longevity of data in Cassandra? Consider the simple staples of user/member details like username,email,address,state,preferences,etc. Fairly simple, storing this data into a row key

sstable image/pic ?

2012-02-27 Thread Franc Carter
Hi, does anyone know of a picture/image that shows the layout of keys/columns/values in an sstable - I haven't been able to find one and am having a hard time visualising the layout from various descriptions and various overviews thanks -- *Franc Carter* | Systems architect | Sirca Ltd

Re: newer Cassandra + Hadoop = TimedOutException()

2012-02-27 Thread Patrik Modesto
Hi aaron, this is our current settings: property namecassandra.range.batch.size/name value1024/value /property property namecassandra.input.split.size/name value16384/value /property rpc_timeout_in_ms: 3 Regards, P. On Mon,