Re: Option for ordering columns by timestamp in CF

2012-10-12 Thread Ertio Lew
Make column timestamps optional- kidding me, right ?:) I do understand that this wont be possible as then cassandra wont be able to distinguish the latest among several copies of same column. I dont mean that. I just want the while ordering the columns, Cassandra(in an optional mode per CF)

Re: READ messages dropped

2012-10-12 Thread Tamar Fraenkel
Hi! Thanks for the response. My cluster is in a bad state those recent days. I have 29 CFs, and my disk is 5% full... So I guess the VMs still have more space to go, and I am not sure this is considered many CFs. But maybe I have memory issues. I enlarge cassandra memory from about ~2G to ~4G

Re: unnecessary tombstone's transmission during repair process

2012-10-12 Thread Alexey Zotov
Sylvain, I've seen to the code. Yes, you right about local deletion time. But it contradicts to the tests results. Do you have any thoughts how to explain result of the second test after patch applying? Our patch: diff --git a/src/java/org/apache/cassandra/db/DeletedColumn.java

Re: Cassandra nodes loaded unequally

2012-10-12 Thread Alexey Zotov
Hi Ben, I suggest you to compare amount of queries for each node. May be the problem is on the client side. Yoy can do that using JMX: org.apache.cassandra.db:type=ColumnFamilies,keyspace=YOUR KEYSPACE,columnfamily=YOUR CF,ReadCount org.apache.cassandra.db:type=ColumnFamilies,keyspace=YOUR

Re: cassandra 1.0.8 memory usage

2012-10-12 Thread Daniel Woo
Hi Rob, What version of Cassandra? What JVM? Are JNA and Jamm working? cassandra 1.0.8. Sun JDK 1.7.0_05-b06, JNA memlock enabled, jamm works. It sounds like the two nodes that are pathological right now have exhausted the perm gen with actual non-garbage, probably mostly the Bloom filters and

what is more important (RAM vs Cores)

2012-10-12 Thread Hagos, A.S.
Hi All, For of my projects I want to buy a machine to host Casssandra database. The options I am offered are machines with 16GB RAM with Quad-Core processor and 6GB RAM with Hexa-Core processor. Which one do you recommend, big RAM or high number of cores? greetings Ambes

Re: what is more important (RAM vs Cores)

2012-10-12 Thread wang liang
Hi, Hagos, I think it depends on your business case. Big RAM reduce latency and improve responsibility, High number of cores increase concurrency of your app. thanks. On Fri, Oct 12, 2012 at 4:23 PM, Hagos, A.S. a.s.ha...@tue.nl wrote: Hi All, For of my projects I want to buy a machine to

Re: what is more important (RAM vs Cores)

2012-10-12 Thread Romain HARDOUIN
Hi, Sure it depends... but IMHO 6 GB is suboptimal for big data because it means 1,5 GB or 2 GB for Cassandra. Maybe you could elaborate your use case. You really want a one node cluster ? cheers, Romain wang liang wla...@gmail.com a écrit sur 12/10/2012 10:36:15 : Hi, Hagos, I think it

RE: what is more important (RAM vs Cores)

2012-10-12 Thread Hagos, A.S.
Hi there, My application is uses Cassandra to store abstracted sensor data from a sensor network in large building (up to 3000 sensors). For now I am starting one node in one floor of the building, for the future it will definitely be a cluster. Some of the sensors have up 16HZ sampling rate.

RE: what is more important (RAM vs Cores)

2012-10-12 Thread Viktor Jevdokimov
IMO, in most cases you'll be limited by the RAM first. Take into account size of sstables, you will need to keep bloom filters and indexes in RAM and if it will not fit, 4 cores, or 24 cores doesn't matter, except you're on SSD. You need to design first, stress test second, conclude last.

Super columns and arrays

2012-10-12 Thread Thierry Templier
Hello, I wonder if it's possible to specify an array of values as a value of a super column... If it's not possible, is there another way to do that? Thanks very much for your help. Thierry

RE: what is more important (RAM vs Cores)

2012-10-12 Thread Tim Wintle
On Fri, 2012-10-12 at 10:20 +, Viktor Jevdokimov wrote: IMO, in most cases you'll be limited by the RAM first. +1 - I've seen our 8-core boxes limited by RAM and inter-rack networking, but not by CPU (yet). Tim

RE: what is more important (RAM vs Cores)

2012-10-12 Thread Romain HARDOUIN
Also, take into account i/o since they are often a limiting factor.

RE: Super columns and arrays

2012-10-12 Thread Viktor Jevdokimov
struct SuperColumn { 1: required binary name, 2: required listColumn columns, } Best regards / Pagarbiai Viktor Jevdokimov Senior Developer Email: viktor.jevdoki...@adform.com Phone: +370 5 212 3063 Fax: +370 5 261 0453 J. Jasinskio 16C, LT-01112 Vilnius, Lithuania Disclaimer: The

Re: unnecessary tombstone's transmission during repair process

2012-10-12 Thread Hiller, Dean
+1 I want to see how this plays out as well. Anyone know the answer? Dean From: Alexey Zotov azo...@griddynamics.commailto:azo...@griddynamics.com Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Date: Friday,

read performance plumetted

2012-10-12 Thread Brian Tarbox
I have a two node cluster hosting a 45 gig dataset. I periodically have to read a high fraction (20% or so) of my 'rows', grabbing a few thousand at a time and then processing them. This used to result in about 300-500 reads a second which seemed quite good. Recently that number has plummeted

Re: Option for ordering columns by timestamp in CF

2012-10-12 Thread Derek Williams
You probably already know this but I'm pretty sure it wouldn't be a trivial change, since to efficiently lookup a column by name requires the columns to be ordered by name. A separate index would be needed in order to provide lookup by column name if the row was sorted by timestamp (which is the

Re: Repair Failing due to bad network

2012-10-12 Thread David Koblas
Jim, Great idea - though it doesn't look like it's in 1.1.3 (which is what I'm running). My lame idea of the morning is that I'm going to just read the whole keyspace with QUORUM reads to force read repairs - the unfortunate truth is that this is about 2B reads... --david On 10/11/12

Re: read performance plumetted

2012-10-12 Thread B. Todd Burruss
did the amount of data finally exceed your per machine RAM capacity? is it the same 20% each time you read? or do your periodic reads eventually work through the entire dataset? if you are essentially table scanning your data set, and the size exceeds available RAM, then a degradation like that

Re: Repair Failing due to bad network

2012-10-12 Thread Rob Coli
https://issues.apache.org/jira/browse/CASSANDRA-3483 Is directly on point for the use case in question, and introduces rebuild concept.. https://issues.apache.org/jira/browse/CASSANDRA-3487 https://issues.apache.org/jira/browse/CASSANDRA-3112 Are for improvements in repair sessions..

Re: READ messages dropped

2012-10-12 Thread Tyler Hobbs
On Fri, Oct 12, 2012 at 2:24 AM, Tamar Fraenkel ta...@tok-media.com wrote: Thanks for the response. My cluster is in a bad state those recent days. I have 29 CFs, and my disk is 5% full... So I guess the VMs still have more space to go, and I am not sure this is considered many CFs. That's

Re: cassandra 1.0.8 memory usage

2012-10-12 Thread Tyler Hobbs
On Fri, Oct 12, 2012 at 3:26 AM, Daniel Woo daniel.y@gmail.com wrote: Disable swap for cassandra node I am gonna change swappiness to 20% Dead nodes are better than crippled nodes. I'll echo Rob's suggestion that you disable swap entirely. -- Tyler Hobbs DataStax http://datastax.com/

Re: Option for ordering columns by timestamp in CF

2012-10-12 Thread B. Todd Burruss
trying to think of a use case where you would want to order by timestamp, and also have unique column names for direct access. not really trying to challenge the use case, but you can get ordering by timestamp and still maintain a name for the column using composites. if the first component of

Re: cassandra 1.0.8 memory usage

2012-10-12 Thread Rob Coli
On Fri, Oct 12, 2012 at 1:26 AM, Daniel Woo daniel.y@gmail.com wrote: What version of Cassandra? What JVM? Are JNA and Jamm working? cassandra 1.0.8. Sun JDK 1.7.0_05-b06, JNA memlock enabled, jamm works. The unusual aspect here is Sun JDK 1.7. Can you use 1.6 on an affected node and see if

RE: Read latency issue

2012-10-12 Thread Arindam Barua
We instrumented the Cassandra and Hector code adding more logs to check where the time was being spent. We found the Cassandra read times to be very low, eg. CassandraServer.getSlice() is only 3ms. However, on Hector's side, creating a ColumnFamilyTemplateString, Composite, and doing