Re: Option for ordering columns by timestamp in CF
Make column timestamps optional- kidding me, right ?:) I do understand that this wont be possible as then cassandra wont be able to distinguish the latest among several copies of same column. I dont mean that. I just want the while ordering the columns, Cassandra(in an optional mode per CF) should not look at column names(they will exist though but for retrieval purposes not for ordering) but instead Cassandra would order the columns by looking at the timestamp values(timestamps would exist!). So the change would be just to provide a mode in which cassandra, while ordering, uses timestamps instead of column names. On Fri, Oct 12, 2012 at 2:26 AM, Tyler Hobbs ty...@datastax.com wrote: Without thinking too deeply about it, this is basically equivalent to disabling timestamps for a column family and using timestamps for column names, though in a very indirect (and potentially confusing) manner. So, if you want to open a ticket, I would suggest framing it as make column timestamps optional. On Wed, Oct 10, 2012 at 4:44 AM, Ertio Lew ertio...@gmail.com wrote: I think Cassandra should provide an configurable option on per column family basis to do columns sorting by time-stamp rather than column names. This would be really helpful to maintain time-sorted columns without using up the column name as time-stamps which might otherwise be used to store most relevant column names useful for retrievals. Very frequently we need to store data sorted in time order. Therefore I think this may be a very general requirement not specific to just my use-case alone. Does it makes sense to create an issue for this ? On Fri, Mar 25, 2011 at 2:38 AM, aaron morton aa...@thelastpickle.comwrote: If you mean order by the column timestamp (as passed by the client) that it not possible. Can you use your own timestamps as the column name and store them as long values ? Aaron On 25 Mar 2011, at 09:30, Narendra Sharma wrote: Cassandra 0.7.4 Column names in my CF are of type byte[] but I want to order columns by timestamp. What is the best way to achieve this? Does it make sense for Cassandra to support ordering of columns by timestamp as option for a column family irrespective of the column name type? Thanks, Naren -- Tyler Hobbs DataStax http://datastax.com/
Re: READ messages dropped
Hi! Thanks for the response. My cluster is in a bad state those recent days. I have 29 CFs, and my disk is 5% full... So I guess the VMs still have more space to go, and I am not sure this is considered many CFs. But maybe I have memory issues. I enlarge cassandra memory from about ~2G to ~4G (out of ~8G). This was done because at that stage I had lots of key caches. I then reduced them to almost 0 on all CF. I guess now I can reduce the memory back to ~2 or ~3 G. Will that help? Thanks *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Thu, Oct 11, 2012 at 10:46 PM, Tyler Hobbs ty...@datastax.com wrote: On Wed, Oct 10, 2012 at 3:10 PM, Tamar Fraenkel ta...@tok-media.comwrote: What I did noticed while looking at the logs (which are also running OpsCenter), is that there is some correlation between the dropped reads and flushes of OpsCenter column families to disk and or compactions. What are the rollups CFs? why is there so much traffic in them? The rollups CFs hold the performance metric data that OpsCenter stores about your cluster. Typically these aren't actually very high traffic column families, but that depends on how many column families you have (more CFs require more metrics to be stored). If you have a lot of column families, you have a couple of options for reducing the amount of metric data that's stored: http://www.datastax.com/docs/opscenter/trouble_shooting_opsc#limiting-the-metrics-collected-by-opscenter Assuming you don't have a large number of CFs, your nodes may legitimately be nearing capacity. -- Tyler Hobbs DataStax http://datastax.com/ tokLogo.png
Re: unnecessary tombstone's transmission during repair process
Sylvain, I've seen to the code. Yes, you right about local deletion time. But it contradicts to the tests results. Do you have any thoughts how to explain result of the second test after patch applying? Our patch: diff --git a/src/java/org/apache/cassandra/db/DeletedColumn.java b/src/java/org/apache/cassandra/db/DeletedColumn.java index 18faeef..31744f6 100644 --- a/src/java/org/apache/cassandra/db/DeletedColumn.java +++ b/src/java/org/apache/cassandra/db/DeletedColumn.java @@ -17,10 +17,13 @@ */ package org.apache.cassandra.db; +import java.io.IOException; import java.nio.ByteBuffer; +import java.security.MessageDigest; import org.apache.cassandra.config.CFMetaData; import org.apache.cassandra.db.marshal.MarshalException; +import org.apache.cassandra.io.util.DataOutputBuffer; import org.apache.cassandra.utils.Allocator; import org.apache.cassandra.utils.ByteBufferUtil; import org.apache.cassandra.utils.HeapAllocator; @@ -46,6 +49,25 @@ public class DeletedColumn extends Column } @Override +public void updateDigest(MessageDigest digest) { +digest.update(name.duplicate()); +// it's commented to prevent consideration of the localDeletionTime in Merkle Tree construction +//digest.update(value.duplicate()); + +DataOutputBuffer buffer = new DataOutputBuffer(); +try +{ +buffer.writeLong(timestamp); +buffer.writeByte(serializationFlags()); +} +catch (IOException e) +{ +throw new RuntimeException(e); +} +digest.update(buffer.getData(), 0, buffer.getLength()); +} + +@Override public long getMarkedForDeleteAt() { return timestamp; -- Best regards** Zotov Alexey Grid Dynamics Skype: azotcsit
Re: Cassandra nodes loaded unequally
Hi Ben, I suggest you to compare amount of queries for each node. May be the problem is on the client side. Yoy can do that using JMX: org.apache.cassandra.db:type=ColumnFamilies,keyspace=YOUR KEYSPACE,columnfamily=YOUR CF,ReadCount org.apache.cassandra.db:type=ColumnFamilies,keyspace=YOUR KEYSPACE,columnfamily=YOUR CF,WriteCount Also I suggest to check output of nodetool compactionstats. -- Alexey
Re: cassandra 1.0.8 memory usage
Hi Rob, What version of Cassandra? What JVM? Are JNA and Jamm working? cassandra 1.0.8. Sun JDK 1.7.0_05-b06, JNA memlock enabled, jamm works. It sounds like the two nodes that are pathological right now have exhausted the perm gen with actual non-garbage, probably mostly the Bloom filters and the JMX MBeans. JMAP shows that the per gen is only 40% used. Do you have a large number of ColumnFamilies? How large is the data stored per node? I have very few column families, maybe 30-50. The nodetool shows each node has 5 GB load. Disable swap for cassandra node I am gonna change swappiness to 20% Thanks, Daniel On Fri, Oct 12, 2012 at 2:02 AM, Rob Coli rc...@palominodb.com wrote: On Wed, Oct 10, 2012 at 11:04 PM, Daniel Woo daniel.y@gmail.com wrote: I am running a mini cluster with 6 nodes, recently we see very frequent ParNewGC on two nodes. It takes 200 - 800 ms on average, sometimes it takes 5 seconds. You know, hte ParNewGC is stop-of-wolrd GC and our client throws SocketTimeoutException every 3 minutes. What version of Cassandra? What JVM? Are JNA and Jamm working? I checked the load, it seems well balanced, and the two nodes are running on the same hardware: 2 * 4 cores xeon with 16G RAM, we give cassandrda 4G heap, including 800MB young generation. We did not see any swap usage during the GC, any idea about this? It sounds like the two nodes that are pathological right now have exhausted the perm gen with actual non-garbage, probably mostly the Bloom filters and the JMX MBeans. Then I took a heap dump, it shows that 5 instances of JmxMBeanServer holds 500MB memory and most of the referenced objects are JMX mbean related, it's kind of wired to me and looks like a memory leak. Do you have a large number of ColumnFamilies? How large is the data stored per node? =Rob -- =Robert Coli AIMGTALK - rc...@palominodb.com YAHOO - rcoli.palominob SKYPE - rcoli_palominodb -- Thanks Regards, Daniel
what is more important (RAM vs Cores)
Hi All, For of my projects I want to buy a machine to host Casssandra database. The options I am offered are machines with 16GB RAM with Quad-Core processor and 6GB RAM with Hexa-Core processor. Which one do you recommend, big RAM or high number of cores? greetings Ambes
Re: what is more important (RAM vs Cores)
Hi, Hagos, I think it depends on your business case. Big RAM reduce latency and improve responsibility, High number of cores increase concurrency of your app. thanks. On Fri, Oct 12, 2012 at 4:23 PM, Hagos, A.S. a.s.ha...@tue.nl wrote: Hi All, For of my projects I want to buy a machine to host Casssandra database. The options I am offered are machines with 16GB RAM with Quad-Core processor and 6GB RAM with Hexa-Core processor. Which one do you recommend, big RAM or high number of cores? greetings Ambes -- Best wishes, Helping others is to help myself.
Re: what is more important (RAM vs Cores)
Hi, Sure it depends... but IMHO 6 GB is suboptimal for big data because it means 1,5 GB or 2 GB for Cassandra. Maybe you could elaborate your use case. You really want a one node cluster ? cheers, Romain wang liang wla...@gmail.com a écrit sur 12/10/2012 10:36:15 : Hi, Hagos, I think it depends on your business case. Big RAM reduce latency and improve responsibility, High number of cores increase concurrency of your app. thanks. On Fri, Oct 12, 2012 at 4:23 PM, Hagos, A.S. a.s.ha...@tue.nl wrote: Hi All, For of my projects I want to buy a machine to host Casssandra database. The options I am offered are machines with 16GB RAM with Quad-Core processor and 6GB RAM with Hexa-Core processor. Which one do you recommend, big RAM or high number of cores? greetings Ambes -- Best wishes, Helping others is to help myself.
RE: what is more important (RAM vs Cores)
Hi there, My application is uses Cassandra to store abstracted sensor data from a sensor network in large building (up to 3000 sensors). For now I am starting one node in one floor of the building, for the future it will definitely be a cluster. Some of the sensors have up 16HZ sampling rate. And now I want to make a decision if I have to focus on big RAM or large number of cores. greetings Ambes From: Romain HARDOUIN [romain.hardo...@urssaf.fr] Sent: Friday, October 12, 2012 10:57 AM To: user@cassandra.apache.org Subject: Re: what is more important (RAM vs Cores) Hi, Sure it depends... but IMHO 6 GB is suboptimal for big data because it means 1,5 GB or 2 GB for Cassandra. Maybe you could elaborate your use case. You really want a one node cluster ? cheers, Romain wang liang wla...@gmail.com a écrit sur 12/10/2012 10:36:15 : Hi, Hagos, I think it depends on your business case. Big RAM reduce latency and improve responsibility, High number of cores increase concurrency of your app. thanks. On Fri, Oct 12, 2012 at 4:23 PM, Hagos, A.S. a.s.ha...@tue.nl wrote: Hi All, For of my projects I want to buy a machine to host Casssandra database. The options I am offered are machines with 16GB RAM with Quad-Core processor and 6GB RAM with Hexa-Core processor. Which one do you recommend, big RAM or high number of cores? greetings Ambes -- Best wishes, Helping others is to help myself.
RE: what is more important (RAM vs Cores)
IMO, in most cases you'll be limited by the RAM first. Take into account size of sstables, you will need to keep bloom filters and indexes in RAM and if it will not fit, 4 cores, or 24 cores doesn't matter, except you're on SSD. You need to design first, stress test second, conclude last. Best regards / Pagarbiai Viktor Jevdokimov Senior Developer Email: viktor.jevdoki...@adform.com Phone: +370 5 212 3063 Fax: +370 5 261 0453 J. Jasinskio 16C, LT-01112 Vilnius, Lithuania Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies. -Original Message- From: Hagos, A.S. [mailto:a.s.ha...@tue.nl] Sent: Friday, October 12, 2012 12:17 To: user@cassandra.apache.org Subject: RE: what is more important (RAM vs Cores) Hi there, My application is uses Cassandra to store abstracted sensor data from a sensor network in large building (up to 3000 sensors). For now I am starting one node in one floor of the building, for the future it will definitely be a cluster. Some of the sensors have up 16HZ sampling rate. And now I want to make a decision if I have to focus on big RAM or large number of cores. greetings Ambes From: Romain HARDOUIN [romain.hardo...@urssaf.fr] Sent: Friday, October 12, 2012 10:57 AM To: user@cassandra.apache.org Subject: Re: what is more important (RAM vs Cores) Hi, Sure it depends... but IMHO 6 GB is suboptimal for big data because it means 1,5 GB or 2 GB for Cassandra. Maybe you could elaborate your use case. You really want a one node cluster ? cheers, Romain wang liang wla...@gmail.com a écrit sur 12/10/2012 10:36:15 : Hi, Hagos, I think it depends on your business case. Big RAM reduce latency and improve responsibility, High number of cores increase concurrency of your app. thanks. On Fri, Oct 12, 2012 at 4:23 PM, Hagos, A.S. a.s.ha...@tue.nl wrote: Hi All, For of my projects I want to buy a machine to host Casssandra database. The options I am offered are machines with 16GB RAM with Quad-Core processor and 6GB RAM with Hexa-Core processor. Which one do you recommend, big RAM or high number of cores? greetings Ambes -- Best wishes, Helping others is to help myself.
Super columns and arrays
Hello, I wonder if it's possible to specify an array of values as a value of a super column... If it's not possible, is there another way to do that? Thanks very much for your help. Thierry
RE: what is more important (RAM vs Cores)
On Fri, 2012-10-12 at 10:20 +, Viktor Jevdokimov wrote: IMO, in most cases you'll be limited by the RAM first. +1 - I've seen our 8-core boxes limited by RAM and inter-rack networking, but not by CPU (yet). Tim
RE: what is more important (RAM vs Cores)
Also, take into account i/o since they are often a limiting factor.
RE: Super columns and arrays
struct SuperColumn { 1: required binary name, 2: required listColumn columns, } Best regards / Pagarbiai Viktor Jevdokimov Senior Developer Email: viktor.jevdoki...@adform.com Phone: +370 5 212 3063 Fax: +370 5 261 0453 J. Jasinskio 16C, LT-01112 Vilnius, Lithuania Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies. -Original Message- From: Thierry Templier [mailto:thierry.templ...@restlet.com] Sent: Friday, October 12, 2012 13:44 To: user@cassandra.apache.org Subject: Super columns and arrays Hello, I wonder if it's possible to specify an array of values as a value of a super column... If it's not possible, is there another way to do that? Thanks very much for your help. Thierry
Re: unnecessary tombstone's transmission during repair process
+1 I want to see how this plays out as well. Anyone know the answer? Dean From: Alexey Zotov azo...@griddynamics.commailto:azo...@griddynamics.com Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Date: Friday, October 12, 2012 1:33 AM To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: Re: unnecessary tombstone's transmission during repair process iff --git a/src/java/org/apache/cassandra/db/DeletedColumn.java b/src/java/org/apache/cassandra/db/DeletedColumn.java index 18faeef..31744f6 100644 --- a/src/java/org/apache/cassandra/db/DeletedColumn.java +++ b/src/java/org/apache/cassandra/db/DeletedColumn.java @@ -17,10 +17,13 @@ */ package org.apache.cassandra.db; +import java.io.IOException; import java.nio.ByteBuffer; +import java.security.MessageDigest; import org.apache.cassandra.config.CFMetaData; import org.apache.cassandra.db.marshal.MarshalException; +import org.apache.cassandra.io.util.DataOutputBuffer; import org.apache.cassandra.utils.Allocator;
read performance plumetted
I have a two node cluster hosting a 45 gig dataset. I periodically have to read a high fraction (20% or so) of my 'rows', grabbing a few thousand at a time and then processing them. This used to result in about 300-500 reads a second which seemed quite good. Recently that number has plummeted to 20-50 reads a second. The obvious question is what did I change? I certainly added more databringing my total load from 38 or so gig to 45 or so gig but its hard to imagine that causing this problem. The shape of my data has not changed and I haven't changed any cassandra configuration. Running nodetool tpstats I'm for the first time ever seeing entries under ReadStage Active and Pending which correlates with slow reads. Running iostat I'm seeing a significant (10-50%) of iowait where I previously never saw higher than 1-2% I ran a full compaction on the relevant CF (which took 3.5 hours) to no avail. Any suggestions on where I can look next? Thanks.
Re: Option for ordering columns by timestamp in CF
You probably already know this but I'm pretty sure it wouldn't be a trivial change, since to efficiently lookup a column by name requires the columns to be ordered by name. A separate index would be needed in order to provide lookup by column name if the row was sorted by timestamp (which is the way Redis implements it's sorted set). On Fri, Oct 12, 2012 at 12:13 AM, Ertio Lew ertio...@gmail.com wrote: Make column timestamps optional- kidding me, right ?:) I do understand that this wont be possible as then cassandra wont be able to distinguish the latest among several copies of same column. I dont mean that. I just want the while ordering the columns, Cassandra(in an optional mode per CF) should not look at column names(they will exist though but for retrieval purposes not for ordering) but instead Cassandra would order the columns by looking at the timestamp values(timestamps would exist!). So the change would be just to provide a mode in which cassandra, while ordering, uses timestamps instead of column names. On Fri, Oct 12, 2012 at 2:26 AM, Tyler Hobbs ty...@datastax.com wrote: Without thinking too deeply about it, this is basically equivalent to disabling timestamps for a column family and using timestamps for column names, though in a very indirect (and potentially confusing) manner. So, if you want to open a ticket, I would suggest framing it as make column timestamps optional. On Wed, Oct 10, 2012 at 4:44 AM, Ertio Lew ertio...@gmail.com wrote: I think Cassandra should provide an configurable option on per column family basis to do columns sorting by time-stamp rather than column names. This would be really helpful to maintain time-sorted columns without using up the column name as time-stamps which might otherwise be used to store most relevant column names useful for retrievals. Very frequently we need to store data sorted in time order. Therefore I think this may be a very general requirement not specific to just my use-case alone. Does it makes sense to create an issue for this ? On Fri, Mar 25, 2011 at 2:38 AM, aaron morton aa...@thelastpickle.comwrote: If you mean order by the column timestamp (as passed by the client) that it not possible. Can you use your own timestamps as the column name and store them as long values ? Aaron On 25 Mar 2011, at 09:30, Narendra Sharma wrote: Cassandra 0.7.4 Column names in my CF are of type byte[] but I want to order columns by timestamp. What is the best way to achieve this? Does it make sense for Cassandra to support ordering of columns by timestamp as option for a column family irrespective of the column name type? Thanks, Naren -- Tyler Hobbs DataStax http://datastax.com/ -- Derek Williams
Re: Repair Failing due to bad network
Jim, Great idea - though it doesn't look like it's in 1.1.3 (which is what I'm running). My lame idea of the morning is that I'm going to just read the whole keyspace with QUORUM reads to force read repairs - the unfortunate truth is that this is about 2B reads... --david On 10/11/12 4:51 PM, Jim Cistaro wrote: I am not aware of any built-in mechanism for retrying repairs. I believe you will have to build that into your process. As for reducing the time of each repair command to fit in your windows: If you have multiple reasonable size column families, and are not already doing this, one approach might be to do repairs on a per cf basis. This will break your repairs up into smaller chunks that might fit in the window. If you are not doing -pr (primary range), using that on each node causes the repair command to only repair the primary range on the node (not the ones it is replicating). Depending on your version, there is also https://issues.apache.org/jira/browse/CASSANDRA-3912 which might help you - but I have no experience using this feature. jc On 10/11/12 4:09 PM, David Koblas da...@koblas.com wrote: I'm trying to bring up a new Datacenter - while I probably could have brought things up in another way I've now got a DC that has a ready Cassandra with keys allocated. The problem is that I cannot get a repair to complete due since it appears that some part of my network decides to restart all connections twice a day (6am and 2pm - ok 5 minutes before). So when I start a repair job, it usually get's a ways into things before one of the nodes goes DOWN, then back up. What I don't see is the repair restarting, it just stops. Is there a workaround for this case, or is there something else I could be doing? --david
Re: read performance plumetted
did the amount of data finally exceed your per machine RAM capacity? is it the same 20% each time you read? or do your periodic reads eventually work through the entire dataset? if you are essentially table scanning your data set, and the size exceeds available RAM, then a degradation like that isn't crazy. and this is indicated with your iowait %% On Fri, Oct 12, 2012 at 6:33 AM, Brian Tarbox tar...@cabotresearch.com wrote: I have a two node cluster hosting a 45 gig dataset. I periodically have to read a high fraction (20% or so) of my 'rows', grabbing a few thousand at a time and then processing them. This used to result in about 300-500 reads a second which seemed quite good. Recently that number has plummeted to 20-50 reads a second. The obvious question is what did I change? I certainly added more databringing my total load from 38 or so gig to 45 or so gig but its hard to imagine that causing this problem. The shape of my data has not changed and I haven't changed any cassandra configuration. Running nodetool tpstats I'm for the first time ever seeing entries under ReadStage Active and Pending which correlates with slow reads. Running iostat I'm seeing a significant (10-50%) of iowait where I previously never saw higher than 1-2% I ran a full compaction on the relevant CF (which took 3.5 hours) to no avail. Any suggestions on where I can look next? Thanks.
Re: Repair Failing due to bad network
https://issues.apache.org/jira/browse/CASSANDRA-3483 Is directly on point for the use case in question, and introduces rebuild concept.. https://issues.apache.org/jira/browse/CASSANDRA-3487 https://issues.apache.org/jira/browse/CASSANDRA-3112 Are for improvements in repair sessions.. https://issues.apache.org/jira/browse/CASSANDRA-4767 Is for unambiguous indication of repair session status. =Rob -- =Robert Coli AIMGTALK - rc...@palominodb.com YAHOO - rcoli.palominob SKYPE - rcoli_palominodb
Re: READ messages dropped
On Fri, Oct 12, 2012 at 2:24 AM, Tamar Fraenkel ta...@tok-media.com wrote: Thanks for the response. My cluster is in a bad state those recent days. I have 29 CFs, and my disk is 5% full... So I guess the VMs still have more space to go, and I am not sure this is considered many CFs. That's not too many CFs. I don't know how much 5% of your disk space is in absolute numbers, which is more important. The most important measure for whether you are approaching limits is really disk utilization (as in how busy the disk is, not how much data it's holding). OpsCenter exposes metrics for this that you should check. But maybe I have memory issues. I enlarge cassandra memory from about ~2G to ~4G (out of ~8G). This was done because at that stage I had lots of key caches. I then reduced them to almost 0 on all CF. I guess now I can reduce the memory back to ~2 or ~3 G. Will that help? I would leave your heap at 4G. You really do want key caching enabled in almost all circumstances; it can save you a lot of disk activity on reads. If you need to bump your heap up to 4.5G to accommodate key caches, it's worth it. -- Tyler Hobbs DataStax http://datastax.com/
Re: cassandra 1.0.8 memory usage
On Fri, Oct 12, 2012 at 3:26 AM, Daniel Woo daniel.y@gmail.com wrote: Disable swap for cassandra node I am gonna change swappiness to 20% Dead nodes are better than crippled nodes. I'll echo Rob's suggestion that you disable swap entirely. -- Tyler Hobbs DataStax http://datastax.com/
Re: Option for ordering columns by timestamp in CF
trying to think of a use case where you would want to order by timestamp, and also have unique column names for direct access. not really trying to challenge the use case, but you can get ordering by timestamp and still maintain a name for the column using composites. if the first component of the composite is a timestamp, then you can order on it. when retrieved you will could have a name in the second component .. and have dupes as long as the timestamp is unique (use TimeUUID) On Fri, Oct 12, 2012 at 7:20 AM, Derek Williams de...@fyrie.net wrote: You probably already know this but I'm pretty sure it wouldn't be a trivial change, since to efficiently lookup a column by name requires the columns to be ordered by name. A separate index would be needed in order to provide lookup by column name if the row was sorted by timestamp (which is the way Redis implements it's sorted set). On Fri, Oct 12, 2012 at 12:13 AM, Ertio Lew ertio...@gmail.com wrote: Make column timestamps optional- kidding me, right ?:) I do understand that this wont be possible as then cassandra wont be able to distinguish the latest among several copies of same column. I dont mean that. I just want the while ordering the columns, Cassandra(in an optional mode per CF) should not look at column names(they will exist though but for retrieval purposes not for ordering) but instead Cassandra would order the columns by looking at the timestamp values(timestamps would exist!). So the change would be just to provide a mode in which cassandra, while ordering, uses timestamps instead of column names. On Fri, Oct 12, 2012 at 2:26 AM, Tyler Hobbs ty...@datastax.com wrote: Without thinking too deeply about it, this is basically equivalent to disabling timestamps for a column family and using timestamps for column names, though in a very indirect (and potentially confusing) manner. So, if you want to open a ticket, I would suggest framing it as make column timestamps optional. On Wed, Oct 10, 2012 at 4:44 AM, Ertio Lew ertio...@gmail.com wrote: I think Cassandra should provide an configurable option on per column family basis to do columns sorting by time-stamp rather than column names. This would be really helpful to maintain time-sorted columns without using up the column name as time-stamps which might otherwise be used to store most relevant column names useful for retrievals. Very frequently we need to store data sorted in time order. Therefore I think this may be a very general requirement not specific to just my use-case alone. Does it makes sense to create an issue for this ? On Fri, Mar 25, 2011 at 2:38 AM, aaron morton aa...@thelastpickle.com wrote: If you mean order by the column timestamp (as passed by the client) that it not possible. Can you use your own timestamps as the column name and store them as long values ? Aaron On 25 Mar 2011, at 09:30, Narendra Sharma wrote: Cassandra 0.7.4 Column names in my CF are of type byte[] but I want to order columns by timestamp. What is the best way to achieve this? Does it make sense for Cassandra to support ordering of columns by timestamp as option for a column family irrespective of the column name type? Thanks, Naren -- Tyler Hobbs DataStax -- Derek Williams
Re: cassandra 1.0.8 memory usage
On Fri, Oct 12, 2012 at 1:26 AM, Daniel Woo daniel.y@gmail.com wrote: What version of Cassandra? What JVM? Are JNA and Jamm working? cassandra 1.0.8. Sun JDK 1.7.0_05-b06, JNA memlock enabled, jamm works. The unusual aspect here is Sun JDK 1.7. Can you use 1.6 on an affected node and see if the problem disappears? https://issues.apache.org/jira/browse/CASSANDRA-4571 Exists in 1.1.x (not your case) and is for leaking descriptors and not memory, but affects both 1.6 and 1.7. JMAP shows that the per gen is only 40% used. What is the usage of the other gens? I have very few column families, maybe 30-50. The nodetool shows each node has 5 GB load. Most of your heap being consumed by 30-50 columnfamilies MBeans seems excessive. Disable swap for cassandra node I am gonna change swappiness to 20% Even setting swappiness to 0% does not prevent the kernel from swapping if swap is defined/enabled. I re-iterate my suggestion that you de-define/disable swap on any node running Cassandra. :) =Rob -- =Robert Coli AIMGTALK - rc...@palominodb.com YAHOO - rcoli.palominob SKYPE - rcoli_palominodb
RE: Read latency issue
We instrumented the Cassandra and Hector code adding more logs to check where the time was being spent. We found the Cassandra read times to be very low, eg. CassandraServer.getSlice() is only 3ms. However, on Hector's side, creating a ColumnFamilyTemplateString, Composite, and doing queryColumns() on it takes 90ms. Looking at the breakup on the Hector side, it appears ExecutionResult.execute takes ~30ms, and ColumnFamilyResultWrapper takes ~47ms. (we are reading around 800 composite columns of 1000 bytes each) Any idea if this is the expected time to process stuff on Hector/other clients? Btw, using Hector's SliceQuery() and reading into a List, or Astynax seem to result in similar times too. Thanks, Arindam -Original Message- From: Arindam Barua [mailto:aba...@247-inc.com] Sent: Wednesday, October 03, 2012 10:54 AM To: user@cassandra.apache.org Subject: RE: Read latency issue Thanks for your responses. Just to be clear our table declaration looks something like this: CREATE TABLE sessionevents ( atag text, col2 uuid, col3 text, col4 uuid, col5 text, col6 text, col7 blob, col8 text, col9 timestamp, col10 uuid, col11 int, col12 uuid, PRIMARY KEY (atag, col2, col3, col4) ) My understanding was that the (full) row key in this case would be the 'atag' values. The column names would then be composites like (col2_value:col3_value:col4_value:col5), (col2_value: col3_value: col4_value:col6), (col2_value:col3_value:col4_value:col7) ... (col2_value: col3_value: col4_value:col12). The columns would be sorted first by col2_values, then by col3 values, etc. Hence a query like select * from sessionevents where atag=foo, we are specifying the entire row key, and Cassandra would return all the columns for that row. Using read consistency of ONE reduces the read latency by ~20ms, compared to using QUORUM. It would only have read from the local node. (I think, may be confusing secondary index reads here). For read consistency ONE, reading only from one node is my expectation as well, and hence I'm seeing the reduced read latency compared to read consistency QUORUM. Does that not sound right? Btw, with read consistency ONE, we found the reading only happens from one node, but not necessarily the local node, even if the data is present in the local node. To check this, we turned on DEBUG logs on all the Cassandra hosts in the ring. We are using replication factor=3 on a 4 node ring, hence mostly the data is present locally. However, we noticed that the coordinator host on receiving the same request multiple times (i.e with the same row key) , would sometimes return the data locally, but sometimes would contact another host in the ring to fetch the data. Thanks, Arindam -Original Message- From: aaron morton [mailto:aa...@thelastpickle.com] Sent: Wednesday, October 03, 2012 12:32 AM To: user@cassandra.apache.org Subject: Re: Read latency issue Running a query to like select * from table_name where atag=foo, where 'atag' is the first column of the composite key, from either JDBC or Hector (equivalent code), results in read times of 200-300ms from a remote host on the same network. If you send a query to select columns from a row and do not fully specify the row key cassandra has to do a row scan. If you want fast performance specify the full row key. Using read consistency of ONE reduces the read latency by ~20ms, compared to using QUORUM. It would only have read from the local node. (I think, may be confusing secondary index reads here). Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 3/10/2012, at 2:17 AM, Roshni Rajagopal roshni_rajago...@hotmail.com wrote: Arindam, Did you also try the cassandra stress tool compare results? I havent done a performance test as yet, the only ones published on the internet are of YCSB on an older version of apache cassandra, and it doesn't seem to be actively supported or updated http://www.brianfrankcooper.net/pubs/ycsb-v4.pdf. The numbers you have sound very low, for a read of a row by key which should have been the fastest. I hope someone can help investigate or share numbers from their tests. Regards, Roshni From: dean.hil...@nrel.gov To: user@cassandra.apache.org Date: Tue, 2 Oct 2012 06:41:09 -0600 Subject: Re: Read latency issue Interesting results. With PlayOrm, we did a 6 node test of reading 100 rows from 1,000,000 using PlayOrm Scalable SQL. It only took 60ms. Maybe we have better hardware though??? We are using 7200 RPM drives so nothing fancy on the disk side of things. More nodes puts at a higher throughput though as reading from more disks will be faster. Anyways, you may want to play with more nodes and re-run. If you run a test with PlayOrm, I would love to know the results there as well. Later, Dean From: Arindam Barua