Re: Amazingly bad compaction performance
Hello Too much GC? Check JVM heap settings and real usage. On 06/27/2012 01:37 AM, Dustin Wenz wrote: We occasionally see fairly poor compaction performance on random nodes in our 7-node cluster, and I have no idea why. This is one example from the log: [CompactionExecutor:45] 2012-06-26 13:40:18,721 CompactionTask.java (line 221) Compacted to [/raid00/cassandra_data/main/basic/main-basic.basic_id_index-hd-160-Data.db,]. 26,632,210 to 26,679,667 (~100% of original) bytes for 2 keys at 0.006250MB/s. Time: 4,071,163ms. That particular event took over an hour to compact only 25 megabytes. During that time, there was very little disk IO, and the java process (OpenJDK 7) was pegged at 200% CPU. The node was also completely unresponsive to network requests until the compaction was finished. Most compactions run just over 7MB/s. This is an extreme outlier, but users definitely notice the hit when it occurs. I grabbed a sample of the process using jstack, and this was the only thread in CompactionExecutor: CompactionExecutor:54 daemon prio=1 tid=41247522816 nid=0x99a5ff740 runnable [140737253617664] java.lang.Thread.State: RUNNABLE at org.xerial.snappy.SnappyNative.rawCompress(Native Method) at org.xerial.snappy.Snappy.rawCompress(Snappy.java:358) at org.apache.cassandra.io.compress.SnappyCompressor.compress(SnappyCompressor.java:80) at org.apache.cassandra.io.compress.CompressedSequentialWriter.flushData(CompressedSequentialWriter.java:89) at org.apache.cassandra.io.util.SequentialWriter.flushInternal(SequentialWriter.java:196) at org.apache.cassandra.io.util.SequentialWriter.reBuffer(SequentialWriter.java:260) at org.apache.cassandra.io.util.SequentialWriter.writeAtMost(SequentialWriter.java:128) at org.apache.cassandra.io.util.SequentialWriter.write(SequentialWriter.java:112) at java.io.DataOutputStream.write(DataOutputStream.java:107) - locked 36527862064 (a java.io.DataOutputStream) at org.apache.cassandra.db.compaction.PrecompactedRow.write(PrecompactedRow.java:142) at org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:156) at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:159) at org.apache.cassandra.db.compaction.CompactionManager$1.runMayThrow(CompactionManager.java:150) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) Is it possible that there is an issue with snappy compression? Based on the lousy compression ratio, I think we could get by without it just fine. Can compression be changed or disabled on-the-fly with cassandra? - .Dustin
Re: Amazingly bad compaction performance
Last I heard only Oracle's JDK was officially supported with Cassandra, possibly nitpicky but is this still the case? On Jun 26, 2012, at 3:37 PM, Dustin Wenz wrote: (OpenJDK 7) was pegged at 200% CPU
Re: bulk load problem
What is your yaml setting for rpc and listen server on destination node? Nury Tue, 26 Jun 2012 17:07:49 -0700 от James Pirz james.p...@gmail.com: Dear all, I am trying to use sstableloader in cassandra 1.1.1, to bulk load some data into a single node cluster. I am running the following command: bin/sstableloader -d 192.168.100.1 /data/ssTable/tpch/tpch/ from another node (other than the node on which cassandra is running), while the data should be loaded into a keyspace named tpch. I made sure that the 2nd node, from which I run sstableloader, have the same copy of cassandra.yaml as the destination node. I have put tpch-cf0-hd-1-Data.db tpch-cf0-hd-1-Index.db under the path, I have passed to sstableloader. But I am getting the following error: Could not retrieve endpoint ranges: Any hint ? Thanks in advance, James
Re: How to use row caching to enable faster retrieval of rows in Cassandra
You should read this page for settings http://www.datastax.com/dev/blog/caching-in-cassandra-1-1 Basically, you have to set row_cache size in yaml, and create table with caching flags - all or rows_only, but beware that you shouldn't use row cache, if you using wide rows. Tue, 26 Jun 2012 02:35:32 -0500 от Prakrati Agrawal prakrati.agra...@mu-sigma.com: Dear all, I am trying to understand whether I can fasten the retrieval process using cache. Please can you help me write the code for setting the cache properties in Cassandra. Please help Thanks and Regards Prakrati -- This email message may contain proprietary, private and confidential information. The information transmitted is intended only for the person(s) or entities to which it is addressed. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited and may be illegal. If you received this in error, please contact the sender and delete the message from your system. Mu Sigma takes all reasonable steps to ensure that its electronic communications are free from viruses. However, given Internet accessibility, the Company cannot accept liability for any virus introduced by this e-mail or any attachment and you are advised to use up-to-date virus checking software.
Re: Interpreting system.log MeteredFlusher messages
, but I do not understand the remedy to the problem. Is increasing this variable my only option? There is nothing to be fixed. This is Cassandra flushing data to disk to free memory and checkpoint the commit log. I see memtables of serialized size of 100-200 MB with estimated live size of 500 MB get flushed to produce sstables of around 10-15 MB sizes. Are these factors of 10-20 between serialized on disk and memory and 3-5 for liveRatio expected? Do you have some log messages for this ? The elevated estimated size may be due to a lot of overwrites. Since the formula is CF Count + Secondary Index Count + memtable_flush_queue_size (defaults to 4) + memtable_flush_writers (defaults to 1 per data directory) memtables in memory the JVM at once., shouldn't the limit be 6 (and not 7) memtables in memory? It's 7 because https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/MeteredFlusher.java#L51 Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 26/06/2012, at 4:41 AM, rohit bhatia wrote: Hi We have 8 cassandra 1.0.5 nodes with 16 cores and 32G ram, Heap size is 12G, memtable_total_space_in_mb is one third = 4G, There are 12 Hot CFs (write-read ratio of 10). memtable_flush_queue_size = 4 and memtable_flush_writers = 2.. I got this log-entry MeteredFlusher.java (line 74) estimated 423318 bytes used by all memtables pre-flush, following which cassandra flushed several of its largest memtables. I understand that this message is due to the memtable_total_space_in_mb setting being reached, but I do not understand the remedy to the problem. Is increasing this variable my only option? Also, In standard MeteredFlusher flushes (the ones that trigger due to if my entire flush pipeline were full of memtables of this size, how big could I allow them to be. logic), I see memtables of serialized size of 100-200 MB with estimated live size of 500 MB get flushed to produce sstables of around 10-15 MB sizes. Are these factors of 10-20 between serialized on disk and memory and 3-5 for liveRatio expected? Also, this very informative article http://thelastpickle.com/2011/05/04/How-are-Memtables-measured/ has this to say For example if memtable_total_space_in_mb is 100MB, and memtable_flush_writers is the default 1 (with one data directory), and memtable_flush_queue_size is the default 4, and a Column Family has no secondary indexes. The CF will not be allowed to get above one seventh of 100MB or 14MB, as if the CF filled the flush pipeline with 7 memtables of this size it would take 98MB. Since the formula is CF Count + Secondary Index Count + memtable_flush_queue_size (defaults to 4) + memtable_flush_writers (defaults to 1 per data directory) memtables in memory the JVM at once., shouldn't the limit be 6 (and not 7) memtables in memory? Thanks Rohit
Re: Fat Client Commit Log
Fat clients aren't involved in writes or HH, and I think my previous thought about it having some info in the System KS may be wrong. Can you recreate the issue ? Care to raise a ticket on https://issues.apache.org/jira/browse/CASSANDRA ? Thanks - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 26/06/2012, at 4:47 AM, Frank Ng wrote: There are files that are recently created. One fat client node is storing 2.5GB of commit log files. The other 2 nodes are storing around 20MB of files. Question, would the commit log data be related to hinted handoff info in the system CF? In other words, is hinted handoffs being store in the fat clients? thanks On Sun, Jun 24, 2012 at 2:58 PM, aaron morton aa...@thelastpickle.com wrote: The fat client would still have some information in the system CF. Are the files big ? Are they continually created ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 23/06/2012, at 8:07 AM, Frank Ng wrote: Hi All, We are using the Fat Client and notice that there are files written to the commit log directory on the Fat Client. Does anyone know what these files are storing? Are these hinted handoff data? The Fat Client has no files in the data directory, as expected. thanks
Re: Interpreting system.log MeteredFlusher messages
On Wed, Jun 27, 2012 at 2:27 PM, aaron morton aa...@thelastpickle.com wrote: , but I do not understand the remedy to the problem. Is increasing this variable my only option? There is nothing to be fixed. This is Cassandra flushing data to disk to free memory and checkpoint the commit log. yes, but it induces simultaneous flushes of around 7-8 column families which exceeds the flush queue size, I believe this can lead cassandra to stop accepting writes. I see memtables of serialized size of 100-200 MB with estimated live size of 500 MB get flushed to produce sstables of around 10-15 MB sizes. Are these factors of 10-20 between serialized on disk and memory and 3-5 for liveRatio expected? Do you have some log messages for this ? The elevated estimated size may be due to a lot of overwrites. Sample Log Message INFO [OptionalTasks:1] 2012-06-27 07:14:25,720 MeteredFlusher.java (line 62) flushing high-traffic column family CFS(Keyspace='Stats', ColumnFamily='Minutewise_Adtype_Customer_Stats') (estimated 529810674 bytes) INFO [OptionalTasks:1] 2012-06-27 07:14:25,721 ColumnFamilyStore.java (line 688) Enqueuing flush of Memtable-Minutewise_Adtype_Customer_Stats@1651281270(163641387/529810674 serialized/live bytes, 1633074 ops) INFO [FlushWriter:3808] 2012-06-27 07:14:25,727 Memtable.java (line 239) Writing Memtable-Minutewise_Adtype_Customer_Stats@1651281270(163641387/529810674 serialized/live bytes, 1633074 ops) INFO [FlushWriter:3808] 2012-06-27 07:14:26,131 Memtable.java (line 275) Completed flushing /mnt/data/cassandra/data/Stats/Minutewise_Adtype_Customer_Stats-hb-70-Data.db (6315581 bytes) Yes, there are overwrites. Since these are Counter Column family, it sees a lot of increments, Does cassandra store all the history for a column (and is there some way to not store it)?? Since the formula is CF Count + Secondary Index Count + memtable_flush_queue_size (defaults to 4) + memtable_flush_writers (defaults to 1 per data directory) memtables in memory the JVM at once., shouldn't the limit be 6 (and not 7) memtables in memory? It's 7 because https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/MeteredFlusher.java#L51 Thanks a lot for this. I should have looked this up myself. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 26/06/2012, at 4:41 AM, rohit bhatia wrote: Hi We have 8 cassandra 1.0.5 nodes with 16 cores and 32G ram, Heap size is 12G, memtable_total_space_in_mb is one third = 4G, There are 12 Hot CFs (write-read ratio of 10). memtable_flush_queue_size = 4 and memtable_flush_writers = 2.. I got this log-entry MeteredFlusher.java (line 74) estimated 423318 bytes used by all memtables pre-flush, following which cassandra flushed several of its largest memtables. I understand that this message is due to the memtable_total_space_in_mb setting being reached, but I do not understand the remedy to the problem. Is increasing this variable my only option? Also, In standard MeteredFlusher flushes (the ones that trigger due to if my entire flush pipeline were full of memtables of this size, how big could I allow them to be. logic), I see memtables of serialized size of 100-200 MB with estimated live size of 500 MB get flushed to produce sstables of around 10-15 MB sizes. Are these factors of 10-20 between serialized on disk and memory and 3-5 for liveRatio expected? Also, this very informative article http://thelastpickle.com/2011/05/04/How-are-Memtables-measured/ has this to say For example if memtable_total_space_in_mb is 100MB, and memtable_flush_writers is the default 1 (with one data directory), and memtable_flush_queue_size is the default 4, and a Column Family has no secondary indexes. The CF will not be allowed to get above one seventh of 100MB or 14MB, as if the CF filled the flush pipeline with 7 memtables of this size it would take 98MB. Since the formula is CF Count + Secondary Index Count + memtable_flush_queue_size (defaults to 4) + memtable_flush_writers (defaults to 1 per data directory) memtables in memory the JVM at once., shouldn't the limit be 6 (and not 7) memtables in memory? Thanks Rohit
Re: Enable CQL3 from Astyanax
Had a quick look, the current master does not appear to support it. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 26/06/2012, at 7:46 AM, Thierry Templier wrote: Hello, How can I enable CQL3 support in Astyanax? Thanks very much for your help! Thierry
Re: repair never finishing 1.0.7
Setting up a Cassandra ring across NAT ( without a VPN ) is impossible in my experience. The broadcast_address allows a node to broadcast an address that is different to the ones it's bound to on the local interfaces https://github.com/apache/cassandra/blob/trunk/conf/cassandra.yaml#L270 1) How can I make sure that the JIRA issue above is my real problem? (I see no errors or warns in the logs; no other activity) If the errors are not there it is not your problem. - a full cluster restart allows the first attempted repair to complete (haven't tested yet; this is not practical even if it works) Rolling restart of the nodes involved in the repair is sufficient. Double checking the networking and check the logs on both sides of the transfer for errors or warnings. The code around streaming is better at failing loudly now days. If you dont see anything set DEBUG logging on org.apache.cassandra.streaming.FileStreamTask. That will let you know if things start and progress. Hope that helps. - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 26/06/2012, at 6:16 PM, Alexandru Sicoe wrote: Hi Andras, I am not using a VPN. The system has been running successfully in this configuration for a couple of weeks until I noticed the repair is not working. What happens is that I configure the IP Tables of the machine on each Cassandra node to forward packets that are sent to any of the IPs in the other DC (on ports 7000, 9160 and 7199) to be sent to the gateway IP. The gateway does the NAT sending the packets on the other side to the real destination IP, having replaced the source IP with the initial sender's IP (at least in my understanding of it). What might be the problem given the configuration? How to fix this? Cheers, Alex On Mon, Jun 25, 2012 at 12:47 PM, Andras Szerdahelyi andras.szerdahe...@ignitionone.com wrote: The DCs are communicating over a gateway where I do NAT for ports 7000, 9160 and 7199. Ah, that sounds familiar. You don't mention if you are VPN'd or not. I'll assume you are not. So, your nodes are behind network address translation - is that to say they advertise ( broadcast ) their internal or translated/forwarded IP to each other? Setting up a Cassandra ring across NAT ( without a VPN ) is impossible in my experience. Either the nodes on your local network won't be able to communicate with each other, because they broadcast their translated ( public ) address which is normally ( router configuration ) not routable from within the local network, or the nodes broadcast their internal IP, in which case the outside nodes are helpless in trying to connect to a local net. On DC2 nodes/the node you issue the repair on, check for any sockets being opened to the internal addresses of the nodes in DC1. regards, Andras On 25 Jun 2012, at 11:57, Alexandru Sicoe wrote: Hello everyone, I have a 2 DC (DC1:3 and DC2:6) Cassandra1.0.7 setup. I have about 300GB/node in the DC2. The DCs are communicating over a gateway where I do NAT for ports 7000, 9160 and 7199. I did a nodetool repair on a node in DC2 without any external load on the system. It took 5 hrs to finish the Merkle tree calculations (which is fine for me) but then in the streaming phase nothing happens (0% seen in nodetool netstats) and stays like that forever. Note: it has to stream to/from nodes in DC1! I tried another time and still the same. Looking around I found this thread http://www.mail-archive.com/user@cassandra.apache.org/msg22167.html which seems to describe the same problem. The thread gives 2 suggestions: - a full cluster restart allows the first attempted repair to complete (haven't tested yet; this is not practical even if it works) - issue https://issues.apache.org/jira/browse/CASSANDRA-4223 can be the problem Questions: 1) How can I make sure that the JIRA issue above is my real problem? (I see no errors or warns in the logs; no other activity) 2) What should I do to make the repairs work? (If the JIRA issue is the problem, then I see there is a fix for it in Version 1.0.11 which is not released yet) Thanks, Alex
Re: repair never finishing 1.0.7
Aaron, The broadcast_address allows a node to broadcast an address that is different to the ones it's bound to on the local interfaces https://github.com/apache/cassandra/blob/trunk/conf/cassandra.yaml#L270 Yes and thats not where the problem is IMO.. If you broadcast your translated address ( say 1.2.3.4, a public ip ) , nodes outside your VPN'd network will have no problems connecting as long as they can route to this address ( which they should ), but any other nodes on the local net ( e.g. 10.0.1.2 ) won't be able to connect/route to their neighbor who's telling them to open the return socket to 1.2.3.4 Am i getting this right? At least this is what i have experienced not so long ago: DC1 nodes a) 10.0.1.1 translated to 1.2.3.4 on NAT b) 10.0.1.2 translated to 1.2.3.5 on NAT DC2 nodes a) 10.0.2.1 translated to 1.2.4.4 on NAT b) 10.0.2.2 translated to 1.2.4.5 on NAT Let's assume DC2 nodes' broadcast_addresses are their public addresses. if, DC1:a and DC1:b broadcast their public address, 1.2.3.4 and 1.2.3.5, they are advertising an address that is not routable on their network ( loopback ) but DC2:a and DC2:b can connect/route to them just fine. Nodetool ring on any DC1 node says the others in DC1 are down, everything else is up . Nodetool ring on any DC2 node says everything is up. if DC1:a and DC1:b broadcast their private address, they can connect to each other fine, but DC2:a and DC2:b will have no chance to route to them. Nodetool ring on any DC1 node says everything is up. Nodetool ring on any DC2 node says DC1 nodes are down. regards, Andras On 27 Jun 2012, at 11:29, aaron morton wrote: Setting up a Cassandra ring across NAT ( without a VPN ) is impossible in my experience. The broadcast_address allows a node to broadcast an address that is different to the ones it's bound to on the local interfaces https://github.com/apache/cassandra/blob/trunk/conf/cassandra.yaml#L270 1) How can I make sure that the JIRA issue above is my real problem? (I see no errors or warns in the logs; no other activity) If the errors are not there it is not your problem. - a full cluster restart allows the first attempted repair to complete (haven't tested yet; this is not practical even if it works) Rolling restart of the nodes involved in the repair is sufficient. Double checking the networking and check the logs on both sides of the transfer for errors or warnings. The code around streaming is better at failing loudly now days. If you dont see anything set DEBUG logging on org.apache.cassandra.streaming.FileStreamTask. That will let you know if things start and progress. Hope that helps. - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.comhttp://www.thelastpickle.com/ On 26/06/2012, at 6:16 PM, Alexandru Sicoe wrote: Hi Andras, I am not using a VPN. The system has been running successfully in this configuration for a couple of weeks until I noticed the repair is not working. What happens is that I configure the IP Tables of the machine on each Cassandra node to forward packets that are sent to any of the IPs in the other DC (on ports 7000, 9160 and 7199) to be sent to the gateway IP. The gateway does the NAT sending the packets on the other side to the real destination IP, having replaced the source IP with the initial sender's IP (at least in my understanding of it). What might be the problem given the configuration? How to fix this? Cheers, Alex On Mon, Jun 25, 2012 at 12:47 PM, Andras Szerdahelyi andras.szerdahe...@ignitionone.commailto:andras.szerdahe...@ignitionone.com wrote: The DCs are communicating over a gateway where I do NAT for ports 7000, 9160 and 7199. Ah, that sounds familiar. You don't mention if you are VPN'd or not. I'll assume you are not. So, your nodes are behind network address translation - is that to say they advertise ( broadcast ) their internal or translated/forwarded IP to each other? Setting up a Cassandra ring across NAT ( without a VPN ) is impossible in my experience. Either the nodes on your local network won't be able to communicate with each other, because they broadcast their translated ( public ) address which is normally ( router configuration ) not routable from within the local network, or the nodes broadcast their internal IP, in which case the outside nodes are helpless in trying to connect to a local net. On DC2 nodes/the node you issue the repair on, check for any sockets being opened to the internal addresses of the nodes in DC1. regards, Andras On 25 Jun 2012, at 11:57, Alexandru Sicoe wrote: Hello everyone, I have a 2 DC (DC1:3 and DC2:6) Cassandra1.0.7 setup. I have about 300GB/node in the DC2. The DCs are communicating over a gateway where I do NAT for ports 7000, 9160 and 7199. I did a nodetool repair on a node in DC2 without any external load on the system. It took 5 hrs to finish the Merkle tree calculations (which is fine for me)
Re: Secondary index data gone after restart (1.1.1)
CASSANDRA-3954 disabled caches on secondary index CF's in 1.1.0 and CASSANDRA-4197 enabled it in 1.1.1 Can you create a ticket on https://issues.apache.org/jira/browse/CASSANDRA I guessing this has something to do with the local partitioner used for the secondary index Cf. That would explain the BigInteger value (from the RP) and the TimeUUIDType could come from the LocalToken. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 26/06/2012, at 8:11 PM, Ivo Meißner wrote: Hi, but if the data must be converted, this is something that should be fixed inside cassandra… Is this a bug, should I file a bug report? Or is there some kind of setting I can change to make it work for now? Maybe it is related to this issue, but this should have been fixed in 1.1.0: https://issues.apache.org/jira/browse/CASSANDRA-3954 Thanks Ivo Am 26.06.2012 um 09:26 schrieb Fei Shan: Hi please refer JDK nio package's ByteBuffer, I don't think that ByteBuffer can be cast from the BigInteger directly, it seems you need make some conversion before put it into a ByteBuffer. Thanks Fei On Tue, Jun 26, 2012 at 12:07 AM, Ivo Meißner i...@overtronic.com wrote: Hi, I am running into some problems with secondary indexes that I am unable to track down. When I restart the cassandra service, the secondary index data won't load and I get the following error during startup: INFO 08:29:42,127 Opening /var/myproject/cassandra/data/mykeyspace/group_admin/mykeyspace-group_admin.group_admin_groupId_idx-hd-1 (20808 bytes) ERROR 08:29:42,159 Exception in thread Thread[SSTableBatchOpen:1,5,main] java.lang.ClassCastException: java.math.BigInteger cannot be cast to java.nio.ByteBuffer at org.apache.cassandra.db.marshal.TimeUUIDType.compare(TimeUUIDType.java:37) at org.apache.cassandra.dht.LocalToken.compareTo(LocalToken.java:45) at org.apache.cassandra.db.DecoratedKey.compareTo(DecoratedKey.java:89) at org.apache.cassandra.db.DecoratedKey.compareTo(DecoratedKey.java:38) at java.util.TreeMap.getEntry(TreeMap.java:328) at java.util.TreeMap.containsKey(TreeMap.java:209) at java.util.TreeSet.contains(TreeSet.java:217) at org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:396) at org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:187) at org.apache.cassandra.io.sstable.SSTableReader$1.run(SSTableReader.java:225) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) When the service starts I can still select data from the column family, but not using the secondary index. After I execute nodetool rebuild_index the secondary index works fine again until the next restart. The error only seems to occur on the column groupId (TimeUUIDType). The other index on userId seems to work. I have the following column family definition: create column family group_admin with comparator = UTF8Type and key_validation_class = UTF8Type and column_metadata = [ {column_name: id, validation_class: UTF8Type}, {column_name: added, validation_class: LongType}, {column_name: userId, validation_class: BytesType, index_type: KEYS}, {column_name: requestMessage, validation_class: UTF8Type}, {column_name: status, validation_class: LongType}, {column_name: groupId, validation_class: TimeUUIDType, index_type: KEYS} ]; Thank you very much for your help! Ivo
Re: Enable CQL3 from Astyanax
Hello Aaron, Thanks very much for your answer! Could you give me some hints on how implement that? So I could contribute a patch for that. After having a look at the Astyanax code, I saw that CQL is executed in the ThriftColumnFamilyQueryImpl class based on the org.apache.cassandra.thrift.Cassandra.Client class and its execute_cql_query method. Which class from Cassandra do I need to use for CQL3? I'll go on investigating this aspect. Cheers, Thierry Had a quick look, the current master does not appear to support it. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 26/06/2012, at 7:46 AM, Thierry Templier wrote: Hello, How can I enable CQL3 support in Astyanax? Thanks very much for your help! Thierry
Re: Multi datacenter, WAN hiccups and replication
Therefore I was wondering if Cassandra already intelligently optimizes for HH-over-WAN (since this is common) or alternately if there's a way to enable HH for WAN replication? When the coordinator is preparing to process the request, done nodes in a foreign DC are treated like down nodes in a local DC. So a hint is stored for each down node. So more than 1 cross DC message is used when the hints are replayed. When the local DC coordinator sends a forwarded message to a foreign DC coordinator it sets up an expectation to hear back from each of the replicas in the foreign DC. If any of these fail to return, because say the foreign DC coordinator went down, a hint is stored. So again, more than 1 cross DC message will be used when the hints are replayed. Replaying hints is triggered when a node notices that another is up again. There would be no sense in forwarding the messages as they are sent directly to the recipient. Hope that helps. - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 27/06/2012, at 5:14 AM, Karthik N wrote: Let me attempt to articulate my question a little better. Say I choose LOCAL_QUORUM with a Replication Factor of 3. Cassandra stores three copies in my local datacenter. Therefore the cost associated with losing one node is not very high locally, and I usually HH, and use read repair/nodetool repair instead. However over the WAN network blips are quite normal and HH really helps. More so because for WAN replication Cassandra sends only one copy to a coordinator in the remote datacenter. Therefore I was wondering if Cassandra already intelligently optimizes for HH-over-WAN (since this is common) or alternately if there's a way to enable HH for WAN replication? Thank you. On Tue, Jun 26, 2012 at 9:22 AM, Mohit Anchlia mohitanch...@gmail.com wrote: On Tue, Jun 26, 2012 at 8:16 AM, Karthik N karthik@gmail.com wrote: Since Cassandra optimizes and sends only one copy over the WAN, can I opt in only for HH for WAN replication and avoid HH for the local quorum? (since I know I have more copies) I am not sure if I understand your question. In general I don't think you can selectively decide on HH. Besides HH should only be used when the outage is in mts, for longer outages using HH would only create memory pressure. On Tuesday, June 26, 2012, Mohit Anchlia wrote: On Tue, Jun 26, 2012 at 7:52 AM, Karthik N karthik@gmail.com wrote: My Cassandra ring spans two DCs. I use local quorum with replication factor=3. I do a write in DC1 with local quorum. Data gets written to multiple nodes in DC1. For the same write to propagate to DC2 only one copy is sent from the coordinator node in DC1 to a coordinator node in DC2 for optimizing traffic over the WAN (from what I have read in the Cassandra documentation) Will a Wan hiccup result in a Hinted Handoff (HH) being created in DC1's coordinator for DC2 to be delivered when the Wan link is up again? I have seen hinted handoff messages in the log files when the remote DC is unreachable. But this mechanism is only used for a the time defined in cassandra.yaml file. -- Thanks, Karthik
CQL / ASSUME for keys
Hi, I'm trying to do the following : update keyspace.CF set '2' = '2' + 12 WHERE KEY = 'mykey'; And got this answer: Bad Request: cannot parse 'mykey' as hex bytes Using this doesn't help: assume keyspace.CF(KEY) VALUES ARE text; (Found here http://www.datastax.com/docs/1.0/references/cql/ASSUME and I'm using C* 1.0.9) Show schema in cli gives : create column family CF with column_type = 'Standard' and comparator = 'UTF8Type' and default_validation_class = 'CounterColumnType' and key_validation_class = 'BytesType' and rows_cached = 0.0 and row_cache_save_period = 0 and row_cache_keys_to_save = 2147483647 and keys_cached = 2000.0 and key_cache_save_period = 14400 and read_repair_chance = 1.0 and gc_grace = 864000 and min_compaction_threshold = 4 and max_compaction_threshold = 32 and replicate_on_write = true and row_cache_provider = 'ConcurrentLinkedHashCacheProvider' and compaction_strategy = 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'; What would be the consequences of changing the key_validation_class from 'BytesType' to 'UTF8Type' (in production)? Shouldn't my assume command allow me to update my data even if I don't give the key as Bytes ? Alain
Re: Enable CQL3 from Astyanax
Hello Aaron, I created an issue on the Astyanax github for this problem. I added a fix to support CQL3 in the tool. See the link https://github.com/Netflix/astyanax/issues/75. Thierry Had a quick look, the current master does not appear to support it. Cheers
Re: Ball is rolling on High Performance Cassandra Cookbook second edition
Great stuff!!! On Tue, Jun 26, 2012 at 5:25 PM, Edward Capriolo edlinuxg...@gmail.comwrote: Hello all, It has not been very long since the first book was published but several things have been added to Cassandra and a few things have changed. I am putting together a list of changed content, for example features like the old per Column family memtable flush settings versus the new system with the global variable. My editors have given me the green light to grow the second edition from ~200 pages currently up to 300 pages! This gives us the ability to add more items/sections to the text. Some things were missing from the first edition such as Hector support. Nate has offered to help me in this area. Please feel contact me with any ideas and suggestions of recipes you would like to see in the book. Also get in touch if you want to write a recipe. Several people added content to the first edition and it would be great to see that type of participation again. Thank you, Edward
Node crashing during read repair
Hi there, Today I found one node (running 1.1.1 in a 3 node cluster) being dead for the third time this week, it died with the following message: ERROR [ReadRepairStage:3] 2012-06-27 14:28:30,929 AbstractCassandraDaemon.java (line 134) Exception in thread Thread[ReadRepairStage:3,5,main] java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:60) at java.util.concurrent.ThreadPoolExecutor.reject(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.execute(Unknown Source) at org.apache.cassandra.net.MessagingService.receive(MessagingService.java:566) at org.apache.cassandra.net.MessagingService.sendOneWay(MessagingService.java:439) at org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:391) at org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:372) at org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:460) at org.apache.cassandra.service.RowRepairResolver.scheduleRepairs(RowRepairResolver.java:136) at org.apache.cassandra.service.RowRepairResolver.resolve(RowRepairResolver.java:94) at org.apache.cassandra.service.AsyncRepairCallback$1.runMayThrow(AsyncRepairCallback.java:54) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Is this a common bug in 1.1.1, or did I get a race condition? -- With kind regards, Robin Verlangen *Software engineer* * * W http://www.robinverlangen.nl E ro...@us2.nl Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies.
Re: Ball is rolling on High Performance Cassandra Cookbook second edition
Hi Edward, Looking forward to your book. It's always interesting to read what others have to say about a certain subject, and hopefully even learn new things! 2012/6/27 Raj N raj.cassan...@gmail.com Great stuff!!! On Tue, Jun 26, 2012 at 5:25 PM, Edward Capriolo edlinuxg...@gmail.comwrote: Hello all, It has not been very long since the first book was published but several things have been added to Cassandra and a few things have changed. I am putting together a list of changed content, for example features like the old per Column family memtable flush settings versus the new system with the global variable. My editors have given me the green light to grow the second edition from ~200 pages currently up to 300 pages! This gives us the ability to add more items/sections to the text. Some things were missing from the first edition such as Hector support. Nate has offered to help me in this area. Please feel contact me with any ideas and suggestions of recipes you would like to see in the book. Also get in touch if you want to write a recipe. Several people added content to the first edition and it would be great to see that type of participation again. Thank you, Edward -- With kind regards, Robin Verlangen *Software engineer* * * W http://www.robinverlangen.nl E ro...@us2.nl Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies.
Re: Ball is rolling on High Performance Cassandra Cookbook second edition
Hey Edward, I finally posted my (short) blog post on using Hector with Jruby: http://synfin.net/sock_stream/technology/code/cassandra-hector-jruby-awesome If you're interested in documenting that more in detail in your book, let me know and I can help you with that in your book if you'd like. -Aaron On Tue, Jun 26, 2012 at 2:25 PM, Edward Capriolo edlinuxg...@gmail.com wrote: Hello all, It has not been very long since the first book was published but several things have been added to Cassandra and a few things have changed. I am putting together a list of changed content, for example features like the old per Column family memtable flush settings versus the new system with the global variable. My editors have given me the green light to grow the second edition from ~200 pages currently up to 300 pages! This gives us the ability to add more items/sections to the text. Some things were missing from the first edition such as Hector support. Nate has offered to help me in this area. Please feel contact me with any ideas and suggestions of recipes you would like to see in the book. Also get in touch if you want to write a recipe. Several people added content to the first edition and it would be great to see that type of participation again. Thank you, Edward -- Aaron Turner http://synfin.net/ Twitter: @synfinatic http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix Windows Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety. -- Benjamin Franklin carpe diem quam minimum credula postero
Re: Amazingly bad compaction performance
On Wed, Jun 27, 2012 at 1:42 AM, Derek Andree dand...@lacunasystems.comwrote: Last I heard only Oracle's JDK was officially supported with Cassandra, possibly nitpicky but is this still the case? On Jun 26, 2012, at 3:37 PM, Dustin Wenz wrote: (OpenJDK 7) was pegged at 200% CPU Java 7 still hasn't been that thoroughly tested, and from your description of the problem, it sounds like that might indeed be the cause. -- Tyler Hobbs DataStax http://datastax.com/
Re: bulk load problem
Thank you so much ! The problem was the RPC address, it was different than the listen address. I appreciate your help. Best, James On Wed, Jun 27, 2012 at 1:29 AM, Nury Redjepow nreje...@mail.ru wrote: What is your yaml setting for rpc and listen server on destination node? Nury Tue, 26 Jun 2012 17:07:49 -0700 от James Pirz james.p...@gmail.com: Dear all, I am trying to use sstableloader in cassandra 1.1.1, to bulk load some data into a single node cluster. I am running the following command: bin/sstableloader -d 192.168.100.1 /data/ssTable/tpch/tpch/ from another node (other than the node on which cassandra is running), while the data should be loaded into a keyspace named tpch. I made sure that the 2nd node, from which I run sstableloader, have the same copy of cassandra.yaml as the destination node. I have put tpch-cf0-hd-1-Data.db tpch-cf0-hd-1-Index.db under the path, I have passed to sstableloader. But I am getting the following error: Could not retrieve endpoint ranges: Any hint ? Thanks in advance, James
Re: Ball is rolling on High Performance Cassandra Cookbook second edition
Sounds good. One thing I'd like to see is more coverage on Cassandra Internals. Out of the box Cassandra's great but having a little inside knowledge can be very useful because it helps you design your applications to work with Cassandra; rather than having to later make endless optimizations that could probably have been avoided had you done your implementation slightly differently. Another thing that may be worth adding would be a recipe that showed an approach to evaluating Cassandra for your organization/use case. I realize that's going to vary on a case by case basis but one thing I've noticed is that some people dive in without really thinking through whether Cassandra is actually the right fit for what they're doing. It sort of becomes a hammer for anything that looks like a nail. On Tue, Jun 26, 2012 at 10:25 PM, Edward Capriolo edlinuxg...@gmail.comwrote: Hello all, It has not been very long since the first book was published but several things have been added to Cassandra and a few things have changed. I am putting together a list of changed content, for example features like the old per Column family memtable flush settings versus the new system with the global variable. My editors have given me the green light to grow the second edition from ~200 pages currently up to 300 pages! This gives us the ability to add more items/sections to the text. Some things were missing from the first edition such as Hector support. Nate has offered to help me in this area. Please feel contact me with any ideas and suggestions of recipes you would like to see in the book. Also get in touch if you want to write a recipe. Several people added content to the first edition and it would be great to see that type of participation again. Thank you, Edward -- Courtney Robinson court...@crlog.info http://crlog.info 07535691628 (No private #s)
Re: Ball is rolling on High Performance Cassandra Cookbook second edition
On Wed, Jun 27, 2012 at 3:08 PM, Courtney Robinson court...@crlog.info wrote: Sounds good. One thing I'd like to see is more coverage on Cassandra Internals. Out of the box Cassandra's great but having a little inside knowledge can be very useful because it helps you design your applications to work with Cassandra; rather than having to later make endless optimizations that could probably have been avoided had you done your implementation slightly differently. Another thing that may be worth adding would be a recipe that showed an approach to evaluating Cassandra for your organization/use case. I realize that's going to vary on a case by case basis but one thing I've noticed is that some people dive in without really thinking through whether Cassandra is actually the right fit for what they're doing. It sort of becomes a hammer for anything that looks like a nail. On Tue, Jun 26, 2012 at 10:25 PM, Edward Capriolo edlinuxg...@gmail.com wrote: Hello all, It has not been very long since the first book was published but several things have been added to Cassandra and a few things have changed. I am putting together a list of changed content, for example features like the old per Column family memtable flush settings versus the new system with the global variable. My editors have given me the green light to grow the second edition from ~200 pages currently up to 300 pages! This gives us the ability to add more items/sections to the text. Some things were missing from the first edition such as Hector support. Nate has offered to help me in this area. Please feel contact me with any ideas and suggestions of recipes you would like to see in the book. Also get in touch if you want to write a recipe. Several people added content to the first edition and it would be great to see that type of participation again. Thank you, Edward -- Courtney Robinson court...@crlog.info http://crlog.info 07535691628 (No private #s) Thanks for the comments. Yes the INTERNALS chapter was a bit tricky. The challenge of writing about internals is they go stale fairly quickly. I was considering writing a partitioner for the internals chapter but then I thought about it more: 1) Its hard 2) The APIs can change. (They work the same way across versions but they may have a different signature etc) 3) 99.99% of people should be using the random partitioner :) But I agree the external chapter can be made much stronger then it is. The recipe format strict. It naturally conflicts with the typical use case style. In a use case where you write a good amount of text talking about problem domain, previous solutions, bragging about company X. We can not do that with the recipe style, but we can do our best to make the recipes as real world as possible. I tried to do that throughout the text, you do not find many examples like 'writing foo records to bar column families'. However the format does not allow extensive text blocks mentioned above so it is difficult to set the stage for a complex and detailed real world problem. Still, I think for some examples we can take the next step and make the recipe more real world practical and more use-case like.
Re: Ball is rolling on High Performance Cassandra Cookbook second edition
RE: API method signatures changing That triggers another thought... What terminology will you use in the book to describe the data model? CQL? When we wrote the RefCard on DZonehttp://refcardz.dzone.com/refcardz/apache-cassandra, we intentionally favored/used CQL terminology. On advisement from Jonathan and Kris Hahn, we wanted to start the process of sunsetting the legacy terms (keyspace, column family, etc.) in favor of the more familiar CQL terms (schema, table, etc.). I've gone on recordhttp://css.dzone.com/articles/new-refcard-apache-cassandrain favor of the switch, but it is probably something worth noting in the book since that terminology does not yet align with all the client APIs yet. (e.g. Hector, Astyanax, etc.) I'm not sure when the client APIs will catch up to the new terminology, but we may want to inquire as to future proof the recipes as much as possible. -brian On Wed, Jun 27, 2012 at 4:18 PM, Edward Capriolo edlinuxg...@gmail.comwrote: On Wed, Jun 27, 2012 at 3:08 PM, Courtney Robinson court...@crlog.info wrote: Sounds good. One thing I'd like to see is more coverage on Cassandra Internals. Out of the box Cassandra's great but having a little inside knowledge can be very useful because it helps you design your applications to work with Cassandra; rather than having to later make endless optimizations that could probably have been avoided had you done your implementation slightly differently. Another thing that may be worth adding would be a recipe that showed an approach to evaluating Cassandra for your organization/use case. I realize that's going to vary on a case by case basis but one thing I've noticed is that some people dive in without really thinking through whether Cassandra is actually the right fit for what they're doing. It sort of becomes a hammer for anything that looks like a nail. On Tue, Jun 26, 2012 at 10:25 PM, Edward Capriolo edlinuxg...@gmail.com wrote: Hello all, It has not been very long since the first book was published but several things have been added to Cassandra and a few things have changed. I am putting together a list of changed content, for example features like the old per Column family memtable flush settings versus the new system with the global variable. My editors have given me the green light to grow the second edition from ~200 pages currently up to 300 pages! This gives us the ability to add more items/sections to the text. Some things were missing from the first edition such as Hector support. Nate has offered to help me in this area. Please feel contact me with any ideas and suggestions of recipes you would like to see in the book. Also get in touch if you want to write a recipe. Several people added content to the first edition and it would be great to see that type of participation again. Thank you, Edward -- Courtney Robinson court...@crlog.info http://crlog.info 07535691628 (No private #s) Thanks for the comments. Yes the INTERNALS chapter was a bit tricky. The challenge of writing about internals is they go stale fairly quickly. I was considering writing a partitioner for the internals chapter but then I thought about it more: 1) Its hard 2) The APIs can change. (They work the same way across versions but they may have a different signature etc) 3) 99.99% of people should be using the random partitioner :) But I agree the external chapter can be made much stronger then it is. The recipe format strict. It naturally conflicts with the typical use case style. In a use case where you write a good amount of text talking about problem domain, previous solutions, bragging about company X. We can not do that with the recipe style, but we can do our best to make the recipes as real world as possible. I tried to do that throughout the text, you do not find many examples like 'writing foo records to bar column families'. However the format does not allow extensive text blocks mentioned above so it is difficult to set the stage for a complex and detailed real world problem. Still, I think for some examples we can take the next step and make the recipe more real world practical and more use-case like. -- Brian ONeill Lead Architect, Health Market Science (http://healthmarketscience.com) mobile:215.588.6024 blog: http://weblogs.java.net/blog/boneill42/ blog: http://brianoneill.blogspot.com/
Adding New Nodes to a Production Cluster
Hi, We have a production cluster with few nodes in each data center. Each node is being contacted in each data center to serve front end requests. I have a question about the method adding new nodes to the cluster (say, to improve RF or scalability). AFAIK, there are two methods to do this. 1. Bring up the node, with no data in it, and let it get the data from the peers. But the issue with this problem is all the requests coming into this node can not be served until it gets all the data. This is a big impact for front end. 2. Copy all the data into the new node from other nodes (we can be little bit smart here selecting which nodes to copy from), bring it up and run cleanup and repair. The issue with this approach is all the data will be around 1TB and it will take a consider amount of time to copy them all and also because of the rate of updates happening its sometimes hard to shadow the time to copy the data in to this node. Is there an option in Cassandra, where I can bring a node up, just like in #1, but ask peers not to send any read request to it? Or may be apart from all these, are there better option to handle new nodes? Also, I think I can use a similar method when a node goes down (say, disk or RAID failure) for a longer period of time. Please help me to find an answer to this. Thanks, Eran Chinthaka Withana
Re: Ball is rolling on High Performance Cassandra Cookbook second edition
On Wed, Jun 27, 2012 at 4:34 PM, Brian O'Neill b...@alumni.brown.edu wrote: RE: API method signatures changing That triggers another thought... What terminology will you use in the book to describe the data model? CQL? When we wrote the RefCard on DZone, we intentionally favored/used CQL terminology. On advisement from Jonathan and Kris Hahn, we wanted to start the process of sunsetting the legacy terms (keyspace, column family, etc.) in favor of the more familiar CQL terms (schema, table, etc.). I've gone on record in favor of the switch, but it is probably something worth noting in the book since that terminology does not yet align with all the client APIs yet. (e.g. Hector, Astyanax, etc.) I'm not sure when the client APIs will catch up to the new terminology, but we may want to inquire as to future proof the recipes as much as possible. -brian On Wed, Jun 27, 2012 at 4:18 PM, Edward Capriolo edlinuxg...@gmail.com wrote: On Wed, Jun 27, 2012 at 3:08 PM, Courtney Robinson court...@crlog.info wrote: Sounds good. One thing I'd like to see is more coverage on Cassandra Internals. Out of the box Cassandra's great but having a little inside knowledge can be very useful because it helps you design your applications to work with Cassandra; rather than having to later make endless optimizations that could probably have been avoided had you done your implementation slightly differently. Another thing that may be worth adding would be a recipe that showed an approach to evaluating Cassandra for your organization/use case. I realize that's going to vary on a case by case basis but one thing I've noticed is that some people dive in without really thinking through whether Cassandra is actually the right fit for what they're doing. It sort of becomes a hammer for anything that looks like a nail. On Tue, Jun 26, 2012 at 10:25 PM, Edward Capriolo edlinuxg...@gmail.com wrote: Hello all, It has not been very long since the first book was published but several things have been added to Cassandra and a few things have changed. I am putting together a list of changed content, for example features like the old per Column family memtable flush settings versus the new system with the global variable. My editors have given me the green light to grow the second edition from ~200 pages currently up to 300 pages! This gives us the ability to add more items/sections to the text. Some things were missing from the first edition such as Hector support. Nate has offered to help me in this area. Please feel contact me with any ideas and suggestions of recipes you would like to see in the book. Also get in touch if you want to write a recipe. Several people added content to the first edition and it would be great to see that type of participation again. Thank you, Edward -- Courtney Robinson court...@crlog.info http://crlog.info 07535691628 (No private #s) Thanks for the comments. Yes the INTERNALS chapter was a bit tricky. The challenge of writing about internals is they go stale fairly quickly. I was considering writing a partitioner for the internals chapter but then I thought about it more: 1) Its hard 2) The APIs can change. (They work the same way across versions but they may have a different signature etc) 3) 99.99% of people should be using the random partitioner :) But I agree the external chapter can be made much stronger then it is. The recipe format strict. It naturally conflicts with the typical use case style. In a use case where you write a good amount of text talking about problem domain, previous solutions, bragging about company X. We can not do that with the recipe style, but we can do our best to make the recipes as real world as possible. I tried to do that throughout the text, you do not find many examples like 'writing foo records to bar column families'. However the format does not allow extensive text blocks mentioned above so it is difficult to set the stage for a complex and detailed real world problem. Still, I think for some examples we can take the next step and make the recipe more real world practical and more use-case like. -- Brian ONeill Lead Architect, Health Market Science (http://healthmarketscience.com) mobile:215.588.6024 blog: http://weblogs.java.net/blog/boneill42/ blog: http://brianoneill.blogspot.com/ As for terminology, I guess you can consider me a hard-liner as I have a few problems with calling a column family a table. I might be in the minority, but I know I am not alone. On one hand aliases make the integration easier https://issues.apache.org/jira/browse/CASSANDRA-2743, but on the other hand if a user does not understand what a column family is they will likely use cassandra incorrectly. Maybe this is just a semantics debate because a table in a column oriented database is different then a table in a row oriented
Re: Adding New Nodes to a Production Cluster
Hi Eran, As far as I'm aware of a node will not serve requests until the bootstrap (starts automatically these days) has been completed. So the problem of #1 is not really there Solution #2 is not straight forward and easy to make mistakes. When you're concerned about read consistency use a proper consistency level, like QUORUM or LOCAL_QUORUM. Cheers! 2012/6/27 Eran Chinthaka Withana eran.chinth...@gmail.com Hi, We have a production cluster with few nodes in each data center. Each node is being contacted in each data center to serve front end requests. I have a question about the method adding new nodes to the cluster (say, to improve RF or scalability). AFAIK, there are two methods to do this. 1. Bring up the node, with no data in it, and let it get the data from the peers. But the issue with this problem is all the requests coming into this node can not be served until it gets all the data. This is a big impact for front end. 2. Copy all the data into the new node from other nodes (we can be little bit smart here selecting which nodes to copy from), bring it up and run cleanup and repair. The issue with this approach is all the data will be around 1TB and it will take a consider amount of time to copy them all and also because of the rate of updates happening its sometimes hard to shadow the time to copy the data in to this node. Is there an option in Cassandra, where I can bring a node up, just like in #1, but ask peers not to send any read request to it? Or may be apart from all these, are there better option to handle new nodes? Also, I think I can use a similar method when a node goes down (say, disk or RAID failure) for a longer period of time. Please help me to find an answer to this. Thanks, Eran Chinthaka Withana -- With kind regards, Robin Verlangen *Software engineer* * * W http://www.robinverlangen.nl E ro...@us2.nl Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies.
Re: Ball is rolling on High Performance Cassandra Cookbook second edition
I'm looking forward to getting a few copies of this. Some areas that would be great to cover - Indexing strategies - Configuring clients/env for sane timestamping - Efficient CQL - Top 8/10 perf issues/stacktraces and common resolutions - understanding nodetool tpstats/cfhistograms/cfstats and what they're actually saying - Capacity sizing (disk/ram overhead needed) - Compaction choice/strategies for kinds of workload Bill On 26/06/12 22:25, Edward Capriolo wrote: Hello all, It has not been very long since the first book was published but several things have been added to Cassandra and a few things have changed. I am putting together a list of changed content, for example features like the old per Column family memtable flush settings versus the new system with the global variable. My editors have given me the green light to grow the second edition from ~200 pages currently up to 300 pages! This gives us the ability to add more items/sections to the text. Some things were missing from the first edition such as Hector support. Nate has offered to help me in this area. Please feel contact me with any ideas and suggestions of recipes you would like to see in the book. Also get in touch if you want to write a recipe. Several people added content to the first edition and it would be great to see that type of participation again. Thank you, Edward
Re: Ball is rolling on High Performance Cassandra Cookbook second edition
Hi Edward, That's a great news! One thing I'd like to see in the new edition is Counters, known issues and how to avoid them: - avoid double counting (don't retry on failure, use write consistency level ONE, use dedicated Hector connector?) - delete counters (tricky, reset to zero?) - other tips and tricks I personally had (and still have to some extend) problems with maintaining counter accuracy. Best, Rustam. On 26/06/2012 22:25, Edward Capriolo wrote: Hello all, It has not been very long since the first book was published but several things have been added to Cassandra and a few things have changed. I am putting together a list of changed content, for example features like the old per Column family memtable flush settings versus the new system with the global variable. My editors have given me the green light to grow the second edition from ~200 pages currently up to 300 pages! This gives us the ability to add more items/sections to the text. Some things were missing from the first edition such as Hector support. Nate has offered to help me in this area. Please feel contact me with any ideas and suggestions of recipes you would like to see in the book. Also get in touch if you want to write a recipe. Several people added content to the first edition and it would be great to see that type of participation again. Thank you, Edward
Re: Ball is rolling on High Performance Cassandra Cookbook second edition
On Thu, Jun 28, 2012 at 7:32 AM, Edward Capriolo edlinuxg...@gmail.comwrote: On Wed, Jun 27, 2012 at 4:34 PM, Brian O'Neill b...@alumni.brown.edu wrote: RE: API method signatures changing That triggers another thought... What terminology will you use in the book to describe the data model? CQL? When we wrote the RefCard on DZone, we intentionally favored/used CQL terminology. On advisement from Jonathan and Kris Hahn, we wanted to start the process of sunsetting the legacy terms (keyspace, column family, etc.) in favor of the more familiar CQL terms (schema, table, etc.). I've gone on record in favor of the switch, but it is probably something worth noting in the book since that terminology does not yet align with all the client APIs yet. (e.g. Hector, Astyanax, etc.) I'm not sure when the client APIs will catch up to the new terminology, but we may want to inquire as to future proof the recipes as much as possible. -brian On Wed, Jun 27, 2012 at 4:18 PM, Edward Capriolo edlinuxg...@gmail.com wrote: On Wed, Jun 27, 2012 at 3:08 PM, Courtney Robinson court...@crlog.info wrote: Sounds good. One thing I'd like to see is more coverage on Cassandra Internals. Out of the box Cassandra's great but having a little inside knowledge can be very useful because it helps you design your applications to work with Cassandra; rather than having to later make endless optimizations that could probably have been avoided had you done your implementation slightly differently. Another thing that may be worth adding would be a recipe that showed an approach to evaluating Cassandra for your organization/use case. I realize that's going to vary on a case by case basis but one thing I've noticed is that some people dive in without really thinking through whether Cassandra is actually the right fit for what they're doing. It sort of becomes a hammer for anything that looks like a nail. On Tue, Jun 26, 2012 at 10:25 PM, Edward Capriolo edlinuxg...@gmail.com wrote: Hello all, It has not been very long since the first book was published but several things have been added to Cassandra and a few things have changed. I am putting together a list of changed content, for example features like the old per Column family memtable flush settings versus the new system with the global variable. My editors have given me the green light to grow the second edition from ~200 pages currently up to 300 pages! This gives us the ability to add more items/sections to the text. Some things were missing from the first edition such as Hector support. Nate has offered to help me in this area. Please feel contact me with any ideas and suggestions of recipes you would like to see in the book. Also get in touch if you want to write a recipe. Several people added content to the first edition and it would be great to see that type of participation again. Thank you, Edward -- Courtney Robinson court...@crlog.info http://crlog.info 07535691628 (No private #s) Thanks for the comments. Yes the INTERNALS chapter was a bit tricky. The challenge of writing about internals is they go stale fairly quickly. I was considering writing a partitioner for the internals chapter but then I thought about it more: 1) Its hard 2) The APIs can change. (They work the same way across versions but they may have a different signature etc) 3) 99.99% of people should be using the random partitioner :) But I agree the external chapter can be made much stronger then it is. The recipe format strict. It naturally conflicts with the typical use case style. In a use case where you write a good amount of text talking about problem domain, previous solutions, bragging about company X. We can not do that with the recipe style, but we can do our best to make the recipes as real world as possible. I tried to do that throughout the text, you do not find many examples like 'writing foo records to bar column families'. However the format does not allow extensive text blocks mentioned above so it is difficult to set the stage for a complex and detailed real world problem. Still, I think for some examples we can take the next step and make the recipe more real world practical and more use-case like. -- Brian ONeill Lead Architect, Health Market Science (http://healthmarketscience.com) mobile:215.588.6024 blog: http://weblogs.java.net/blog/boneill42/ blog: http://brianoneill.blogspot.com/ As for terminology, I guess you can consider me a hard-liner as I have a few problems with calling a column family a table. I might be in the minority, but I know I am not alone. On one hand aliases make the integration easier https://issues.apache.org/jira/browse/CASSANDRA-2743, but on the other hand if a user
Re: Ball is rolling on High Performance Cassandra Cookbook second edition
On Wed, Jun 27, 2012 at 1:34 PM, Brian O'Neill b...@alumni.brown.edu wrote: RE: API method signatures changing That triggers another thought... What terminology will you use in the book to describe the data model? CQL? When we wrote the RefCard on DZone, we intentionally favored/used CQL terminology. On advisement from Jonathan and Kris Hahn, we wanted to start the process of sunsetting the legacy terms (keyspace, column family, etc.) in favor of the more familiar CQL terms (schema, table, etc.). I've gone on record in favor of the switch, but it is probably something worth noting in the book since that terminology does not yet align with all the client APIs yet. (e.g. Hector, Astyanax, etc.) I'm not sure when the client APIs will catch up to the new terminology, but we may want to inquire as to future proof the recipes as much as possible. Not just client API's but documentation as well. When I was a new user, yeah the different terminology was a bit off-putting, but it was consistent and it didn't take long to realize a CF was like a SQL table, etc. Honestly, I think using the same terms as a RDBMS does makes users think they're exactly the same thing and have the same properties... which is close enough in some cases, but dangerous in others. That said, while I found the first edition informative, I found the java/hector code examples hard to read. Part of that was because I don't know Java (I know enough other languages that I can follow along) and part of that is that Java is so verbose that it just doesn't fit on the printed page. I think CQL lends itself to making the book more readable to a wider audience, but I think there should be a chapter on Hector/pycassa/etc. Of course, you still need to write code around it, and if that's Java I'm not sure how much it matters. -- Aaron Turner http://synfin.net/ Twitter: @synfinatic http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix Windows Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety. -- Benjamin Franklin carpe diem quam minimum credula postero
which high level Java client
Dear all, I am interested in using Cassandra 1.1.1 in a read-intensive scenario, where more than 95% of my operations are get(). I have a cluster with ~10 nodes, around 15-20 GB of data on each, while in the extreme case I expect to have 20-40 concurrent clients. I am kind of confused about which high level java client should I use ? (Which one is the best/fastest for concurrent read operations) Hector, Pelops, Astyanax, or something else ? I browsed the mailing list, but I came across different types of arguments and conclusions on behalf of various clients. Thanks in advance, James
Re: which high level Java client
Hello We are using Hector and it perfectly matching to our case. https://github.com/hector-client/hector -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/which-high-level-Java-client-tp7580842p7580844.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.