RE: Supported Cassandra version for CentOS 5.5
I am running Cassandra 1.2.12 on CentOS 5.10. Was running 1.1.15 previously without any issues as well. -Arindam From: Donald Smith [mailto:donald.sm...@audiencescience.com] Sent: Tuesday, February 25, 2014 3:40 PM To: user@cassandra.apache.org Subject: RE: Supported Cassandra version for CentOS 5.5 I was unable to get cassandra working with CentOS 5.X . I needed to use CentOS 6.2 or 6.4. Don From: Hari Rajendhran hari.rajendh...@tcs.commailto:hari.rajendh...@tcs.com Sent: Tuesday, February 25, 2014 2:34 AM To: user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: Supported Cassandra version for CentOS 5.5 Hi, Currently i am using CentOS 5.5 OS.I need a clarification on the latest cassandra version(preferably 2.0.4) that my OS supports. Best Regards Hari Krishnan Rajendhran Hadoop Admin DESS-ABIM ,Chennai BIGDATA Galaxy Tata Consultancy Services Cell:- 9677985515 Mailto: hari.rajendh...@tcs.commailto:hari.rajendh...@tcs.com Website: http://www.tcs.comhttp://www.tcs.com/ Experience certainty.IT Services Business Solutions Consulting =-=-= Notice: The information contained in this e-mail message and/or attachments to it may contain confidential or privileged information. If you are not the intended recipient, any dissemination, use, review, distribution, printing or copying of the information contained in this e-mail message and/or attachments to it are strictly prohibited. If you have received this communication in error, please notify us by reply e-mail or telephone and immediately and permanently delete the message and any attachments. Thank you
RE: Bootstrap stuck: vnode enabled 1.2.12
As an update - finally got the node to join the ring. Restarting all the nodes in the cluster, followed by a clean bootstrap of the node that was stuck did the trick. -Arindam From: Arindam Barua [mailto:aba...@247-inc.com] Sent: Monday, February 24, 2014 5:04 PM To: user@cassandra.apache.org Subject: RE: Bootstrap stuck: vnode enabled 1.2.12 The host would not join the ring after more clean bootstrap attempts. Noticed nodetool netstats, even though doesn't repair any streaming, does constantly report Nothing streaming from 3 specific hosts in the ring. $ nodetool netstats xss = -ea -d64 -javaagent:/usr/local/cassandra/bin/../lib/jamm-0.2.5.jar -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms8043M -Xmx8043M -Xmn800M -XX:+HeapDumpOnOutOfMemoryError -Xss256k Mode: JOINING Not sending any streams. Nothing streaming from /10.67.XXX.XXX Nothing streaming from /10.67.XXX.XXX Nothing streaming from /10.67.XXX.XXX Today when I had to do some unrelated maintenance and attempted to drain the hosts mentioned above before restarting cassandra, the drain would just hang. Other hosts in the ring did not have any issue. Also the original host that is stuck in the joining state, logged the following: [24/02/2014:15:49:42 PST] GossipTasks:1: ERROR AbstractStreamSession.java (line 110) Stream failed because /10.67.XXX.XXX died or was restarted/removed (streams may still be active in background, but further streams won't be started) [24/02/2014:15:49:42 PST] GossipTasks:1: WARN RangeStreamer.java (line 246) Streaming from /10.67.XXX.XXX failed From: Arindam Barua [mailto:aba...@247-inc.com] Sent: Tuesday, February 18, 2014 5:16 PM To: user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: RE: Bootstrap stuck: vnode enabled 1.2.12 I believe you are talking about CASSANDRA-6685, which was introduced in 1.2.15. I'm trying to add a node to a production ring. I have added nodes previously just fine. However, this node had hardware issues during a previous bootstrap, and now even a clean bootstrap seems to be having problems. Does the ring somehow remember about this node and if so can I make it forget about it? Decommission/removenode does not work on a node that hasn't yet bootstrapped. From: Edward Capriolo [mailto:edlinuxg...@gmail.com] Sent: Tuesday, February 18, 2014 12:30 PM To: user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: Re: Bootstrap stuck: vnode enabled 1.2.12 There is a bug where a node without schema can not bootstrap. Do you have schema? On Tue, Feb 18, 2014 at 1:29 PM, Arindam Barua aba...@247-inc.commailto:aba...@247-inc.com wrote: The node is still out of the ring. Any suggestions on how to get it in will be very helpful. From: Arindam Barua [mailto:aba...@247-inc.commailto:aba...@247-inc.com] Sent: Friday, February 14, 2014 1:04 AM To: user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: Bootstrap stuck: vnode enabled 1.2.12 After our otherwise successful upgrade procedure to enable vnodes, when adding back new hosts to our cluster, one non-seed host ran into a hardware issue during bootstrap. By the time the hardware issue was fixed a week later, all other nodes were added successfully, cleaned, repaired. The disks on this node were untouched, and when the node was started back up, it detected an interrupted bootstrap, and attempted to bootstrap. However, after ~24 hrs it was still stuck in the 'JOINING' state according to nodetool netstats on that node, even though no streams were flowing to/from it. Also, it did not appear in nodetool status in any way/form (not even as JOINING). From couple of observed thread dumps, the stack of the thread blocked during bootstrap is at [1]. Since the node wasn't making any progress, I ended up stopping Cassandra, cleaning up the data and commitlog directories, and attempted a fresh bootstrap. Nodetool netstats immediately reported a whole bunch of streams queued up, and data started streaming to the node. The data directory quickly grew to 18 GB (the other nodes had ~25GB, but we have lot of data with low TTLs). However, the node ended up being in the earlier reported state, i.e. nodetool netstats doesn't have anything queued, but still reports the JOINING state, even though it's been 24 hrs. There are no other ERRORS in the logs, and new data being written to the cluster makes it to this node just fine, triggering compactions, etc from time to time. Any help is appreciated. Thanks, Arindam [1] Thread dump Thread 3708: (state = BLOCKED) - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information may be imprecise) - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, line=156 (Interpreted frame) - java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt() @bci=1, line=811 (Interpreted frame) - java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(int)
FW: Sporadic gossip exception on add node
Did not help finally. So I enabled logging at debug level. The log files tell me that the node being added is communicating with the other nodes (that are seed nodes). Still nothing seems to be returning to that node. The log files on the other nodes are detecting the shadow request, but no other information like being unable to send something back. And as before, just restarting that node once more does the trick and bootstrap is proceeding. Maybe the problem has something to do about the gossip state? My test case is : Decommission node 10.164.8.93, restart a clean node 10.164.8.93 and let it bootstrap. In my test case, I see that 7 minutes before the node is added again to the ring the other nodes are detecting the decommission of the node. 2014-02-26 10:23:21.443 6 elapsed, /10.164.8.93 gossip quarantine over 2014-02-26 10:23:21.444 Ignoring state change for dead or unknown endpoint: /10.164.8.93 2014-02-26 10:23:55.636 Forcing conviction of /10.164.8.93 2014-02-26 10:24:00.230 Reseting version for /10.164.8.93 2014-02-26 10:24:00.230 Reseting version for /10.164.8.93 At the time the node 10.164.8.93 is added the log shows : 2014-02-26 10:30:03.015 Cassandra version: 2.0.5-SNAPSHOT 2014-02-26 10:30:03.016 Thrift API version: 19.39.0 2014-02-26 10:30:03.018 CQL supported versions: 2.0.0,3.1.4 (default: 3.1.4) 2014-02-26 10:30:03.029 Loading persisted ring state 2014-02-26 10:30:03.034 Starting shadow gossip round to check for endpoint collision 2014-02-26 10:30:03.034 Starting Messaging Service on port 9804 2014-02-26 10:30:03.046 attempting to connect to /10.164.8.249 2014-02-26 10:30:03.047 attempting to connect to /10.164.8.250 2014-02-26 10:30:03.048 attempting to connect to /10.164.8.92 2014-02-26 10:30:03.051 Handshaking version with /10.164.8.250 2014-02-26 10:30:03.052 Handshaking version with /10.164.8.249 2014-02-26 10:30:03.052 Handshaking version with /10.164.8.92 2014-02-26 10:30:34.059 Exception encountered during startup java.lang.RuntimeException: Unable to gossip with any seeds at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1173) ~[apache-cassandra-2.0.5-SNAPSHOT.jar:2.0.5-SNAPSHOT] at org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:424) ~[apache-cassandra-2.0.5-SNAPSHOT.jar:2.0.5-SNAPSHOT] at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:615) ~[apache-cassandra-2.0.5-SNAPSHOT.jar:2.0.5-SNAPSHOT] at org.apache.cassandra.service.StorageService.initServer(StorageService.java:583) ~[apache-cassandra-2.0.5-SNAPSHOT.jar:2.0.5-SNAPSHOT] at org.apache.cassandra.service.StorageService.initServer(StorageService.java:482) ~[apache-cassandra-2.0.5-SNAPSHOT.jar:2.0.5-SNAPSHOT] at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:348) ~[apache-cassandra-2.0.5-SNAPSHOT.jar:2.0.5-SNAPSHOT] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:465) ~[apache-cassandra-2.0.5-SNAPSHOT.jar:2.0.5-SNAPSHOT] at be.landc.services.search.server.db.baseserver.indexsearch.store.cassandra.CassandraStore$CassThread.startUpCassandra(CassandraStore.java:495) [landc-services-search-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT-92200] at be.landc.services.search.server.db.baseserver.indexsearch.store.cassandra.CassandraStore$CassThread.run(CassandraStore.java:461) [landc-services-search-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT-92200] 2014-02-26 10:30:34.069 Exception in thread Thread[StorageServiceShutdownHook,5,main] java.lang.NullPointerException: null at org.apache.cassandra.gms.Gossiper.stop(Gossiper.java:1250) ~[apache-cassandra-2.0.5-SNAPSHOT.jar:2.0.5-SNAPSHOT] at org.apache.cassandra.service.StorageService$1.runMayThrow(StorageService.java:550) ~[apache-cassandra-2.0.5-SNAPSHOT.jar:2.0.5-SNAPSHOT] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) ~[apache-cassandra-2.0.5-SNAPSHOT.jar:2.0.5-SNAPSHOT] at java.lang.Thread.run(Thread.java:724) ~[na:1.7.0_40] 2014-02-26 10:31:34.081 ShutDownHook requests shutdown on be.landc.framework.service.ipl.Boot@730e8516 And at that time 3 other nodes print log information : - 2014-02-26 10:30:03.051 Connection version 7 from /10.164.8.93 2014-02-26 10:30:03.065 Upgrading incoming connection to be compressed 2014-02-26 10:30:03.130 Max version for /10.164.8.93 is 7 2014-02-26 10:30:03.130 Setting version 7 for /10.164.8.93 2014-02-26 10:30:03.131 set version for /10.164.8.93 to 7 2014-02-26 10:30:03.131 Shadow request received, adding all states Any more information I can pass? Regards, Ignace From: Desimpel, Ignace
Cassandra nodetool status result after restoring snapshot
Hi I have two separated clusters consist of 4 nodes. One cluster is running on 1.2.12 and the other one on 2.0.5. I loaded data from first cluster (1.2.12) to the second one (2.0.5) by copying snapshots between corresponding nodes. I removed commitlogs, started second cluster and run nodetool upgradesstables. After this I expect that nodetool status will give me the same results in Load column on both clusters. Unfortunately it is completely different: - old cluster: [728.02 GB, 558.24 GB, 787.08 GB, 555.1 GB] - new cluster: [14.63 GB, 35.98 GB, 18 GB, 38.39 GB] When I briefly check data on new cluster it looks fine. But I'm worry about this difference. Do you have any idea what does it mean? Thanks, Michu
Re: CQL decimal encoding
You may need to bit shift if that is the case Sent from my iPhone On Feb 26, 2014, at 2:53 AM, Ben Hood 0x6e6...@gmail.com wrote: Hey Colin, On Tue, Feb 25, 2014 at 10:26 PM, Colin Blower cblo...@barracuda.com wrote: It looks like you are trying to implement the Decimal type. You might want to start with implementing the Integer type. The Decimal type follows pretty easily from the Integer type. For example: i = unmarchalInteger(data[4:]) s = decInt(data[0:4]) out = inf.newDec(i, s) Thanks for the suggestion. This is pretty much what I've got already. I think the issue might be to do with the way that big.Int doesn't appear to use two's complement to encode the varint. Maybe what is happening is that the encoding is isomorphic across say Java, .NET, Python and Ruby, but that the big.Int library in Go is not encoding in the same way. Cheers, Ben
Re: CQL decimal encoding
go uses 'zig-zag' encoding, perhaps that is the difference? On Wed, Feb 26, 2014 at 6:52 AM, Peter Lin wool...@gmail.com wrote: You may need to bit shift if that is the case Sent from my iPhone On Feb 26, 2014, at 2:53 AM, Ben Hood 0x6e6...@gmail.com wrote: Hey Colin, On Tue, Feb 25, 2014 at 10:26 PM, Colin Blower cblo...@barracuda.com wrote: It looks like you are trying to implement the Decimal type. You might want to start with implementing the Integer type. The Decimal type follows pretty easily from the Integer type. For example: i = unmarchalInteger(data[4:]) s = decInt(data[0:4]) out = inf.newDec(i, s) Thanks for the suggestion. This is pretty much what I've got already. I think the issue might be to do with the way that big.Int doesn't appear to use two's complement to encode the varint. Maybe what is happening is that the encoding is isomorphic across say Java, .NET, Python and Ruby, but that the big.Int library in Go is not encoding in the same way. Cheers, Ben
Re: Getting the most-recent version from time-series data
And one last clarification. Where I said stored procedure earlier, I meant prepared statement. Sorry for the confusion. Too much typing while tired. -Tupshin On Tue, Feb 25, 2014 at 10:36 PM, Tupshin Harper tups...@tupshin.comwrote: I failed to address the matter of not knowing the families in advance. I can't really recommend any solution to that other than storing the list of families in another structure that is readily queryable. I don't know how many families you are thinking, but if it is in the millions or more, You might consider constructing another table such as: CREATE TABLE families ( key int, family text, PRIMARY KEY (key, family) ); store your families there, with a knowable set of keys (I suggest something like the last 3 digits of the md5 hash of the family). So then you could retrieve your families in nice sized batches SELECT family FROM id WHERE key=0; and then do the fan-out selects that I described previously. -Tupshin On Tue, Feb 25, 2014 at 10:15 PM, Tupshin Harper tups...@tupshin.comwrote: Hi Clint, What you are describing could actually be accomplished with the Thrift API and a multiget_slice with a slicerange having a count of 1. Initially I was thinking that this was an important feature gap between Thrift and CQL, and was going to suggest that it should be implemented (possible syntax is in https://issues.apache.org/jira/browse/CASSANDRA-6167 which is almost a superset of this feature). But then I was convinced by some colleagues, that with a modern CQL driver that is token aware, you are actually better off (in terms of latency, throughput, and reliability), by doing each query separately on the client. The reasoning is that if you did this with a single query, it would necessarily be sent to a coordinator that wouldn't own most of the data that you are looking for. That coordinator would then need to fan out the read to all the nodes owning the partitions you are looking for. Far better to just do it directly on the client. The token aware client will send each request for a row straight to a node that owns it. With a separate connection open to each node, this is done in parallel from the get-go. Fewer hops. Less load on the coordinator. No bottlenecks. And with a stored procedure, very very little additional overhead to the client, server, or network. -Tupshin On Tue, Feb 25, 2014 at 7:48 PM, Clint Kelly clint.ke...@gmail.comwrote: Hi everyone, Let's say that I have a table that looks like the following: CREATE TABLE time_series_stuff ( key text, family text, version int, val text, PRIMARY KEY (key, family, version) ) WITH CLUSTERING ORDER BY (family ASC, version DESC) AND bloom_filter_fp_chance=0.01 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.00 AND gc_grace_seconds=864000 AND index_interval=128 AND read_repair_chance=0.10 AND replicate_on_write='true' AND populate_io_cache_on_flush='false' AND default_time_to_live=0 AND speculative_retry='99.0PERCENTILE' AND memtable_flush_period_in_ms=0 AND compaction={'class': 'SizeTieredCompactionStrategy'} AND compression={'sstable_compression': 'LZ4Compressor'}; cqlsh:fiddle select * from time_series_stuff ; key| family | version | val +-+-+ monday | revenue | 3 | $$ monday | revenue | 2 |$$$ monday | revenue | 1 | $$ monday | revenue | 0 | $ monday | traffic | 2 | medium monday | traffic | 1 | light monday | traffic | 0 | heavy (7 rows) Now let's say that I'd like to perform a query that gets me the most recent N versions of revenue and traffic. Is there a CQL query to do this? Let's say that N=1. Then I know that I can do: cqlsh:fiddle select * from time_series_stuff where key='monday' and family='revenue' limit 1; key| family | version | val +-+-+ monday | revenue | 3 | $$ (1 rows) cqlsh:fiddle select * from time_series_stuff where key='monday' and family='traffic' limit 1; key| family | version | val +-+-+ monday | traffic | 2 | medium (1 rows) But what if I have lots of families and I want to get the most recent N versions of all of them in a single CQL statement. Is that possible? Unfortunately I am working on something where the family names and the number of most-recent versions are not known a priori (I am porting some code that was designed for HBase). Best regards, Clint
Cassandra Java Client
Hi, is the DataStax Java Driver for Apache Cassandra ( https://github.com/datastax/java-driver) the official/recommended Java Client to use for accessing Cassandra from Java? Does Cassandra itself (i.e. the apache-cassandra-* jars) not contain any CQL clients? Thanks!
Re: Cassandra Java Client
Short answer : yes Long anwser: depending on whether you want to access Cassandra using Thrift of the native CQL3 protocole, different options are available. For Thrift access, lots of choices (Hector, Astyanax...). For CQL3 right now the only Java client so far is the one provided by Datastax Does Cassandra itself (i.e. the apache-cassandra-* jars) not contain any CQL clients? No, the apache jars only ship the server-related components. On Wed, Feb 26, 2014 at 2:00 PM, Timmy Turner timm.t...@gmail.com wrote: Hi, is the DataStax Java Driver for Apache Cassandra ( https://github.com/datastax/java-driver) the official/recommended Java Client to use for accessing Cassandra from Java? Does Cassandra itself (i.e. the apache-cassandra-* jars) not contain any CQL clients? Thanks!
Re: Naive question about orphan rows
It is probably ok to have redundant songs in playlists, cassandra is about denormalization. Dealing with this issue is going to be hard since the only way to dwal with this would be scanning through the firsr cf and procing counts then using that information to delete in the second table. However that information can change rapidly and then will fall out of sink fast. The only ways yo handle this are 1) never delete songs 2) store copies of songs ib playlist On Friday, February 21, 2014, Green, John M (HP Education) john.gr...@hp.com wrote: I'm very much a newbie so this may be a silly question but ... I have a situation similar to the music service example ( http://www.datastax.com/documentation/cql/3.1/cql/ddl/ddl_music_service_c.html) of songs and playlists. However, in my case, the songs would be considered orphans that should be deleted when no playlists refer to them. Relational databases have mechanisms to manage this relationship so that a song could be deleted as soon as the last playlist referencing it is deleted.While I do NOT need to manage this as an atomic transaction, I'm wondering what is the best way to delete orphaned rows (i.e., songs not referenced by any playlists) using Cassandra. I guess an alternative approach would be to store songs directly in the playlists but this could lead to many redundant copies of the same song which is something I'm hoping to avoid. I'm my case the playlists could have thousands of entries and the songs might be blobs of 10s of Mbytes.Maybe I'm just having a hard time abandoning my relational roots? John -- Sorry this was sent from mobile. Will do less grammar and spell check than usual.
Re: Cassandra nodetool status result after restoring snapshot
Hello, That's a large variation between the old and new cluster. Are you sure you pulled over all the SSTables for your keyspaces? Also, did you run a repair after the data move? Do you have a lot of tombstone data in the old cluster that was removed during the migration process? Are you using Opscenter? A quick comparison of cfstats between clusters may help you analyze your situation and help you pinpoint if you are missing any data for a particular keyspace, etc as well. Thanks, Jonathan Jonathan Lacefield Solutions Architect, DataStax (404) 822 3487 http://www.linkedin.com/in/jlacefield http://www.datastax.com/what-we-offer/products-services/training/virtual-training On Wed, Feb 26, 2014 at 6:07 AM, Ranking Lekarzy rankinglekarzy@gmail.com wrote: Hi I have two separated clusters consist of 4 nodes. One cluster is running on 1.2.12 and the other one on 2.0.5. I loaded data from first cluster (1.2.12) to the second one (2.0.5) by copying snapshots between corresponding nodes. I removed commitlogs, started second cluster and run nodetool upgradesstables. After this I expect that nodetool status will give me the same results in Load column on both clusters. Unfortunately it is completely different: - old cluster: [728.02 GB, 558.24 GB, 787.08 GB, 555.1 GB] - new cluster: [14.63 GB, 35.98 GB, 18 GB, 38.39 GB] When I briefly check data on new cluster it looks fine. But I'm worry about this difference. Do you have any idea what does it mean? Thanks, Michu
Re: Cassandra Java Client
Kundera does support CQL3. Work for supporting datastax java driver is under development. https://github.com/impetus-opensource/Kundera/issues/385 -Vivek On Wed, Feb 26, 2014 at 6:34 PM, DuyHai Doan doanduy...@gmail.com wrote: Short answer : yes Long anwser: depending on whether you want to access Cassandra using Thrift of the native CQL3 protocole, different options are available. For Thrift access, lots of choices (Hector, Astyanax...). For CQL3 right now the only Java client so far is the one provided by Datastax Does Cassandra itself (i.e. the apache-cassandra-* jars) not contain any CQL clients? No, the apache jars only ship the server-related components. On Wed, Feb 26, 2014 at 2:00 PM, Timmy Turner timm.t...@gmail.com wrote: Hi, is the DataStax Java Driver for Apache Cassandra ( https://github.com/datastax/java-driver) the official/recommended Java Client to use for accessing Cassandra from Java? Does Cassandra itself (i.e. the apache-cassandra-* jars) not contain any CQL clients? Thanks!
Flushing after dropping a column family
Hi, I'm trying to truncate data on a single node 2.0.5 instance and I'm noticing that using either TRUNCATE or DROP/CREATE in cqlsh appear to leave the underlying data behind. So I was wondering what nodetool operation I should use to completely nuke the old data, short of dropping the entire keyspace. Cheers, Ben
Re: Flushing after dropping a column family
I'm noticing that using either TRUNCATE or DROP/CREATE in cqlsh appear to leave the underlying data behind. -- What do you mean by underlying data ? Are you talking about snapshots ? If yes, you can wipe them using nodetool clearsnapshots command On Wed, Feb 26, 2014 at 4:14 PM, Ben Hood 0x6e6...@gmail.com wrote: Hi, I'm trying to truncate data on a single node 2.0.5 instance and I'm noticing that using either TRUNCATE or DROP/CREATE in cqlsh appear to leave the underlying data behind. So I was wondering what nodetool operation I should use to completely nuke the old data, short of dropping the entire keyspace. Cheers, Ben
RE: Naive question about orphan rows
Edward, Thanks for your insight. One other thought I had was to store a reference count with the song. When the last playlist referencing the song is deleted the song will also be deleted because the reference count decrements to zero. However, this would create some nastiness when it comes to reliably maintaining reference counts. I'm not sure if it would help to split the reference count into two monotonically increasing counters (number of references added, and number of references deleted). In my case, users cannot browse a repository of songs to build a playlist from scratch. They can only import songs themselves or create references to songs other users have explicitly made available to them. Once a song is not referred to by any playlist it will never be re-discovered so it should be deleted. This could be done in some sort of background data maintenance job that runs periodically. Even if it is a low-priority background job it look like it will create a lot overhead (scanning and producing counts). John From: Edward Capriolo [mailto:edlinuxg...@gmail.com] Sent: Wednesday, February 26, 2014 5:56 AM To: user@cassandra.apache.org Subject: Re: Naive question about orphan rows It is probably ok to have redundant songs in playlists, cassandra is about denormalization. Dealing with this issue is going to be hard since the only way to dwal with this would be scanning through the firsr cf and procing counts then using that information to delete in the second table. However that information can change rapidly and then will fall out of sink fast. The only ways yo handle this are 1) never delete songs 2) store copies of songs ib playlist On Friday, February 21, 2014, Green, John M (HP Education) john.gr...@hp.commailto:john.gr...@hp.com wrote: I'm very much a newbie so this may be a silly question but ... I have a situation similar to the music service example (http://www.datastax.com/documentation/cql/3.1/cql/ddl/ddl_music_service_c.html) of songs and playlists. However, in my case, the songs would be considered orphans that should be deleted when no playlists refer to them. Relational databases have mechanisms to manage this relationship so that a song could be deleted as soon as the last playlist referencing it is deleted.While I do NOT need to manage this as an atomic transaction, I'm wondering what is the best way to delete orphaned rows (i.e., songs not referenced by any playlists) using Cassandra. I guess an alternative approach would be to store songs directly in the playlists but this could lead to many redundant copies of the same song which is something I'm hoping to avoid. I'm my case the playlists could have thousands of entries and the songs might be blobs of 10s of Mbytes. Maybe I'm just having a hard time abandoning my relational roots? John -- Sorry this was sent from mobile. Will do less grammar and spell check than usual.
Re: Flushing after dropping a column family
On Wed, Feb 26, 2014 at 3:17 PM, DuyHai Doan doanduy...@gmail.com wrote: I'm noticing that using either TRUNCATE or DROP/CREATE in cqlsh appear to leave the underlying data behind. -- What do you mean by underlying data ? Are you talking about snapshots ? I was referring to all of the state related to the particular column family I want to set fire to, be it snapshots, parts of commit logs, sstables, key caches, row caches, or anything else on or off disk that relates to said column family. If yes, you can wipe them using nodetool clearsnapshots command This is what I'm doing: cqlsh:bar drop table foo; $ nodetool clearsnapshot bar Requested clearing snapshot for: bar cqlsh:bar create table foo (); cqlsh:bar select * from foo limit 1; This returns nothing (as you would expect). But this if I re-run this again after about a minute, the data is back. I get the same behavior if I use nodetool cleanup, flush, compact or repair. It's as if there is either a background app process filling the table up again or if the deletion hasn't taken place.
Re: Flushing after dropping a column family
Try truncate foo instead of drop table foo. About the nodetool clearsnapshot, I've experienced the same behavior also before. Snapshots cleaning is not immediate On Wed, Feb 26, 2014 at 4:53 PM, Ben Hood 0x6e6...@gmail.com wrote: On Wed, Feb 26, 2014 at 3:17 PM, DuyHai Doan doanduy...@gmail.com wrote: I'm noticing that using either TRUNCATE or DROP/CREATE in cqlsh appear to leave the underlying data behind. -- What do you mean by underlying data ? Are you talking about snapshots ? I was referring to all of the state related to the particular column family I want to set fire to, be it snapshots, parts of commit logs, sstables, key caches, row caches, or anything else on or off disk that relates to said column family. If yes, you can wipe them using nodetool clearsnapshots command This is what I'm doing: cqlsh:bar drop table foo; $ nodetool clearsnapshot bar Requested clearing snapshot for: bar cqlsh:bar create table foo (); cqlsh:bar select * from foo limit 1; This returns nothing (as you would expect). But this if I re-run this again after about a minute, the data is back. I get the same behavior if I use nodetool cleanup, flush, compact or repair. It's as if there is either a background app process filling the table up again or if the deletion hasn't taken place.
Re: Flushing after dropping a column family
This is a known issue that is fixed in 2.1beta1. https://issues.apache.org/jira/browse/CASSANDRA-5202 Until 2.1, we do not recommend relying on the recycling of tables through drop/create or truncate. However, on a single node cluster, I suspect that truncate will work far more reliably than drop/recreate. -Tupshin On Wed, Feb 26, 2014 at 10:53 AM, Ben Hood 0x6e6...@gmail.com wrote: On Wed, Feb 26, 2014 at 3:17 PM, DuyHai Doan doanduy...@gmail.com wrote: I'm noticing that using either TRUNCATE or DROP/CREATE in cqlsh appear to leave the underlying data behind. -- What do you mean by underlying data ? Are you talking about snapshots ? I was referring to all of the state related to the particular column family I want to set fire to, be it snapshots, parts of commit logs, sstables, key caches, row caches, or anything else on or off disk that relates to said column family. If yes, you can wipe them using nodetool clearsnapshots command This is what I'm doing: cqlsh:bar drop table foo; $ nodetool clearsnapshot bar Requested clearing snapshot for: bar cqlsh:bar create table foo (); cqlsh:bar select * from foo limit 1; This returns nothing (as you would expect). But this if I re-run this again after about a minute, the data is back. I get the same behavior if I use nodetool cleanup, flush, compact or repair. It's as if there is either a background app process filling the table up again or if the deletion hasn't taken place.
Re: Flushing after dropping a column family
I've played with it using a 2 nodes cluster with auto_snapshot = false in cassandra.yaml and by deactivating durable write (no commitlog). In my case, truncating tables allows cleaning up data. With nodetool status, I can see the data payload decreasing from Gb to some kbytes On Wed, Feb 26, 2014 at 4:59 PM, Tupshin Harper tups...@tupshin.com wrote: This is a known issue that is fixed in 2.1beta1. https://issues.apache.org/jira/browse/CASSANDRA-5202 Until 2.1, we do not recommend relying on the recycling of tables through drop/create or truncate. However, on a single node cluster, I suspect that truncate will work far more reliably than drop/recreate. -Tupshin On Wed, Feb 26, 2014 at 10:53 AM, Ben Hood 0x6e6...@gmail.com wrote: On Wed, Feb 26, 2014 at 3:17 PM, DuyHai Doan doanduy...@gmail.com wrote: I'm noticing that using either TRUNCATE or DROP/CREATE in cqlsh appear to leave the underlying data behind. -- What do you mean by underlying data ? Are you talking about snapshots ? I was referring to all of the state related to the particular column family I want to set fire to, be it snapshots, parts of commit logs, sstables, key caches, row caches, or anything else on or off disk that relates to said column family. If yes, you can wipe them using nodetool clearsnapshots command This is what I'm doing: cqlsh:bar drop table foo; $ nodetool clearsnapshot bar Requested clearing snapshot for: bar cqlsh:bar create table foo (); cqlsh:bar select * from foo limit 1; This returns nothing (as you would expect). But this if I re-run this again after about a minute, the data is back. I get the same behavior if I use nodetool cleanup, flush, compact or repair. It's as if there is either a background app process filling the table up again or if the deletion hasn't taken place.
Re: Naive question about orphan rows
Right the problem with building a list of counts in a batch is what happens if song added as you are building the counts. On Wed, Feb 26, 2014 at 10:32 AM, Green, John M (HP Education) john.gr...@hp.com wrote: Edward, Thanks for your insight. One other thought I had was to store a reference count with the song. When the last playlist referencing the song is deleted the song will also be deleted because the reference count decrements to zero. However, this would create some nastiness when it comes to reliably maintaining reference counts. I'm not sure if it would help to split the reference count into two monotonically increasing counters (number of references added, and number of references deleted). In my case, users cannot browse a repository of songs to build a playlist from scratch. They can only import songs themselves or create references to songs other users have explicitly made available to them. Once a song is not referred to by any playlist it will never be re-discovered so it should be deleted. This could be done in some sort of background data maintenance job that runs periodically. Even if it is a low-priority background job it look like it will create a lot overhead (scanning and producing counts). John *From:* Edward Capriolo [mailto:edlinuxg...@gmail.com] *Sent:* Wednesday, February 26, 2014 5:56 AM *To:* user@cassandra.apache.org *Subject:* Re: Naive question about orphan rows It is probably ok to have redundant songs in playlists, cassandra is about denormalization. Dealing with this issue is going to be hard since the only way to dwal with this would be scanning through the firsr cf and procing counts then using that information to delete in the second table. However that information can change rapidly and then will fall out of sink fast. The only ways yo handle this are 1) never delete songs 2) store copies of songs ib playlist On Friday, February 21, 2014, Green, John M (HP Education) john.gr...@hp.com wrote: I'm very much a newbie so this may be a silly question but ... I have a situation similar to the music service example ( http://www.datastax.com/documentation/cql/3.1/cql/ddl/ddl_music_service_c.html) of songs and playlists. However, in my case, the songs would be considered orphans that should be deleted when no playlists refer to them. Relational databases have mechanisms to manage this relationship so that a song could be deleted as soon as the last playlist referencing it is deleted.While I do NOT need to manage this as an atomic transaction, I'm wondering what is the best way to delete orphaned rows (i.e., songs not referenced by any playlists) using Cassandra. I guess an alternative approach would be to store songs directly in the playlists but this could lead to many redundant copies of the same song which is something I'm hoping to avoid. I'm my case the playlists could have thousands of entries and the songs might be blobs of 10s of Mbytes.Maybe I'm just having a hard time abandoning my relational roots? John -- Sorry this was sent from mobile. Will do less grammar and spell check than usual.
Re: Flushing after dropping a column family
On Wed, Feb 26, 2014 at 3:58 PM, DuyHai Doan doanduy...@gmail.com wrote: Try truncate foo instead of drop table foo. About the nodetool clearsnapshot, I've experienced the same behavior also before. Snapshots cleaning is not immediate I get the same behavior with truncate as well.
Re: Flushing after dropping a column family
On Wed, Feb 26, 2014 at 3:59 PM, Tupshin Harper tups...@tupshin.com wrote: This is a known issue that is fixed in 2.1beta1. https://issues.apache.org/jira/browse/CASSANDRA-5202 Until 2.1, we do not recommend relying on the recycling of tables through drop/create or truncate. However, on a single node cluster, I suspect that truncate will work far more reliably than drop/recreate. Cool, thanks for the heads up. Short of using 2.1beta1, is there a more stable way of recycling tables?
RE: Supported Cassandra version for CentOS 5.5
Oh, I should add that I was trying to use Cassandra 2.0.X on CentOS and it needed CentOS 6.2+. Don From: Arindam Barua [mailto:aba...@247-inc.com] Sent: Wednesday, February 26, 2014 1:52 AM To: user@cassandra.apache.org Subject: RE: Supported Cassandra version for CentOS 5.5 I am running Cassandra 1.2.12 on CentOS 5.10. Was running 1.1.15 previously without any issues as well. -Arindam From: Donald Smith [mailto:donald.sm...@audiencescience.com] Sent: Tuesday, February 25, 2014 3:40 PM To: user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: RE: Supported Cassandra version for CentOS 5.5 I was unable to get cassandra working with CentOS 5.X . I needed to use CentOS 6.2 or 6.4. Don From: Hari Rajendhran hari.rajendh...@tcs.commailto:hari.rajendh...@tcs.com Sent: Tuesday, February 25, 2014 2:34 AM To: user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: Supported Cassandra version for CentOS 5.5 Hi, Currently i am using CentOS 5.5 OS.I need a clarification on the latest cassandra version(preferably 2.0.4) that my OS supports. Best Regards Hari Krishnan Rajendhran Hadoop Admin DESS-ABIM ,Chennai BIGDATA Galaxy Tata Consultancy Services Cell:- 9677985515 Mailto: hari.rajendh...@tcs.commailto:hari.rajendh...@tcs.com Website: http://www.tcs.comhttp://www.tcs.com/ Experience certainty.IT Services Business Solutions Consulting =-=-= Notice: The information contained in this e-mail message and/or attachments to it may contain confidential or privileged information. If you are not the intended recipient, any dissemination, use, review, distribution, printing or copying of the information contained in this e-mail message and/or attachments to it are strictly prohibited. If you have received this communication in error, please notify us by reply e-mail or telephone and immediately and permanently delete the message and any attachments. Thank you
Re: Supported Cassandra version for CentOS 5.5
I am running Cassandra 2.0.5 on CentOS 5.9 without issue. Getting CassandraPDO to work with with PHP... well that's another matter entirely. I haven't had any luck there at all. I may have to move to Centos 6.x for that reason alone! Tim On Wed, Feb 26, 2014 at 11:55 AM, Donald Smith donald.sm...@audiencescience.com wrote: Oh, I should add that I was trying to use Cassandra 2.0.X on CentOS and it needed CentOS 6.2+. Don *From:* Arindam Barua [mailto:aba...@247-inc.com] *Sent:* Wednesday, February 26, 2014 1:52 AM *To:* user@cassandra.apache.org *Subject:* RE: Supported Cassandra version for CentOS 5.5 I am running Cassandra 1.2.12 on CentOS 5.10. Was running 1.1.15 previously without any issues as well. -Arindam *From:* Donald Smith [mailto:donald.sm...@audiencescience.comdonald.sm...@audiencescience.com] *Sent:* Tuesday, February 25, 2014 3:40 PM *To:* user@cassandra.apache.org *Subject:* RE: Supported Cassandra version for CentOS 5.5 I was unable to get cassandra working with CentOS 5.X . I needed to use CentOS 6.2 or 6.4. Don -- *From:* Hari Rajendhran hari.rajendh...@tcs.com *Sent:* Tuesday, February 25, 2014 2:34 AM *To:* user@cassandra.apache.org *Subject:* Supported Cassandra version for CentOS 5.5 Hi, Currently i am using CentOS 5.5 OS.I need a clarification on the latest cassandra version(preferably 2.0.4) that my OS supports. Best Regards Hari Krishnan Rajendhran Hadoop Admin DESS-ABIM ,Chennai BIGDATA Galaxy Tata Consultancy Services Cell:- 9677985515 Mailto: hari.rajendh...@tcs.com Website: http://www.tcs.com Experience certainty.IT Services Business Solutions Consulting =-=-= Notice: The information contained in this e-mail message and/or attachments to it may contain confidential or privileged information. If you are not the intended recipient, any dissemination, use, review, distribution, printing or copying of the information contained in this e-mail message and/or attachments to it are strictly prohibited. If you have received this communication in error, please notify us by reply e-mail or telephone and immediately and permanently delete the message and any attachments. Thank you -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B
Re: Flushing after dropping a column family
I use truncate between my test cases. Never had a problem with one test case inheriting the data from the previous one. I¹m using a single node, so that may be why. On 2/26/14, 9:27 AM, Ben Hood 0x6e6...@gmail.com wrote: On Wed, Feb 26, 2014 at 3:58 PM, DuyHai Doan doanduy...@gmail.com wrote: Try truncate foo instead of drop table foo. About the nodetool clearsnapshot, I've experienced the same behavior also before. Snapshots cleaning is not immediate I get the same behavior with truncate as well.
Commit logs building up
We're running 2.0.5, recently upgraded from 1.2.14. Sometimes we are seeing CommitLogs starting to build up. Is this a potential bug? Or a symptom of something else we can easily address? We have commitlog_sync: periodic commitlog_sync_period_in_ms:1 commitlog_segment_size_in_mb: 512 Thanks, Chris
Re: Naive question about orphan rows
One way to handle this is that both tables should be de-normalized. Take this: SongsAndPlaylists PlaylistsAndSongs In this way your client software is charged with keeping data in sync. When you remove a song from a PlaylistsAndSongs you do a read for that song in SongsAnyPlaylists. If the number of people with that song is now 0 the song can be removed. On Wed, Feb 26, 2014 at 11:17 AM, Edward Capriolo edlinuxg...@gmail.comwrote: Right the problem with building a list of counts in a batch is what happens if song added as you are building the counts. On Wed, Feb 26, 2014 at 10:32 AM, Green, John M (HP Education) john.gr...@hp.com wrote: Edward, Thanks for your insight. One other thought I had was to store a reference count with the song. When the last playlist referencing the song is deleted the song will also be deleted because the reference count decrements to zero. However, this would create some nastiness when it comes to reliably maintaining reference counts. I'm not sure if it would help to split the reference count into two monotonically increasing counters (number of references added, and number of references deleted). In my case, users cannot browse a repository of songs to build a playlist from scratch. They can only import songs themselves or create references to songs other users have explicitly made available to them. Once a song is not referred to by any playlist it will never be re-discovered so it should be deleted. This could be done in some sort of background data maintenance job that runs periodically. Even if it is a low-priority background job it look like it will create a lot overhead (scanning and producing counts). John *From:* Edward Capriolo [mailto:edlinuxg...@gmail.com] *Sent:* Wednesday, February 26, 2014 5:56 AM *To:* user@cassandra.apache.org *Subject:* Re: Naive question about orphan rows It is probably ok to have redundant songs in playlists, cassandra is about denormalization. Dealing with this issue is going to be hard since the only way to dwal with this would be scanning through the firsr cf and procing counts then using that information to delete in the second table. However that information can change rapidly and then will fall out of sink fast. The only ways yo handle this are 1) never delete songs 2) store copies of songs ib playlist On Friday, February 21, 2014, Green, John M (HP Education) john.gr...@hp.com wrote: I'm very much a newbie so this may be a silly question but ... I have a situation similar to the music service example ( http://www.datastax.com/documentation/cql/3.1/cql/ddl/ddl_music_service_c.html) of songs and playlists. However, in my case, the songs would be considered orphans that should be deleted when no playlists refer to them. Relational databases have mechanisms to manage this relationship so that a song could be deleted as soon as the last playlist referencing it is deleted.While I do NOT need to manage this as an atomic transaction, I'm wondering what is the best way to delete orphaned rows (i.e., songs not referenced by any playlists) using Cassandra. I guess an alternative approach would be to store songs directly in the playlists but this could lead to many redundant copies of the same song which is something I'm hoping to avoid. I'm my case the playlists could have thousands of entries and the songs might be blobs of 10s of Mbytes.Maybe I'm just having a hard time abandoning my relational roots? John -- Sorry this was sent from mobile. Will do less grammar and spell check than usual.
Re: Update multiple rows in a CQL lightweight transaction
Thanks for your help everyone. Sylvain, as I understand it, the scenario I described above is not resolved by CASSANDRA-6561, correct? (This scenario may not matter to most folks, which is totally fine, I just want to make sure that I understand.) Should I instead look into using the Thrift API to address this? Best regards, Clint On Tue, Feb 25, 2014 at 11:30 PM, Sylvain Lebresne sylv...@datastax.comwrote: Sorry to interfere again here but CASSANDRA-5633 will not be picked up because pretty much everything it was set to fix is fixed by CASSANDRA-6561, this is *not* a syntax problem anymore. On Wed, Feb 26, 2014 at 3:18 AM, Tupshin Harper tups...@tupshin.comwrote: Unfortunately there is no option to vote for a resolved ticket, but if you can propose a better syntax that people agree on, you could probably get some fresh traction on it. -Tupshin On Feb 25, 2014 7:20 PM, Clint Kelly clint.ke...@gmail.com wrote: Hi Tupshin, Thanks for your help! Unfortunately in my case, I will need to do a compare and set in which the compare is against a value in a dynamic column. In general, I need to be able to do the following: - Check whether a given value exists in a dynamic column - If so, perform some number of insertions / deletions for dynamic columns in the same row (i.e., with the same partition key as the dynamic column used for the compare) I think you are correct that I need https://issues.apache.org/jira/browse/CASSANDRA-5633 to be implemented. Is there any way to vote for that to get picked up again? :) Best regards, Clint On Mon, Feb 24, 2014 at 2:32 PM, Tupshin Harper tups...@tupshin.comwrote: Hi Clint, That does appear to be an omission in CQL3. It would be possible to simulate it by doing BEGIN BATCH UPDATE foo SET z = 10 WHERE x = 'a' AND y = 1 IF t= 2 AND z=10; UPDATE foo SET t = 5,z=6 where x = 'a' AND y = 4 APPLY BATCH; However, this does a redundant write to the first row if the condition holds, and I certainly wouldn't recommend doing that routinely. Alternatively, depending on your needs, you might be able to use a static column (coming with 2.0.6) as your conditional flag, as that column is shared by all rows in the partition. -Tupshin On Mon, Feb 24, 2014 at 3:57 PM, Clint Kelly clint.ke...@gmail.comwrote: Hi Tupshin, Thanks for your help; I appreciate it. Could I do something like the following? Given the same table you started with: x | y | t | z ---+---+---+ a | 1 | 2 | 10 a | 2 | 2 | 20 I'd like to write a compare-and-set that does something like: If there is a row with (x,y,t,z) = (a,1,2,10), then update/insert a row with (x,y,t,z) = (a,3,4,5) and update/insert a row with (x,y,t,z) = (a,4,5,6). I don't see how I could do this with what you outlined above---just curious. It seems like what I describe above under the hood would be a compare-and-(batch)-set on a single wide row, so it maybe is possible with the Thrift API (I have to check). Thanks again! Best regards, Clint On Sat, Feb 22, 2014 at 11:38 AM, Tupshin Harper tups...@tupshin.com wrote: #5633 was actually closed because the static columns feature (https://issues.apache.org/jira/browse/CASSANDRA-6561) which has been checked in to the 2.0 branch but is not yet part of a release (it will be in 2.0.6). That feature will let you update multiple rows within a single partition by doing a CAS write based on a static column shared by all rows within the partition. Example extracted from the ticket: CREATE TABLE foo ( x text, y bigint, t bigint static, z bigint, PRIMARY KEY (x, y) ); insert into foo (x,y,t, z) values ('a', 1, 1, 10); insert into foo (x,y,t, z) values ('a', 2, 2, 20); select * from foo; x | y | t | z ---+---+---+ a | 1 | 2 | 10 a | 2 | 2 | 20 (Note that both values of t are 2 because it is static) begin batch update foo set z = 1 where x = 'a' and y = 1; update foo set z = 2 where x = 'a' and y = 2 if t = 4; apply batch; [applied] | x | y| t ---+---+--+--- False | a | null | 2 (Both updates failed to apply because there was an unmet conditional on one of them) select * from foo; x | y | t | z ---+---+---+ a | 1 | 2 | 10 a | 2 | 2 | 20 begin batch update foo set z = 1 where x = 'a' and y = 1; update foo set z = 2 where x = 'a' and y = 2 if t = 2; apply batch; [applied] --- True (both updates succeeded because the check on t succeeded) select * from foo; x | y | t | z ---+---+---+--- a | 1 | 2 | 1 a | 2 | 2 | 2 Hope this helps. -Tupshin On Fri, Feb 21, 2014 at 6:05 PM, DuyHai Doan doanduy...@gmail.com wrote: Hello Clint The Resolution status of the JIRA is set to Later, probably the implementation is not done yet. The JIRA was opened to discuss about impl strategy but
Combine multiple SELECT statements into one RPC?
Hi all, Is there any way to use the DataStax Java driver to combine multiple SELECT statements into a single RPC? I assume not (I could not find anything about this in the documentation), but I just wanted to check. Thanks! Best regards, Clint
Re: CQL decimal encoding
On Wed, Feb 26, 2014 at 12:10 PM, Laing, Michael michael.la...@nytimes.com wrote: go uses 'zig-zag' encoding, perhaps that is the difference? On Wed, Feb 26, 2014 at 6:52 AM, Peter Lin wool...@gmail.com wrote: You may need to bit shift if that is the case Thanks for everybody's help, I've managed to solve the issue: the unscaled part of the decimal needs to be encoded using two's compliment. Neither the standard Go big.Rat type nor the more amenable replacement inf.Dec use two's compliment encoding, which is what Java BigDecimal and the other languages are doing. Ironically, the code to do the two's compliment packing and unpacking is available in the the asn1 module of the standard Go library. Unfortunately the functions are not exported outside the package scope, since they designed for internal use only. So open source to the rescue. Hopefully the gocql team can code review this soon and if that's good to go, we'll have another CQL driver that can deal with decimals.
Re: CQL decimal encoding
On Thu, Feb 27, 2014 at 12:01 AM, Ben Hood 0x6e6...@gmail.com wrote: Hopefully the gocql team can code review this soon and if that's good to go, we'll have another CQL driver that can deal with decimals. BTW thanks and kudos go to Theo and Tyler (of the cql-rb and the datastax python drivers respectively) for publishing encoding test cases for the decimal type - that was quite helpful :-)
Re: CQL decimal encoding
On Thu, Feb 27, 2014 at 12:05 AM, Ben Hood 0x6e6...@gmail.com wrote: BTW thanks and kudos go to Theo and Tyler (of the cql-rb and the datastax python drivers respectively) for publishing encoding test cases for the decimal type - that was quite helpful :-) Sorry, I forgot to mention the inspiration gained from Paul's Perl encoding specs as well. And just generally thanks to everybody for chiming in :-)
Background flushing appears to peg CPU
Hi, Using Cassandra 2.0.5 we seem to be running into an issue with a continuous flush of a column family that has no current data ingress. After disconnecting all clients from the node, the Cassandra instance seems to continuously flushing a specific column family with this line appearing all over the logs: INFO [OptionalTasks:1] 2014-02-27 07:36:39,366 MeteredFlusher.java (line 63) - flushing high-traffic column family CFS(Keyspace='bar', ColumnFamily='foo') (estimated 70078271 bytes) Restarting the node didn't appear to change the situation. Does anybody know why this might be happening for a column family that appears not to be receiving any writes? Cheers, Ben