cassandra 1.0.8 memory usage
Hi guys, I am running a mini cluster with 6 nodes, recently we see very frequent ParNewGC on two nodes. It takes 200 - 800 ms on average, sometimes it takes 5 seconds. You know, hte ParNewGC is stop-of-wolrd GC and our client throws SocketTimeoutException every 3 minutes. I checked the load, it seems well balanced, and the two nodes are running on the same hardware: 2 * 4 cores xeon with 16G RAM, we give cassandrda 4G heap, including 800MB young generation. We did not see any swap usage during the GC, any idea about this? Then I took a heap dump, it shows that 5 instances of JmxMBeanServer holds 500MB memory and most of the referenced objects are JMX mbean related, it's kind of wired to me and looks like a memory leak. -- Thanks Regards, Daniel
Re: unbalanced ring
Tamar be carefull. Datastax doesn't recommand major compactions in production environnement. If I got it right, performing major compaction will convert all your SSTables into a big one, improving substantially your reads performence, at least for a while... The problem is that will disable minor compactions too (because of the difference of size between this SSTable and the new ones, if I remeber well). So your reads performance will decrease until your others SSTable reach the size of this big one you've created or until you run an other major compaction, transforming them into a maintenance normal process like repair is. But, knowing that, I still don't know if we both (Tamar and I) shouldn't run it anyway (In my case it will greatly decrease the size of my data 133 GB - 35GB and maybe load the cluster evenly...) Alain 2012/10/10 B. Todd Burruss bto...@gmail.com it should not have any other impact except increased usage of system resources. and i suppose, cleanup would not have an affect (over normal compaction) if all nodes contain the same data On Wed, Oct 10, 2012 at 12:12 PM, Tamar Fraenkel ta...@tok-media.comwrote: Hi! Apart from being heavy load (the compact), will it have other effects? Also, will cleanup help if I have replication factor = number of nodes? Thanks *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Wed, Oct 10, 2012 at 6:12 PM, B. Todd Burruss bto...@gmail.comwrote: major compaction in production is fine, however it is a heavy operation on the node and will take I/O and some CPU. the only time i have seen this happen is when i have changed the tokens in the ring, like nodetool movetoken. cassandra does not auto-delete data that it doesn't use anymore just in case you want to move the tokens again or otherwise undo. try nodetool cleanup On Wed, Oct 10, 2012 at 2:01 AM, Alain RODRIGUEZ arodr...@gmail.comwrote: Hi, Same thing here: 2 nodes, RF = 2. RCL = 1, WCL = 1. Like Tamar I never ran a major compaction and repair once a week each node. 10.59.21.241eu-west 1b Up Normal 133.02 GB 50.00% 0 10.58.83.109eu-west 1b Up Normal 98.12 GB 50.00% 85070591730234615865843651857942052864 What phenomena could explain the result above ? By the way, I have copy the data and import it in a one node dev cluster. There I have run a major compaction and the size of my data has been significantly reduced (to about 32 GB instead of 133 GB). How is that possible ? Do you think that if I run major compaction in both nodes it will balance the load evenly ? Should I run major compaction in production ? 2012/10/10 Tamar Fraenkel ta...@tok-media.com Hi! I am re-posting this, now that I have more data and still *unbalanced ring*: 3 nodes, RF=3, RCL=WCL=QUORUM Address DC RackStatus State Load OwnsToken 113427455640312821154458202477256070485 x.x.x.xus-east 1c Up Normal 24.02 GB 33.33% 0 y.y.y.y us-east 1c Up Normal 33.45 GB 33.33% 56713727820156410577229101238628035242 z.z.z.zus-east 1c Up Normal 29.85 GB 33.33% 113427455640312821154458202477256070485 repair runs weekly. I don't run nodetool compact as I read that this may cause the minor regular compactions not to run and then I will have to run compact manually. Is that right? Any idea if this means something wrong, and if so, how to solve? Thanks, * Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Tue, Mar 27, 2012 at 9:12 AM, Tamar Fraenkel ta...@tok-media.comwrote: Thanks, I will wait and see as data accumulates. Thanks, *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Tue, Mar 27, 2012 at 9:00 AM, R. Verlangen ro...@us2.nl wrote: Cassandra is built to store tons and tons of data. In my opinion roughly ~ 6MB per node is not enough data to allow it to become a fully balanced cluster. 2012/3/27 Tamar Fraenkel ta...@tok-media.com This morning I have nodetool ring -h localhost Address DC RackStatus State Load OwnsToken 113427455640312821154458202477256070485 10.34.158.33us-east 1c Up Normal 5.78 MB 33.33% 0 10.38.175.131 us-east 1c Up Normal 7.23 MB 33.33% 56713727820156410577229101238628035242 10.116.83.10us-east 1c Up Normal 5.02 MB 33.33% 113427455640312821154458202477256070485 Version is 1.0.8. *Tamar Fraenkel * Senior Software Engineer, TOK Media
RE: unbalanced ring
To run, or not to run? All this depends on use case. There're no problems running major compactions (we do it nightly) in one case, there could be problems in another. Just need to understand, how everything works. Best regards / Pagarbiai Viktor Jevdokimov Senior Developer Email: viktor.jevdoki...@adform.commailto:viktor.jevdoki...@adform.com Phone: +370 5 212 3063, Mobile: +370 650 19588, Fax +370 5 261 0453 J. Jasinskio 16C, LT-01112 Vilnius, Lithuania Follow us on Twitter: @adforminsiderhttp://twitter.com/#!/adforminsider What is Adform: watch this short videohttp://vimeo.com/adform/display [Adform News] http://www.adform.com Visit us at IAB RTB workshop October 11, 4 pm in Sala Rossa [iab forum] http://www.iabforum.it/iab-forum-milano-2012/agenda/11-ottobre/ Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies. From: Alain RODRIGUEZ [mailto:arodr...@gmail.com] Sent: Thursday, October 11, 2012 09:17 To: user@cassandra.apache.org Subject: Re: unbalanced ring Tamar be carefull. Datastax doesn't recommand major compactions in production environnement. If I got it right, performing major compaction will convert all your SSTables into a big one, improving substantially your reads performence, at least for a while... The problem is that will disable minor compactions too (because of the difference of size between this SSTable and the new ones, if I remeber well). So your reads performance will decrease until your others SSTable reach the size of this big one you've created or until you run an other major compaction, transforming them into a maintenance normal process like repair is. But, knowing that, I still don't know if we both (Tamar and I) shouldn't run it anyway (In my case it will greatly decrease the size of my data 133 GB - 35GB and maybe load the cluster evenly...) Alain 2012/10/10 B. Todd Burruss bto...@gmail.commailto:bto...@gmail.com it should not have any other impact except increased usage of system resources. and i suppose, cleanup would not have an affect (over normal compaction) if all nodes contain the same data On Wed, Oct 10, 2012 at 12:12 PM, Tamar Fraenkel ta...@tok-media.commailto:ta...@tok-media.com wrote: Hi! Apart from being heavy load (the compact), will it have other effects? Also, will cleanup help if I have replication factor = number of nodes? Thanks Tamar Fraenkel Senior Software Engineer, TOK Media [Inline image 1] ta...@tok-media.commailto:ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Wed, Oct 10, 2012 at 6:12 PM, B. Todd Burruss bto...@gmail.commailto:bto...@gmail.com wrote: major compaction in production is fine, however it is a heavy operation on the node and will take I/O and some CPU. the only time i have seen this happen is when i have changed the tokens in the ring, like nodetool movetoken. cassandra does not auto-delete data that it doesn't use anymore just in case you want to move the tokens again or otherwise undo. try nodetool cleanup On Wed, Oct 10, 2012 at 2:01 AM, Alain RODRIGUEZ arodr...@gmail.commailto:arodr...@gmail.com wrote: Hi, Same thing here: 2 nodes, RF = 2. RCL = 1, WCL = 1. Like Tamar I never ran a major compaction and repair once a week each node. 10.59.21.241eu-west 1b Up Normal 133.02 GB 50.00% 0 10.58.83.109eu-west 1b Up Normal 98.12 GB50.00% 85070591730234615865843651857942052864 What phenomena could explain the result above ? By the way, I have copy the data and import it in a one node dev cluster. There I have run a major compaction and the size of my data has been significantly reduced (to about 32 GB instead of 133 GB). How is that possible ? Do you think that if I run major compaction in both nodes it will balance the load evenly ? Should I run major compaction in production ? 2012/10/10 Tamar Fraenkel ta...@tok-media.commailto:ta...@tok-media.com Hi! I am re-posting this, now that I have more data and still unbalanced ring: 3 nodes, RF=3, RCL=WCL=QUORUM Address DC RackStatus State LoadOwns Token 113427455640312821154458202477256070485 x.x.x.xus-east 1c Up Normal 24.02 GB33.33% 0 y.y.y.y us-east 1c Up Normal 33.45 GB33.33% 56713727820156410577229101238628035242 z.z.z.zus-east 1c Up Normal 29.85 GB33.33%
RE: Problem while streaming SSTables with BulkOutputFormat
Hello again, I noticed that this issue happens whenever a reduce task is done (so the SSTable is generated) while an SSTable already generated is being streamed to the cluster. I think that the error is therefore caused because cassandra cannot queue SSTables that are streamed to the cluster.Does that make sense? Cheers,Ralph From: matgan...@hotmail.com To: user@cassandra.apache.org Subject: RE: Problem while streaming SSTables with BulkOutputFormat Date: Tue, 9 Oct 2012 22:29:41 + Aaron,Thank you for your answer, I tried to move to Cassandra 1.1.5, but the error still occurs.When I set a single task or less per hadoop node, the error does not happen.However, when I have more than one task on any of the nodes (Hadoop only node or Hadoop+Cassandra node),the error happens. When it happens, the task fails and is sent after a while to another node and completed. Ultimately, I getall my tasks done but it takes much more time. Is it possible that streaming multiple SSTables generated from two different tasks done by the same node to the Cassandra cluster is thecause of this issue? CheersRalph Subject: Re: Problem while streaming SSTables with BulkOutputFormat From: aa...@thelastpickle.com Date: Wed, 10 Oct 2012 10:05:13 +1300 To: user@cassandra.apache.org Something, somewhere, at some point is breaking the connection. Sorry I cannot be of more help :) Something caused the streaming to fail, which started a retry, which failed because the pipe was broken. Are there any earlier errors in the logs ? Did this happen on one of the nodes that has both a task tacker and cassandra ? Cheers On 9/10/2012, at 4:06 AM, Ralph Romanos matgan...@hotmail.com wrote: Hello, I am using BulkOutputFormat to load data from a .csv file into Cassandra. I am using Cassandra 1.1.3 and Hadoop 0.20.2. I have 7 hadoop nodes: 1 namenode/jobtracker and 6 datanodes/tasktrackers. Cassandra is installed on 4 of these 6 datanodes/tasktrackers. The issue happens when I have more than 1 reducer, SSTables are generated in each node, however, I get the following error in the tasktracker's logs when they are streamed into the Cassandra cluster: Exception in thread Streaming to /172.16.110.79:1 java.lang.RuntimeException: java.io.EOFException at org.apache.cassandra.utils.FBUtilities.unchecked(FBUtilities.java:628) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Caused by: java.io.EOFException at java.io.DataInputStream.readInt(Unknown Source) at org.apache.cassandra.streaming.FileStreamTask.receiveReply(FileStreamTask.java:194) at org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:181) at org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:94) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) ... 3 more Exception in thread Streaming to /172.16.110.92:1 java.lang.RuntimeException: java.io.EOFException at org.apache.cassandra.utils.FBUtilities.unchecked(FBUtilities.java:628) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Caused by: java.io.EOFException at java.io.DataInputStream.readInt(Unknown Source) at org.apache.cassandra.streaming.FileStreamTask.receiveReply(FileStreamTask.java:194) at org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:181) at org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:94) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) ... 3 more ... This is what I get in the logs of one of my Cassandra nodes: ERROR 16:47:34,904 Sending retry message failed, closing session. java.io.IOException: Broken pipe at sun.nio.ch.FileDispatcher.write0(Native Method) at sun.nio.ch.SocketDispatcher.write(Unknown Source) at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown Source) at sun.nio.ch.IOUtil.write(Unknown Source) at sun.nio.ch.SocketChannelImpl.write(Unknown Source) at java.nio.channels.Channels.writeFullyImpl(Unknown Source) at java.nio.channels.Channels.writeFully(Unknown Source) at java.nio.channels.Channels.access$000(Unknown Source) at java.nio.channels.Channels$1.write(Unknown Source) at java.io.OutputStream.write(Unknown Source) at java.nio.channels.Channels$1.write(Unknown Source)
Create column family with Composite key column via thrift API
Hi, I know one way is to execute cql query via thrift client to create a column family having compound primary/composite columns. But is it the only way? Looks like i would end up creating own CQLTranslator/Wrapper to deal with compound primary/composite columns!(Or may be something else in near future). Thrift way of dealing with this is really different, as column family metadata for such column families created on cqlsh is quite different for Thrift! I know, have started little late and these are basic things how CQL/composite column works. But is there anything i am missing/misunderstood on this part? -Vivek
CQL Sets and Maps
I was reading Brian's post http://mail-archives.apache.org/mod_mbox/cassandra-dev/201210.mbox/%3ccajhhpg20rrcajqjdnf8sf7wnhblo6j+aofksgbxyxwcoocg...@mail.gmail.com%3E In which he asks Any insight into why CQL puts that in column name? Where does it store the metadata related to compound key interpretation? Wouldn't that be a better place for that since it shouldn't change within a table? I have those same questions and would like to understand how it stores stuff better. For example, if PlayOrm has the following User { @Embedded Private ListEmail emails; @Embedded Private ListSomethingElse otherStuff @OneToMany Private ListOwner owners; } It ends up storing rowkey: userid = column=emails:email1Id:title, value=some email title = column=emails:email1Id:contents, value=some contents in email really really long = column=emails:email2Id:title, value=some other email = column=owners:ownerId29, value=null = column=owners:ownerId57, value=null Basically using emails as the prefix since User can have other embedded objects, and using emailId as the next prefix so you can have many unique emails and then having each email property. How is it actually stored when doing Sets and Maps in CQL?? Ideally, I would like PlayOrm to overlay on top of that. Thanks, Dean
unsubscribe
unsubscribe
cassandra + pig
I'm wondering how many people are using cassandra + pig out there? I recently went through the effort of validating things at a much higher level than I previously did(*), and found a few issues: https://issues.apache.org/jira/browse/CASSANDRA-4748 https://issues.apache.org/jira/browse/CASSANDRA-4749 https://issues.apache.org/jira/browse/CASSANDRA-4789 In general, it seems like the widerow implementation still has rough edges. I'm concerned I'm not understanding why other people aren't using the feature, and thus finding these problems. Is everyone else just setting a high static limit? E.g. LOAD 'cassandra://KEYSPACE/CF?limit=X where X = the max size of any key? Is everyone else using data models that result in keys with # columns always less than 1024? Do newer version of hadoop consume the cassandra API in a way that work around these issues? I'm using CDH3 == hadoop 0.20.2, pig 0.8.1. (*) I took a random subsample of 50,000 keys of my production data (approx 1M total key/value pairs, some keys having only a single value and some having 1000's). I then wrote both a pig script and simple procedural version of the pig script. Then I compared the results. Obviously I started with differences, though after locally patching my code to fix the above 3 bugs (though, really only two issues), I now (finally) get the same results.
Re: 1.1.1 is repair still needed ?
as of 1.0 (CASSANDRA-2034) hints are generated for nodes that timeout. On Thu, Oct 11, 2012 at 3:55 AM, Watanabe Maki watanabe.m...@gmail.com wrote: Even if HH works fine, HH will not be created until the failure detector marks the node is dead. HH will not be created for partially timeouted mutation request ( but meets CL ) also... In my understanding... On 2012/10/11, at 5:55, Rob Coli rc...@palominodb.com wrote: On Tue, Oct 9, 2012 at 12:56 PM, Oleg Dulin oleg.du...@gmail.com wrote: My understanding is that the repair has to happen within gc_grace period. [ snip ] So the question is, is this still needed ? Do we even need to run nodetool repair ? If Hinted Handoff works in your version of Cassandra, and that version is 1.0, you should not need to repair if no node has crashed or been down for longer than max_hint_window_in_ms. This is because after 1.0, any failed write to a remote replica results in a hint, so any DELETE should eventually be fully replicated. However hinted handoff is meaningfully broken between 1.1.0 and 1.1.6 (unreleased) so you cannot rely on the above heuristic for consistency. In these versions, you have to repair (or read repair 100% of keys) once every GCGraceSeconds to prevent the possibility of zombie data. If it were possible to repair on a per-columnfamily basis, you could get a significant win by only repairing columnfamilies which take DELETE traffic. https://issues.apache.org/jira/browse/CASSANDRA-4772 =Rob -- =Robert Coli AIMGTALK - rc...@palominodb.com YAHOO - rcoli.palominob SKYPE - rcoli_palominodb
Re: cassandra + pig
The Dachis Group (where I just came from, now at DataStax) uses pig with cassandra for a lot of things. However, we weren't using the widerow implementation yet since wide row support is new to 1.1.x and we were on 0.7, then 0.8, then 1.0.x. I think since it's new to 1.1's hadoop support, it sounds like there are some rough edges like you say. But issues that are reproducible on tickets for any problems are much appreciated and they will get addressed. On Oct 11, 2012, at 10:43 AM, William Oberman ober...@civicscience.com wrote: I'm wondering how many people are using cassandra + pig out there? I recently went through the effort of validating things at a much higher level than I previously did(*), and found a few issues: https://issues.apache.org/jira/browse/CASSANDRA-4748 https://issues.apache.org/jira/browse/CASSANDRA-4749 https://issues.apache.org/jira/browse/CASSANDRA-4789 In general, it seems like the widerow implementation still has rough edges. I'm concerned I'm not understanding why other people aren't using the feature, and thus finding these problems. Is everyone else just setting a high static limit? E.g. LOAD 'cassandra://KEYSPACE/CF?limit=X where X = the max size of any key? Is everyone else using data models that result in keys with # columns always less than 1024? Do newer version of hadoop consume the cassandra API in a way that work around these issues? I'm using CDH3 == hadoop 0.20.2, pig 0.8.1. (*) I took a random subsample of 50,000 keys of my production data (approx 1M total key/value pairs, some keys having only a single value and some having 1000's). I then wrote both a pig script and simple procedural version of the pig script. Then I compared the results. Obviously I started with differences, though after locally patching my code to fix the above 3 bugs (though, really only two issues), I now (finally) get the same results.
Re: cassandra + pig
If you don't mind me asking, how are you handling the fact that pre-widerow you are only getting a static number of columns per key (default 1024)? Or am I not understanding the limit concept? On Thu, Oct 11, 2012 at 11:25 AM, Jeremy Hanna jeremy.hanna1...@gmail.comwrote: The Dachis Group (where I just came from, now at DataStax) uses pig with cassandra for a lot of things. However, we weren't using the widerow implementation yet since wide row support is new to 1.1.x and we were on 0.7, then 0.8, then 1.0.x. I think since it's new to 1.1's hadoop support, it sounds like there are some rough edges like you say. But issues that are reproducible on tickets for any problems are much appreciated and they will get addressed. On Oct 11, 2012, at 10:43 AM, William Oberman ober...@civicscience.com wrote: I'm wondering how many people are using cassandra + pig out there? I recently went through the effort of validating things at a much higher level than I previously did(*), and found a few issues: https://issues.apache.org/jira/browse/CASSANDRA-4748 https://issues.apache.org/jira/browse/CASSANDRA-4749 https://issues.apache.org/jira/browse/CASSANDRA-4789 In general, it seems like the widerow implementation still has rough edges. I'm concerned I'm not understanding why other people aren't using the feature, and thus finding these problems. Is everyone else just setting a high static limit? E.g. LOAD 'cassandra://KEYSPACE/CF?limit=X where X = the max size of any key? Is everyone else using data models that result in keys with # columns always less than 1024? Do newer version of hadoop consume the cassandra API in a way that work around these issues? I'm using CDH3 == hadoop 0.20.2, pig 0.8.1. (*) I took a random subsample of 50,000 keys of my production data (approx 1M total key/value pairs, some keys having only a single value and some having 1000's). I then wrote both a pig script and simple procedural version of the pig script. Then I compared the results. Obviously I started with differences, though after locally patching my code to fix the above 3 bugs (though, really only two issues), I now (finally) get the same results.
unnecessary tombstone's transmission during repair process
Hi Guys, I have a question about merkle tree construction and repair process. When mercle tree is constructing it calculates hashes. For DeletedColumn it calculates hash using value. Value of DeletedColumn is a serialized local deletion time. We know that local deletion time can be different on different nodes for the same tombstone. So hashes of the same tombstone on different nodes will be different. Is it true? I think that local deletion time shouldn't be considered in hash's calculation. We've provided several tests: // we have 3 node, RF=2, CL=QUORUM. So we have strong consistency. 1. Populate data to all nodes. Run repair process. No any streams were transmitted. It's predictable behaviour. 2. Then we removed some columns for some rows. No any nodes we down. All writes were done successfully. We run repair. There were some streams. It's strange for me, because all data should be consistent. We've created some patch and applied it. 1. Result of the first test is the same. 2. Result of the second test: there were no any unnecessary streams as I expected. My question is: Is transmission of the equals tombstones during repair process a feature? :) or is it a bug? If it's a bug, I'll create ticket and attach patch to it.
Re: can't get cqlsh running
It looks like easy_install is only recognizing python2.4 on your system. It is installing the cql module for that version. The cqlsh script explicitly looks for and runs with python2.6 since 2.4 isn't supported. I believe you can run 'python2.6 easy_install cql' to force it to use that python install. -Nick On Thu, Oct 11, 2012 at 10:45 AM, Tim Dunphy bluethu...@gmail.com wrote: Hey guys, I'm on cassandra 1.1.5 on a centos 5.8 machine. I have the cassandra bin directory on my path so that i can simply type 'cassandra-cli' from anywhere on my path to get into the cassandra command line environment. It's great! But I'd like to start using the cql shell (cqlsh) but apparently I don't know enough about python to get this working. This is what happens when I try to run cqlsh: [root@beta:~] #cqlsh Python CQL driver not installed, or not on PYTHONPATH. You might try easy_install cql. Python: /usr/bin/python2.6 Module load path: ['/usr/local/apache-cassandra-1.1.5/bin/../pylib', '/usr/local/apache-cassandra-1.1.5/bin', '/usr/local/apache-cassandra-1.1.5/bin', '/usr/lib64/python26.zip', '/usr/lib64/python2.6', '/usr/lib64/python2.6/plat-linux2', '/usr/lib64/python2.6/lib-tk', '/usr/lib64/python2.6/lib-old', '/usr/lib64/python2.6/lib-dynload', '/usr/lib64/python2.6/site-packages', '/usr/lib/python2.6/site-packages', '/usr/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg-info'] Error: No module named cql But easy_install claims that it's already installed: [root@beta:~] #easy_install cql Searching for cql Best match: cql 1.0.10 Processing cql-1.0.10-py2.4.egg cql 1.0.10 is already the active version in easy-install.pth Using /usr/lib/python2.4/site-packages/cql-1.0.10-py2.4.egg Processing dependencies for cql I'm thinking that I just don't know how to set the PYTHONPATH variable or where to point it to. Can someone give me a pointer here? Thanks Tim -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B
Re: cassandra + pig
Thanks Jeremy! Maybe figuring out how to do paging in pig would have been easier, but I found the widerow setting first which led me where I am today. I don't mind helping to blaze trails, or contribute back when doing so, but I usually try to follow rather than lead when it comes to tools/software I choose use. I didn't realize how close to the edge I was getting in this case :-) On Thu, Oct 11, 2012 at 1:03 PM, Jeremy Hanna jeremy.hanna1...@gmail.comwrote: For our use case, we had a lot of narrow column families and the couple of column families that had wide rows, we did our own paging through them. I don't recall if we did paging in pig or mapreduce but you should be able to do that in both since pig allows you to specify the slice start. On Oct 11, 2012, at 11:28 AM, William Oberman ober...@civicscience.com wrote: If you don't mind me asking, how are you handling the fact that pre-widerow you are only getting a static number of columns per key (default 1024)? Or am I not understanding the limit concept? On Thu, Oct 11, 2012 at 11:25 AM, Jeremy Hanna jeremy.hanna1...@gmail.com wrote: The Dachis Group (where I just came from, now at DataStax) uses pig with cassandra for a lot of things. However, we weren't using the widerow implementation yet since wide row support is new to 1.1.x and we were on 0.7, then 0.8, then 1.0.x. I think since it's new to 1.1's hadoop support, it sounds like there are some rough edges like you say. But issues that are reproducible on tickets for any problems are much appreciated and they will get addressed. On Oct 11, 2012, at 10:43 AM, William Oberman ober...@civicscience.com wrote: I'm wondering how many people are using cassandra + pig out there? I recently went through the effort of validating things at a much higher level than I previously did(*), and found a few issues: https://issues.apache.org/jira/browse/CASSANDRA-4748 https://issues.apache.org/jira/browse/CASSANDRA-4749 https://issues.apache.org/jira/browse/CASSANDRA-4789 In general, it seems like the widerow implementation still has rough edges. I'm concerned I'm not understanding why other people aren't using the feature, and thus finding these problems. Is everyone else just setting a high static limit? E.g. LOAD 'cassandra://KEYSPACE/CF?limit=X where X = the max size of any key? Is everyone else using data models that result in keys with # columns always less than 1024? Do newer version of hadoop consume the cassandra API in a way that work around these issues? I'm using CDH3 == hadoop 0.20.2, pig 0.8.1. (*) I took a random subsample of 50,000 keys of my production data (approx 1M total key/value pairs, some keys having only a single value and some having 1000's). I then wrote both a pig script and simple procedural version of the pig script. Then I compared the results. Obviously I started with differences, though after locally patching my code to fix the above 3 bugs (though, really only two issues), I now (finally) get the same results.
unsubscribe
unsubscribe
Re: unnecessary tombstone's transmission during repair process
On Thu, Oct 11, 2012 at 8:41 AM, Alexey Zotov azo...@griddynamics.com wrote: Value of DeletedColumn is a serialized local deletion time. We know that local deletion time can be different on different nodes for the same tombstone. So hashes of the same tombstone on different nodes will be different. Is it true? Yes, this seems correct based on my understanding of the process of writing tombstones. I think that local deletion time shouldn't be considered in hash's calculation. I think you are correct; the only thing that matters is whether the tombstone exists or not. There may be something I am missing about why the very-unlikely-to-be-identical value should be considered a merkle tree failure. https://issues.apache.org/jira/browse/CASSANDRA-2279 Seems related to this issue, fwiw. Is transmission of the equals tombstones during repair process a feature? :) or is it a bug? I think it's a bug. If it's a bug, I'll create ticket and attach patch to it. Yay! =Rob -- =Robert Coli AIMGTALK - rc...@palominodb.com YAHOO - rcoli.palominob SKYPE - rcoli_palominodb
Re: unnecessary tombstone's transmission during repair process
I have a question about merkle tree construction and repair process. When mercle tree is constructing it calculates hashes. For DeletedColumn it calculates hash using value. Value of DeletedColumn is a serialized local deletion time. The deletion time time is not local to each replica, it's computed only once by the coordinator node that received the deletion initially. We know that local deletion time can be different on different nodes for the same tombstone. Given the above, no it cannot. -- Sylvain
Re: cassandra 1.0.8 memory usage
what jvm version? On Thu, Oct 11, 2012 at 2:04 PM, Daniel Woo daniel.y@gmail.com wrote: Hi guys, I am running a mini cluster with 6 nodes, recently we see very frequent ParNewGC on two nodes. It takes 200 - 800 ms on average, sometimes it takes 5 seconds. You know, hte ParNewGC is stop-of-wolrd GC and our client throws SocketTimeoutException every 3 minutes. I checked the load, it seems well balanced, and the two nodes are running on the same hardware: 2 * 4 cores xeon with 16G RAM, we give cassandrda 4G heap, including 800MB young generation. We did not see any swap usage during the GC, any idea about this? Then I took a heap dump, it shows that 5 instances of JmxMBeanServer holds 500MB memory and most of the referenced objects are JMX mbean related, it's kind of wired to me and looks like a memory leak. -- Thanks Regards, Daniel
Re: cassandra 1.0.8 memory usage
On Wed, Oct 10, 2012 at 11:04 PM, Daniel Woo daniel.y@gmail.com wrote: I am running a mini cluster with 6 nodes, recently we see very frequent ParNewGC on two nodes. It takes 200 - 800 ms on average, sometimes it takes 5 seconds. You know, hte ParNewGC is stop-of-wolrd GC and our client throws SocketTimeoutException every 3 minutes. What version of Cassandra? What JVM? Are JNA and Jamm working? I checked the load, it seems well balanced, and the two nodes are running on the same hardware: 2 * 4 cores xeon with 16G RAM, we give cassandrda 4G heap, including 800MB young generation. We did not see any swap usage during the GC, any idea about this? It sounds like the two nodes that are pathological right now have exhausted the perm gen with actual non-garbage, probably mostly the Bloom filters and the JMX MBeans. Then I took a heap dump, it shows that 5 instances of JmxMBeanServer holds 500MB memory and most of the referenced objects are JMX mbean related, it's kind of wired to me and looks like a memory leak. Do you have a large number of ColumnFamilies? How large is the data stored per node? =Rob -- =Robert Coli AIMGTALK - rc...@palominodb.com YAHOO - rcoli.palominob SKYPE - rcoli_palominodb
Re: cassandra 1.0.8 memory usage
On Thu, Oct 11, 2012 at 11:02 AM, Rob Coli rc...@palominodb.com wrote: On Wed, Oct 10, 2012 at 11:04 PM, Daniel Woo daniel.y@gmail.com wrote: We did not see any swap usage during the GC, any idea about this? As an aside.. you shouldn't have swap enabled on a Cassandra node, generally. As a simple example, if you have swap enabled and use the off-heap row cache, the kernel might swap your row cache. =Rob -- =Robert Coli AIMGTALK - rc...@palominodb.com YAHOO - rcoli.palominob SKYPE - rcoli_palominodb
Perlcassa - Perl Cassandra 'Client'
Hi- A few months back I wrote a Perl client for Cassandra and I realized I never sent it out to this list. While I realize that while Perl is not the language du jour hopefully this will help someone else out. :) Code is periodically thrown up on CPAN but look at http://github.com/mkjellman/perlcassa for the most current version. It supports serialization/deserialization of validation classes, composite columns, as well as connection pooling (without using ResourcePool which fails miserably when run under mod_perl). Best, michael 'Like' us on Facebook for exclusive content and other resources on all Barracuda Networks solutions. Visit http://barracudanetworks.com/facebook
Re: [problem with OOM in nodes]
Splitting one report to multiple rows is uncomfortably WHY? Reading from N disks is way faster than reading from 1 disk. I think in terms of PlayOrm and then explain the model you can use so I think in objects first Report { String uniqueId String reportName; //may be indexable and query able String description; CursorToManyReportRow rows; } ReportRow { String uniqueId; String somedata; String someMoreData; } Each row in Report in PlayOrm is backed by two rows in the database in this special case of using CursorToMany ReportRow - reportName=somename, description=some desc CursorToManyRow in index table - reportRowKey56, reportRowKey78, reportRowKey89 (there are NO values in this row and this row can have less than 10 million valuesÅ if your report is beyond 10 million, let me know and I have a different design). Then each report row is basically the same structure as above. You can then 1. Read in the report 2. As you read from CursorToMany, it does a BATCH slice into the CursorToManyRow AND then does a MULTIGET in parallel to fetch report rows(ie. It is all in parallel so get lots of rows from many disks really fast) 3. Print the rows out If you have more than 10 million rows in a report, let me know. You can do what PlayOrm does yourself of course ;). Later, Dean On 9/23/12 11:14 PM, Denis Gabaydulin gaba...@gmail.com wrote: On Sun, Sep 23, 2012 at 10:41 PM, aaron morton aa...@thelastpickle.com wrote: /var/log/cassandra$ cat system.log | grep Compacting large | grep -E [0-9]+ bytes -o | cut -d -f 1 | awk '{ foo = $1 / 1024 / 1024 ; print foo MB }' | sort -nr | head -n 50 Is it bad signal? Sorry, I do not know what this is outputting. This is outputting size of big rows which cassandra had compacted before. As I can see in cfstats, compacted row maximum size: 386857368 ! Yes. Having rows in the 100's of MB is will cause problems. Doubly so if they are large super columns. What exactly is the problem with big rows? And, how can we should place our data in this case (see the schema in the previous replies)? Splitting one report to multiple rows is uncomfortably :-( Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 22/09/2012, at 5:07 AM, Denis Gabaydulin gaba...@gmail.com wrote: And some stuff from log: /var/log/cassandra$ cat system.log | grep Compacting large | grep -E [0-9]+ bytes -o | cut -d -f 1 | awk '{ foo = $1 / 1024 / 1024 ; print foo MB }' | sort -nr | head -n 50 3821.55MB 3337.85MB 1221.64MB 1128.67MB 930.666MB 916.4MB 861.114MB 843.325MB 711.813MB 706.992MB 674.282MB 673.861MB 658.305MB 557.756MB 531.577MB 493.112MB 492.513MB 492.291MB 484.484MB 479.908MB 465.742MB 464.015MB 459.95MB 454.472MB 441.248MB 428.763MB 424.028MB 416.663MB 416.191MB 409.341MB 406.895MB 397.314MB 388.27MB 376.714MB 371.298MB 368.819MB 366.92MB 361.371MB 360.509MB 356.168MB 355.012MB 354.897MB 354.759MB 347.986MB 344.109MB 335.546MB 329.529MB 326.857MB 326.252MB 326.237MB Is it bad signal? On Fri, Sep 21, 2012 at 8:22 PM, Denis Gabaydulin gaba...@gmail.com wrote: Found one more intersting fact. As I can see in cfstats, compacted row maximum size: 386857368 ! On Fri, Sep 21, 2012 at 12:50 PM, Denis Gabaydulin gaba...@gmail.com wrote: Reports - is a SuperColumnFamily Each report has unique identifier (report_id). This is a key of SuperColumnFamily. And a report saved in separate row. A report is consisted of report rows (may vary between 1 and 50, but most are small). Each report row is saved in separate super column. Hector based code: superCfMutator.addInsertion( report_id, Reports, HFactory.createSuperColumn( report_row_id, mapper.convertObject(object), columnDefinition.getTopSerializer(), columnDefinition.getSubSerializer(), inferringSerializer ) ); We have two frequent operation: 1. count report rows by report_id (calculate number of super columns in the row). 2. get report rows by report_id and range predicate (get super columns from the row with range predicate). I can't see here a big super columns :-( On Fri, Sep 21, 2012 at 3:10 AM, Tyler Hobbs ty...@datastax.com wrote: I'm not 100% that I understand your data model and read patterns correctly, but it sounds like you have large supercolumns and are requesting some of the subcolumns from individual super columns. If that's the case, the issue is that Cassandra must deserialize the entire supercolumn in memory whenever you read *any* of the subcolumns. This is one of the reasons why composite columns are recommended over supercolumns. On Thu, Sep 20, 2012 at 6:45 AM, Denis Gabaydulin gaba...@gmail.com wrote: p.s. Cassandra 1.1.4 On Thu, Sep 20, 2012 at 3:27 PM, Denis Gabaydulin gaba...@gmail.com wrote: Hi, all! We have a cluster with virtual 7 nodes (disk storage is connected to nodes with iSCSI).
CRM:0267190
unsubscribe Joseph Heinzen Senior VP, UCC Sales Tel. 571-297-4162 | Mobile. 703-463-7145 Fax. 703-891-1073 | jhein...@microtech.netmailto:jhein...@microtech.net | www.MicroTech.nethttp://www.MicroTech.net [Description: C:\Users\joseph.heinzen\AppData\Roaming\Microsoft\Signatures\New Email Signature HQ v3 (Joseph Heinzen)-Image01.jpg] The Fastest Growing Hispanic-Owned Business in the Nation (2009, 2010 2011) 8330 Boone Blvd, Suite 600, Vienna, VA 22182 A Service-Disabled Veteran-Owned Business DISCLAIMER: The information in this email, and any attached document(s), is MicroTech proprietary data and intended only for recipient(s) addressed above. If you are not an intended recipient, you are requested to notify the sender above and delete any copies of this transmission. Thank you in advance for your cooperation. If you have any questions, please contact the sender at (703) 891-1073. inline: image001.jpg
Re: can't get cqlsh running
I believe you can run 'python2.6 easy_install cql' to force it to use that python install. Well initially I tried going: [root@beta:~] #python2.6 easy_install python2.6: can't open file 'easy_install': [Errno 2] No such file or directory But when I used the full paths of each: /usr/bin/python2.6 /usr/bin/easy_install cql It worked like a charm! [root@beta:~] #cqlsh Connected to Test Cluster at beta.jokefire.com:9160. [cqlsh 2.0.0 | Cassandra unknown | CQL spec unknown | Thrift protocol 19.20.0] Use HELP for help. cqlsh So, thanks for your advice! That really did the trick! Tim On Thu, Oct 11, 2012 at 11:56 AM, Nick Bailey n...@datastax.com wrote: It looks like easy_install is only recognizing python2.4 on your system. It is installing the cql module for that version. The cqlsh script explicitly looks for and runs with python2.6 since 2.4 isn't supported. I believe you can run 'python2.6 easy_install cql' to force it to use that python install. -Nick On Thu, Oct 11, 2012 at 10:45 AM, Tim Dunphy bluethu...@gmail.com wrote: Hey guys, I'm on cassandra 1.1.5 on a centos 5.8 machine. I have the cassandra bin directory on my path so that i can simply type 'cassandra-cli' from anywhere on my path to get into the cassandra command line environment. It's great! But I'd like to start using the cql shell (cqlsh) but apparently I don't know enough about python to get this working. This is what happens when I try to run cqlsh: [root@beta:~] #cqlsh Python CQL driver not installed, or not on PYTHONPATH. You might try easy_install cql. Python: /usr/bin/python2.6 Module load path: ['/usr/local/apache-cassandra-1.1.5/bin/../pylib', '/usr/local/apache-cassandra-1.1.5/bin', '/usr/local/apache-cassandra-1.1.5/bin', '/usr/lib64/python26.zip', '/usr/lib64/python2.6', '/usr/lib64/python2.6/plat-linux2', '/usr/lib64/python2.6/lib-tk', '/usr/lib64/python2.6/lib-old', '/usr/lib64/python2.6/lib-dynload', '/usr/lib64/python2.6/site-packages', '/usr/lib/python2.6/site-packages', '/usr/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg-info'] Error: No module named cql But easy_install claims that it's already installed: [root@beta:~] #easy_install cql Searching for cql Best match: cql 1.0.10 Processing cql-1.0.10-py2.4.egg cql 1.0.10 is already the active version in easy-install.pth Using /usr/lib/python2.4/site-packages/cql-1.0.10-py2.4.egg Processing dependencies for cql I'm thinking that I just don't know how to set the PYTHONPATH variable or where to point it to. Can someone give me a pointer here? Thanks Tim -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B
Re: CRM:0267190
On 10/11/2012 02:37 PM, Joseph Heinzen wrote: unsubscribe http://wiki.apache.org/cassandra/FAQ#unsubscribe *Joseph Heinzen* Senior VP, UCC Sales Tel. 571-297-4162 | Mobile. 703-463-7145 Fax. 703-891-1073 | _jhein...@microtech.net mailto:jhein...@microtech.net_ | _www.MicroTech.net http://www.MicroTech.net_ *Description: C:\Users\joseph.heinzen\AppData\Roaming\Microsoft\Signatures\New Email Signature HQ v3 (Joseph Heinzen)-Image01.jpg** *The Fastest Growing Hispanic-Owned Business in the Nation (2009, 2010 2011) 8330 Boone Blvd, Suite 600, Vienna, VA 22182 A Service-Disabled Veteran-Owned Business http://www.ietf.org/rfc/rfc1855.txt (re: subject; signature) /DISCLAIMER: The information in this email, and any attached document(s), is MicroTech proprietary data and intended only for recipient(s) addressed above. If you are not an intended recipient, you are requested to notify the sender above and delete any copies of this transmission. Thank you in advance for your cooperation. If you have any questions, please contact the sender at (703) 891-1073./ and finally, http://www.pbandjelly.org/2011/03/to-whom-it-may-concern/ -- Kind regards, Michael
Re: unsubscribe
http://wiki.apache.org/cassandra/FAQ#unsubscribe On Thu, Oct 11, 2012 at 12:41 PM, Siddiqui, Akmal akmal.siddi...@broadvision.com wrote: unsubscribe -- Tyler Hobbs DataStax http://datastax.com/
Re: Option for ordering columns by timestamp in CF
Without thinking too deeply about it, this is basically equivalent to disabling timestamps for a column family and using timestamps for column names, though in a very indirect (and potentially confusing) manner. So, if you want to open a ticket, I would suggest framing it as make column timestamps optional. On Wed, Oct 10, 2012 at 4:44 AM, Ertio Lew ertio...@gmail.com wrote: I think Cassandra should provide an configurable option on per column family basis to do columns sorting by time-stamp rather than column names. This would be really helpful to maintain time-sorted columns without using up the column name as time-stamps which might otherwise be used to store most relevant column names useful for retrievals. Very frequently we need to store data sorted in time order. Therefore I think this may be a very general requirement not specific to just my use-case alone. Does it makes sense to create an issue for this ? On Fri, Mar 25, 2011 at 2:38 AM, aaron morton aa...@thelastpickle.comwrote: If you mean order by the column timestamp (as passed by the client) that it not possible. Can you use your own timestamps as the column name and store them as long values ? Aaron On 25 Mar 2011, at 09:30, Narendra Sharma wrote: Cassandra 0.7.4 Column names in my CF are of type byte[] but I want to order columns by timestamp. What is the best way to achieve this? Does it make sense for Cassandra to support ordering of columns by timestamp as option for a column family irrespective of the column name type? Thanks, Naren -- Tyler Hobbs DataStax http://datastax.com/
RE: unsubscribe
http://wiki.apache.org/cassandra/FAQ#unsubscribe -Original Message- From: Chris Favero [mailto:chris.fav...@tricast.com] Sent: Thursday, October 11, 2012 7:37 AM To: user@cassandra.apache.org Subject: unsubscribe unsubscribe
RE: unsubscribe
thanks From: Tyler Hobbs [mailto:ty...@datastax.com] Sent: Thursday, October 11, 2012 1:42 PM To: user@cassandra.apache.org Subject: Re: unsubscribe http://wiki.apache.org/cassandra/FAQ#unsubscribe On Thu, Oct 11, 2012 at 12:41 PM, Siddiqui, Akmal akmal.siddi...@broadvision.com wrote: unsubscribe -- Tyler Hobbs DataStax http://datastax.com/
Repair Failing due to bad network
I'm trying to bring up a new Datacenter - while I probably could have brought things up in another way I've now got a DC that has a ready Cassandra with keys allocated. The problem is that I cannot get a repair to complete due since it appears that some part of my network decides to restart all connections twice a day (6am and 2pm - ok 5 minutes before). So when I start a repair job, it usually get's a ways into things before one of the nodes goes DOWN, then back up. What I don't see is the repair restarting, it just stops. Is there a workaround for this case, or is there something else I could be doing? --david
Re: 1.1.1 is repair still needed ?
Oh sorry. It's pretty nice to know that. On 2012/10/12, at 0:18, B. Todd Burruss bto...@gmail.com wrote: as of 1.0 (CASSANDRA-2034) hints are generated for nodes that timeout. On Thu, Oct 11, 2012 at 3:55 AM, Watanabe Maki watanabe.m...@gmail.com wrote: Even if HH works fine, HH will not be created until the failure detector marks the node is dead. HH will not be created for partially timeouted mutation request ( but meets CL ) also... In my understanding... On 2012/10/11, at 5:55, Rob Coli rc...@palominodb.com wrote: On Tue, Oct 9, 2012 at 12:56 PM, Oleg Dulin oleg.du...@gmail.com wrote: My understanding is that the repair has to happen within gc_grace period. [ snip ] So the question is, is this still needed ? Do we even need to run nodetool repair ? If Hinted Handoff works in your version of Cassandra, and that version is 1.0, you should not need to repair if no node has crashed or been down for longer than max_hint_window_in_ms. This is because after 1.0, any failed write to a remote replica results in a hint, so any DELETE should eventually be fully replicated. However hinted handoff is meaningfully broken between 1.1.0 and 1.1.6 (unreleased) so you cannot rely on the above heuristic for consistency. In these versions, you have to repair (or read repair 100% of keys) once every GCGraceSeconds to prevent the possibility of zombie data. If it were possible to repair on a per-columnfamily basis, you could get a significant win by only repairing columnfamilies which take DELETE traffic. https://issues.apache.org/jira/browse/CASSANDRA-4772 =Rob -- =Robert Coli AIMGTALK - rc...@palominodb.com YAHOO - rcoli.palominob SKYPE - rcoli_palominodb