sstable loader
Hi All, Can we use sstable loader for loading external flat file or csv file. If yes , kindly share the steps or manual. I need to put 40 million data into a table of around 70 columns Regards: Rahul Bhardwaj -- Follow IndiaMART.com http://www.indiamart.com for latest updates on this and more: https://plus.google.com/+indiamart https://www.facebook.com/IndiaMART https://twitter.com/IndiaMART Mobile Channel: https://itunes.apple.com/WebObjects/MZStore.woa/wa/viewSoftware?id=668561641mt=8 https://play.google.com/store/apps/details?id=com.indiamart.m http://m.indiamart.com/ https://www.youtube.com/watch?v=DzORNbeSXN8list=PL2o4J51MqpL0mbue6kzDa6eymLVUXtlR1index=2 Watch how IndiaMART Maximiser helped Mr. Khanna expand his business. kyunki Kaam Yahin Banta Hai https://www.youtube.com/watch?v=Q9fZ5ILY3w8feature=youtu.be!!!
Re: Java Driver 2.1 reading counter values from row
Hi All, This is possible with cassandra-driver-core-2.1.5, with 'row.getLong(sum)'. Thanks On Fri, Mar 27, 2015 at 2:51 PM, Amila Paranawithana amila1...@gmail.com wrote: in Apache Cassandra Java Driver 2.1 how to read counter type values from a row when iterating over result set. eg: If I have a counter table called 'countertable' with key and a counter colum 'sum' how can I read the value of the counter column using Java driver? If I say, row.getInt(sum) this gives the following error, com.datastax.driver.core.exceptions.InvalidTypeException: Value sum is of type counter Code :: ResultSet results = session.execute(SELECT * FROM simplex.countertable ) for (Row row : results) { System.out.println(row.getString(key),row.getInt(sum))); } Thanks, Amila -- *Amila Iroshani Paranawithana , **Senior Software Engineer* *, AdroitLogic http://adroitlogic.org* | ☎: +94779747398 | ✍: http://amilaparanawithana.blogspot.com [image: Facebook] https://www.facebook.com/amila.paranawithana [image: Twitter] https://twitter.com/AmilaPara [image: LinkedIn] http://www.linkedin.com/profile/view?id=66289851trk=tab_pro [image: Skype] amila.paranawithana -- *Amila Iroshani Paranawithana , **Senior Software Engineer* *, AdroitLogic http://adroitlogic.org* | ☎: +94779747398 | ✍: http://amilaparanawithana.blogspot.com [image: Facebook] https://www.facebook.com/amila.paranawithana [image: Twitter] https://twitter.com/AmilaPara [image: LinkedIn] http://www.linkedin.com/profile/view?id=66289851trk=tab_pro [image: Skype] amila.paranawithana
Re: Replication to second data center with different number of nodes
I would recommend you utilise Cassandra’s Vnodes config and let it manage this itself. This means it will create these and a mange them all on its own and allows quick and easy scaling and boot strapping. From: Björn Hachmann bjoern.hachm...@metrigo.demailto:bjoern.hachm...@metrigo.de Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Date: Friday, 27 March 2015 10:40 To: user user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: Replication to second data center with different number of nodes Hi, we currently plan to add a second data center to our Cassandra-Cluster. I have read about this procedure in the documentation (eg. https://www.datastax.com/documentation/cassandra/2.1/cassandra/operations/ops_add_dc_to_cluster_t.html), but at least one question remains: Do I have to provide appropriate values for num_tokens dependent on the number of nodes per data center, or is this handled somehow by the NetworkTopologyStrategy? Example: We currently have 12 nodes each covering 256 tokens. Our second datacenter will have three nodes only. Do I have to set num_tokens to 1024 (12*256/3) for the nodes in that DC? Thank you very much for your valuable input! Kind regards Björn Hachmann Information in this email including any attachments may be privileged, confidential and is intended exclusively for the addressee. The views expressed may not be official policy, but the personal views of the originator. If you have received it in error, please notify the sender by return e-mail and delete it from your system. You should not reproduce, distribute, store, retransmit, use or disclose its contents to anyone. Please note we reserve the right to monitor all e-mail communication through our internal and external networks. SKY and the SKY marks are trademarks of Sky plc and Sky International AG and are used under licence. Sky UK Limited (Registration No. 2906991), Sky-In-Home Service Limited (Registration No. 2067075) and Sky Subscribers Services Limited (Registration No. 2340150) are direct or indirect subsidiaries of Sky plc (Registration No. 2247735). All of the companies mentioned in this paragraph are incorporated in England and Wales and share the same registered office at Grant Way, Isleworth, Middlesex TW7 5QD.
Re: upgrade from 1.0.12 to 1.1.12
Rob, the cluster now upgraded to cassandra 1.0.12 (default hd version, in Descriptor.java) and I ensure all sstables in current cluster are hd version before upgrade to cassandra 1.1. I have also checked in cassandra 1.1.12 , the sstable is version hf version. so i guess, nodetool upgradesstables is needed? Why not scrub? when you run command nodetool upgradesstables , it is actually scrubing the data? Can you explain ? Jason On Fri, Mar 27, 2015 at 7:21 AM, Robert Coli rc...@eventbrite.com wrote: On Wed, Mar 25, 2015 at 7:16 PM, Jonathan Haddad j...@jonhaddad.com wrote: There's no downside to running upgradesstables. I recommend always doing it on upgrade just to be safe. For the record and just my opinion : I recommend against paying this fixed cost when you don't need to. It is basically trivial to ascertain whether there is a new version of the SSTable format in your new version, without even relying on the canonical NEWS.txt. Type nodetool flush and look at the filename of the table that was just flushed. If the version component is different from all the other SSTables, you definitely need to run upgradesstables. If it isn't, you definitely don't. If you're going to run something which unnecessarily rewrites all SSTables, why not scrub? That'll check the files for corruption while also upgrading them as they are written out 1:1... =Rob
Replication to second data center with different number of nodes
Hi, we currently plan to add a second data center to our Cassandra-Cluster. I have read about this procedure in the documentation (eg. https://www.datastax.com/documentation/cassandra/2.1/cassandra/operations/ops_add_dc_to_cluster_t.html), but at least one question remains: Do I have to provide appropriate values for num_tokens dependent on the number of nodes per data center, or is this handled somehow by the NetworkTopologyStrategy? Example: We currently have 12 nodes each covering 256 tokens. Our second datacenter will have three nodes only. Do I have to set num_tokens to 1024 (12*256/3) for the nodes in that DC? Thank you very much for your valuable input! Kind regards Björn Hachmann
High latencies for simple queries
I'm running Cassandra locally and I see that the execution time for the simplest queries is 1-2 milliseconds. By a simple query I mean either INSERT or SELECT from a small table with short keys. While this number is not high, it's about 10-20 times slower than Postgresql (even if INSERTs are wrapped in transactions). I know that the nature of Cassandra compared to Postgresql is different, but for some scenarios this difference can matter. The question is: is it normal for Cassandra to have a minimum latency of 1 millisecond? I'm using Cassandra 2.1.2, python-driver.
Re: Arbitrary nested tree hierarchy data model
Hi Robert, We're trying to do something similar to the OP and finding it a bit difficult. Would it be possible to provide more details about how you're doing it? Thanks. On Fri, Mar 27, 2015 at 3:15 AM, Robert Wille rwi...@fold3.com wrote: I have a cluster which stores tree structures. I keep several hundred unrelated trees. The largest has about 180 million nodes, and the smallest has 1 node. The largest fanout is almost 400K. Depth is arbitrary, but in practice is probably less than 10. I am able to page through children and siblings. It works really well. Doesn’t sound like its exactly like what you’re looking for, but if you want any pointers on how I went about implementing mine, I’d be happy to share. On Mar 26, 2015, at 3:05 PM, List l...@airstreamcomm.net wrote: Not sure if this is the right place to ask, but we are trying to model a user-generated tree hierarchy in which they create child objects of a root node, and can create an arbitrary number of children (and children of children, and on and on). So far we have looked at storing each tree structure as a single document in JSON format and reading/writing it out in it's entirety, doing materialized paths where we store the root id with every child and the tree structure above the child as a map, and some form of an adjacency list (which does not appear to be very viable as looking up the entire tree would be ridiculous). The hope is to end up with a data model that allows us to display the entire tree quickly, as well as see the entire path to a leaf when selecting that leaf. If anyone has some suggestions/experience on how to model such a tree heirarchy we would greatly appreciate your input. -- Fabian Siddiqi Software Engineer T: (+44) 776 335 1398
Java Driver 2.1 reading counter values from row
in Apache Cassandra Java Driver 2.1 how to read counter type values from a row when iterating over result set. eg: If I have a counter table called 'countertable' with key and a counter colum 'sum' how can I read the value of the counter column using Java driver? If I say, row.getInt(sum) this gives the following error, com.datastax.driver.core.exceptions.InvalidTypeException: Value sum is of type counter Code :: ResultSet results = session.execute(SELECT * FROM simplex.countertable ) for (Row row : results) { System.out.println(row.getString(key),row.getInt(sum))); } Thanks, Amila -- *Amila Iroshani Paranawithana , **Senior Software Engineer* *, AdroitLogic http://adroitlogic.org* | ☎: +94779747398 | ✍: http://amilaparanawithana.blogspot.com [image: Facebook] https://www.facebook.com/amila.paranawithana [image: Twitter] https://twitter.com/AmilaPara [image: LinkedIn] http://www.linkedin.com/profile/view?id=66289851trk=tab_pro [image: Skype] amila.paranawithana
Re: sstable loader
Hi, This post[1] may be useful. But note that this was done with cassandra older version. So there may be new way to do this. [1]. http://amilaparanawithana.blogspot.com/2012/06/bulk-loading-external-data-to-cassandra.html Thanks, On Fri, Mar 27, 2015 at 11:40 AM, Rahul Bhardwaj rahul.bhard...@indiamart.com wrote: Hi All, Can we use sstable loader for loading external flat file or csv file. If yes , kindly share the steps or manual. I need to put 40 million data into a table of around 70 columns Regards: Rahul Bhardwaj Follow IndiaMART.com http://www.indiamart.com for latest updates on this and more: https://plus.google.com/+indiamart https://www.facebook.com/IndiaMART https://twitter.com/IndiaMART Mobile Channel: https://itunes.apple.com/WebObjects/MZStore.woa/wa/viewSoftware?id=668561641mt=8 https://play.google.com/store/apps/details?id=com.indiamart.m http://m.indiamart.com/ https://www.youtube.com/watch?v=DzORNbeSXN8list=PL2o4J51MqpL0mbue6kzDa6eymLVUXtlR1index=2 Watch how IndiaMART Maximiser helped Mr. Khanna expand his business. kyunki Kaam Yahin Banta Hai https://www.youtube.com/watch?v=Q9fZ5ILY3w8feature=youtu.be!!! -- *Amila Iroshani Paranawithana , **Senior Software Engineer* *, AdroitLogic http://adroitlogic.org* | ☎: +94779747398 | ✍: http://amilaparanawithana.blogspot.com [image: Facebook] https://www.facebook.com/amila.paranawithana [image: Twitter] https://twitter.com/AmilaPara [image: LinkedIn] http://www.linkedin.com/profile/view?id=66289851trk=tab_pro [image: Skype] amila.paranawithana
Re: Replication to second data center with different number of nodes
http://www.datastax.com/documentation/cassandra/2.0/cassandra/configuration/configCassandra_yaml_r.html?scroll=reference_ds_qfg_n1r_1k__num_tokens So go with a default 256, and leave initial token empty: num_tokens: 256 # initial_token: Cassandra will always give each node the same number of tokens, the only time you might want to distribute this is if your instances are of different sizing/capability which is also a bad scenario. From: Björn Hachmann bjoern.hachm...@metrigo.demailto:bjoern.hachm...@metrigo.de Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Date: Friday, 27 March 2015 12:11 To: user user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: Re: Replication to second data center with different number of nodes 2015-03-27 11:58 GMT+01:00 Sibbald, Charles charles.sibb...@bskyb.commailto:charles.sibb...@bskyb.com: Cassandra’s Vnodes config Thank you. Yes, we are using vnodes! The num_token parameter controls the number of vnodes assigned to a specific node. Might be I am seeing problems where are none. Let me rephrase my question: How does Cassandra know it has to replicate 1/3 of all keys to each single node in the second DC? I can see two ways: 1. It has to be configured explicitly. 2. It is derived from the number of nodes available in the data center at the time `nodetool rebuild` is started. Kind regards Björn Information in this email including any attachments may be privileged, confidential and is intended exclusively for the addressee. The views expressed may not be official policy, but the personal views of the originator. If you have received it in error, please notify the sender by return e-mail and delete it from your system. You should not reproduce, distribute, store, retransmit, use or disclose its contents to anyone. Please note we reserve the right to monitor all e-mail communication through our internal and external networks. SKY and the SKY marks are trademarks of Sky plc and Sky International AG and are used under licence. Sky UK Limited (Registration No. 2906991), Sky-In-Home Service Limited (Registration No. 2067075) and Sky Subscribers Services Limited (Registration No. 2340150) are direct or indirect subsidiaries of Sky plc (Registration No. 2247735). All of the companies mentioned in this paragraph are incorporated in England and Wales and share the same registered office at Grant Way, Isleworth, Middlesex TW7 5QD.
('Unable to complete the operation against any hosts', {})
Hi All, We are using cassandra version 2.1.2 with cqlsh 5.0.1 (cluster of three nodes with rf 2) I need to load around 40 million records into a table of cassandra db. I have created batch of 1 million ( batch of 1 records also gives the same error) in csv format. when I use copy command to import I got this error, which is causing problem. cqlsh:mesh_glusr copy glusr_usr1(glusr_usr_id,glusr_usr_usrname,glusr_usr_pass,glusr_usr_membersince,glusr_usr_designation,glusr_usr_url,glusr_usr_modid,fk_gl_city_id,fk_gl_state_id,glusr_usr_ph2_area) from 'gl_a' with delimiter = '\t' and QUOTE = ''; Processed 36000 rows; Write: 1769.07 rows/s Record has the wrong number of fields (9 instead of 10). Aborting import at record #36769. Previously-inserted values still present. 36669 rows imported in 20.571 seconds. cqlsh:mesh_glusr copy glusr_usr1(glusr_usr_id,glusr_usr_usrname,glusr_usr_pass,glusr_usr_membersince,glusr_usr_designation,glusr_usr_url,glusr_usr_modid,fk_gl_city_id,fk_gl_state_id,glusr_usr_ph2_area) from 'gl_a' with delimiter = '\t' and QUOTE = ''; Processed 185000 rows; Write: 1800.91 rows/s Record has the wrong number of fields (9 instead of 10). Aborting import at record #185607. Previously-inserted values still present. 185507 rows imported in 1 minute and 43.428 seconds. [cqlsh 5.0.1 | Cassandra 2.1.2 | CQL spec 3.2.0 | Native protocol v3] Use HELP for help. cqlsh use mesh_glusr ; cqlsh:mesh_glusr copy glusr_usr1(glusr_usr_id,glusr_usr_usrname,glusr_usr_pass,glusr_usr_membersince,glusr_usr_designation,glusr_usr_url,glusr_usr_modid,fk_gl_city_id,fk_gl_state_id,glusr_usr_ph2_area) from 'gl_a1' with delimiter = '\t' and QUOTE = ''; Processed 373000 rows; Write: 1741.23 rows/s ('Unable to complete the operation against any hosts', {}) Aborting import at record #373269. Previously-inserted values still present. When we remove already inserted records from file and on again starting the command for rest data, it inserts few more records and gives the same error without any specific. please help if any one have some idea about this error. Regards: Rahul Bhardwaj -- Follow IndiaMART.com http://www.indiamart.com for latest updates on this and more: https://plus.google.com/+indiamart https://www.facebook.com/IndiaMART https://twitter.com/IndiaMART Mobile Channel: https://itunes.apple.com/WebObjects/MZStore.woa/wa/viewSoftware?id=668561641mt=8 https://play.google.com/store/apps/details?id=com.indiamart.m http://m.indiamart.com/ https://www.youtube.com/watch?v=DzORNbeSXN8list=PL2o4J51MqpL0mbue6kzDa6eymLVUXtlR1index=2 Watch how IndiaMART Maximiser helped Mr. Khanna expand his business. kyunki Kaam Yahin Banta Hai https://www.youtube.com/watch?v=Q9fZ5ILY3w8feature=youtu.be!!!
Re: Replication to second data center with different number of nodes
2015-03-27 11:58 GMT+01:00 Sibbald, Charles charles.sibb...@bskyb.com: Cassandra’s Vnodes config Thank you. Yes, we are using vnodes! The num_token parameter controls the number of vnodes assigned to a specific node. Might be I am seeing problems where are none. Let me rephrase my question: How does Cassandra know it has to replicate 1/3 of all keys to each single node in the second DC? I can see two ways: 1. It has to be configured explicitly. 2. It is derived from the number of nodes available in the data center at the time `nodetool rebuild` is started. Kind regards Björn
Re: Delayed events processing / queue (anti-)pattern
Would it help here to not actually issue a delete statement but instead use date based compaction and a dynamically calculated ttl that is some safe distance in the future from your key? I’m not sure about about this part *date based compaction*, do you mean DateTieredCompationStrategy ? Anyway we achieved something like that without this strategy with a TTL + date in partition key based approach. The thing however to watch is the size of the partition (one should avoid too long partitions (in thrift wide rows)), so care must be taken on the date increment to be correctly adjusted. -- Brice On Thu, Mar 26, 2015 at 5:23 PM, Robin Verlangen ro...@us2.nl wrote: Interesting thought, that should work indeed, I'll evaluate both options and provide an update here once I have results. Best regards, Robin Verlangen *Chief Data Architect* W http://www.robinverlangen.nl E ro...@us2.nl http://goo.gl/Lt7BC *What is CloudPelican? http://goo.gl/HkB3D* Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies. On Thu, Mar 26, 2015 at 7:09 AM, Thunder Stumpges thunder.stump...@gmail.com wrote: Would it help here to not actually issue a delete statement but instead use date based compaction and a dynamically calculated ttl that is some safe distance in the future from your key? Just a thought. -Thunder On Mar 25, 2015 11:07 AM, Robert Coli rc...@eventbrite.com wrote: On Wed, Mar 25, 2015 at 12:45 AM, Robin Verlangen ro...@us2.nl wrote: @Robert: can you elaborate a bit more on the not ideal parts? In my case I will be throwing away the rows (thus the points in time that are now in the past), which will create tombstones which are compacted away. Not ideal is what I mean... Cassandra has immutable data files, use cases which do DELETE pay an obvious penalty. Some percentage of tombstones will exist continuously, and you have to store them and seek past them. =Rob
Re: Arbitrary nested tree hierarchy data model
On 3/26/15 10:15 PM, Robert Wille wrote: I have a cluster which stores tree structures. I keep several hundred unrelated trees. The largest has about 180 million nodes, and the smallest has 1 node. The largest fanout is almost 400K. Depth is arbitrary, but in practice is probably less than 10. I am able to page through children and siblings. It works really well. Doesn’t sound like its exactly like what you’re looking for, but if you want any pointers on how I went about implementing mine, I’d be happy to share. On Mar 26, 2015, at 3:05 PM, List l...@airstreamcomm.net wrote: Not sure if this is the right place to ask, but we are trying to model a user-generated tree hierarchy in which they create child objects of a root node, and can create an arbitrary number of children (and children of children, and on and on). So far we have looked at storing each tree structure as a single document in JSON format and reading/writing it out in it's entirety, doing materialized paths where we store the root id with every child and the tree structure above the child as a map, and some form of an adjacency list (which does not appear to be very viable as looking up the entire tree would be ridiculous). The hope is to end up with a data model that allows us to display the entire tree quickly, as well as see the entire path to a leaf when selecting that leaf. If anyone has some suggestions/experience on how to model such a tree heirarchy we would greatly appreciate your input. Robert, This certainly sounds like a step in the right direction so yes please do share! Thank you.
Re: Arbitrary nested tree hierarchy data model
I'd be interested to see that data model. I think the entire list would benefit! On Thu, Mar 26, 2015 at 8:16 PM Robert Wille rwi...@fold3.com wrote: I have a cluster which stores tree structures. I keep several hundred unrelated trees. The largest has about 180 million nodes, and the smallest has 1 node. The largest fanout is almost 400K. Depth is arbitrary, but in practice is probably less than 10. I am able to page through children and siblings. It works really well. Doesn’t sound like its exactly like what you’re looking for, but if you want any pointers on how I went about implementing mine, I’d be happy to share. On Mar 26, 2015, at 3:05 PM, List l...@airstreamcomm.net wrote: Not sure if this is the right place to ask, but we are trying to model a user-generated tree hierarchy in which they create child objects of a root node, and can create an arbitrary number of children (and children of children, and on and on). So far we have looked at storing each tree structure as a single document in JSON format and reading/writing it out in it's entirety, doing materialized paths where we store the root id with every child and the tree structure above the child as a map, and some form of an adjacency list (which does not appear to be very viable as looking up the entire tree would be ridiculous). The hope is to end up with a data model that allows us to display the entire tree quickly, as well as see the entire path to a leaf when selecting that leaf. If anyone has some suggestions/experience on how to model such a tree heirarchy we would greatly appreciate your input.
Re: upgrade from 1.0.12 to 1.1.12
Running upgrade is a noop if the tables don't need to be upgraded. I consider the cost of this to be less than the cost of missing an upgrade. On Thu, Mar 26, 2015 at 4:23 PM Robert Coli rc...@eventbrite.com wrote: On Wed, Mar 25, 2015 at 7:16 PM, Jonathan Haddad j...@jonhaddad.com wrote: There's no downside to running upgradesstables. I recommend always doing it on upgrade just to be safe. For the record and just my opinion : I recommend against paying this fixed cost when you don't need to. It is basically trivial to ascertain whether there is a new version of the SSTable format in your new version, without even relying on the canonical NEWS.txt. Type nodetool flush and look at the filename of the table that was just flushed. If the version component is different from all the other SSTables, you definitely need to run upgradesstables. If it isn't, you definitely don't. If you're going to run something which unnecessarily rewrites all SSTables, why not scrub? That'll check the files for corruption while also upgrading them as they are written out 1:1... =Rob
Re: Delayed events processing / queue (anti-)pattern
Yeah that's the one :) sorry, was on my phone and didn't want to look up the exact name. Cheers, Thunder On Mar 27, 2015 6:17 AM, Brice Dutheil brice.duth...@gmail.com wrote: Would it help here to not actually issue a delete statement but instead use date based compaction and a dynamically calculated ttl that is some safe distance in the future from your key? I’m not sure about about this part *date based compaction*, do you mean DateTieredCompationStrategy ? Anyway we achieved something like that without this strategy with a TTL + date in partition key based approach. The thing however to watch is the size of the partition (one should avoid too long partitions (in thrift wide rows)), so care must be taken on the date increment to be correctly adjusted. -- Brice On Thu, Mar 26, 2015 at 5:23 PM, Robin Verlangen ro...@us2.nl wrote: Interesting thought, that should work indeed, I'll evaluate both options and provide an update here once I have results. Best regards, Robin Verlangen *Chief Data Architect* W http://www.robinverlangen.nl E ro...@us2.nl http://goo.gl/Lt7BC *What is CloudPelican? http://goo.gl/HkB3D* Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies. On Thu, Mar 26, 2015 at 7:09 AM, Thunder Stumpges thunder.stump...@gmail.com wrote: Would it help here to not actually issue a delete statement but instead use date based compaction and a dynamically calculated ttl that is some safe distance in the future from your key? Just a thought. -Thunder On Mar 25, 2015 11:07 AM, Robert Coli rc...@eventbrite.com wrote: On Wed, Mar 25, 2015 at 12:45 AM, Robin Verlangen ro...@us2.nl wrote: @Robert: can you elaborate a bit more on the not ideal parts? In my case I will be throwing away the rows (thus the points in time that are now in the past), which will create tombstones which are compacted away. Not ideal is what I mean... Cassandra has immutable data files, use cases which do DELETE pay an obvious penalty. Some percentage of tombstones will exist continuously, and you have to store them and seek past them. =Rob
Re: High latencies for simple queries
Just to check, are you concerned about minimizing that latency or maximizing throughput? I'll that latency is what you're actually concerned about. A fair amount of that latency is probably happening in the python driver. Although it can easily execute ~8k operations per second (using cpython), in some scenarios it can be difficult to guarantee sub-ms latency for an individual query due to how some of the internals work. In particular, it uses python's Conditions for cross-thread signalling (from the event loop thread to the application thread). Unfortunately, python's Condition implementation includes a loop with a minimum sleep of 1ms if the Condition isn't already set when you start the wait() call. This is why, with a single application thread, you will typically see a minimum of 1ms latency. Another source of similar latencies for the python driver is the Asyncore event loop, which is used when libev isn't available. I would make sure that you can use the LibevConnection class with the driver to avoid this. On Fri, Mar 27, 2015 at 6:24 AM, Artur Siekielski a...@vhex.net wrote: I'm running Cassandra locally and I see that the execution time for the simplest queries is 1-2 milliseconds. By a simple query I mean either INSERT or SELECT from a small table with short keys. While this number is not high, it's about 10-20 times slower than Postgresql (even if INSERTs are wrapped in transactions). I know that the nature of Cassandra compared to Postgresql is different, but for some scenarios this difference can matter. The question is: is it normal for Cassandra to have a minimum latency of 1 millisecond? I'm using Cassandra 2.1.2, python-driver. -- Tyler Hobbs DataStax http://datastax.com/
Re: High latencies for simple queries
Yes, I'm concerned about the latency. Throughput can be high even when using Python: http://datastax.github.io/python-driver/performance.html. But in my scenarios I need to run queries sequentially, so latencies matter. And Cassandra requires issuing more queries than SQL databases so these latencies can add up to a significant amount. I was running Asyncore event loop, because it looks like libev isn't supported for PyPy which I'm using. I've switched to CPython and LibevConnection for a moment and I don't think I've noticed a major speedup, and a minimum latency is still 1ms. Overall, it looks to me that the issue is not that important, because using multi-master, multi-dc databases always involve getting higher and somewhat unpredictable latencies, so relying on sub-millisecond latencies on production clusters is not very realistic. On 03/27/2015 04:28 PM, Tyler Hobbs wrote: Just to check, are you concerned about minimizing that latency or maximizing throughput? I'll that latency is what you're actually concerned about. A fair amount of that latency is probably happening in the python driver. Although it can easily execute ~8k operations per second (using cpython), in some scenarios it can be difficult to guarantee sub-ms latency for an individual query due to how some of the internals work. In particular, it uses python's Conditions for cross-thread signalling (from the event loop thread to the application thread). Unfortunately, python's Condition implementation includes a loop with a minimum sleep of 1ms if the Condition isn't already set when you start the wait() call. This is why, with a single application thread, you will typically see a minimum of 1ms latency. Another source of similar latencies for the python driver is the Asyncore event loop, which is used when libev isn't available. I would make sure that you can use the LibevConnection class with the driver to avoid this. On Fri, Mar 27, 2015 at 6:24 AM, Artur Siekielski a...@vhex.net mailto:a...@vhex.net wrote: I'm running Cassandra locally and I see that the execution time for the simplest queries is 1-2 milliseconds. By a simple query I mean either INSERT or SELECT from a small table with short keys. While this number is not high, it's about 10-20 times slower than Postgresql (even if INSERTs are wrapped in transactions). I know that the nature of Cassandra compared to Postgresql is different, but for some scenarios this difference can matter. The question is: is it normal for Cassandra to have a minimum latency of 1 millisecond? I'm using Cassandra 2.1.2, python-driver.
Re: High latencies for simple queries
I think that in your example Postgres spends most time on waiting for fsync() to complete. On Linux, for a battery-backed raid controller, it's safe to mount ext4 filesystem with barrier=0 option which improves fsync() performance a lot. I have partitions mounted with this option and I did a test from Python, using psycopg2 driver, and I got the following latencies, in milliseconds: - INSERT without COMMIT: 0.04 - INSERT with COMMIT: 0.12 - SELECT: 0.05 I'm also repeating benchmark runs multiple times (I'm using Python's timeit module). On 03/27/2015 07:58 PM, Ben Bromhead wrote: Latency can be so variable even when testing things locally. I quickly fired up postgres and did the following with psql: ben=# CREATE TABLE foo(i int, j text, PRIMARY KEY(i)); CREATE TABLE ben=# \timing Timing is on. ben=# INSERT INTO foo VALUES(2, 'yay'); INSERT 0 1 Time: 1.162 ms ben=# INSERT INTO foo VALUES(3, 'yay'); INSERT 0 1 Time: 1.108 ms I then fired up a local copy of Cassandra (2.0.12) cqlsh CREATE KEYSPACE foo WITH replication = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 }; cqlsh USE foo; cqlsh:foo CREATE TABLE foo(i int PRIMARY KEY, j text); cqlsh:foo TRACING ON; Now tracing requests. cqlsh:foo INSERT INTO foo (i, j) VALUES (1, 'yay');
Re: High latencies for simple queries
Since you're executing queries sequentially, you may want to look into using callback chaining to avoid the cross-thread signaling that results in the 1ms latencies. Basically, just use session.execute_async() and attach a callback to the returned future that will execute your next query. The callback is executed on the event loop thread. The main downsides to this are that you need to be careful to avoid blocking the event loop thread (including executing session.execute() or prepare()) and you need to ensure that all exceptions raised in the callback are handled by your application code. On Fri, Mar 27, 2015 at 3:11 PM, Artur Siekielski a...@vhex.net wrote: I think that in your example Postgres spends most time on waiting for fsync() to complete. On Linux, for a battery-backed raid controller, it's safe to mount ext4 filesystem with barrier=0 option which improves fsync() performance a lot. I have partitions mounted with this option and I did a test from Python, using psycopg2 driver, and I got the following latencies, in milliseconds: - INSERT without COMMIT: 0.04 - INSERT with COMMIT: 0.12 - SELECT: 0.05 I'm also repeating benchmark runs multiple times (I'm using Python's timeit module). On 03/27/2015 07:58 PM, Ben Bromhead wrote: Latency can be so variable even when testing things locally. I quickly fired up postgres and did the following with psql: ben=# CREATE TABLE foo(i int, j text, PRIMARY KEY(i)); CREATE TABLE ben=# \timing Timing is on. ben=# INSERT INTO foo VALUES(2, 'yay'); INSERT 0 1 Time: 1.162 ms ben=# INSERT INTO foo VALUES(3, 'yay'); INSERT 0 1 Time: 1.108 ms I then fired up a local copy of Cassandra (2.0.12) cqlsh CREATE KEYSPACE foo WITH replication = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 }; cqlsh USE foo; cqlsh:foo CREATE TABLE foo(i int PRIMARY KEY, j text); cqlsh:foo TRACING ON; Now tracing requests. cqlsh:foo INSERT INTO foo (i, j) VALUES (1, 'yay'); -- Tyler Hobbs DataStax http://datastax.com/
Re: High latencies for simple queries
Latency can be so variable even when testing things locally. I quickly fired up postgres and did the following with psql: ben=# CREATE TABLE foo(i int, j text, PRIMARY KEY(i)); CREATE TABLE ben=# \timing Timing is on. ben=# INSERT INTO foo VALUES(2, 'yay'); INSERT 0 1 Time: 1.162 ms ben=# INSERT INTO foo VALUES(3, 'yay'); INSERT 0 1 Time: 1.108 ms I then fired up a local copy of Cassandra (2.0.12) cqlsh CREATE KEYSPACE foo WITH replication = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 }; cqlsh USE foo; cqlsh:foo CREATE TABLE foo(i int PRIMARY KEY, j text); cqlsh:foo TRACING ON; Now tracing requests. cqlsh:foo INSERT INTO foo (i, j) VALUES (1, 'yay'); Tracing session: 7a7dced0-d4b2-11e4-b950-85c3c9bd91a0 activity | timestamp| source | source_elapsed ---+--+---+ execute_cql3_query | 11:52:55,229 | 127.0.0.1 | 0 Parsing INSERT INTO foo (i, j) VALUES (1, 'yay'); | 11:52:55,229 | 127.0.0.1 | 43 Preparing statement | 11:52:55,229 | 127.0.0.1 |141 Determining replicas for mutation | 11:52:55,229 | 127.0.0.1 |291 Acquiring switchLock read lock | 11:52:55,229 | 127.0.0.1 |403 Appending to commitlog | 11:52:55,229 | 127.0.0.1 |413 Adding to foo memtable | 11:52:55,229 | 127.0.0.1 |432 Request complete | 11:52:55,229 | 127.0.0.1 |541 All this on a mac book pro with 16gb of memory and an SSD So ymmv? On 27 March 2015 at 08:28, Tyler Hobbs ty...@datastax.com wrote: Just to check, are you concerned about minimizing that latency or maximizing throughput? I'll that latency is what you're actually concerned about. A fair amount of that latency is probably happening in the python driver. Although it can easily execute ~8k operations per second (using cpython), in some scenarios it can be difficult to guarantee sub-ms latency for an individual query due to how some of the internals work. In particular, it uses python's Conditions for cross-thread signalling (from the event loop thread to the application thread). Unfortunately, python's Condition implementation includes a loop with a minimum sleep of 1ms if the Condition isn't already set when you start the wait() call. This is why, with a single application thread, you will typically see a minimum of 1ms latency. Another source of similar latencies for the python driver is the Asyncore event loop, which is used when libev isn't available. I would make sure that you can use the LibevConnection class with the driver to avoid this. On Fri, Mar 27, 2015 at 6:24 AM, Artur Siekielski a...@vhex.net wrote: I'm running Cassandra locally and I see that the execution time for the simplest queries is 1-2 milliseconds. By a simple query I mean either INSERT or SELECT from a small table with short keys. While this number is not high, it's about 10-20 times slower than Postgresql (even if INSERTs are wrapped in transactions). I know that the nature of Cassandra compared to Postgresql is different, but for some scenarios this difference can matter. The question is: is it normal for Cassandra to have a minimum latency of 1 millisecond? I'm using Cassandra 2.1.2, python-driver. -- Tyler Hobbs DataStax http://datastax.com/ -- Ben Bromhead Instaclustr | www.instaclustr.com | @instaclustr http://twitter.com/instaclustr | (650) 284 9692
Re: cassandra source code
hi I hav run the source of cassandra in eclipse juno by following this document http://brianoneill.blogspot.in/2015/03/getting-started-with-cassandra.html. but i'm getting the exceptions. please help to solve this. INFO 17:43:40 Node localhost/127.0.0.1 state jump to normal INFO 17:43:41 Netty using Java NIO event loop INFO 17:43:41 Using Netty Version: [netty-buffer=netty-buffer-4.0.23.Final.208198c, netty-codec=netty-codec-4.0.23.Final.208198c, netty-codec-http=netty-codec-http-4.0.23.Final.208198c, netty-codec-socks=netty-codec-socks-4.0.23.Final.208198c, netty-common=netty-common-4.0.23.Final.208198c, netty-handler=netty-handler-4.0.23.Final.208198c, netty-transport=netty-transport-4.0.23.Final.208198c, netty-transport-rxtx=netty-transport-rxtx-4.0.23.Final.208198c, netty-transport-sctp=netty-transport-sctp-4.0.23.Final.208198c, netty-transport-udt=netty-transport-udt-4.0.23.Final.208198c] INFO 17:43:41 Starting listening for CQL clients on localhost/127.0.0.1:9042... Exception (java.lang.IllegalStateException) encountered during startup: Failed to bind port 9042 on 127.0.0.1. java.lang.IllegalStateException: Failed to bind port 9042 on 127.0.0.1. ERROR 17:43:41 Exception encountered during startup java.lang.IllegalStateException: Failed to bind port 9042 on 127.0.0.1. at org.apache.cassandra.transport.Server.run(Server.java:179) ~[main/:na] at org.apache.cassandra.transport.Server.start(Server.java:119) ~[main/:na] at org.apache.cassandra.service.CassandraDaemon.start(CassandraDaemon.java:428) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:505) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:599) [main/:na] ERROR 17:43:41 Exception encountered during startup java.lang.IllegalStateException: Failed to bind port 9042 on 127.0.0.1. at org.apache.cassandra.transport.Server.run(Server.java:179) ~[main/:na] at org.apache.cassandra.transport.Server.start(Server.java:119) ~[main/:na] at org.apache.cassandra.service.CassandraDaemon.start(CassandraDaemon.java:428) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:505) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:599) [main/:na] at org.apache.cassandra.transport.Server.run(Server.java:179) at org.apache.cassandra.transport.Server.start(Server.java:119) at org.apache.cassandra.service.CassandraDaemon.start(CassandraDaemon.java:428) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:505) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:599) INFO 17:43:41 Announcing shutdown INFO 17:43:43 Waiting for messaging service to quiesce INFO 17:43:43 MessagingService has terminated the accept() thread -- *From:* Divya Divs divya.divi2...@gmail.com *Sent:* Tuesday, March 24, 2015 10:59 AM *To:* user@cassandra.apache.org; Jason Wee; Eric Stevens *Subject:* cassandra source code Hi I'm Divya, I'm trying to run the source code of cassandra in eclipse. I'm taking the source code from github. I'm using windows 64-bit, I'm following the instructions from this website. http://runningcassandraineclipse.blogspot.in/. In the github cassandra-trunk, conf/log4j-server.properies directories and org.apache.cassandra.thrift.CassandraDaemon, main class is not there. please give me a document to run the source code of cassandra. Please kindly help me to proceed. Please reply me as soon as possible. Thanking you This electronic mail (including any attachment thereto) may be confidential and privileged and is intended only for the individual or entity named above. Any unauthorized use, printing, copying, disclosure or dissemination of this communication may be subject to legal restriction or sanction. Accordingly, if you are not the intended recipient, please notify the sender by replying to this email immediately and delete this email (and any attachment thereto) from your computer system...Thank You.
Re: High latencies for simple queries
I use callback chaining with the python driver and can confirm that it is very fast. You can chain the chains together to perform sequential processing. I do this when retrieving metadata and then the referenced payload for example, when the metadata has been inverted and the payload is larger than we want to invert. And you can be running multiple chains of chains asynchronously - cascade state by employing the userdata of the future. We also multiprocess, for more parallelism, and we distribute work to multiple multiprocessing instances using a message broker for yet more parallel activity, as well as reliability. ml On Fri, Mar 27, 2015 at 4:28 PM, Tyler Hobbs ty...@datastax.com wrote: Since you're executing queries sequentially, you may want to look into using callback chaining to avoid the cross-thread signaling that results in the 1ms latencies. Basically, just use session.execute_async() and attach a callback to the returned future that will execute your next query. The callback is executed on the event loop thread. The main downsides to this are that you need to be careful to avoid blocking the event loop thread (including executing session.execute() or prepare()) and you need to ensure that all exceptions raised in the callback are handled by your application code. On Fri, Mar 27, 2015 at 3:11 PM, Artur Siekielski a...@vhex.net wrote: I think that in your example Postgres spends most time on waiting for fsync() to complete. On Linux, for a battery-backed raid controller, it's safe to mount ext4 filesystem with barrier=0 option which improves fsync() performance a lot. I have partitions mounted with this option and I did a test from Python, using psycopg2 driver, and I got the following latencies, in milliseconds: - INSERT without COMMIT: 0.04 - INSERT with COMMIT: 0.12 - SELECT: 0.05 I'm also repeating benchmark runs multiple times (I'm using Python's timeit module). On 03/27/2015 07:58 PM, Ben Bromhead wrote: Latency can be so variable even when testing things locally. I quickly fired up postgres and did the following with psql: ben=# CREATE TABLE foo(i int, j text, PRIMARY KEY(i)); CREATE TABLE ben=# \timing Timing is on. ben=# INSERT INTO foo VALUES(2, 'yay'); INSERT 0 1 Time: 1.162 ms ben=# INSERT INTO foo VALUES(3, 'yay'); INSERT 0 1 Time: 1.108 ms I then fired up a local copy of Cassandra (2.0.12) cqlsh CREATE KEYSPACE foo WITH replication = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 }; cqlsh USE foo; cqlsh:foo CREATE TABLE foo(i int PRIMARY KEY, j text); cqlsh:foo TRACING ON; Now tracing requests. cqlsh:foo INSERT INTO foo (i, j) VALUES (1, 'yay'); -- Tyler Hobbs DataStax http://datastax.com/
Re: Arbitrary nested tree hierarchy data model
+1 would love to see how you do it On 27 March 2015 at 07:18, Jonathan Haddad j...@jonhaddad.com wrote: I'd be interested to see that data model. I think the entire list would benefit! On Thu, Mar 26, 2015 at 8:16 PM Robert Wille rwi...@fold3.com wrote: I have a cluster which stores tree structures. I keep several hundred unrelated trees. The largest has about 180 million nodes, and the smallest has 1 node. The largest fanout is almost 400K. Depth is arbitrary, but in practice is probably less than 10. I am able to page through children and siblings. It works really well. Doesn’t sound like its exactly like what you’re looking for, but if you want any pointers on how I went about implementing mine, I’d be happy to share. On Mar 26, 2015, at 3:05 PM, List l...@airstreamcomm.net wrote: Not sure if this is the right place to ask, but we are trying to model a user-generated tree hierarchy in which they create child objects of a root node, and can create an arbitrary number of children (and children of children, and on and on). So far we have looked at storing each tree structure as a single document in JSON format and reading/writing it out in it's entirety, doing materialized paths where we store the root id with every child and the tree structure above the child as a map, and some form of an adjacency list (which does not appear to be very viable as looking up the entire tree would be ridiculous). The hope is to end up with a data model that allows us to display the entire tree quickly, as well as see the entire path to a leaf when selecting that leaf. If anyone has some suggestions/experience on how to model such a tree heirarchy we would greatly appreciate your input. -- Ben Bromhead Instaclustr | www.instaclustr.com | @instaclustr http://twitter.com/instaclustr | (650) 284 9692
Re: Arbitrary nested tree hierarchy data model
Hmmm... If you serialize the tree properly in a partition, you could always read an entire sub-tree as a single slice (consecutive CQL rows.) Is there much more to it? -- Jack Krupansky On Fri, Mar 27, 2015 at 7:35 PM, Ben Bromhead b...@instaclustr.com wrote: +1 would love to see how you do it On 27 March 2015 at 07:18, Jonathan Haddad j...@jonhaddad.com wrote: I'd be interested to see that data model. I think the entire list would benefit! On Thu, Mar 26, 2015 at 8:16 PM Robert Wille rwi...@fold3.com wrote: I have a cluster which stores tree structures. I keep several hundred unrelated trees. The largest has about 180 million nodes, and the smallest has 1 node. The largest fanout is almost 400K. Depth is arbitrary, but in practice is probably less than 10. I am able to page through children and siblings. It works really well. Doesn’t sound like its exactly like what you’re looking for, but if you want any pointers on how I went about implementing mine, I’d be happy to share. On Mar 26, 2015, at 3:05 PM, List l...@airstreamcomm.net wrote: Not sure if this is the right place to ask, but we are trying to model a user-generated tree hierarchy in which they create child objects of a root node, and can create an arbitrary number of children (and children of children, and on and on). So far we have looked at storing each tree structure as a single document in JSON format and reading/writing it out in it's entirety, doing materialized paths where we store the root id with every child and the tree structure above the child as a map, and some form of an adjacency list (which does not appear to be very viable as looking up the entire tree would be ridiculous). The hope is to end up with a data model that allows us to display the entire tree quickly, as well as see the entire path to a leaf when selecting that leaf. If anyone has some suggestions/experience on how to model such a tree heirarchy we would greatly appreciate your input. -- Ben Bromhead Instaclustr | www.instaclustr.com | @instaclustr http://twitter.com/instaclustr | (650) 284 9692
Re: Arbitrary nested tree hierarchy data model
Okay, this is going to be a pretty long post, but I think its an interesting data model, and hopefully someone will find it worth going through. First, I think it will be easier to understand the modeling choices I made if you see the end product. Go to http://www.fold3.com/browse.php#249|hzUkLqDmIhttp://www.fold3.com/browse.php#249%7ChzUkLqDmI. What you see looks like one big tree, but actually is a combination of trees spliced together. There is one tree in a relational database that forms what I call the top-level browse. The top-level browse is used to navigate through categories until you arrive at a publication. When you drill down into a publication, you are then viewing data stored in Cassandra. The link provided above points to the root of a publication (in this case, maps from the Civil War), so to the left is top-level browse coming from MySQL, and to the right is the Cassandra browse. Each publication has an independent tree in Cassandra, with all trees stored in the same set of tables (I do not dynamically create tables for each publication — I personally think that’s a bad practice). We currently have 458 publications, and collectively they have about half a billion nodes and consume about 400 GB (RF=3). My trees are immutable. When there are changes to a publication (e.g. adding new documents), it is very difficult to know what changes need to be made to the tree to edit it in-place. Also, it would be impossible to maintain navigational consistency while a tree is in process of being restructured. So, when a publication changes, I create a completely new tree. Once the new tree is built, I change a reference to point to the new tree. I have a process that nightly pages through the tables and deletes records that belong to obsolete trees. This process takes about five hours. If it were much longer than that, I would probably run it weekly. My database has quite a bit of churn, which is fairly unusual for a Cassandra-based application. Most nights build two or three trees, generally resulting in a few tens of millions of new records and a slightly fewer number of deletions. Size-tiered compaction is a bad choice for churn, so I use leveled compaction. Most publications are at most a few million nodes, and generally build in less than 20 minutes. Since any modeling exercise requires knowing the queries, I should describe that before getting into the model. Here are the features I need to support. For browsing the tree, I need to be able to get the children of a node (paginated), the siblings of a node (also paginated), and the ancestors of a node. The leaves of each tree are images and form a filmstrip. You can use the filmstrip to navigate through all the images in a publication in the tree’s natural order. If you go to my browse page and keep drilling down, you’ll eventually get to an image. The filmstrip appears at the bottom of the image viewer. Before I discuss the schema, I should discuss a couple of other non-obvious things that are relevant to the data model. One very common operation is to retrieve a node and all of its ancestors in order to display a path. Denormalization would suggest that I store the data for each node, along with that of all of its ancestors. That would mean that in my biggest tree, I would store the root node 180 million times. I didn’t consider that kind of bloat to be acceptable, so I do not denormalize ancestors. I also wanted to retrieve a node and its ancestors in constant time, rather than O(n) as would be typical for tree traversal. In order to accomplish this, I use a pretty unique idea for a node's primary key. I create a hash from information in the node, and then append it to the hash of its parent. So, the primary key is really a path. When I need to retrieve a node and its ancestors, I tokenize the path and issue queries in parallel to get all the nodes in the ancestry at the same time. In keeping with this pattern of not denormalizing, my auxiliary tables do not have node data in them, but instead provide a means of getting hash paths, which I then tokenize and make parallel requests with. Most requests that use an auxiliary table can generally just make a query to the auxiliary table to get the hash path, and then retrieve the node and its ancestors from the node table. Three or fewer trips to Cassandra are sufficient for all my API’s. Without further ado, here’s my schema (with commentary): CREATE TABLE tree ( tree INT, pub INT, rhpath VARCHAR, atime TIMESTAMP, ccount INT, ncount INT, PRIMARY KEY (tree) ) WITH gc_grace_seconds = 864000; This table maintains the references to the root nodes for each tree. pub is the primary key for the publication table in my relational database. There is usually just one record for each publication. When a tree is being built (and until the old one is retired), a publication may have more than one tree. This table is small (458 records), and I cache