Re: Difference in retrieving data from cassandra
Hey Jonathan, Thanks for your reply. i created schema structure in this manner CREATE SCHEMA schemaname WITH replication = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 }; and table according to requirement. I didn't used node structure. So will it be the reason for performance? And can you also tell me what is the difference between the structure i used and in Node Structure. Regards, Umang Shah BI-ETL Developer On Thu, Sep 25, 2014 at 4:48 PM, Jonathan Haddad j...@jonhaddad.com wrote: You'll need to provide a bit of information. To start, a query trace from would be helpful. http://www.datastax.com/documentation/cql/3.0/cql/cql_reference/tracing_r.html (self promo) You may want to read over my blog post regarding diagnosing problems in production. I've covered diagnosing slow queries: http://rustyrazorblade.com/2014/09/cassandra-summit-recap-diagnosing-problems-in-production/ On Thu, Sep 25, 2014 at 4:21 AM, Umang Shah shahuma...@gmail.com wrote: Hi All, I am using cassandra with Pentaho PDI kettle, i have installed cassandra in Amazon EC2 instance and in local-machine, so when i am trying to retrieve data from local machine using Pentaho PDI it is taking few seconds (not more then 20 seconds) and if i do the same using production data-base it takes almost 3 minutes for the same number of data , which is huge difference. So if anybody can give me some comments of solution that what i need to check for this or how can i narrow down this difference? on local machine and production server RAM is same. Local machine is windows environment and production is Linux. -- Regards, Umang V.Shah BI-ETL Developer -- Jon Haddad http://www.rustyrazorblade.com twitter: rustyrazorblade -- Regards, Umang V.Shah +919886829019
DevCenter and Cassandra 2.1
Hi all, I notice that descanter 1.1.1doesn’t support User defined types (as far as I can see). Is it just a matter of importing a template or will we need to wait for full 2.1 support in descanter ? Andy The University of Dundee is a registered Scottish Charity, No: SC015096
Re: DevCenter and Cassandra 2.1
Hi Andrew, DevCenter has a complete CQL parser inside which helps with the offline validations and suggestions. So the bad news is that it requires a new version for every CQL grammar change. The good news is that this wait is not going to be too long (I cannot talk yet about a specific release date, but it's getting there). On Fri, Sep 26, 2014 at 2:13 AM, Andrew Cobley a.e.cob...@dundee.ac.uk wrote: Hi all, I notice that descanter 1.1.1doesn’t support User defined types (as far as I can see). Is it just a matter of importing a template or will we need to wait for full 2.1 support in descanter ? Andy The University of Dundee is a registered Scottish Charity, No: SC015096 -- :- a) Alex Popescu Sen. Product Manager @ DataStax @al3xandru
How to setup Cassandra client-to-node encryption
Hi, All, I use the following configuration (in yaml file) to enable the client-to-node encryption: client_encryption_options: enabled: true keystore: path-to-keystore-file keystore_password: some-password truststore: path-to-truststore-file truststore_password: some-password But when Cassandra starts, I got following error: Caused by: org.apache.thrift.transport.TTransportException: Could not bind to port 9160 at org.apache.thrift.transport.TSSLTransportFactory.createServer(TSSLTransportFactory.java:117) at org.apache.thrift.transport.TSSLTransportFactory.getServerSocket(TSSLTransportFactory.java:103) at org.apache.cassandra.thrift.CustomTThreadPoolServer$Factory.buildTServer(CustomTThreadPoolServer.java:253) ... 6 more Caused by: java.lang.IllegalArgumentException: Cannot support TLS_RSA_WITH_AES_256_CBC_SHA with currently installed providers at sun.security.ssl.CipherSuiteList.init(CipherSuiteList.java:92) at sun.security.ssl.SSLServerSocketImpl.setEnabledCipherSuites(SSLServerSocketImpl.java:191) at org.apache.thrift.transport.TSSLTransportFactory.createServer(TSSLTransportFactory.java:113) ... 8 more Does anyone know the root cause? Thanks a lot. Boying
Re: How to setup Cassandra client-to-node encryption
Hi, You need to install JCE - http://www.oracle.com/technetwork/java/javase/downloads/jce-7-download-432124.html Bulat On Sep 26, 2014, at 7:58, Lu, Boying boying...@emc.com wrote: Hi, All, I use the following configuration (in yaml file) to enable the client-to-node encryption: client_encryption_options: enabled: true keystore: path-to-keystore-file keystore_password: some-password truststore: path-to-truststore-file truststore_password: some-password But when Cassandra starts, I got following error: Caused by: org.apache.thrift.transport.TTransportException: Could not bind to port 9160 at org.apache.thrift.transport.TSSLTransportFactory.createServer(TSSLTransportFactory.java:117) at org.apache.thrift.transport.TSSLTransportFactory.getServerSocket(TSSLTransportFactory.java:103) at org.apache.cassandra.thrift.CustomTThreadPoolServer$Factory.buildTServer(CustomTThreadPoolServer.java:253) ... 6 more Caused by: java.lang.IllegalArgumentException: Cannot support TLS_RSA_WITH_AES_256_CBC_SHA with currently installed providers at sun.security.ssl.CipherSuiteList.init(CipherSuiteList.java:92) at sun.security.ssl.SSLServerSocketImpl.setEnabledCipherSuites(SSLServerSocketImpl.java:191) at org.apache.thrift.transport.TSSLTransportFactory.createServer(TSSLTransportFactory.java:113) ... 8 more Does anyone know the root cause? Thanks a lot. Boying
RE: How to setup Cassandra client-to-node encryption
Thanks a lot. I’ll try it. From: Bulat Shakirzyanov [mailto:mallluh...@gmail.com] Sent: 2014年9月26日 23:58 To: user@cassandra.apache.org Subject: Re: How to setup Cassandra client-to-node encryption Hi, You need to install JCE - http://www.oracle.com/technetwork/java/javase/downloads/jce-7-download-432124.html Bulat On Sep 26, 2014, at 7:58, Lu, Boying boying...@emc.commailto:boying...@emc.com wrote: Hi, All, I use the following configuration (in yaml file) to enable the client-to-node encryption: client_encryption_options: enabled: true keystore: path-to-keystore-file keystore_password: some-password truststore: path-to-truststore-file truststore_password: some-password But when Cassandra starts, I got following error: Caused by: org.apache.thrift.transport.TTransportException: Could not bind to port 9160 at org.apache.thrift.transport.TSSLTransportFactory.createServer(TSSLTransportFactory.java:117) at org.apache.thrift.transport.TSSLTransportFactory.getServerSocket(TSSLTransportFactory.java:103) at org.apache.cassandra.thrift.CustomTThreadPoolServer$Factory.buildTServer(CustomTThreadPoolServer.java:253) ... 6 more Caused by: java.lang.IllegalArgumentException: Cannot support TLS_RSA_WITH_AES_256_CBC_SHA with currently installed providers at sun.security.ssl.CipherSuiteList.init(CipherSuiteList.java:92) at sun.security.ssl.SSLServerSocketImpl.setEnabledCipherSuites(SSLServerSocketImpl.java:191) at org.apache.thrift.transport.TSSLTransportFactory.createServer(TSSLTransportFactory.java:113) ... 8 more Does anyone know the root cause? Thanks a lot. Boying
Re: using dynamic cell names in CQL 3
I’m not sure I understand correctly “for example column.name would be event_name(temperature)“, what I gather however is that you have multiple events that may or may not have certain properties, in your example I believe you mean you want a CF for events with a type event_name that contains a column temperature ?! You can model it like that : CREATE TABLE events ( name text, metric text, value text, PRIMARY KEY (name, metric) ) Where - name is the row key, for each kind (or name) of event - metric is the column name, aka the clustering key For example when inserting INSERT INTO events (name, metric, value) VALUES ('captor', 'temperature', '25 ºC'); INSERT INTO events (name, metric, value) VALUES ('captor', 'wind', '5 km/h'); INSERT INTO events (name, metric, value) VALUES ('captor', 'atmosphere', '1013 millibars'); INSERT INTO events (name, metric, value) VALUES ('cpu', 'temperature', '70 ºC'); INSERT INTO events (name, metric, value) VALUES ('cpu', 'frequency', '1015,7 MHz'); You will have something like this : temperature atmosphere wind frequency meteorologic 25 ºC 1013 millibars 5 km/h cpu 70 ºC 1015,7 MHz CQLSH represents each clustering key as row, which is not how the column family is stored. The model I give is just an example as you may want a to model differently according to your use cases. And time probably is part of it. And will probably be in the clustering key too. Note that if you create wide rows and you have *a lot* of data, you may want to bucket the CF per time period (month / week / day / etc). HTH — Brice On Thu, Sep 25, 2014 at 3:13 PM, shahab shahab.mok...@gmail.com wrote: Thanks, It seems that I was not clear in my question, I would like to store values in the column name, for example column.name would be event_name (temperature) and column-content would be the respective value (e.g. 40.5) . And I need to know how the schema should look like in CQL 3 best, /Shahab On Wed, Sep 24, 2014 at 1:49 PM, DuyHai Doan doanduy...@gmail.com wrote: Dynamic thing in Thrift ≈ clustering columns in CQL Can you give more details about your data model ? On Wed, Sep 24, 2014 at 1:11 PM, shahab shahab.mok...@gmail.com wrote: Hi, I would like to define schema for a table where the column (cell) names are defined dynamically. Apparently there is a way to do this in Thrift ( http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows ) but i couldn't find how i can do the same using CQL? Any resource/example that I can look at ? best, /Shahab
Repair taking long time
I am fairly new to Cassandra. We have a 9 node cluster, 5 in one DC and 4 in another. Running a repair on a large column family seems to be moving much slower than I expect. Looking at nodetool compaction stats it indicates the Validation phase is running that the total bytes is 4.5T (4505336278756). This is a very large CF. The process has been running for 2.5 hours and has processed 71G (71950433062). That rate is about 28.4 GB per hour. At this rate it will take 158 hours, just shy of 1 week. Is this reasonable? This is my first large repair and I am wondering if this is normal for a CF of this size. Seems like a long time to me. Is it possible to tune this process to speed it up? Is there something in my configuration that could be causing this slow performance? I am running HDDs, not SSDs in a JBOD configuration. Gene Robichaux Manager, Database Operations Match.com 8300 Douglas Avenue I Suite 800 I Dallas, TX 75225 Phone: 214-576-3273
Re: Repair taking long time
Are you using Cassandra 2.0 vnodes? If so, repair takes forever. This problem is addressed in 2.1. On Fri, Sep 26, 2014 at 9:52 AM, Gene Robichaux gene.robich...@match.com wrote: I am fairly new to Cassandra. We have a 9 node cluster, 5 in one DC and 4 in another. Running a repair on a large column family seems to be moving much slower than I expect. Looking at nodetool compaction stats it indicates the Validation phase is running that the total bytes is 4.5T (4505336278756). This is a very large CF. The process has been running for 2.5 hours and has processed 71G (71950433062). That rate is about 28.4 GB per hour. At this rate it will take 158 hours, just shy of 1 week. Is this reasonable? This is my first large repair and I am wondering if this is normal for a CF of this size. Seems like a long time to me. Is it possible to tune this process to speed it up? Is there something in my configuration that could be causing this slow performance? I am running HDDs, not SSDs in a JBOD configuration. Gene Robichaux Manager, Database Operations Match.com 8300 Douglas Avenue I Suite 800 I Dallas, TX 75225 Phone: 214-576-3273 -- Jon Haddad http://www.rustyrazorblade.com twitter: rustyrazorblade
Re: Repair taking long time
Unfortunately DSE 4.5.0 is still on 2.0.x -- Brice On Fri, Sep 26, 2014 at 7:40 PM, Jonathan Haddad j...@jonhaddad.com wrote: Are you using Cassandra 2.0 vnodes? If so, repair takes forever. This problem is addressed in 2.1. On Fri, Sep 26, 2014 at 9:52 AM, Gene Robichaux gene.robich...@match.com wrote: I am fairly new to Cassandra. We have a 9 node cluster, 5 in one DC and 4 in another. Running a repair on a large column family seems to be moving much slower than I expect. Looking at nodetool compaction stats it indicates the Validation phase is running that the total bytes is 4.5T (4505336278756). This is a very large CF. The process has been running for 2.5 hours and has processed 71G (71950433062). That rate is about 28.4 GB per hour. At this rate it will take 158 hours, just shy of 1 week. Is this reasonable? This is my first large repair and I am wondering if this is normal for a CF of this size. Seems like a long time to me. Is it possible to tune this process to speed it up? Is there something in my configuration that could be causing this slow performance? I am running HDDs, not SSDs in a JBOD configuration. Gene Robichaux Manager, Database Operations Match.com 8300 Douglas Avenue I Suite 800 I Dallas, TX 75225 Phone: 214-576-3273 -- Jon Haddad http://www.rustyrazorblade.com twitter: rustyrazorblade
Re: Repair taking long time
With a 4.5 TB table and just 4 nodes, repair will likely take forever for any version. -Bryan On Fri, Sep 26, 2014 at 10:40 AM, Jonathan Haddad j...@jonhaddad.com wrote: Are you using Cassandra 2.0 vnodes? If so, repair takes forever. This problem is addressed in 2.1. On Fri, Sep 26, 2014 at 9:52 AM, Gene Robichaux gene.robich...@match.com wrote: I am fairly new to Cassandra. We have a 9 node cluster, 5 in one DC and 4 in another. Running a repair on a large column family seems to be moving much slower than I expect. Looking at nodetool compaction stats it indicates the Validation phase is running that the total bytes is 4.5T (4505336278756). This is a very large CF. The process has been running for 2.5 hours and has processed 71G (71950433062). That rate is about 28.4 GB per hour. At this rate it will take 158 hours, just shy of 1 week. Is this reasonable? This is my first large repair and I am wondering if this is normal for a CF of this size. Seems like a long time to me. Is it possible to tune this process to speed it up? Is there something in my configuration that could be causing this slow performance? I am running HDDs, not SSDs in a JBOD configuration. Gene Robichaux Manager, Database Operations Match.com 8300 Douglas Avenue I Suite 800 I Dallas, TX 75225 Phone: 214-576-3273 -- Jon Haddad http://www.rustyrazorblade.com twitter: rustyrazorblade
RE: Repair taking long time
I am on DSE 4.0.3 which is 2.0.7. If 4.5.1 is NOT 2.1. I guess an upgrade will not buy me much….. The bad thing is that table is not our largest….. :( Gene Robichaux Manager, Database Operations Match.com 8300 Douglas Avenue I Suite 800 I Dallas, TX 75225 Phone: 214-576-3273 From: Brice Dutheil [mailto:brice.duth...@gmail.com] Sent: Friday, September 26, 2014 12:47 PM To: user@cassandra.apache.org Subject: Re: Repair taking long time Unfortunately DSE 4.5.0 is still on 2.0.x -- Brice On Fri, Sep 26, 2014 at 7:40 PM, Jonathan Haddad j...@jonhaddad.commailto:j...@jonhaddad.com wrote: Are you using Cassandra 2.0 vnodes? If so, repair takes forever. This problem is addressed in 2.1. On Fri, Sep 26, 2014 at 9:52 AM, Gene Robichaux gene.robich...@match.commailto:gene.robich...@match.com wrote: I am fairly new to Cassandra. We have a 9 node cluster, 5 in one DC and 4 in another. Running a repair on a large column family seems to be moving much slower than I expect. Looking at nodetool compaction stats it indicates the Validation phase is running that the total bytes is 4.5T (4505336278756). This is a very large CF. The process has been running for 2.5 hours and has processed 71G (71950433062). That rate is about 28.4 GB per hour. At this rate it will take 158 hours, just shy of 1 week. Is this reasonable? This is my first large repair and I am wondering if this is normal for a CF of this size. Seems like a long time to me. Is it possible to tune this process to speed it up? Is there something in my configuration that could be causing this slow performance? I am running HDDs, not SSDs in a JBOD configuration. Gene Robichaux Manager, Database Operations Match.com 8300 Douglas Avenue I Suite 800 I Dallas, TX 75225 Phone: 214-576-3273 -- Jon Haddad http://www.rustyrazorblade.com twitter: rustyrazorblade
Re: Repair taking long time
If you're using DSE you might want to contact Datastax support, rather than the ML. On Fri, Sep 26, 2014 at 10:52 AM, Gene Robichaux gene.robich...@match.com wrote: I am on DSE 4.0.3 which is 2.0.7. If 4.5.1 is NOT 2.1. I guess an upgrade will not buy me much….. The bad thing is that table is not our largest….. :( Gene Robichaux Manager, Database Operations Match.com 8300 Douglas Avenue I Suite 800 I Dallas, TX 75225 Phone: 214-576-3273 From: Brice Dutheil [mailto:brice.duth...@gmail.com] Sent: Friday, September 26, 2014 12:47 PM To: user@cassandra.apache.org Subject: Re: Repair taking long time Unfortunately DSE 4.5.0 is still on 2.0.x -- Brice On Fri, Sep 26, 2014 at 7:40 PM, Jonathan Haddad j...@jonhaddad.com wrote: Are you using Cassandra 2.0 vnodes? If so, repair takes forever. This problem is addressed in 2.1. On Fri, Sep 26, 2014 at 9:52 AM, Gene Robichaux gene.robich...@match.com wrote: I am fairly new to Cassandra. We have a 9 node cluster, 5 in one DC and 4 in another. Running a repair on a large column family seems to be moving much slower than I expect. Looking at nodetool compaction stats it indicates the Validation phase is running that the total bytes is 4.5T (4505336278756). This is a very large CF. The process has been running for 2.5 hours and has processed 71G (71950433062). That rate is about 28.4 GB per hour. At this rate it will take 158 hours, just shy of 1 week. Is this reasonable? This is my first large repair and I am wondering if this is normal for a CF of this size. Seems like a long time to me. Is it possible to tune this process to speed it up? Is there something in my configuration that could be causing this slow performance? I am running HDDs, not SSDs in a JBOD configuration. Gene Robichaux Manager, Database Operations Match.com 8300 Douglas Avenue I Suite 800 I Dallas, TX 75225 Phone: 214-576-3273 -- Jon Haddad http://www.rustyrazorblade.com twitter: rustyrazorblade -- Jon Haddad http://www.rustyrazorblade.com twitter: rustyrazorblade
RE: Repair taking long time
Using their community edition..no support (yet!) :( Gene Robichaux Manager, Database Operations Match.com 8300 Douglas Avenue I Suite 800 I Dallas, TX 75225 Phone: 214-576-3273 -Original Message- From: jonathan.had...@gmail.com [mailto:jonathan.had...@gmail.com] On Behalf Of Jonathan Haddad Sent: Friday, September 26, 2014 12:58 PM To: user@cassandra.apache.org Subject: Re: Repair taking long time If you're using DSE you might want to contact Datastax support, rather than the ML. On Fri, Sep 26, 2014 at 10:52 AM, Gene Robichaux gene.robich...@match.com wrote: I am on DSE 4.0.3 which is 2.0.7. If 4.5.1 is NOT 2.1. I guess an upgrade will not buy me much….. The bad thing is that table is not our largest….. :( Gene Robichaux Manager, Database Operations Match.com 8300 Douglas Avenue I Suite 800 I Dallas, TX 75225 Phone: 214-576-3273 From: Brice Dutheil [mailto:brice.duth...@gmail.com] Sent: Friday, September 26, 2014 12:47 PM To: user@cassandra.apache.org Subject: Re: Repair taking long time Unfortunately DSE 4.5.0 is still on 2.0.x -- Brice On Fri, Sep 26, 2014 at 7:40 PM, Jonathan Haddad j...@jonhaddad.com wrote: Are you using Cassandra 2.0 vnodes? If so, repair takes forever. This problem is addressed in 2.1. On Fri, Sep 26, 2014 at 9:52 AM, Gene Robichaux gene.robich...@match.com wrote: I am fairly new to Cassandra. We have a 9 node cluster, 5 in one DC and 4 in another. Running a repair on a large column family seems to be moving much slower than I expect. Looking at nodetool compaction stats it indicates the Validation phase is running that the total bytes is 4.5T (4505336278756). This is a very large CF. The process has been running for 2.5 hours and has processed 71G (71950433062). That rate is about 28.4 GB per hour. At this rate it will take 158 hours, just shy of 1 week. Is this reasonable? This is my first large repair and I am wondering if this is normal for a CF of this size. Seems like a long time to me. Is it possible to tune this process to speed it up? Is there something in my configuration that could be causing this slow performance? I am running HDDs, not SSDs in a JBOD configuration. Gene Robichaux Manager, Database Operations Match.com 8300 Douglas Avenue I Suite 800 I Dallas, TX 75225 Phone: 214-576-3273 -- Jon Haddad http://www.rustyrazorblade.com twitter: rustyrazorblade -- Jon Haddad http://www.rustyrazorblade.com twitter: rustyrazorblade
Re: Repair taking long time
Well, in that case, you may want to roll your own script for doing constant repairs of your cluster, and extend your gc grace seconds so you can repair the whole cluster before the tombstones are cleared. On Fri, Sep 26, 2014 at 11:15 AM, Gene Robichaux gene.robich...@match.com wrote: Using their community edition..no support (yet!) :( Gene Robichaux Manager, Database Operations Match.com 8300 Douglas Avenue I Suite 800 I Dallas, TX 75225 Phone: 214-576-3273 -Original Message- From: jonathan.had...@gmail.com [mailto:jonathan.had...@gmail.com] On Behalf Of Jonathan Haddad Sent: Friday, September 26, 2014 12:58 PM To: user@cassandra.apache.org Subject: Re: Repair taking long time If you're using DSE you might want to contact Datastax support, rather than the ML. On Fri, Sep 26, 2014 at 10:52 AM, Gene Robichaux gene.robich...@match.com wrote: I am on DSE 4.0.3 which is 2.0.7. If 4.5.1 is NOT 2.1. I guess an upgrade will not buy me much….. The bad thing is that table is not our largest….. :( Gene Robichaux Manager, Database Operations Match.com 8300 Douglas Avenue I Suite 800 I Dallas, TX 75225 Phone: 214-576-3273 From: Brice Dutheil [mailto:brice.duth...@gmail.com] Sent: Friday, September 26, 2014 12:47 PM To: user@cassandra.apache.org Subject: Re: Repair taking long time Unfortunately DSE 4.5.0 is still on 2.0.x -- Brice On Fri, Sep 26, 2014 at 7:40 PM, Jonathan Haddad j...@jonhaddad.com wrote: Are you using Cassandra 2.0 vnodes? If so, repair takes forever. This problem is addressed in 2.1. On Fri, Sep 26, 2014 at 9:52 AM, Gene Robichaux gene.robich...@match.com wrote: I am fairly new to Cassandra. We have a 9 node cluster, 5 in one DC and 4 in another. Running a repair on a large column family seems to be moving much slower than I expect. Looking at nodetool compaction stats it indicates the Validation phase is running that the total bytes is 4.5T (4505336278756). This is a very large CF. The process has been running for 2.5 hours and has processed 71G (71950433062). That rate is about 28.4 GB per hour. At this rate it will take 158 hours, just shy of 1 week. Is this reasonable? This is my first large repair and I am wondering if this is normal for a CF of this size. Seems like a long time to me. Is it possible to tune this process to speed it up? Is there something in my configuration that could be causing this slow performance? I am running HDDs, not SSDs in a JBOD configuration. Gene Robichaux Manager, Database Operations Match.com 8300 Douglas Avenue I Suite 800 I Dallas, TX 75225 Phone: 214-576-3273 -- Jon Haddad http://www.rustyrazorblade.com twitter: rustyrazorblade -- Jon Haddad http://www.rustyrazorblade.com twitter: rustyrazorblade -- Jon Haddad http://www.rustyrazorblade.com twitter: rustyrazorblade
Apache Cassandra 2.1.0 : cassandra-stress performance discrepancy between SSD and SATA drive
Hi, I have run cassandra-stress write and cassandra-stress read on my office PC and on my home PC. Office PC : Intel Core i7-4479, 8 virtual core, 16G RAM, 500G SSD Home PC : Intel Xeon E3-1230V3, 8 virtual core, 8G RAM, 500G SATA disk. From the cassandra-stress result (please see below), it seems Cassandra ismore than 100% performant on my home PC than the office PC. I am expecting the other way around, as my office PC has much better hardware. Office : Intel Core i7-4479, 9 virtual cores, 16G RAM, 500G SSDcauchy:~/installed/cassandra/tools/bin ./cassandra-stress write Running with 8 threadCount Results: op rate : 11264 partition rate: 11264 row rate : 11264 latency mean : 0.7 latency median: 0.4 latency 95th percentile : 0.9 latency 99th percentile : 1.6 latency 99.9th percentile : 5.3 latency max : 325.3 Total operation time : 00:02:40 cauchy:~/installed/cassandra/tools/bin ./cassandra-stress read Running with 8 threadCount Results: op rate : 13702 partition rate: 13702 row rate : 13702 latency mean : 0.5 latency median: 0.5 latency 95th percentile : 0.8 latency 99th percentile : 1.4 latency 99.9th percentile : 3.4 latency max : 67.1 Total operation time : 00:00:30 --- -- Home: Intel Xeon E3-1230V3, 8 virtual core, 8G RAM, 500G SATA disk. matmsh@gauss:~/installed/cassandra/tools/bin ./cassandra-stress write Running with 8 threadCount Results: op rate : 25181 partition rate: 25181 row rate : 25181 latency mean : 0.3 latency median: 0.2 latency 95th percentile : 0.3 latency 99th percentile : 0.5 latency 99.9th percentile : 16.7 latency max : 331.0 Total operation time : 00:03:24 gauss:~/installed/cassandra/tools/bin ./cassandra-stress read Results: op rate : 35338 partition rate: 35338 row rate : 35338 latency mean : 0.2 latency median: 0.2 latency 95th percentile : 0.3 latency 99th percentile : 0.4 latency 99.9th percentile : 1.1 latency max : 17.7 Total operation time : 00:00:30 Is the above result expected ? Thanks in advance for any suggestions ! Shing
simple map / table scans without hadoop?
I have the requirements to periodically run full tables scans on our data. It’s mostly for repair tasks or making bulk UPDATEs… but I’d prefer to do it in Java because I need something mildly trivial. Pig / hadoop / etc are mildly overkill for this. I don’t want or need a whole hadoop or HDFS setup for this. For example, a full table scan, and if a field matches a regex, set another column based on that value. Seems like this wouldn’t be too hard. Just write a daemon that looks at the key distribution and runs a scan on the data closest to it. It would be ideal if it was in a separate daemon so that you couldn’t accidentally read all that data into memory and then OOM the Cassandra daemon. Does this already exist? -- Founder/CEO Spinn3r.com Location: *San Francisco, CA* blog: http://burtonator.wordpress.com … or check out my Google+ profile https://plus.google.com/102718274791889610666/posts http://spinn3r.com
Re: Apache Cassandra 2.1.0 : cassandra-stress performance discrepancy between SSD and SATA drive
What SSD was it? There are a lot of variability in terms of SSD performance. 1. Is it a new vs old SSD? Old SSDs can become slower if they’re really worn out 2. was the office SSD near capacity holding other data? 3. what models were they? SSD != SSD… there is a massive amount of performance variability out there. … also … more data is needed. JDK versions the same? cassandra versions the same? what about the config? On Fri, Sep 26, 2014 at 2:39 PM, Shing Hing Man mat...@yahoo.com wrote: Hi, I have run cassandra-stress write and cassandra-stress read on my office PC and on my home PC. Office PC : Intel Core i7-4479, 8 virtual core, 16G RAM, 500G SSD Home PC : Intel Xeon E3-1230V3, 8 virtual core, 8G RAM, 500G SATA disk. From the cassandra-stress result (please see below), it seems Cassandra is more than 100% performant on my home PC than the office PC. I am expecting the other way around, as my office PC has much better hardware. Office : Intel Core i7-4479, 9 virtual cores, 16G RAM, 500G SSD cauchy:~/installed/cassandra/tools/bin ./cassandra-stress write Running with 8 threadCount Results: op rate : 11264 partition rate : 11264 row rate : 11264 latency mean : 0.7 latency median : 0.4 latency 95th percentile : 0.9 latency 99th percentile : 1.6 latency 99.9th percentile : 5.3 latency max : 325.3 Total operation time : 00:02:40 cauchy:~/installed/cassandra/tools/bin ./cassandra-stress read Running with 8 threadCount Results: op rate : 13702 partition rate : 13702 row rate : 13702 latency mean : 0.5 latency median : 0.5 latency 95th percentile : 0.8 latency 99th percentile : 1.4 latency 99.9th percentile : 3.4 latency max : 67.1 Total operation time : 00:00:30 --- -- Home : Intel Xeon E3-1230V3, 8 virtual core, 8G RAM, 500G SATA disk. matmsh@gauss:~/installed/cassandra/tools/bin ./cassandra-stress write Running with 8 threadCount Results: op rate : 25181 partition rate : 25181 row rate : 25181 latency mean : 0.3 latency median : 0.2 latency 95th percentile : 0.3 latency 99th percentile : 0.5 latency 99.9th percentile : 16.7 latency max : 331.0 Total operation time : 00:03:24 gauss:~/installed/cassandra/tools/bin ./cassandra-stress read Results: op rate : 35338 partition rate : 35338 row rate : 35338 latency mean : 0.2 latency median : 0.2 latency 95th percentile : 0.3 latency 99th percentile : 0.4 latency 99.9th percentile : 1.1 latency max : 17.7 Total operation time : 00:00:30 Is the above result expected ? Thanks in advance for any suggestions ! Shing -- Founder/CEO Spinn3r.com Location: *San Francisco, CA* blog: http://burtonator.wordpress.com … or check out my Google+ profile https://plus.google.com/102718274791889610666/posts http://spinn3r.com