Re: How to prevent writing to a Keyspace?
Create different user and assign role and privileges. Create a user like guest and grant select only to that user. That way user cannot modify data in specific keyspace or column family. http://www.datastax.com/documentation/cql/3.0/cql/cql_reference/grant_r.html -Vivek On Mon, Jul 21, 2014 at 7:57 AM, Lu, Boying boying...@emc.com wrote: Thanks a lot J But I think authorization and authentication do little help here. Once we allow an user to read the keyspace, how can we prevent him from writing DB without Cassandra’s help? Is there any way to support ‘read-only’ some keyspace in Cassandra ? e.g. set some specific strategy? Boying *From:* Vivek Mishra [mailto:mishra.v...@gmail.com] *Sent:* 2014年7月17日 18:35 *To:* user@cassandra.apache.org *Subject:* Re: How to prevent writing to a Keyspace? Think about managing it via authorization and authentication support On Thu, Jul 17, 2014 at 4:00 PM, Lu, Boying boying...@emc.com wrote: Hi, All, I need to make a Cassandra keyspace to be read-only. Does anyone know how to do that? Thanks Boying
Re: Which way to Cassandraville?
Having said that, what Java clients should I be looking at? Are there any reasonably mature PoJo mapping techs for Cassandra analogous to Hibernate? The Java Driver offers a basic object mapper in the mapper module. If you look for something more evolved, have a look at http://doanduyhai.github.io/Achilles/
Re: ghost table is breaking compactions and won't go away… even during a drop.
In my experience, SSTable FileNotFoundException, not only caused by recreate a table but also other operations or even bug, cannot be solved by any nodetool command. However, restart the node for more than one time can make this Exception disappear. I don't know the reason but it does work... Thanks, Philo Yang 2014-07-17 10:32 GMT+08:00 Kevin Burton bur...@spinn3r.com: you rock… glad it's fixed in 2.1… :) On Wed, Jul 16, 2014 at 7:05 PM, graham sanderson gra...@vast.com wrote: Known issue deleting and recreating a CF with the same name, fixed in 2.1 (manifests in lots of ways) https://issues.apache.org/jira/browse/CASSANDRA-5202 On Jul 16, 2014, at 8:53 PM, Kevin Burton bur...@spinn3r.com wrote: looks like a restart of cassandra and a nodetool compact fixed this… On Wed, Jul 16, 2014 at 6:45 PM, Kevin Burton bur...@spinn3r.com wrote: this is really troubling… I have a ghost table. I dropped it.. but it's not going away. (Cassandra 2.0.8 btw) I ran a 'drop table' on it.. then a 'describe tables' shows that it's not there. However, when I recreated it, with a new schema, all operations on it failed. Looking at why… it seems that cassandra had some old SSTables that I imagine are no longer being used but are now in an inconsistent state? This is popping up in the system.log: Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: /d0/cassandra/data/blogindex/content_idx_source_hashcode/blogindex-content_idx_source_hashcode-jb-1447-Data.db (No such file or directory) so I think what happened… is that the original drop table, failed, and then left things in an inconsistent state. I tried a nodetool repair and a nodetool compact… those fail on the same java.io.FileNotFoundException … I moved the directories out of the way, same failure issue. … any advice on resolving this? -- Founder/CEO Spinn3r.com http://spinn3r.com/ Location: *San Francisco, CA* blog: http://burtonator.wordpress.com … or check out my Google+ profile https://plus.google.com/102718274791889610666/posts http://spinn3r.com/ -- Founder/CEO Spinn3r.com http://spinn3r.com/ Location: *San Francisco, CA* blog: http://burtonator.wordpress.com … or check out my Google+ profile https://plus.google.com/102718274791889610666/posts http://spinn3r.com/ -- Founder/CEO Spinn3r.com Location: *San Francisco, CA* blog: http://burtonator.wordpress.com … or check out my Google+ profile https://plus.google.com/102718274791889610666/posts http://spinn3r.com
Re: DataType protocol ID error for TIMESTAMPs when upgrading from 1.2.11 to 2.0.9
On Sat, Jul 19, 2014 at 7:35 PM, Karl Rieb karl.r...@gmail.com wrote: Can now be followed at: https://issues.apache.org/jira/browse/CASSANDRA-7576. Nice work! Finally we have a proper solution to this issue, so well done to you.
RE: How to prevent writing to a Keyspace?
I see. Thanks a lot ☺ From: Vivek Mishra [mailto:mishra.v...@gmail.com] Sent: 2014年7月21日 14:16 To: user@cassandra.apache.org Subject: Re: How to prevent writing to a Keyspace? Create different user and assign role and privileges. Create a user like guest and grant select only to that user. That way user cannot modify data in specific keyspace or column family. http://www.datastax.com/documentation/cql/3.0/cql/cql_reference/grant_r.html -Vivek On Mon, Jul 21, 2014 at 7:57 AM, Lu, Boying boying...@emc.commailto:boying...@emc.com wrote: Thanks a lot ☺ But I think authorization and authentication do little help here. Once we allow an user to read the keyspace, how can we prevent him from writing DB without Cassandra’s help? Is there any way to support ‘read-only’ some keyspace in Cassandra ? e.g. set some specific strategy? Boying From: Vivek Mishra [mailto:mishra.v...@gmail.commailto:mishra.v...@gmail.com] Sent: 2014年7月17日 18:35 To: user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: Re: How to prevent writing to a Keyspace? Think about managing it via authorization and authentication support On Thu, Jul 17, 2014 at 4:00 PM, Lu, Boying boying...@emc.commailto:boying...@emc.com wrote: Hi, All, I need to make a Cassandra keyspace to be read-only. Does anyone know how to do that? Thanks Boying
Re: horizontal query scaling issues follow on
Hello, Here is the documentation for cfhistograms, which is in microseconds. http://www.datastax.com/documentation/cassandra/2.0/cassandra/tools/toolsCFhisto.html Your question about setting timeouts is subjective, but you have set your timeout limits to 4 mins, which seems excessive. The default timeout values should be appropriate for a well sized and operating cluster. Increasing timeouts to achieve stability isn't a recommended practice. You're VMs are undersized, and therefore, it is recommended that you reduce your workload or add nodes until stability is achieved. The goal of your exersize is to prove out linear scalability, correct? Then it is recommended to find the load your small nodes/cluster can handle without increasing timeout values, i.e. your cluster can remain stable. Once you found the sweet spot for load on your cluster, increase load by X% while increasing cluster size by X%. Do this for a few iterations so you can see that the processing capabilities of your cluster increases proportionally, and linearly, to the amount of load you are putting on your cluster. Note, with small VM's, you will not receive production-like performance from individual nodes. Also, what type of storage do you have under the VMs? It's not recommended to leverage shared storage. Leveraging shared storage will, more than likely, not allow you to achieve linear scalability. This is because your hardware will not be scaling linearly fully through the stack. Hope this helps Jonathan On Sun, Jul 20, 2014 at 9:12 PM, Diane Griffith dfgriff...@gmail.com wrote: I am running tests again across different number of client threads and number of nodes but this time I tweaked some of the timeouts configured for the nodes in the cluster. I was able to get better performance on the nodes at 10 client threads by upping 4 timeout values in cassandra.yaml to 24: - read_request_timeout_in_ms - range_request_timeout_in_ms - write_request_timeout_in_ms - request_timeout_in_ms I did this because of my interpretation of the cfhistograms output on one of the nodes. So 3 questions that come to mind: 1. Did I interpret the histogram information correctly in cassandra 2.0.6 nodetool output? That the 2 column read latency output is the offset or left column is the time in milliseconds and the right column is number of requests that fell into that bucket range. 2. Was it reasonable for me to boost those 4 timeouts and just those? 3. What are reasonable timeout values for smaller vm sizes (i.e. 8GB RAM, 4 CPUs)? If anyone has any insight it would be appreciated. Thanks, Diane On Fri, Jul 18, 2014 at 2:23 PM, Tyler Hobbs ty...@datastax.com wrote: On Fri, Jul 18, 2014 at 8:01 AM, Diane Griffith dfgriff...@gmail.com wrote: Partition Size (bytes) 1109 bytes: 1800 Cell Count per Partition 8 cells: 1800 meaning I can't glean anything about how it partitioned or if it broke a key across partitions from this right? Does it mean for 1800 (the number of unique keys) that each has 8 cells? Yes, your interpretation is correct. Each of your 1800 partitions has 8 cells (taking up 1109 bytes). -- Tyler Hobbs DataStax http://datastax.com/ -- Jonathan Lacefield Solutions Architect, DataStax (404) 822 3487 http://www.linkedin.com/in/jlacefield http://www.datastax.com/cassandrasummit14
Re: TTransportException (java.net.SocketException: Broken pipe)
I have not seen the issue after changing the commit log segment size to 1024MB. tpstats output: Pool Name Active Pending Completed Blocked All time blocked ReadStage 0 0 0 0 0 RequestResponseStage 0 0 0 0 0 MutationStage 32 40 2526143 0 0 ReadRepairStage 0 0 0 0 0 ReplicateOnWriteStage 0 0 0 0 0 GossipStage 0 0 0 0 0 AntiEntropyStage 0 0 0 0 0 MigrationStage 0 0 3 0 0 MemoryMeter 0 0 24752 0 0 MemtablePostFlusher 1 19 12939 0 0 FlushWriter 6 10 12442 1 2940 MiscStage 0 0 0 0 0 PendingRangeCalculator 0 0 1 0 0 commitlog_archiver 0 0 0 0 0 InternalResponseStage 0 0 0 0 0 HintedHandoff 0 0 0 0 0 Message type Dropped RANGE_SLICE 0 READ_REPAIR 0 PAGED_RANGE 0 BINARY 0 READ 0 MUTATION 0 _TRACE 0 REQUEST_RESPONSE 0 COUNTER_MUTATION 0 On Saturday, 19 July 2014 1:32 AM, Robert Coli rc...@eventbrite.com wrote: On Mon, Jul 7, 2014 at 9:30 PM, Bhaskar Singhal bhaskarsing...@yahoo.com wrote: I am using Cassandra 2.0.7 (with default settings and 16GB heap on quad core ubuntu server with 32gb ram) 16GB of heap will lead to significant GC pauses, and probably will not improve total performance versus 8gb heap. I continue to maintain that your problem is that you are writing faster than you can flush. Paste the output of nodetool tpstats? =Rob
Re: estimated row count for a pk range
thank you for the reply; I was hoping for something with a bit less overhead than the first solution; the second is not really an option for me. On Monday, 21 July 2014, DuyHai Doan doanduy...@gmail.com wrote: 1) Use separate counter to count number of entries in each column family but it will require you to manage the counting manually 2) SELECT DISTINCT partitionKey FROM Normally this query is optimized and is much faster than a SELECT *. However if you have a very big number of distinct partitions it can be slow On Sun, Jul 20, 2014 at 6:48 PM, tommaso barbugli tbarbu...@gmail.com javascript:_e(%7B%7D,'cvml','tbarbu...@gmail.com'); wrote: Hello, Lately I collapsed several (around 1k) column families in a bunch (100) of column families. To keep data separated I have added an extra column (family) which is part of the PK. While previous approach allowed me to always have a clear picture of every column family's size; now I have no other option than select all the rows and make some estimation to guess the overall size used by one of the grouped data in this CFs. eg. SELECT * FROM cf_shard1 WHERE family = '1'; Of course this does not work really well when cf_shard1 has some data in it; is there some way perhaps to get an estimated count for rows matching this query? Thanks, Tommaso -- sent from iphone (sorry for the typos)
map reduce for Cassandra
Hi, I have the need to executing a map/reduce job to identity data stored in Cassandra before indexing this data to Elastic Search. I have already used ColumnFamilyInputFormat (before start using CQL) to write hadoop jobs to do that, but I use to have a lot of troubles to perform tunning, as hadoop depends on how map tasks are split in order to successfull execute things in parallel, for IO/bound processes. First question is: Am I the only one having problems with that? Is anyone else using hadoop jobs that reads from Cassandra in production? Second question is about the alternatives. I saw new version spark will have Cassandra support, but using CqlPagingInputFormat, from hadoop. I tried to use HIVE with Cassandra community, but it seems it only works with Cassandra Enterprise and doesn't do more than FB presto (http://prestodb.io/), which we have been using reading from Cassandra and so far it has been great for SQL-like queries. For custom map reduce jobs, however, it is not enough. Does anyone know some other tool that performs MR on Cassandra? My impression is most tools were created to work on top of HDFS and reading from a nosql db is some kind of workaround. Third question is about how these tools work. Most of them writtes mapped data on a intermediate storage, then data is shuffled and sorted, then it is reduced. Even when using CqlPagingInputFormat, if you are using hadoop it will write files to HDFS after the mapping phase, shuffle and sort this data, and then reduce it. I wonder if a tool supporting Cassandra out of the box wouldn't be smarter. Is it faster to write all your data to a file and then sorting it, or batch inserting data and already indexing it, as it happens when you store data in a Cassandra CF? I didn't do the calculations to check the complexity of each one, what should consider no index in Cassandra would be really large, as the maximum index size will always depend on the maximum capacity of a single host, but my guess is that a map / reduce tool written specifically to Cassandra, from the beggining, could perform much better than a tool written to HDFS and adapted. I hear people saying Map/Reduce on Cassandra/HBase is usually 30% slower than M/R in HDFS. Does it really make sense? Should we expect a result like this? Final question: Do you think writting a new M/R tool like described would be reinventing the wheel? Or it makes sense? Thanks in advance. Any opinions about this subject will be very appreciated. Best regards, Marcelo Valle.
Re: map reduce for Cassandra
Hey Marcelo, You should check out spark. It intelligently deals with a lot of the issues you're mentioning. Al Tobey did a walkthrough of how to set up the OSS side of things here: http://tobert.github.io/post/2014-07-15-installing-cassandra-spark-stack.html It'll be less work than writing a M/R framework from scratch :) Jon On Mon, Jul 21, 2014 at 8:24 AM, Marcelo Elias Del Valle marc...@s1mbi0se.com.br wrote: Hi, I have the need to executing a map/reduce job to identity data stored in Cassandra before indexing this data to Elastic Search. I have already used ColumnFamilyInputFormat (before start using CQL) to write hadoop jobs to do that, but I use to have a lot of troubles to perform tunning, as hadoop depends on how map tasks are split in order to successfull execute things in parallel, for IO/bound processes. First question is: Am I the only one having problems with that? Is anyone else using hadoop jobs that reads from Cassandra in production? Second question is about the alternatives. I saw new version spark will have Cassandra support, but using CqlPagingInputFormat, from hadoop. I tried to use HIVE with Cassandra community, but it seems it only works with Cassandra Enterprise and doesn't do more than FB presto (http://prestodb.io/), which we have been using reading from Cassandra and so far it has been great for SQL-like queries. For custom map reduce jobs, however, it is not enough. Does anyone know some other tool that performs MR on Cassandra? My impression is most tools were created to work on top of HDFS and reading from a nosql db is some kind of workaround. Third question is about how these tools work. Most of them writtes mapped data on a intermediate storage, then data is shuffled and sorted, then it is reduced. Even when using CqlPagingInputFormat, if you are using hadoop it will write files to HDFS after the mapping phase, shuffle and sort this data, and then reduce it. I wonder if a tool supporting Cassandra out of the box wouldn't be smarter. Is it faster to write all your data to a file and then sorting it, or batch inserting data and already indexing it, as it happens when you store data in a Cassandra CF? I didn't do the calculations to check the complexity of each one, what should consider no index in Cassandra would be really large, as the maximum index size will always depend on the maximum capacity of a single host, but my guess is that a map / reduce tool written specifically to Cassandra, from the beggining, could perform much better than a tool written to HDFS and adapted. I hear people saying Map/Reduce on Cassandra/HBase is usually 30% slower than M/R in HDFS. Does it really make sense? Should we expect a result like this? Final question: Do you think writting a new M/R tool like described would be reinventing the wheel? Or it makes sense? Thanks in advance. Any opinions about this subject will be very appreciated. Best regards, Marcelo Valle. -- Jon Haddad http://www.rustyrazorblade.com skype: rustyrazorblade
Re: map reduce for Cassandra
Hi Jonathan, Do you know if this RDD can be used with Python? AFAIK, python + Cassandra will be supported just in the next version, but I would like to be wrong... Best regards, Marcelo Valle. 2014-07-21 13:06 GMT-03:00 Jonathan Haddad j...@jonhaddad.com: Hey Marcelo, You should check out spark. It intelligently deals with a lot of the issues you're mentioning. Al Tobey did a walkthrough of how to set up the OSS side of things here: http://tobert.github.io/post/2014-07-15-installing-cassandra-spark-stack.html It'll be less work than writing a M/R framework from scratch :) Jon On Mon, Jul 21, 2014 at 8:24 AM, Marcelo Elias Del Valle marc...@s1mbi0se.com.br wrote: Hi, I have the need to executing a map/reduce job to identity data stored in Cassandra before indexing this data to Elastic Search. I have already used ColumnFamilyInputFormat (before start using CQL) to write hadoop jobs to do that, but I use to have a lot of troubles to perform tunning, as hadoop depends on how map tasks are split in order to successfull execute things in parallel, for IO/bound processes. First question is: Am I the only one having problems with that? Is anyone else using hadoop jobs that reads from Cassandra in production? Second question is about the alternatives. I saw new version spark will have Cassandra support, but using CqlPagingInputFormat, from hadoop. I tried to use HIVE with Cassandra community, but it seems it only works with Cassandra Enterprise and doesn't do more than FB presto (http://prestodb.io/), which we have been using reading from Cassandra and so far it has been great for SQL-like queries. For custom map reduce jobs, however, it is not enough. Does anyone know some other tool that performs MR on Cassandra? My impression is most tools were created to work on top of HDFS and reading from a nosql db is some kind of workaround. Third question is about how these tools work. Most of them writtes mapped data on a intermediate storage, then data is shuffled and sorted, then it is reduced. Even when using CqlPagingInputFormat, if you are using hadoop it will write files to HDFS after the mapping phase, shuffle and sort this data, and then reduce it. I wonder if a tool supporting Cassandra out of the box wouldn't be smarter. Is it faster to write all your data to a file and then sorting it, or batch inserting data and already indexing it, as it happens when you store data in a Cassandra CF? I didn't do the calculations to check the complexity of each one, what should consider no index in Cassandra would be really large, as the maximum index size will always depend on the maximum capacity of a single host, but my guess is that a map / reduce tool written specifically to Cassandra, from the beggining, could perform much better than a tool written to HDFS and adapted. I hear people saying Map/Reduce on Cassandra/HBase is usually 30% slower than M/R in HDFS. Does it really make sense? Should we expect a result like this? Final question: Do you think writting a new M/R tool like described would be reinventing the wheel? Or it makes sense? Thanks in advance. Any opinions about this subject will be very appreciated. Best regards, Marcelo Valle. -- Jon Haddad http://www.rustyrazorblade.com skype: rustyrazorblade
Re: map reduce for Cassandra
I haven't tried pyspark yet, but it's part of the distribution. My main language is Python too, so I intend on getting deep into it. On Mon, Jul 21, 2014 at 9:38 AM, Marcelo Elias Del Valle marc...@s1mbi0se.com.br wrote: Hi Jonathan, Do you know if this RDD can be used with Python? AFAIK, python + Cassandra will be supported just in the next version, but I would like to be wrong... Best regards, Marcelo Valle. 2014-07-21 13:06 GMT-03:00 Jonathan Haddad j...@jonhaddad.com: Hey Marcelo, You should check out spark. It intelligently deals with a lot of the issues you're mentioning. Al Tobey did a walkthrough of how to set up the OSS side of things here: http://tobert.github.io/post/2014-07-15-installing-cassandra-spark-stack.html It'll be less work than writing a M/R framework from scratch :) Jon On Mon, Jul 21, 2014 at 8:24 AM, Marcelo Elias Del Valle marc...@s1mbi0se.com.br wrote: Hi, I have the need to executing a map/reduce job to identity data stored in Cassandra before indexing this data to Elastic Search. I have already used ColumnFamilyInputFormat (before start using CQL) to write hadoop jobs to do that, but I use to have a lot of troubles to perform tunning, as hadoop depends on how map tasks are split in order to successfull execute things in parallel, for IO/bound processes. First question is: Am I the only one having problems with that? Is anyone else using hadoop jobs that reads from Cassandra in production? Second question is about the alternatives. I saw new version spark will have Cassandra support, but using CqlPagingInputFormat, from hadoop. I tried to use HIVE with Cassandra community, but it seems it only works with Cassandra Enterprise and doesn't do more than FB presto (http://prestodb.io/), which we have been using reading from Cassandra and so far it has been great for SQL-like queries. For custom map reduce jobs, however, it is not enough. Does anyone know some other tool that performs MR on Cassandra? My impression is most tools were created to work on top of HDFS and reading from a nosql db is some kind of workaround. Third question is about how these tools work. Most of them writtes mapped data on a intermediate storage, then data is shuffled and sorted, then it is reduced. Even when using CqlPagingInputFormat, if you are using hadoop it will write files to HDFS after the mapping phase, shuffle and sort this data, and then reduce it. I wonder if a tool supporting Cassandra out of the box wouldn't be smarter. Is it faster to write all your data to a file and then sorting it, or batch inserting data and already indexing it, as it happens when you store data in a Cassandra CF? I didn't do the calculations to check the complexity of each one, what should consider no index in Cassandra would be really large, as the maximum index size will always depend on the maximum capacity of a single host, but my guess is that a map / reduce tool written specifically to Cassandra, from the beggining, could perform much better than a tool written to HDFS and adapted. I hear people saying Map/Reduce on Cassandra/HBase is usually 30% slower than M/R in HDFS. Does it really make sense? Should we expect a result like this? Final question: Do you think writting a new M/R tool like described would be reinventing the wheel? Or it makes sense? Thanks in advance. Any opinions about this subject will be very appreciated. Best regards, Marcelo Valle. -- Jon Haddad http://www.rustyrazorblade.com skype: rustyrazorblade -- Jon Haddad http://www.rustyrazorblade.com skype: rustyrazorblade
Re: map reduce for Cassandra
Jonathan, By what I have read in the docs, Python API has some limitations yet, not being possible to use any hadoop binary input format. The python example for Cassandra is only in the master branch: https://github.com/apache/spark/blob/master/examples/src/main/python/cassandra_inputformat.py I may be lacking knowledge of Spark, but if I understood it correctly, the access to Cassandra data is still made by the CqlPagingInputFormat, from hadoop integration. Here is where I ask: even if Spark supports Cassandra, will it be fast enough? My understanding (please some correct me if I am wrong) is that when you insert N items in a Cassandra CF, you are executing N binary searches to insert the item already indexed by a key. When you read the data, it's already sorted. So you take O(N * log(N)) (binary search complexity to insert all data already sorted. However, by using a fast sort algorithm, you also take O(N * log(N)) to sort the data after ir was inserted, but then using more IO. If I write a job in Spark / Java with Cassandra, how will the mapped data be stored and sorted? Will it be stored in Cassandra too? Will spark run sort after the mapping? Best regards, Marcelo. 2014-07-21 14:06 GMT-03:00 Jonathan Haddad j...@jonhaddad.com: I haven't tried pyspark yet, but it's part of the distribution. My main language is Python too, so I intend on getting deep into it. On Mon, Jul 21, 2014 at 9:38 AM, Marcelo Elias Del Valle marc...@s1mbi0se.com.br wrote: Hi Jonathan, Do you know if this RDD can be used with Python? AFAIK, python + Cassandra will be supported just in the next version, but I would like to be wrong... Best regards, Marcelo Valle. 2014-07-21 13:06 GMT-03:00 Jonathan Haddad j...@jonhaddad.com: Hey Marcelo, You should check out spark. It intelligently deals with a lot of the issues you're mentioning. Al Tobey did a walkthrough of how to set up the OSS side of things here: http://tobert.github.io/post/2014-07-15-installing-cassandra-spark-stack.html It'll be less work than writing a M/R framework from scratch :) Jon On Mon, Jul 21, 2014 at 8:24 AM, Marcelo Elias Del Valle marc...@s1mbi0se.com.br wrote: Hi, I have the need to executing a map/reduce job to identity data stored in Cassandra before indexing this data to Elastic Search. I have already used ColumnFamilyInputFormat (before start using CQL) to write hadoop jobs to do that, but I use to have a lot of troubles to perform tunning, as hadoop depends on how map tasks are split in order to successfull execute things in parallel, for IO/bound processes. First question is: Am I the only one having problems with that? Is anyone else using hadoop jobs that reads from Cassandra in production? Second question is about the alternatives. I saw new version spark will have Cassandra support, but using CqlPagingInputFormat, from hadoop. I tried to use HIVE with Cassandra community, but it seems it only works with Cassandra Enterprise and doesn't do more than FB presto (http://prestodb.io/), which we have been using reading from Cassandra and so far it has been great for SQL-like queries. For custom map reduce jobs, however, it is not enough. Does anyone know some other tool that performs MR on Cassandra? My impression is most tools were created to work on top of HDFS and reading from a nosql db is some kind of workaround. Third question is about how these tools work. Most of them writtes mapped data on a intermediate storage, then data is shuffled and sorted, then it is reduced. Even when using CqlPagingInputFormat, if you are using hadoop it will write files to HDFS after the mapping phase, shuffle and sort this data, and then reduce it. I wonder if a tool supporting Cassandra out of the box wouldn't be smarter. Is it faster to write all your data to a file and then sorting it, or batch inserting data and already indexing it, as it happens when you store data in a Cassandra CF? I didn't do the calculations to check the complexity of each one, what should consider no index in Cassandra would be really large, as the maximum index size will always depend on the maximum capacity of a single host, but my guess is that a map / reduce tool written specifically to Cassandra, from the beggining, could perform much better than a tool written to HDFS and adapted. I hear people saying Map/Reduce on Cassandra/HBase is usually 30% slower than M/R in HDFS. Does it really make sense? Should we expect a result like this? Final question: Do you think writting a new M/R tool like described would be reinventing the wheel? Or it makes sense? Thanks in advance. Any opinions about this subject will be very appreciated. Best regards, Marcelo Valle. -- Jon Haddad
Authentication exception
I routinely get this exception from cqlsh on one of my clusters: cql.cassandra.ttypes.AuthenticationException: AuthenticationException(why='org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out - received only 2 responses.') The system_auth keyspace is set to replicate X times given X nodes in each datacenter, and at the time of the exception all nodes are reporting as online and healthy. After a short period (i.e. 30 minutes), it will let me in again. What could be the cause of this?
Re: map reduce for Cassandra
On Mon, Jul 21, 2014 at 10:54 AM, Marcelo Elias Del Valle marc...@s1mbi0se.com.br wrote: My understanding (please some correct me if I am wrong) is that when you insert N items in a Cassandra CF, you are executing N binary searches to insert the item already indexed by a key. When you read the data, it's already sorted. So you take O(N * log(N)) (binary search complexity to insert all data already sorted. You're wrong, unless you're talking about insertion into a memtable, which you probably aren't and which probably doesn't actually work that way enough to be meaningful. On disk, Cassandra has immutable datafiles, from which row fragments are merged into a row at read time. I'm pretty sure the rest of the stuff you said doesn't make any sense in light of this? =Rob
Re: TTransportException (java.net.SocketException: Broken pipe)
On Mon, Jul 21, 2014 at 8:07 AM, Bhaskar Singhal bhaskarsing...@yahoo.com wrote: I have not seen the issue after changing the commit log segment size to 1024MB. Yes... your insanely over-huge commitlog will be contained in fewer files if you increase the size of segments that will not make it any less of an insanely over-huge commitlog which indicates systemic failure in your application's use of Cassandra. Congratulations on masking your actual issue with your configuration change. Pool NameActive Pending Completed Blocked All time blocked FlushWriter 610 12442 1 2940 1/4 attempts to flush blocked waiting for resources, and you have 6 actives flushes and 10 pending, because YOU'RE WRITING TOO FAST. As a meta aside, I am unlikely to respond to further questions of yours which do not engage with what I have now told you three or four times, that YOU'RE WRITING TOO FAST. =Rob
Re: DataType protocol ID error for TIMESTAMPs when upgrading from 1.2.11 to 2.0.9
On Mon, Jul 21, 2014 at 1:58 AM, Ben Hood 0x6e6...@gmail.com wrote: On Sat, Jul 19, 2014 at 7:35 PM, Karl Rieb karl.r...@gmail.com wrote: Can now be followed at: https://issues.apache.org/jira/browse/CASSANDRA-7576. Nice work! Finally we have a proper solution to this issue, so well done to you. For reference, I consider this issue of sufficient severity to recommend against upgrading to any version of 2.0 before 2.0.10, unless you are certain you have no such schema. I'm pretty sure reversed comparator timestamps are a common type of schema, given that there are blog posts recommending their use, so I struggle to understand how this was not detected by unit tests. Does your fix add unit tests which would catch this case on upgrade? =Rob
Re: DataType protocol ID error for TIMESTAMPs when upgrading from 1.2.11 to 2.0.9
I did not include unit tests in my patch. I think many people did not run into this issue because many Cassandra clients handle the DateType when found as a CUSTOM type. -Karl On Jul 21, 2014, at 8:26 PM, Robert Coli rc...@eventbrite.com wrote: On Mon, Jul 21, 2014 at 1:58 AM, Ben Hood 0x6e6...@gmail.com wrote: On Sat, Jul 19, 2014 at 7:35 PM, Karl Rieb karl.r...@gmail.com wrote: Can now be followed at: https://issues.apache.org/jira/browse/CASSANDRA-7576. Nice work! Finally we have a proper solution to this issue, so well done to you. For reference, I consider this issue of sufficient severity to recommend against upgrading to any version of 2.0 before 2.0.10, unless you are certain you have no such schema. I'm pretty sure reversed comparator timestamps are a common type of schema, given that there are blog posts recommending their use, so I struggle to understand how this was not detected by unit tests. Does your fix add unit tests which would catch this case on upgrade? =Rob
Re: map reduce for Cassandra
Hi Robert, First of all, thanks for answering. 2014-07-21 20:18 GMT-03:00 Robert Coli rc...@eventbrite.com: You're wrong, unless you're talking about insertion into a memtable, which you probably aren't and which probably doesn't actually work that way enough to be meaningful. On disk, Cassandra has immutable datafiles, from which row fragments are merged into a row at read time. I'm pretty sure the rest of the stuff you said doesn't make any sense in light of this? Although several sstables (disk fragments) may have the same row key, inside a single sstable row keys and column keys are indexed, right? Otherwise, doing a GET in Cassandra would take some time. From the M/R perspective, I was reffering to the mem table, as I am trying to compare the time to insert in Cassandra against the time of sorting in hadoop. To make it more clear: hadoop has it's own partitioner, which is used after the map phase. The map output is written locally on each hadoop node, then it's shuffled from one node to the other (see slide 17 in this presentation: http://pt.slideshare.net/j_singh/nosql-and-mapreduce). In other words, you may read Cassandra data on hadoop, but the intermediate results are still stored in HDFS. Instead of using hadoop partitioner, I would like to store the intermediate results in a Cassandra CF, so the map output would go directly to an intermediate column family via batch inserts, instead of being written to a local disk first, then shuffled to the right node. Therefore, the mapper would write it's output the same way all data enters in Cassandra: first on a memtable, then being flush to a sstable, then read during the reduce phase. Shouldn't it be faster than storing intermediate results in HDFS? Best regards, Marcelo.
Re: map reduce for Cassandra
On Mon, Jul 21, 2014 at 5:45 PM, Marcelo Elias Del Valle marc...@s1mbi0se.com.br wrote: Although several sstables (disk fragments) may have the same row key, inside a single sstable row keys and column keys are indexed, right? Otherwise, doing a GET in Cassandra would take some time. From the M/R perspective, I was reffering to the mem table, as I am trying to compare the time to insert in Cassandra against the time of sorting in hadoop. I was confused, because unless you are using new in-memory columnfamilies, which I believe are only available in DSE, there is no way to ensure that any given row stays in a memtable. Very rarely is there a view of the function of a memtable that only cares about its properties and not the closely related properties of SSTables. However yours is one of them, I see now why your question makes sense, you only care about the memtable for how quickly it sorts. But if you are only relying on memtables to sort writes, that seems like a pretty heavyweight reason to use Cassandra? I'm certainly not an expert in this area of Cassandra... but Cassandra, as a datastore with immutable data files, is not typically a good choice for short lived intermediate result sets... are you planning to use DSE? =Rob
Re: map reduce for Cassandra
Hi, But if you are only relying on memtables to sort writes, that seems like a pretty heavyweight reason to use Cassandra? Actually, it's not a reason to use Cassandra. I already use Cassandra and I need to map reduce data from it. I am trying to see a reason to use the conventional M/R tools or to build a tool specific to Cassandra. but Cassandra, as a datastore with immutable data files, is not typically a good choice for short lived intermediate result sets... Indeed, but so far I am seeing it as the best option. I storing this intermediate files in HDFS is better, then I agree there is no reason to consider Cassandra to do it. are you planning to use DSE? Our company will probably hire DSE support when it reaches some size, but DSE as a product doesn't seem interesting to our case so far. The only tool that would help be at this moment would be HIVE, but honestly I didn't like the way DSE supports hive and I don't want to use a solution not available to DSC (see http://stackoverflow.com/questions/23959169/problems-using-hive-cassandra-community for details). []s 2014-07-21 22:09 GMT-03:00 Robert Coli rc...@eventbrite.com: On Mon, Jul 21, 2014 at 5:45 PM, Marcelo Elias Del Valle marc...@s1mbi0se.com.br wrote: Although several sstables (disk fragments) may have the same row key, inside a single sstable row keys and column keys are indexed, right? Otherwise, doing a GET in Cassandra would take some time. From the M/R perspective, I was reffering to the mem table, as I am trying to compare the time to insert in Cassandra against the time of sorting in hadoop. I was confused, because unless you are using new in-memory columnfamilies, which I believe are only available in DSE, there is no way to ensure that any given row stays in a memtable. Very rarely is there a view of the function of a memtable that only cares about its properties and not the closely related properties of SSTables. However yours is one of them, I see now why your question makes sense, you only care about the memtable for how quickly it sorts. But if you are only relying on memtables to sort writes, that seems like a pretty heavyweight reason to use Cassandra? I'm certainly not an expert in this area of Cassandra... but Cassandra, as a datastore with immutable data files, is not typically a good choice for short lived intermediate result sets... are you planning to use DSE? =Rob
Re: horizontal query scaling issues follow on
So I appreciate all the help so far. Upfront, it is possible the schema and data query pattern could be contributing to the problem. The schema was born out of certain design requirements. If it proves to be part of what makes the scalability crumble, then I hope it will help shape the design requirements. Anyway, the premise of the question was my struggle where scalability metrics fell apart going from 2 nodes to 4 nodes for the current schema and query access pattern being modeled: - 1 node was producing acceptable response times seemed to be the consensus - 2 nodes showed marked improvement to the response times for the query scenario being modeled which was welcomed news - 4 nodes showed a decrease in performance and it was not clear why going 2 to 4 nodes triggered the decrease Also what contributed to the question was 2 more items: - cassandra-env.sh - where in the example for HEAP_NEWSIZE states in the comments it assumes a modern 8 core machine for pause times - a wiki article I had found and I am trying to relocate where a person set up very small nodes for developers on that team and talked through all the paramters that had to be changed from the default to get good throughput. It sort of implied the defaults maybe were based on a certain sized vm. That was the main driver for those questions. I agree it does not seem correct to boost the values let alone so high to minimize impact in some respects (i.e. not trigger the reads to time out and start over given the retry policy). So the question really was are the defaults sized with the assumption of a certain minimal vm size (i.e. the comment in cassandra-env.sh) Does that explain where I am coming from better? My question, despite being naive and ignoring other impacts still stands, is there a minimal vm size that is more of the sweet spot for cassandra and the defaults. I get the point that a column family schema as it relates to the desired queries can and do impact that answer. I guess what bothered me was it didn't impact that answer going from 1 node to 2 nodes but started showing up going from 2 nodes to 4 nodes. I'm building whatever facts I can to support the schema and query pattern scales or does not. If it does not, then I am trying to pull information from some metrics outputted by nodetool or log statements on the cassandra log files to support a case to change the design requirements. Thanks, Diane On Mon, Jul 21, 2014 at 8:15 PM, Robert Coli rc...@eventbrite.com wrote: On Sun, Jul 20, 2014 at 6:12 PM, Diane Griffith dfgriff...@gmail.com wrote: I am running tests again across different number of client threads and number of nodes but this time I tweaked some of the timeouts configured for the nodes in the cluster. I was able to get better performance on the nodes at 10 client threads by upping 4 timeout values in cassandra.yaml to 24: If you have to tune these timeout values, you have probably modeled data in such a way that each of your requests is quite large or quite slow. This is usually, but not always, an indicator that you are Doing It Wrong. Massively multithreaded things don't generally like their threads to be long-lived, for what should hopefully be obvious reasons. I did this because of my interpretation of the cfhistograms output on one of the nodes. Could you be more specific? So 3 questions that come to mind: 1. Did I interpret the histogram information correctly in cassandra 2.0.6 nodetool output? That the 2 column read latency output is the offset or left column is the time in milliseconds and the right column is number of requests that fell into that bucket range. 2. Was it reasonable for me to boost those 4 timeouts and just those? Not really. In 5 years of operating Cassandra, I've never had a problem whose solution was to increase these timeouts from their default. 1. What are reasonable timeout values for smaller vm sizes (i.e. 8GB RAM, 4 CPUs)? As above, I question the premise of this question. =Rob
Re: Authentication exception
I could you perhaps check your ntp? On Tue, Jul 22, 2014 at 3:35 AM, Jeremy Jongsma jer...@barchart.com wrote: I routinely get this exception from cqlsh on one of my clusters: cql.cassandra.ttypes.AuthenticationException: AuthenticationException(why='org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out - received only 2 responses.') The system_auth keyspace is set to replicate X times given X nodes in each datacenter, and at the time of the exception all nodes are reporting as online and healthy. After a short period (i.e. 30 minutes), it will let me in again. What could be the cause of this?