Re: Inconsistent behavior during read
Hi If this problem was because of data inconsistencies, it should have been very rare. However, I am seeing this happen very often (almost 50 % of the times). Statistically, this should be very unlikely if the number of replication failures are small. On Thu, Jun 25, 2015 at 11:55 PM, Tyler Hobbs ty...@datastax.com wrote: On Thu, Jun 25, 2015 at 1:00 PM, Robert Coli rc...@eventbrite.com wrote: [1] or read repair set to 100% combined with a full scan of all data... which no one does... And this is only true if full scan means reading every partition individually. Reads of partition ranges (or a range slice, in old Thrift terms) don't do read repair. -- Tyler Hobbs DataStax http://datastax.com/ -- Aditya Shetty *Lead Engineer* *M*: +91 7022423545, *T*: 080 46603000 *EXT*: 4417 2nd FLOOR, WEST WING, SALARPURIA SUPREME, MARATHAHALLI, BENGALURU Download Our App [image: A] https://play.google.com/store/apps/details?id=com.snapdeal.mainutm_source=mobileAppLputm_campaign=android [image: A] https://itunes.apple.com/in/app/snapdeal-mobile-shopping/id721124909?ls=1mt=8utm_source=mobileAppLputm_campaign=ios [image: W] http://www.windowsphone.com/en-in/store/app/snapdeal/ee17fccf-40d0-4a59-80a3-04da47a5553f
Re: Restore Snapshots
Good morning, Alain, thank you so much. This is exactly what I needed. In my test I had a node which had for whatever reason the directory containing my data corrupted. I keep in a separate folder my snapshots. Here are the steps I took to recover my sick node: 0) Cassandra is stopped on my sick node. 1) I wiped out my data directory. My snapshots were kept outside this directory. 2) I modified my Cassandra.yaml. I added auto_bootstrap: false .This is to make sure that my node does not synch with the others. 3) I restarted Cassandra. This step created a basic structure for my new data directory. 4) I did the command: nodetool resetlocalschema. This recreated all the folders for my cf. 5) I stopped Cassandra on my node. 6) I copied my snapshot in the right location. I actually hard linked them, this is very fast. 7) I restarted Cassandra. That's it. Thank you SO MUCH ALAIN for your support. You really helped me a lot. On 25 Jun,2015, at 18:37, Alain RODRIGUEZ arodr...@gmail.commailto:arodr...@gmail.com wrote: Hi Jean, Answers in line to be sure to be exhaustive: - how can I restore the data directory structure in order to copy my snapshots at the right position? -- making a script to do it and testing it I would say. basically under any table repo you have a snapshots/snapshot_name directory (snapshot_name is timestamp if not specified off the top of my head..) and then your sstables. - is it possible to recreate the schema on one node? -- The easiest way that come to my mind is to set auto_bootstrap: false on a node not already in the ring. If you have trouble with the schema of a node in the ring run a nodetool resetlocalschema - how can I avoid the node from streaming from the other nodes? -- See above (auto_bootstrap: false). BTW, option might not be present at all, just add it. - must I also have the snapshot of the system tables in order to restore a node from only the snapshot of my tables? -- just you user table. Yet remember that snapshot is per node and as such you will just have part of the data this node use to hold. meaning that if the new node have different tokens, there will be unused data + missing data for sure. Basically when a node is down I use to remove it, repair the cluster, and bootstap it (auto_bootstrap: true). Streams are part of Cassandra. I accept that. An other solution would be to replace the node -- http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_replace_node_t.html C*heers, Alain 2015-06-25 17:07 GMT+02:00 Jean Tremblay jean.tremb...@zen-innovations.commailto:jean.tremb...@zen-innovations.com: Hi, I am testing snapshot restore procedures in case of a major catastrophe on our cluster. I'm using Cassandra 2.1.7 with RF:3 The scenario that I am trying to solve is how to quickly get one node back to work after its disk failed and lost all its data assuming that the only thing I have is its snapshots. The procedure that I'm following is the one explained here: http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_backup_snapshot_restore_t.html I can do a snapshot that is straight forward. My problem is in the restore of the snapshot. If I restart Cassandra with an empty data directory the node will bootstrap. Bootstrap is very nice, since it recreate the schema and reload the data from its neighbour. But this is quite heavy traffic and quite a slow process. My questions are: - how can I restore the data directory structure in order to copy my snapshots at the right position? - is it possible to recreate the schema on one node? - how can I avoid the node from streaming from the other nodes? - must I also have the snapshot of the system tables in order to restore a node from only the snapshot of my tables? Thanks for your comments. Jean
Re: Restore Snapshots
Hi Jean, Glad to hear it worked this way. Some other people provided (and continue providing) similar help to me, just trying to give back to the community as much as I received from it. See you around. Alain 2015-06-26 8:44 GMT+02:00 Jean Tremblay jean.tremb...@zen-innovations.com: Good morning, Alain, thank you so much. This is exactly what I needed. In my test I had a node which had for whatever reason the directory containing my data corrupted. I keep in a separate folder my snapshots. Here are the steps I took to recover my sick node: 0) Cassandra is stopped on my sick node. 1) I wiped out my data directory. My snapshots were kept outside this directory. 2) I modified my Cassandra.yaml. I added auto_bootstrap: false .This is to make sure that my node does not synch with the others. 3) I restarted Cassandra. This step created a basic structure for my new data directory. 4) I did the command: nodetool resetlocalschema. This recreated all the folders for my cf. 5) I stopped Cassandra on my node. 6) I copied my snapshot in the right location. I actually hard linked them, this is very fast. 7) I restarted Cassandra. That's it. Thank you SO MUCH ALAIN for your support. You really helped me a lot. On 25 Jun,2015, at 18:37, Alain RODRIGUEZ arodr...@gmail.com wrote: Hi Jean, Answers in line to be sure to be exhaustive: - how can I restore the data directory structure in order to copy my snapshots at the right position? -- making a script to do it and testing it I would say. basically under any table repo you have a snapshots/snapshot_name directory (snapshot_name is timestamp if not specified off the top of my head..) and then your sstables. - is it possible to recreate the schema on one node? -- The easiest way that come to my mind is to set auto_bootstrap: false on a node not already in the ring. If you have trouble with the schema of a node in the ring run a nodetool resetlocalschema - how can I avoid the node from streaming from the other nodes? -- See above (auto_bootstrap: false). BTW, option might not be present at all, just add it. - must I also have the snapshot of the system tables in order to restore a node from only the snapshot of my tables? -- just you user table. Yet remember that snapshot is per node and as such you will just have part of the data this node use to hold. meaning that if the new node have different tokens, there will be unused data + missing data for sure. Basically when a node is down I use to remove it, repair the cluster, and bootstap it (auto_bootstrap: true). Streams are part of Cassandra. I accept that. An other solution would be to replace the node -- http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_replace_node_t.html C*heers, Alain 2015-06-25 17:07 GMT+02:00 Jean Tremblay jean.tremb...@zen-innovations.com: Hi, I am testing snapshot restore procedures in case of a major catastrophe on our cluster. I’m using Cassandra 2.1.7 with RF:3 The scenario that I am trying to solve is how to quickly get one node back to work after its disk failed and lost all its data assuming that the only thing I have is its snapshots. The procedure that I’m following is the one explained here: http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_backup_snapshot_restore_t.html I can do a snapshot that is straight forward. My problem is in the restore of the snapshot. If I restart Cassandra with an empty data directory the node will bootstrap. Bootstrap is very nice, since it recreate the schema and reload the data from its neighbour. But this is quite heavy traffic and quite a slow process. My questions are: - how can I restore the data directory structure in order to copy my snapshots at the right position? - is it possible to recreate the schema on one node? - how can I avoid the node from streaming from the other nodes? - must I also have the snapshot of the system tables in order to restore a node from only the snapshot of my tables? Thanks for your comments. Jean
Cassandra stuck at DataSink running on cluster
Hi, I am trying to write into Cassandra via the CqlBulkOutputFormat from an apache flink program. The program succeeds to write into a cassandra-cluster while the program is running locally on my pc. However, when trying to run the program on the cluster, it seems to get stuck at SSTableSimpleUnsortedWriter.put() waiting for the Diskwriter-Thread, that is not running anymore. I am using cassandra version 1.5 and apache flink version 0.9.0. Attached is the full stacktrace. Thanks in advance, Susanne 2015-06-26 11:15:35 Full thread dump Java HotSpot(TM) 64-Bit Server VM (24.65-b04 mixed mode): JMX server connection timeout 68 - Thread t@68 java.lang.Thread.State: TIMED_WAITING at java.lang.Object.wait(Native Method) - waiting on 117d0002 (a [I) at com.sun.jmx.remote.internal.ServerCommunicatorAdmin$Timeout.run(ServerCommunicatorAdmin.java:168) at java.lang.Thread.run(Thread.java:745) Locked ownable synchronizers: - None RMI TCP Connection(4)-172.16.30.87 - Thread t@67 java.lang.Thread.State: RUNNABLE at sun.management.ThreadImpl.dumpThreads0(Native Method) at sun.management.ThreadImpl.dumpAllThreads(ThreadImpl.java:446) at sun.reflect.GeneratedMethodAccessor62.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75) at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279) at com.sun.jmx.mbeanserver.ConvertingMethod.invokeWithOpenReturn(ConvertingMethod.java:193) at com.sun.jmx.mbeanserver.ConvertingMethod.invokeWithOpenReturn(ConvertingMethod.java:175) at com.sun.jmx.mbeanserver.MXBeanIntrospector.invokeM2(MXBeanIntrospector.java:117) at com.sun.jmx.mbeanserver.MXBeanIntrospector.invokeM2(MXBeanIntrospector.java:54) at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237) at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138) at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252) at javax.management.StandardMBean.invoke(StandardMBean.java:405) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819) at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801) at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1487) at javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97) at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328) at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1420) at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:848) at sun.reflect.GeneratedMethodAccessor43.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322) at sun.rmi.transport.Transport$1.run(Transport.java:177) at sun.rmi.transport.Transport$1.run(Transport.java:174) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.Transport.serviceCall(Transport.java:173) at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:556) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:811) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:670) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Locked ownable synchronizers: - locked 4f733919 (a java.util.concurrent.ThreadPoolExecutor$Worker) RMI Scheduler(0) - Thread t@66 java.lang.Thread.State: TIMED_WAITING at sun.misc.Unsafe.park(Native Method) - parking to wait for 258b8c46 (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1090)
Re: Slow reads on C* 2.0.15 using Spark Cassandra
We notice incredibly slow reads, 600mb in an hour, we are using quorum LOCAL_ONE reads. The load_one of Cassandra increases from 1 to 60! There is no CPU wait, only user nice. Without seeing the code and query, it's hard to tell, but I noticed something similar when we had a client incorrectly using the 'take' method for a result count like so: val resultCount = query.take(count).length 'take' can call limit under the hood. The docs for the latter are interesting: The limit will be applied for each created Spark partition. In other words, unless the data are fetched from a single Cassandra partition the number of results is unpredictable. [0] Removing that line (it wasnt necessary for the use case) and just relying on a simple 'myRDD.select(my_col)).toArray.foreach got performance back to where it should be. Per the docs, limit (and therefore take) works fine as long as the partition key is used as a predicate in the where clause (WHERE test_id = somevalue in your example). [0] https://github.com/datastax/spark-cassandra-connector/blob/master/spark-cassandra-connector/src/main/scala/com/datastax/spark/connector/rdd/CassandraRDD.scala#L92-L101 -- - Nate McCall Austin, TX @zznate Co-Founder Sr. Technical Consultant Apache Cassandra Consulting http://www.thelastpickle.com
Re: sstableloader Could not retrieve endpoint ranges
I want to follow up on this thread to describe what I was able to get working. My goal was to switch a cluster to vnodes, in the process preserving the data for a single table, endpoints.endpoint_messages. Otherwise, I could afford to start from a clean slate. As should be apparent, I could also afford to do this within a maintenance window where the cluster was down. In other words, I had the luxury of not having to add a new data center to a live cluster per DataStax's documented procedure to enable vnodes: http://docs.datastax.com/en/cassandra/1.2/cassandra/configuration/configVnodesProduction_t.html http://docs.datastax.com/en/cassandra/2.1/cassandra/configuration/configVnodesProduction_t.html What I got working relies on the nodetool snapshot command to create various SSTable snapshots under endpoints/endpoint_messages/snapshots/SNAPSHOT_NAME. The snapshots represent the data being backed up and restored from. The backup and restore is not directly, literally working against the original SSTables directly in various endpoints/endpoint_messages/ directories. - endpoints/endpoint_messages/snapshots/SNAPSHOT_NAME/: These SSTables are being copied off and restored from. - endpoints/endpoint_messages/: These SSTables are obviously the source of the snapshots but are not being copied off and restored from. Instead of using sstableloader to load the snapshots into the re-initialized Cassandra cluster, I used the JMX StorageService.bulkLoad command after establishing a JConsole session to each node. I copied off the snapshots to load to a directory path that ends with endpoints/endpoint_messages/ to give the bulk-loader a path it expects. The directory path that is the destination for nodetool snapshot and the source for StorageService.bulkLoad is on the same host as the Cassandra node but outside the purview of the Cassandra node. This procedure can be summarized as follows: 1. For each node, create a snapshot of the endpoint_messages table as a backup. 2. Stop the cluster. 3. On each node, wipe all the data, i.e. the contents of data_files_directories, commitlog, and saved_caches. 4. Deploy the cassandra.yaml configuration that makes the switch to vnodes and restart the cluster to apply the vnodes change. 5. Re-create the endpoints keyspace. 6. On each node, bulk-load the snapshots for that particular node. This summary can be reduced even further: 1. On each node, export the data to preserve. 2. On each node, wipe the data. 3. On all nodes, switch to vnodes. 4. On each node, import back in the exported data. I'm sure this process could have been streamlined. One caveat for anyone looking to emulate this: Our situation might have been a little easier to reason about because our original endpoint_messages table had a replication factor of 1. We used the vnodes switch as an opportunity to up the RF to 3. I can only speculate as to why what I was originally attempting wasn't working. But what I was originally attempting wasn't precisely the use case I care about. What I'm following up with now was. On Fri, Jun 19, 2015 at 8:22 PM, Mitch Gitman mgit...@gmail.com wrote: I checked the system.log for the Cassandra node that I did the jconsole JMX session against and which had the data to load. Lot of log output indicating that it's busy loading the files. Lot of stacktraces indicating a broken pipe. I have no reason to believe there are connectivity issues between the nodes, but verifying that is beyond my expertise. What's indicative is this last bit of log output: INFO [Streaming to /10.205.55.101:5] 2015-06-19 21:20:45,441 StreamReplyVerbHandler.java (line 44) Successfully sent /srv/cas-snapshot-06-17-2015/endpoints/endpoint_messages/endpoints-endpoint_messages-ic-34-Data.db to /10.205.55.101 INFO [Streaming to /10.205.55.101:5] 2015-06-19 21:20:45,457 OutputHandler.java (line 42) Streaming session to /10.205.55.101 failed ERROR [Streaming to /10.205.55.101:5] 2015-06-19 21:20:45,458 CassandraDaemon.java (line 253) Exception in thread Thread[Streaming to / 10.205.55.101:5,5,RMI Runtime] java.lang.RuntimeException: java.io.IOException: Broken pipe at com.google.common.base.Throwables.propagate(Throwables.java:160) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: Broken pipe at sun.nio.ch.FileChannelImpl.transferTo0(Native Method) at sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:433) at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:565) at org.apache.cassandra.streaming.compress.CompressedFileStreamTask.stream(CompressedFileStreamTask.java:93) at org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:91) at
Mixing incremental repair with sequential
Dear colleagues, We are using incremental repair and have noticed that every few repairs, the cluster experiences pauses. We run the repair with the following command: nodetool repair -par -inc I have tried to run it not in parallel, but get the following error: It is not possible to mix sequential repair and incremental repairs. Does anyone have any suggestions? Many thanks in advance, Carl
Re: Cassandra stuck at DataSink running on cluster
I strongly disagree with recommending to use version 2.1.x. It only very recently became more or less stable. Anything before 2.1.5 was unusable. You might be better of with a recent 2.0.n version. Best regards, Nathan On Fri, Jun 26, 2015 at 3:36 PM Marcos Ortiz mlor...@uci.cu wrote: Regards, Susanne. Which version of Java are you using here? Have you tested this with more recent versions of Cassandra? These new version have a lot of improvements related to SSTable reading and writing, and much more. I recommend you that you should use at least a 2.1.x version. Best, -- Marcos Ortiz http://about.me/marcosortiz, Sr. Product Manager (Data Infrastructure) at UCI @marcosluis2186 http://twitter.com/marcosluis2186 On 26/06/15 08:21, Susanne Bülow wrote: Hi, I am trying to write into Cassandra via the CqlBulkOutputFormat from an apache flink program. The program succeeds to write into a cassandra-cluster while the program is running locally on my pc. However, when trying to run the program on the cluster, it seems to get stuck at SSTableSimpleUnsortedWriter.put() waiting for the Diskwriter-Thread, that is not running anymore. I am using cassandra version 1.5 and apache flink version 0.9.0. Attached is the full stacktrace. Thanks in advance, Susanne
AW: [MASSMAIL]Cassandra stuck at DataSink running on cluster
Hi, I am using Java 7. The cassandra version I use is actually 2.1.5, not 1.5. Sorry for the confusion. I also tried cassandra 2.1.6, but the problem stays the same. Best regards, Susanne Von: Marcos Ortiz [mailto:mlor...@uci.cu] Gesendet: Freitag, 26. Juni 2015 15:34 An: susanne...@gmx.de Cc: user@cassandra.apache.org Betreff: Re: [MASSMAIL]Cassandra stuck at DataSink running on cluster Regards, Susanne. Which version of Java are you using here? Have you tested this with more recent versions of Cassandra? These new version have a lot of improvements related to SSTable reading and writing, and much more. I recommend you that you should use at least a 2.1.x version. Best, -- Marcos Ortiz http://about.me/marcosortiz , Sr. Product Manager (Data Infrastructure) at UCI @marcosluis2186 http://twitter.com/marcosluis2186 On 26/06/15 08:21, Susanne Bülow wrote: Hi, I am trying to write into Cassandra via the CqlBulkOutputFormat from an apache flink program. The program succeeds to write into a cassandra-cluster while the program is running locally on my pc. However, when trying to run the program on the cluster, it seems to get stuck at SSTableSimpleUnsortedWriter.put() waiting for the Diskwriter-Thread, that is not running anymore. I am using cassandra version 1.5 and apache flink version 0.9.0. Attached is the full stacktrace. Thanks in advance, Susanne
Re: Mixing incremental repair with sequential
It is not possible to mix sequential repair and incremental repairs. I guess that is a system limitation, even if I am not sure of it (I don't have used C*2.1 yet) I would focus on tuning your repair by : - Monitoring performance / logs (see why the cluster hangs) - Use range repairs (as a workaround to the Merkle tree 32K limit) or at list run it per table ( http://www.datastax.com/dev/blog/advanced-repair-techniques) Depending on what's the root issue that makes hang your cluster it is hard to help you. - If CPU is a limit, then some tuning around compactions or GC might be needed (or a few more things) - if you have Disk IO limitations, you might want to add machines or tune compaction throughput - If your network is the issue, there are commands to tune the bandwidth used by streams. You need to troubleshot this and give us more informations. I hope you have a monitoring tool up and running and an easy way to detect errors on your logs. C*heers, Alain 2015-06-26 16:26 GMT+02:00 Carl Hu m...@carlhu.com: Dear colleagues, We are using incremental repair and have noticed that every few repairs, the cluster experiences pauses. We run the repair with the following command: nodetool repair -par -inc I have tried to run it not in parallel, but get the following error: It is not possible to mix sequential repair and incremental repairs. Does anyone have any suggestions? Many thanks in advance, Carl
Re: Slow reads on C* 2.0.15 using Spark Cassandra
Thanks for the suggestion, will take a look. Our code looks like this: val rdd = sc.cassandraTable[EventV0](keyspace, test) val transformed = rdd.map{e = EventV1(e.testId, e.ts, e.channel, e.groups, e.event)} transformed.saveToCassandra(keyspace, test_v1) Not sure if this code might translate to limits. The total date in this table is +/- 2gb on disk, total data for each node is around 290gb. On Fri, Jun 26, 2015 at 7:01 PM Nate McCall n...@thelastpickle.com wrote: We notice incredibly slow reads, 600mb in an hour, we are using quorum LOCAL_ONE reads. The load_one of Cassandra increases from 1 to 60! There is no CPU wait, only user nice. Without seeing the code and query, it's hard to tell, but I noticed something similar when we had a client incorrectly using the 'take' method for a result count like so: val resultCount = query.take(count).length 'take' can call limit under the hood. The docs for the latter are interesting: The limit will be applied for each created Spark partition. In other words, unless the data are fetched from a single Cassandra partition the number of results is unpredictable. [0] Removing that line (it wasnt necessary for the use case) and just relying on a simple 'myRDD.select(my_col)).toArray.foreach got performance back to where it should be. Per the docs, limit (and therefore take) works fine as long as the partition key is used as a predicate in the where clause (WHERE test_id = somevalue in your example). [0] https://github.com/datastax/spark-cassandra-connector/blob/master/spark-cassandra-connector/src/main/scala/com/datastax/spark/connector/rdd/CassandraRDD.scala#L92-L101 -- - Nate McCall Austin, TX @zznate Co-Founder Sr. Technical Consultant Apache Cassandra Consulting http://www.thelastpickle.com
Re: Is it okay to use a small t2.micro instance for OpsCenter and use m3.medium instances for the actual Cassandra nodes?
Hi Sid, I would recommend you to use either c3s or m3s instances for Opscenter and for Cassandra nodes it depends on your use case. You can go with either c3s or i2s for Cassandra nodes. But i would recommend you to run performance tests before selecting the instance type. If your use case requires more CPU i would recommend c3s. On Fri, Jun 26, 2015 at 1:20 PM, Sid Tantia sid.tan...@baseboxsoftware.com wrote: Hello, I haven’t been able to find any documentation for best practices on this…is it okay to set up opscenter as a smaller node than the rest of the cluster. For instance, on AWS can I have 3 m3.medium nodes for Cassandra and 1 t2.micro node for OpsCenter? -- Arun Senior Hadoop/Cassandra Engineer Cloudwick 2014 Data Impact Award Winner (Cloudera) http://www.cloudera.com/content/cloudera/en/campaign/data-impact-awards.html
Slow reads on C* 2.0.15 using Spark Cassandra
We are using the Spark Cassandra driver, version 1.2.0 (Spark 1.2.1) connecting to a 6 node bare metal (16gb ram, Xeon E3-1270 (8core), 4x 7,2k SATA disks) Cassandra cluster. Spark runs on a separate Mesos cluster. We are running a transformation job, where we read the complete contents of a table into Spark, do some transformations and write them back to C*. We are using Spark to do a data migration in C*. Before we execute, the load on Cassandra is very little. We notice incredibly slow reads, 600mb in an hour, we are using quorum LOCAL_ONE reads. The load_one of Cassandra increases from 1 to 60! There is no CPU wait, only user nice. The table cassandra.yaml: https://gist.github.com/nathan-gs/908a48aed8a0eb3c3183 Anyone any idea? Thanks, Nathan
Re: Is it okay to use a small t2.micro instance for OpsCenter and use m3.medium instances for the actual Cassandra nodes?
It doesn't need to be the same size. It's not part of the cluster. On Fri, Jun 26, 2015 at 1:34 PM Sid Tantia sid.tan...@baseboxsoftware.com wrote: Hello, I haven’t been able to find any documentation for best practices on this…is it okay to set up opscenter as a smaller node than the rest of the cluster. For instance, on AWS can I have 3 m3.medium nodes for Cassandra and 1 t2.micro node for OpsCenter?
Re: Mixing incremental repair with sequential
Thank you, Alain, for the response. We're using 2.1 indeed. I've lowered compaction threshhold from 18 to 10mb/s. Will see what happens. I hope you have a monitoring tool up and running and an easy way to detect errors on your logs. We do not have this. What do you use for this? Thank you, Carl On Fri, Jun 26, 2015 at 11:26 AM, Alain RODRIGUEZ arodr...@gmail.com wrote: It is not possible to mix sequential repair and incremental repairs. I guess that is a system limitation, even if I am not sure of it (I don't have used C*2.1 yet) I would focus on tuning your repair by : - Monitoring performance / logs (see why the cluster hangs) - Use range repairs (as a workaround to the Merkle tree 32K limit) or at list run it per table ( http://www.datastax.com/dev/blog/advanced-repair-techniques) Depending on what's the root issue that makes hang your cluster it is hard to help you. - If CPU is a limit, then some tuning around compactions or GC might be needed (or a few more things) - if you have Disk IO limitations, you might want to add machines or tune compaction throughput - If your network is the issue, there are commands to tune the bandwidth used by streams. You need to troubleshot this and give us more informations. I hope you have a monitoring tool up and running and an easy way to detect errors on your logs. C*heers, Alain 2015-06-26 16:26 GMT+02:00 Carl Hu m...@carlhu.com: Dear colleagues, We are using incremental repair and have noticed that every few repairs, the cluster experiences pauses. We run the repair with the following command: nodetool repair -par -inc I have tried to run it not in parallel, but get the following error: It is not possible to mix sequential repair and incremental repairs. Does anyone have any suggestions? Many thanks in advance, Carl
Is it okay to use a small t2.micro instance for OpsCenter and use m3.medium instances for the actual Cassandra nodes?
Hello, I haven’t been able to find any documentation for best practices on this…is it okay to set up opscenter as a smaller node than the rest of the cluster. For instance, on AWS can I have 3 m3.medium nodes for Cassandra and 1 t2.micro node for OpsCenter?
Re: Mixing incremental repair with sequential
Here is something I wrote some time ago: http://planetcassandra.org/blog/interview/video-advertising-platform-teads-chose-cassandra-spm-and-opscenter-to-monitor-a-personalized-ad-experience/ Monitoring absolutely necessary to understand what is happening in the system. There is no magic in there and if you find bottlenecks, you can think about how to alleviate things. I would say at least as much as the design of your data models. I've lowered compaction threshhold from 18 to 10mb/s. Will see what happens. If you have no SSD and compactions are creating a bottleneck at the disk the disk, this looks reasonable as long as the compactions pending metric remains low enough. If it is a cpu issue and you have many cores, I would advice you to try lowering the concurrent_compactor: number. (by default 1 compactor per core) Once again it will depend on were the pressure is. Anyway, you might want to do anything you will try on one node only to test it first. Also, one option at the time (or a couple that you believe would have a synergy), and monitor the evolutions. C*heers, Alain 2015-06-26 21:30 GMT+02:00 Carl Hu m...@carlhu.com: Thank you, Alain, for the response. We're using 2.1 indeed. I've lowered compaction threshhold from 18 to 10mb/s. Will see what happens. I hope you have a monitoring tool up and running and an easy way to detect errors on your logs. We do not have this. What do you use for this? Thank you, Carl On Fri, Jun 26, 2015 at 11:26 AM, Alain RODRIGUEZ arodr...@gmail.com wrote: It is not possible to mix sequential repair and incremental repairs. I guess that is a system limitation, even if I am not sure of it (I don't have used C*2.1 yet) I would focus on tuning your repair by : - Monitoring performance / logs (see why the cluster hangs) - Use range repairs (as a workaround to the Merkle tree 32K limit) or at list run it per table ( http://www.datastax.com/dev/blog/advanced-repair-techniques) Depending on what's the root issue that makes hang your cluster it is hard to help you. - If CPU is a limit, then some tuning around compactions or GC might be needed (or a few more things) - if you have Disk IO limitations, you might want to add machines or tune compaction throughput - If your network is the issue, there are commands to tune the bandwidth used by streams. You need to troubleshot this and give us more informations. I hope you have a monitoring tool up and running and an easy way to detect errors on your logs. C*heers, Alain 2015-06-26 16:26 GMT+02:00 Carl Hu m...@carlhu.com: Dear colleagues, We are using incremental repair and have noticed that every few repairs, the cluster experiences pauses. We run the repair with the following command: nodetool repair -par -inc I have tried to run it not in parallel, but get the following error: It is not possible to mix sequential repair and incremental repairs. Does anyone have any suggestions? Many thanks in advance, Carl
Re: Mixing incremental repair with sequential
Alain, The reduction of compaction is having significant impact lowering response time, especially at the 90th percentile level, for us. For the record, we are using AWS's i2.2xl instance types (these are ssd). We were running compaction_throughput_mb_per_sec at 18. Now we are running at 10. Latency variation for reads is hugely reduced. This is very promising. Thanks, Alain. Best, Carl On Fri, Jun 26, 2015 at 7:40 PM, Alain RODRIGUEZ arodr...@gmail.com wrote: Here is something I wrote some time ago: http://planetcassandra.org/blog/interview/video-advertising-platform-teads-chose-cassandra-spm-and-opscenter-to-monitor-a-personalized-ad-experience/ Monitoring absolutely necessary to understand what is happening in the system. There is no magic in there and if you find bottlenecks, you can think about how to alleviate things. I would say at least as much as the design of your data models. I've lowered compaction threshhold from 18 to 10mb/s. Will see what happens. If you have no SSD and compactions are creating a bottleneck at the disk the disk, this looks reasonable as long as the compactions pending metric remains low enough. If it is a cpu issue and you have many cores, I would advice you to try lowering the concurrent_compactor: number. (by default 1 compactor per core) Once again it will depend on were the pressure is. Anyway, you might want to do anything you will try on one node only to test it first. Also, one option at the time (or a couple that you believe would have a synergy), and monitor the evolutions. C*heers, Alain 2015-06-26 21:30 GMT+02:00 Carl Hu m...@carlhu.com: Thank you, Alain, for the response. We're using 2.1 indeed. I've lowered compaction threshhold from 18 to 10mb/s. Will see what happens. I hope you have a monitoring tool up and running and an easy way to detect errors on your logs. We do not have this. What do you use for this? Thank you, Carl On Fri, Jun 26, 2015 at 11:26 AM, Alain RODRIGUEZ arodr...@gmail.com wrote: It is not possible to mix sequential repair and incremental repairs. I guess that is a system limitation, even if I am not sure of it (I don't have used C*2.1 yet) I would focus on tuning your repair by : - Monitoring performance / logs (see why the cluster hangs) - Use range repairs (as a workaround to the Merkle tree 32K limit) or at list run it per table ( http://www.datastax.com/dev/blog/advanced-repair-techniques) Depending on what's the root issue that makes hang your cluster it is hard to help you. - If CPU is a limit, then some tuning around compactions or GC might be needed (or a few more things) - if you have Disk IO limitations, you might want to add machines or tune compaction throughput - If your network is the issue, there are commands to tune the bandwidth used by streams. You need to troubleshot this and give us more informations. I hope you have a monitoring tool up and running and an easy way to detect errors on your logs. C*heers, Alain 2015-06-26 16:26 GMT+02:00 Carl Hu m...@carlhu.com: Dear colleagues, We are using incremental repair and have noticed that every few repairs, the cluster experiences pauses. We run the repair with the following command: nodetool repair -par -inc I have tried to run it not in parallel, but get the following error: It is not possible to mix sequential repair and incremental repairs. Does anyone have any suggestions? Many thanks in advance, Carl
Re: Is it okay to use a small t2.micro instance for OpsCenter and use m3.medium instances for the actual Cassandra nodes?
On Fri, Jun 26, 2015 at 1:20 PM, Sid Tantia sid.tan...@baseboxsoftware.com wrote: For instance, on AWS can I have 3 m3.medium nodes for Cassandra and 1 t2.micro node for OpsCenter? m3.medium is below the minimum size I would use for Cassandra doing anything meaningful, for the record. =Rob
Re: [MASSMAIL]Cassandra stuck at DataSink running on cluster
Regards, Susanne. Which version of Java are you using here? Have you tested this with more recent versions of Cassandra? These new version have a lot of improvements related to SSTable reading and writing, and much more. I recommend you that you should use at least a 2.1.x version. Best, -- Marcos Ortiz http://about.me/marcosortiz, Sr. Product Manager (Data Infrastructure) at UCI @marcosluis2186 http://twitter.com/marcosluis2186 On 26/06/15 08:21, Susanne Bülow wrote: Hi, I am trying to write into Cassandra via the CqlBulkOutputFormat from an apache flink program. The program succeeds to write into a cassandra-cluster while the program is running locally on my pc. However, when trying to run the program on the cluster, it seems to get stuck at SSTableSimpleUnsortedWriter.put() waiting for the Diskwriter-Thread, that is not running anymore. I am using cassandra version 1.5 and apache flink version 0.9.0. Attached is the full stacktrace. Thanks in advance, Susanne