Re: Corrupted sstables
maybe print out value into the logfile and that should lead to some clue where it might be the problem? On Tue, May 7, 2019 at 4:58 PM Paul Chandler wrote: > > Roy, We spent along time trying to fix it, but didn’t find a solution, it was > a test cluster, so we ended up rebuilding the cluster, rather than spending > anymore time trying to fix the corruption. We have worked out what had caused > it, so were happy it wasn’t going to occur in production. Sorry that is not > much help, but I am not even sure it is the same issue you have. > > Paul > > > > On 7 May 2019, at 07:14, Roy Burstein wrote: > > I can say that it happens now as well ,currently no node has been > added/removed . > Corrupted sstables are usually the index files and in some machines the > sstable even does not exist on the filesystem. > On one machine I was able to dump the sstable to dump file without any issue > . Any idea how to tackle this issue ? > > > On Tue, May 7, 2019 at 12:32 AM Paul Chandler wrote: >> >> Roy, >> >> I have seen this exception before when a column had been dropped then re >> added with the same name but a different type. In particular we dropped a >> column and re created it as static, then had this exception from the old >> sstables created prior to the ddl change. >> >> Not sure if this applies in your case. >> >> Thanks >> >> Paul >> >> On 6 May 2019, at 21:52, Nitan Kainth wrote: >> >> can Disk have bad sectors? fccheck or something similar can help. >> >> Long shot: repair or any other operation conflicting. Would leave that to >> others. >> >> On Mon, May 6, 2019 at 3:50 PM Roy Burstein wrote: >>> >>> It happens on the same column families and they have the same ddl (as >>> already posted) . I did not check it after cleanup >>> . >>> >>> On Mon, May 6, 2019, 23:43 Nitan Kainth wrote: This is strange, never saw this. does it happen to same column family? Does it happen after cleanup? On Mon, May 6, 2019 at 3:41 PM Roy Burstein wrote: > > Yes. > > On Mon, May 6, 2019, 23:23 Nitan Kainth wrote: >> >> Roy, >> >> You mean all nodes show corruption when you add a node to cluster?? >> >> >> Regards, >> Nitan >> Cell: 510 449 9629 >> >> On May 6, 2019, at 2:48 PM, Roy Burstein wrote: >> >> It happened on all the servers in the cluster every time I have added >> node >> . >> This is new cluster nothing was upgraded here , we have a similar cluster >> running on C* 2.1.15 with no issues . >> We are aware to the scrub utility just it reproduce every time we added >> node to the cluster . >> >> We have many tables there >> >> > - To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional commands, e-mail: user-h...@cassandra.apache.org
Re: JOB | Permanent Java Development Manager (Montreal, Canada)
Hi Eric, I agree with you. On Wed, Aug 1, 2018 at 11:15 PM, Eric Evans wrote: > On Tue, Jul 31, 2018 at 11:42 PM James Tobin wrote: >> Hello, I'm working with an employer that is looking to hire (for their >> Montreal office) a permanent development manager that has extensive >> hands-on Java coding experience. Consequently I had hoped that some >> members of this mailing list may like to discuss with me further; >> off-list. I can be reached using "JamesBTobin (at) Gmail (dot) Com". >> Kind regards, James > > I don't think this is appropriate; An employer looking for Cassandra > experience seems OK, but this just looks like recruiter spam to me. > > I'll let others on the list chime-in if they disagree, but I'm > inclined to unsubscribe/ban anyone making posts like this. > > -- > Eric Evans > john.eric.ev...@gmail.com > > - > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org > For additional commands, e-mail: user-h...@cassandra.apache.org > - To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional commands, e-mail: user-h...@cassandra.apache.org
Re: Corrupt SSTABLE over and over
cassandra run on virtual server (vmware)? > I tried sstablescrub but it crashed with hs-err-pid-... maybe try with larger heap allocated to sstablescrub this sstable corrupt i ran into it as well (on cassandra 1.2), first i try nodetool scrub, still persist, then offline sstablescrub still persist, wipe the node and it happen again, then i change the hardware (disk and mem). things went good. hth jason On Fri, Aug 12, 2016 at 9:20 AM, Alaa Zubaidi (PDF)wrote: > Hi, > > I have a 16 Node cluster, Cassandra 2.2.1 on Windows, local installation > (NOT on the cloud) > > and I am getting > Error [CompactionExecutor:2] 2016-08-12 06:51:52, 983 Cassandra > Daemon.java:183 - Execption in thread Thread[CompactionExecutor:2,1main] > org.apache.cassandra.io.FSReaderError: > org.apache.cassandra.io.sstable.CorruptSSTableExecption: > org.apache.cassandra.io.compress.CurrptBlockException: > (E:\\la-4886-big-Data.db): corruption detected, chunk at 4969092 of > length 10208. > at > org.apache.cassandra.io.util.RandomAccessReader.readBytes(RandomAccessReader.java:357) > ~[apache-cassandra-2.2.1.jar:2.2.1] > > > ERROR [CompactionExecutor:2] ... FileUtils.java:463 - Existing > forcefully due to file system exception on startup, disk failure policy > "stop" > > I tried sstablescrub but it crashed with hs-err-pid-... > I removed the corrupted file and started the Node again, after one day the > corruption came back again, I removed the files, and restarted Cassandra, it > worked for few days, then I ran "nodetool repair" after it finished, > Cassandra failed again but with commitlog corruption, after removing the > commitlog files, it failed again with another sstable corruption. > > I was also checking the HW, file system, and memory, the VMware logs showed > no HW error, also the HW management logs showed NO problems or issues. > Also checked the Windows Logs (Application and System) the only thing I > found is on the system logs "Cassandra Service terminated with > service-specific error Cannot create another system semaphore. > > I could not find any thing regarding that error, all comments point to > application log. > > Any help is appreciated.. > > -- > > Alaa Zubaidi > > > This message may contain confidential and privileged information. If it has > been sent to you in error, please reply to advise the sender of the error > and then immediately permanently delete it and all attachments to it from > your systems. If you are not the intended recipient, do not read, copy, > disclose or otherwise use this message or any attachments to it. The sender > disclaims any liability for such unauthorized use. PLEASE NOTE that all > incoming e-mails sent to PDF e-mail accounts will be archived and may be > scanned by us and/or by external service providers to detect and prevent > threats to our systems, investigate illegal or inappropriate behavior, > and/or eliminate unsolicited promotional e-mails (“spam”). If you have any > concerns about this process, please contact us at legal.departm...@pdf.com.
Re: too many full gc in one node of the cluster
Used to manage/develop for cassandra 1.0.8 for quite sometime. Although 1.0 was rocking stable but we encountered various problems as load per node grow beyond 500gb. upgrading is one of the solution but may not be the solution for you but I strongly recommend you upgrade to 1.1 or 1.2. we upgraded the java on the cassandra node and cassandra to 1.1 and a lot of problems went away. As for your use cases, a quick solution would probably to just add nodes, or study client reading pattern so not on a node hot row (the has on the key), or the client configuration on your application and/or the keyspace replication. hth, jason On Fri, Nov 13, 2015 at 2:35 PM, Shuo Chenwrote: > Hi, > > We have a small cassandra cluster with 4 nodes for production. All the > nodes have similar hardware configuration and similar data load. The C* > version is 1.0.7 (prretty old) > > One of the node has much higher cpu usage than others and high full gc > frequency, but the io of this node is not high and data load of this node > is even lower. So I have several questions: > > 1. Is that normal that one of the node having much higher full gc with > same jvm configuration? > 2. Does this node need special gc tuning and how? > 3. How to find the cause of the full gc? > > Thank you guys! > > > The heap size is 8G and max heap size is 16G. The gc config of > cassandra-env.sh is default: > > JVM_OPTS="$JVM_OPTS -XX:+UseParNewGC" > JVM_OPTS="$JVM_OPTS -XX:+UseConcMarkSweepGC" > JVM_OPTS="$JVM_OPTS -XX:+CMSParallelRemarkEnabled" > JVM_OPTS="$JVM_OPTS -XX:SurvivorRatio=8" > JVM_OPTS="$JVM_OPTS -XX:MaxTenuringThreshold=1" > JVM_OPTS="$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=75" > JVM_OPTS="$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly" > > - > I print instance in the gc log: > > num #instances #bytes class name > -- >1: 2982796 238731200 [B >2: 3889672 186704256 java.nio.HeapByteBuffer >3: 1749589 55986848 org.apache.cassandra.db.Column >4: 1803900 43293600 > java.util.concurrent.ConcurrentSkipListMap$Node >5:859496 20627904 > java.util.concurrent.ConcurrentSkipListMap$Index >6: 5568 18827912 [J >7:1626306505200 java.math.BigInteger >8:1675725716976 [I >9:1416984534336 > java.util.concurrent.ConcurrentHashMap$HashEntry > 10:1415054528160 > com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node > 11: 314914376976 > 12: 314914291992 > 13:1716954120680 org.apache.cassandra.db.DecoratedKey > 14: 31573436120 > 15:1417843402816 java.lang.Long > 16:1416243398976 org.apache.cassandra.utils.Pair > 17:1415053396120 > com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$WeightedValue > 18: 496042675352 > 19:1622542596064 > org.apache.cassandra.dht.BigIntegerToken > .. > Total 13337798 641834360 > > > -- > The gc part and thread status of system log: > > INFO [ScheduledTasks:1] 2015-11-13 14:22:08,681 GCInspector.java (line > 123) GC for ParNew: 1015 ms for 2 collections, 3886753520 used; max is > 8231321600 > INFO [ScheduledTasks:1] 2015-11-13 14:22:09,683 GCInspector.java (line > 123) GC for ParNew: 500 ms for 1 collections, 4956287408 used; max is > 8231321600 > INFO [ScheduledTasks:1] 2015-11-13 14:22:10,685 GCInspector.java (line > 123) GC for ParNew: 627 ms for 1 collections, 5615882296used; max is > 8231321600 > INFO [ScheduledTasks:1] 2015-11-13 14:22:12,015 GCInspector.java (line > 123) GC for ParNew: 988 ms for 2 collections, 4943363480 used; max is > 8231321600 > INFO [ScheduledTasks:1] 2015-11-13 14:22:13,016 GCInspector.java (line > 123) GC for ParNew: 373 ms for 1 collections, 5978572832 used; max is > 8231321600 > INFO [ScheduledTasks:1] 2015-11-13 14:22:14,020 GCInspector.java (line > 123) GC for ParNew: 486 ms for 1 collections, 6209638280used; max is > 8231321600 > INFO [ScheduledTasks:1] 2015-11-13 14:22:15,412 GCInspector.java (line > 123) GC for ParNew: 898 ms for 2 collections, 6045603728used; max is > 8231321600 > INFO [ScheduledTasks:1] 2015-11-13 14:22:16,413 GCInspector.java (line > 123) GC for ParNew: 503 ms for 1 collections, 6991263984 used; max is > 8231321600 > INFO [ScheduledTasks:1] 2015-11-13 14:22:17,416 GCInspector.java (line > 123) GC for ParNew: 746 ms for 1 collections, 7073467384used; max is > 8231321600 > INFO [ScheduledTasks:1] 2015-11-13 14:22:33,363 GCInspector.java (line > 123) GC for ConcurrentMarkSweep: 843 ms for 2 collections, 1130423160 used; > max is 8231321600 > INFO [ScheduledTasks:1] 2015-11-13 14:22:33,364 MessagingService.java > (line 603) 4198 READ messages dropped
Re: High CPU load
just a guess, gc? On Mon, Jul 20, 2015 at 3:15 PM, Marcin Pietraszek mpietras...@opera.com wrote: Hello! I've noticed a strange CPU utilisation patterns on machines in our cluster. After C* daemon restart it behaves in a normal way, after a few weeks since a restart CPU usage starts to raise. Currently on one of the nodes (screenshots attached) cpu load is ~4. Shortly before restart load raises to ~15 (our cassandra machines have 16 cpus). In that cluster we're using bulkloading from hadoop cluster with 1400 reducers (200 parallel bulkloading tasks). After such session of heavy bulkloading number of pending compactions is quite high but it's able to clear them before next bulkloading session. We're also tracking number of pending compactions and during most of the time it's 0. On our machines we do have a few gigs of free memory ~7GB (17GB used), also it seems like we aren't IO bound. Screenshots from our zabbix with CPU utilisation graphs: http://i60.tinypic.com/xas8q8.jpg http://i58.tinypic.com/24pifcy.jpg Do you guys know what could be causing such high load? -- mp
Re: Experiencing Timeouts on one node
3. How do we rebuild System keyspace? wipe this node and start it all over. hth jason On Tue, Jul 7, 2015 at 12:16 AM, Shashi Yachavaram shashi...@gmail.com wrote: When we reboot the problematic node, we see the following errors in system.log. 1. Does this mean hints column family is corrupted? 2. Can we scrub system column family on problematic node and its replication partners? 3. How do we rebuild System keyspace? == ERROR [CompactionExecutor:950] 2015-06-27 20:11:44,595 CassandraDaemon.java (line 191) Exception in thread Thread[CompactionExecutor:950,1,main] java.lang.AssertionError: originally calculated column size of 8684 but now it is 15725 at org.apache.cassandra.db.compaction.LazilyCompactedRow.write(LazilyCompactedRow.java:135) at org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:160) at org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:162) at org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:58) at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60) at org.apache.cassandra.db.compaction.CompactionManager$7.runMayThrow(CompactionManager.java:442) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) ERROR [HintedHandoff:552] 2015-06-27 20:11:44,595 CassandraDaemon.java (line 191) Exception in thread Thread[HintedHandoff:552,1,main] java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.AssertionError: originally calculated column size of 8684 but now it is 15725 at org.apache.cassandra.db.HintedHandOffManager.doDeliverHintsToEndpoint(HintedHandOffManager.java:436) at org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:282) at org.apache.cassandra.db.HintedHandOffManager.access$300(HintedHandOffManager.java:90) at org.apache.cassandra.db.HintedHandOffManager$4.run(HintedHandOffManager.java:502) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Caused by: java.util.concurrent.ExecutionException: java.lang.AssertionError: originally calculated column size of 8684 but now it is 15725 at java.util.concurrent.FutureTask$Sync.innerGet(Unknown Source) at java.util.concurrent.FutureTask.get(Unknown Source) at org.apache.cassandra.db.HintedHandOffManager.doDeliverHintsToEndpoint(HintedHandOffManager.java:432) ... 6 more Caused by: java.lang.AssertionError: originally calculated column size of 8684 but now it is 15725 at org.apache.cassandra.db.compaction.LazilyCompactedRow.write(LazilyCompactedRow.java:135) at org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:160) at org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:162) at org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:58) at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60) at org.apache.cassandra.db.compaction.CompactionManager$7.runMayThrow(CompactionManager.java:442) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) == On Wed, Jul 1, 2015 at 11:59 AM, Shashi Yachavaram shashi...@gmail.com wrote: We have a 28 node cluster, out of which only one node is experiencing timeouts. We thought it was the raid, but there are two other nodes on the same raid without any problem. Also The problem goes away if we reboot the node, and then reappears after seven days. The following hinted hand-off timeouts are seen on the node experiencing the timeouts. Also we did not notice any gossip errors. I was wondering if anyone has seen this issue and how they resolved it. Cassandra Version: 1.2.15.1 OS: Linux cm 2.6.32-504.8.1.el6.x86_64 #1 SMP Fri Dec 19
Re: Experiencing Timeouts on one node
you should check the network connectivity for this node and also its system average load. is that typo or literary what it is, cassandra 1.2.15.*1* and java 6 update *85* ? On Thu, Jul 2, 2015 at 12:59 AM, Shashi Yachavaram shashi...@gmail.com wrote: We have a 28 node cluster, out of which only one node is experiencing timeouts. We thought it was the raid, but there are two other nodes on the same raid without any problem. Also The problem goes away if we reboot the node, and then reappears after seven days. The following hinted hand-off timeouts are seen on the node experiencing the timeouts. Also we did not notice any gossip errors. I was wondering if anyone has seen this issue and how they resolved it. Cassandra Version: 1.2.15.1 OS: Linux cm 2.6.32-504.8.1.el6.x86_64 #1 SMP Fri Dec 19 12:09:25 EST 2014 x86_64 x86_64 x86_64 GNU/Linux java version 1.6.0_85 INFO [HintedHandoff:2] 2015-06-17 22:52:08,130 HintedHandOffManager.java (line 296) Started hinted handoff for host: 4fe86051-6bca-4c28-b09c-1b0f073c1588 with IP: /192.168.1.122 INFO [HintedHandoff:1] 2015-06-17 22:52:08,131 HintedHandOffManager.java (line 296) Started hinted handoff for host: bbf0878b-b405-4518-b649-f6cf7c9a6550 with IP: /192.168.1.119 INFO [HintedHandoff:2] 2015-06-17 22:52:17,634 HintedHandOffManager.java (line 422) Timed out replaying hints to /192.168.1.122; aborting (0 delivered) INFO [HintedHandoff:2] 2015-06-17 22:52:17,635 HintedHandOffManager.java (line 296) Started hinted handoff for host: f7b7ab10-4d42-4f0c-af92-2934a075bee3 with IP: /192.168.1.108 INFO [HintedHandoff:1] 2015-06-17 22:52:17,643 HintedHandOffManager.java (line 422) Timed out replaying hints to /192.168.1.119; aborting (0 delivered) INFO [HintedHandoff:1] 2015-06-17 22:52:17,643 HintedHandOffManager.java (line 296) Started hinted handoff for host: ddb79f35-3e2b-4be8-84d8-7942086e2b73 with IP: /192.168.1.104 INFO [HintedHandoff:2] 2015-06-17 22:52:27,143 HintedHandOffManager.java (line 422) Timed out replaying hints to /192.168.1.108; aborting (0 delivered) INFO [HintedHandoff:2] 2015-06-17 22:52:27,144 HintedHandOffManager.java (line 296) Started hinted handoff for host: 6a2fa431-4a51-44cb-af19-1991c960e075 with IP: /192.168.1.117 INFO [HintedHandoff:1] 2015-06-17 22:52:27,153 HintedHandOffManager.java (line 422) Timed out replaying hints to /192.168.1.104; aborting (0 delivered) INFO [HintedHandoff:1] 2015-06-17 22:52:27,154 HintedHandOffManager.java (line 296) Started hinted handoff for host: cf03174a-533c-44d6-a679-e70090ad2bc5 with IP: /192.168.1.107 Thanks -shashi..
Re: Cassandra leap second
same here too, on branch 1.1 and have not seen any high cpu usage. On Wed, Jul 1, 2015 at 2:52 PM, John Wong gokoproj...@gmail.com wrote: Which version are you running and what's your kernel version? We are still running on 1.2 branch but we have not seen any high cpu usage yet... On Tue, Jun 30, 2015 at 11:10 PM, snair123 . nair...@outlook.com wrote: reboot of the machine worked -- From: nair...@outlook.com To: user@cassandra.apache.org Subject: Cassandra leap second Date: Wed, 1 Jul 2015 02:54:53 + Is it ok to run this https://blog.mozilla.org/it/2012/06/30/mysql-and-the-leap-second-high-cpu-and-the-fix/ Seeing high cpu consumption for cassandra process -- Sent from Jeff Dean's printf() mobile console
Re: Error while adding a new node.
nodetool cfstats? On Wed, Jul 1, 2015 at 8:08 PM, Neha Trivedi nehajtriv...@gmail.com wrote: Hey.. nodetool compactionstats pending tasks: 0 no pending tasks. Dont have opscenter. how do I monitor sstables? On Wed, Jul 1, 2015 at 4:28 PM, Alain RODRIGUEZ arodr...@gmail.com wrote: You also might want to check if you have compactions pending (Opscenter / nodetool compactionstats). Also you can monitor the number of sstables. C*heers Alain 2015-07-01 11:53 GMT+02:00 Neha Trivedi nehajtriv...@gmail.com: Thanks I will checkout. I increased the ulimit to 10, but I am getting the same error, but after a while. regards Neha On Wed, Jul 1, 2015 at 2:22 PM, Alain RODRIGUEZ arodr...@gmail.com wrote: Just check the process owner to be sure (top, htop, ps, ...) http://docs.datastax.com/en/cassandra/2.0/cassandra/install/installRecommendSettings.html#reference_ds_sxl_gf3_2k__user-resource-limits C*heers, Alain 2015-07-01 7:33 GMT+02:00 Neha Trivedi nehajtriv...@gmail.com: Arun, I am logging on to Server as root and running (sudo service cassandra start) regards Neha On Wed, Jul 1, 2015 at 11:00 AM, Neha Trivedi nehajtriv...@gmail.com wrote: Thanks Arun ! I will try and get back ! On Wed, Jul 1, 2015 at 10:32 AM, Arun arunsi...@gmail.com wrote: Looks like you have too many open files issue. Increase the ulimit for the user. If you are starting the cassandra daemon using user cassandra, increase the ulimit for that user. On Jun 30, 2015, at 21:16, Neha Trivedi nehajtriv...@gmail.com wrote: Hello, I have a 4 node cluster with SimpleSnitch. Cassandra : Cassandra 2.1.3 I am trying to add a new node (cassandra 2.1.7) and I get the following error. ERROR [STREAM-IN-] 2015-06-30 05:13:48,516 JVMStabilityInspector.java:94 - JVM state determined to be unstable. Exiting forcefully due to: java.io.FileNotFoundException: /var/lib/cassandra/data/-Index.db (Too many open files) I increased the MAX_HEAP_SIZE then I get : ERROR [CompactionExecutor:9] 2015-06-30 23:31:44,792 CassandraDaemon.java:223 - Exception in thread Thread[CompactionExecutor:9,1,main] java.lang.RuntimeException: java.io.FileNotFoundException: /var/lib/cassandra/data/-Data.db (Too many open files) at org.apache.cassandra.io.compress.CompressedThrottledReader.open(CompressedThrottledReader.java:52) ~[apache-cassandra-2.1.7.jar:2.1.7] Is it because of the different version of Cassandra (2.1.3 and 2.17) ? regards N
Re: After Restart Nodes had lost data
on the node 192.168.2.100, did you run repair after its status is UN? On Wed, Jun 24, 2015 at 2:46 AM, Jean Tremblay jean.tremb...@zen-innovations.com wrote: Dear Alain, Thank you for your reply. Ok, yes I did not drain. The cluster was loaded with tons of records, and no new records were added since few weeks. Each node had a load of about 160 GB as seen in the “nodetool status. I killed the cassandradeamon, and restarted it. After cassandra was restarted, I could see in the “nodetool status” a load of 5 GB!! I don’t use counters. I use RF 3 on 5 nodes. I did not change the replication factor. I have two types of read queries. One use QUORUM for read, and the other use ONE for consistency level. I did not change the Topology. Are you sure this node had data before you restart it ? Actually the full story is: - I stopped node0(192.168.2.100), and I restarted it. - I stopped node1(192.168.2.101). - I made a nodetool status and I noticed that node0 was UN and had a load 5 GB. I found this really weird because all the other ones had about 160GB. I also saw that node1 was DN with a load of about 160GB. - I restarted node1. - I made a nodetool status and I noticed that node1 was UN and had a load of also 5GB, it previously had a load of about 160GB. That I’m sure. - Then my program could no longer query C*. Neither the QUORUM nor the ONE consistency level statements could read data. What does a nodetool status mykeyspace outputs ? I cannot try this anymore. I flushed the whole cluster, and I am currently reloading everything. I was too much in a hurry. I have a demo tomorrow, and I will manage to have it back before tomorrow. After my bad decision of flushing the cluster, I realised that I could have bootstrapped again my two nodes. Learning by doing. It’s like the whole cluster is paralysed -- what does it mean, be more accurate on this please. You should tell us action that were taken before this occurred and now what is not working since a C* cluster in this state could perfectly run. No SPOF. What I did before? Well this cluster was basically idling. I was only making lots of select on it. I was loaded since few weeks. But what I noticed when I restarted node0 is the following: INFO [InternalResponseStage:1] 2015-06-23 11:45:32,723 ColumnFamilyStore.java:882 - Enqueuing flush of schema_columnfamilies: 131587 (0%) on-heap, 0 (0%) off-heap INFO [MemtableFlushWriter:2] 2015-06-23 11:45:32,723 Memtable.java:346 - Writing Memtable-schema_columnfamilies@917967643(34850 serialized bytes, 585 ops, 0%/0% of on/off-heap limit) WARN [GossipTasks:1] 2015-06-23 11:45:33,459 FailureDetector.java:251 - Not marking nodes down due to local pause of 25509152054 50 INFO [MemtableFlushWriter:1] 2015-06-23 11:45:33,982 Memtable.java:385 - Completed flushing /home/maia/apache-cassandra-DATA/data/system/local-7ad54392bcdd35a684174e047860b377/system-local-ka-11-Data.db (5274 bytes) for commitlog position ReplayPos ition(segmentId=1435052707645, position=144120) INFO [GossipStage:1] 2015-06-23 11:45:33,985 StorageService.java:1642 - Node /192.168.2.101 state jump to normal INFO [GossipStage:1] 2015-06-23 11:45:33,991 Gossiper.java:987 - Node / 192.168.2.102 has restarted, now UP INFO [SharedPool-Worker-1] 2015-06-23 11:45:33,992 Gossiper.java:954 - InetAddress /192.168.2.102 is now UP INFO [HANDSHAKE-/192.168.2.102] 2015-06-23 11:45:33,993 OutboundTcpConnection.java:485 - Handshaking version with /192.168.2.102 INFO [GossipStage:1] 2015-06-23 11:45:33,993 StorageService.java:1642 - Node /192.168.2.102 state jump to normal INFO [GossipStage:1] 2015-06-23 11:45:33,999 Gossiper.java:987 - Node / 192.168.2.103 has restarted, now UP INFO [SharedPool-Worker-1] 2015-06-23 11:45:33,999 Gossiper.java:954 - InetAddress /192.168.2.103 is now UP INFO [GossipStage:1] 2015-06-23 11:45:34,001 StorageService.java:1642 - Node /192.168.2.103 state jump to normal INFO [HANDSHAKE-/192.168.2.103] 2015-06-23 11:45:34,020 OutboundTcpConnection.java:485 - Handshaking version with /192.168.2.103 INFO [main] 2015-06-23 11:45:34,021 StorageService.java:1642 - Node zennode0/192.168.2.100 state jump to normal INFO [GossipStage:1] 2015-06-23 11:45:34,028 StorageService.java:1642 - Node /192.168.2.104 state jump to normal INFO [main] 2015-06-23 11:45:34,038 CassandraDaemon.java:583 - Waiting for gossip to settle before accepting client requests... INFO [GossipStage:1] 2015-06-23 11:45:34,039 StorageService.java:1642 - Node /192.168.2.101 state jump to normal INFO [GossipStage:1] 2015-06-23 11:45:34,047 StorageService.java:1642 - Node /192.168.2.103 state jump to normal INFO [GossipStage:1] 2015-06-23 11:45:34,055 StorageService.java:1642 - Node /192.168.2.102 state jump to normal INFO [CompactionExecutor:1] 2015-06-23 11:45:34,062 CompactionTask.java:270 - Compacted 1 sstables to
Re: system-hints compaction all the time
what's your questions? On Mon, Jun 22, 2015 at 12:05 AM, 曹志富 cao.zh...@gmail.com wrote: the logger like this : INFO [CompactionExecutor:501] 2015-06-21 21:42:36,306 CompactionTask.java:140 - Compacting [SSTableReader(path='/home/ant/apache-cassandra-2.1.6/bin/../data/data/system/hints/system-hints-ka-365-Data.db')] INFO [CompactionExecutor:501] 2015-06-21 21:42:37,782 CompactionTask.java:270 - Compacted 1 sstables to [bin/../data/data/system/hints/system-hints-ka-366,]. 18,710,207 bytes to 18,710,207 (~100% of original) in 1,476ms = 12.089054MB/s. 11 total partitions merged to 11. Partition merge counts were {1:11, } INFO [CompactionExecutor:502] 2015-06-21 21:52:37,784 CompactionTask.java:140 - Compacting [SSTableReader(path='/home/ant/apache-cassandra-2.1.6/bin/../data/data/system/hints/system-hints-ka-366-Data.db')] INFO [CompactionExecutor:502] 2015-06-21 21:52:39,223 CompactionTask.java:270 - Compacted 1 sstables to [bin/../data/data/system/hints/system-hints-ka-367,]. 18,710,207 bytes to 18,710,207 (~100% of original) in 1,438ms = 12.408515MB/s. 11 total partitions merged to 11. Partition merge counts were {1:11, } INFO [CompactionExecutor:503] 2015-06-21 22:02:39,224 CompactionTask.java:140 - Compacting [SSTableReader(path='/home/ant/apache-cassandra-2.1.6/bin/../data/data/system/hints /system-hints-ka-367-Data.db')] INFO [CompactionExecutor:503] 2015-06-21 22:02:40,742 CompactionTask.java:270 - Compacted 1 sstables to [bin/../data/data/system/hints/system-hints-ka-368,]. 18,710,207 byte s to 18,710,207 (~100% of original) in 1,517ms = 11.762323MB/s. 11 total partitions merged to 11. Partition merge counts were {1:11, } INFO [CompactionExecutor:504] 2015-06-21 22:12:40,743 CompactionTask.java:140 - Compacting [SSTableReader(path='/home/ant/apache-cassandra-2.1.6/bin/../data/data/system/hints /system-hints-ka-368-Data.db')] INFO [CompactionExecutor:504] 2015-06-21 22:12:42,262 CompactionTask.java:270 - Compacted 1 sstables to [bin/../data/data/system/hints/system-hints-ka-369,]. 18,710,207 byte s to 18,710,207 (~100% of original) in 1,518ms = 11.754574MB/s. 11 total partitions merged to 11. Partition merge counts were {1:11, } INFO [CompactionExecutor:505] 2015-06-21 22:22:42,264 CompactionTask.java:140 - Compacting [SSTableReader(path='/home/ant/apache-cassandra-2.1.6/bin/../data/data/system/hints /system-hints-ka-369-Data.db')] INFO [CompactionExecutor:505] 2015-06-21 22:22:43,750 CompactionTask.java:270 - Compacted 1 sstables to [bin/../data/data/system/hints/system-hints-ka-370,]. 18,710,207 byte s to 18,710,207 (~100% of original) in 1,486ms = 12.007701MB/s. 11 total partitions merged to 11. Partition merge counts were {1:11, } C* 2.1.6 -- Ranger Tsao
Re: Garbage collector launched on all nodes at once
okay, iirc memtable has been removed off heap, google and got this http://www.datastax.com/dev/blog/off-heap-memtables-in-Cassandra-2-1 apparently, there are still some reference on heap. On Thu, Jun 18, 2015 at 1:11 PM, Marcus Eriksson krum...@gmail.com wrote: It is probably this: https://issues.apache.org/jira/browse/CASSANDRA-9549 On Wed, Jun 17, 2015 at 7:37 PM, Michał Łowicki mlowi...@gmail.com wrote: Looks that memtable heap size is growing on some nodes rapidly ( https://www.dropbox.com/s/3brloiy3fqang1r/Screenshot%202015-06-17%2019.21.49.png?dl=0). Drops are the places when nodes have been restarted. On Wed, Jun 17, 2015 at 6:53 PM, Michał Łowicki mlowi...@gmail.com wrote: Hi, Two datacenters with 6 nodes (2.1.6) each. In each DC garbage collection is launched at the same time on each node (See [1] for total GC duration per 5 seconds). RF is set to 3. Any ideas? [1] https://www.dropbox.com/s/bsbyew1jxbe3dgo/Screenshot%202015-06-17%2018.49.48.png?dl=0 -- BR, Michał Łowicki -- BR, Michał Łowicki
Re: Nodetool ring and Replicas after 1.2 upgrade
maybe check the system.log to see if there is any exception and/or error? check as well if they are having consistent schema for the keyspace? hth jason On Tue, Jun 16, 2015 at 7:17 AM, Michael Theroux mthero...@yahoo.com wrote: Hello, We (finally) have just upgraded from Cassandra 1.1 to Cassandra 1.2.19. Everything appears to be up and running normally, however, we have noticed unusual output from nodetool ring. There is a new (to us) field Replicas in the nodetool output, and this field, seemingly at random, is changing from 2 to 3 and back to 2. We are using the byte ordered partitioner (we hash our own keys), and have a replication factor of 3. We are also on AWS and utilize the Ec2snitch on a single Datacenter. Other calls appear to be normal. nodetool getEndpoints returns the proper endpoints when querying various keys, nodetool ring and status return that all nodes appear healthy. Anyone have any hints on what maybe happening, or if this is a problem we should be concerned with? Thanks, -Mike
Re: How to minimize Cassandra memory usage for test environment?
for a start, maybe you can see the setting use by raspberry pi project, for instance http://ac31004.blogspot.com/2012/05/apache-cassandra-on-raspberry-pi.html you can look at these two files, to tune down the settings for test environment. cassandra-env.sh cassandra.yaml hth jason On Tue, Jun 9, 2015 at 3:59 PM, Eax Melanhovich m...@eax.me wrote: Hello. We are running integration tests, using real Cassandra (not a mock) under Vagrant. MAX_HEAP_SIZE is set to 500M. As I discovered, lower value causes 'out of memory' after some time. Could memory usage be decreased somehow? Developers don't usually have a lot of free RAM and performance obviously is not an issue in this case. -- Best regards, Eax Melanhovich http://eax.me/
Re: ERROR Compaction Interrupted
looks like it is graciously handle in the code, should be okay. if (ci.isStopRequested()) throw new CompactionInterruptedException(ci.getCompactionInfo()); https://github.com/apache/cassandra/blob/cassandra-2.0.9/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L156-L157 jason On Tue, Jun 2, 2015 at 2:31 AM, Aiman Parvaiz ai...@flipagram.com wrote: Hi everyone, I am running C* 2.0.9 without vnodes and RF=2. Recently while repairing, rebalancing the cluster I encountered one instance of this(just one on one node): ERROR CompactionExecutor: https://logentries.com/app/9f95dbd4#55472 CassandraDaemon.uncaughtException - Exception in thread Thread[ CompactionExecutor: https://logentries.com/app/9f95dbd4#55472,1,main] May 30 19:31:09 cass-prod4.localdomain cassandra: 2015-05-30 19:31:09,991 ERROR CompactionExecutor:55472 CassandraDaemon.uncaughtException - Exception in thread Thread[CompactionExecutor:55472,1,main] May 30 19:31:09 cass-prod4.localdomain org.apache.cassandra.db.compaction.CompactionInterruptedException: Compaction interrupted: Compaction@1b0b43e5-bef5-34f9-af08-405a7b58c71f(flipagram, home_feed_entry_index, 218409618/450008574)bytes May 30 19:31:09 cass-prod4.localdomain at org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:157) May 30 19:31:09 cass-prod4.localdomain at org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48) May 30 19:31:09 cass-prod4.localdomain at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) May 30 19:31:09 cass-prod4.localdomain at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60) May 30 19:31:09 cass-prod4.localdomain at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59) May 30 19:31:09 cass-prod4.localdomain at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:198) May 30 19:31:09 cass-prod4.localdomain at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) May 30 19:31:09 cass-prod4.localdomain at java.util.concurrent.FutureTask.run(FutureTask.java:262) May 30 19:31:09 cass-prod4.localdomain at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) May 30 19:31:09 cass-prod4.localdomain at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) May 30 19:31:09 cass-prod4.localdomain at java.lang.Thread.run(Thread.java:745) After looking up a bit on the mailing list archives etc I understand that this might mean data corruption and I plan to take the node offline and replace it with a new one but still wanted to see if anyone can throw some light here about me missing out on something. Also, if this is a case of corrupted SST should I be concerned about it getting replicated and take care of it on the replication too. Thanks
Re: How to set datastax-angent connect with jmx an
the error in the log output looks similar to this http://serverfault.com/questions/614810/opscenter-4-1-4-authentication-failing , in the opscenter 5.1.2 , do you configure the username/password same with the agent and cassandra node too? jason On Wed, Jun 3, 2015 at 11:13 AM, 贺伟平 wolai...@hotmail.com wrote: I am using opscenter 5.1.2 and just enabled JMX username/password authentication on my Cassandra cluster. I think I've updated all my opscenter configs correctly to force the agents to use JMX auth, but it is not working. I've updated the config under /etc/opscenter/Clusters/[cluster-name].conf with the following jmx properties [jmx] username=username password=password port=7199 I then restarted opscenter and opscenter agents, but see the following error in the opscenter agent logs: INFO [main] 2015-06-03 10:55:53,910 Loading conf files: ./conf/address.yaml INFO [main] 2015-06-03 10:55:53,953 Java vendor/version: Java HotSpot(TM) 64-Bit Server VM/1.7.0_51 INFO [main] 2015-06-03 10:55:53,953 DataStax Agent version: 5.1.2 INFO [main] 2015-06-03 10:55:54,010 Default config values: {:cassandra_port 9042, :rollups300_ttl 2419200, :settings_cf settings, :agent_rpc_interface localhost, :restore_req_update_period 60, :my_channel_prefix /agent, :poll_period 60, :jmx_username heweiping, :thrift_conn_timeout 1, :rollups60_ttl 604800, :stomp_port 61620, :shorttime_interval 10, :longtime_interval 300, :max-seconds-to-sleep 25, :private-conf-props [initial_token listen_address broadcast_address rpc_address broadcast_rpc_address], :thrift_port 9160, :async_retry_timeout 5, :agent-conf-group global-cluster-agent-group, :jmx_host 127.0.0.1, :ec2_metadata_api_host 169.254.169.254, :metrics_enabled 1, :async_queue_size 5000, :backup_staging_dir nil, :read-buffer-size 1000, :remote_verify_max 30, :disk_usage_update_period 60, :throttle-bytes-per-second 50, :rollups7200_ttl 31536000, :agent_rpc_broadcast_address localhost, :remote_backup_retries 3, :ssl_keystore nil, :rollup_snapshot_period 300, :is_package false, :monitor_command /usr/share/datastax-agent/bin/datastax_agent_monitor, :thrift_socket_timeout 5000, :remote_verify_initial_delay 1000, :cassandra_log_location /var/log/cassandra/system.log, :max-pending-repairs 5, :remote_backup_region us-west-1, :restore_on_transfer_failure false, :tmp_dir /var/lib/datastax-agent/tmp/, :config_md5 nil, :jmx_port 7299, :write-buffer-size 10, :jmx_metrics_threadpool_size 4, :use_ssl 0, :rollups86400_ttl 0, :nodedetails_threadpool_size 3, :api_port 61621, :kerberos_service nil, :backup_file_queue_max 1, :jmx_thread_pool_size 5, :production 1, :runs_sudo 1, :max_file_transfer_attempts 30, :jmx_password eefung, :stomp_interface 172.19.104.123, :storage_keyspace OpsCenter, :hosts [127.0.0.1], :rollup_snapshot_threshold 300, :jmx_retry_timeout 30, :unthrottled-default 100, :remote_backup_retry_delay 5000, :remote_backup_timeout 1000, :seconds-to-read-kill-channel 0.005, :realtime_interval 5, :pdps_ttl 259200} INFO [main] 2015-06-03 10:55:54,174 Waiting for the config from OpsCenter INFO [main] 2015-06-03 10:55:54,175 Attempting to determine Cassandra's broadcast address through JMX INFO [main] 2015-06-03 10:55:54,176 Starting Stomp INFO [main] 2015-06-03 10:55:54,176 Starting up agent communcation with OpsCenter. INFO [Initialization] 2015-06-03 10:55:54,180 New JMX connection ( 127.0.0.1:7299) WARN [Initialization] 2015-06-03 10:55:54,409 Error when trying to match our local token: java.lang.SecurityException: Authentication failed! Credentials required INFO [main] 2015-06-03 10:55:59,412 Reconnecting to a backup OpsCenter instance INFO [main] 2015-06-03 10:55:59,413 SSL communication is disabled INFO [main] 2015-06-03 10:55:59,413 Creating stomp connection to 172.19.104.123:61620 INFO [Initialization] 2015-06-03 10:55:59,418 Sleeping for 2s before trying to determine IP over JMX again WARN [clojure-agent-send-off-pool-0] 2015-06-03 10:55:59,422 Tried to send message while not connected: /conf-request [[172.19.104.123,0:0:0:0:0:0:0:1%1,fe80:0:0:0:225:90ff:fe6a:d35c%2,127.0.0.1],[5.1.2,\/437054467\/conf]] INFO [StompConnection receiver] 2015-06-03 10:55:59,423 Reconnecting in 0s. INFO [StompConnection receiver] 2015-06-03 10:55:59,424 Connected to 172.19.104.123:61620 INFO [main] 2015-06-03 10:55:59,432 Starting Jetty server: {:join? false, :ssl? false, :host localhost, :port 61621} Checks with other jmx based tools (nodetool, jmxtrans) confirm that the jmx setup is correct. Any ideals ? Thank you very much! 发自 Windows 邮件
Re: How to interpret some GC logs
can you tell what jvm is that? jason On Mon, Jun 1, 2015 at 5:46 PM, Michał Łowicki mlowi...@gmail.com wrote: Hi, Normally I get logs like: 2015-06-01T09:19:50.610+: 4736.314: [GC 6505591K-4895804K(8178944K), 0.0494560 secs] which is fine and understandable but occasionalIy I see something like: 2015-06-01T09:19:50.661+: 4736.365: [GC 4901600K(8178944K), 0.0049600 secs] How to interpret it? Does it miss only part before - so memory occupied before GC cycle? -- BR, Michał Łowicki
Re: what this error mean
why it happened? from the code, it looks like this condition is not null https://github.com/apache/cassandra/blob/cassandra-2.1.3/src/java/org/apache/cassandra/io/sstable/SSTableReader.java#L921 or you can quickly fix this by upgrading to 2.1.5, i noticed there is code change for this class https://github.com/apache/cassandra/blob/cassandra-2.1.5/src/java/org/apache/cassandra/io/sstable/SSTableReader.java#L921 hth jason On Fri, May 29, 2015 at 9:39 AM, 曹志富 cao.zh...@gmail.com wrote: I have a 25 noedes C* cluster with C* 2.1.3. These days a node occur split brain many times。 check the log I found this: INFO [MemtableFlushWriter:118] 2015-05-29 08:07:39,176 Memtable.java:378 - Completed flushing /home/ant/apache-cassandra-2.1.3/bin/../data/data/system/sstable_activity-5a1ff2 67ace03f128563cfae6103c65e/system-sstable_activity-ka-4371-Data.db (8187 bytes) for commitlog position ReplayPosition(segmentId=1432775133526, position=16684949) ERROR [IndexSummaryManager:1] 2015-05-29 08:10:30,209 CassandraDaemon.java:167 - Exception in thread Thread[IndexSummaryManager:1,1,main] java.lang.AssertionError: null at org.apache.cassandra.io.sstable.SSTableReader.cloneWithNewSummarySamplingLevel(SSTableReader.java:921) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.io.sstable.IndexSummaryManager.adjustSamplingLevels(IndexSummaryManager.java:410) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.io.sstable.IndexSummaryManager.redistributeSummaries(IndexSummaryManager.java:288) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.io.sstable.IndexSummaryManager.redistributeSummaries(IndexSummaryManager.java:238) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.io.sstable.IndexSummaryManager$1.runMayThrow(IndexSummaryManager.java:139) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:82) ~[apache-cassandra-2. 1.3.jar:2.1.3] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [na:1.7.0_71] at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) [na:1.7.0_71] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) [na:1.7.0_71] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [na:1.7.0_71] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_71] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_71] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71] I want to know why this and how to fix this Thanks all -- Ranger Tsao
Re: Start with single node, move to 3-node cluster
hmm..i supposed you start with rf = 1 and then when 3n arrived, just add into the cluster and later decomission this one node? http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_remove_node_t.html hth jason On Tue, May 26, 2015 at 10:02 PM, Matthew Johnson matt.john...@algomi.com wrote: Hi Jason, When the 3N cluster is up and running, I need to get the data from SN into the 3N cluster and then give the SN server back. So I need to keep the data, but on completely new servers – just trying to work out what the best way of doing that is. The volume of data that needs migrating won’t be huge, probably about 30G, but it is data that I definitely need to keep (for historical analysis, audit etc). Thanks! Matthew *From:* Jason Wee [mailto:peich...@gmail.com] *Sent:* 26 May 2015 14:38 *To:* user@cassandra.apache.org *Subject:* Re: Start with single node, move to 3-node cluster will you add this lent one node into the 3N to form a cluster? but really , if you are just started, you could use this one node for your learning by installing multiple instances for experiments or development purposes only. imho, in the long run, this proove to be very valuable, as least for me. with this single node, you can easily simulate like c* upgrade. for instance, c* right now is at 2.1.5, when 2.2 went stable, you can test using your multiple instances on this single node to simulate your production environment safely. hth jason On Tue, May 26, 2015 at 9:24 PM, Matthew Johnson matt.john...@algomi.com wrote: Hi gurus, We have ordered some hardware for a 3-node cluster, but its ETA is 6 to 8 weeks. In the meantime, I have been lent a single server that I can use. I am wondering what the best way is to set up my single node (SN), so I can then move to the 3-node cluster (3N) when the hardware arrives. Do I: 1. Create my keyspaces on SN with RF=1, and when 3N is up and running migrate all the data manually (either through Spark, dump-and-load, or write a small script)? 2. Create my keyspaces on SN with RF=3, bootstrap the 3N nodes into a 4-node cluster when they’re ready, then remove SN from the cluster? 3. Use SN as normal, and when 3N hardware arrives, physically move the data folder and commit log folder onto one of the nodes in 3N and start it up as a seed? 4. Any other recommended solutions? I’m not even sure what the impact would be of running a single node with RF=3 – would this even work? Any ideas would be much appreciated. Thanks! Matthew
Re: Start with single node, move to 3-node cluster
will you add this lent one node into the 3N to form a cluster? but really , if you are just started, you could use this one node for your learning by installing multiple instances for experiments or development purposes only. imho, in the long run, this proove to be very valuable, as least for me. with this single node, you can easily simulate like c* upgrade. for instance, c* right now is at 2.1.5, when 2.2 went stable, you can test using your multiple instances on this single node to simulate your production environment safely. hth jason On Tue, May 26, 2015 at 9:24 PM, Matthew Johnson matt.john...@algomi.com wrote: Hi gurus, We have ordered some hardware for a 3-node cluster, but its ETA is 6 to 8 weeks. In the meantime, I have been lent a single server that I can use. I am wondering what the best way is to set up my single node (SN), so I can then move to the 3-node cluster (3N) when the hardware arrives. Do I: 1. Create my keyspaces on SN with RF=1, and when 3N is up and running migrate all the data manually (either through Spark, dump-and-load, or write a small script)? 2. Create my keyspaces on SN with RF=3, bootstrap the 3N nodes into a 4-node cluster when they’re ready, then remove SN from the cluster? 3. Use SN as normal, and when 3N hardware arrives, physically move the data folder and commit log folder onto one of the nodes in 3N and start it up as a seed? 4. Any other recommended solutions? I’m not even sure what the impact would be of running a single node with RF=3 – would this even work? Any ideas would be much appreciated. Thanks! Matthew
Re: Leveled Compaction Strategy with a really intensive delete workload
, due to a really intensive delete workloads, the SSTable is promoted to t.. Is cassandra design for *delete* workloads? doubt so. Perhaps looking at some other alternative like ttl? jason On Mon, May 25, 2015 at 10:12 AM, Manoj Khangaonkar khangaon...@gmail.com wrote: Hi, For a delete intensive workload ( translate to write intensive), is there any reason to use leveled compaction ? The recommendation seems to be that leveled compaction is suited for read intensive workloads. Depending on your use case, you might better of with data tiered or size tiered strategy. regards regards On Sun, May 24, 2015 at 10:50 AM, Stefano Ortolani ostef...@gmail.com wrote: Hi all, I have a question re leveled compaction strategy that has been bugging me quite a lot lately. Based on what I understood, a compaction takes place when the SSTable gets to a specific size (10 times the size of its previous generation). My question is about an edge case where, due to a really intensive delete workloads, the SSTable is promoted to the next level (say L1) and its size, because of the many evicted tombstones, fall back to 1/10 of its size (hence to a size compatible to the previous generation, L0). What happens in this case? If the next major compaction is set to happen when the SSTable is promoted to L2, well, that might take too long and too many tobmstones could then appear in the meanwhile (and queries might subsequently fail). Wouldn't be more correct to flag the SStable's generation to its previous value (namely, not changing it even if a major compaction took place)? Regards, Stefano Ortolani -- http://khangaonkar.blogspot.com/
Re: Leveled Compaction Strategy with a really intensive delete workload
help second guess my a decision a bit less :) Cheers, Stefano On Mon, May 25, 2015 at 9:52 AM, Jason Wee peich...@gmail.com wrote: , due to a really intensive delete workloads, the SSTable is promoted to t.. Is cassandra design for *delete* workloads? doubt so. Perhaps looking at some other alternative like ttl? jason On Mon, May 25, 2015 at 10:12 AM, Manoj Khangaonkar khangaon...@gmail.com wrote: Hi, For a delete intensive workload ( translate to write intensive), is there any reason to use leveled compaction ? The recommendation seems to be that leveled compaction is suited for read intensive workloads. Depending on your use case, you might better of with data tiered or size tiered strategy. regards regards On Sun, May 24, 2015 at 10:50 AM, Stefano Ortolani ostef...@gmail.com wrote: Hi all, I have a question re leveled compaction strategy that has been bugging me quite a lot lately. Based on what I understood, a compaction takes place when the SSTable gets to a specific size (10 times the size of its previous generation). My question is about an edge case where, due to a really intensive delete workloads, the SSTable is promoted to the next level (say L1) and its size, because of the many evicted tombstones, fall back to 1/10 of its size (hence to a size compatible to the previous generation, L0). What happens in this case? If the next major compaction is set to happen when the SSTable is promoted to L2, well, that might take too long and too many tobmstones could then appear in the meanwhile (and queries might subsequently fail). Wouldn't be more correct to flag the SStable's generation to its previous value (namely, not changing it even if a major compaction took place)? Regards, Stefano Ortolani -- http://khangaonkar.blogspot.com/
Re: Nodetool on 2.1.5
yeah, you can confirm in the log such as the one below. WARN [main] 2015-05-22 11:23:25,584 CassandraDaemon.java:81 - JMX is not enabled to receive remote connections. Please see cassandra-env.sh for more info. we are running c* with ipv6, cqlsh works superb but not on local link. $ nodetool -h fe80::224:1ff:fed7:82ea cfstats system.hints; nodetool: Failed to connect to 'fe80::224:1ff:fed7:82ea:7199' - ConnectException: 'Connection refused'. On Fri, May 22, 2015 at 12:39 AM, Yuki Morishita mor.y...@gmail.com wrote: For security reason, Cassandra changes JMX to listen localhost only since version 2.0.14/2.1.4. From NEWS.txt: The default JMX config now listens to localhost only. You must enable the other JMX flags in cassandra-env.sh manually. On Thu, May 21, 2015 at 11:05 AM, Walsh, Stephen stephen.wa...@aspect.com wrote: Just wondering if anyone else is seeing this issue on the nodetool after installing 2.1.5 This works nodetool -h 127.0.0.1 cfstats keyspace.table This works nodetool -h localhost cfstats keyspace.table This works nodetool cfstats keyspace.table This doesn’t work nodetool -h 192.168.1.10 cfstats keyspace.table nodetool: Failed to connect to ‘192.168.1.10:7199' - ConnectException: 'Connection refused'. Where 192.168.1.10 is the machine IP, All firewalls are disabled and it worked fine on version 2.0.13 This has happened on both of our upgraded clusters. Also no longer able to view the “CF: Total MemTable Size” “flushes pending” in Ops Center 5.1.1, related issue? This email (including any attachments) is proprietary to Aspect Software, Inc. and may contain information that is confidential. If you have received this message in error, please do not read, copy or forward this message. Please notify the sender immediately, delete it from your system and destroy any copies. You may not further disclose or distribute this email or its attachments. -- Yuki Morishita t:yukim (http://twitter.com/yukim)
Re: C*1.2 on JVM 1.7
Running java7u72 with c* 1.1 with no issues.. yet (hope not) :) Jason On Sat, May 16, 2015 at 2:39 AM, sean_r_dur...@homedepot.com wrote: I have run plenty of 1.2.x Cassandra versions on the Oracle JVM 1.7. I have used both 1.7.0_40 and 1.7.0_72 with no issues. Also have 3.2.7 DSE running on 1.7.0_72 in PR with no issues. Sean Durity – Cassandra Admin, Big Data Team To engage the team, create a request https://portal.homedepot.com/sites/bigdata/SitePages/Big%20Data%20Engagement%20Request.aspx *From:* cass savy [mailto:casss...@gmail.com] *Sent:* Friday, May 15, 2015 1:22 PM *To:* user@cassandra.apache.org *Subject:* C*1.2 on JVM 1.7 Has anybody run DSE 3.2.6(C*1.2.16) on JRE 1.7. I know its recommended that we have to get to JVM version 7 to get to C*2.0 and higher. We are experiencing latency issues during rolling upgrade from 1.2 to 2.0. Hence cannot get o C*1.2, but plan to just upgrade JRE from 1.6 to 1.7. I wanted to know if anyof you have C*1.2 on JRE 1.7 in PROD and have run into issues. -- The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the terms and conditions expressed in any applicable governing The Home Depot terms of business or client engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy and content of this attachment and for any damages or losses arising from any inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature, which may be contained in this attachment and shall not be liable for direct, indirect, consequential or special damages in connection with this e-mail message or its attachment.
Re: java.lang.InternalError: a fault occurred in a recent unsafe memory access operation in compiled Java code
try different jvm version or find out why is that happening? hth jason On Mon, May 4, 2015 at 10:10 AM, 曹志富 cao.zh...@gmail.com wrote: Hi guys: I havle C* 2.1.3 cluster,25 nodes ,running in JDK_1.7.0_71, CentOS 2.6.32-220.el6.x86_64,4 Core,32GB RAM. Today one of the nodes,has some error like this: java.lang.InternalError: a fault occurred in a recent unsafe memory access operation in compiled Java code at org.apache.cassandra.io.util.AbstractDataInput.readUnsignedShort(AbstractDataInput.java:312) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.utils.ByteBufferUtil.readShortLength(ByteBufferUtil.java:317) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:327) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.io.sstable.SSTableReader.getPosition(SSTableReader.java:1425) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.io.sstable.SSTableReader.getPosition(SSTableReader.java:1350) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.db.columniterator.SSTableNamesIterator.init(SSTableNamesIterator.java:53) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:89) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:62) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.db.CollationController.collectTimeOrderedData(CollationController.java:129) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:62) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1915) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1748) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:342) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:53) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:47) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:62) ~[apache-cassandra-2.1.3.jar:2.1.3] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_71] at org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164) ~[apache-cassandra-2.1.3.jar:2.1.3] at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) [apache-cassandra-2.1.3.jar:2.1.3] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71] I have foud an issue CASSANDRA-5737 https://issues.apache.org/jira/browse/CASSANDRA-5737. So I want to ask what can i do with this error? Thank u all!!! -- Ranger Tsao
Re: Help understanding aftermath of death by GC
Hey Robert, you might want to start by looking into the statistics of cassandra, either exposed via nodetool or if you have monitoring system monitor the important metrics. I have read this article moment ago and I hope it help you http://aryanet.com/blog/cassandra-garbage-collector-tuning to begin to understand where and how to determine the root cause. jason On Tue, Mar 31, 2015 at 8:22 PM, Robert Wille rwi...@fold3.com wrote: I moved my site over to Cassandra a few months ago, and everything has been just peachy until a few hours ago (yes, it would be in the middle of the night) when my entire cluster suffered death by GC. By death by GC, I mean this: [rwille@cas031 cassandra]$ grep GC system.log | head -5 INFO [ScheduledTasks:1] 2015-03-31 02:49:57,480 GCInspector.java (line 116) GC for ConcurrentMarkSweep: 30219 ms for 1 collections, 7664429440 used; max is 8329887744 INFO [ScheduledTasks:1] 2015-03-31 02:50:32,180 GCInspector.java (line 116) GC for ConcurrentMarkSweep: 30673 ms for 1 collections, 7707488712 used; max is 8329887744 INFO [ScheduledTasks:1] 2015-03-31 02:51:05,108 GCInspector.java (line 116) GC for ConcurrentMarkSweep: 30453 ms for 1 collections, 7693634672 used; max is 8329887744 INFO [ScheduledTasks:1] 2015-03-31 02:51:38,787 GCInspector.java (line 116) GC for ConcurrentMarkSweep: 30691 ms for 1 collections, 7686028472 used; max is 8329887744 INFO [ScheduledTasks:1] 2015-03-31 02:52:12,452 GCInspector.java (line 116) GC for ConcurrentMarkSweep: 30346 ms for 1 collections, 7701401200 used; max is 8329887744 I’m pretty sure I know what triggered it. When I first started developing to Cassandra, I found the IN clause to be supremely useful, and I used it a lot. Later I figured out it was a bad thing and repented and fixed my code, but I missed one spot. A maintenance task spent a couple of hours repeatedly issuing queries with IN clauses with 1000 items in the clause and the whole system went belly up. I get that my bad queries caused Cassandra to require more heap than was available, but here’s what I don’t understand. When the crap hit the fan, the maintenance task died due to a timeout error, but the cluster never recovered. I would have expected that when I was no longer issuing the bad queries, that the heap would get cleaned up and life would resume to normal. Can anybody help me understand why Cassandra wouldn’t recover? How is it that GC pressure will cause heap to be permanently uncollectable? This makes me pretty worried. I can fix my code, but I don’t really have control over spikes. If memory pressure spikes, I can tolerate some timeouts and errors, but if it can’t come back when the pressure is gone, that seems pretty bad. Any insights would be greatly appreciated Robert
Re: upgrade from 1.0.12 to 1.1.12
Rob, the cluster now upgraded to cassandra 1.0.12 (default hd version, in Descriptor.java) and I ensure all sstables in current cluster are hd version before upgrade to cassandra 1.1. I have also checked in cassandra 1.1.12 , the sstable is version hf version. so i guess, nodetool upgradesstables is needed? Why not scrub? when you run command nodetool upgradesstables , it is actually scrubing the data? Can you explain ? Jason On Fri, Mar 27, 2015 at 7:21 AM, Robert Coli rc...@eventbrite.com wrote: On Wed, Mar 25, 2015 at 7:16 PM, Jonathan Haddad j...@jonhaddad.com wrote: There's no downside to running upgradesstables. I recommend always doing it on upgrade just to be safe. For the record and just my opinion : I recommend against paying this fixed cost when you don't need to. It is basically trivial to ascertain whether there is a new version of the SSTable format in your new version, without even relying on the canonical NEWS.txt. Type nodetool flush and look at the filename of the table that was just flushed. If the version component is different from all the other SSTables, you definitely need to run upgradesstables. If it isn't, you definitely don't. If you're going to run something which unnecessarily rewrites all SSTables, why not scrub? That'll check the files for corruption while also upgrading them as they are written out 1:1... =Rob
Re: upgrade from 1.0.12 to 1.1.12
hmm... okay. one more question https://github.com/apache/cassandra/blob/cassandra-1.1.12/NEWS.txt I upgraded directly to 1.1.12 , do I need to run nodetool upgradesstables as stipulated in version 1.1.3 ? jason On Wed, Mar 25, 2015 at 1:04 AM, Jonathan Haddad j...@jonhaddad.com wrote: Streaming is repair, adding removing nodes. In general it's a bad idea to do any streaming op when you've got an upgrade in progress. On Tue, Mar 24, 2015 at 3:14 AM Jason Wee peich...@gmail.com wrote: Hello, Reading this documentation http://www.datastax.com/docs/1.1/install/upgrading If you are upgrading to Cassandra 1.1.9 from a version earlier than 1.1.7, all nodes must be upgraded before any streaming can take place. Until you upgrade all nodes, you cannot add version 1.1.7 nodes or later to a 1.1.7 or earlier cluster. Does this apply for upgrade to cassandra 1.1.12 ? What is cassandra streaming ? Is repair (nodetool or background), hinted handoff, antientropy consider streaming? if yes, how do we prevent streaming after a node is upgraded to 1.1.12 in a 1.0.12 cluster environment? Thanks. Jason
Re: upgrade from 1.0.12 to 1.1.12
Sean, thanks and I will keep that in mind for this upgrade. Jason On Thu, Mar 26, 2015 at 1:23 AM, sean_r_dur...@homedepot.com wrote: Yes, run upgradesstables on all nodes - unless you already force major compactions on all tables. I run them on a few nodes at a time to minimize impact to performance. The upgrade is not complete until upgradesstables completes on all nodes. Then you are safe to resume any streaming operations (repairs and bootstraps). Sean Durity – Cassandra Admin, Big Data Team To engage the team, create a request -Original Message- From: Jason Wee [mailto:peich...@gmail.com] Sent: Wednesday, March 25, 2015 10:59 AM To: user@cassandra.apache.org Subject: Re: upgrade from 1.0.12 to 1.1.12 hmm... okay. one more question https://github.com/apache/cassandra/blob/cassandra-1.1.12/NEWS.txt I upgraded directly to 1.1.12 , do I need to run nodetool upgradesstables as stipulated in version 1.1.3 ? jason On Wed, Mar 25, 2015 at 1:04 AM, Jonathan Haddad j...@jonhaddad.com wrote: Streaming is repair, adding removing nodes. In general it's a bad idea to do any streaming op when you've got an upgrade in progress. On Tue, Mar 24, 2015 at 3:14 AM Jason Wee peich...@gmail.com wrote: Hello, Reading this documentation http://www.datastax.com/docs/1.1/install/upgrading If you are upgrading to Cassandra 1.1.9 from a version earlier than 1.1.7, all nodes must be upgraded before any streaming can take place. Until you upgrade all nodes, you cannot add version 1.1.7 nodes or later to a 1.1.7 or earlier cluster. Does this apply for upgrade to cassandra 1.1.12 ? What is cassandra streaming ? Is repair (nodetool or background), hinted handoff, antientropy consider streaming? if yes, how do we prevent streaming after a node is upgraded to 1.1.12 in a 1.0.12 cluster environment? Thanks. Jason The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the terms and conditions expressed in any applicable governing The Home Depot terms of business or client engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy and content of this attachment and for any damages or losses arising from any inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature, which may be contained in this attachment and shall not be liable for direct, indirect, consequential or special damages in connection with this e-mail message or its attachment.
upgrade from 1.0.12 to 1.1.12
Hello, Reading this documentation http://www.datastax.com/docs/1.1/install/upgrading If you are upgrading to Cassandra 1.1.9 from a version earlier than 1.1.7, all nodes must be upgraded before any streaming can take place. Until you upgrade all nodes, you cannot add version 1.1.7 nodes or later to a 1.1.7 or earlier cluster. Does this apply for upgrade to cassandra 1.1.12 ? What is cassandra streaming ? Is repair (nodetool or background), hinted handoff, antientropy consider streaming? if yes, how do we prevent streaming after a node is upgraded to 1.1.12 in a 1.0.12 cluster environment? Thanks. Jason
Re: cassandra triggers
okay, if you leave a comment in the blog on what is breaking and what cassandra, I can take a look at the code when I get the time. :-) jason On Mon, Mar 23, 2015 at 8:15 PM, Asit KAUSHIK asitkaushikno...@gmail.com wrote: attached is the code . You follow the process for compiling and using the code. If anything more is required please let me know. The Jar file has to be put into /usr/share/cassandra/conf/triggers. Hope this helps Regards asit On Mon, Mar 23, 2015 at 3:20 PM, Rahul Bhardwaj rahul.bhard...@indiamart.com wrote: Yes Asit you can share it with me, let c if we can implement with our requirement. Regards: Rahul Bhardwaj On Mon, Mar 23, 2015 at 1:43 PM, Asit KAUSHIK asitkaushikno...@gmail.com wrote: Hi Rahul, i have created a trigger which inserts a default value into the table. But everyone are against using it. As its an external code which may be uncompatible in future releases. Its was a chnallenge as all the examples are of old 2.0.X veresion where RowMutable package is used which is discontinued in the later releases. If you still want the code i can give you . The application is same as on all the sites i used below for my reference. But again the code is for older release and would not work.. http://noflex.org/learn-experiment-cassandra-trigger/ On Mon, Mar 23, 2015 at 11:01 AM, Rahul Bhardwaj rahul.bhard...@indiamart.com wrote: Hi All, I want to use triggers in cassandra. Is there any tutorial on creating triggers in cassandra . Also I am not good in java. Pl help !! Regards: Rahul Bhardwaj Follow IndiaMART.com http://www.indiamart.com for latest updates on this and more: https://plus.google.com/+indiamart https://www.facebook.com/IndiaMART https://twitter.com/IndiaMART Mobile Channel: https://itunes.apple.com/WebObjects/MZStore.woa/wa/viewSoftware?id=668561641mt=8 https://play.google.com/store/apps/details?id=com.indiamart.m http://m.indiamart.com/ https://www.youtube.com/watch?v=DzORNbeSXN8list=PL2o4J51MqpL0mbue6kzDa6eymLVUXtlR1index=2 Watch how IndiaMART Maximiser helped Mr. Khanna expand his business. kyunki Kaam Yahin Banta Hai https://www.youtube.com/watch?v=Q9fZ5ILY3w8feature=youtu.be!!! Follow IndiaMART.com http://www.indiamart.com for latest updates on this and more: https://plus.google.com/+indiamart https://www.facebook.com/IndiaMART https://twitter.com/IndiaMART Mobile Channel: https://itunes.apple.com/WebObjects/MZStore.woa/wa/viewSoftware?id=668561641mt=8 https://play.google.com/store/apps/details?id=com.indiamart.m http://m.indiamart.com/ https://www.youtube.com/watch?v=DzORNbeSXN8list=PL2o4J51MqpL0mbue6kzDa6eymLVUXtlR1index=2 Watch how IndiaMART Maximiser helped Mr. Khanna expand his business. kyunki Kaam Yahin Banta Hai https://www.youtube.com/watch?v=Q9fZ5ILY3w8feature=youtu.be!!!
Re: cassandra node jvm stall intermittently
heh on the midst of upgrading , Rob ;-) Jason On Tue, Mar 10, 2015 at 2:04 AM, Robert Coli rc...@eventbrite.com wrote: On Sat, Mar 7, 2015 at 1:44 AM, Jason Wee peich...@gmail.com wrote: hey Ali, 1.0.8 On Sat, Mar 7, 2015 at 5:20 PM, Ali Akhtar ali.rac...@gmail.com wrote: What version are you running? Upgrade your very old version to at least 1.2.x (via 1.1.x) ASAP. =Rob
Re: cassandra node jvm stall intermittently
Hi Jan, thanks for your time to prepare the question and answer below, - How many nodes do you have on the ring ? 12 - What is the activity when this occurs - reads / writes/ compactions ? This cluster has a lot of writes and read. off peak period, ops center shown cluster write is about 5k/sec, read about 1k/sec and during peak period, write could be 22k/sec and read about 10k/sec. So this particular one node hang like every moment irrespect if it is peak or non peak, or compaction. - Is there anything that is unique about this node that makes it different from the other nodes ? Our nodes are same in term of operating system (centos 6) and cassandra configuration settings. Other than that, there are no other resources intensive application running in cassandra nodes. - Is this a periodic occurance OR a single occurence - I am trying to determine a pattern about when this shows up. it *always* happened and in fact, it is happening now. - What is the load distribution the ring (ie: is this node carrying more load than the others). As of this moment, - Address DC RackStatus State Load OwnsToken - 155962751505430129087380028406227096910 - node1 us-east 1e Up Normal 498.66 GB 8.33% 0 - node2 us-east 1e Up Normal 503.36 GB 8.33% 14178431955039102644307275309657008810 - node3 us-east 1e Up Normal 492.08 GB 8.33% 28356863910078205288614550619314017619 - node4 us-east 1e Up Normal 499.54 GB 8.33% 42535295865117307932921825928971026430 - node5 us-east 1e Up Normal 523.76 GB 8.33% 56713727820156407428984779325531226109 - node6 us-east 1e Up Normal 515.36 GB 8.33% 70892159775195513221536376548285044050 - node7 us-east 1e Up Normal 588.93 GB 8.33% 85070591730234615865843651857942052860 - node8 us-east 1e Up Normal 498.51 GB 8.33% 99249023685273718510150927167599061670 - node9 us-east 1e Up Normal 531.81 GB 8.33% 113427455640312814857969558651062452221 - node10 us-east 1e Up Normal 501.85 GB 8.33% 127605887595351923798765477786913079290 - node11 us-east 1e Up Normal 501.13 GB 8.33% 141784319550391026443072753096570088100 - node12 us-east 1e Up Normal 508.45 GB 8.33% 155962751505430129087380028406227096910 so that one node is node5. At this instance ring output, yea, it is second highest in the ring but unlikely this is the cause. Jason On Sat, Mar 7, 2015 at 3:35 PM, Jan cne...@yahoo.com wrote: HI Jason; The single node showing the anomaly is a hint that the problem is probably local to a node (as you suspected). - How many nodes do you have on the ring ? - What is the activity when this occurs - reads / writes/ compactions ? - Is there anything that is unique about this node that makes it different from the other nodes ? - Is this a periodic occurance OR a single occurence - I am trying to determine a pattern about when this shows up. - What is the load distribution the ring (ie: is this node carrying more load than the others). The system.log should have more info.,about it. hope this helps Jan/ On Friday, March 6, 2015 4:50 AM, Jason Wee peich...@gmail.com wrote: well, StatusLogger.java started shown in cassandra system.log, MessagingService.java also shown some stage (e.g. read, mutation) dropped. It's strange it only happen in this node but this type of message does not shown in other node log file at the same time... Jason On Thu, Mar 5, 2015 at 4:26 AM, Jan cne...@yahoo.com wrote: HI Jason; Whats in the log files at the moment jstat shows 100%. What is the activity on the cluster the node at the specific point in time (reads/ writes/ joins etc) Jan/ On Wednesday, March 4, 2015 5:59 AM, Jason Wee peich...@gmail.com wrote: Hi, our cassandra node using java 7 update 72 and we ran jstat on one of the node, and notice some strange behaviour as indicated by output below. any idea why when eden space stay the same for few seconds like 100% and 18.02% for few seconds? we suspect such stalling cause timeout to our cluster. any idea what happened, what went wrong and what could cause this? $ jstat -gcutil 32276 1s 0.00 5.78 91.21 70.94 60.07 2657 73.437 40.056 73.493 0.00 5.78 100.00 70.94 60.07 2657 73.437 40.056 73.493 0.00 5.78 100.00 70.94 60.07 2657 73.437 40.056 73.493 0.00 5.78 100.00 70.94 60.07 2657 73.437 40.056 73.493 0.00 5.78 100.00 70.94 60.07 2657 73.437 40.056 73.493 0.00 5.78 100.00 70.94 60.07
Re: cassandra node jvm stall intermittently
hey Ali, 1.0.8 On Sat, Mar 7, 2015 at 5:20 PM, Ali Akhtar ali.rac...@gmail.com wrote: What version are you running? On Sat, Mar 7, 2015 at 2:14 PM, Jason Wee peich...@gmail.com wrote: Hi Jan, thanks for your time to prepare the question and answer below, - How many nodes do you have on the ring ? 12 - What is the activity when this occurs - reads / writes/ compactions ? This cluster has a lot of writes and read. off peak period, ops center shown cluster write is about 5k/sec, read about 1k/sec and during peak period, write could be 22k/sec and read about 10k/sec. So this particular one node hang like every moment irrespect if it is peak or non peak, or compaction. - Is there anything that is unique about this node that makes it different from the other nodes ? Our nodes are same in term of operating system (centos 6) and cassandra configuration settings. Other than that, there are no other resources intensive application running in cassandra nodes. - Is this a periodic occurance OR a single occurence - I am trying to determine a pattern about when this shows up. it *always* happened and in fact, it is happening now. - What is the load distribution the ring (ie: is this node carrying more load than the others). As of this moment, - Address DC RackStatus State Load OwnsToken - 155962751505430129087380028406227096910 - node1 us-east 1e Up Normal 498.66 GB 8.33% 0 - node2 us-east 1e Up Normal 503.36 GB 8.33% 14178431955039102644307275309657008810 - node3 us-east 1e Up Normal 492.08 GB 8.33% 28356863910078205288614550619314017619 - node4 us-east 1e Up Normal 499.54 GB 8.33% 42535295865117307932921825928971026430 - node5 us-east 1e Up Normal 523.76 GB 8.33% 56713727820156407428984779325531226109 - node6 us-east 1e Up Normal 515.36 GB 8.33% 70892159775195513221536376548285044050 - node7 us-east 1e Up Normal 588.93 GB 8.33% 85070591730234615865843651857942052860 - node8 us-east 1e Up Normal 498.51 GB 8.33% 99249023685273718510150927167599061670 - node9 us-east 1e Up Normal 531.81 GB 8.33% 113427455640312814857969558651062452221 - node10 us-east 1e Up Normal 501.85 GB 8.33% 127605887595351923798765477786913079290 - node11 us-east 1e Up Normal 501.13 GB 8.33% 141784319550391026443072753096570088100 - node12 us-east 1e Up Normal 508.45 GB 8.33% 155962751505430129087380028406227096910 so that one node is node5. At this instance ring output, yea, it is second highest in the ring but unlikely this is the cause. Jason On Sat, Mar 7, 2015 at 3:35 PM, Jan cne...@yahoo.com wrote: HI Jason; The single node showing the anomaly is a hint that the problem is probably local to a node (as you suspected). - How many nodes do you have on the ring ? - What is the activity when this occurs - reads / writes/ compactions ? - Is there anything that is unique about this node that makes it different from the other nodes ? - Is this a periodic occurance OR a single occurence - I am trying to determine a pattern about when this shows up. - What is the load distribution the ring (ie: is this node carrying more load than the others). The system.log should have more info.,about it. hope this helps Jan/ On Friday, March 6, 2015 4:50 AM, Jason Wee peich...@gmail.com wrote: well, StatusLogger.java started shown in cassandra system.log, MessagingService.java also shown some stage (e.g. read, mutation) dropped. It's strange it only happen in this node but this type of message does not shown in other node log file at the same time... Jason On Thu, Mar 5, 2015 at 4:26 AM, Jan cne...@yahoo.com wrote: HI Jason; Whats in the log files at the moment jstat shows 100%. What is the activity on the cluster the node at the specific point in time (reads/ writes/ joins etc) Jan/ On Wednesday, March 4, 2015 5:59 AM, Jason Wee peich...@gmail.com wrote: Hi, our cassandra node using java 7 update 72 and we ran jstat on one of the node, and notice some strange behaviour as indicated by output below. any idea why when eden space stay the same for few seconds like 100% and 18.02% for few seconds? we suspect such stalling cause timeout to our cluster. any idea what happened, what went wrong and what could cause this? $ jstat -gcutil 32276 1s 0.00 5.78 91.21 70.94 60.07 2657 73.437 40.056 73.493 0.00 5.78 100.00 70.94 60.07 2657 73.437 40.056 73.493 0.00 5.78
Re: cassandra node jvm stall intermittently
well, StatusLogger.java started shown in cassandra system.log, MessagingService.java also shown some stage (e.g. read, mutation) dropped. It's strange it only happen in this node but this type of message does not shown in other node log file at the same time... Jason On Thu, Mar 5, 2015 at 4:26 AM, Jan cne...@yahoo.com wrote: HI Jason; Whats in the log files at the moment jstat shows 100%. What is the activity on the cluster the node at the specific point in time (reads/ writes/ joins etc) Jan/ On Wednesday, March 4, 2015 5:59 AM, Jason Wee peich...@gmail.com wrote: Hi, our cassandra node using java 7 update 72 and we ran jstat on one of the node, and notice some strange behaviour as indicated by output below. any idea why when eden space stay the same for few seconds like 100% and 18.02% for few seconds? we suspect such stalling cause timeout to our cluster. any idea what happened, what went wrong and what could cause this? $ jstat -gcutil 32276 1s 0.00 5.78 91.21 70.94 60.07 2657 73.437 40.056 73.493 0.00 5.78 100.00 70.94 60.07 2657 73.437 40.056 73.493 0.00 5.78 100.00 70.94 60.07 2657 73.437 40.056 73.493 0.00 5.78 100.00 70.94 60.07 2657 73.437 40.056 73.493 0.00 5.78 100.00 70.94 60.07 2657 73.437 40.056 73.493 0.00 5.78 100.00 70.94 60.07 2657 73.437 40.056 73.493 0.00 5.78 100.00 70.94 60.07 2657 73.437 40.056 73.493 0.00 5.78 100.00 70.94 60.07 2657 73.437 40.056 73.493 0.00 5.78 100.00 70.94 60.07 2657 73.437 40.056 73.493 0.00 5.78 100.00 70.94 60.07 2657 73.437 40.056 73.493 0.00 5.78 100.00 70.94 60.07 2657 73.437 40.056 73.493 0.00 5.78 100.00 70.94 60.07 2657 73.437 40.056 73.493 0.00 5.78 100.00 70.94 60.07 2657 73.437 40.056 73.493 0.00 5.78 100.00 70.94 60.07 2657 73.437 40.056 73.493 0.00 5.78 100.00 70.94 60.07 2657 73.437 40.056 73.493 0.00 5.78 100.00 70.94 60.07 2657 73.437 40.056 73.493 0.00 5.78 100.00 70.94 60.07 2657 73.437 40.056 73.493 0.00 4.65 29.66 71.00 60.07 2659 73.488 40.056 73.544 0.00 4.65 70.88 71.00 60.07 2659 73.488 40.056 73.544 0.00 4.65 71.58 71.00 60.07 2659 73.488 40.056 73.544 0.00 4.65 72.15 71.00 60.07 2659 73.488 40.056 73.544 0.00 4.65 72.33 71.00 60.07 2659 73.488 40.056 73.544 0.00 4.65 72.73 71.00 60.07 2659 73.488 40.056 73.544 0.00 4.65 73.20 71.00 60.07 2659 73.488 40.056 73.544 0.00 4.65 73.71 71.00 60.07 2659 73.488 40.056 73.544 0.00 4.65 73.84 71.00 60.07 2659 73.488 40.056 73.544 0.00 4.65 73.91 71.00 60.07 2659 73.488 40.056 73.544 0.00 4.65 74.18 71.00 60.07 2659 73.488 40.056 73.544 0.00 4.65 74.29 71.00 60.07 2659 73.488 40.056 73.544 0.00 4.65 74.29 71.00 60.07 2659 73.488 40.056 73.544 0.00 4.65 74.29 71.00 60.07 2659 73.488 40.056 73.544 0.00 4.65 74.29 71.00 60.07 2659 73.488 40.056 73.544 0.00 4.65 74.29 71.00 60.07 2659 73.488 40.056 73.544 0.00 4.65 74.29 71.00 60.07 2659 73.488 40.056 73.544 0.00 4.65 74.29 71.00 60.07 2659 73.488 40.056 73.544 0.00 5.43 12.64 71.09 60.07 2661 73.534 40.056 73.590 0.00 5.43 18.02 71.09 60.07 2661 73.534 40.056 73.590 0.00 5.43 18.02 71.09 60.07 2661 73.534 40.056 73.590 0.00 5.43 18.02 71.09 60.07 2661 73.534 40.056 73.590 0.00 5.43 18.02 71.09 60.07 2661 73.534 40.056 73.590 0.00 5.43 18.02 71.09 60.07 2661 73.534 40.056 73.590 0.00 5.43 18.02 71.09 60.07 2661 73.534 40.056 73.590 0.00 5.43 18.02 71.09 60.07 2661 73.534 40.056 73.590 0.00 5.43 18.02 71.09 60.07 2661 73.534 40.056 73.590 0.00 5.43 18.02 71.09 60.07 2661 73.534 40.056 73.590 0.00 5.43 18.02 71.09 60.07 2661 73.534 40.056 73.590 0.00 5.43 18.02 71.09 60.07 2661 73.534 40.056 73.590 0.00 5.43 18.02 71.09 60.07 2661 73.534 40.056 73.590 0.00 5.43 18.02 71.09 60.07 2661 73.534 40.056 73.590 0.00 5.43 18.02 71.09 60.07 2661 73.534 40.056 73.590 0.00 5.43 18.02 71.09 60.07 2661 73.534 40.056 73.590 0.00 5.43 18.02 71.09 60.07 2661 73.534 40.056 73.590 0.00 5.43 18.02 71.09 60.07 2661 73.534
cassandra node jvm stall intermittently
Hi, our cassandra node using java 7 update 72 and we ran jstat on one of the node, and notice some strange behaviour as indicated by output below. any idea why when eden space stay the same for few seconds like 100% and 18.02% for few seconds? we suspect such stalling cause timeout to our cluster. any idea what happened, what went wrong and what could cause this? $ jstat -gcutil 32276 1s 0.00 5.78 91.21 70.94 60.07 2657 73.437 40.056 73.493 0.00 5.78 100.00 70.94 60.07 2657 73.437 40.056 73.493 0.00 5.78 100.00 70.94 60.07 2657 73.437 40.056 73.493 0.00 5.78 100.00 70.94 60.07 2657 73.437 40.056 73.493 0.00 5.78 100.00 70.94 60.07 2657 73.437 40.056 73.493 0.00 5.78 100.00 70.94 60.07 2657 73.437 40.056 73.493 0.00 5.78 100.00 70.94 60.07 2657 73.437 40.056 73.493 0.00 5.78 100.00 70.94 60.07 2657 73.437 40.056 73.493 0.00 5.78 100.00 70.94 60.07 2657 73.437 40.056 73.493 0.00 5.78 100.00 70.94 60.07 2657 73.437 40.056 73.493 0.00 5.78 100.00 70.94 60.07 2657 73.437 40.056 73.493 0.00 5.78 100.00 70.94 60.07 2657 73.437 40.056 73.493 0.00 5.78 100.00 70.94 60.07 2657 73.437 40.056 73.493 0.00 5.78 100.00 70.94 60.07 2657 73.437 40.056 73.493 0.00 5.78 100.00 70.94 60.07 2657 73.437 40.056 73.493 0.00 5.78 100.00 70.94 60.07 2657 73.437 40.056 73.493 0.00 5.78 100.00 70.94 60.07 2657 73.437 40.056 73.493 0.00 4.65 29.66 71.00 60.07 2659 73.488 40.056 73.544 0.00 4.65 70.88 71.00 60.07 2659 73.488 40.056 73.544 0.00 4.65 71.58 71.00 60.07 2659 73.488 40.056 73.544 0.00 4.65 72.15 71.00 60.07 2659 73.488 40.056 73.544 0.00 4.65 72.33 71.00 60.07 2659 73.488 40.056 73.544 0.00 4.65 72.73 71.00 60.07 2659 73.488 40.056 73.544 0.00 4.65 73.20 71.00 60.07 2659 73.488 40.056 73.544 0.00 4.65 73.71 71.00 60.07 2659 73.488 40.056 73.544 0.00 4.65 73.84 71.00 60.07 2659 73.488 40.056 73.544 0.00 4.65 73.91 71.00 60.07 2659 73.488 40.056 73.544 0.00 4.65 74.18 71.00 60.07 2659 73.488 40.056 73.544 0.00 4.65 74.29 71.00 60.07 2659 73.488 40.056 73.544 0.00 4.65 74.29 71.00 60.07 2659 73.488 40.056 73.544 0.00 4.65 74.29 71.00 60.07 2659 73.488 40.056 73.544 0.00 4.65 74.29 71.00 60.07 2659 73.488 40.056 73.544 0.00 4.65 74.29 71.00 60.07 2659 73.488 40.056 73.544 0.00 4.65 74.29 71.00 60.07 2659 73.488 40.056 73.544 0.00 4.65 74.29 71.00 60.07 2659 73.488 40.056 73.544 0.00 5.43 12.64 71.09 60.07 2661 73.534 40.056 73.590 0.00 5.43 18.02 71.09 60.07 2661 73.534 40.056 73.590 0.00 5.43 18.02 71.09 60.07 2661 73.534 40.056 73.590 0.00 5.43 18.02 71.09 60.07 2661 73.534 40.056 73.590 0.00 5.43 18.02 71.09 60.07 2661 73.534 40.056 73.590 0.00 5.43 18.02 71.09 60.07 2661 73.534 40.056 73.590 0.00 5.43 18.02 71.09 60.07 2661 73.534 40.056 73.590 0.00 5.43 18.02 71.09 60.07 2661 73.534 40.056 73.590 0.00 5.43 18.02 71.09 60.07 2661 73.534 40.056 73.590 0.00 5.43 18.02 71.09 60.07 2661 73.534 40.056 73.590 0.00 5.43 18.02 71.09 60.07 2661 73.534 40.056 73.590 0.00 5.43 18.02 71.09 60.07 2661 73.534 40.056 73.590 0.00 5.43 18.02 71.09 60.07 2661 73.534 40.056 73.590 0.00 5.43 18.02 71.09 60.07 2661 73.534 40.056 73.590 0.00 5.43 18.02 71.09 60.07 2661 73.534 40.056 73.590 0.00 5.43 18.02 71.09 60.07 2661 73.534 40.056 73.590 0.00 5.43 18.02 71.09 60.07 2661 73.534 40.056 73.590 0.00 5.43 18.02 71.09 60.07 2661 73.534 40.056 73.590 0.00 5.43 69.24 71.09 60.07 2661 73.534 40.056 73.590 0.00 5.43 78.05 71.09 60.07 2661 73.534 40.056 73.590 0.00 5.43 78.97 71.09 60.07 2661 73.534 40.056 73.590 0.00 5.43 79.07 71.09 60.07 2661 73.534 40.056 73.590 0.00 5.43 79.18 71.09 60.07 2661 73.534 40.056 73.590 0.00 5.43 80.09 71.09 60.07 2661 73.534 40.056 73.590 0.00 5.43 80.36 71.09 60.07 2661 73.534 40.056 73.590 0.00 5.43 80.51 71.09 60.07 2661 73.534 40.056 73.590 0.00 5.43 80.70 71.09
Re: sstables remain after compaction
off topic for this discussion, and yea, we are in the midst of upgrading... 1.0.8 - 1.0.12 then to 1.1.0.. then to the latest of 1.1.. then to 1.2. keep my finger cross for safe upgrading for such a big cluster... we hope that with cassandra moving some components off heap in 1.1 and 1.2, the cluster would perform better, at least i do not need to do the user defined compaction regulary. thanks guys, jason On Tue, Mar 3, 2015 at 3:59 AM, sean_r_dur...@homedepot.com wrote: In my experience, you do not want to stay on 1.1 very long. 1.08 was very stable. 1.1 can get bad in a hurry. 1.2 (with many things moved off-heap) is very much better. Sean Durity – Cassandra Admin, Big Data Team *From:* Robert Coli [mailto:rc...@eventbrite.com] *Sent:* Monday, March 02, 2015 2:01 PM *To:* user@cassandra.apache.org *Subject:* Re: sstables remain after compaction On Sat, Feb 28, 2015 at 5:39 PM, Jason Wee peich...@gmail.com wrote: Hi Rob, sorry for the late response, festive season here. cassandra version is 1.0.8 and thank you, I will read on the READ_STAGE threads. 1.0.8 is pretty seriously old in 2015. I would upgrade to at least 1.2.x (via 1.1.x) ASAP. Your cluster will be much happier, in general. =Rob -- The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the terms and conditions expressed in any applicable governing The Home Depot terms of business or client engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy and content of this attachment and for any damages or losses arising from any inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature, which may be contained in this attachment and shall not be liable for direct, indirect, consequential or special damages in connection with this e-mail message or its attachment.
Re: sstables remain after compaction
noted Tyler...and many thanks.. well, I read cassandra jira issues and just followed one of the comment https://issues.apache.org/jira/browse/CASSANDRA-5740 In general, I thought we always advised to upgrade through the 'major' revs, 1.0 - 1.1 - 1.2. Or, at least, I think that's the advice now but I will read carefully for the diff between 1.0.12 to latest of 1.1.x... i think there should be many changes. jason On Wed, Mar 4, 2015 at 7:38 AM, Robert Coli rc...@eventbrite.com wrote: On Tue, Mar 3, 2015 at 2:34 PM, Tyler Hobbs ty...@datastax.com wrote: I'm not aware of any good reason to put 1.1.0 in the middle there. I would go straight from 1.0.12 to the latest 1.1.x. +1 =Rob
Re: sstables remain after compaction
Hi Rob, sorry for the late response, festive season here. cassandra version is 1.0.8 and thank you, I will read on the READ_STAGE threads. Jason On Wed, Feb 18, 2015 at 3:33 AM, Robert Coli rc...@eventbrite.com wrote: On Fri, Feb 13, 2015 at 7:45 PM, Jason Wee peich...@gmail.com wrote: I trigger user defined compaction to big sstables (big as in the size per sstable reach more than 50GB, some 100GB). Occasionally, after user defined compaction, I see some sstables remain, even after 12 hours elapsed. That is unexpected. What version of Cassandra? You mentioned a thread, could you tell what threads are those or perhaps highlight in the code? I'd presume READ_STAGE threads. =Rob
Re: how many rows can one partion key hold?
you might want to read here http://wiki.apache.org/cassandra/CassandraLimitations jason On Fri, Feb 27, 2015 at 2:44 PM, wateray wate...@163.com wrote: Hi all, My team is using Cassandra as our database. We have one question as below. As we know, the row with the some partition key will be stored in the some node. But how many rows can one partition key hold? What is it depend on? The node's volume or partition data size or partition rows size(the number of rows)? When one partition's data is extreme large, the write/read will slow? Can anyone show me some exist usecases. thanks!
sstables remain after compaction
Hello, Pre cassandra 1.0, after sstables are compacted, the old sstables will be remain until the first gc kick in. For cassandra 1.0, the sstables will be remove after compaction is done. Will it be possible the old sstables remains due to whatever reasons (e.g. read referencing)? Thank you. Jason
Re: sstables remain after compaction
Thank Rob, I trigger user defined compaction to big sstables (big as in the size per sstable reach more than 50GB, some 100GB). Occasionally, after user defined compaction, I see some sstables remain, even after 12 hours elapsed. You mentioned a thread, could you tell what threads are those or perhaps highlight in the code? Jason On Sat, Feb 14, 2015 at 3:58 AM, Robert Coli rc...@eventbrite.com wrote: On Fri, Feb 13, 2015 at 1:35 AM, Jason Wee peich...@gmail.com wrote: Pre cassandra 1.0, after sstables are compacted, the old sstables will be remain until the first gc kick in. For cassandra 1.0, the sstables will be remove after compaction is done. Will it be possible the old sstables remains due to whatever reasons (e.g. read referencing)? If I understand your question properly, the answer is no or not for longer than the duration of a running thread. If compaction is working properly in a post-needs-the-java-GC-to-delete-files version of Cassandra the input files should be deleted ASAP. If a thread is actively accessing that file, I would imagine it blocks for that long, but that's not likely to be very long. =Rob
Re: SStables can't compat automaticly
Did you disable auto compaction through nodetool? disableautocompactionDisable autocompaction for the given keyspace and column family Jason On Mon, Jan 26, 2015 at 11:34 AM, 曹志富 cao.zh...@gmail.com wrote: Hi everybody: I have 18 nodes using cassandra2.1.2.Every node has 4 core, 32 GB RAM, 2T hard disk,OS is CentOS release 6.2 (Final). I have follow the Recommended production settings to config my system.such as disable SWAP,unlimited mem lock... My heap size is: MAX_HEAP_SIZE=8G MIN_HEAP_SIZE=8G HEAP_NEWSIZE=2G I use STCS,other config using default,using Datastax Java Driver 2.1.2. BatchStatment 100key commit per time. When I run my cluster and insert data from kafka (1 keys/s) after 2 days,every node can't compact some there too many sstables. I try to use major compact to compact the sstables , it cost a long long time .Also the new sstables can't compat automatic. I trace the log , the CMS GC too often,almost 30 minute onetime. Could someone help me to solve this problem. -- 曹志富 手机:18611121927 邮箱:caozf.zh...@gmail.com 微博:http://weibo.com/boliza/
Re: keyspace not exists?
Thanks Rob, we keep this in mind for our learning journey. Jason On Wed, Jan 21, 2015 at 6:45 AM, Robert Coli rc...@eventbrite.com wrote: On Sun, Jan 18, 2015 at 8:55 PM, Jason Wee peich...@gmail.com wrote: two nodes running cassandra 2.1.2 and one running cassandra 2.1.1 For the record, this is an unsupported persistent configuration. You are only supposed to have split minor versions during an upgrade. I have no idea if it is causing the problem you are having. =Rob
Re: keyspace not exists?
log does not show anything fishy, because it is just for fun cluster, we can actually wipe our 3 nodes cluster casandra dir, data,saved_caches,commitlog and start it all over, we encounter the same problem. two nodes running cassandra 2.1.2 and one running cassandra 2.1.1 I look a look at the issue given by Tyler link, and patch my cqlsh and given more information below and thank you it works. Actually doing this tutorial from this blog http://www.datastax.com/dev/blog/thrift-to-cql3 $ cqlsh 192.168.0.2 9042 Warning: schema version mismatch detected; check the schema versions of your nodes in system.local and system.peers. Connected to just4fun at 192.168.0.2:9042. [cqlsh 5.0.1 | Cassandra 2.1.1 | CQL spec 3.2.0 | Native protocol v3] Use HELP for help. cqlsh DESCRIBE KEYSPACES; system_traces jw_schema1 system cqlsh use jw_schema1; cqlsh:jw_schema1 desc tables; user_profiles cqlsh:jw_schema1 quit; $ cassandra-cli -h 192.168.0.2 -p 9160 Connected to: just4fun on 192.168.0.2/9160 Welcome to Cassandra CLI version 2.1.1 The CLI is deprecated and will be removed in Cassandra 3.0. Consider migrating to cqlsh. CQL is fully backwards compatible with Thrift data; see http://www.datastax.com/dev/blog/thrift-to-cql3 Type 'help;' or '?' for help. Type 'quit;' or 'exit;' to quit. [default@unknown] show keyspaces; WARNING: CQL3 tables are intentionally omitted from 'show keyspaces' output. See https://issues.apache.org/jira/browse/CASSANDRA-4377 for details. Keyspace: jw_schema1: Replication Strategy: org.apache.cassandra.locator.SimpleStrategy Durable Writes: true Options: [replication_factor:3] Column Families: ColumnFamily: user_profiles Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type Default column value validator: org.apache.cassandra.db.marshal.BytesType Cells sorted by: org.apache.cassandra.db.marshal.UTF8Type GC grace seconds: 864000 Compaction min/max thresholds: 4/32 Read repair chance: 0.0 DC Local Read repair chance: 0.1 Caching: KEYS_ONLY Default time to live: 0 Bloom Filter FP chance: 0.01 Index interval: default Speculative Retry: 99.0PERCENTILE Built indexes: [] Column Metadata: Column Name: first_name Validation Class: org.apache.cassandra.db.marshal.UTF8Type Column Name: year_of_birth Validation Class: org.apache.cassandra.db.marshal.Int32Type Column Name: last_name Validation Class: org.apache.cassandra.db.marshal.UTF8Type Compaction Strategy: org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy Compression Options: sstable_compression: org.apache.cassandra.io.compress.LZ4Compressor Keyspace: system: .. .. .. Keyspace: system_traces: Replication Strategy: org.apache.cassandra.locator.SimpleStrategy Durable Writes: true Options: [replication_factor:2] Column Families: [default@unknown] use jw_schema1; Authenticated to keyspace: jw_schema1 [default@jw_schema1] list user_profiles; Using default limit of 100 Using default cell limit of 100 0 Row Returned. Elapsed time: 728 msec(s). [default@jw_schema1] On Sat, Jan 17, 2015 at 6:41 AM, Tyler Hobbs ty...@datastax.com wrote: This might be https://issues.apache.org/jira/browse/CASSANDRA-8512 if your cluster has a schema disagreement. You can apply the patch on that ticket with patch -p1 8512-2.1.txt from the top-level cassandra directory and see if it helps. On Fri, Jan 16, 2015 at 11:58 AM, Julien Anguenot jul...@anguenot.org wrote: Hey Jason, Your RF=3, do you have 3 nodes up and running in this DC? We have seen this issue with 2.1.x and cqlsh where schema changes would trigger the keyspace not found error in cqlsh if not all nodes were up and running when altering KS schema in a DC with NetworkTopologyStrategy and RF=3. For us, bringing all the nodes up to meet RF would then fix the problem. As well, you might want to restart the node and see if the keyspace not found still occurs: same here, since 2.1.x we've had cases where a restart was required for cqlsh and / or drivers to see the schema changes. J. On Fri, Jan 16, 2015 at 3:56 AM, Jason Wee peich...@gmail.com wrote: $ cqlsh 192.168.0.2 9042 Connected to just4fun at 192.168.0.2:9042. [cqlsh 5.0.1 | Cassandra 2.1.1 | CQL spec 3.2.0 | Native protocol v3] Use HELP for help. cqlsh DESCRIBE KEYSPACES empty cqlsh create keyspace foobar with replication = {'class':'SimpleStrategy', 'replication_factor':3}; errors={}, last_host=192.168.0.2 cqlsh DESCRIBE KEYSPACES; empty cqlsh use foobar; cqlsh:foobar DESCRIBE TABLES; Keyspace 'foobar' not found. Just trying cassandra 2.1 and encounter the above erorr, can anyone explain why is this and where to even begin troubleshooting? Jason -- Tyler Hobbs DataStax http://datastax.com/
Re: keyspace not exists?
Hi, Immediately after a repair, I execute cqlsh, still the schema mismatch? [2015-01-19 13:50:49,979] Repair session 19c67350-9f9f-11e4-8b56-a322c40b8b81 for range (-725731847063341791,-718486959589605925] finished [2015-01-19 13:50:49,980] Repair session 1a612cb0-9f9f-11e4-8b56-a322c40b8b81 for range (-5366440687164990017,-5357952536457207248] finished [2015-01-19 13:50:49,980] Repair session 1afcd070-9f9f-11e4-8b56-a322c40b8b81 for range (-2871651679602006497,-2860883420245139806] finished [2015-01-19 13:50:49,980] Repair session 1b99acb0-9f9f-11e4-8b56-a322c40b8b81 for range (-394095345040964045,-391878264832686281] finished [2015-01-19 13:50:49,981] Repair session 1c352960-9f9f-11e4-8b56-a322c40b8b81 for range (8830377476646048271,8848086816619852308] finished [2015-01-19 13:50:49,981] Repair session 1cd1de90-9f9f-11e4-8b56-a322c40b8b81 for range (4538653889569069241,4549572313549299652] finished [2015-01-19 13:50:49,985] Repair session 1d6ebad0-9f9f-11e4-8b56-a322c40b8b81 for range (6052068628404624993,6058413940102734921] finished [2015-01-19 13:50:49,986] Repair command #1 finished jason@localhost:~$ cqlsh 192.168.0.2 9042 Warning: schema version mismatch detected; check the schema versions of your nodes in system.local and system.peers. Connected to just4fun at 192.168.0.2:9042. [cqlsh 5.0.1 | Cassandra 2.1.1 | CQL spec 3.2.0 | Native protocol v3] Use HELP for help. cqlsh cqlsh desc keyspaces; system_traces jw_schema1 system cqlsh use jw_schema1; cqlsh:jw_schema1 desc tables; user_profiles cqlsh:system select host_id,schema_version from system.peers; host_id | schema_version --+-- d21e3d11-5bfb-4888-97cd-62af90e83f56 | b5291c1d-6635-3627-928f-f5a0f0c27ec1 d21e3d11-5bfb-4888-97cd-62af90e83f56 | c7a2ebda-89f7-36f0-a735-a0dffc400124 69bd2306-c919-411b-83f3-341b4f7f54b4 | f6f3835e-ed12-34f4-9f4b-f2a72bb57c30 e1444216-4412-45d5-9703-a463ee50aec2 | f6f3835e-ed12-34f4-9f4b-f2a72bb57c30 (4 rows) cqlsh:system select host_id,schema_version from system.local; host_id | schema_version --+-- d21e3d11-5bfb-4888-97cd-62af90e83f56 | f6f3835e-ed12-34f4-9f4b-f2a72bb57c30 (1 rows) On Mon, Jan 19, 2015 at 12:55 PM, Jason Wee peich...@gmail.com wrote: log does not show anything fishy, because it is just for fun cluster, we can actually wipe our 3 nodes cluster casandra dir, data,saved_caches,commitlog and start it all over, we encounter the same problem. two nodes running cassandra 2.1.2 and one running cassandra 2.1.1 I look a look at the issue given by Tyler link, and patch my cqlsh and given more information below and thank you it works. Actually doing this tutorial from this blog http://www.datastax.com/dev/blog/thrift-to-cql3 $ cqlsh 192.168.0.2 9042 Warning: schema version mismatch detected; check the schema versions of your nodes in system.local and system.peers. Connected to just4fun at 192.168.0.2:9042. [cqlsh 5.0.1 | Cassandra 2.1.1 | CQL spec 3.2.0 | Native protocol v3] Use HELP for help. cqlsh DESCRIBE KEYSPACES; system_traces jw_schema1 system cqlsh use jw_schema1; cqlsh:jw_schema1 desc tables; user_profiles cqlsh:jw_schema1 quit; $ cassandra-cli -h 192.168.0.2 -p 9160 Connected to: just4fun on 192.168.0.2/9160 Welcome to Cassandra CLI version 2.1.1 The CLI is deprecated and will be removed in Cassandra 3.0. Consider migrating to cqlsh. CQL is fully backwards compatible with Thrift data; see http://www.datastax.com/dev/blog/thrift-to-cql3 Type 'help;' or '?' for help. Type 'quit;' or 'exit;' to quit. [default@unknown] show keyspaces; WARNING: CQL3 tables are intentionally omitted from 'show keyspaces' output. See https://issues.apache.org/jira/browse/CASSANDRA-4377 for details. Keyspace: jw_schema1: Replication Strategy: org.apache.cassandra.locator.SimpleStrategy Durable Writes: true Options: [replication_factor:3] Column Families: ColumnFamily: user_profiles Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type Default column value validator: org.apache.cassandra.db.marshal.BytesType Cells sorted by: org.apache.cassandra.db.marshal.UTF8Type GC grace seconds: 864000 Compaction min/max thresholds: 4/32 Read repair chance: 0.0 DC Local Read repair chance: 0.1 Caching: KEYS_ONLY Default time to live: 0 Bloom Filter FP chance: 0.01 Index interval: default Speculative Retry: 99.0PERCENTILE Built indexes: [] Column Metadata: Column Name: first_name Validation Class: org.apache.cassandra.db.marshal.UTF8Type Column Name: year_of_birth Validation Class: org.apache.cassandra.db.marshal.Int32Type Column Name: last_name Validation Class
keyspace not exists?
$ cqlsh 192.168.0.2 9042 Connected to just4fun at 192.168.0.2:9042. [cqlsh 5.0.1 | Cassandra 2.1.1 | CQL spec 3.2.0 | Native protocol v3] Use HELP for help. cqlsh DESCRIBE KEYSPACES empty cqlsh create keyspace foobar with replication = {'class':'SimpleStrategy', 'replication_factor':3}; errors={}, last_host=192.168.0.2 cqlsh DESCRIBE KEYSPACES; empty cqlsh use foobar; cqlsh:foobar DESCRIBE TABLES; Keyspace 'foobar' not found. Just trying cassandra 2.1 and encounter the above erorr, can anyone explain why is this and where to even begin troubleshooting? Jason
Re: diff cassandra.yaml 1.2 -- 2.1
https://issues.apache.org/jira/browse/CASSANDRA-3534 On Mon, Dec 29, 2014 at 6:58 PM, Alain RODRIGUEZ arodr...@gmail.com wrote: Hi guys, I am looking at added and dropped option in Cassandra between 1.2.18 and 2.0.11 and this makes me wonder: Why has the index_interval option been removed from cassandra.yaml ? I know we can also define it on a per table basis, yet, this global option was quite useful to tune memory usage. I also know that this index is now kept off-heap, but I can not see when and why this option has been removed, any pointer ? Also it seems this option still usable even if not present by default on cassandra.yaml, but it is marked as deprecated ( https://github.com/apache/cassandra/blob/cassandra-2.0.11/src/java/org/apache/cassandra/config/Config.java#L165). Is this option deprecated on the table schema definition too ? Same kind of questions around the heap emergency pressure valve -- flush_largest_memtables_at, reduce_cache_sizes_at and reduce_cache_capacity_to, except that those params seems to have been dropped directly. Why, is there no more need of it, has some other mechanism replaced it, improving things ? Hope this wasn't already discussed,I was unable to find information about it anyway. C*heers !
Re: diff cassandra.yaml 1.2 -- 2.1
What you are asking maybe answer in the code level and pretty deep stuff, at least from user (like me) point of view. But to quote Jonathan in CASSANDRA-3534, Then you will be able to say use X amount of memory for memtables, Y amount for the cache (and monitor Z amount for the bloom filters) which makes the old pressure valve code obsolete. To explain why is this removed. There is also another issue discussing which you might find it worth to read https://issues.apache.org/jira/browse/CASSANDRA-3143 If I may ask, are you doing cassandra upgrade from 1.2 to 2.1? Jason On Mon, Dec 29, 2014 at 10:54 PM, Alain RODRIGUEZ arodr...@gmail.com wrote: Thanks for the pointer Jason, Yet, I thought that cache and memtables went off-heap only in version 2.1 and not 2.0 (As of Cassandra 2.0, there are two major pieces of the storage engine that still depend on the JVM heap: memtables and the key cache. -- http://www.datastax.com/dev/blog/off-heap-memtables-in-cassandra-2-1). So this clean up makes sense to me but in the new 2.1 version of Cassandra. I also read on the same blog that we might have the choice in/off heap for memtables (or more precisely just get memtable buffers off-heap) . If this is true, flush_largest_memtables_at still makes sense. About cache, isn't key cache still in the heap, even in 2.1 ? It looks like the removal of these option looks to me a bit radical and premature. I guess I am missing something in my reasoning but can't figure out what exactly. C*heers, Alain 2014-12-29 14:52 GMT+01:00 Jason Wee peich...@gmail.com: https://issues.apache.org/jira/browse/CASSANDRA-3534 On Mon, Dec 29, 2014 at 6:58 PM, Alain RODRIGUEZ arodr...@gmail.com wrote: Hi guys, I am looking at added and dropped option in Cassandra between 1.2.18 and 2.0.11 and this makes me wonder: Why has the index_interval option been removed from cassandra.yaml ? I know we can also define it on a per table basis, yet, this global option was quite useful to tune memory usage. I also know that this index is now kept off-heap, but I can not see when and why this option has been removed, any pointer ? Also it seems this option still usable even if not present by default on cassandra.yaml, but it is marked as deprecated ( https://github.com/apache/cassandra/blob/cassandra-2.0.11/src/java/org/apache/cassandra/config/Config.java#L165). Is this option deprecated on the table schema definition too ? Same kind of questions around the heap emergency pressure valve -- flush_largest_memtables_at, reduce_cache_sizes_at and reduce_cache_capacity_to, except that those params seems to have been dropped directly. Why, is there no more need of it, has some other mechanism replaced it, improving things ? Hope this wasn't already discussed,I was unable to find information about it anyway. C*heers !
Re: Keyspace and table/cf limits
+1 well said Jack! On Sun, Dec 7, 2014 at 6:13 AM, Jack Krupansky j...@basetechnology.com wrote: Generally, limit a Cassandra cluster low hundreds of tables, regardless of number of keyspaces. Beyond low hundreds is certainly an “expert” feature and requires great care. Sure, maybe you can have 500 or 750 or maybe even 1,000 tables in a cluster, but don’t be surprised if you start running into memory and performance issues. There is an undocumented method to reduce the table overhead to support more tables, but... if you are not expert enough to find it on your own, then you are definitely not expert enough to be using it. -- Jack Krupansky *From:* Raj N raj.cassan...@gmail.com *Sent:* Tuesday, November 25, 2014 12:07 PM *To:* user@cassandra.apache.org *Subject:* Keyspace and table/cf limits What's the latest on the maximum number of keyspaces and/or tables that one can have in Cassandra 2.1.x? -Raj
Re: Using Cassandra for session tokens
Hi Phil, just my 2 cents, just watch out for these issues like counter type (replicate on write), compaction (when node load goes huge) and cassandra instance gc. This issues exists in 1.x perhaps it has been resolved in 2.x. hth jason On Mon, Dec 1, 2014 at 10:44 PM, Matt Brown m...@mattnworb.com wrote: This sounds like a good use case for http://www.datastax.com/dev/blog/datetieredcompactionstrategy On Dec 1, 2014, at 3:07 AM, Phil Wise p...@advancedtelematic.com wrote: We're considering switching from using Redis to Cassandra to store short lived (~1 hour) session tokens, in order to reduce the number of data storage engines we have to manage. Can anyone foresee any problems with the following approach: 1) Use the TTL functionality in Cassandra to remove old tokens. 2) Store the tokens in a table like: CREATE TABLE tokens ( id uuid, username text, // (other session information) PRIMARY KEY (id) ); 3) Perform ~100 writes/sec like: INSERT INTO tokens (id, username ) VALUES (468e0d69-1ebe-4477-8565-00a4cb6fa9f2, 'bob') USING TTL 3600; 4) Perform ~1000 reads/sec like: SELECT * FROM tokens WHERE ID=468e0d69-1ebe-4477-8565-00a4cb6fa9f2 ; The tokens will be about 100 bytes each, and we will grant 100 per second on a small 3 node cluster. Therefore there will be about 360k tokens alive at any time, with a total size of 36MB before database overhead. My biggest worry at the moment is that this kind of workload will stress compaction in an unusual way. Are there any metrics I should keep an eye on to make sure it is working fine? I read over the following links, but they mostly talk about DELETE-ing and tombstones. Am I right in thinking that as soon as a node performs a compaction then the rows with an expired TTL will be thrown away, regardless of gc_grace_seconds? https://issues.apache.org/jira/browse/CASSANDRA-7534 http://www.datastax.com/dev/blog/cassandra-anti-patterns-queues-and-queue-like-datasets https://issues.apache.org/jira/browse/CASSANDRA-6654 Thank you Phil
Re: Nodes get stuck in crazy GC loop after some time, leading to timeouts
ack and many thanks for the tips and help.. jason On Wed, Dec 3, 2014 at 4:49 AM, Robert Coli rc...@eventbrite.com wrote: On Mon, Dec 1, 2014 at 11:07 PM, Jason Wee peich...@gmail.com wrote: Hi Rob, any recommended documentation on describing explanation/configuration of the JVM heap and permanent generation ? We stucked in this same situation too. :( The archives of this list are chock full of explorations of various cases. Your best bet is to look for a good Aaron Morton reference where he breaks down the math between generations. I swear there was a blog post of his on this subject, but the best I can find is this slidedeck : http://www.slideshare.net/aaronmorton/cassandra-tk-2014-large-nodes =Rob
Re: Nodes get stuck in crazy GC loop after some time, leading to timeouts
Hi Rob, any recommended documentation on describing explanation/configuration of the JVM heap and permanent generation ? We stucked in this same situation too. :( Jason On Tue, Dec 2, 2014 at 3:42 AM, Robert Coli rc...@eventbrite.com wrote: On Fri, Nov 28, 2014 at 12:55 PM, Paulo Ricardo Motta Gomes paulo.mo...@chaordicsystems.com wrote: We restart the whole cluster every 1 or 2 months, to avoid machines getting into this crazy state. We tried tuning GC size and parameters, different cassandra versions (1.1, 1.2, 2.0), but this behavior keeps happening. More recently, during black friday, we received about 5x our normal load, and some machines started presenting this behavior. Once again, we restart the nodes an the GC behaves normal again. ... You can clearly notice some memory is actually reclaimed during GC in healthy nodes, while in sick machines very little memory is reclaimed. Also, since GC is executed more frequently in sick machines, it uses about 2x more CPU than non-sick nodes. Have you ever observed this behavior in your cluster? Could this be related to heap fragmentation? Would using the G1 collector help in this case? Any GC tuning or monitoring advice to troubleshoot this issue? The specific combo of symptoms does in fact sound like a combination of being close to heap exhaustion with working set and then fragmentation putting you over the top. I would probably start by increasing your heap, which will help avoid the pre-fail condition from your working set. But for tuning, examine the contents of each generation when the JVM gets into this state. You are probably exhausting permanent generation, but depending on what that says, you could change the relatively sizing of the generations. =Rob
Re: open source cassandra and hadoop
There are two examples of hadoop with cassandra in the examples codes, https://github.com/apache/cassandra/tree/trunk/examples/hadoop_word_count https://github.com/apache/cassandra/tree/trunk/examples/hadoop_cql3_word_count Does these help? Jason On Sat, Nov 29, 2014 at 2:30 AM, Tim Dunphy bluethu...@gmail.com wrote: Hey all, I have a 3 node Cassandra cluster I would like to hook into hadoop for processing the information in the Cassandra DB. I know that Datastax version of Cassandra includes support for Hadoop right out of the box. But I've been googling around and I don't see any good information on how to do this. The Cassandra wiki does mention that there is a way to do this. http://wiki.apache.org/cassandra/HadoopSupport But the information is old. It only covers version 0.7. And there's still not a lot of information to go on in that wiki page. So I was wondering if anyone has ever heard of someone connecting a recent version of the community edition of Cassandra to Hadoop. And does anybody know of a guide I can use to do this? Thanks Tim -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B
Re: Question: How to monitor the QPS in Cassandra local node or cluster
Hello, have you try get statistics from jmx? On Thu, Nov 20, 2014 at 10:55 AM, luolee.me luolee...@gmail.com wrote: Hi, everyone, I want to monitor the Cassandra cluster using Zabbix, but I have no idea about hot monitor the QPS on local Cassandra node ? I search the internet but haven't any result about how to get the QPS. anyone had any idea? Thanks!
What actually causing java.lang.OutOfMemoryError: unable to create new native thread
Hello people, below is an extraction from cassandra system log. ERROR [Thread-273] 2012-04-10 16:33:18,328 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[Thread-273,5,main] java.lang.OutOfMemoryError: unable to create new native thread at java.lang.Thread.start0(Native Method) at java.lang.Thread.start(Thread.java:640) at java.util.concurrent.ThreadPoolExecutor.addIfUnderMaximumPoolSize(ThreadPoolExecutor.java:727) at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:657) at org.apache.cassandra.thrift.CustomTThreadPoolServer.serve(CustomTThreadPoolServer.java:104) at org.apache.cassandra.thrift.CassandraDaemon$ThriftServer.run(CassandraDaemon.java:214) Investigated into the call until the java native call, http://hg.openjdk.java.net/jdk7/jdk7/hotspot/file/tip/src/share/vm/prims/jvm.cpp#l2698 if (native_thread-osthread() == NULL) { // No one should hold a reference to the 'native_thread'. delete native_thread; if (JvmtiExport::should_post_resource_exhausted()) { JvmtiExport::post_resource_exhausted( JVMTI_RESOURCE_EXHAUSTED_OOM_ERROR | JVMTI_RESOURCE_EXHAUSTED_THREADS, unable to create new native thread); } THROW_MSG(vmSymbols::java_lang_OutOfMemoryError(), unable to create new native thread); } Question. Is that out of memory error due to native os memory or java heap? Stacked size to the jvm is -Xss128k. Operating system file descriptor max user processes 26. open files capped at 65536 Can any java/cpp expert pin point what JVMTI_RESOURCE_EXHAUSTED_OOM_ERROR and JVMTI_RESOURCE_EXHAUSTED_THREADS means too? Thank you. Jason
Re: What actually causing java.lang.OutOfMemoryError: unable to create new native thread
Hi, thank you for response. using 64bit and kernel 2.6.32-358.18.1.el6.x86_64. # cat /proc/13405/limits Limit Soft Limit Hard Limit Units Max cpu time unlimitedunlimitedseconds Max file size unlimitedunlimitedbytes Max data size unlimitedunlimitedbytes Max stack size10485760 unlimitedbytes Max core file size0unlimitedbytes Max resident set unlimitedunlimitedbytes Max processes 26 26 processes Max open files6553665536files Max locked memory 6553665536bytes Max address space unlimitedunlimitedbytes Max file locksunlimitedunlimitedlocks Max pending signals 255762 255762 signals Max msgqueue size 819200 819200 bytes Max nice priority 00 Max realtime priority 00 Max realtime timeout unlimitedunlimitedus Is the stack size and / or Max open files is sufficient? The rest fd pretty much infinite. Jason On Tue, Nov 11, 2014 at 4:09 AM, Chris Lohfink clohfin...@gmail.com wrote: if your using 64 bit, check output of: cat /proc/{cassandra pid}/limits some older linux kernels wont work with above so if it doesnt exist check the ulimit -a output for the cassandra user. max processes per user may be your issue as well. --- Chris Lohfink On Mon, Nov 10, 2014 at 11:21 AM, graham sanderson gra...@vast.com wrote: First question are you running 32bit or 64bit… on 32bit you can easily run out of virtual address space for thread stacks. On Nov 10, 2014, at 8:25 AM, Jason Wee peich...@gmail.com wrote: Hello people, below is an extraction from cassandra system log. ERROR [Thread-273] 2012-04-10 16:33:18,328 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[Thread-273,5,main] java.lang.OutOfMemoryError: unable to create new native thread at java.lang.Thread.start0(Native Method) at java.lang.Thread.start(Thread.java:640) at java.util.concurrent.ThreadPoolExecutor.addIfUnderMaximumPoolSize(ThreadPoolExecutor.java:727) at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:657) at org.apache.cassandra.thrift.CustomTThreadPoolServer.serve(CustomTThreadPoolServer.java:104) at org.apache.cassandra.thrift.CassandraDaemon$ThriftServer.run(CassandraDaemon.java:214) Investigated into the call until the java native call, http://hg.openjdk.java.net/jdk7/jdk7/hotspot/file/tip/src/share/vm/prims/jvm.cpp#l2698 if (native_thread-osthread() == NULL) { // No one should hold a reference to the 'native_thread'. delete native_thread; if (JvmtiExport::should_post_resource_exhausted()) { JvmtiExport::post_resource_exhausted( JVMTI_RESOURCE_EXHAUSTED_OOM_ERROR | JVMTI_RESOURCE_EXHAUSTED_THREADS, unable to create new native thread); } THROW_MSG(vmSymbols::java_lang_OutOfMemoryError(), unable to create new native thread); } Question. Is that out of memory error due to native os memory or java heap? Stacked size to the jvm is -Xss128k. Operating system file descriptor max user processes 26. open files capped at 65536 Can any java/cpp expert pin point what JVMTI_RESOURCE_EXHAUSTED_OOM_ERROR and JVMTI_RESOURCE_EXHAUSTED_THREADS means too? Thank you. Jason
Re: Hector latency related configuration
Hi, What version of Hector are you using? Probably start with different consistency level? Does your node in cluster having memory pressure (you can check in cassandra system log)? what is the average node load per node currently? Also read concurrent_writes in cassandra.yaml if you can increase higher. You can also use nodetool cfstats to read for the write latency. Thanks. Jason On Mon, Oct 27, 2014 at 8:45 PM, Or Sher or.sh...@gmail.com wrote: Hi all, We're using Hector in one of our older use cases with C* 1.0.9. We suspect it increases our total round trip write latency to Cassandra. C* metrics shows low latency so we assume the problem is somewhere else. What are the configuration parameters you would recommend to investigate/change in order to decrease latency. -- Or Sher
Re: Why is the cassandra documentation such poor quality?
I agree to the people here already sharing their ways to access documentation. If you are starter, you should better spend time to search for documentation (like using google) or hours to read. Then start ask specific question. Coming here kpkb about poor quality of documentation just does not cut it. If you find documentation is outdated, you can email to the people in charge and tell them what is wrong and what you think will improve. There are some documentation which is left there so that we can read and understand history where it came from and some may still use old version of cassandra. On Wed, Jul 23, 2014 at 7:49 PM, Jack Krupansky j...@basetechnology.com wrote: And the simplest and easiest thing to do is simply email this list when you see something wrong or missing in the DataStax Cassandra doc, or for anything that is not adequately anywhere. I work with the doc people there, so I can make sure they see corrections and improvements. And simply sharing knowledge on this list is always a big step forward. -- Jack Krupansky *From:* spa...@gmail.com *Sent:* Wednesday, July 23, 2014 4:25 AM *To:* user@cassandra.apache.org *Subject:* Re: Why is the cassandra documentation such poor quality? I would like to help out with the documentation of C*. How do I start? On Wed, Jul 23, 2014 at 12:46 PM, Robert Stupp sn...@snazy.de wrote: Just a note: If you have suggestions how to improve documentation on the datastax website, write them an email to d...@datastax.com. They appreciate proposals :) Am 23.07.2014 um 09:10 schrieb Mark Reddy mark.re...@boxever.com: Hi Kevin, The difference here is that the Apache Cassandra site is maintained by the community whereas the DataStax site is maintained by paid employees with a vested interest in producing documentation. With DataStax having some comprehensive docs, I guess the desire for people to maintain the Apache site has dwindled. However, if you are interested in contributing to it and bringing it back up to standard you can, thus is the freedom of open source. Mark On Wed, Jul 23, 2014 at 2:54 AM, Kevin Burton bur...@spinn3r.com wrote: This document: https://wiki.apache.org/cassandra/Operations … for example. Is extremely out dated… does NOT reflect 2.x releases certainly. Mentions commands that are long since removed/deprecated. Instead of giving bad documentation, maybe remove this and mark it as obsolete. The datastax documentation… is … acceptable I guess. My main criticism there is that a lot of it it is in their blog. Kevin -- Founder/CEO Spinn3r.com http://spinn3r.com/ Location: *San Francisco, CA* blog: http://burtonator.wordpress.com … or check out my Google+ profile https://plus.google.com/102718274791889610666/posts http://spinn3r.com/ -- http://spawgi.wordpress.com We can do it and do it better.
Re: Unable to complete request: one or more nodes were unavailable.
hmm.. I get a similar output as yours yesterday when trying to truncate a table in a 3 nodes cluster where one of the node went offline. but the alternative that I have is that, instead of truncate, i just drop the table and recreate it. using cassandra version 2.0.6 by the way. On Wed, Apr 16, 2014 at 3:52 AM, Vivek Mishra mishra.v...@gmail.com wrote: Hi, I am trying Cassandra light weight transaction support with Cassandra 2.0.4 cqlsh:twitter create table user(user_id text primary key, namef text); cqlsh:twitter insert into user(user_id,namef) values('v','ff') if not exists; *Unable to complete request: one or more nodes were unavailable.* Any suggestions? -Vivek
Re: Long GC due to promotion failures
SSTable count: 365 Your sstable counts are too many... don't know what is the best count should be but for my experience, anything below 20 are good. Is your compaction running? I read on a few blog on how should we read cfhistograms, but never really understood fully. Anyone care to explain using OP attached cfhistogram ? Taking a wild shot, perhaps trying different build, oracle jdk 1.6u25 perhaps? HTH Jason On Tue, Jan 21, 2014 at 4:02 PM, John Watson j...@disqus.com wrote: Pretty reliable, at some point, nodes will have super long GCs. Followed by https://issues.apache.org/jira/browse/CASSANDRA-6592 Lovely log messages: 9030.798: [ParNew (0: promotion failure size = 4194306) (2: promotion failure size = 4194306) (4: promotion failure size = 4194306) (promotion failed) Total time for which application threads were stopped: 23.2659990 seconds Full gc.log until just before restarting the node (see another 32s GC near the end): https://gist.github.com/dctrwatson/f04896c215fa2418b1d9 Here's graph of GC time, where we can see a an increase 30 minutes prior (indicator that the issue will happen soon): http://dl.dropboxusercontent.com/s/q4dr7dle023w9ih/render.png Graph of various Heap usage: http://dl.dropboxusercontent.com/s/e8kd8go25ihbmkl/download.png Running compactions in the same time frame: http://dl.dropboxusercontent.com/s/li9tggk4r2l3u4b/render%20(1).png CPU, IO, ops and latencies: https://dl.dropboxusercontent.com/s/yh9osm9urplikb7/2014-01-20%20at%2011.46%20PM%202x.png cfhistograms/cfstats: https://gist.github.com/dctrwatson/9a08b38d0258ae434b15 Cassandra 1.2.13 Oracle JDK 1.6u45 JVM opts: MAX_HEAP_SIZE=8G HEAP_NEW_SIZE=1536M Tried HEAP_NEW_SIZE of 768M, 800M, 1000M and 1600M Tried default -XX:SurvivorRatio=8 and -XX:SurvivorRatio=4 Tried default -XX:MaxTenuringThreshold=1 and -XX:MaxTenuringThreshold=2 All still eventually ran into long GC. Hardware for all 3 nodes: (2) E5520 @ 2.27Ghz (8 cores w/ HT) [16 cores] (6) 4GB RAM [24G RAM] (1) 500GB 7.2k for commitlog (2) 400G SSD for data (configured as separate data directories)
Re: Exception in thread main java.lang.NoClassDefFoundError
NoClassDefFoundError: org/apache/cassandra/service/CassandraDaemon This stated very clear, the class is not found in the classpath. Very obviously you are not using the cassandra package for the distribution, so you need to find which jar that contain this class and check in your classpath if this jar is included. Jason On Tue, Jan 21, 2014 at 9:23 AM, Le Xu sharonx...@gmail.com wrote: Hello! I got this error while trying out Cassandra 1.2.13. The error message looks like: Exception in thread main java.lang. NoClassDefFoundError: org/apache/cassandra/service/CassandraDaemon Caused by: java.lang.ClassNotFoundException: org.apache.cassandra.service.CassandraDaemon at java.net.URLClassLoader$1.run(URLClassLoader.java:217) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:205) at java.lang.ClassLoader.loadClass(ClassLoader.java:323) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294) at java.lang.ClassLoader.loadClass(ClassLoader.java:268) Could not find the main class: org.apache.cassandra.service.CassandraDaemon. Program will exit. I checked JAVA_HOME and CASSANDRA_HOME and they are both set but I still got the error. However, based on Brian's reply in this thread: http://mail-archives.apache.org/mod_mbox/cassandra-user/201307.mbox/%3CCAJHHpg3Lf9tyxwgZNEN3cKH=p9xwms0w4rzqbpt8oriaq9r...@mail.gmail.com%3E I followed the step and printed out the $CLASSPATH variable and got : /home/lexu1/scale/apache-cassandra-1.2.13-src//conf:/home/lexu1/scale/apache-cassandra-1.2.13-src//build/classes/main:/home/lexu1/scale/apache-cassandra-1.2.13-src//build/classes/thrift:/home/lexu1/scale/apache-cassandra-1.2.13-src//lib/antlr-3.2.jar:/home/lexu1/scale/apache-cassandra-1.2.13-src//lib/avro-1.4.0-fixes.jar:/home/lexu1/scale/apache-cassandra-1.2.13-src//lib/avro-1.4.0-sources-fixes.jar:/home/lexu1/scale/apache-cassandra-1.2.13-src//lib/commons-cli-1.1.jar:/home/lexu1/scale/apache-cassandra-1.2.13-src//lib/commons-codec-1.2.jar:/home/lexu1/scale/apache-cassandra-1.2.13-src//lib/commons-lang-2.6.jar:/home/lexu1/scale/apache-cassandra-1.2.13-src//lib/compress-lzf-0.8.4.jar:/home/lexu1/scale/apache-cassandra-1.2.13-src//lib/concurrentlinkedhashmap-lru-1.3.jar:/home/lexu1/scale/apache-cassandra-1.2.13-src//lib/guava-13.0.1.jar:/home/lexu1/scale/apache-cassandra-1.2.13-src//lib/high-scale-lib-1.1.2.jar:/home/lexu1/scale/apache-cassandra-1.2.13-src//lib/jackson-core-asl-1.9.2.jar:/home/lexu1/scale/apache-cassandra-1.2.13-src//lib/jackson-mapper-asl-1.9.2.jar:/home/lexu1/scale/apache-cassandra-1.2.13-src//lib/jamm-0.2.5.jar:/home/lexu1/scale/apache-cassandra-1.2.13-src//lib/jbcrypt-0.3m.jar:/home/lexu1/scale/apache-cassandra-1.2.13-src//lib/jline-1.0.jar:/home/lexu1/scale/apache-cassandra-1.2.13-src//lib/json-simple-1.1.jar:/home/lexu1/scale/apache-cassandra-1.2.13-src//lib/libthrift-0.7.0.jar:/home/lexu1/scale/apache-cassandra-1.2.13-src//lib/log4j-1.2.16.jar:/home/lexu1/scale/apache-cassandra-1.2.13-src//lib/lz4-1.1.0.jar:/home/lexu1/scale/apache-cassandra-1.2.13-src//lib/metrics-core-2.2.0.jar:/home/lexu1/scale/apache-cassandra-1.2.13-src//lib/netty-3.6.6.Final.jar:/home/lexu1/scale/apache-cassandra-1.2.13-src//lib/servlet-api-2.5-20081211.jar:/home/lexu1/scale/apache-cassandra-1.2.13-src//lib/slf4j-api-1.7.2.jar:/home/lexu1/scale/apache-cassandra-1.2.13-src//lib/slf4j-log4j12-1.7.2.jar:/home/lexu1/scale/apache-cassandra-1.2.13-src//lib/snakeyaml-1.6.jar:/home/lexu1/scale/apache-cassandra-1.2.13-src//lib/snappy-java-1.0.5.jar:/home/lexu1/scale/apache-cassandra-1.2.13-src//lib/snaptree-0.1.jar It includes apache-cassandra-1.2.13-src//build/classes/thrift but not service. Does the location of CassandraDaemon seems to be the problem? If it is, then how do I fix the problem? Thanks! Le
Re: ./cqlsh not working
Just meant it cannot find the require library, why don't you install cassandra package to your distribution ? http://rpm.datastax.com/community/noarch/cassandra12-1.2.13-1.noarch.rpm http://www.datastax.com/documentation/cassandra/1.2/webhelp/index.html#cassandra/install/installRHEL_t.html?pagename=docsversion=1.2file=install/install_rpm That should save you a lot of trouble. Jason On Thu, Jan 23, 2014 at 12:02 PM, Chamila Wijayarathna cdwijayarat...@gmail.com wrote: Hi all, I downloaded 1.2.13 version and ran ./cqlsh inside bin folder, but it says that bash: ./cqlsh: Permission denied, when I ran it with sudo it says Command not found. When I ran chmod u+x cqlsh and then tried ./cqlsh, now it says that Can't locate transport factory function cqlshlib.tfactory.regular_transport_factory. What is the problem here? Thank You. -- *Chamila Dilshan Wijayarathna,* SMIEEE, SMIESL, Undergraduate, Department of Computer Science and Engineering, University of Moratuwa.
Re: using cssandra cql with php
Hi, operating system should not be a matter right? You just need the cassandra client downloaded and use it to access cassandra node. PHP? http://wiki.apache.org/cassandra/ClientOptions perhaps you can package cassandra pdo driver into rpm? Jason On Mon, Jan 13, 2014 at 3:02 PM, Tim Dunphy bluethu...@gmail.com wrote: Hey all, I'd like to be able to make calls to the cassandra database using PHP. I've taken a look around but I've only found solutions out there for Ubuntu and other distros. But my environment is CentOS. Are there any packages out there I can install that would allow me to use CQL in my PHP code? Thanks Tim -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B
Re: Can't start service with error: java.lang.IllegalStateException: Unable to contact any seeds
Hi, did you configured ip address in the setting seeds: in cassandra.yaml? Jason On Fri, Jan 10, 2014 at 1:20 PM, Francisco Dalla Rosa Soares dallar...@gmail.com wrote: Hello everyone, I've tried to google all I could and also asking at ServerFault first but as I got no answer I decided to come to the list. I have 6 machines that I want to use to make a cluster using Cassandra 2.0 Cassandra start in the machines, however it dies after a while with the error java.lang.IllegalStateException: Unable to contact any seeds! I've set the hostname in cassandra-env.sh and set the ip address of the machine in cassandra.yaml (rpc_address, listen_address) but still nothing. Weird enough, I have one machine that seems stays alive, however I can't connect (not even with telnet) to the service or to port 7000. There's no firewall up. For the logs and environment details please check: http://serverfault.com/questions/566044/cassandra-cant-start-service-with-error-java-lang-illegalstateexception-unab
Re: massive spikes in read latency
/** * Verbs it's okay to drop if the request has been queued longer than the request timeout. These * all correspond to client requests or something triggered by them; we don't want to * drop internal messages like bootstrap or repair notifications. */ public static final EnumSetVerb DROPPABLE_VERBS = EnumSet.of(Verb.BINARY, Verb._TRACE, Verb.MUTATION, Verb.READ_REPAIR, Verb.READ, Verb.RANGE_SLICE, Verb.PAGED_RANGE, Verb.REQUEST_RESPONSE); The short term solution would probably increase the timeout in your yaml file but i suggest you get the monitoring graphs (ping internode, block io) ready so it will give better indication which might be the exact problem. Jason On Tue, Jan 7, 2014 at 2:30 AM, Blake Eggleston bl...@shift.com wrote: That’s a good point. CPU steal time is very low, but I haven’t observed internode ping times during one of the peaks, I’ll have to check that out. Another thing I’ve noticed is that cassandra starts dropping read messages during the spikes, as reported by tpstats. This indicates that there’s too many queries for cassandra to handle. However, as I mentioned earlier, the spikes aren’t correlated to an increase in reads. On Jan 5, 2014, at 3:28 PM, Blake Eggleston bl...@shift.com wrote: Hi, I’ve been having a problem with 3 neighboring nodes in our cluster having their read latencies jump up to 9000ms - 18000ms for a few minutes (as reported by opscenter), then come back down. We’re running a 6 node cluster, on AWS hi1.4xlarge instances, with cassandra reading and writing to 2 raided ssds. I’ve added 2 nodes to the struggling part of the cluster, and aside from the latency spikes shifting onto the new nodes, it has had no effect. I suspect that a single key that lives on the first stressed node may be being read from heavily. The spikes in latency don’t seem to be correlated to an increase in reads. The cluster’s workload is usually handling a maximum workload of 4200 reads/sec per node, with writes being significantly less, at ~200/sec per node. Usually it will be fine with this, with read latencies at around 3.5-10 ms/read, but once or twice an hour the latencies on the 3 nodes will shoot through the roof. The disks aren’t showing serious use, with read and write rates on the ssd volume at around 1350 kBps and 3218 kBps, respectively. Each cassandra process is maintaining 1000-1100 open connections. GC logs aren’t showing any serious gc pauses. Any ideas on what might be causing this? Thanks, Blake
Re: massive spikes in read latency
Hi, could it be due to having noisy neighbour? Do you have graphs statistics ping between nodes? Jason On Mon, Jan 6, 2014 at 7:28 AM, Blake Eggleston bl...@shift.com wrote: Hi, I’ve been having a problem with 3 neighboring nodes in our cluster having their read latencies jump up to 9000ms - 18000ms for a few minutes (as reported by opscenter), then come back down. We’re running a 6 node cluster, on AWS hi1.4xlarge instances, with cassandra reading and writing to 2 raided ssds. I’ve added 2 nodes to the struggling part of the cluster, and aside from the latency spikes shifting onto the new nodes, it has had no effect. I suspect that a single key that lives on the first stressed node may be being read from heavily. The spikes in latency don’t seem to be correlated to an increase in reads. The cluster’s workload is usually handling a maximum workload of 4200 reads/sec per node, with writes being significantly less, at ~200/sec per node. Usually it will be fine with this, with read latencies at around 3.5-10 ms/read, but once or twice an hour the latencies on the 3 nodes will shoot through the roof. The disks aren’t showing serious use, with read and write rates on the ssd volume at around 1350 kBps and 3218 kBps, respectively. Each cassandra process is maintaining 1000-1100 open connections. GC logs aren’t showing any serious gc pauses. Any ideas on what might be causing this? Thanks, Blake
Re: offheap component
Solely by the Cassandra version. Are you asking about a particular component not mentioned above? Not exactly. From version to version, the component of cassandra may reside on heap or off heap, so is there a way to determine which component in the cassandra version that may reside on heap or off heap? On Wed, Jan 1, 2014 at 1:25 AM, Tyler Hobbs ty...@datastax.com wrote: On Tue, Dec 31, 2013 at 7:35 AM, Jason Wee peich...@gmail.com wrote: In Cassandra 1.2 and later, the Bloom filter and compression offset map that store this metadata reside off-heap, greatly increasing the capacity per node of data that Cassandra can handle efficiently. In Cassandra 2.0, the partition summary also resides off-heap. How do we determine if a cassandra component is on heap or offheap? Solely by the Cassandra version. Are you asking about a particular component not mentioned above? By off-heap, it means that the object is stored *not* in the allocated heap in jvm but in native memory? That's correct. -- Tyler Hobbs DataStax http://datastax.com/
Re: Replication Latency between cross data centers
Hi, how about streaming metrics https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/metrics/StreamingMetrics.java ? Jason On Tue, Dec 31, 2013 at 6:38 AM, Senthil, Athinanthny X. -ND athinanthny.x.senthil@disney.com wrote: I want to determine data replication latency between data centers. Is there any metrics that is available to capture it in JConsole or other ways?
offheap component
Excerpts from http://www.datastax.com/documentation/cassandra/2.0/webhelp/cassandra/operations/ops_tune_jvm_c.html In Cassandra 1.2 and later, the Bloom filter and compression offset map that store this metadata reside off-heap, greatly increasing the capacity per node of data that Cassandra can handle efficiently. In Cassandra 2.0, the partition summary also resides off-heap. How do we determine if a cassandra component is on heap or offheap? By off-heap, it means that the object is stored *not* in the allocated heap in jvm but in native memory? As the off-heap context is not the same as Non-heap? http://www.yourkit.com/docs/kb/sizes.jsp Thank you. Jason
Re: org.apache.thrift.TApplicationException: get_range_slices failed: out of sequence response
Hi Aaron, thank you for response and advices. yes, the problem was due to the code using wrote method calls, spent sometime to read the codes and voila... problem found! =) thank you again. /Jason On Tue, Dec 24, 2013 at 4:16 AM, Aaron Morton aa...@thelastpickle.comwrote: AFAIK it mean the response from the server is for a different request, probably something going wrong with the threading or trying to do async IO with thrift. I know it is ancient cassandra but just wanna learn it. Looking at anything other than 2.0 will be wasting your time. Cheers - Aaron Morton New Zealand @aaronmorton Co-Founder Principal Consultant Apache Cassandra Consulting http://www.thelastpickle.com On 19/12/2013, at 11:38 pm, Jason Wee peich...@gmail.com wrote: Hi, In regards to recv_get_range_slices(), in my cassandra client code, it always throw new org.apache.thrift.TApplicationException(org.apache.thrift.TApplicationException.BAD_SEQUENCE_ID, get_range_slices failed: out of sequence response); Full source code here https://raw.github.com/apache/cassandra/cassandra-1.0.12/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java Can anybody give some clue as to why the exception is always thrown? My cassandra use is version 0.7.10 and the code checking msg.seqid remain until version 1.0.12. I know it is ancient cassandra but just wanna learn it. I have checked cassandra 1.1 branch, the checking condition if (msg.seqid != seqid_) { was removed. Thank you and any clue or indication will be great. /Jason
org.apache.thrift.TApplicationException: get_range_slices failed: out of sequence response
Hi, In regards to recv_get_range_slices(), in my cassandra client code, it always throw new org.apache.thrift.TApplicationException(org.apache.thrift.TApplicationException.BAD_SEQUENCE_ID, get_range_slices failed: out of sequence response); Full source code here https://raw.github.com/apache/cassandra/cassandra-1.0.12/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java Can anybody give some clue as to why the exception is always thrown? My cassandra use is version 0.7.10 and the code checking msg.seqid remain until version 1.0.12. I know it is ancient cassandra but just wanna learn it. I have checked cassandra 1.1 branch, the checking condition if (msg.seqid != seqid_) { was removed. Thank you and any clue or indication will be great. /Jason
Re: Best way to measure write throughput...
Hello, you could also probably do it in your application? Just sample with an interval of time and that should give some indication of throughput. HTH /Jason On Thu, Dec 19, 2013 at 12:11 AM, Krishna Chaitanya bnsk1990r...@gmail.comwrote: Hello, Could you please suggest to me the best way to measure write-throughput in Cassandra. I basically have an application that stores network packets to a Cassandra cluster. Which is the best way to measure write performance, especially write-throughput, in terms of number of packets stored into Cassandra per second or something similar to this??? Can I measure this using nodetool? Thanks. -- Regards, BNSK *. *
Re: help on backup muiltinode cluster
Hmm... cassandra fundamental key features like fault tolerant, durable and replication. Just out of curiousity, why would you want to do backup? /Jason On Sat, Dec 7, 2013 at 3:31 AM, Robert Coli rc...@eventbrite.com wrote: On Fri, Dec 6, 2013 at 6:41 AM, Amalrik Maia amal...@s1mbi0se.com.brwrote: hey guys, I'm trying to take backups of a multi-node cassandra and save them on S3. My idea is simply doing ssh to each server and use nodetool to create the snapshots then push then to S3. https://github.com/synack/tablesnap So is this approach recommended? my concerns are about inconsistencies that this approach can lead, since the snapshots are taken one by one and not in parallel. Should i worry about it or cassandra finds a way to deal with inconsistencies when doing a restore? The backup is as consistent as your cluster is at any given moment, which is not necessarily. Manual repair brings you closer to consistency, but only on data present when the repair started. =Rob
Re: OOMs during high (read?) load in Cassandra 1.2.11
Hi, Just taking a wild shot here, sorry if it does not help. Could it be thrown during reading the sstable? That is, try to find the configuration parameters for read operation, tune down a little for those settings. Also check on the the chunk_length_kb. http://www.datastax.com/documentation/cql/3.1/webhelp/cql/cql_reference/cql_storage_options_c.html /Jason On Fri, Dec 6, 2013 at 6:01 PM, Klaus Brunner klaus.brun...@gmail.comwrote: We're getting fairly reproducible OOMs on a 2-node cluster using Cassandra 1.2.11, typically in situations with a heavy read load. A sample of some stack traces is at https://gist.github.com/KlausBrunner/7820902 - they're all failing somewhere down from table.getRow(), though I don't know if that's part of query processing, compaction, or something else. - The main CFs contain some 100k rows, none of them particularly wide. - Heap dumps invariably show a single huge byte array (~1.6 GiB associated with the OOM'ing thread) hogging 80% of the Java heap. The array seems to contain all/many rows of one CF. - We're moderately certain there's no killer query with a huge result set involved here, but we can't see exactly what triggers this. - We've tried to switch to LeveledCompaction, to no avail. - Xms/x is set to some 4 GB. - The logs show the usual signs of panic (flushing memtables) before actually OOMing. It seems that this scenario is often or even always after a compaction, but it's not quite conclusive. I'm somewhat worried that Cassandra would read so much data into a single contiguous byte[] at any point. Could this be related to compaction? Any ideas what we could do about this? Thanks Klaus
Re: Write performance with 1.2.12
Hi srmore, Perhaps if you use jconsole and connect to the jvm using jmx. Then uner MBeans tab, start inspecting the GC metrics. /Jason On Fri, Dec 6, 2013 at 11:40 PM, srmore comom...@gmail.com wrote: On Fri, Dec 6, 2013 at 9:32 AM, Vicky Kak vicky@gmail.com wrote: Hard to say much without knowing about the cassandra configurations. The cassandra configuration is -Xms8G -Xmx8G -Xmn800m -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=4 -XX:MaxTenuringThreshold=2 -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly Yes compactions/GC's could skipe the CPU, I had similar behavior with my setup. Were you able to get around it ? -VK On Fri, Dec 6, 2013 at 7:40 PM, srmore comom...@gmail.com wrote: We have a 3 node cluster running cassandra 1.2.12, they are pretty big machines 64G ram with 16 cores, cassandra heap is 8G. The interesting observation is that, when I send traffic to one node its performance is 2x more than when I send traffic to all the nodes. We ran 1.0.11 on the same box and we observed a slight dip but not half as seen with 1.2.12. In both the cases we were writing with LOCAL_QUORUM. Changing CL to ONE make a slight improvement but not much. The read_Repair_chance is 0.1. We see some compactions running. following is my iostat -x output, sda is the ssd (for commit log) and sdb is the spinner. avg-cpu: %user %nice %system %iowait %steal %idle 66.460.008.950.010.00 24.58 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda 0.0027.60 0.00 4.40 0.00 256.00 58.18 0.012.55 1.32 0.58 sda1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.000.00 0.00 0.00 sda2 0.0027.60 0.00 4.40 0.00 256.00 58.18 0.012.55 1.32 0.58 sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.000.00 0.00 0.00 sdb1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.000.00 0.00 0.00 dm-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.000.00 0.00 0.00 dm-1 0.00 0.00 0.00 0.60 0.00 4.80 8.00 0.005.33 2.67 0.16 dm-2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.000.00 0.00 0.00 dm-3 0.00 0.00 0.00 24.80 0.00 198.40 8.00 0.249.80 0.13 0.32 dm-4 0.00 0.00 0.00 6.60 0.0052.80 8.00 0.011.36 0.55 0.36 dm-5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.000.00 0.00 0.00 dm-6 0.00 0.00 0.00 24.80 0.00 198.40 8.00 0.29 11.60 0.13 0.32 I can see I am cpu bound here but couldn't figure out exactly what is causing it, is this caused by GC or Compaction ? I am thinking it is compaction, I see a lot of context switches and interrupts in my vmstat output. I don't see GC activity in the logs but see some compaction activity. Has anyone seen this ? or know what can be done to free up the CPU. Thanks, Sandeep
Re: Drop keyspace via CQL hanging on master/trunk.
Hey Brian, just out of curiosity, why would you remove cassandra data directory entirely? /Jason On Fri, Dec 6, 2013 at 2:38 AM, Brian O'Neill b...@alumni.brown.edu wrote: When running Cassandra from trunk/master, I see a drop keyspace command hang at the CQL prompt. To reproduce: 1) Removed my cassandra data directory entirely 2) Fired up cqlsh, and executed the following CQL commands in succession: bone@zen:~/git/boneill42/cassandra- bin/cqlsh Connected to Test Cluster at localhost:9160. [cqlsh 4.1.0 | Cassandra 2.1-SNAPSHOT | CQL spec 3.1.1 | Thrift protocol 19 .38.0] Use HELP for help. cqlsh describe keyspaces; system system_traces cqlsh create keyspace test_keyspace with replication =3D {'class':'SimpleS= trategy', 'replication_factor':'1'}; cqlsh describe keyspaces; system test_keyspace system_traces cqlsh drop keyspace test_keyspace; THIS HANGS INDEFINITELY thoughts? user error? worth filing an issue? One other note — this happens using the CQL java driver as well. -brian --- Brian O'Neill Chief Architect *Health Market Science* *The Science of Better Results* 2700 Horizon Drive • King of Prussia, PA • 19406 M: 215.588.6024 • @boneill42 http://www.twitter.com/boneill42 • healthmarketscience.com This information transmitted in this email message is for the intended recipient only and may contain confidential and/or privileged material. If you received this email in error and are not the intended recipient, or the person responsible to deliver it to the intended recipient, please contact the sender at the email above and delete this email and any attachments and destroy any copies thereof. Any review, retransmission, dissemination, copying or other use of, or taking any action in reliance upon, this information by persons or entities other than the intended recipient is strictly prohibited.
Re: How to measure data transfer between data centers?
Hi, Will it be simpler to just measure the network interface of the node instead? /Jason On Thu, Dec 5, 2013 at 10:57 AM, Jacob Rhoden jacob.rho...@me.com wrote: http://unix.stackexchange.com/questions/41765/traffic-stats-per-network-port __ Sent from iPhone On 5 Dec 2013, at 5:44 am, Tom van den Berge t...@drillster.com wrote: Hi Chris, I think streaming is used for repair tasks, bulk loading and that kind of things, but not for regular replication traffic. I think you're right that I should look into network tools. I don't think cassandra can supply this information. Thanks, Tom On Wed, Dec 4, 2013 at 6:08 PM, Chris Burroughs chris.burrou...@gmail.com wrote: https://wiki.apache.org/cassandra/Metrics has per node Streaming metrics that include total bytes/in out. That is only a small bit of what you want though. For total DC bandwidth it might be more straightforward to measure this at the router/switch/fancy-network-gear level. On 12/03/2013 06:25 AM, Tom van den Berge wrote: Is there a way to know how much data is transferred between two nodes, or more specifically, between two data centers? I'm especially interested in how much data is being replicated from one data center to another, to know how much of the available bandwidth is used. Thanks, Tom -- Drillster BV Middenburcht 136 3452MT Vleuten Netherlands +31 30 755 5330 Open your free account at www.drillster.com
Re: bin/cqlsh is missing cqlshlib
Hi, if you download the rpm from http://rpm.datastax.com/community/noarch/, example cassandra20-2.0.3-1.noarch.rpm , it should contain the cqlshlib and it is package into /usr/lib/python2.6/site-packages/cqlshlib hth /Jason On Tue, Dec 3, 2013 at 10:17 AM, Ritchie Iu r...@ixl.com wrote: No, there is no cqlshlib found at /usr/share/pyshared/cqlshlib although it might because I'm using Fedora which isn't debian. I did a search and I've found that cqlshlib is in several locations: /opt_build/fc17/lib/cassandra/pylib/cqlshlib, /opt_build/fc17/lib/cassandra/pylib/cqlshlib, /usr/opt/apache-cassandra/1.1.4/top/cassandra/pylib/cqlshlib and /usr/opt/apache-cassandra/1.1.0/top/cassandra/pylib/cqlshlib So I'm guessing that means the start script doesn't know where to find the cqlshlib directory? Any idea which one of the above locations I should tell it to point to? Thanks, Ritchie
Re: replica verification
You could probably use one of these nodetool getendpoints keyspace cf key - Print the end points that owns the key using cqlsh and desc keyspace; /Jason On Tue, Dec 3, 2013 at 12:16 AM, chandra Varahala hadoopandcassan...@gmail.com wrote: Hello Team, I have cassandra cluster with 5 nodes with 1 replication factor initially. Now i changed to replication factor 3 and ran nodetool repair. Is there way i can verify that i have 3 replicas ? Thanks Chandra
Re: bin/cqlsh is missing cqlshlib
Check if you have the cqlshlib installed? For debian, it is located at /usr/share/pyshared/cqlshlib /Jason On Tue, Dec 3, 2013 at 5:42 AM, Ritchie Iu r...@ixl.com wrote: Hello, I am trying to install and setup Cassandra on Fedora. So far I have successfully installed it using Yum by following the startup guide: http://www.datastax.com/documentation/cassandra/1.2/ webhelp/index.html#cassandra/install/installRHEL_t.html My problem is that when I run bin/cqlsh, I get the following error: Traceback (most recent call last): File bin/cqlsh, line 114, in module from cqlshlib import cqlhandling, cql3handling, pylexotron ImportError: No module named cqlshlib I'm aware that there was a similar bug about this a year ago ( https://issues.apache.org/jira/browse/CASSANDRA-3767) but it seems to have been fixed, so I'm not sure what I'm missing. Thank for your help, Ritchie
Re: Migration Cassandra 2.0 to Cassandra 2.0.2
eh? should you download from the official apache cassandra site? well, I download a copy from http://cassandra.apache.org/download/ and check below, it is there $ tar -ztf apache-cassandra-2.0.2-bin.tar.gz | grep apache-cassandra-2.0.2.jar apache-cassandra-2.0.2/lib/apache-cassandra-2.0.2.jar /Jason On Thu, Nov 21, 2013 at 6:32 PM, Bonnet Jonathan. jonathan.bon...@externe.bnpparibas.com wrote: Jason Wee peichieh at gmail.com writes: I had the same version upgrade path you had but using debian binary package. Looks like it could be the java cannot find the main class, try find out by executing ps and grep for the cassandra process, then it should show a lot of classpath, check if you apache-cassandra-2.0.2.jar in the classpath. also, check on the jar file read permission. /Jason On Thu, Nov 21, 2013 at 2:30 AM, Robert Coli rcoli at eventbrite.com wrote: On Wed, Nov 20, 2013 at 5:44 AM, Bonnet Jonathan. jonathan.bonnet at externe.bnpparibas.com wrote: So i Deploy the binaries of the new version, and configure my cassandra.yaml with the same informations as before. Why deploy binaries instead of a binary package? =Rob Thanks Mr Coli and Mr Wee for your answears, Mr Coli What's the difference between deploy binaries and the binary package ? I upload the binary package on the Apache Cassandra Homepage, Am I wrong ? Mr Wee i think you hit the right way, cause my lib directory in my Cassandra_Home are different between the two versions. In the Home for the old version /produits/cassandra/install_cassandra/apache-cassandra-2.0.0/lib i have: [cassandra at s00vl9925761 lib]$ ls -ltr total 14564 -rw-r- 1 cassandra cassandra 123898 Aug 28 15:07 thrift-server-0.3.0.jar -rw-r- 1 cassandra cassandra 42854 Aug 28 15:07 thrift-python-internal-only-0.7.0.zip -rw-r- 1 cassandra cassandra 55066 Aug 28 15:07 snaptree-0.1.jar -rw-r- 1 cassandra cassandra 1251514 Aug 28 15:07 snappy-java-1.0.5.jar -rw-r- 1 cassandra cassandra 270552 Aug 28 15:07 snakeyaml-1.11.jar -rw-r- 1 cassandra cassandra8819 Aug 28 15:07 slf4j-log4j12-1.7.2.jar -rw-r- 1 cassandra cassandra 26083 Aug 28 15:07 slf4j-api-1.7.2.jar -rw-r- 1 cassandra cassandra 134133 Aug 28 15:07 servlet-api-2.5-20081211.jar -rw-r- 1 cassandra cassandra 1128961 Aug 28 15:07 netty-3.5.9.Final.jar -rw-r- 1 cassandra cassandra 80800 Aug 28 15:07 metrics-core-2.0.3.jar -rw-r- 1 cassandra cassandra 134748 Aug 28 15:07 lz4-1.1.0.jar -rw-r- 1 cassandra cassandra 481534 Aug 28 15:07 log4j-1.2.16.jar -rw-r- 1 cassandra cassandra 347531 Aug 28 15:07 libthrift-0.9.0.jar -rw-r- 1 cassandra cassandra 16046 Aug 28 15:07 json-simple-1.1.jar -rw-r- 1 cassandra cassandra 91183 Aug 28 15:07 jline-1.0.jar -rw-r- 1 cassandra cassandra 17750 Aug 28 15:07 jbcrypt-0.3m.jar -rw-r- 1 cassandra cassandra5792 Aug 28 15:07 jamm-0.2.5.jar -rw-r- 1 cassandra cassandra 765648 Aug 28 15:07 jackson-mapper-asl-1.9.2.jar -rw-r- 1 cassandra cassandra 228286 Aug 28 15:07 jackson-core-asl-1.9.2.jar -rw-r- 1 cassandra cassandra 96046 Aug 28 15:07 high-scale-lib-1.1.2.jar -rw-r- 1 cassandra cassandra 1891110 Aug 28 15:07 guava-13.0.1.jar -rw-r- 1 cassandra cassandra 66843 Aug 28 15:07 disruptor-3.0.1.jar -rw-r- 1 cassandra cassandra 91982 Aug 28 15:07 cql-internal-only-1.4.0.zip -rw-r- 1 cassandra cassandra 54345 Aug 28 15:07 concurrentlinkedhashmap-lru-1.3.jar -rw-r- 1 cassandra cassandra 25490 Aug 28 15:07 compress-lzf-0.8.4.jar -rw-r- 1 cassandra cassandra 284220 Aug 28 15:07 commons-lang-2.6.jar -rw-r- 1 cassandra cassandra 30085 Aug 28 15:07 commons-codec-1.2.jar -rw-r- 1 cassandra cassandra 36174 Aug 28 15:07 commons-cli-1.1.jar -rw-r- 1 cassandra cassandra 1695790 Aug 28 15:07 apache-cassandra-thrift-2.0.0.jar -rw-r- 1 cassandra cassandra 71117 Aug 28 15:07 apache-cassandra-clientutil-2.0.0.jar -rw-r- 1 cassandra cassandra 3265185 Aug 28 15:07 apache-cassandra-2.0.0.jar -rw-r- 1 cassandra cassandra 1928009 Aug 28 15:07 antlr-3.2.jar drwxr-x--- 2 cassandra cassandra4096 Oct 1 14:16 licenses In my new home i have /produits/cassandra/install_cassandra/apache-cassandra-2.0.2/lib: [cassandra at s00vl9925761 lib]$ ls -ltr total 9956 -rw-r- 1 cassandra cassandra 123920 Oct 24 09:21 thrift-server-0.3.2.jar -rw-r- 1 cassandra cassandra 52477 Oct 24 09:21 thrift-python-internal-only-0.9.1.zip -rw-r- 1 cassandra cassandra 55066 Oct 24 09:21 snaptree-0.1.jar -rw-r- 1 cassandra cassandra 1251514 Oct 24 09:21 snappy-java-1.0.5.jar -rw-r- 1 cassandra cassandra 270552 Oct 24 09:21 snakeyaml-1.11.jar -rw-r- 1 cassandra cassandra 26083 Oct 24 09:21 slf4j-api-1.7.2.jar -rw-r- 1 cassandra cassandra 22291 Oct 24 09:21 reporter-config-2.1.0.jar -rw-r- 1
Re: Migration Cassandra 2.0 to Cassandra 2.0.2
I had the same version upgrade path you had but using debian binary package. Looks like it could be the java cannot find the main class, try find out by executing ps and grep for the cassandra process, then it should show a lot of classpath, check if you apache-cassandra-2.0.2.jar in the classpath. also, check on the jar file read permission. /Jason On Thu, Nov 21, 2013 at 2:30 AM, Robert Coli rc...@eventbrite.com wrote: On Wed, Nov 20, 2013 at 5:44 AM, Bonnet Jonathan. jonathan.bon...@externe.bnpparibas.com wrote: So i Deploy the binaries of the new version, and configure my cassandra.yaml with the same informations as before. Why deploy binaries instead of a binary package? =Rob
Re: Unsupported major.minor version 51.0
Sorry, I have no knowledge on Node.js, probably someone else might know. Jason On Wed, Sep 18, 2013 at 11:29 AM, Gary Zhao garyz...@gmail.com wrote: Thanks Jason. Does Node.js work with 2.0? I'm wondering which version should I run. Thanks. On Tue, Sep 17, 2013 at 8:24 PM, Jason Wee peich...@gmail.com wrote: cassandra 2.0, then use oracle or open jdk version 7. Jason On Wed, Sep 18, 2013 at 11:21 AM, Gary Zhao garyz...@gmail.com wrote: Hello I just saw this error. Anyone knows how to fix it? [root@gary-vm1 apache-cassandra-2.0.0]# bin/cassandra -f xss = -ea -javaagent:bin/../lib/jamm-0.2.5.jar -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms4014M -Xmx4014M -Xmn400M -XX:+HeapDumpOnOutOfMemoryError -Xss180k Exception in thread main java.lang.UnsupportedClassVersionError: org/apache/cassandra/service/CassandraDaemon : Unsupported major.minor version 51.0 at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClassCond(ClassLoader.java:632) at java.lang.ClassLoader.defineClass(ClassLoader.java:616) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141) at java.net.URLClassLoader.defineClass(URLClassLoader.java:283) at java.net.URLClassLoader.access$000(URLClassLoader.java:58) at java.net.URLClassLoader$1.run(URLClassLoader.java:197) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:248) Could not find the main class: org.apache.cassandra.service.CassandraDaemon. Program will exit. [root@gary-vm1 apache-cassandra-2.0.0]# java -version java version 1.6.0_24 Java(TM) SE Runtime Environment (build 1.6.0_24-b07) Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02, mixed mode) Thanks Gary
Re: HsHa
Nature of issue CASSANDRA-4573 compare to Read an invalid frame size of 0. looks different, nevertheless if someone can test the issue fix would include invalid frame size.. would be awesome! Jason On Wed, Aug 21, 2013 at 4:08 PM, Alain RODRIGUEZ arodr...@gmail.com wrote: @Christopher, not sure if you noticed it but, CASSANDRA-4573 is now fixed in C*2.0.0 RC2 = http://goo.gl/AGVTOF No idea if this could fix our issue Alain 2013/8/14 Jake Luciani jak...@gmail.com This is technically a Thrift message not Cassandra, it happens when a client hangs up without closing the socket. You should be able to silence it by raising the class specific log level see log4j-server.properties as an example On Wed, Aug 14, 2013 at 9:59 AM, Alain RODRIGUEZ arodr...@gmail.comwrote: @Commiters/Experts, Does this sound like a bug or like 4 PEBCAKs to you ? Should we raise a JIRA ? Alain 2013/8/14 Keith Wright kwri...@nanigans.com Same here on 1.2.4. From: Romain HARDOUIN romain.hardo...@urssaf.fr Reply-To: user@cassandra.apache.org user@cassandra.apache.org Date: Wednesday, August 14, 2013 3:36 AM To: user@cassandra.apache.org user@cassandra.apache.org Subject: Re: HsHa The same goes for us. Romain Alain RODRIGUEZ arodr...@gmail.com a écrit sur 13/08/2013 18:10:05 : De : Alain RODRIGUEZ arodr...@gmail.com A : user@cassandra.apache.org, Date : 13/08/2013 18:10 Objet : Re: HsHa I have this anytime I try to switch to hsha since 0.8. Always kept sync for this reason. Thought I was alone with this bug since I never had any clue about this on the mailing list. So +1. Alain 2013/8/13 Christopher Wirt chris.w...@struq.com Hello, I was trying out the hsha thrift server implementation and found that I get a fair amount of these appearing in the server logs. ERROR [Selector-Thread-9] 2013-08-13 15:39:10,433 TNonblockingServer.java (line 468) Read an invalid frame size of 0. Are you using TFramedTransport on the client side? ERROR [Selector-Thread-9] 2013-08-13 15:39:11,499 TNonblockingServer.java (line 468) Read an invalid frame size of 0. Are you using TFramedTransport on the client side? ERROR [Selector-Thread-9] 2013-08-13 15:39:11,695 TNonblockingServer.java (line 468) Read an invalid frame size of 0. Are you using TFramedTransport on the client side? ERROR [Selector-Thread-9] 2013-08-13 15:39:12,562 TNonblockingServer.java (line 468) Read an invalid frame size of 0. Are you using TFramedTransport on the client side? ERROR [Selector-Thread-1] 2013-08-13 15:39:12,660 TNonblockingServer.java (line 468) Read an invalid frame size of 0. Are you using TFramedTransport on the client side? ERROR [Selector-Thread-9] 2013-08-13 15:39:13,496 TNonblockingServer.java (line 468) Read an invalid frame size of 0. Are you using TFramedTransport on the client side? ERROR [Selector-Thread-9] 2013-08-13 15:39:14,281 TNonblockingServer.java (line 468) Read an invalid frame size of 0. Are you using TFramedTransport on the client side? Anyone seen this message before? know what it means? or issues it could hide? https://issues.apache.org/jira/browse/CASSANDRA-4573 in the comments suggests it might be a 10 client timeout but looking at JMX client stats the max value for read/write/slice is well below 10secs I’m using 1.2.8 on centos Cheers, Chris -- http://twitter.com/tjake
Re: org.apache.cassandra.io.sstable.CorruptSSTableException
you can try nodetool scrub. if it does not work, try repair then cleanup. had this issue a few weeks back but our version is 1.0.x On Mon, Aug 5, 2013 at 8:12 AM, Keith Wright kwri...@nanigans.com wrote: Re-sending hoping to get some help. Any ideas would be much appreciated! From: Keith Wright kwri...@nanigans.com Date: Friday, August 2, 2013 3:01 PM To: user@cassandra.apache.org user@cassandra.apache.org Subject: org.apache.cassandra.io.sstable.CorruptSSTableException Hi all, We just added a node to our cluster (1.2.4 Vnodes) and they appear to be running well exception I see that the new node is not making any progress compacting one of the CF. The exception below is generated. My assumption is that the only way to handle this is to stop the node, delete the file in question, restart, and run repair. Thoughts? org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.IOException: dataSize of 1249463589142530 starting at 5604968 would be larger than file /data/3/cassandra/data/users/global_user/users-global_user-ib-1550-Data.db length 14017479 at org.apache.cassandra.io.sstable.SSTableIdentityIterator.init(SSTableIdentityIterator.java:168) at org.apache.cassandra.io.sstable.SSTableIdentityIterator.init(SSTableIdentityIterator.java:83) at org.apache.cassandra.io.sstable.SSTableIdentityIterator.init(SSTableIdentityIterator.java:69) at org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:177) at org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:152) at org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:139) at org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:36) at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer$1.runMayThrow(ParallelCompactionIterable.java:288) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.lang.Thread.run(Thread.java:722) Caused by: java.io.IOException: dataSize of 1249463589142530 starting at 5604968 would be larger than file /data/3/cassandra/data/users/global_user/users-global_user-ib-1550-Data.db length 14017479 at org.apache.cassandra.io.sstable.SSTableIdentityIterator.init(SSTableIdentityIterator.java:123) ... 9 more
Re: unable to compact large rows
Would it possible to delete this row and reinsert this row? By the way, how large is that one row? Jason On Wed, Jul 24, 2013 at 9:23 AM, Paul Ingalls paulinga...@gmail.com wrote: I'm getting constant exceptions during compaction of large rows. In fact, I have not seen one work, even starting from an empty DB. As soon as I start pushing in data, when a row hits the large threshold, it fails compaction with this type of stack trace: INFO [CompactionExecutor:6] 2013-07-24 01:17:53,592 CompactionController.java (line 156) Compacting large row fanzo/tweets_by_id:352567939972603904 (153360688 bytes) incrementally ERROR [CompactionExecutor:6] 2013-07-24 01:18:12,496 CassandraDaemon.java (line 192) Exception in thread Thread[CompactionExecutor:6,1,main] java.lang.AssertionError: incorrect row data size 5722610 written to /mnt/datadrive/lib/cassandra/data/fanzo/tweets_by_id/fanzo-tweets_by_id-tmp-ic-1453-Data.db; correct is 5767384 at org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:162) at org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:162) at org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:58) at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60) at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:211) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) I'm not sure what to do or where to look. Help…:) Thanks, Paul
Re: nodetool ring generate strange info
same host, multiple cassandra instance? but looks wrong, what cassandra version? On Fri, May 10, 2013 at 3:19 PM, 杨辉强 huiqiangy...@yunrang.com wrote: Hi, all I use ./bin/nodetool -h 10.21.229.32 ring It generates lots of info of same host like this: 10.21.229.32 rack1 Up Normal 928.3 MB24.80% 8875305964978355793 10.21.229.32 rack1 Up Normal 928.3 MB24.80% 8875770246221977199 10.21.229.32 rack1 Up Normal 928.3 MB24.80% 8875903273282028661 10.21.229.32 rack1 Up Normal 928.3 MB24.80% 9028992266297813652 10.21.229.32 rack1 Up Normal 928.3 MB24.80% 9130157610675408105 10.21.229.32 rack1 Up Normal 928.3 MB24.80% 9145604352014775913 10.21.229.32 rack1 Up Normal 928.3 MB24.80% 9182228238626921304 Does it normal?
Re: Really odd issue (AWS related?)
top command? st : time stolen from this vm by the hypervisor jason On Fri, Apr 26, 2013 at 9:54 AM, Michael Theroux mthero...@yahoo.comwrote: Sorry, Not sure what CPU steal is :) I have AWS console with detailed monitoring enabled... things seem to track close to the minute, so I can see the CPU load go to 0... then jump at about the minute Cassandra reports the dropped messages, -Mike On Apr 25, 2013, at 9:50 PM, aaron morton wrote: The messages appear right after the node wakes up. Are you tracking CPU steal ? - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 25/04/2013, at 4:15 AM, Robert Coli rc...@eventbrite.com wrote: On Wed, Apr 24, 2013 at 5:03 AM, Michael Theroux mthero...@yahoo.com wrote: Another related question. Once we see messages being dropped on one node, our cassandra client appears to see this, reporting errors. We use LOCAL_QUORUM with a RF of 3 on all queries. Any idea why clients would see an error? If only one node reports an error, shouldn't the consistency level prevent the client from seeing an issue? If the client is talking to a broken/degraded coordinator node, RF/CL are unable to protect it from RPCTimeout. If it is unable to coordinate the request in a timely fashion, your clients will get errors. =Rob
Re: cql query not giving any result.
Here is a list of keywords and whether or not the words are reserved. A reserved keyword cannot be used as an identifier unless you enclose the word in double quotation marks. Non-reserved keywords have a specific meaning in certain context but can be used as an identifier outside this context. http://www.datastax.com/docs/1.2/cql_cli/cql_lexicon#cql-keywords On Fri, Mar 15, 2013 at 6:43 PM, Kuldeep Mishra kuld.cs.mis...@gmail.comwrote: Hi, Is it possible in Cassandra to make multiple column with same name ?, like in this particular scenario I have two column with same name as key, first one is rowkey and second on is column name . Thanks and Regards Kuldeep On Fri, Mar 15, 2013 at 4:05 PM, Kuldeep Mishra kuld.cs.mis...@gmail.comwrote: Hi , Following cql query not returning any result cqlsh:KunderaExamples select * from DOCTOR where key='kuldeep'; I have enabled secondary indexes on both column. Screen shot is attached Please help -- Thanks and Regards Kuldeep Kumar Mishra +919540965199 -- Thanks and Regards Kuldeep Kumar Mishra +919540965199
Re: migrating from SimpleStrategy to NetworkTopologyStrategy
Probably also ensure port 7000 for the nodes to be reachable between nodes. Jason On Tue, Mar 12, 2013 at 4:11 AM, Dane Miller d...@optimalsocial.com wrote: Hi, I'd like to resurrect this thread from April 2012 - http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/migrating-from-SimpleStrategy-to-NetworkTopologyStrategy-td7481090.html - migrating from SimpleStrategy to NetworkTopologyStrategy We're in a similar situation, and I'd like to understand this more thoroughly. In preparation for adding another datacenter to our cassandra cluster, I'd like to migrate from EC2Snitch + SimpleStrategy to GosspingPropertyFileSnitch + NetworkTopologyStrategy. Here are the steps I'm planning: Change the Snitch 1. set endpoint_snitch: GosspingPropertyFileSnitch in cassandra.yaml 2. configure cassandra-rackdc.properties to use a single rack and datacenter,: rack=RAC1, dc=dc1 3. do a rolling restart of all nodes Change replication strategy 4. for all keyspaces (except system*), alter keyspace ... with replication = {'class' : 'NetworkTopologyStrategy', 'dc1' : 3} 5. run nodetool repair -pr on all nodes Does this look right? Also, I'm curious whether/why step 5 is necessary, given the single rack configuration. Versions: cassandra 1.2.2 dsc12 1.2.2-1 Ubuntu 12.04, x86_64 Datastax AMI Thanks! Dane
Re: Need help
Shouldn't be difficult to google what you want for starter... but here are some below, http://wiki.apache.org/cassandra/GettingStarted http://wiki.apache.org/cassandra/ClientOptions http://wiki.apache.org/cassandra/HadoopSupport http://www.slideshare.net/jeromatron/cassandrahadoop-integration Jason On Fri, Mar 8, 2013 at 10:11 PM, oualid ait wafli oualid.aitwa...@gmail.com wrote: Hi I am new in Cassandra, I need some examples ( Hands-on-lab, Use Cases ...) to deploy Cassandra on two nodes. I need too examples of how configure Hadoop and Cassandra to work both Thanks
Re: Cassandra automatic setup
Hi, you can use cassandra-cli / cqlsh with option --file to load the ddl. or cassandra client..http://wiki.apache.org/cassandra/ClientOptions Jason On Fri, Mar 8, 2013 at 12:43 AM, vck veesee...@gmail.com wrote: Hi, so we are just in the process of setting up dse cassandra to be used for our services. At this point, I have manually created the keyspace, cfs, etc and we have different environments like dev, certification, staging, production. So i have to manually do this atleast 4 times to set up cassandra on all my envs. would you guys recommend a standard way of doing this. say for eg : create scripts that run the rpm installations, creating CFs, etc. Am not really aware of what are the standard practices that exist. thanks ~v
Re: Cassandra OOM, many deletedColumn
hmm.. did you managed to take a look using nodetool tpstats? That may give you indication further.. Jason On Thu, Mar 7, 2013 at 1:56 PM, 金剑 jinjia...@gmail.com wrote: Hi, My version is 1.1.7 Our use case is : we have a index columnfamily to record how many resource is stored for a user. The number might vary from tens to millions. We provide a feature to let user to delete resource according prefix. we found some cassandra will OOM after some period. The cluster is a kind of cross-datacenter ring. 1. Exception in cassandra log: ERROR [Thread-5810] 2013-02-04 05:38:13,882 AbstractCassandraDaemon.java (line 135) Exception in thread Thread[Thread-5810,5,main] java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:60) at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767) at java.util.concurrent.ThreadPoolExecutor.ensureQueuedTaskHandled(ThreadPoolExecutor.java:758) at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:655) at org.apache.cassandra.net.MessagingService.receive(MessagingService.java:581) at org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:155) at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:113) ERROR [Thread-5819] 2013-02-04 05:38:13,888 AbstractCassandraDaemon.java (line 135) Exception in thread Thread[Thread-5819,5,main] java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:60) at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767) at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:658) at org.apache.cassandra.net.MessagingService.receive(MessagingService.java:581) at org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:155) at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:113) ERROR [Thread-36] 2013-02-04 05:38:13,898 AbstractCassandraDaemon.java (line 135) Exception in thread Thread[Thread-36,5,main] java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:60) at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767) at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:658) at org.apache.cassandra.net.MessagingService.receive(MessagingService.java:581) at org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:155) at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:113) ERROR [Thread-3990] 2013-02-04 05:38:13,902 AbstractCassandraDaemon.java (line 135) Exception in thread Thread[Thread-3990,5,main] java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:60) at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767) at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:658) at org.apache.cassandra.net.MessagingService.receive(MessagingService.java:581) at org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:155) at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:113) ERROR [ACCEPT-/10.139.50.62] AbstractCassandraDaemon.java (line 135) Exception in thread Thread[ACCEPT-/10.139.50.62,5,main] java.lang.RuntimeException: java.nio.channels.ClosedChannelException at org.apache.cassandra.net.MessagingService$SocketThread.run(MessagingService.java:710) Caused by: java.nio.channels.ClosedChannelException at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:137) at sun.nio.ch.ServerSocketAdaptor.accept(ServerSocketAdaptor.java:84) at org.apache.cassandra.net.MessagingService$SocketThread.run(MessagingService.java:699) INFO [HintedHandoff:1] 2013-02-04 05:38:24,971 HintedHandOffManager.java (line 374) Timed out replaying hints to /23.20.84.240; aborting further deliveries INFO [HintedHandoff:1] 2013-02-04 05:38:24,971 HintedHandOffManager.java (line 392) Finished hinted handoff of 0 rows to endpoint INFO [HintedHandoff:1] 2013-02-04 05:38:24,971 HintedHandOffManager.java (line 296) Started hinted handoff for token: 3 2. From heap dump, there are many deletedColumn found, rooted from thread readStage. Pls help: where might be the problem? Best Regards! Jian Jin
Re: Unable to instantiate cache provider org.apache.cassandra.cache.SerializingCacheProvider
version 1.0.8 Just curious, what is the mechanism for off heap in 1.1? Thank you. /Jason On Mon, Mar 4, 2013 at 11:49 PM, aaron morton aa...@thelastpickle.comwrote: What version are you using ? As of 1.1 off heap caches no longer require JNA https://github.com/apache/cassandra/blob/trunk/NEWS.txt#L327 Also the row and key caches are now set globally not per CF https://github.com/apache/cassandra/blob/trunk/NEWS.txt#L324 Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 1/03/2013, at 1:33 AM, Jason Wee peich...@gmail.com wrote: This happened sometime ago, but for the sake of helping others if they encounter, each column family has a row cache provider, you can read into the schema, for example : ... and row_cache_provider = 'SerializingCacheProvider' ... it cannot start the cache provider for a reason and as a result, default to the ConcurrentLinkedHashCacheProvider. the serializing cache provider require jna lib, and if you place the library into cassandra lib directory, then this warning should not happen again.
Re: Problem with CQL
You need an equal operator in your query. For instance, SELECT * FROM users WHERE country = 'malaysia' age 20 On Thu, Feb 28, 2013 at 10:04 PM, Everton Lima peitin.inu...@gmail.comwrote: Hello, I was using cql 2. I have the following query: SELECT * FROM users WHERE age 20 AND age 25; The table was created as follow: CREATE TABLE users (name PRIMARY KEY, age float); After create table and insert some data I create the Secondary Index: CREATE INDEX age_index ON users (age); When I execute a query like: SELECT * FROM users WHERE age = 22; it works fine. But when I try something like this: SELECT * FROM users WHERE age 20 I recieve the error: Bad Request: No indexed columns present in by-columns clause with equals operator Someone can help me, please? -- Everton Lima Aleixo Mestrando em Ciência da Computação pela UFG Programador no LUPA
Re: Counting problem
There is a limit option, find it in the doc. On Fri, Feb 22, 2013 at 3:41 AM, Sri Ramya ramya.1...@gmail.com wrote: hi,, Cassandra can display maximum 100 rows in a Columnfamily. can i increase it. If it is possible please mention here. Thank you
Re: Creating a keyspace fails
cqlsh CREATE KEYSPACE demodb WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 3}; cqlsh use demodb; cqlsh:demodb On Tue, Jan 22, 2013 at 7:04 PM, Paul van Hoven paul.van.ho...@googlemail.com wrote: CREATE KEYSPACE demodb WITH strategy_class = 'SimpleStrategy' AND strategy_options:replication_factor='1';
Re: Creating a keyspace fails
maybe typo or forget to update the doc... but anyway, you can use the help command when you are in cqlsh.. for example: cqlsh HELP CREATE_KEYSPACE; CREATE KEYSPACE ksname WITH replication = {'class':'strategy' [,'option':val]}; On Tue, Jan 22, 2013 at 8:06 PM, Paul van Hoven paul.van.ho...@googlemail.com wrote: Okay, that worked. Why is the statement from the tutorial wrong. I mean, why would a company like datastax post somthing like this? 2013/1/22 Jason Wee peich...@gmail.com: cqlsh CREATE KEYSPACE demodb WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 3}; cqlsh use demodb; cqlsh:demodb On Tue, Jan 22, 2013 at 7:04 PM, Paul van Hoven paul.van.ho...@googlemail.com wrote: CREATE KEYSPACE demodb WITH strategy_class = 'SimpleStrategy' AND strategy_options:replication_factor='1';
Re: Cassandra 1.1.2 - 1.1.8 upgrade
always check NEWS.txt for instance for cassandra 1.1.3 you need to run nodetool upgradesstables if your cf has counter. On Wed, Jan 16, 2013 at 11:58 PM, Mike mthero...@yahoo.com wrote: Hello, We are looking to upgrade our Cassandra cluster from 1.1.2 - 1.1.8 (or possibly 1.1.9 depending on timing). It is my understanding that rolling upgrades of Cassandra is supported, so as we upgrade our cluster, we can do so one node at a time without experiencing downtime. Has anyone had any gotchas recently that I should be aware of before performing this upgrade? In order to upgrade, is the only thing that needs to change are the JAR files? Can everything remain as-is? Thanks, -Mike