i3 are having those issues more than the other instances it seems. Not the first report I heard about.
Regards, Carlos Juzarte Rolo Cassandra Consultant / Datastax Certified Architect / Cassandra MVP Pythian - Love your data rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin: *linkedin.com/in/carlosjuzarterolo <http://linkedin.com/in/carlosjuzarterolo>* Mobile: +351 918 918 100 www.pythian.com On Thu, Apr 6, 2017 at 5:36 PM, Cogumelos Maravilha < cogumelosmaravi...@sapo.pt> wrote: > Yes but this time I going to give lots of time between killing and pickup. > Thanks a lot. > > > On 04/06/2017 05:31 PM, Avi Kivity wrote: > > Your disk is bad. Kill that instance and hope someone else gets it. > > On 04/06/2017 07:27 PM, Cogumelos Maravilha wrote: > > Interesting > > [ 720.693768] blk_update_request: I/O error, dev nvme0n1, sector > 1397303056 > [ 750.698840] blk_update_request: I/O error, dev nvme0n1, sector > 1397303080 > [ 1416.202103] blk_update_request: I/O error, dev nvme0n1, sector > 1397303080 > > On 04/06/2017 05:26 PM, Avi Kivity wrote: > > Is there anything in dmesg? > > On 04/06/2017 07:25 PM, Cogumelos Maravilha wrote: > > Now dies and restart (systemd) without logging why > > system.log > > INFO [Native-Transport-Requests-2] 2017-04-06 16:06:55,362 > AuthCache.java:172 - (Re)initializing RolesCache (validity period > /update interval/max entries) (2000/2000/1000) > INFO [main] 2017-04-06 16:17:42,535 YamlConfigurationLoader.java:89 - > Configuration location: file:/etc/cassandra/cassandra. > yaml > > debug.log > DEBUG [GossipStage:1] 2017-04-06 16:16:56,272 FailureDetector.java:457 - > Ignoring interval time of 2496703934 for /10.100.120.52 > DEBUG [GossipStage:1] 2017-04-06 16:16:59,090 FailureDetector.java:457 - > Ignoring interval time of 2818071981 for /10.100.120.161 > INFO [main] 2017-04-06 16:17:42,535 YamlConfigurationLoader.java:89 - > Configuration location: file:/etc/cassandra/cassandra.yaml > DEBUG [main] 2017-04-06 16:17:42,540 YamlConfigurationLoader.java:108 - > Loading settings from file:/etc/cassandra/cassandra.yaml > > > On 04/06/2017 04:18 PM, Cogumelos Maravilha wrote: > > find */mnt/cassandra/* \! -user cassandra > nothing > > I've found some "strange" solutions on Internet > chmod -R 2777 /tmp > chmod -R 2775 cassandra folder > > Lets give some time to see the result > > > On 04/06/2017 03:14 PM, Michael Shuler wrote: > > All it takes is one frustrated `sudo cassandra` run. Checking only the > top level directory ownership is insufficient, since root could own > files/dirs created below the top level. Find all files not owned by user > cassandra: `find */mnt/cassandra/* \! -user cassandra` > > Just another thought. > > -- > Michael > > > On 04/06/2017 05:23 AM, Cogumelos Maravilha wrote: > > From cassandra.yaml: > > hints_directory: /mnt/cassandra/hints > data_file_directories: > - /mnt/cassandra/data > commitlog_directory: /mnt/cassandra/commitlog > saved_caches_directory: /mnt/cassandra/saved_caches > > drwxr-xr-x 3 cassandra cassandra 23 Apr 5 16:03 mnt/ > > drwxr-xr-x 6 cassandra cassandra 68 Apr 5 16:17 ./ > drwxr-xr-x 3 cassandra cassandra 23 Apr 5 16:03 ../ > drwxr-xr-x 2 cassandra cassandra 80 Apr 6 10:07 commitlog/ > drwxr-xr-x 8 cassandra cassandra 124 Apr 5 16:17 data/ > drwxr-xr-x 2 cassandra cassandra 72 Apr 5 16:20 hints/ > drwxr-xr-x 2 cassandra cassandra 49 Apr 5 20:17 saved_caches/ > > cassand+ 2267 1 99 10:18 ? 00:02:56 java > -Xloggc:/var/log/cassandra/gc.log -ea -XX:+UseThreadPriorities -XX:Threa... > > /dev/mapper/um_vg-xfs_lv 885G 27G 858G 4% /mnt > > On /etc/security/limits.conf > > * - memlock unlimited > * - nofile 100000 > * - nproc 32768 > * - as unlimited > > On /etc/security/limits.d/cassandra.conf > > cassandra - memlock unlimited > cassandra - nofile 100000 > cassandra - as unlimited > cassandra - nproc 32768 > > On /etc/sysctl.conf > > vm.max_map_count = 1048575 > > On /etc/systcl.d/cassanda.conf > > vm.max_map_count = 1048575 > net.ipv4.tcp_keepalive_time=600 > > On /etc/pam.d/su > ... > session required pam_limits.so > ... > > Distro is the currently Ubuntu LTS. > Thanks > > > On 04/06/2017 10:39 AM, benjamin roth wrote: > > Cassandra cannot write an SSTable to disk. Are you sure the > disk/volume where SSTables reside (normally /var/lib/cassandra/data) > is writeable for the CS user and has enough free space? > The CDC warning also implies that. > The other warnings indicate you are probably not running CS as root > and you did not set an appropriate limit for max open files. Running > out of open files can also be a reason for the IO error. > > 2017-04-06 11:34 GMT+02:00 Cogumelos Maravilha > <cogumelosmaravi...@sapo.pt <mailto:cogumelosmaravi...@sapo.pt> > <cogumelosmaravi...@sapo.pt>>: > > Hi list, > > I'm using C* 3.10 in a 6 nodes cluster RF=2. All instances type > i3.xlarge (AWS) with 32GB, 2 cores and SSD LVM XFS formated 885G. > I have > one node that is always dieing and I don't understand why. Can anyone > give me some hints please. All nodes using the same configuration. > > Thanks in advance. > > INFO [IndexSummaryManager:1] 2017-04-06 05:22:18,352 > IndexSummaryRedistribution.java:75 - Redistributing index summaries > ERROR [MemtablePostFlush:22] 2017-04-06 06:00:26,800 > CassandraDaemon.java:229 - Exception in thread > Thread[MemtablePostFlush:22,5,main] > org.apache.cassandra.io > <http://org.apache.cassandra.io> > <http://org.apache.cassandra.io>.FSWriteError: > java.io.IOException: Input/output > error > at > > org.apache.cassandra.io.util.SequentialWriter.syncDataOnlyInternal(SequentialWriter.java:173) > ~[apache-cassandra-3.10.jar:3.10] > at > > org.apache.cassandra.io.util.SequentialWriter.syncInternal(SequentialWriter.java:185) > ~[apache-cassandra-3.10.jar:3.10] > at > org.apache.cassandra.io > <http://org.apache.cassandra.io> > <http://org.apache.cassandra.io>.compress.CompressedSequentialWriter.access$100(CompressedSequentialWriter.java:38) > ~[apache-cassandra-3.10.jar:3.10] > at > org.apache.cassandra.io > <http://org.apache.cassandra.io> > <http://org.apache.cassandra.io>.compress.CompressedSequentialWriter$TransactionalProxy.doPrepare(CompressedSequentialWriter.java:307) > ~[apache-cassandra-3.10.jar:3.10] > at > > org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(Transactional.java:173) > ~[apache-cassandra-3.10.jar:3.10] > at > > org.apache.cassandra.io.util.SequentialWriter.prepareToCommit(SequentialWriter.java:358) > ~[apache-cassandra-3.10.jar:3.10] > at > org.apache.cassandra.io > <http://org.apache.cassandra.io> > <http://org.apache.cassandra.io>.sstable.format.big.BigTableWriter$TransactionalProxy.doPrepare(BigTableWriter.java:367) > ~[apache-cassandra-3.10.jar:3.10] > at > > org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(Transactional.java:173) > ~[apache-cassandra-3.10.jar:3.10] > at > org.apache.cassandra.io > <http://org.apache.cassandra.io> > <http://org.apache.cassandra.io>.sstable.format.SSTableWriter.prepareToCommit(SSTableWriter.java:281) > ~[apache-cassandra-3.10.jar:3.10] > at > org.apache.cassandra.io > <http://org.apache.cassandra.io> > <http://org.apache.cassandra.io>.sstable.SimpleSSTableMultiWriter.prepareToCommit(SimpleSSTableMultiWriter.java:101) > ~[apache-cassandra-3.10.jar:3.10] > at > > org.apache.cassandra.db.ColumnFamilyStore$Flush.flushMemtable(ColumnFamilyStore.java:1153) > ~[apache-cassandra-3.10.jar:3.10] > at > > org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1086) > ~[apache-cassandra-3.10.jar:3.10] > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > ~[na:1.8.0_121] > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_121] > at > > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79) > [apache-cassandra-3.10.jar:3.10] > at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121] > Caused by: java.io.IOException: Input/output error > at sun.nio.ch.FileDispatcherImpl.force0(Native Method) > ~[na:1.8.0_121] > at sun.nio.ch.FileDispatcherImpl.force(FileDispatcherImpl.java:76) > ~[na:1.8.0_121] > at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:388) > ~[na:1.8.0_121] > at org.apache.cassandra.utils.SyncUtil.force(SyncUtil.java:158) > ~[apache-cassandra-3.10.jar:3.10] > at > > org.apache.cassandra.io.util.SequentialWriter.syncDataOnlyInternal(SequentialWriter.java:169) > ~[apache-cassandra-3.10.jar:3.10] > ... 15 common frames omitted > INFO [IndexSummaryManager:1] 2017-04-06 06:22:18,366 > IndexSummaryRedistribution.java:75 - Redistributing index summaries > ERROR [MemtablePostFlush:31] 2017-04-06 06:39:19,525 > CassandraDaemon.java:229 - Exception in thread > Thread[MemtablePostFlush:31,5,main] > org.apache.cassandra.io > <http://org.apache.cassandra.io> > <http://org.apache.cassandra.io>.FSWriteError: > java.io.IOException: Input/output > error > at > > org.apache.cassandra.io.util.SequentialWriter.syncDataOnlyInternal(SequentialWriter.java:173) > ~[apache-cassandra-3.10.jar:3.10] > at > > org.apache.cassandra.io.util.SequentialWriter.syncInternal(SequentialWriter.java:185) > ~[apache-cassandra-3.10.jar:3.10] > at > org.apache.cassandra.io > <http://org.apache.cassandra.io> > <http://org.apache.cassandra.io>.compress.CompressedSequentialWriter.access$100(CompressedSequentialWriter.java:38) > ~[apache-cassandra-3.10.jar:3.10] > at > org.apache.cassandra.io > <http://org.apache.cassandra.io> > <http://org.apache.cassandra.io>.compress.CompressedSequentialWriter$TransactionalProxy.doPrepare(CompressedSequentialWriter.java:307) > ~[apache-cassandra-3.10.jar:3.10] > at > > org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(Transactional.java:173) > ~[apache-cassandra-3.10.jar:3.10] > at > > org.apache.cassandra.io.util.SequentialWriter.prepareToCommit(SequentialWriter.java:358) > ~[apache-cassandra-3.10.jar:3.10] > at > org.apache.cassandra.io > <http://org.apache.cassandra.io> > <http://org.apache.cassandra.io>.sstable.format.big.BigTableWriter$TransactionalProxy.doPrepare(BigTableWriter.java:367) > ~[apache-cassandra-3.10.jar:3.10] > at > > org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(Transactional.java:173) > ~[apache-cassandra-3.10.jar:3.10] > at > org.apache.cassandra.io > <http://org.apache.cassandra.io> > <http://org.apache.cassandra.io>.sstable.format.SSTableWriter.prepareToCommit(SSTableWriter.java:281) > ~[apache-cassandra-3.10.jar:3.10] > at > org.apache.cassandra.io > <http://org.apache.cassandra.io> > <http://org.apache.cassandra.io>.sstable.SimpleSSTableMultiWriter.prepareToCommit(SimpleSSTableMultiWriter.java:101) > ~[apache-cassandra-3.10.jar:3.10] > at > > org.apache.cassandra.db.ColumnFamilyStore$Flush.flushMemtable(ColumnFamilyStore.java:1153) > ~[apache-cassandra-3.10.jar:3.10] > at > > org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1086) > ~[apache-cassandra-3.10.jar:3.10] > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > ~[na:1.8.0_121] > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_121] > at > > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79) > [apache-cassandra-3.10.jar:3.10] > at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121] > Caused by: java.io.IOException: Input/output error > at sun.nio.ch.FileDispatcherImpl.force0(Native Method) > ~[na:1.8.0_121] > at sun.nio.ch.FileDispatcherImpl.force(FileDispatcherImpl.java:76) > ~[na:1.8.0_121] > at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:388) > ~[na:1.8.0_121] > at org.apache.cassandra.utils.SyncUtil.force(SyncUtil.java:158) > ~[apache-cassandra-3.10.jar:3.10] > at > > org.apache.cassandra.io.util.SequentialWriter.syncDataOnlyInternal(SequentialWriter.java:169) > ~[apache-cassandra-3.10.jar:3.10] > ... 15 common frames omitted > INFO [main] 2017-04-06 07:11:57,289 YamlConfigurationLoader.java:89 - > Configuration location: file:/etc/cassandra/cassandra.yaml > > > Some ERRORs messages: > > ERROR [MemtablePostFlush:2] 2017-04-05 23:35:46,339 > CassandraDaemon.java:229 - Exception in thread > Thread[MemtablePostFlush:2,5,main] > ERROR [MemtablePostFlush:3] 2017-04-05 23:44:08,471 > CassandraDaemon.java:229 - Exception in thread > Thread[MemtablePostFlush:3,5,main] > ERROR [MemtablePostFlush:4] 2017-04-05 23:54:41,224 > CassandraDaemon.java:229 - Exception in thread > Thread[MemtablePostFlush:4,5,main] > ERROR [MessagingService-Incoming-/10.0.120.52 > <http://10.0.120.52> <http://10.0.120.52>] 2017-04-06 03:19:13,453 > CassandraDaemon.java:229 - Exception in thread > Thread[MessagingService-Incoming-/10.0.120.52 > <http://10.0.120.52> <http://10.0.120.52>,5,main] > ERROR [epollEventLoopGroup-2-6] 2017-04-06 03:24:41,006 > CassandraDaemon.java:229 - Exception in thread > Thread[epollEventLoopGroup-2-6,10,main] > ERROR [Native-Transport-Requests-36] 2017-04-06 03:25:45,915 > JVMStabilityInspector.java:142 - JVM state determined to be unstable. > Exiting forcefully due to: > ERROR [Native-Transport-Requests-49] 2017-04-06 03:25:45,915 > JVMStabilityInspector.java:142 - JVM state determined to be unstable. > Exiting forcefully due to: > ERROR [IndexSummaryManager:1] 2017-04-06 03:25:45,915 > JVMStabilityInspector.java:142 - JVM state determined to be unstable. > Exiting forcefully due to: > ERROR [Native-Transport-Requests-69] 2017-04-06 03:25:45,916 > JVMStabilityInspector.java:142 - JVM state determined to be unstable. > Exiting forcefully due to: > ERROR [Native-Transport-Requests-46] 2017-04-06 03:26:18,465 > JVMStabilityInspector.java:142 - JVM state determined to be unstable. > Exiting forcefully due to: > ERROR [SharedPool-Worker-136] 2017-04-06 03:26:18,465 > JVMStabilityInspector.java:142 - JVM state determined to be unstable. > Exiting forcefully due to: > ERROR [Native-Transport-Requests-156] 2017-04-06 03:26:18,465 > JVMStabilityInspector.java:142 - JVM state determined to be unstable. > Exiting forcefully due to: > ERROR [SharedPool-Worker-92] 2017-04-06 03:26:24,696 > JVMStabilityInspector.java:142 - JVM state determined to be unstable. > Exiting forcefully due to: > ERROR [Native-Transport-Requests-48] 2017-04-06 03:26:24,696 ?:? - JVM > state determined to be unstable. Exiting forcefully due to: > ERROR [Native-Transport-Requests-66] 2017-04-06 03:26:55,808 > JVMStabilityInspector.java:142 - JVM state determined to be unstable. > Exiting forcefully due to: > ERROR [Native-Transport-Requests-77] 2017-04-06 03:26:55,808 > JVMStabilityInspector.java:142 - JVM state determined to be unstable. > Exiting forcefully due to: > ERROR [GossipTasks:1] 2017-04-06 03:26:55,808 > JVMStabilityInspector.java:142 - JVM state determined to be unstable. > Exiting forcefully due to: > ERROR [Native-Transport-Requests-133] 2017-04-06 03:26:55,808 > JVMStabilityInspector.java:142 - JVM state determined to be unstable. > Exiting forcefully due to: > ERROR [Native-Transport-Requests-135] 2017-04-06 03:26:55,808 > JVMStabilityInspector.java:142 - JVM state determined to be unstable. > Exiting forcefully due to: > ERROR [ScheduledFastTasks:1] 2017-04-06 03:26:55,808 > JVMStabilityInspector.java:142 - JVM state determined to be unstable. > Exiting forcefully due to: > ERROR [Native-Transport-Requests-70] 2017-04-06 03:27:11,569 > JVMStabilityInspector.java:142 - JVM state determined to be unstable. > Exiting forcefully due to: > ERROR [IndexSummaryManager:1] 2017-04-06 03:27:17,821 > CassandraDaemon.java:229 - Exception in thread > Thread[IndexSummaryManager:1,1,main] > ERROR [Native-Transport-Requests-103] 2017-04-06 03:27:24,049 > JVMStabilityInspector.java:142 - JVM state determined to be unstable. > Exiting forcefully due to: > ERROR [Native-Transport-Requests-69] 2017-04-06 03:27:24,049 > SEPWorker.java:145 - Failed to execute task, unexpected exception > killed > worker: {} > ERROR [SharedPool-Worker-98] 2017-04-06 03:27:24,049 > JVMStabilityInspector.java:142 - JVM state determined to be unstable. > Exiting forcefully due to: > ERROR [MessagingService-Incoming-/10.0.120.52 > <http://10.0.120.52> <http://10.0.120.52>] 2017-04-06 03:27:55,079 > JVMStabilityInspector.java:142 - JVM state determined to be unstable. > Exiting forcefully due to: > ERROR [epollEventLoopGroup-2-5] 2017-04-06 03:27:55,079 > JVMStabilityInspector.java:142 - JVM state determined to be unstable. > Exiting forcefully due to: > ERROR [Native-Transport-Requests-64] 2017-04-06 03:28:43,285 > SEPWorker.java:145 - Failed to execute task, unexpected exception > killed > worker: {} > ERROR [MemtablePostFlush:22] 2017-04-06 06:00:26,800 > CassandraDaemon.java:229 - Exception in thread > Thread[MemtablePostFlush:22,5,main] > ERROR [MemtablePostFlush:31] 2017-04-06 06:39:19,525 > CassandraDaemon.java:229 - Exception in thread > Thread[MemtablePostFlush:31,5,main] > > Also some WARNs: > > WARN [main] 2017-04-06 09:26:49,725 CLibrary.java:178 - Unable to > lock > JVM memory (ENOMEM). This can result in part of the JVM being swapped > out, especially with mmapped I/O enabled. Increase RLIMIT_MEMLOCK > or run > Cassandra as root. > > WARN [main] 2017-04-06 09:25:07,355 StartupChecks.java:157 - JMX > is not > enabled to receive remote connections. Please see cassandra-env.sh for > more info. > > WARN [main] 2017-04-06 09:25:07,369 SigarLibrary.java:174 - Cassandra > server running in degraded mode. Is swap disabled? : true, Address > space adequate? : true, nofile limit adequate? : false, nproc limit > adequate? : true > > WARN [main] 2017-04-06 09:25:07,091 DatabaseDescriptor.java:493 - > Small > cdc volume detected at /var/lib/cassandra/cdc_raw; setting > cdc_total_space_in_mb to 2502. You can override this in > cassandra.yaml > > > > > > > > > > > -- --