[
https://issues.apache.org/jira/browse/CASSANDRA-6716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Robert Stupp resolved CASSANDRA-6716.
-------------------------------------
Resolution: Cannot Reproduce
Reproduced In: 2.2.3, 2.0.5 (was: 2.0.5, 2.2.3)
Found the cause of that hard-link issue in bash by typing {{history | grep
sstablescrub}}, which returns a non-empty result - shortly after a {{service
cassandra start}}
/me banging head on desk
> snapshots constantly fail with "Tried to hard link to file that does not
> exist"
> -------------------------------------------------------------------------------
>
> Key: CASSANDRA-6716
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6716
> Project: Cassandra
> Issue Type: Bug
> Environment: Cassandra 2.0.5 (built from source), Linux, 6 nodes, JDK
> 1.7
> Reporter: Nikolai Grigoriev
> Attachments: system.log.gz
>
>
> It seems that since recently I have started getting a number of exceptions
> like "File not found" on all Cassandra nodes. Currently I am getting an
> exception like this every couple of seconds on each node, for different
> keyspaces and CFs.
> I have tried to restart the nodes, tried to scrub them. No luck so far. It
> seems that scrub cannot complete on any of these nodes, at some point it
> fails because of the file that it can't find.
> One one of the nodes currently the "nodetool scrub" command fails instantly
> and consistently with this exception:
> {code}
> # /opt/cassandra/bin/nodetool scrub
> Exception in thread "main" java.lang.RuntimeException: Tried to hard link to
> file that does not exist
> /mnt/disk5/cassandra/data/mykeyspace_jmeter/test_contacts/mykeyspace_jmeter-test_contacts-jb-28049-Data.db
> at
> org.apache.cassandra.io.util.FileUtils.createHardLink(FileUtils.java:75)
> at
> org.apache.cassandra.io.sstable.SSTableReader.createLinks(SSTableReader.java:1215)
> at
> org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(ColumnFamilyStore.java:1826)
> at
> org.apache.cassandra.db.ColumnFamilyStore.scrub(ColumnFamilyStore.java:1122)
> at
> org.apache.cassandra.service.StorageService.scrub(StorageService.java:2159)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75)
> at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279)
> at
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
> at
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
> at
> com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
> at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)
> at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
> at
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819)
> at
> com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)
> at
> javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1487)
> at
> javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97)
> at
> javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328)
> at
> javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1420)
> at
> javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:848)
> at sun.reflect.GeneratedMethodAccessor38.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322)
> at sun.rmi.transport.Transport$1.run(Transport.java:177)
> at sun.rmi.transport.Transport$1.run(Transport.java:174)
> at java.security.AccessController.doPrivileged(Native Method)
> at sun.rmi.transport.Transport.serviceCall(Transport.java:173)
> at
> sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:553)
> at
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:808)
> at
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:667)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:724)
> {code}
> Also I have noticed that the files that are missing are often (or maybe
> always?) referred to in the log as follows:
> {quote}
> WARN 00:06:10,597 At level 3,
> SSTableReader(path='/mnt/disk5/cassandra/data/mykeyspace_jmeter/test_contacts/mykeyspace_jmeter-test_contacts-jb-26776-Data.db')
> [DecoratedKey(-9053060597280257896,
> 0010f582cddaca974d7198ae30f194ccfd0c00001000000000004b818d00000000000000010000100000000000004000000000000000000300),
> DecoratedKey(-8855915848970248008,
> 00103ce153dfeeb547fb881a51adf611f6cf0000100000000000f04f4700000000000000010000100000000000004000000000000000000500)]
> overlaps
> SSTableReader(path='/mnt/disk2/cassandra/data/mykeyspace_jmeter/test_contacts/mykeyspace_jmeter-test_contacts-jb-28022-Data.db')
> [DecoratedKey(-8964446543595889729,
> 001043214a8bdcfd46a3b8ea71da2d57bb9a0000100000000001117c0d00000000000000000000100000000000004000000000000000000100),
> DecoratedKey(-8848132752710859808,
> 0010d1f6de8039d54218bf5b1e184335df5f000010000000000062e52600000000000000010000100000000000004000000000000000000400)].
> This could be caused by a bug in Cassandra 1.1.0 .. 1.1.3 or due to the
> fact that you have dropped sstables from another node into the data
> directory. Sending back to L0. If you didn't drop in sstables, and have not
> yet run scrub, you should do so since you may also have rows out-of-order
> within an sstable
> WARN [RMI TCP Connection(2)-10.3.45.158] 2014-02-18 00:06:10,597
> LeveledManifest.java (line 171) At level 3,
> SSTableReader(path='/mnt/disk5/cassandra/data/mykeyspace_jmeter/test_contacts/mykeyspace_jmeter-test_contacts-jb-26776-Data.db')
> [DecoratedKey(-9053060597280257896,
> 0010f582cddaca974d7198ae30f194ccfd0c00001000000000004b818d00000000000000010000100000000000004000000000000000000300),
> DecoratedKey(-8855915848970248008,
> 00103ce153dfeeb547fb881a51adf611f6cf0000100000000000f04f4700000000000000010000100000000000004000000000000000000500)]
> overlaps
> SSTableReader(path='/mnt/disk2/cassandra/data/mykeyspace_jmeter/test_contacts/mykeyspace_jmeter-test_contacts-jb-28022-Data.db')
> [DecoratedKey(-8964446543595889729,
> 001043214a8bdcfd46a3b8ea71da2d57bb9a0000100000000001117c0d00000000000000000000100000000000004000000000000000000100),
> DecoratedKey(-8848132752710859808,
> 0010d1f6de8039d54218bf5b1e184335df5f000010000000000062e52600000000000000010000100000000000004000000000000000000400)].
> This could be caused by a bug in Cassandra 1.1.0 .. 1.1.3 or due to the
> fact that you have dropped sstables from another node into the data
> directory. Sending back to L0. If you didn't drop in sstables, and have not
> yet run scrub, you should do so since you may also have rows out-of-order
> within an sstable
> {quote}
> I never had anything but Cassandra 2.0 on these systems. Also I have
> recreated my test data from scratch with 2.0.4.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)