[jira] [Commented] (CASSANDRA-6736) Windows7 AccessDeniedException on commit log
[ https://issues.apache.org/jira/browse/CASSANDRA-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930391#comment-13930391 ] Joshua McKenzie commented on CASSANDRA-6736: Bill - thanks for the heads up. As far as I know nobody else has seen this and I haven't been able to reproduce even with a threefold increase in batchers. Excluding C* folders from AV processing is probably something we need to document from a performance implication perspective regardless of file locking. Windows7 AccessDeniedException on commit log - Key: CASSANDRA-6736 URL: https://issues.apache.org/jira/browse/CASSANDRA-6736 Project: Cassandra Issue Type: Bug Components: Core Environment: Windows 7, quad core, 8GB RAM, single Cassandra node, Cassandra 2.0.5 with leakdetect patch from CASSANDRA-6283 Reporter: Bill Mitchell Assignee: Joshua McKenzie Attachments: 2014-02-18-22-16.log Similar to the data file deletion of CASSANDRA-6283, under heavy load with logged batches, I am seeing a problem where the Commit log cannot be deleted: ERROR [COMMIT-LOG-ALLOCATOR] 2014-02-18 22:15:58,252 CassandraDaemon.java (line 192) Exception in thread Thread[COMMIT-LOG-ALLOCATOR,5,main] FSWriteError in C:\Program Files\DataStax Community\data\commitlog\CommitLog-3-1392761510706.log at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:120) at org.apache.cassandra.db.commitlog.CommitLogSegment.discard(CommitLogSegment.java:150) at org.apache.cassandra.db.commitlog.CommitLogAllocator$4.run(CommitLogAllocator.java:217) at org.apache.cassandra.db.commitlog.CommitLogAllocator$1.runMayThrow(CommitLogAllocator.java:95) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.lang.Thread.run(Unknown Source) Caused by: java.nio.file.AccessDeniedException: C:\Program Files\DataStax Community\data\commitlog\CommitLog-3-1392761510706.log at sun.nio.fs.WindowsException.translateToIOException(Unknown Source) at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source) at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source) at sun.nio.fs.WindowsFileSystemProvider.implDelete(Unknown Source) at sun.nio.fs.AbstractFileSystemProvider.delete(Unknown Source) at java.nio.file.Files.delete(Unknown Source) at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:116) ... 5 more (Attached in 2014-02-18-22-16.log is a larger excerpt from the cassandra.log.) In this particular case, I was trying to do 100 million inserts into two tables in parallel, one with a single wide row and one with narrow rows, and the error appeared after inserting 43,151,232 rows. So it does take a while to trip over this timing issue. It may be aggravated by the size of the batches. This test was writing 10,000 rows to each table in a batch. When I try switching the same test from using a logged batch to an unlogged batch, and no such failure appears. So the issue could be related to the use of large, logged batches, or it could be that unlogged batches just change the probability of failure. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6736) Windows7 AccessDeniedException on commit log
[ https://issues.apache.org/jira/browse/CASSANDRA-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13929908#comment-13929908 ] Bill Mitchell commented on CASSANDRA-6736: -- Josh, by accident I logged into my development machine from home this evening, found it had stopped with the COMMIT_LOG_ALLOCATOR exception, and coincidentally noticed that the Kaspersky virus scan was still running. Which suggests that this may have never been a C* issue, but rather interference from the antivirus software -- a hypothesis consistent with this issue appearing only my development machine at work and never on my laptop. So, unless someone else reports a similar symptom, I suggest we close this out. Windows7 AccessDeniedException on commit log - Key: CASSANDRA-6736 URL: https://issues.apache.org/jira/browse/CASSANDRA-6736 Project: Cassandra Issue Type: Bug Components: Core Environment: Windows 7, quad core, 8GB RAM, single Cassandra node, Cassandra 2.0.5 with leakdetect patch from CASSANDRA-6283 Reporter: Bill Mitchell Assignee: Joshua McKenzie Attachments: 2014-02-18-22-16.log Similar to the data file deletion of CASSANDRA-6283, under heavy load with logged batches, I am seeing a problem where the Commit log cannot be deleted: ERROR [COMMIT-LOG-ALLOCATOR] 2014-02-18 22:15:58,252 CassandraDaemon.java (line 192) Exception in thread Thread[COMMIT-LOG-ALLOCATOR,5,main] FSWriteError in C:\Program Files\DataStax Community\data\commitlog\CommitLog-3-1392761510706.log at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:120) at org.apache.cassandra.db.commitlog.CommitLogSegment.discard(CommitLogSegment.java:150) at org.apache.cassandra.db.commitlog.CommitLogAllocator$4.run(CommitLogAllocator.java:217) at org.apache.cassandra.db.commitlog.CommitLogAllocator$1.runMayThrow(CommitLogAllocator.java:95) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.lang.Thread.run(Unknown Source) Caused by: java.nio.file.AccessDeniedException: C:\Program Files\DataStax Community\data\commitlog\CommitLog-3-1392761510706.log at sun.nio.fs.WindowsException.translateToIOException(Unknown Source) at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source) at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source) at sun.nio.fs.WindowsFileSystemProvider.implDelete(Unknown Source) at sun.nio.fs.AbstractFileSystemProvider.delete(Unknown Source) at java.nio.file.Files.delete(Unknown Source) at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:116) ... 5 more (Attached in 2014-02-18-22-16.log is a larger excerpt from the cassandra.log.) In this particular case, I was trying to do 100 million inserts into two tables in parallel, one with a single wide row and one with narrow rows, and the error appeared after inserting 43,151,232 rows. So it does take a while to trip over this timing issue. It may be aggravated by the size of the batches. This test was writing 10,000 rows to each table in a batch. When I try switching the same test from using a logged batch to an unlogged batch, and no such failure appears. So the issue could be related to the use of large, logged batches, or it could be that unlogged batches just change the probability of failure. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6736) Windows7 AccessDeniedException on commit log
[ https://issues.apache.org/jira/browse/CASSANDRA-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13919532#comment-13919532 ] Joshua McKenzie commented on CASSANDRA-6736: Bill - could you attach your test script that you used to generate this error? Windows7 AccessDeniedException on commit log - Key: CASSANDRA-6736 URL: https://issues.apache.org/jira/browse/CASSANDRA-6736 Project: Cassandra Issue Type: Bug Components: Core Environment: Windows 7, quad core, 8GB RAM, single Cassandra node, Cassandra 2.0.5 with leakdetect patch from CASSANDRA-6283 Reporter: Bill Mitchell Assignee: Joshua McKenzie Attachments: 2014-02-18-22-16.log Similar to the data file deletion of CASSANDRA-6283, under heavy load with logged batches, I am seeing a problem where the Commit log cannot be deleted: ERROR [COMMIT-LOG-ALLOCATOR] 2014-02-18 22:15:58,252 CassandraDaemon.java (line 192) Exception in thread Thread[COMMIT-LOG-ALLOCATOR,5,main] FSWriteError in C:\Program Files\DataStax Community\data\commitlog\CommitLog-3-1392761510706.log at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:120) at org.apache.cassandra.db.commitlog.CommitLogSegment.discard(CommitLogSegment.java:150) at org.apache.cassandra.db.commitlog.CommitLogAllocator$4.run(CommitLogAllocator.java:217) at org.apache.cassandra.db.commitlog.CommitLogAllocator$1.runMayThrow(CommitLogAllocator.java:95) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.lang.Thread.run(Unknown Source) Caused by: java.nio.file.AccessDeniedException: C:\Program Files\DataStax Community\data\commitlog\CommitLog-3-1392761510706.log at sun.nio.fs.WindowsException.translateToIOException(Unknown Source) at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source) at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source) at sun.nio.fs.WindowsFileSystemProvider.implDelete(Unknown Source) at sun.nio.fs.AbstractFileSystemProvider.delete(Unknown Source) at java.nio.file.Files.delete(Unknown Source) at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:116) ... 5 more (Attached in 2014-02-18-22-16.log is a larger excerpt from the cassandra.log.) In this particular case, I was trying to do 100 million inserts into two tables in parallel, one with a single wide row and one with narrow rows, and the error appeared after inserting 43,151,232 rows. So it does take a while to trip over this timing issue. It may be aggravated by the size of the batches. This test was writing 10,000 rows to each table in a batch. When I try switching the same test from using a logged batch to an unlogged batch, and no such failure appears. So the issue could be related to the use of large, logged batches, or it could be that unlogged batches just change the probability of failure. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6736) Windows7 AccessDeniedException on commit log
[ https://issues.apache.org/jira/browse/CASSANDRA-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13919566#comment-13919566 ] Bill Mitchell commented on CASSANDRA-6736: -- That's a bit awkward, as the JUnit test cases live within the framework of a complete application, and stripping it down to something more minimal, that does not carry all of our application framework, will be a little work. As I've already built a 2.0.5 version with the leak detect patch, if you want me to try a version with a larger test or diagnostic patch, that would be straightforward. I'll drop you an email to explore other options. Windows7 AccessDeniedException on commit log - Key: CASSANDRA-6736 URL: https://issues.apache.org/jira/browse/CASSANDRA-6736 Project: Cassandra Issue Type: Bug Components: Core Environment: Windows 7, quad core, 8GB RAM, single Cassandra node, Cassandra 2.0.5 with leakdetect patch from CASSANDRA-6283 Reporter: Bill Mitchell Assignee: Joshua McKenzie Attachments: 2014-02-18-22-16.log Similar to the data file deletion of CASSANDRA-6283, under heavy load with logged batches, I am seeing a problem where the Commit log cannot be deleted: ERROR [COMMIT-LOG-ALLOCATOR] 2014-02-18 22:15:58,252 CassandraDaemon.java (line 192) Exception in thread Thread[COMMIT-LOG-ALLOCATOR,5,main] FSWriteError in C:\Program Files\DataStax Community\data\commitlog\CommitLog-3-1392761510706.log at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:120) at org.apache.cassandra.db.commitlog.CommitLogSegment.discard(CommitLogSegment.java:150) at org.apache.cassandra.db.commitlog.CommitLogAllocator$4.run(CommitLogAllocator.java:217) at org.apache.cassandra.db.commitlog.CommitLogAllocator$1.runMayThrow(CommitLogAllocator.java:95) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.lang.Thread.run(Unknown Source) Caused by: java.nio.file.AccessDeniedException: C:\Program Files\DataStax Community\data\commitlog\CommitLog-3-1392761510706.log at sun.nio.fs.WindowsException.translateToIOException(Unknown Source) at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source) at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source) at sun.nio.fs.WindowsFileSystemProvider.implDelete(Unknown Source) at sun.nio.fs.AbstractFileSystemProvider.delete(Unknown Source) at java.nio.file.Files.delete(Unknown Source) at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:116) ... 5 more (Attached in 2014-02-18-22-16.log is a larger excerpt from the cassandra.log.) In this particular case, I was trying to do 100 million inserts into two tables in parallel, one with a single wide row and one with narrow rows, and the error appeared after inserting 43,151,232 rows. So it does take a while to trip over this timing issue. It may be aggravated by the size of the batches. This test was writing 10,000 rows to each table in a batch. When I try switching the same test from using a logged batch to an unlogged batch, and no such failure appears. So the issue could be related to the use of large, logged batches, or it could be that unlogged batches just change the probability of failure. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6736) Windows7 AccessDeniedException on commit log
[ https://issues.apache.org/jira/browse/CASSANDRA-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13919589#comment-13919589 ] Joshua McKenzie commented on CASSANDRA-6736: No worries - I can pursue reproducing this externally w/ccm. Didn't realize your test cases were integrated w/a larger app. Windows7 AccessDeniedException on commit log - Key: CASSANDRA-6736 URL: https://issues.apache.org/jira/browse/CASSANDRA-6736 Project: Cassandra Issue Type: Bug Components: Core Environment: Windows 7, quad core, 8GB RAM, single Cassandra node, Cassandra 2.0.5 with leakdetect patch from CASSANDRA-6283 Reporter: Bill Mitchell Assignee: Joshua McKenzie Attachments: 2014-02-18-22-16.log Similar to the data file deletion of CASSANDRA-6283, under heavy load with logged batches, I am seeing a problem where the Commit log cannot be deleted: ERROR [COMMIT-LOG-ALLOCATOR] 2014-02-18 22:15:58,252 CassandraDaemon.java (line 192) Exception in thread Thread[COMMIT-LOG-ALLOCATOR,5,main] FSWriteError in C:\Program Files\DataStax Community\data\commitlog\CommitLog-3-1392761510706.log at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:120) at org.apache.cassandra.db.commitlog.CommitLogSegment.discard(CommitLogSegment.java:150) at org.apache.cassandra.db.commitlog.CommitLogAllocator$4.run(CommitLogAllocator.java:217) at org.apache.cassandra.db.commitlog.CommitLogAllocator$1.runMayThrow(CommitLogAllocator.java:95) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.lang.Thread.run(Unknown Source) Caused by: java.nio.file.AccessDeniedException: C:\Program Files\DataStax Community\data\commitlog\CommitLog-3-1392761510706.log at sun.nio.fs.WindowsException.translateToIOException(Unknown Source) at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source) at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source) at sun.nio.fs.WindowsFileSystemProvider.implDelete(Unknown Source) at sun.nio.fs.AbstractFileSystemProvider.delete(Unknown Source) at java.nio.file.Files.delete(Unknown Source) at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:116) ... 5 more (Attached in 2014-02-18-22-16.log is a larger excerpt from the cassandra.log.) In this particular case, I was trying to do 100 million inserts into two tables in parallel, one with a single wide row and one with narrow rows, and the error appeared after inserting 43,151,232 rows. So it does take a while to trip over this timing issue. It may be aggravated by the size of the batches. This test was writing 10,000 rows to each table in a batch. When I try switching the same test from using a logged batch to an unlogged batch, and no such failure appears. So the issue could be related to the use of large, logged batches, or it could be that unlogged batches just change the probability of failure. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6736) Windows7 AccessDeniedException on commit log
[ https://issues.apache.org/jira/browse/CASSANDRA-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13918170#comment-13918170 ] Joshua McKenzie commented on CASSANDRA-6736: I've reproduced that error with scrubDataDirectories locally as well - I'll create a new JIRA out of that since it's separate from the commit log issue. Historically when errors like this popped up, repeatedly attempting to start C* eventually resolved it and I've seen that locally on my system as well. rejectedException from a ThreadPoolExecutor after shutdown is expected behavior. There's been some discussion on other tickets about handling that more gracefully but the general consensus seems to be that it's Working As Intended, though noisy. The SSTableDeletingTask issue you saw after exhausting your heap and restarting may be related to CASSANDRA-6283 as we're seeing the same there. Windows7 AccessDeniedException on commit log - Key: CASSANDRA-6736 URL: https://issues.apache.org/jira/browse/CASSANDRA-6736 Project: Cassandra Issue Type: Bug Components: Core Environment: Windows 7, quad core, 8GB RAM, single Cassandra node, Cassandra 2.0.5 with leakdetect patch from CASSANDRA-6283 Reporter: Bill Mitchell Assignee: Joshua McKenzie Attachments: 2014-02-18-22-16.log Similar to the data file deletion of CASSANDRA-6283, under heavy load with logged batches, I am seeing a problem where the Commit log cannot be deleted: ERROR [COMMIT-LOG-ALLOCATOR] 2014-02-18 22:15:58,252 CassandraDaemon.java (line 192) Exception in thread Thread[COMMIT-LOG-ALLOCATOR,5,main] FSWriteError in C:\Program Files\DataStax Community\data\commitlog\CommitLog-3-1392761510706.log at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:120) at org.apache.cassandra.db.commitlog.CommitLogSegment.discard(CommitLogSegment.java:150) at org.apache.cassandra.db.commitlog.CommitLogAllocator$4.run(CommitLogAllocator.java:217) at org.apache.cassandra.db.commitlog.CommitLogAllocator$1.runMayThrow(CommitLogAllocator.java:95) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.lang.Thread.run(Unknown Source) Caused by: java.nio.file.AccessDeniedException: C:\Program Files\DataStax Community\data\commitlog\CommitLog-3-1392761510706.log at sun.nio.fs.WindowsException.translateToIOException(Unknown Source) at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source) at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source) at sun.nio.fs.WindowsFileSystemProvider.implDelete(Unknown Source) at sun.nio.fs.AbstractFileSystemProvider.delete(Unknown Source) at java.nio.file.Files.delete(Unknown Source) at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:116) ... 5 more (Attached in 2014-02-18-22-16.log is a larger excerpt from the cassandra.log.) In this particular case, I was trying to do 100 million inserts into two tables in parallel, one with a single wide row and one with narrow rows, and the error appeared after inserting 43,151,232 rows. So it does take a while to trip over this timing issue. It may be aggravated by the size of the batches. This test was writing 10,000 rows to each table in a batch. When I try switching the same test from using a logged batch to an unlogged batch, and no such failure appears. So the issue could be related to the use of large, logged batches, or it could be that unlogged batches just change the probability of failure. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6736) Windows7 AccessDeniedException on commit log
[ https://issues.apache.org/jira/browse/CASSANDRA-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13918188#comment-13918188 ] Joshua McKenzie commented on CASSANDRA-6736: Created CASSANDRA-6795 to track scrubDataDirectories exception. Windows7 AccessDeniedException on commit log - Key: CASSANDRA-6736 URL: https://issues.apache.org/jira/browse/CASSANDRA-6736 Project: Cassandra Issue Type: Bug Components: Core Environment: Windows 7, quad core, 8GB RAM, single Cassandra node, Cassandra 2.0.5 with leakdetect patch from CASSANDRA-6283 Reporter: Bill Mitchell Assignee: Joshua McKenzie Attachments: 2014-02-18-22-16.log Similar to the data file deletion of CASSANDRA-6283, under heavy load with logged batches, I am seeing a problem where the Commit log cannot be deleted: ERROR [COMMIT-LOG-ALLOCATOR] 2014-02-18 22:15:58,252 CassandraDaemon.java (line 192) Exception in thread Thread[COMMIT-LOG-ALLOCATOR,5,main] FSWriteError in C:\Program Files\DataStax Community\data\commitlog\CommitLog-3-1392761510706.log at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:120) at org.apache.cassandra.db.commitlog.CommitLogSegment.discard(CommitLogSegment.java:150) at org.apache.cassandra.db.commitlog.CommitLogAllocator$4.run(CommitLogAllocator.java:217) at org.apache.cassandra.db.commitlog.CommitLogAllocator$1.runMayThrow(CommitLogAllocator.java:95) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.lang.Thread.run(Unknown Source) Caused by: java.nio.file.AccessDeniedException: C:\Program Files\DataStax Community\data\commitlog\CommitLog-3-1392761510706.log at sun.nio.fs.WindowsException.translateToIOException(Unknown Source) at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source) at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source) at sun.nio.fs.WindowsFileSystemProvider.implDelete(Unknown Source) at sun.nio.fs.AbstractFileSystemProvider.delete(Unknown Source) at java.nio.file.Files.delete(Unknown Source) at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:116) ... 5 more (Attached in 2014-02-18-22-16.log is a larger excerpt from the cassandra.log.) In this particular case, I was trying to do 100 million inserts into two tables in parallel, one with a single wide row and one with narrow rows, and the error appeared after inserting 43,151,232 rows. So it does take a while to trip over this timing issue. It may be aggravated by the size of the batches. This test was writing 10,000 rows to each table in a batch. When I try switching the same test from using a logged batch to an unlogged batch, and no such failure appears. So the issue could be related to the use of large, logged batches, or it could be that unlogged batches just change the probability of failure. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6736) Windows7 AccessDeniedException on commit log
[ https://issues.apache.org/jira/browse/CASSANDRA-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13918373#comment-13918373 ] Joshua McKenzie commented on CASSANDRA-6736: 1st off - scratch what I said about the SSTableDeletingTask issue you saw above and 6283 - we're not seeing that stack there. We're seeing tasks enqueued as expected but not getting file access errors. 2nd: Regarding the CommitLogSegment error: from the javadocs on MappedByteBuffer: A mapped byte buffer and the file mapping that it represents remain valid until the buffer itself is garbage-collected. It looks like we've seen this type of situation before - from SSTableDeletingTask.java comments: Deleting sstables is tricky because the mmapping might not have been finalized yet, and delete will fail (on Windows) until it is (we only force the unmapping on SUN VMs). Additionally, we need to make sure to delete the data file first, so on restart the others will be recognized as GCable. Our cleanup code in CommitLogSegment.java doesn't appear to take this into account: {code:title=CommitLogSegment.java:148} close(); // FileUtils.clean(buffer) in this method if (deleteFile) FileUtils.deleteWithConfirm(logFile); {code} Unless FileUtils.clean(buffer) actually works on Windows - which SSTableDeletingTask would imply does not - we may need to pursue deferring deletion attempts until after finalization on buffer. Creating a DeferredDeletionTask that'll fire after gc's and retry to deal with this condition should be pretty straightforward. Relevant bug from java bug db: http://bugs.java.com/bugdatabase/view_bug.do?bug_id=4724038 Windows7 AccessDeniedException on commit log - Key: CASSANDRA-6736 URL: https://issues.apache.org/jira/browse/CASSANDRA-6736 Project: Cassandra Issue Type: Bug Components: Core Environment: Windows 7, quad core, 8GB RAM, single Cassandra node, Cassandra 2.0.5 with leakdetect patch from CASSANDRA-6283 Reporter: Bill Mitchell Assignee: Joshua McKenzie Attachments: 2014-02-18-22-16.log Similar to the data file deletion of CASSANDRA-6283, under heavy load with logged batches, I am seeing a problem where the Commit log cannot be deleted: ERROR [COMMIT-LOG-ALLOCATOR] 2014-02-18 22:15:58,252 CassandraDaemon.java (line 192) Exception in thread Thread[COMMIT-LOG-ALLOCATOR,5,main] FSWriteError in C:\Program Files\DataStax Community\data\commitlog\CommitLog-3-1392761510706.log at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:120) at org.apache.cassandra.db.commitlog.CommitLogSegment.discard(CommitLogSegment.java:150) at org.apache.cassandra.db.commitlog.CommitLogAllocator$4.run(CommitLogAllocator.java:217) at org.apache.cassandra.db.commitlog.CommitLogAllocator$1.runMayThrow(CommitLogAllocator.java:95) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.lang.Thread.run(Unknown Source) Caused by: java.nio.file.AccessDeniedException: C:\Program Files\DataStax Community\data\commitlog\CommitLog-3-1392761510706.log at sun.nio.fs.WindowsException.translateToIOException(Unknown Source) at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source) at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source) at sun.nio.fs.WindowsFileSystemProvider.implDelete(Unknown Source) at sun.nio.fs.AbstractFileSystemProvider.delete(Unknown Source) at java.nio.file.Files.delete(Unknown Source) at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:116) ... 5 more (Attached in 2014-02-18-22-16.log is a larger excerpt from the cassandra.log.) In this particular case, I was trying to do 100 million inserts into two tables in parallel, one with a single wide row and one with narrow rows, and the error appeared after inserting 43,151,232 rows. So it does take a while to trip over this timing issue. It may be aggravated by the size of the batches. This test was writing 10,000 rows to each table in a batch. When I try switching the same test from using a logged batch to an unlogged batch, and no such failure appears. So the issue could be related to the use of large, logged batches, or it could be that unlogged batches just change the probability of failure. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6736) Windows7 AccessDeniedException on commit log
[ https://issues.apache.org/jira/browse/CASSANDRA-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13918438#comment-13918438 ] Jonathan Ellis commented on CASSANDRA-6736: --- I think that comment was saying two related things: - We need to unmap before deleting on Windows - We only have code to unmap for Sun (Oracle) VM, so all bets are off for e.g. IBM VM I have no reason to believe FileUtils.clean does not work on Windows. Windows7 AccessDeniedException on commit log - Key: CASSANDRA-6736 URL: https://issues.apache.org/jira/browse/CASSANDRA-6736 Project: Cassandra Issue Type: Bug Components: Core Environment: Windows 7, quad core, 8GB RAM, single Cassandra node, Cassandra 2.0.5 with leakdetect patch from CASSANDRA-6283 Reporter: Bill Mitchell Assignee: Joshua McKenzie Attachments: 2014-02-18-22-16.log Similar to the data file deletion of CASSANDRA-6283, under heavy load with logged batches, I am seeing a problem where the Commit log cannot be deleted: ERROR [COMMIT-LOG-ALLOCATOR] 2014-02-18 22:15:58,252 CassandraDaemon.java (line 192) Exception in thread Thread[COMMIT-LOG-ALLOCATOR,5,main] FSWriteError in C:\Program Files\DataStax Community\data\commitlog\CommitLog-3-1392761510706.log at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:120) at org.apache.cassandra.db.commitlog.CommitLogSegment.discard(CommitLogSegment.java:150) at org.apache.cassandra.db.commitlog.CommitLogAllocator$4.run(CommitLogAllocator.java:217) at org.apache.cassandra.db.commitlog.CommitLogAllocator$1.runMayThrow(CommitLogAllocator.java:95) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.lang.Thread.run(Unknown Source) Caused by: java.nio.file.AccessDeniedException: C:\Program Files\DataStax Community\data\commitlog\CommitLog-3-1392761510706.log at sun.nio.fs.WindowsException.translateToIOException(Unknown Source) at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source) at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source) at sun.nio.fs.WindowsFileSystemProvider.implDelete(Unknown Source) at sun.nio.fs.AbstractFileSystemProvider.delete(Unknown Source) at java.nio.file.Files.delete(Unknown Source) at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:116) ... 5 more (Attached in 2014-02-18-22-16.log is a larger excerpt from the cassandra.log.) In this particular case, I was trying to do 100 million inserts into two tables in parallel, one with a single wide row and one with narrow rows, and the error appeared after inserting 43,151,232 rows. So it does take a while to trip over this timing issue. It may be aggravated by the size of the batches. This test was writing 10,000 rows to each table in a batch. When I try switching the same test from using a logged batch to an unlogged batch, and no such failure appears. So the issue could be related to the use of large, logged batches, or it could be that unlogged batches just change the probability of failure. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6736) Windows7 AccessDeniedException on commit log
[ https://issues.apache.org/jira/browse/CASSANDRA-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13918580#comment-13918580 ] Joshua McKenzie commented on CASSANDRA-6736: Stress-testing the FileUtils MappedByteBuffer and RAF.close() on Windows looks clean. I agree that there's no indication that's the issue - thanks for clarifying the Windows + non-Sun JVM nature of that earlier comment. Windows7 AccessDeniedException on commit log - Key: CASSANDRA-6736 URL: https://issues.apache.org/jira/browse/CASSANDRA-6736 Project: Cassandra Issue Type: Bug Components: Core Environment: Windows 7, quad core, 8GB RAM, single Cassandra node, Cassandra 2.0.5 with leakdetect patch from CASSANDRA-6283 Reporter: Bill Mitchell Assignee: Joshua McKenzie Attachments: 2014-02-18-22-16.log Similar to the data file deletion of CASSANDRA-6283, under heavy load with logged batches, I am seeing a problem where the Commit log cannot be deleted: ERROR [COMMIT-LOG-ALLOCATOR] 2014-02-18 22:15:58,252 CassandraDaemon.java (line 192) Exception in thread Thread[COMMIT-LOG-ALLOCATOR,5,main] FSWriteError in C:\Program Files\DataStax Community\data\commitlog\CommitLog-3-1392761510706.log at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:120) at org.apache.cassandra.db.commitlog.CommitLogSegment.discard(CommitLogSegment.java:150) at org.apache.cassandra.db.commitlog.CommitLogAllocator$4.run(CommitLogAllocator.java:217) at org.apache.cassandra.db.commitlog.CommitLogAllocator$1.runMayThrow(CommitLogAllocator.java:95) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.lang.Thread.run(Unknown Source) Caused by: java.nio.file.AccessDeniedException: C:\Program Files\DataStax Community\data\commitlog\CommitLog-3-1392761510706.log at sun.nio.fs.WindowsException.translateToIOException(Unknown Source) at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source) at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source) at sun.nio.fs.WindowsFileSystemProvider.implDelete(Unknown Source) at sun.nio.fs.AbstractFileSystemProvider.delete(Unknown Source) at java.nio.file.Files.delete(Unknown Source) at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:116) ... 5 more (Attached in 2014-02-18-22-16.log is a larger excerpt from the cassandra.log.) In this particular case, I was trying to do 100 million inserts into two tables in parallel, one with a single wide row and one with narrow rows, and the error appeared after inserting 43,151,232 rows. So it does take a while to trip over this timing issue. It may be aggravated by the size of the batches. This test was writing 10,000 rows to each table in a batch. When I try switching the same test from using a logged batch to an unlogged batch, and no such failure appears. So the issue could be related to the use of large, logged batches, or it could be that unlogged batches just change the probability of failure. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6736) Windows7 AccessDeniedException on commit log
[ https://issues.apache.org/jira/browse/CASSANDRA-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13917188#comment-13917188 ] Bill Mitchell commented on CASSANDRA-6736: -- Off and on, I've been following the commentary in CASSANDRA-6283, but I opened this report as my issue seems to be a different situation and path through the code. In particular: 1. I'm not seeing any reports from the leak detection patch, except in the log cited in CASSANDRA-6721. 2. As I'm working in a test environment, I'm not dealing with any node repair issues. With only a single node I reboot it when it hangs, which clears the locks. 3. As these are only test cases, and I am frequently changing the schema, I'm am deleting and recreating the keyspaces; so I would not notice any lingering data files. And I gave up after CASSANDRA-6721, and changed my test environment to use a new unique keyspace name, except when I really want to exercise multiple runs against the same database. 4. As these are only tests, I disabled snapshots in the cassandra.yaml, so I'm not seeing those locks. 5. Similarly, after CASSANDRA-6721, I simply disabled saved key caching, so I'm not seeing any issues around those files. All of which may mean I've avoided the issues mentioned in CASSANDRA-6283 and am hitting a different set of issues. I can confirm that the issue here is not exclusive to the use of logged batches. On Thursday I ran into the same COMMIT_LOG_ALLOCATOR failure using smaller, overlapped unlogged batches to the two tables (breaking the larger batch down into smaller segments, and overlapping the one segment against one table, while the next segment is applied to the other table). So the use of large, logged batches just makes the problem likely enough for me to see it fairly consistently. I have seen the file deletion failure in a couple of other situations where it seemed a secondary result from an earlier failure. If it helps, I will describe these below. Windows7 AccessDeniedException on commit log - Key: CASSANDRA-6736 URL: https://issues.apache.org/jira/browse/CASSANDRA-6736 Project: Cassandra Issue Type: Bug Components: Core Environment: Windows 7, quad core, 8GB RAM, single Cassandra node, Cassandra 2.0.5 with leakdetect patch from CASSANDRA-6283 Reporter: Bill Mitchell Assignee: Joshua McKenzie Attachments: 2014-02-18-22-16.log Similar to the data file deletion of CASSANDRA-6283, under heavy load with logged batches, I am seeing a problem where the Commit log cannot be deleted: ERROR [COMMIT-LOG-ALLOCATOR] 2014-02-18 22:15:58,252 CassandraDaemon.java (line 192) Exception in thread Thread[COMMIT-LOG-ALLOCATOR,5,main] FSWriteError in C:\Program Files\DataStax Community\data\commitlog\CommitLog-3-1392761510706.log at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:120) at org.apache.cassandra.db.commitlog.CommitLogSegment.discard(CommitLogSegment.java:150) at org.apache.cassandra.db.commitlog.CommitLogAllocator$4.run(CommitLogAllocator.java:217) at org.apache.cassandra.db.commitlog.CommitLogAllocator$1.runMayThrow(CommitLogAllocator.java:95) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.lang.Thread.run(Unknown Source) Caused by: java.nio.file.AccessDeniedException: C:\Program Files\DataStax Community\data\commitlog\CommitLog-3-1392761510706.log at sun.nio.fs.WindowsException.translateToIOException(Unknown Source) at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source) at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source) at sun.nio.fs.WindowsFileSystemProvider.implDelete(Unknown Source) at sun.nio.fs.AbstractFileSystemProvider.delete(Unknown Source) at java.nio.file.Files.delete(Unknown Source) at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:116) ... 5 more (Attached in 2014-02-18-22-16.log is a larger excerpt from the cassandra.log.) In this particular case, I was trying to do 100 million inserts into two tables in parallel, one with a single wide row and one with narrow rows, and the error appeared after inserting 43,151,232 rows. So it does take a while to trip over this timing issue. It may be aggravated by the size of the batches. This test was writing 10,000 rows to each table in a batch. When I try switching the same test from using a logged batch to an unlogged batch, and no such failure appears. So the issue could be related to the use of large, logged batches, or it could be that unlogged batches just change the probability of failure. -- This message was sent by
[jira] [Commented] (CASSANDRA-6736) Windows7 AccessDeniedException on commit log
[ https://issues.apache.org/jira/browse/CASSANDRA-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13917195#comment-13917195 ] Bill Mitchell commented on CASSANDRA-6736: -- The first time I came across the deleteWithConfirm failure on one of the Index files, on the previous run I was trying to use the SnappyCompressor on the table with lots of small rows and it died with a Java heap space error during garbage collection: INFO [ScheduledTasks:1] 2014-02-16 01:31:12,160 GCInspector.java (line 116) GC for ConcurrentMarkSweep: 1458 ms for 3 collections, 124495112 used; max is 2130051072 ERROR [ReadStage:7] 2014-02-16 01:31:13,820 CassandraDaemon.java (line 192) Exception in thread Thread[ReadStage:7,5,main] java.lang.OutOfMemoryError: Java heap space at org.apache.cassandra.io.util.RandomAccessReader.readBytes(RandomAccessReader.java:353) at org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:392) at org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:355) at org.apache.cassandra.db.ColumnSerializer.deserializeColumnBody(ColumnSerializer.java:124) at org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:85) at org.apache.cassandra.db.Column$1.computeNext(Column.java:75) at org.apache.cassandra.db.Column$1.computeNext(Column.java:64) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) at com.google.common.collect.AbstractIterator.next(AbstractIterator.java:153) at org.apache.cassandra.db.columniterator.IndexedSliceReader$IndexedBlockFetcher.getNextBlock(IndexedSliceReader.java:379) at org.apache.cassandra.db.columniterator.IndexedSliceReader$IndexedBlockFetcher.fetchMoreData(IndexedSliceReader.java:332) at org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNext(IndexedSliceReader.java:145) at org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNext(IndexedSliceReader.java:45) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) at org.apache.cassandra.db.columniterator.SSTableSliceIterator.hasNext(SSTableSliceIterator.java:82) at org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:157) at org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:140) at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:144) at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:123) at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:97) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) at org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:185) at org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:122) at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:80) at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:72) at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:297) at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53) at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1560) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1379) INFO [StorageServiceShutdownHook] 2014-02-16 01:31:13,953 ThriftServer.java (line 141) Stop listening to thrift clients INFO [StorageServiceShutdownHook] 2014-02-16 01:31:14,037 Server.java (line 181) Stop listening for CQL clients Soon after restart, 1.5 million rows into the test, I ran into the deleteWithConfirm failure: INFO [FlushWriter:5] 2014-02-16 08:53:57,084 Memtable.java (line 380) Completed flushing; nothing needed to be retained. Commitlog position was ReplayPosition(segmentId=1392560341059, position=6159762) ERROR [NonPeriodicTasks:1] 2014-02-16 08:53:57,969 CassandraDaemon.java (line 192) Exception in thread Thread[NonPeriodicTasks:1,5,main] FSWriteError in C:\Program Files\DataStax Community\data\data\testdb\sr\testdb-sr-jb-117-Index.db at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:120) at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:106) at
[jira] [Commented] (CASSANDRA-6736) Windows7 AccessDeniedException on commit log
[ https://issues.apache.org/jira/browse/CASSANDRA-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13917201#comment-13917201 ] Bill Mitchell commented on CASSANDRA-6736: -- Trying to be polite, I started using Drain to shutdown Cassandra before rebooting the machine. In one case, this provoked numerous ThreadPoolExecutor has shutdown messages underneath the compactor: INFO [RMI TCP Connection(2095)-127.0.0.1] 2014-02-24 08:34:23,743 StorageService.java (line 947) DRAINING: starting drain process INFO [RMI TCP Connection(2095)-127.0.0.1] 2014-02-24 08:34:23,783 ThriftServer.java (line 141) Stop listening to thrift clients INFO [RMI TCP Connection(2095)-127.0.0.1] 2014-02-24 08:34:24,980 Server.java (line 181) Stop listening for CQL clients INFO [RMI TCP Connection(2095)-127.0.0.1] 2014-02-24 08:34:24,980 Gossiper.java (line 1251) Announcing shutdown INFO [RMI TCP Connection(2095)-127.0.0.1] 2014-02-24 08:34:27,001 MessagingService.java (line 665) Waiting for messaging service to quiesce INFO [RMI TCP Connection(2095)-127.0.0.1] 2014-02-24 08:34:27,040 ColumnFamilyStore.java (line 784) Enqueuing flush of Memtable-sr@1217138300(1825983/4411193 serialized/live bytes, 29946 ops) INFO [RMI TCP Connection(2095)-127.0.0.1] 2014-02-24 08:34:27,040 ColumnFamilyStore.java (line 784) Enqueuing flush of Memtable-etol@703118381(2963818/46129889 serialized/live bytes, 68926 ops) INFO [FlushWriter:272] 2014-02-24 08:34:27,040 Memtable.java (line 333) Writing Memtable-sr@1217138300(1825983/4411193 serialized/live bytes, 29946 ops) INFO [RMI TCP Connection(2095)-127.0.0.1] 2014-02-24 08:34:27,054 ColumnFamilyStore.java (line 784) Enqueuing flush of Memtable-events@899982591(188/1880 serialized/live bytes, 7 ops) INFO [RMI TCP Connection(2095)-127.0.0.1] 2014-02-24 08:34:27,075 ColumnFamilyStore.java (line 784) Enqueuing flush of Memtable-events_timeline@1379706298(16/160 serialized/live bytes, 1 ops) INFO [FlushWriter:273] 2014-02-24 08:34:27,075 Memtable.java (line 333) Writing Memtable-etol@703118381(2963818/46129889 serialized/live bytes, 68926 ops) INFO [ACCEPT-localhost/127.0.0.1] 2014-02-24 08:34:27,144 MessagingService.java (line 875) MessagingService has terminated the accept() thread INFO [FlushWriter:272] 2014-02-24 08:34:27,411 Memtable.java (line 373) Completed flushing C:\Program Files\DataStax Community\data\data\testdb_1393207231382\sr\testdb_1393207231382-sr-jb-473-Data.db (428854 bytes) for commitlog position ReplayPosition(segmentId=1393178353775, position=18771262) INFO [FlushWriter:272] 2014-02-24 08:34:27,411 Memtable.java (line 333) Writing Memtable-events@899982591(188/1880 serialized/live bytes, 7 ops) INFO [FlushWriter:273] 2014-02-24 08:34:27,932 Memtable.java (line 373) Completed flushing C:\Program Files\DataStax Community\data\data\testdb_1393207231382\etol\testdb_1393207231382-etol-jb-1563-Data.db (1012805 bytes) for commitlog position ReplayPosition(segmentId=1393178353775, position=18771262) INFO [FlushWriter:273] 2014-02-24 08:34:27,933 Memtable.java (line 333) Writing Memtable-events_timeline@1379706298(16/160 serialized/live bytes, 1 ops) INFO [FlushWriter:272] 2014-02-24 08:34:28,366 Memtable.java (line 373) Completed flushing C:\Program Files\DataStax Community\data\data\OpsCenter\events\OpsCenter-events-jb-32-Data.db (184 bytes) for commitlog position ReplayPosition(segmentId=1393178353775, position=18771262) INFO [FlushWriter:273] 2014-02-24 08:34:28,456 Memtable.java (line 373) Completed flushing C:\Program Files\DataStax Community\data\data\OpsCenter\events_timeline\OpsCenter-events_timeline-jb-39-Data.db (47 bytes) for commitlog position ReplayPosition(segmentId=1393178353775, position=18771262) INFO [RMI TCP Connection(2095)-127.0.0.1] 2014-02-24 08:34:28,457 ColumnFamilyStore.java (line 784) Enqueuing flush of Memtable-compaction_history@814197203(1725/19675 serialized/live bytes, 45 ops) INFO [FlushWriter:272] 2014-02-24 08:34:28,458 Memtable.java (line 333) Writing Memtable-compaction_history@814197203(1725/19675 serialized/live bytes, 45 ops) INFO [RMI TCP Connection(2095)-127.0.0.1] 2014-02-24 08:34:28,458 ColumnFamilyStore.java (line 784) Enqueuing flush of Memtable-sstable_activity@446592137(13500/207410 serialized/live bytes, 1442 ops) INFO [FlushWriter:273] 2014-02-24 08:34:28,458 Memtable.java (line 333) Writing Memtable-sstable_activity@446592137(13500/207410 serialized/live bytes, 1442 ops) INFO [FlushWriter:273] 2014-02-24 08:34:28,732 Memtable.java (line 373) Completed flushing C:\Program Files\DataStax Community\data\data\system\sstable_activity\system-sstable_activity-jb-428-Data.db (4072 bytes) for commitlog position ReplayPosition(segmentId=1393178353775, position=18771471) INFO [FlushWriter:272] 2014-02-24 08:34:28,761 Memtable.java (line 373) Completed flushing C:\Program
[jira] [Commented] (CASSANDRA-6736) Windows7 AccessDeniedException on commit log
[ https://issues.apache.org/jira/browse/CASSANDRA-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914687#comment-13914687 ] Joshua McKenzie commented on CASSANDRA-6736: Bill - could I get you to take a quick look at CASSANDRA-6283? There's some other potential issues with file handles on 2.0.5 and I'd like to get your take on whether you're seeing these same problems or not. Windows7 AccessDeniedException on commit log - Key: CASSANDRA-6736 URL: https://issues.apache.org/jira/browse/CASSANDRA-6736 Project: Cassandra Issue Type: Bug Components: Core Environment: Windows 7, quad core, 8GB RAM, single Cassandra node, Cassandra 2.0.5 with leakdetect patch from CASSANDRA-6283 Reporter: Bill Mitchell Assignee: Joshua McKenzie Attachments: 2014-02-18-22-16.log Similar to the data file deletion of CASSANDRA-6283, under heavy load with logged batches, I am seeing a problem where the Commit log cannot be deleted: ERROR [COMMIT-LOG-ALLOCATOR] 2014-02-18 22:15:58,252 CassandraDaemon.java (line 192) Exception in thread Thread[COMMIT-LOG-ALLOCATOR,5,main] FSWriteError in C:\Program Files\DataStax Community\data\commitlog\CommitLog-3-1392761510706.log at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:120) at org.apache.cassandra.db.commitlog.CommitLogSegment.discard(CommitLogSegment.java:150) at org.apache.cassandra.db.commitlog.CommitLogAllocator$4.run(CommitLogAllocator.java:217) at org.apache.cassandra.db.commitlog.CommitLogAllocator$1.runMayThrow(CommitLogAllocator.java:95) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.lang.Thread.run(Unknown Source) Caused by: java.nio.file.AccessDeniedException: C:\Program Files\DataStax Community\data\commitlog\CommitLog-3-1392761510706.log at sun.nio.fs.WindowsException.translateToIOException(Unknown Source) at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source) at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source) at sun.nio.fs.WindowsFileSystemProvider.implDelete(Unknown Source) at sun.nio.fs.AbstractFileSystemProvider.delete(Unknown Source) at java.nio.file.Files.delete(Unknown Source) at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:116) ... 5 more (Attached in 2014-02-18-22-16.log is a larger excerpt from the cassandra.log.) In this particular case, I was trying to do 100 million inserts into two tables in parallel, one with a single wide row and one with narrow rows, and the error appeared after inserting 43,151,232 rows. So it does take a while to trip over this timing issue. It may be aggravated by the size of the batches. This test was writing 10,000 rows to each table in a batch. When I try switching the same test from using a logged batch to an unlogged batch, and no such failure appears. So the issue could be related to the use of large, logged batches, or it could be that unlogged batches just change the probability of failure. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (CASSANDRA-6736) Windows7 AccessDeniedException on commit log
[ https://issues.apache.org/jira/browse/CASSANDRA-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909392#comment-13909392 ] Joshua McKenzie commented on CASSANDRA-6736: I should have clarified - I don't think this is a hard-link file locking problem, but rather regular old race condition on file access. Deferred delete would mimic ext* behavior on ntfs however if our intent is synchronized segment access then this definitely looks like a bug. The Allocator thread checking cfLastWrite.isEmpty() on a regular HashMap to see if there's any dirty CF's vs. the CommitLogSegment.write() marking CF's dirty in that container looks like a candidate for a race that could potentially lead to the results we're seeing. We could potentially test this by dropping the eval time for the polling in the Allocator and slamming the CommitLog with mutations in a unit test. Windows7 AccessDeniedException on commit log - Key: CASSANDRA-6736 URL: https://issues.apache.org/jira/browse/CASSANDRA-6736 Project: Cassandra Issue Type: Bug Components: Core Environment: Windows 7, quad core, 8GB RAM, single Cassandra node, Cassandra 2.0.5 with leakdetect patch from CASSANDRA-6283 Reporter: Bill Mitchell Assignee: Joshua McKenzie Attachments: 2014-02-18-22-16.log Similar to the data file deletion of CASSANDRA-6283, under heavy load with logged batches, I am seeing a problem where the Commit log cannot be deleted: ERROR [COMMIT-LOG-ALLOCATOR] 2014-02-18 22:15:58,252 CassandraDaemon.java (line 192) Exception in thread Thread[COMMIT-LOG-ALLOCATOR,5,main] FSWriteError in C:\Program Files\DataStax Community\data\commitlog\CommitLog-3-1392761510706.log at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:120) at org.apache.cassandra.db.commitlog.CommitLogSegment.discard(CommitLogSegment.java:150) at org.apache.cassandra.db.commitlog.CommitLogAllocator$4.run(CommitLogAllocator.java:217) at org.apache.cassandra.db.commitlog.CommitLogAllocator$1.runMayThrow(CommitLogAllocator.java:95) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.lang.Thread.run(Unknown Source) Caused by: java.nio.file.AccessDeniedException: C:\Program Files\DataStax Community\data\commitlog\CommitLog-3-1392761510706.log at sun.nio.fs.WindowsException.translateToIOException(Unknown Source) at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source) at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source) at sun.nio.fs.WindowsFileSystemProvider.implDelete(Unknown Source) at sun.nio.fs.AbstractFileSystemProvider.delete(Unknown Source) at java.nio.file.Files.delete(Unknown Source) at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:116) ... 5 more (Attached in 2014-02-18-22-16.log is a larger excerpt from the cassandra.log.) In this particular case, I was trying to do 100 million inserts into two tables in parallel, one with a single wide row and one with narrow rows, and the error appeared after inserting 43,151,232 rows. So it does take a while to trip over this timing issue. It may be aggravated by the size of the batches. This test was writing 10,000 rows to each table in a batch. When I try switching the same test from using a logged batch to an unlogged batch, and no such failure appears. So the issue could be related to the use of large, logged batches, or it could be that unlogged batches just change the probability of failure. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (CASSANDRA-6736) Windows7 AccessDeniedException on commit log
[ https://issues.apache.org/jira/browse/CASSANDRA-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909395#comment-13909395 ] Benedict commented on CASSANDRA-6736: - That should be better commented, but I didn't worry about it too much last time I massaged this code, as it's all gone in 2.1. However that isn't a race condition for two reasons: 1) the discardCompletedSegments() is called in the same thread as CLS.write(), so they're definitely not racing. 2) if you look closely segment.isUnused() is only called if iter.hasNext(), i.e. if it isn't the currently active (i.e. being modified) segment. Dirty marking happens before any actual modification of the CLS that would result in it moving from being the active segment. So even if they were happening on two different threads, it would be impossible for CLS.write() to be modifying it whilst isUnused() is being called Windows7 AccessDeniedException on commit log - Key: CASSANDRA-6736 URL: https://issues.apache.org/jira/browse/CASSANDRA-6736 Project: Cassandra Issue Type: Bug Components: Core Environment: Windows 7, quad core, 8GB RAM, single Cassandra node, Cassandra 2.0.5 with leakdetect patch from CASSANDRA-6283 Reporter: Bill Mitchell Assignee: Joshua McKenzie Attachments: 2014-02-18-22-16.log Similar to the data file deletion of CASSANDRA-6283, under heavy load with logged batches, I am seeing a problem where the Commit log cannot be deleted: ERROR [COMMIT-LOG-ALLOCATOR] 2014-02-18 22:15:58,252 CassandraDaemon.java (line 192) Exception in thread Thread[COMMIT-LOG-ALLOCATOR,5,main] FSWriteError in C:\Program Files\DataStax Community\data\commitlog\CommitLog-3-1392761510706.log at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:120) at org.apache.cassandra.db.commitlog.CommitLogSegment.discard(CommitLogSegment.java:150) at org.apache.cassandra.db.commitlog.CommitLogAllocator$4.run(CommitLogAllocator.java:217) at org.apache.cassandra.db.commitlog.CommitLogAllocator$1.runMayThrow(CommitLogAllocator.java:95) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.lang.Thread.run(Unknown Source) Caused by: java.nio.file.AccessDeniedException: C:\Program Files\DataStax Community\data\commitlog\CommitLog-3-1392761510706.log at sun.nio.fs.WindowsException.translateToIOException(Unknown Source) at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source) at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source) at sun.nio.fs.WindowsFileSystemProvider.implDelete(Unknown Source) at sun.nio.fs.AbstractFileSystemProvider.delete(Unknown Source) at java.nio.file.Files.delete(Unknown Source) at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:116) ... 5 more (Attached in 2014-02-18-22-16.log is a larger excerpt from the cassandra.log.) In this particular case, I was trying to do 100 million inserts into two tables in parallel, one with a single wide row and one with narrow rows, and the error appeared after inserting 43,151,232 rows. So it does take a while to trip over this timing issue. It may be aggravated by the size of the batches. This test was writing 10,000 rows to each table in a batch. When I try switching the same test from using a logged batch to an unlogged batch, and no such failure appears. So the issue could be related to the use of large, logged batches, or it could be that unlogged batches just change the probability of failure. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (CASSANDRA-6736) Windows7 AccessDeniedException on commit log
[ https://issues.apache.org/jira/browse/CASSANDRA-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909439#comment-13909439 ] Bill Mitchell commented on CASSANDRA-6736: -- {quote}Is this reproducible?{quote} In a statistical sense, yes, I have encountered this failure repeatedly when trying to insert 100 million rows into two tables, one with a single wide row, and the other with narrow rows. The failure does not happen at exactly the same point, which suggests it is a timing problem. The test does use randomly generated data, so one cannot completely rule out a data dependency. There are also reads intermixed with the inserts, as for each segment of 10,000 rows, it checks for duplicate values and removes them before inserting the new, unique values. Test history of the 100 million insert test: 2014-02-04 passed 2014-02-07 passed 2014-02-10 failed after 16,707,724 rows (changing the second table to have a few wide rows) 2014-02-10 failed after 6,215,471 rows 2014-02-12 failed after 21,038,110 rows (after installing 2.0.5) 2014-02-12 failed after 63,397,406 rows 2014-02-14 failed after 33,974,034 rows 2014-02-16 passed, using unlogged batch instead of logged batch 2014-02-18 failed after 43,151,232 rows, using logged batch 2014-02-20 failed after 54,263,560 rows 2014-02-21 passed, logged batch but with only 1,000 rows inserted in each table instead of 10,000 The failures were observed on 2.0.3, 2.0.4, and 2.0.5, so they are not restricted to the most recent build. I don't have a hypothesis for why the test passed on 02-04 and 02-07. I tried the reduced batch size of 1000 pairs of rows under the hypothesis that the failure has something to do with the large batch size causing a large commit log that needs to be compacted, and cannot just be marked as complete. Of course, one success does not necessarily mean that the problem cannot happen with the smaller batch size; it may just change the timing such that one is less likely to hit the failure. Windows7 AccessDeniedException on commit log - Key: CASSANDRA-6736 URL: https://issues.apache.org/jira/browse/CASSANDRA-6736 Project: Cassandra Issue Type: Bug Components: Core Environment: Windows 7, quad core, 8GB RAM, single Cassandra node, Cassandra 2.0.5 with leakdetect patch from CASSANDRA-6283 Reporter: Bill Mitchell Assignee: Joshua McKenzie Attachments: 2014-02-18-22-16.log Similar to the data file deletion of CASSANDRA-6283, under heavy load with logged batches, I am seeing a problem where the Commit log cannot be deleted: ERROR [COMMIT-LOG-ALLOCATOR] 2014-02-18 22:15:58,252 CassandraDaemon.java (line 192) Exception in thread Thread[COMMIT-LOG-ALLOCATOR,5,main] FSWriteError in C:\Program Files\DataStax Community\data\commitlog\CommitLog-3-1392761510706.log at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:120) at org.apache.cassandra.db.commitlog.CommitLogSegment.discard(CommitLogSegment.java:150) at org.apache.cassandra.db.commitlog.CommitLogAllocator$4.run(CommitLogAllocator.java:217) at org.apache.cassandra.db.commitlog.CommitLogAllocator$1.runMayThrow(CommitLogAllocator.java:95) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.lang.Thread.run(Unknown Source) Caused by: java.nio.file.AccessDeniedException: C:\Program Files\DataStax Community\data\commitlog\CommitLog-3-1392761510706.log at sun.nio.fs.WindowsException.translateToIOException(Unknown Source) at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source) at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source) at sun.nio.fs.WindowsFileSystemProvider.implDelete(Unknown Source) at sun.nio.fs.AbstractFileSystemProvider.delete(Unknown Source) at java.nio.file.Files.delete(Unknown Source) at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:116) ... 5 more (Attached in 2014-02-18-22-16.log is a larger excerpt from the cassandra.log.) In this particular case, I was trying to do 100 million inserts into two tables in parallel, one with a single wide row and one with narrow rows, and the error appeared after inserting 43,151,232 rows. So it does take a while to trip over this timing issue. It may be aggravated by the size of the batches. This test was writing 10,000 rows to each table in a batch. When I try switching the same test from using a logged batch to an unlogged batch, and no such failure appears. So the issue could be related to the use of large, logged batches, or it could be that unlogged batches just change the probability of failure. -- This message was sent
[jira] [Commented] (CASSANDRA-6736) Windows7 AccessDeniedException on commit log
[ https://issues.apache.org/jira/browse/CASSANDRA-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13908981#comment-13908981 ] Joshua McKenzie commented on CASSANDRA-6736: As mentioned - this looks like it's probably a race condition on commit log access we've dodged on *nix due to the OS allowing you to delete files even if another proc has them open. Whatever solution for deferred deletion we come up with on Cassandra-6283 should also apply here, though we'll need to test it for this use-case. Is this reproducible? Windows7 AccessDeniedException on commit log - Key: CASSANDRA-6736 URL: https://issues.apache.org/jira/browse/CASSANDRA-6736 Project: Cassandra Issue Type: Bug Components: Core Environment: Windows 7, quad core, 8GB RAM, single Cassandra node, Cassandra 2.0.5 with leakdetect patch from CASSANDRA-6283 Reporter: Bill Mitchell Assignee: Joshua McKenzie Attachments: 2014-02-18-22-16.log Similar to the data file deletion of CASSANDRA-6283, under heavy load with logged batches, I am seeing a problem where the Commit log cannot be deleted: ERROR [COMMIT-LOG-ALLOCATOR] 2014-02-18 22:15:58,252 CassandraDaemon.java (line 192) Exception in thread Thread[COMMIT-LOG-ALLOCATOR,5,main] FSWriteError in C:\Program Files\DataStax Community\data\commitlog\CommitLog-3-1392761510706.log at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:120) at org.apache.cassandra.db.commitlog.CommitLogSegment.discard(CommitLogSegment.java:150) at org.apache.cassandra.db.commitlog.CommitLogAllocator$4.run(CommitLogAllocator.java:217) at org.apache.cassandra.db.commitlog.CommitLogAllocator$1.runMayThrow(CommitLogAllocator.java:95) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.lang.Thread.run(Unknown Source) Caused by: java.nio.file.AccessDeniedException: C:\Program Files\DataStax Community\data\commitlog\CommitLog-3-1392761510706.log at sun.nio.fs.WindowsException.translateToIOException(Unknown Source) at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source) at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source) at sun.nio.fs.WindowsFileSystemProvider.implDelete(Unknown Source) at sun.nio.fs.AbstractFileSystemProvider.delete(Unknown Source) at java.nio.file.Files.delete(Unknown Source) at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:116) ... 5 more (Attached in 2014-02-18-22-16.log is a larger excerpt from the cassandra.log.) In this particular case, I was trying to do 100 million inserts into two tables in parallel, one with a single wide row and one with narrow rows, and the error appeared after inserting 43,151,232 rows. So it does take a while to trip over this timing issue. It may be aggravated by the size of the batches. This test was writing 10,000 rows to each table in a batch. When I try switching the same test from using a logged batch to an unlogged batch, and no such failure appears. So the issue could be related to the use of large, logged batches, or it could be that unlogged batches just change the probability of failure. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (CASSANDRA-6736) Windows7 AccessDeniedException on commit log
[ https://issues.apache.org/jira/browse/CASSANDRA-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909189#comment-13909189 ] Jonathan Ellis commented on CASSANDRA-6736: --- In the CL case we never create hard links so it should be possible to only delete segments once they are closed. (I'm pretty sure that's how it's supposed to work now, in which case there must be a bug.) Windows7 AccessDeniedException on commit log - Key: CASSANDRA-6736 URL: https://issues.apache.org/jira/browse/CASSANDRA-6736 Project: Cassandra Issue Type: Bug Components: Core Environment: Windows 7, quad core, 8GB RAM, single Cassandra node, Cassandra 2.0.5 with leakdetect patch from CASSANDRA-6283 Reporter: Bill Mitchell Assignee: Joshua McKenzie Attachments: 2014-02-18-22-16.log Similar to the data file deletion of CASSANDRA-6283, under heavy load with logged batches, I am seeing a problem where the Commit log cannot be deleted: ERROR [COMMIT-LOG-ALLOCATOR] 2014-02-18 22:15:58,252 CassandraDaemon.java (line 192) Exception in thread Thread[COMMIT-LOG-ALLOCATOR,5,main] FSWriteError in C:\Program Files\DataStax Community\data\commitlog\CommitLog-3-1392761510706.log at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:120) at org.apache.cassandra.db.commitlog.CommitLogSegment.discard(CommitLogSegment.java:150) at org.apache.cassandra.db.commitlog.CommitLogAllocator$4.run(CommitLogAllocator.java:217) at org.apache.cassandra.db.commitlog.CommitLogAllocator$1.runMayThrow(CommitLogAllocator.java:95) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.lang.Thread.run(Unknown Source) Caused by: java.nio.file.AccessDeniedException: C:\Program Files\DataStax Community\data\commitlog\CommitLog-3-1392761510706.log at sun.nio.fs.WindowsException.translateToIOException(Unknown Source) at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source) at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source) at sun.nio.fs.WindowsFileSystemProvider.implDelete(Unknown Source) at sun.nio.fs.AbstractFileSystemProvider.delete(Unknown Source) at java.nio.file.Files.delete(Unknown Source) at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:116) ... 5 more (Attached in 2014-02-18-22-16.log is a larger excerpt from the cassandra.log.) In this particular case, I was trying to do 100 million inserts into two tables in parallel, one with a single wide row and one with narrow rows, and the error appeared after inserting 43,151,232 rows. So it does take a while to trip over this timing issue. It may be aggravated by the size of the batches. This test was writing 10,000 rows to each table in a batch. When I try switching the same test from using a logged batch to an unlogged batch, and no such failure appears. So the issue could be related to the use of large, logged batches, or it could be that unlogged batches just change the probability of failure. -- This message was sent by Atlassian JIRA (v6.1.5#6160)