[ https://issues.apache.org/jira/browse/CASSANDRA-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13917195#comment-13917195 ]
Bill Mitchell edited comment on CASSANDRA-6736 at 3/1/14 9:15 PM: ------------------------------------------------------------------ The first time I came across the deleteWithConfirm failure on one of the Index files, on the previous run I was trying to use the SnappyCompressor on the table with lots of small rows and it died with a Java heap space error during garbage collection: INFO [ScheduledTasks:1] 2014-02-16 01:31:12,160 GCInspector.java (line 116) GC for ConcurrentMarkSweep: 1458 ms for 3 collections, 124495112 used; max is 2130051072 ERROR [ReadStage:7] 2014-02-16 01:31:13,820 CassandraDaemon.java (line 192) Exception in thread Thread[ReadStage:7,5,main] java.lang.OutOfMemoryError: Java heap space at org.apache.cassandra.io.util.RandomAccessReader.readBytes(RandomAccessReader.java:353) at org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:392) at org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:355) at org.apache.cassandra.db.ColumnSerializer.deserializeColumnBody(ColumnSerializer.java:124) at org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:85) at org.apache.cassandra.db.Column$1.computeNext(Column.java:75) at org.apache.cassandra.db.Column$1.computeNext(Column.java:64) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) at com.google.common.collect.AbstractIterator.next(AbstractIterator.java:153) at org.apache.cassandra.db.columniterator.IndexedSliceReader$IndexedBlockFetcher.getNextBlock(IndexedSliceReader.java:379) at org.apache.cassandra.db.columniterator.IndexedSliceReader$IndexedBlockFetcher.fetchMoreData(IndexedSliceReader.java:332) at org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNext(IndexedSliceReader.java:145) at org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNext(IndexedSliceReader.java:45) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) at org.apache.cassandra.db.columniterator.SSTableSliceIterator.hasNext(SSTableSliceIterator.java:82) at org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:157) at org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:140) at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:144) at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:123) at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:97) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) at org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:185) at org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:122) at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:80) at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:72) at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:297) at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53) at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1560) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1379) INFO [StorageServiceShutdownHook] 2014-02-16 01:31:13,953 ThriftServer.java (line 141) Stop listening to thrift clients INFO [StorageServiceShutdownHook] 2014-02-16 01:31:14,037 Server.java (line 181) Stop listening for CQL clients Soon after restart, 1.5 million rows into the test, I ran into the deleteWithConfirm failure: INFO [FlushWriter:5] 2014-02-16 08:53:57,084 Memtable.java (line 380) Completed flushing; nothing needed to be retained. Commitlog position was ReplayPosition(segmentId=1392560341059, position=6159762) ERROR [NonPeriodicTasks:1] 2014-02-16 08:53:57,969 CassandraDaemon.java (line 192) Exception in thread Thread[NonPeriodicTasks:1,5,main] FSWriteError in C:\Program Files\DataStax Community\data\data\testdb\sr\testdb-sr-jb-117-Index.db at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:120) at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:106) at org.apache.cassandra.io.sstable.SSTable.delete(SSTable.java:145) at org.apache.cassandra.io.sstable.SSTableDeletingTask.run(SSTableDeletingTask.java:86) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Caused by: java.nio.file.FileSystemException: C:\Program Files\DataStax Community\data\data\testdb\sr\testdb-sr-jb-117-Index.db: The process cannot access the file because it is being used by another process. at sun.nio.fs.WindowsException.translateToIOException(Unknown Source) at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source) at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source) at sun.nio.fs.WindowsFileSystemProvider.implDelete(Unknown Source) at sun.nio.fs.AbstractFileSystemProvider.delete(Unknown Source) at java.nio.file.Files.delete(Unknown Source) at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:116) ... 10 more ERROR [NonPeriodicTasks:1] 2014-02-16 08:53:57,970 FileUtils.java (line 417) Stopping gossiper WARN [NonPeriodicTasks:1] 2014-02-16 08:53:57,970 StorageService.java (line 278) Stopping gossip by operator request I chalked this up to expecting that clean up might not be reliable after a heap failure, or perhaps this was another side-effect of dropping and re-creating a keyspace and table with the same name. I did not try to reproduce this; instead I increased the JVM space to 2.5G and did not try the SnappyCompressor again on that table until much later, after I introduced the use of unlogged batches. was (Author: wtmitchell3): The first time I came across the deleteWithConfirm failure on one of the Index files, on the previous run I was trying to use the SnappyCompressor on the table with lots of small rows and it died with a Java heap space error during garbage collection: INFO [ScheduledTasks:1] 2014-02-16 01:31:12,160 GCInspector.java (line 116) GC for ConcurrentMarkSweep: 1458 ms for 3 collections, 124495112 used; max is 2130051072 ERROR [ReadStage:7] 2014-02-16 01:31:13,820 CassandraDaemon.java (line 192) Exception in thread Thread[ReadStage:7,5,main] java.lang.OutOfMemoryError: Java heap space at org.apache.cassandra.io.util.RandomAccessReader.readBytes(RandomAccessReader.java:353) at org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:392) at org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:355) at org.apache.cassandra.db.ColumnSerializer.deserializeColumnBody(ColumnSerializer.java:124) at org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:85) at org.apache.cassandra.db.Column$1.computeNext(Column.java:75) at org.apache.cassandra.db.Column$1.computeNext(Column.java:64) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) at com.google.common.collect.AbstractIterator.next(AbstractIterator.java:153) at org.apache.cassandra.db.columniterator.IndexedSliceReader$IndexedBlockFetcher.getNextBlock(IndexedSliceReader.java:379) at org.apache.cassandra.db.columniterator.IndexedSliceReader$IndexedBlockFetcher.fetchMoreData(IndexedSliceReader.java:332) at org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNext(IndexedSliceReader.java:145) at org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNext(IndexedSliceReader.java:45) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) at org.apache.cassandra.db.columniterator.SSTableSliceIterator.hasNext(SSTableSliceIterator.java:82) at org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:157) at org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:140) at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:144) at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:123) at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:97) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) at org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:185) at org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:122) at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:80) at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:72) at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:297) at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53) at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1560) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1379) INFO [StorageServiceShutdownHook] 2014-02-16 01:31:13,953 ThriftServer.java (line 141) Stop listening to thrift clients INFO [StorageServiceShutdownHook] 2014-02-16 01:31:14,037 Server.java (line 181) Stop listening for CQL clients Soon after restart, 1.5 million rows into the test, I ran into the deleteWithConfirm failure: INFO [FlushWriter:5] 2014-02-16 08:53:57,084 Memtable.java (line 380) Completed flushing; nothing needed to be retained. Commitlog position was ReplayPosition(segmentId=1392560341059, position=6159762) ERROR [NonPeriodicTasks:1] 2014-02-16 08:53:57,969 CassandraDaemon.java (line 192) Exception in thread Thread[NonPeriodicTasks:1,5,main] FSWriteError in C:\Program Files\DataStax Community\data\data\testdb\sr\testdb-sr-jb-117-Index.db at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:120) at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:106) at org.apache.cassandra.io.sstable.SSTable.delete(SSTable.java:145) at org.apache.cassandra.io.sstable.SSTableDeletingTask.run(SSTableDeletingTask.java:86) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Caused by: java.nio.file.FileSystemException: C:\Program Files\DataStax Community\data\data\testdb\sr\testdb-sr-jb-117-Index.db: The process cannot access the file because it is being used by another process. at sun.nio.fs.WindowsException.translateToIOException(Unknown Source) at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source) at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source) at sun.nio.fs.WindowsFileSystemProvider.implDelete(Unknown Source) at sun.nio.fs.AbstractFileSystemProvider.delete(Unknown Source) at java.nio.file.Files.delete(Unknown Source) at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:116) ... 10 more ERROR [NonPeriodicTasks:1] 2014-02-16 08:53:57,970 FileUtils.java (line 417) Stopping gossiper WARN [NonPeriodicTasks:1] 2014-02-16 08:53:57,970 StorageService.java (line 278) Stopping gossip by operator request I chalked this up to expecting that clean up might not be reliable after a heap failure, or perhaps this was another side-effect of dropping and re-creating a keyspace and table with the same name. > Windows7 AccessDeniedException on commit log > --------------------------------------------- > > Key: CASSANDRA-6736 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6736 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Windows 7, quad core, 8GB RAM, single Cassandra node, > Cassandra 2.0.5 with leakdetect patch from CASSANDRA-6283 > Reporter: Bill Mitchell > Assignee: Joshua McKenzie > Attachments: 2014-02-18-22-16.log > > > Similar to the data file deletion of CASSANDRA-6283, under heavy load with > logged batches, I am seeing a problem where the Commit log cannot be deleted: > ERROR [COMMIT-LOG-ALLOCATOR] 2014-02-18 22:15:58,252 CassandraDaemon.java > (line 192) Exception in thread Thread[COMMIT-LOG-ALLOCATOR,5,main] > FSWriteError in C:\Program Files\DataStax > Community\data\commitlog\CommitLog-3-1392761510706.log > at > org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:120) > at > org.apache.cassandra.db.commitlog.CommitLogSegment.discard(CommitLogSegment.java:150) > at > org.apache.cassandra.db.commitlog.CommitLogAllocator$4.run(CommitLogAllocator.java:217) > at > org.apache.cassandra.db.commitlog.CommitLogAllocator$1.runMayThrow(CommitLogAllocator.java:95) > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > at java.lang.Thread.run(Unknown Source) > Caused by: java.nio.file.AccessDeniedException: C:\Program Files\DataStax > Community\data\commitlog\CommitLog-3-1392761510706.log > at sun.nio.fs.WindowsException.translateToIOException(Unknown Source) > at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source) > at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source) > at sun.nio.fs.WindowsFileSystemProvider.implDelete(Unknown Source) > at sun.nio.fs.AbstractFileSystemProvider.delete(Unknown Source) > at java.nio.file.Files.delete(Unknown Source) > at > org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:116) > ... 5 more > (Attached in 2014-02-18-22-16.log is a larger excerpt from the cassandra.log.) > In this particular case, I was trying to do 100 million inserts into two > tables in parallel, one with a single wide row and one with narrow rows, and > the error appeared after inserting 43,151,232 rows. So it does take a while > to trip over this timing issue. > It may be aggravated by the size of the batches. This test was writing 10,000 > rows to each table in a batch. > When I try switching the same test from using a logged batch to an unlogged > batch, and no such failure appears. So the issue could be related to the use > of large, logged batches, or it could be that unlogged batches just change > the probability of failure. -- This message was sent by Atlassian JIRA (v6.1.5#6160)