[ 
https://issues.apache.org/jira/browse/CASSANDRA-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13917195#comment-13917195
 ] 

Bill Mitchell edited comment on CASSANDRA-6736 at 3/1/14 9:15 PM:
------------------------------------------------------------------

The first time I came across the deleteWithConfirm failure on one of the Index 
files, on the previous run I was trying to use the SnappyCompressor on the 
table with lots of small rows and it died with a Java heap space error during 
garbage collection:

  INFO [ScheduledTasks:1] 2014-02-16 01:31:12,160 GCInspector.java (line 116) 
GC for ConcurrentMarkSweep: 1458 ms for 3 collections, 124495112 used; max is 
2130051072
 ERROR [ReadStage:7] 2014-02-16 01:31:13,820 CassandraDaemon.java (line 192) 
Exception in thread Thread[ReadStage:7,5,main]
 java.lang.OutOfMemoryError: Java heap space
        at 
org.apache.cassandra.io.util.RandomAccessReader.readBytes(RandomAccessReader.java:353)
        at 
org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:392)
        at 
org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:355)
        at 
org.apache.cassandra.db.ColumnSerializer.deserializeColumnBody(ColumnSerializer.java:124)
        at 
org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:85)
        at org.apache.cassandra.db.Column$1.computeNext(Column.java:75)
        at org.apache.cassandra.db.Column$1.computeNext(Column.java:64)
        at 
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
        at 
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
        at 
com.google.common.collect.AbstractIterator.next(AbstractIterator.java:153)
        at 
org.apache.cassandra.db.columniterator.IndexedSliceReader$IndexedBlockFetcher.getNextBlock(IndexedSliceReader.java:379)
        at 
org.apache.cassandra.db.columniterator.IndexedSliceReader$IndexedBlockFetcher.fetchMoreData(IndexedSliceReader.java:332)
        at 
org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNext(IndexedSliceReader.java:145)
        at 
org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNext(IndexedSliceReader.java:45)
        at 
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
        at 
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
        at 
org.apache.cassandra.db.columniterator.SSTableSliceIterator.hasNext(SSTableSliceIterator.java:82)
        at 
org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:157)
        at 
org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:140)
        at 
org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:144)
        at 
org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:123)
        at 
org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:97)
        at 
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
        at 
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
        at 
org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:185)
        at 
org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:122)
        at 
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:80)
        at 
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:72)
        at 
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:297)
        at 
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53)
        at 
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1560)
        at 
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1379)
 INFO [StorageServiceShutdownHook] 2014-02-16 01:31:13,953 ThriftServer.java 
(line 141) Stop listening to thrift clients
  INFO [StorageServiceShutdownHook] 2014-02-16 01:31:14,037 Server.java (line 
181) Stop listening for CQL clients

Soon after restart, 1.5 million rows into the test, I ran into the 
deleteWithConfirm failure:

  INFO [FlushWriter:5] 2014-02-16 08:53:57,084 Memtable.java (line 380) 
Completed flushing; nothing needed to be retained.  Commitlog position was 
ReplayPosition(segmentId=1392560341059, position=6159762)
 ERROR [NonPeriodicTasks:1] 2014-02-16 08:53:57,969 CassandraDaemon.java (line 
192) Exception in thread Thread[NonPeriodicTasks:1,5,main]
 FSWriteError in C:\Program Files\DataStax 
Community\data\data\testdb\sr\testdb-sr-jb-117-Index.db
        at 
org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:120)
        at 
org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:106)
        at org.apache.cassandra.io.sstable.SSTable.delete(SSTable.java:145)
        at 
org.apache.cassandra.io.sstable.SSTableDeletingTask.run(SSTableDeletingTask.java:86)
        at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
        at java.util.concurrent.FutureTask.run(Unknown Source)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(Unknown
 Source)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown
 Source)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        at java.lang.Thread.run(Unknown Source)
Caused by: java.nio.file.FileSystemException: C:\Program Files\DataStax 
Community\data\data\testdb\sr\testdb-sr-jb-117-Index.db: The process cannot 
access the file because it is being used by another process.

        at sun.nio.fs.WindowsException.translateToIOException(Unknown Source)
        at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source)
        at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source)
        at sun.nio.fs.WindowsFileSystemProvider.implDelete(Unknown Source)
        at sun.nio.fs.AbstractFileSystemProvider.delete(Unknown Source)
        at java.nio.file.Files.delete(Unknown Source)
        at 
org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:116)
        ... 10 more
ERROR [NonPeriodicTasks:1] 2014-02-16 08:53:57,970 FileUtils.java (line 417) 
Stopping gossiper
  WARN [NonPeriodicTasks:1] 2014-02-16 08:53:57,970 StorageService.java (line 
278) Stopping gossip by operator request

I chalked this up to expecting that clean up might not be reliable after a heap 
failure, or perhaps this was another side-effect of dropping and re-creating a 
keyspace and table with the same name.  I did not try to reproduce this; 
instead I increased the JVM space to 2.5G and did not try the SnappyCompressor 
again on that table until much later, after I introduced the use of unlogged 
batches.  


was (Author: wtmitchell3):
The first time I came across the deleteWithConfirm failure on one of the Index 
files, on the previous run I was trying to use the SnappyCompressor on the 
table with lots of small rows and it died with a Java heap space error during 
garbage collection:

  INFO [ScheduledTasks:1] 2014-02-16 01:31:12,160 GCInspector.java (line 116) 
GC for ConcurrentMarkSweep: 1458 ms for 3 collections, 124495112 used; max is 
2130051072
 ERROR [ReadStage:7] 2014-02-16 01:31:13,820 CassandraDaemon.java (line 192) 
Exception in thread Thread[ReadStage:7,5,main]
 java.lang.OutOfMemoryError: Java heap space
        at 
org.apache.cassandra.io.util.RandomAccessReader.readBytes(RandomAccessReader.java:353)
        at 
org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:392)
        at 
org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:355)
        at 
org.apache.cassandra.db.ColumnSerializer.deserializeColumnBody(ColumnSerializer.java:124)
        at 
org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:85)
        at org.apache.cassandra.db.Column$1.computeNext(Column.java:75)
        at org.apache.cassandra.db.Column$1.computeNext(Column.java:64)
        at 
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
        at 
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
        at 
com.google.common.collect.AbstractIterator.next(AbstractIterator.java:153)
        at 
org.apache.cassandra.db.columniterator.IndexedSliceReader$IndexedBlockFetcher.getNextBlock(IndexedSliceReader.java:379)
        at 
org.apache.cassandra.db.columniterator.IndexedSliceReader$IndexedBlockFetcher.fetchMoreData(IndexedSliceReader.java:332)
        at 
org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNext(IndexedSliceReader.java:145)
        at 
org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNext(IndexedSliceReader.java:45)
        at 
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
        at 
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
        at 
org.apache.cassandra.db.columniterator.SSTableSliceIterator.hasNext(SSTableSliceIterator.java:82)
        at 
org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:157)
        at 
org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:140)
        at 
org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:144)
        at 
org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:123)
        at 
org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:97)
        at 
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
        at 
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
        at 
org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:185)
        at 
org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:122)
        at 
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:80)
        at 
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:72)
        at 
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:297)
        at 
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53)
        at 
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1560)
        at 
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1379)
 INFO [StorageServiceShutdownHook] 2014-02-16 01:31:13,953 ThriftServer.java 
(line 141) Stop listening to thrift clients
  INFO [StorageServiceShutdownHook] 2014-02-16 01:31:14,037 Server.java (line 
181) Stop listening for CQL clients

Soon after restart, 1.5 million rows into the test, I ran into the 
deleteWithConfirm failure:

  INFO [FlushWriter:5] 2014-02-16 08:53:57,084 Memtable.java (line 380) 
Completed flushing; nothing needed to be retained.  Commitlog position was 
ReplayPosition(segmentId=1392560341059, position=6159762)
 ERROR [NonPeriodicTasks:1] 2014-02-16 08:53:57,969 CassandraDaemon.java (line 
192) Exception in thread Thread[NonPeriodicTasks:1,5,main]
 FSWriteError in C:\Program Files\DataStax 
Community\data\data\testdb\sr\testdb-sr-jb-117-Index.db
        at 
org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:120)
        at 
org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:106)
        at org.apache.cassandra.io.sstable.SSTable.delete(SSTable.java:145)
        at 
org.apache.cassandra.io.sstable.SSTableDeletingTask.run(SSTableDeletingTask.java:86)
        at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
        at java.util.concurrent.FutureTask.run(Unknown Source)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(Unknown
 Source)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown
 Source)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        at java.lang.Thread.run(Unknown Source)
Caused by: java.nio.file.FileSystemException: C:\Program Files\DataStax 
Community\data\data\testdb\sr\testdb-sr-jb-117-Index.db: The process cannot 
access the file because it is being used by another process.

        at sun.nio.fs.WindowsException.translateToIOException(Unknown Source)
        at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source)
        at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source)
        at sun.nio.fs.WindowsFileSystemProvider.implDelete(Unknown Source)
        at sun.nio.fs.AbstractFileSystemProvider.delete(Unknown Source)
        at java.nio.file.Files.delete(Unknown Source)
        at 
org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:116)
        ... 10 more
ERROR [NonPeriodicTasks:1] 2014-02-16 08:53:57,970 FileUtils.java (line 417) 
Stopping gossiper
  WARN [NonPeriodicTasks:1] 2014-02-16 08:53:57,970 StorageService.java (line 
278) Stopping gossip by operator request

I chalked this up to expecting that clean up might not be reliable after a heap 
failure, or perhaps this was another side-effect of dropping and re-creating a 
keyspace and table with the same name.  

> Windows7 AccessDeniedException on commit log 
> ---------------------------------------------
>
>                 Key: CASSANDRA-6736
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6736
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: Windows 7, quad core, 8GB RAM, single Cassandra node, 
> Cassandra 2.0.5 with leakdetect patch from CASSANDRA-6283
>            Reporter: Bill Mitchell
>            Assignee: Joshua McKenzie
>         Attachments: 2014-02-18-22-16.log
>
>
> Similar to the data file deletion of CASSANDRA-6283, under heavy load with 
> logged batches, I am seeing a problem where the Commit log cannot be deleted:
>  ERROR [COMMIT-LOG-ALLOCATOR] 2014-02-18 22:15:58,252 CassandraDaemon.java 
> (line 192) Exception in thread Thread[COMMIT-LOG-ALLOCATOR,5,main]
>  FSWriteError in C:\Program Files\DataStax 
> Community\data\commitlog\CommitLog-3-1392761510706.log
>       at 
> org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:120)
>       at 
> org.apache.cassandra.db.commitlog.CommitLogSegment.discard(CommitLogSegment.java:150)
>       at 
> org.apache.cassandra.db.commitlog.CommitLogAllocator$4.run(CommitLogAllocator.java:217)
>       at 
> org.apache.cassandra.db.commitlog.CommitLogAllocator$1.runMayThrow(CommitLogAllocator.java:95)
>       at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>       at java.lang.Thread.run(Unknown Source)
> Caused by: java.nio.file.AccessDeniedException: C:\Program Files\DataStax 
> Community\data\commitlog\CommitLog-3-1392761510706.log
>       at sun.nio.fs.WindowsException.translateToIOException(Unknown Source)
>       at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source)
>       at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source)
>       at sun.nio.fs.WindowsFileSystemProvider.implDelete(Unknown Source)
>       at sun.nio.fs.AbstractFileSystemProvider.delete(Unknown Source)
>       at java.nio.file.Files.delete(Unknown Source)
>       at 
> org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:116)
>       ... 5 more
> (Attached in 2014-02-18-22-16.log is a larger excerpt from the cassandra.log.)
> In this particular case, I was trying to do 100 million inserts into two 
> tables in parallel, one with a single wide row and one with narrow rows, and 
> the error appeared after inserting 43,151,232 rows.  So it does take a while 
> to trip over this timing issue.  
> It may be aggravated by the size of the batches. This test was writing 10,000 
> rows to each table in a batch.  
> When I try switching the same test from using a logged batch to an unlogged 
> batch, and no such failure appears. So the issue could be related to the use 
> of large, logged batches, or it could be that unlogged batches just change 
> the probability of failure.  



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to