[
https://issues.apache.org/jira/browse/CASSANDRA-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13917195#comment-13917195
]
Bill Mitchell commented on CASSANDRA-6736:
------------------------------------------
The first time I came across the deleteWithConfirm failure on one of the Index
files, on the previous run I was trying to use the SnappyCompressor on the
table with lots of small rows and it died with a Java heap space error during
garbage collection:
INFO [ScheduledTasks:1] 2014-02-16 01:31:12,160 GCInspector.java (line 116)
GC for ConcurrentMarkSweep: 1458 ms for 3 collections, 124495112 used; max is
2130051072
ERROR [ReadStage:7] 2014-02-16 01:31:13,820 CassandraDaemon.java (line 192)
Exception in thread Thread[ReadStage:7,5,main]
java.lang.OutOfMemoryError: Java heap space
at
org.apache.cassandra.io.util.RandomAccessReader.readBytes(RandomAccessReader.java:353)
at
org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:392)
at
org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:355)
at
org.apache.cassandra.db.ColumnSerializer.deserializeColumnBody(ColumnSerializer.java:124)
at
org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:85)
at org.apache.cassandra.db.Column$1.computeNext(Column.java:75)
at org.apache.cassandra.db.Column$1.computeNext(Column.java:64)
at
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
at
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
at
com.google.common.collect.AbstractIterator.next(AbstractIterator.java:153)
at
org.apache.cassandra.db.columniterator.IndexedSliceReader$IndexedBlockFetcher.getNextBlock(IndexedSliceReader.java:379)
at
org.apache.cassandra.db.columniterator.IndexedSliceReader$IndexedBlockFetcher.fetchMoreData(IndexedSliceReader.java:332)
at
org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNext(IndexedSliceReader.java:145)
at
org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNext(IndexedSliceReader.java:45)
at
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
at
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
at
org.apache.cassandra.db.columniterator.SSTableSliceIterator.hasNext(SSTableSliceIterator.java:82)
at
org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:157)
at
org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:140)
at
org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:144)
at
org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:123)
at
org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:97)
at
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
at
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
at
org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:185)
at
org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:122)
at
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:80)
at
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:72)
at
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:297)
at
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53)
at
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1560)
at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1379)
INFO [StorageServiceShutdownHook] 2014-02-16 01:31:13,953 ThriftServer.java
(line 141) Stop listening to thrift clients
INFO [StorageServiceShutdownHook] 2014-02-16 01:31:14,037 Server.java (line
181) Stop listening for CQL clients
Soon after restart, 1.5 million rows into the test, I ran into the
deleteWithConfirm failure:
INFO [FlushWriter:5] 2014-02-16 08:53:57,084 Memtable.java (line 380)
Completed flushing; nothing needed to be retained. Commitlog position was
ReplayPosition(segmentId=1392560341059, position=6159762)
ERROR [NonPeriodicTasks:1] 2014-02-16 08:53:57,969 CassandraDaemon.java (line
192) Exception in thread Thread[NonPeriodicTasks:1,5,main]
FSWriteError in C:\Program Files\DataStax
Community\data\data\testdb\sr\testdb-sr-jb-117-Index.db
at
org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:120)
at
org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:106)
at org.apache.cassandra.io.sstable.SSTable.delete(SSTable.java:145)
at
org.apache.cassandra.io.sstable.SSTableDeletingTask.run(SSTableDeletingTask.java:86)
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(Unknown
Source)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown
Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Caused by: java.nio.file.FileSystemException: C:\Program Files\DataStax
Community\data\data\testdb\sr\testdb-sr-jb-117-Index.db: The process cannot
access the file because it is being used by another process.
at sun.nio.fs.WindowsException.translateToIOException(Unknown Source)
at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source)
at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source)
at sun.nio.fs.WindowsFileSystemProvider.implDelete(Unknown Source)
at sun.nio.fs.AbstractFileSystemProvider.delete(Unknown Source)
at java.nio.file.Files.delete(Unknown Source)
at
org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:116)
... 10 more
ERROR [NonPeriodicTasks:1] 2014-02-16 08:53:57,970 FileUtils.java (line 417)
Stopping gossiper
WARN [NonPeriodicTasks:1] 2014-02-16 08:53:57,970 StorageService.java (line
278) Stopping gossip by operator request
I chalked this up to expecting that clean up might not be reliable after a heap
failure, or perhaps this was another side-effect of dropping and re-creating a
keyspace and table with the same name.
> Windows7 AccessDeniedException on commit log
> ---------------------------------------------
>
> Key: CASSANDRA-6736
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6736
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Environment: Windows 7, quad core, 8GB RAM, single Cassandra node,
> Cassandra 2.0.5 with leakdetect patch from CASSANDRA-6283
> Reporter: Bill Mitchell
> Assignee: Joshua McKenzie
> Attachments: 2014-02-18-22-16.log
>
>
> Similar to the data file deletion of CASSANDRA-6283, under heavy load with
> logged batches, I am seeing a problem where the Commit log cannot be deleted:
> ERROR [COMMIT-LOG-ALLOCATOR] 2014-02-18 22:15:58,252 CassandraDaemon.java
> (line 192) Exception in thread Thread[COMMIT-LOG-ALLOCATOR,5,main]
> FSWriteError in C:\Program Files\DataStax
> Community\data\commitlog\CommitLog-3-1392761510706.log
> at
> org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:120)
> at
> org.apache.cassandra.db.commitlog.CommitLogSegment.discard(CommitLogSegment.java:150)
> at
> org.apache.cassandra.db.commitlog.CommitLogAllocator$4.run(CommitLogAllocator.java:217)
> at
> org.apache.cassandra.db.commitlog.CommitLogAllocator$1.runMayThrow(CommitLogAllocator.java:95)
> at
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> at java.lang.Thread.run(Unknown Source)
> Caused by: java.nio.file.AccessDeniedException: C:\Program Files\DataStax
> Community\data\commitlog\CommitLog-3-1392761510706.log
> at sun.nio.fs.WindowsException.translateToIOException(Unknown Source)
> at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source)
> at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source)
> at sun.nio.fs.WindowsFileSystemProvider.implDelete(Unknown Source)
> at sun.nio.fs.AbstractFileSystemProvider.delete(Unknown Source)
> at java.nio.file.Files.delete(Unknown Source)
> at
> org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:116)
> ... 5 more
> (Attached in 2014-02-18-22-16.log is a larger excerpt from the cassandra.log.)
> In this particular case, I was trying to do 100 million inserts into two
> tables in parallel, one with a single wide row and one with narrow rows, and
> the error appeared after inserting 43,151,232 rows. So it does take a while
> to trip over this timing issue.
> It may be aggravated by the size of the batches. This test was writing 10,000
> rows to each table in a batch.
> When I try switching the same test from using a logged batch to an unlogged
> batch, and no such failure appears. So the issue could be related to the use
> of large, logged batches, or it could be that unlogged batches just change
> the probability of failure.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)