Matthias Vetter created AMQ-4171:
------------------------------------
Summary: KahaDB / Journal rotateWriteFIle() sometimes too fast?
Key: AMQ-4171
URL: https://issues.apache.org/jira/browse/AMQ-4171
Project: ActiveMQ
Issue Type: Bug
Components: Broker
Affects Versions: 5.6.0
Environment: Tested on Windows (7)
Reporter: Matthias Vetter
On very fast test machines we got predictable FileNotFoundExceptions when a
KahaDB data file was to be replaced by its successor.
See this from the log file:
30.10.2012 21:00:56.359 [ActiveMQ Transport: tcp:///192.168.111.128:3134] [ ]
master-My-Company-S1-kellner20-vm Message
ID:kellner20-vm-3125-1351618296390-23:1:1:2086:1 sent to
queue://My-Company.dxm.request.provisionToTS
30.10.2012 21:00:56.359 [BrokerService[master-My-Company-S1-kellner20-vm]
Task-197] [ ] My-Company.dxm.request.provisionToTS toPageIn: 1, Inflight: 0,
pagedInMessages.size 0, enqueueCount: 2125, dequeueCount: 2124
-------------------------------------------------------------------------------
30.10.2012 21:00:56.390 [ConcurrentQueueStoreAndDispatch] [ ] *** SEVERE ***
Called from org.apache.activemq.store.kahadb.MessageDatabase.store()
KahaDB failed to store to Journal
-------------------------------------------------------------------------------
java.io.FileNotFoundException:
activemq-data\master-My-Company-S1-kellner20-vm\KahaDB\db-2.log (The system
cannot find the path specified)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.<init>(Unknown Source)
at
org.apache.kahadb.journal.DataFile.openRandomAccessFile(DataFile.java:70)
at
org.apache.kahadb.journal.DataFileAppender.processQueue(DataFileAppender.java:309)
at
org.apache.kahadb.journal.DataFileAppender$1.run(DataFileAppender.java:188)
30.10.2012 21:00:56.375 [BrokerService[master-My-Company-S1-kellner20-vm]
Task-197] [ ]
org.apache.activemq.broker.region.cursors.QueueStorePrefetch@1afd356:My-Company.dxm.request.provisionToTS,batchResetNeeded=false,storeHasMessages=true,size=0,cacheEnabled=true
- fillBatch
30.10.2012 21:00:56.421 [ConcurrentQueueStoreAndDispatch] [ ] NOTIFICATION:
HighWater of SEVERE exceeded (value=1, highwater=1)!
30.10.2012 21:00:56.437 [ConcurrentQueueStoreAndDispatch] [ ] Stopping the
broker due to IO exception, java.io.FileNotFoundException:
activemq-data\master-My-Company-S1-kellner20-vm\KahaDB\db-2.log (The system
cannot find the path specified)
java.io.FileNotFoundException:
activemq-data\master-My-Company-S1-kellner20-vm\KahaDB\db-2.log (The system
cannot find the path specified)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.<init>(Unknown Source)
at
org.apache.kahadb.journal.DataFile.openRandomAccessFile(DataFile.java:70)
at
org.apache.kahadb.journal.DataFileAppender.processQueue(DataFileAppender.java:309)
at
org.apache.kahadb.journal.DataFileAppender$1.run(DataFileAppender.java:188)
********************
(end of log file quotation)
********************
Having evaluated all possibilities of the failure, we finally patched the
following method of the class Journal::
synchronized DataFile rotateWriteFile() {
int nextNum = !dataFiles.isEmpty() ?
dataFiles.getTail().getDataFileId().intValue() + 1 : 1;
File file = getFile(nextNum);
DataFile nextWriteFile = new DataFile(file, nextNum,
preferedFileLength);
PATCH START try {
Thread.sleep(10000);
} catch (InterruptedException e) {
e.printStackTrace();
}
PATCH END
// actually allocate the disk space
fileMap.put(nextWriteFile.getDataFileId(), nextWriteFile);
fileByFileMap.put(file, nextWriteFile);
dataFiles.addLast(nextWriteFile);
return nextWriteFile;
}
*************
This helped, the IO Exception never occurred again after this patch.
It seems that the "nextWriteFile"-object was returned, while the allocating of
the new Data File wasn't finished yet.
Anyway, it is very strange that we should have detected this problem, and it
never occurred earlier - at least no similar bug (???) seems to have been
reported. Therefore, a critical examination of possibly deeper causes could be
useful.
Best regards,
Matthias Vetter
ATOS, Frankfurt
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira