[
https://issues.apache.org/jira/browse/IGNITE-9741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16633986#comment-16633986
]
Maxim Muzafarov commented on IGNITE-9741:
-----------------------------------------
[~akalashnikov]
Thanks for your PR. It will help me much!
I've found that some cases of the {{WalCompactionTest}} can also be affected
by mutli de-activate\activate cluster procedures (e.g.
{{.testCompressorToleratesEmptyWalSegmentsLogOnly()}}).
Can we double-check it?
I can mistake here. Just want to be sure that everything is OK.
For example, the exception of a probable stack trace with compression after
de-activation\activation:
{code}
[2018-09-30
12:45:51,159][ERROR][wal-file-compressor-%wal.WalCompactionTest0%-0-#17889%wal.WalCompactionTest0%][FileWriteAheadLogManager]
Compression of WAL segment [idx=0] was skipped due to unexpected error
class org.apache.ignite.IgniteCheckedException: Failed to initialize WAL
segment:
/data/teamcity/work/9198da4c51c3e112/work/db/wal/archive/wal_WalCompactionTest0/0000000000000000.wal
at
org.apache.ignite.internal.processors.cache.persistence.wal.AbstractWalRecordsIterator.initReadHandle(AbstractWalRecordsIterator.java:412)
at
org.apache.ignite.internal.processors.cache.persistence.wal.SingleSegmentLogicalRecordsIterator.advanceSegment(SingleSegmentLogicalRecordsIterator.java:109)
at
org.apache.ignite.internal.processors.cache.persistence.wal.AbstractWalRecordsIterator.advance(AbstractWalRecordsIterator.java:163)
at
org.apache.ignite.internal.processors.cache.persistence.wal.SingleSegmentLogicalRecordsIterator.advance(SingleSegmentLogicalRecordsIterator.java:119)
at
org.apache.ignite.internal.processors.cache.persistence.wal.SingleSegmentLogicalRecordsIterator.<init>(SingleSegmentLogicalRecordsIterator.java:82)
at
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileCompressorWorker.compressSegmentToFile(FileWriteAheadLogManager.java:2144)
at
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileCompressorWorker.body0(FileWriteAheadLogManager.java:2072)
at
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileCompressorWorker.access$4500(FileWriteAheadLogManager.java:2016)
at
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileCompressor.body(FileWriteAheadLogManager.java:1994)
at
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.nio.channels.ClosedByInterruptException
at
java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
at sun.nio.ch.FileChannelImpl.position(FileChannelImpl.java:269)
at
org.apache.ignite.internal.processors.cache.persistence.file.RandomAccessFileIO.position(RandomAccessFileIO.java:48)
at
org.apache.ignite.internal.processors.cache.persistence.file.FileIODecorator.position(FileIODecorator.java:41)
at
org.apache.ignite.internal.processors.cache.persistence.wal.io.SimpleFileInput.<init>(SimpleFileInput.java:59)
at
org.apache.ignite.internal.processors.cache.persistence.wal.io.SimpleSegmentFileInputFactory.createFileInput(SimpleSegmentFileInputFactory.java:31)
at
org.apache.ignite.internal.processors.cache.persistence.wal.serializer.RecordV1Serializer.readSegmentHeader(RecordV1Serializer.java:258)
at
org.apache.ignite.internal.processors.cache.persistence.wal.AbstractWalRecordsIterator.initReadHandle(AbstractWalRecordsIterator.java:383)
... 10 more
[2018-09-30 12:46:06,204][ERROR][main][root] Test failed.
junit.framework.AssertionFailedError
at junit.framework.Assert.fail(Assert.java:55)
at junit.framework.Assert.assertTrue(Assert.java:22)
at junit.framework.Assert.assertTrue(Assert.java:31)
at junit.framework.TestCase.assertTrue(TestCase.java:201)
at
org.apache.ignite.internal.processors.cache.persistence.db.wal.WalCompactionTest.testCompressorToleratesEmptyWalSegments(WalCompactionTest.java:299)
at
org.apache.ignite.internal.processors.cache.persistence.db.wal.WalCompactionTest.testCompressorToleratesEmptyWalSegmentsLogOnly(WalCompactionTest.java:235)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at junit.framework.TestCase.runTest(TestCase.java:176)
at
org.apache.ignite.testframework.junits.GridAbstractTest.runTestInternal(GridAbstractTest.java:2177)
at
org.apache.ignite.testframework.junits.GridAbstractTest.access$000(GridAbstractTest.java:143)
at
org.apache.ignite.testframework.junits.GridAbstractTest$5.run(GridAbstractTest.java:2092)
at java.lang.Thread.run(Thread.java:748)
{code}
> SegmentArchivedStorage and SegmentCompressStorage remain `iterrupted` after
> de-activation occurs before activation
> ------------------------------------------------------------------------------------------------------------------
>
> Key: IGNITE-9741
> URL: https://issues.apache.org/jira/browse/IGNITE-9741
> Project: Ignite
> Issue Type: Bug
> Affects Versions: 2.6
> Reporter: Maxim Muzafarov
> Assignee: Anton Kalashnikov
> Priority: Critical
> Fix For: 2.7
>
>
> The {{FileWriteAheadLogManager}} now contains:
> {{private final SegmentAware segmentAware;}}
>
> The SegmentAware have the `interrupt()` method:
> {code:java|title=SegmentAware:216}
> /**
> * Interrupt waiting on related objects.
> */
> public void interrupt() {
> segmentArchivedStorage.interrupt();
> segmentCompressStorage.interrupt();
> segmentCurrStateStorage.interrupt();
> }
> {code}
>
> Method at the {{FileWriteAheadLogManager}} de-activation sets (e.g. for
> SegmentArchivedStorage) `interrupted` filed to `true` value but never revert
> it to `false` after activation.
> {code:java|title=SegmentArchivedStorage:114}
> /**
> * Interrupt waiting on this object.
> */
> synchronized void interrupt() {
> interrupted = true;
> notifyAll();
> }
> {code}
>
> So, the SegmentArchivedStorage after de-activation always remain interrupted.
> This can lead to undefined behaviour e.g. exchange worker hangs.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)