[ 
https://issues.apache.org/jira/browse/IGNITE-9040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maxim Muzafarov updated IGNITE-9040:
------------------------------------
    Labels: iep-14  (was: )

> StopNodeFailureHandler is not able to stop node correctly on node segmentation
> ------------------------------------------------------------------------------
>
>                 Key: IGNITE-9040
>                 URL: https://issues.apache.org/jira/browse/IGNITE-9040
>             Project: Ignite
>          Issue Type: Bug
>    Affects Versions: 2.6
>            Reporter: Sergey Chugunov
>            Assignee: Sergey Chugunov
>            Priority: Major
>              Labels: iep-14
>             Fix For: 2.7
>
>
> When flag *IGNITE_WAL_LOG_TX_RECORDS* is set up special TxRecords are logged 
> to WAL even on node stop.
> With STOP segmentation policy *StopNodeFailureHandler* is used to stop the 
> segmented node and it marks node's state as invalid. As a result all write 
> requests to WAL get failed.
> So as part of stop-on-segmentation procedure node needs to log Tx but it 
> cannot as its state is marked as invalid. This leads to stop procedure 
> finishing incorrectly, some threads started by the node are not cleaned up.
> Exception example:
> {noformat}
> [2018-07-20 13:35:36,358][ERROR][node-stopper][ZookeeperDiscoverySpiTest0] 
> Failed to pre-stop processor: GridProcessorAdapter []
> class org.apache.ignite.IgniteException: Failed to log TxRecord: TxRecord 
> [state=PREPARED, nearXidVer=GridCacheVersion [topVer=143562918, 
> order=1532082921780, nodeOrder=3], writeVer=GridCacheVersion 
> [topVer=143562918, order=1532082921781, nodeOrder=1], super=TimeStampRecord 
> [timestamp=1532082936349]]
>       at 
> org.apache.ignite.internal.processors.cache.transactions.IgniteTxAdapter.state(IgniteTxAdapter.java:1132)
>       at 
> org.apache.ignite.internal.processors.cache.transactions.IgniteTxAdapter.state(IgniteTxAdapter.java:968)
>       at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxPrepareFuture.onComplete(GridDhtTxPrepareFuture.java:983)
>       at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxPrepareFuture.onDone(GridDhtTxPrepareFuture.java:717)
>       at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxPrepareFuture.onDone(GridDhtTxPrepareFuture.java:105)
>       at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:462)
>       at 
> org.apache.ignite.internal.processors.cache.GridCacheMvccManager.cancelClientFutures(GridCacheMvccManager.java:425)
>       at 
> org.apache.ignite.internal.processors.cache.GridCacheMvccManager.onStop(GridCacheMvccManager.java:410)
>       at 
> org.apache.ignite.internal.processors.cache.GridCacheProcessor.onKernalStop(GridCacheProcessor.java:984)
>       at org.apache.ignite.internal.IgniteKernal.stop0(IgniteKernal.java:2134)
>       at org.apache.ignite.internal.IgniteKernal.stop(IgniteKernal.java:2082)
>       at 
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop0(IgnitionEx.java:2595)
>       at 
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop(IgnitionEx.java:2558)
>       at org.apache.ignite.internal.IgnitionEx.stop(IgnitionEx.java:374)
>       at 
> org.apache.ignite.failure.StopNodeFailureHandler$1.run(StopNodeFailureHandler.java:36)
>       at java.lang.Thread.run(Thread.java:745)
> Caused by: class org.apache.ignite.internal.pagemem.wal.StorageException: 
> Failed to perform WAL operation (environment was invalidated by a previous 
> error)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.checkNode(FileWriteAheadLogManager.java:1504)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.access$6100(FileWriteAheadLogManager.java:143)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileWriteHandle.addRecord(FileWriteAheadLogManager.java:2611)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileWriteHandle.access$1500(FileWriteAheadLogManager.java:2521)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.log(FileWriteAheadLogManager.java:758)
>       at 
> org.apache.ignite.internal.processors.cache.transactions.IgniteTxAdapter.state(IgniteTxAdapter.java:1127)
>       ... 15 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to