[ 
https://issues.apache.org/jira/browse/IGNITE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Gura reassigned IGNITE-6228:
-----------------------------------

    Assignee: Andrey Gura

> Avoid closing page store file with ClosedByInterruptException when user 
> thread is interrupted
> ---------------------------------------------------------------------------------------------
>
>                 Key: IGNITE-6228
>                 URL: https://issues.apache.org/jira/browse/IGNITE-6228
>             Project: Ignite
>          Issue Type: Bug
>          Components: persistence
>    Affects Versions: 2.1
>            Reporter: Ivan Rakov
>            Assignee: Andrey Gura
>             Fix For: 2.3
>
>         Attachments: RestartGridTest.java
>
>
> If cache proxy is in synchronous mode, user thread may be interrupted during 
> read from file page store file. This will cause closing of partition file 
> with ClosedByInterruptException.
> Example stacktrace:
> {noformat}
> class org.apache.ignite.IgniteCheckedException: Runtime failure on lookup 
> row: 
> org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$SearchRow@717729d
>       at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.findOne(BPlusTree.java:1070)
>       at 
> org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.find(IgniteCacheOffheapManagerImpl.java:1476)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.find(GridCacheOffheapManager.java:1276)
>       at 
> org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.read(IgniteCacheOffheapManagerImpl.java:394)
>       at 
> org.apache.ignite.internal.processors.cache.GridCacheMapEntry.unswap(GridCacheMapEntry.java:371)
>       at 
> org.apache.ignite.internal.processors.cache.GridCacheMapEntry.onTtlExpired(GridCacheMapEntry.java:2952)
>       at 
> org.apache.ignite.internal.processors.cache.GridCacheTtlManager$1.applyx(GridCacheTtlManager.java:61)
>       at 
> org.apache.ignite.internal.processors.cache.GridCacheTtlManager$1.applyx(GridCacheTtlManager.java:52)
>       at 
> org.apache.ignite.internal.util.lang.IgniteInClosure2X.apply(IgniteInClosure2X.java:38)
>       at 
> org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.expire(IgniteCacheOffheapManagerImpl.java:1012)
>       at 
> org.apache.ignite.internal.processors.cache.GridCacheTtlManager.expire(GridCacheTtlManager.java:198)
>       at 
> org.apache.ignite.internal.processors.cache.GridCacheUtils.unwindEvicts(GridCacheUtils.java:868)
>       at 
> org.apache.ignite.internal.processors.cache.GridCacheGateway.leaveNoLock(GridCacheGateway.java:240)
>       at 
> org.apache.ignite.internal.processors.cache.GridCacheGateway.leave(GridCacheGateway.java:225)
>       at 
> org.apache.ignite.internal.processors.cache.GatewayProtectedCacheProxy.onLeave(GatewayProtectedCacheProxy.java:1680)
>       at 
> org.apache.ignite.internal.processors.cache.GatewayProtectedCacheProxy.put(GatewayProtectedCacheProxy.java:875)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.db.RestartGridTest$TestService.execute(RestartGridTest.java:160)
>       at 
> org.apache.ignite.internal.processors.service.GridServiceProcessor$2.run(GridServiceProcessor.java:1160)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at java.lang.Thread.run(Thread.java:745)
> Caused by: class org.apache.ignite.IgniteCheckedException: Read error
>       at 
> org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.read(FilePageStore.java:356)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.read(FilePageStoreManager.java:287)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.read(FilePageStoreManager.java:272)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:570)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:488)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.DataStructure.acquirePage(DataStructure.java:129)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.treeMeta(BPlusTree.java:822)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.access$7700(BPlusTree.java:81)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Get.init(BPlusTree.java:2392)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.doFind(BPlusTree.java:1099)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.findOne(BPlusTree.java:1065)
>       ... 20 more
> Caused by: java.nio.channels.ClosedByInterruptException
>       at 
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
>       at sun.nio.ch.FileChannelImpl.readInternal(FileChannelImpl.java:746)
>       at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:724)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.file.RandomAccessFileIO.read(RandomAccessFileIO.java:67)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.read(FilePageStore.java:319)
>       ... 30 more
> {noformat}
> Any subsequent file operation will throw exception. Furthermore, this 
> potentially may break LFS crash recovery.
> We should either handle ClosedByInterruptException or delegate file I/O 
> operations to internal thread.
> Reproducer test is attached.
> S2R:
> 1) Comment all lines in GridAbstractTest#beforeTestsStarted
> 2) Run test until failure (usually, 10-20 runs are enough).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to