[jira] [Created] (IGNITE-12128) Potentially pds corruption on a failed node during checkpoint
Dmitriy Govorukhin created IGNITE-12128: --- Summary: Potentially pds corruption on a failed node during checkpoint Key: IGNITE-12128 URL: https://issues.apache.org/jira/browse/IGNITE-12128 Project: Ignite Issue Type: Bug Reporter: Dmitriy Govorukhin There are the case when we start a checkpoint but not create CP file marker, but PageMemory may start to flush dirty pages from checkpoint pages to page store. If node crashed at this moment, we can get inconsistency state, because we still not write checkpoint marker to disk but already write some pages for this checkpoint. If we try to recover from this state we cat get any sort of corruption problem. Recovery logic may not recognize that crash was during checkpoint because we did not write file marker when we start checkpoint but write some pages for this checkpoint. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (IGNITE-12127) WAL writer may close file IO with unflushed changes when MMAP is disabled
Dmitriy Govorukhin created IGNITE-12127: --- Summary: WAL writer may close file IO with unflushed changes when MMAP is disabled Key: IGNITE-12127 URL: https://issues.apache.org/jira/browse/IGNITE-12127 Project: Ignite Issue Type: Bug Reporter: Dmitriy Govorukhin Assignee: Dmitriy Govorukhin Most likely the issue manifests itself as the following critical error: {code} 2019-08-27 14:52:31.286 ERROR 26835 --- [wal-write-worker%null-#447] ROOT : Critical system error detected. Will be handled accordingly to configured handler [hnd=class o.a.i.failure.StopNodeOrHaltFailureHandler, failureCtx=FailureContext [type=CRITICAL_ERROR, err=class o.a.i.i.processors.cache.persistence.StorageException: Failed to write buffer.]] org.apache.ignite.internal.processors.cache.persistence.StorageException: Failed to write buffer. at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$WALWriter.writeBuffer(FileWriteAheadLogManager.java:3444) [ignite-core-2.5.7.jar!/:2.5.7] at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$WALWriter.body(FileWriteAheadLogManager.java:3249) [ignite-core-2.5.7.jar!/:2.5.7] at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110) [ignite-core-2.5.7.jar!/:2.5.7] at java.lang.Thread.run(Thread.java:748) [na:1.8.0_201] Caused by: java.nio.channels.ClosedChannelException: null at sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:110) ~[na:1.8.0_201] at sun.nio.ch.FileChannelImpl.position(FileChannelImpl.java:253) ~[na:1.8.0_201] at org.apache.ignite.internal.processors.cache.persistence.file.RandomAccessFileIO.position(RandomAccessFileIO.java:48) ~[ignite-core-2.5.7.jar!/:2.5.7] at org.apache.ignite.internal.processors.cache.persistence.file.FileIODecorator.position(FileIODecorator.java:41) ~[ignite-core-2.5.7.jar!/:2.5.7] at org.apache.ignite.internal.processors.cache.persistence.file.AbstractFileIO.writeFully(AbstractFileIO.java:111) ~[ignite-core-2.5.7.jar!/:2.5.7] at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$WALWriter.writeBuffer(FileWriteAheadLogManager.java:3437) [ignite-core-2.5.7.jar!/:2.5.7] ... 3 common frames omitted {code} It appears that there following sequence is possible: * Thread A attempts to log a large record which does not fit segment, {{addRecord}} fails and the thread A starts segment rollover. I successfully runs {{flushOrWait(null)}} and gets de-scheduled before adding switch segment record * Thread B attempts to log another record, which fits exactly till the end of the current segment. The record is added to the buffer * Thread A resumes and fails to add the switch segment record. No flush is performed and the thread immediately proceeds for wal-writer close * WAL writer thread wakes up, sees that there is a CLOSE request, closes the file IO and immediately proceeds to write unflushed changes causing the exception. Unconditional flush after switch segment record write should fix the issue. Besides the bug itself, I suggest the following changes to the {{FileWriteHandleImpl}} ({{FileWriteAheadLogManager}} in earlier versions): * There is an {{fsync(filePtr)}} call inside {{close()}}; however, {{fsync()}} checks the {{stop}} flag (which is set inside {{close}}) and returns immediately after {{flushOrWait()}} if the flag is set - this is very confusing. After all, the {{close()}} itself explicitly calls {{force}} after flush * There is an ignored IO exception in mmap mode - this should be propagated to the failure handler * In WAL writer, we check for file CLOSE and then attemp to write to (possibly) the same write handle - write should be always before close * In WAL writer, there are racy reads of current handle - it would be better if we read the current handle once and then operate on it during the whole loop iteration -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (IGNITE-12110) Bugs & tests fixes
Dmitriy Govorukhin created IGNITE-12110: --- Summary: Bugs & tests fixes Key: IGNITE-12110 URL: https://issues.apache.org/jira/browse/IGNITE-12110 Project: Ignite Issue Type: Bug Reporter: Dmitriy Govorukhin -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (IGNITE-12102) idle_verify should show info about lost partitions
Dmitriy Govorukhin created IGNITE-12102: --- Summary: idle_verify should show info about lost partitions Key: IGNITE-12102 URL: https://issues.apache.org/jira/browse/IGNITE-12102 Project: Ignite Issue Type: Improvement Reporter: Dmitriy Govorukhin In the current implementation, idle_verify do not show lost partitions, and check shows that everything is fine but it is not true. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (IGNITE-12081) Page replacement can reload invalid page during checkpoint
Dmitriy Govorukhin created IGNITE-12081: --- Summary: Page replacement can reload invalid page during checkpoint Key: IGNITE-12081 URL: https://issues.apache.org/jira/browse/IGNITE-12081 Project: Ignite Issue Type: Bug Reporter: Dmitriy Govorukhin Assignee: Dmitriy Govorukhin There is a race between {{writeCheckpointPages}} and page replacement process: * Checkpointer thread begins a checkpoint * Checkpointer thread calls {{getPageForCheckpoint()}}, which will copy page content *and clear dirty flag* * Page replacement tries to find a page for replacement and chooses this page, the page is thrown away * Before the page is written back to the store, the page is acquired again. As a result, an older copy of the page is brought back to memory, which causes all kinds of corruption exceptions and assertions. The attached unit test demonstrates the issue. It is likely that all baselines are affected starting from 2.4 As a part of this ticket, we must add more unit-tests for checkpointing protocol invariants we rely on. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (IGNITE-12060) Incorrect row size calculation, lead to tree corruption.
Dmitriy Govorukhin created IGNITE-12060: --- Summary: Incorrect row size calculation, lead to tree corruption. Key: IGNITE-12060 URL: https://issues.apache.org/jira/browse/IGNITE-12060 Project: Ignite Issue Type: Bug Reporter: Dmitriy Govorukhin Assignee: Dmitriy Govorukhin Fix For: 2.8 -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (IGNITE-12057) Persistence files are stored to temp dir
Dmitriy Govorukhin created IGNITE-12057: --- Summary: Persistence files are stored to temp dir Key: IGNITE-12057 URL: https://issues.apache.org/jira/browse/IGNITE-12057 Project: Ignite Issue Type: Bug Reporter: Dmitriy Govorukhin h2. Description Check this thread: [https://stackoverflow.com/questions/56951913/ignite-persistent-schema-tables-disappeared-sometimes/56977212#56977212] This prospect almost dropped us because the company could figure out why persistence files disappear upon restarts. They turned off WARN logging level and could see our warning saying that the files are written to such a directory. I've updated Ignite docs: [https://apacheignite.readme.io/docs/distributed-persistent-store#section-persistence-path-management] -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (IGNITE-12048) Bugs & tests fixes
Dmitriy Govorukhin created IGNITE-12048: --- Summary: Bugs & tests fixes Key: IGNITE-12048 URL: https://issues.apache.org/jira/browse/IGNITE-12048 Project: Ignite Issue Type: Bug Reporter: Dmitriy Govorukhin Page replacement can reload invalid page during checkpoint There is a race between {{writeCheckpointPages}} and page replacement process: * Checkpointer thread begins a checkpoint * Checkpointer thread calls {{getPageForCheckpoint()}}, which will copy page content *and clear dirty flag* * Page replacement tries to find a page for replacement and chooses this page, the page is thrown away * Before the page is written back to the store, the page is acquired again. As a result, an older copy of the page is brought back to memory, which causes all kinds of corruption exceptions and assertions. - checkpointReadLock() may hang during node stop I got this hang during one of PDS (Indexing) runs (thread-dump is attached). The following code hang: {code:java} checkpointer.wakeupForCheckpoint(0, "too many dirty pages").cpBeginFut .getUninterruptibly(); {code} It looks like {{wakeupForCheckpoint}} can be called after the checkpointer is stopped and {{cpBeginFut}} will be never completed. - Fixed ZookeeperDiscoveryCommunicationFailureTest.testCommunicationFailureResolve_CachesInfo1 Fixed *.testFailAfterStart -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (IGNITE-11953) BTree corruption caused by byte array values
Dmitriy Govorukhin created IGNITE-11953: --- Summary: BTree corruption caused by byte array values Key: IGNITE-11953 URL: https://issues.apache.org/jira/browse/IGNITE-11953 Project: Ignite Issue Type: Bug Reporter: Dmitriy Govorukhin In some cases for caches with cache group, we can get BTree corruption exception. {code} 09:53:58,890][SEVERE][sys-stripe-10-#11][] Critical system error detected. Will be handled accordingly to configured handler [hnd=CustomFailureHandler [ignoreCriticalErrors=false, disabled=false][StopNodeOrHaltFailureHandler [tryStop=false, timeout=0]], failureCtx=FailureContext [type=CRITICAL_ERROR, err=class o.a.i.i.transactions.IgniteTxHeuristicCheckedException: Committing a transaction has produced runtime exception]]class org.apache.ignite.internal.transactions.IgniteTxHeuristicCheckedException: Committing a transaction has produced runtime exception at org.apache.ignite.internal.processors.cache.transactions.IgniteTxAdapter.heuristicException(IgniteTxAdapter.java:800) at org.apache.ignite.internal.processors.cache.transactions.IgniteTxLocalAdapter.userCommit(IgniteTxLocalAdapter.java:922) at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocalAdapter.localFinish(GridDhtTxLocalAdapter.java:799) at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocal.localFinish(GridDhtTxLocal.java:608) at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocal.finishTx(GridDhtTxLocal.java:478) at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocal.commitDhtLocalAsync(GridDhtTxLocal.java:535) at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.finishDhtLocal(IgniteTxHandler.java:1055) at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.finish(IgniteTxHandler.java:931) at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.processNearTxFinishRequest(IgniteTxHandler.java:887) at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.access$200(IgniteTxHandler.java:117) at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$3.apply(IgniteTxHandler.java:209) at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$3.apply(IgniteTxHandler.java:207) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1129) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:594) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:393) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:319) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:109) at org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:308) at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1568) at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1196) at org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:126) at org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1092) at org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:504) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119) at java.lang.Thread.run(Thread.java:748) Caused by: class org.apache.ignite.internal.processors.cache.persistence.tree.CorruptedTreeException: Runtime failure on search row: SearchRow [key=KeyCacheObjectImpl [part=427, val=Grkg1DUF3yQE6tC9Se50mi5w.T, hasValBytes=true], hash=1872857770, cacheId=-420893003] at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.invoke(BPlusTree.java:1811) at org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.invoke0(IgniteCacheOffheapManagerImpl.java:1620) at org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.invoke(IgniteCacheOffheapManagerImpl.java:1603) at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.invoke(GridCacheOffheapManager.java:2131) at org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.invoke(IgniteCacheOffheapManagerImpl.java:442) at
[jira] [Created] (IGNITE-11934) Bugs & tests fixes
Dmitriy Govorukhin created IGNITE-11934: --- Summary: Bugs & tests fixes Key: IGNITE-11934 URL: https://issues.apache.org/jira/browse/IGNITE-11934 Project: Ignite Issue Type: Bug Reporter: Dmitriy Govorukhin This issue contains fixes for several issues: * AssertionError occurs on the client when coordinator killed (with ZK discovery) * IgniteVersionUtils#BUILD_TSTAMP_DATE_FORMATTER is used in a non thread-safe manner. * Possible discovery race on node joining with Authenticator. * PageLocksCommand#parseArguments cannot properly parse arguments user, password if its at the end of arguments list. * Test CheckpointFreeListTest.testRestoreFreeListCorrectlyAfterRandomStop failed on TC * IgniteWalFlushBackgroundSelfTest.testFailWhileStart & IgniteWalFlushLogOnlySelfTest.testFailWhileStart fail in disk compression suite. * IgniteClientConnectAfterCommunicationFailureTest fails * Add scale factor for PageLockTrackerTests -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11835) Support JMX/control.sh API for page lock dump
Dmitriy Govorukhin created IGNITE-11835: --- Summary: Support JMX/control.sh API for page lock dump Key: IGNITE-11835 URL: https://issues.apache.org/jira/browse/IGNITE-11835 Project: Ignite Issue Type: Sub-task Reporter: Dmitriy Govorukhin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11824) Integrate PageLockTracker to DataStructure (per-thread tracker)
Dmitriy Govorukhin created IGNITE-11824: --- Summary: Integrate PageLockTracker to DataStructure (per-thread tracker) Key: IGNITE-11824 URL: https://issues.apache.org/jira/browse/IGNITE-11824 Project: Ignite Issue Type: Sub-task Reporter: Dmitriy Govorukhin Assignee: Dmitriy Govorukhin After [IGNITE-11750] will be completed, we will have a structure for tracking page locks per-thread. The next step, need to integrate it into diagnostic API and implements a component for creating this structure per-thread. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11786) Implement thread-local stack for trucking page locks
Dmitriy Govorukhin created IGNITE-11786: --- Summary: Implement thread-local stack for trucking page locks Key: IGNITE-11786 URL: https://issues.apache.org/jira/browse/IGNITE-11786 Project: Ignite Issue Type: Sub-task Reporter: Dmitriy Govorukhin The new structure should work as a stack. When thread obtains lock we push pageId (+meta) on the top of the stack when thread release lock we pop pageId from the stack. Their cases when thread may unlock page not from current thread frame (some split pages in B-tree), from previous, in this case, we should go down to stack and find this page and update meta. {code} {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11738) Incorrect check ObjectInput.available() in CacheMetricsSnapshot
Dmitriy Govorukhin created IGNITE-11738: --- Summary: Incorrect check ObjectInput.available() in CacheMetricsSnapshot Key: IGNITE-11738 URL: https://issues.apache.org/jira/browse/IGNITE-11738 Project: Ignite Issue Type: Bug Reporter: Dmitriy Govorukhin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11641) Server node copies a lot of WAL files in WAL archive after restart
Dmitriy Govorukhin created IGNITE-11641: --- Summary: Server node copies a lot of WAL files in WAL archive after restart Key: IGNITE-11641 URL: https://issues.apache.org/jira/browse/IGNITE-11641 Project: Ignite Issue Type: Bug Reporter: Dmitriy Govorukhin Pre-condition: PDS is enabled, wal_path and wal_archive_path are set in config file. 1. Cluster is up and running. Some data uploaded into caches. 2. Start load to generate a lot of files in wal archive (more than files in wal directory). 3. Stop some node and delete all files from wal archive. 4. Start node. In this case node copies WAL files from WAL dir into wal archive dir again and again until the amount of files will be the same it was in wal archive before stop. Here is information from server node log {code} 10:10:17,054][INFO][main][GridCacheDatabaseSharedManager] Restoring partition state for local groups. [10:10:18,522][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] Copied file [src=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/.wal, dst=/storage/ssd/avolkov/wal_archive/node00-83c9db32-fee5-4f3e-8a1c-559221817759/.wal] [10:10:18,523][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] Starting to copy WAL segment [absIdx=1, segIdx=1, origFile=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0001.wal, dstFile=/storage/ssd/avolkov/wal_archive/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0001.wal] [10:10:20,631][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] Copied file [src=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0001.wal, dst=/storage/ssd/avolkov/wal_archive/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0001.wal] [10:10:20,632][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] Starting to copy WAL segment [absIdx=2, segIdx=2, origFile=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0002.wal, dstFile=/storage/ssd/avolkov/wal_archive/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0002.wal] [10:10:23,276][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] Copied file [src=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0002.wal, dst=/storage/ssd/avolkov/wal_archive/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0002.wal] [10:10:23,276][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] Starting to copy WAL segment [absIdx=3, segIdx=3, origFile=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0003.wal, dstFile=/storage/ssd/avolkov/wal_archive/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0003.wal] [10:10:23,995][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] Copied file [src=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0003.wal, dst=/storage/ssd/avolkov/wal_archive/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0003.wal] [10:10:23,996][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] Starting to copy WAL segment [absIdx=4, segIdx=4, origFile=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0004.wal, dstFile=/storage/ssd/avolkov/wal_archive/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0004.wal] [10:10:24,644][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] Copied file [src=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0004.wal, dst=/storage/ssd/avolkov/wal_archive/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0004.wal] [10:10:24,645][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] Starting to copy WAL segment [absIdx=5, segIdx=5, origFile=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0005.wal, dstFile=/storage/ssd/avolkov/wal_archive/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0005.wal] [10:10:25,301][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] Copied file [src=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0005.wal, dst=/storage/ssd/avolkov/wal_archive/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0005.wal] [10:10:25,301][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] Starting to copy WAL segment [absIdx=6, segIdx=6, origFile=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0006.wal, dstFile=/storage/ssd/avolkov/wal_archive/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0006.wal] [10:10:26,043][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] Copied file [src=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0006.wal,
[jira] [Created] (IGNITE-11509) Remove DistributedBaselineConfiguration and replace to methods on IgniteCluster
Dmitriy Govorukhin created IGNITE-11509: --- Summary: Remove DistributedBaselineConfiguration and replace to methods on IgniteCluster Key: IGNITE-11509 URL: https://issues.apache.org/jira/browse/IGNITE-11509 Project: Ignite Issue Type: Bug Reporter: Dmitriy Govorukhin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11095) Failed WalCompactionTest flaky test
Dmitriy Govorukhin created IGNITE-11095: --- Summary: Failed WalCompactionTest flaky test Key: IGNITE-11095 URL: https://issues.apache.org/jira/browse/IGNITE-11095 Project: Ignite Issue Type: Bug Reporter: Dmitriy Govorukhin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10974) Grid may hangs if an exception is thrown from PageMemoryImpl.beforeReleaseWrite()
Dmitriy Govorukhin created IGNITE-10974: --- Summary: Grid may hangs if an exception is thrown from PageMemoryImpl.beforeReleaseWrite() Key: IGNITE-10974 URL: https://issues.apache.org/jira/browse/IGNITE-10974 Project: Ignite Issue Type: Bug Reporter: Dmitriy Govorukhin Assignee: Dmitriy Govorukhin {code} [2019-01-17 14:35:15,953][WARN ][main][root] Thread dump at 2019/01/17 14:35:15 UTC [17:35:15]W: [org.apache.ignite:ignite-core] Thread [name="sys-#857%failure.IoomFailureHandlerTest0%", id=931, state=TIMED_WAITING, blockCnt=0, waitCnt=1] [17:35:15]W: [org.apache.ignite:ignite-core] Lock [object=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@4339baec, ownerName=null, ownerId=-1] [17:35:15]W: [org.apache.ignite:ignite-core] at sun.misc.Unsafe.park(Native Method) [17:35:15]W: [org.apache.ignite:ignite-core] at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) [17:35:15]W: [org.apache.ignite:ignite-core] at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078) [17:35:15]W: [org.apache.ignite:ignite-core] at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467) [17:35:15]W: [org.apache.ignite:ignite-core] at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073) [17:35:15]W: [org.apache.ignite:ignite-core] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) [17:35:15]W: [org.apache.ignite:ignite-core] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [17:35:15]W: [org.apache.ignite:ignite-core] at java.lang.Thread.run(Thread.java:748) [17:35:15]W: [org.apache.ignite:ignite-core] [17:35:15]W: [org.apache.ignite:ignite-core] Thread [name="sys-#856%failure.IoomFailureHandlerTest0%", id=930, state=TIMED_WAITING, blockCnt=0, waitCnt=1] [17:35:15]W: [org.apache.ignite:ignite-core] Lock [object=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@4339baec, ownerName=null, ownerId=-1] [17:35:15]W: [org.apache.ignite:ignite-core] at sun.misc.Unsafe.park(Native Method) [17:35:15]W: [org.apache.ignite:ignite-core] at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) [17:35:15]W: [org.apache.ignite:ignite-core] at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078) [17:35:15]W: [org.apache.ignite:ignite-core] at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467) [17:35:15]W: [org.apache.ignite:ignite-core] at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073) [17:35:15]W: [org.apache.ignite:ignite-core] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) [17:35:15]W: [org.apache.ignite:ignite-core] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [17:35:15]W: [org.apache.ignite:ignite-core] at java.lang.Thread.run(Thread.java:748) [17:35:15]W: [org.apache.ignite:ignite-core] [17:35:15]W: [org.apache.ignite:ignite-core] Thread [name="sys-#855%failure.IoomFailureHandlerTest0%", id=929, state=TIMED_WAITING, blockCnt=0, waitCnt=1] [17:35:15]W: [org.apache.ignite:ignite-core] Lock [object=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@4339baec, ownerName=null, ownerId=-1] [17:35:15]W: [org.apache.ignite:ignite-core] at sun.misc.Unsafe.park(Native Method) [17:35:15]W: [org.apache.ignite:ignite-core] at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) [17:35:15]W: [org.apache.ignite:ignite-core] at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078) [17:35:15]W: [org.apache.ignite:ignite-core] at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467) [17:35:15]W: [org.apache.ignite:ignite-core] at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073) [17:35:15]W: [org.apache.ignite:ignite-core] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) [17:35:15]W: [org.apache.ignite:ignite-core] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [17:35:15]W: [org.apache.ignite:ignite-core] at java.lang.Thread.run(Thread.java:748) [17:35:15]W: [org.apache.ignite:ignite-core] [17:35:15]W: [org.apache.ignite:ignite-core] Thread [name="sys-#854%failure.IoomFailureHandlerTest0%", id=928, state=TIMED_WAITING, blockCnt=0, waitCnt=1] [17:35:15]W: [org.apache.ignite:ignite-core] Lock [object=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@4339baec, ownerName=null, ownerId=-1] [17:35:15]W: [org.apache.ignite:ignite-core] at sun.misc.Unsafe.park(Native Method) [17:35:15]W:
[jira] [Created] (IGNITE-10909) GridCacheBalancingStoreSelfTest.testConcurrentLoad flaky test fail in Cache 1
Dmitriy Govorukhin created IGNITE-10909: --- Summary: GridCacheBalancingStoreSelfTest.testConcurrentLoad flaky test fail in Cache 1 Key: IGNITE-10909 URL: https://issues.apache.org/jira/browse/IGNITE-10909 Project: Ignite Issue Type: Bug Reporter: Dmitriy Govorukhin Assignee: Dmitriy Govorukhin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10908) GridServiceProcessorBatchDeploySelfTest.testDeployAllTopologyChange flaky fail with NPE in Service Grid (legacy mode)
Dmitriy Govorukhin created IGNITE-10908: --- Summary: GridServiceProcessorBatchDeploySelfTest.testDeployAllTopologyChange flaky fail with NPE in Service Grid (legacy mode) Key: IGNITE-10908 URL: https://issues.apache.org/jira/browse/IGNITE-10908 Project: Ignite Issue Type: Bug Reporter: Dmitriy Govorukhin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10907) IgniteUtilsSelfTest.testDoInParallelWithStealingJobRunTaskInExecutor flaky failed in Basic 1
Dmitriy Govorukhin created IGNITE-10907: --- Summary: IgniteUtilsSelfTest.testDoInParallelWithStealingJobRunTaskInExecutor flaky failed in Basic 1 Key: IGNITE-10907 URL: https://issues.apache.org/jira/browse/IGNITE-10907 Project: Ignite Issue Type: Bug Reporter: Dmitriy Govorukhin Assignee: Dmitriy Govorukhin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10891) IgnitePdsThreadInterruptionTest.testInterruptsOnLFSRead flaky in PDS indexing
Dmitriy Govorukhin created IGNITE-10891: --- Summary: IgnitePdsThreadInterruptionTest.testInterruptsOnLFSRead flaky in PDS indexing Key: IGNITE-10891 URL: https://issues.apache.org/jira/browse/IGNITE-10891 Project: Ignite Issue Type: Bug Reporter: Dmitriy Govorukhin Assignee: Dmitriy Govorukhin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10883) IgniteRebalanceOnCachesStoppingOrDestroyingTest flaky failed in PDS4
Dmitriy Govorukhin created IGNITE-10883: --- Summary: IgniteRebalanceOnCachesStoppingOrDestroyingTest flaky failed in PDS4 Key: IGNITE-10883 URL: https://issues.apache.org/jira/browse/IGNITE-10883 Project: Ignite Issue Type: Bug Reporter: Dmitriy Govorukhin [testStopCachesOnDeactivation|https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8=-3436991258700651390=testDetails] [testStopCachesOnDeactivation|https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8=-3436991258700651390=testDetails] The first problem in a test, it is not checked that rebalance completed after test action performed. And second problem in an assert, there are no guarantees that cache will not be desproyed before checkpoint completed. {code} Failed to notify listener: o.a.i.i.processors.cache.WalStateManager$3...@31e26a1java.lang.AssertionError at org.apache.ignite.internal.processors.cache.WalStateManager$3.applyx(WalStateManager.java:510) at org.apache.ignite.internal.processors.cache.WalStateManager$3.applyx(WalStateManager.java:505) at org.apache.ignite.internal.util.lang.IgniteInClosureX.apply(IgniteInClosureX.java:38) at org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:399) at org.apache.ignite.internal.util.future.GridFutureAdapter.unblock(GridFutureAdapter.java:347) at org.apache.ignite.internal.util.future.GridFutureAdapter.unblockAll(GridFutureAdapter.java:335) at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:511) at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$CheckpointProgress$1.onDone(GridCacheDatabaseSharedManager.java:4280) at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$CheckpointProgress$1.onDone(GridCacheDatabaseSharedManager.java:4275) at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:490) at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:456) at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.markCheckpointEnd(GridCacheDatabaseSharedManager.java:3904) at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.doCheckpoint(GridCacheDatabaseSharedManager.java:3353) at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.body(GridCacheDatabaseSharedManager.java:3119) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10508) Need to support the new checkpoint feature not wait for the previous operation to complete
Dmitriy Govorukhin created IGNITE-10508: --- Summary: Need to support the new checkpoint feature not wait for the previous operation to complete Key: IGNITE-10508 URL: https://issues.apache.org/jira/browse/IGNITE-10508 Project: Ignite Issue Type: Improvement Reporter: Dmitriy Govorukhin There are cases when we should trigger the checkpoint, some operations will be sure that all operation finished before the checkpoint. It is necessary to support the possibility of run checkpoint without waiting for the completion of the previous checkpoint. Solution: Merge checkpoint pages and append write new dirty pages to a current checkpoint. Restrictions: Trigger new checkpoint should not wait for the previous checkpoint operation completed. - It should not break crash recovery mechanisms - Only one merged is allow in the first implementation (potentially OOM, if we will try to merge many checkpoint operations) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10341) Missed loss policy tests with persistence
Dmitriy Govorukhin created IGNITE-10341: --- Summary: Missed loss policy tests with persistence Key: IGNITE-10341 URL: https://issues.apache.org/jira/browse/IGNITE-10341 Project: Ignite Issue Type: Bug Reporter: Dmitriy Govorukhin After IGNITE-10207 was implemented, the test was removed (check policy if persistence enables), it is a mistake, need to revert this changes. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10290) Map.Entry interface for key cache may lead to incorrect calculation hash code
Dmitriy Govorukhin created IGNITE-10290: --- Summary: Map.Entry interface for key cache may lead to incorrect calculation hash code Key: IGNITE-10290 URL: https://issues.apache.org/jira/browse/IGNITE-10290 Project: Ignite Issue Type: Bug Reporter: Dmitriy Govorukhin Attachments: Reproducer.java In case if use Map.Entry interface for a key, we can try to find (key, value) in store with incorrect calculated hash code for binary representation. The problem is in the GridPartitionedSingleGetFuture#localGet() and GridPartitionedGetFuture#localGet() does not execute prepareForCache before reading cacheDataRow from row store. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10285) U.doInParallel may lead to deadlock
Dmitriy Govorukhin created IGNITE-10285: --- Summary: U.doInParallel may lead to deadlock Key: IGNITE-10285 URL: https://issues.apache.org/jira/browse/IGNITE-10285 Project: Ignite Issue Type: Bug Reporter: Dmitriy Govorukhin Attachments: dump.rtf There are exist case when we can get a deadlock on the thread pool. If we try doInParallel in the thread of sys-pool in the number of hreads==sys-pool.size we lead to deadlock because threads in sys-pool will try doInParallel through the same sys-pool, and they will wait on future infinitely because no one thread cannot complete operation doInParallel which require threads from sys-pool. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10252) Cache.get() may be mapped to the node with partition state is "MOVING"
Dmitriy Govorukhin created IGNITE-10252: --- Summary: Cache.get() may be mapped to the node with partition state is "MOVING" Key: IGNITE-10252 URL: https://issues.apache.org/jira/browse/IGNITE-10252 Project: Ignite Issue Type: Bug Reporter: Dmitriy Govorukhin After implemented IGNITE-5357, in some cases get maybe mapped to the node with partition state is "MOVING" for PARTITION cache and it may lead to some assertion errors (we do not allow read from moving partitions). In an original issue, a talk was about only replicated cache, why it was implemented for partition cache, not clear. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10207) Missed loss policy checks
Dmitriy Govorukhin created IGNITE-10207: --- Summary: Missed loss policy checks Key: IGNITE-10207 URL: https://issues.apache.org/jira/browse/IGNITE-10207 Project: Ignite Issue Type: Bug Reporter: Dmitriy Govorukhin In some cases (client reconnect, new client join, etc) PartitionLossPolicy may incorrectly validate operation. Return null for READ_ONLY_SAFE for loss partition. To reproduce run CacheResultIsNotNullOnPartitionLossTest (1000 times ) with random node stop. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9898) Checkpointer thread hangs on await async task complete
Dmitriy Govorukhin created IGNITE-9898: -- Summary: Checkpointer thread hangs on await async task complete Key: IGNITE-9898 URL: https://issues.apache.org/jira/browse/IGNITE-9898 Project: Ignite Issue Type: Bug Reporter: Dmitriy Govorukhin In some cases, we can reset thread pool counters during execution async task, and then we can get hangs on await {code} [19:36:01] : [Step 4/5] [2018-10-15 16:36:01,435][INFO ][db-checkpoint-thread-#21691%db.IgnitePdsPageEvictionDuringPartitionClearTest0%][GridCacheDatabaseSharedManager] Await checkpoint pool tasks comleted, pendingTaskCnt=2, completedTaskCnt=3, initialized=true, err=null, activeCnt=0 [19:36:03] : [Step 4/5] [2018-10-15 16:36:03,435][INFO ][db-checkpoint-thread-#21691%db.IgnitePdsPageEvictionDuringPartitionClearTest0%][GridCacheDatabaseSharedManager] Await checkpoint pool tasks comleted, pendingTaskCnt=2, completedTaskCnt=3, initialized=true, err=null, activeCnt=0 [19:36:05] : [Step 4/5] [2018-10-15 16:36:05,436][INFO ][db-checkpoint-thread-#21691%db.IgnitePdsPageEvictionDuringPartitionClearTest0%][GridCacheDatabaseSharedManager] Await checkpoint pool tasks comleted, pendingTaskCnt=2, completedTaskCnt=3, initialized=true, err=null, activeCnt=0 [19:36:07] : [Step 4/5] [2018-10-15 16:36:07,436][INFO ][db-checkpoint-thread-#21691%db.IgnitePdsPageEvictionDuringPartitionClearTest0%][GridCacheDatabaseSharedManager] Await checkpoint pool tasks comleted, pendingTaskCnt=2, completedTaskCnt=3, initialized=true, err=null, activeCnt=0 [19:36:09] : [Step 4/5] [2018-10-15 16:36:09,437][INFO ][db-checkpoint-thread-#21691%db.IgnitePdsPageEvictionDuringPartitionClearTest0%][GridCacheDatabaseSharedManager] Await checkpoint pool tasks comleted, pendingTaskCnt=2, completedTaskCnt=3, initialized=true, err=null, activeCnt=0 [19:36:11] : [Step 4/5] [2018-10-15 16:36:11,437][INFO ][db-checkpoint-thread-#21691%db.IgnitePdsPageEvictionDuringPartitionClearTest0%][GridCacheDatabaseSharedManager] Await checkpoint pool tasks comleted, pendingTaskCnt=2, completedTaskCnt=3, initialized=true, err=null, activeCnt=0 [19:36:13] : [Step 4/5] [2018-10-15 16:36:13,438][INFO ][db-checkpoint-thread-#21691%db.IgnitePdsPageEvictionDuringPartitionClearTest0%][GridCacheDatabaseSharedManager] Await checkpoint pool tasks comleted, pendingTaskCnt=2, completedTaskCnt=3, initialized=true, err=null, activeCnt=0 [19:36:15] : [Step 4/5] [2018-10-15 16:36:15,439][INFO ][db-checkpoint-thread-#21691%db.IgnitePdsPageEvictionDuringPartitionClearTest0%][GridCacheDatabaseSharedManager] Await checkpoint pool tasks comleted, pendingTaskCnt=2, completedTaskCnt=3, initialized=true, err=null, activeCnt=0 [19:36:17] : [Step 4/5] [2018-10-15 16:36:17,440][INFO ][db-checkpoint-thread-#21691%db.IgnitePdsPageEvictionDuringPartitionClearTest0%][GridCacheDatabaseSharedManager] Await checkpoint pool tasks comleted, pendingTaskCnt=2, completedTaskCnt=3, initialized=true, err=null, activeCnt=0 [19:36:19] : [Step 4/5] [2018-10-15 16:36:19,441][INFO ][db-checkpoint-thread-#21691%db.IgnitePdsPageEvictionDuringPartitionClearTest0%][GridCacheDatabaseSharedManager] Await checkpoint pool tasks comleted, pendingTaskCnt=2, completedTaskCnt=3, initialized=true, err=null, activeCnt=0 [19:36:21] : [Step 4/5] [2018-10-15 16:36:21,442][INFO ][db-checkpoint-thread-#21691%db.IgnitePdsPageEvictionDuringPartitionClearTest0%][GridCacheDatabaseSharedManager] Await checkpoint pool tasks comleted, pendingTaskCnt=2, completedTaskCnt=3, initialized=true, err=null, activeCnt=0 {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9426) IgniteAtomicSequence benchmarks
Dmitriy Govorukhin created IGNITE-9426: -- Summary: IgniteAtomicSequence benchmarks Key: IGNITE-9426 URL: https://issues.apache.org/jira/browse/IGNITE-9426 Project: Ignite Issue Type: Bug Reporter: Dmitriy Govorukhin Need to create JMH and Yardstick benchmarks for the atomic sequence in order to be able to measure future performance improvements -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9260) StandaloneWalRecordsIterator broken on WalSegmentTailReachedException not in work dir
Dmitriy Govorukhin created IGNITE-9260: -- Summary: StandaloneWalRecordsIterator broken on WalSegmentTailReachedException not in work dir Key: IGNITE-9260 URL: https://issues.apache.org/jira/browse/IGNITE-9260 Project: Ignite Issue Type: Bug Reporter: Dmitriy Govorukhin After implementation IGNITE-9050, StandaloneWalRecordsIterator became broke because in the standalone mode we can stop the iteration at any moment when the last available segment will be fully read. And validation which was implemented in IGNITE-9050 is not applicable for standalone mode. Need to change behavior and validate what we stop an iteration in last available WAL segment. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9244) Partition eviction may use all threads in sys pool, it leads to hangs send a message via sys pool
Dmitriy Govorukhin created IGNITE-9244: -- Summary: Partition eviction may use all threads in sys pool, it leads to hangs send a message via sys pool Key: IGNITE-9244 URL: https://issues.apache.org/jira/browse/IGNITE-9244 Project: Ignite Issue Type: Bug Environment: In the current implementation, GridDhtPartitionsEvictor reset partition to evict one by one. GridDhtPartitionsEvictor is created for each cache group, if we try to evict too many groups as sys pool size, group evictors will take all available threads in sys pool. It leads to hangs send a message via sys pool. As a fix, I suggest to limit concurrent execution via sys pool or use another pool for this purpose. Reporter: Dmitriy Govorukhin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9050) WALIterator should throws exception if iterator stopped in the WALArchive but not in WALWork
Dmitriy Govorukhin created IGNITE-9050: -- Summary: WALIterator should throws exception if iterator stopped in the WALArchive but not in WALWork Key: IGNITE-9050 URL: https://issues.apache.org/jira/browse/IGNITE-9050 Project: Ignite Issue Type: Bug Reporter: Dmitriy Govorukhin The iterator will stop iteration if next WAL record pointer is not equals expected (WalSegmentTailReachedException), if it happens during iteration over segments in WAL archive, this means WAL is corrupted and we cannot ignore this, WAL log is not fully read. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9049) Missed SWITCH_SEGMENT_RECORD at the end of WAL file but space enough
Dmitriy Govorukhin created IGNITE-9049: -- Summary: Missed SWITCH_SEGMENT_RECORD at the end of WAL file but space enough Key: IGNITE-9049 URL: https://issues.apache.org/jira/browse/IGNITE-9049 Project: Ignite Issue Type: Improvement Reporter: Dmitriy Govorukhin There is a situation the several threads try addRecord when the free space ends (need rollOver to the next WAL segment) and none thread writes SWITCH_SEGMENT_RECORD. This leads to an end of the file will have garbage. If we try to iterate over this segment, iterator stopped when try to read next record and stumble on the garbage at the end of the file, it leads log will not be fully read. Any type of operation required iterator may be broken (crash recovery, delta rebalance, etc.). Example: File size 1024 bytes Current tail position 768 (free space 256) 1. Thread-1 try addRecord (size 128) -> tail update to 896. 2. Thread-2 try addRecord (size 128) -> tail update to 1024 (free space ended). None thread still not write any data, it just reserves position for write. (SegmentedRingByteBuffer.offer). 3. Thread-3 try addRecord (size 128) -> no space enough -> rollOver and CAS stop flag to TRUE. 4. Thread-1 and Thread-2 try to write data and cannot do it. FileWriteHandle.addRecord {code} if (buf == null || (stop.get() && rec.type() != SWITCH_SEGMENT_RECORD)) return null; // Can not write to this segment, need to switch to the next one. {code} Thread-3 - can not write SWITCH_SEGMENT_RECORD because of not enough space. Thread-1 and Thread-2 cannot write their data because a stop is TRUE We have garbage from 768 to 1024 position. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9047) Add idleVerify check for GridCommonAbstractTest
Dmitriy Govorukhin created IGNITE-9047: -- Summary: Add idleVerify check for GridCommonAbstractTest Key: IGNITE-9047 URL: https://issues.apache.org/jira/browse/IGNITE-9047 Project: Ignite Issue Type: Improvement Reporter: Dmitriy Govorukhin Since we have idleVerify (consistency check between primary and backups) it will be useful to add this command into test abstract class for subsequent verification of consistency after some test scenarios. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9042) Transaction with small timeout may lead to inconsistent partition state
Dmitriy Govorukhin created IGNITE-9042: -- Summary: Transaction with small timeout may lead to inconsistent partition state Key: IGNITE-9042 URL: https://issues.apache.org/jira/browse/IGNITE-9042 Project: Ignite Issue Type: Bug Reporter: Dmitriy Govorukhin Attachments: Reproducer.java The transaction with a small timeout may lead to inconsistent partition state. Reproducer in attached. Problem in GridDhtTxPrepareFuture.sendPrepareRequests() if timeout will reached during iteration over tx.dhtMap().values() we do not send GridDhtTxPrepareRequest for some backups, it lead that backup will not know any think about transaction and will not participate in commit. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8973) Need to support dump for idle_verify
Dmitriy Govorukhin created IGNITE-8973: -- Summary: Need to support dump for idle_verify Key: IGNITE-8973 URL: https://issues.apache.org/jira/browse/IGNITE-8973 Project: Ignite Issue Type: Bug Reporter: Dmitriy Govorukhin In a current implementation, idle_verify checking consistency between primary and backup partitions will be useful to have ability dump current state for all partition to file. This dump can help an investigation of some kind of problem with partition counters or sizes because it is a cluster partition hash snapshot by some partition state (hash include all keys in the partition). idle_verify --dump - calculate partition hash and print into standard output idle_verify --dump {path} - calculate partition hash and write output to file -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8929) WAL should not disable for the group if none a partition is not assigned to a local node.
Dmitriy Govorukhin created IGNITE-8929: -- Summary: WAL should not disable for the group if none a partition is not assigned to a local node. Key: IGNITE-8929 URL: https://issues.apache.org/jira/browse/IGNITE-8929 Project: Ignite Issue Type: Bug Reporter: Dmitriy Govorukhin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8827) Disable WAL during apply updates on recovery
Dmitriy Govorukhin created IGNITE-8827: -- Summary: Disable WAL during apply updates on recovery Key: IGNITE-8827 URL: https://issues.apache.org/jira/browse/IGNITE-8827 Project: Ignite Issue Type: Bug Reporter: Dmitriy Govorukhin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8707) DataStorageMetrics.getTotalAllocatedSize metric does not account reserved partition page header.
Dmitriy Govorukhin created IGNITE-8707: -- Summary: DataStorageMetrics.getTotalAllocatedSize metric does not account reserved partition page header. Key: IGNITE-8707 URL: https://issues.apache.org/jira/browse/IGNITE-8707 Project: Ignite Issue Type: Bug Reporter: Dmitriy Govorukhin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8685) Incorrect size for switch segment record
Dmitriy Govorukhin created IGNITE-8685: -- Summary: Incorrect size for switch segment record Key: IGNITE-8685 URL: https://issues.apache.org/jira/browse/IGNITE-8685 Project: Ignite Issue Type: Bug Reporter: Dmitriy Govorukhin We have invariant that switch segment record should have the size of one byte. Although, in the current implementation, size calculation with overhard for storing CRC and WAL pointer. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8661) WALItreater is not stopped if can not deserialize record
Dmitriy Govorukhin created IGNITE-8661: -- Summary: WALItreater is not stopped if can not deserialize record Key: IGNITE-8661 URL: https://issues.apache.org/jira/browse/IGNITE-8661 Project: Ignite Issue Type: Bug Reporter: Dmitriy Govorukhin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8607) [.NET] Support metrics changes in DataStorageMetricsMXBean
Dmitriy Govorukhin created IGNITE-8607: -- Summary: [.NET] Support metrics changes in DataStorageMetricsMXBean Key: IGNITE-8607 URL: https://issues.apache.org/jira/browse/IGNITE-8607 Project: Ignite Issue Type: Bug Reporter: Dmitriy Govorukhin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8583) DataStorageMetricsMXBean.getOffHeapSize include checkpoint buffer size, this is not clear
Dmitriy Govorukhin created IGNITE-8583: -- Summary: DataStorageMetricsMXBean.getOffHeapSize include checkpoint buffer size, this is not clear Key: IGNITE-8583 URL: https://issues.apache.org/jira/browse/IGNITE-8583 Project: Ignite Issue Type: Bug Reporter: Dmitriy Govorukhin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8475) Create new IgniteCache decorator with fair async methonds
Dmitriy Govorukhin created IGNITE-8475: -- Summary: Create new IgniteCache decorator with fair async methonds Key: IGNITE-8475 URL: https://issues.apache.org/jira/browse/IGNITE-8475 Project: Ignite Issue Type: Improvement Components: cache Affects Versions: 2.4 Reporter: Dmitriy Govorukhin Fix For: None GridCacheAdapter.syncOp has awaitLastFut(); this call wait last async operation completed. This means all async operation in one thread will be executed one by one but not in parallel. Async operation is not async. Example for atomic cache f1=cache.getAsync(key1); f2=cache.getAsync(key2); f1 always will be complete before f2. Need to create a new decorator for IgniteCache, and return IgniteCache proxy with fair async operations. IgniteCache.withFairAsync() -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8464) WALItreater broken (race on the switch to the next segment during iteration and concurrent archiving same segment)
Dmitriy Govorukhin created IGNITE-8464: -- Summary: WALItreater broken (race on the switch to the next segment during iteration and concurrent archiving same segment) Key: IGNITE-8464 URL: https://issues.apache.org/jira/browse/IGNITE-8464 Project: Ignite Issue Type: Bug Reporter: Dmitriy Govorukhin FileArchiver {code} final SegmentArchiveResult res = archiveSegment(toArchive); synchronized (this) { while (locked.containsKey(toArchive) && !stopped) wait(); } // Firstly, format working file if (!stopped) formatFile(res.getOrigWorkFile()); synchronized (this) { // Then increase counter to allow rollover on clean working file changeLastArchivedIndexAndNotifyWaiters(toArchive); notifyAll(); } {code} Some thread may try read segments when archive formating file in work dir (formatFile not synchronized), last archived index is still not updated. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8346) FileDownloaderTest is not included in the test suite
Dmitriy Govorukhin created IGNITE-8346: -- Summary: FileDownloaderTest is not included in the test suite Key: IGNITE-8346 URL: https://issues.apache.org/jira/browse/IGNITE-8346 Project: Ignite Issue Type: Test Reporter: Dmitriy Govorukhin org.apache.ignite.internal.processors.cache.persistence.file.FileDownloaderTest -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8341) .NET: Add new metrics for data storage
Dmitriy Govorukhin created IGNITE-8341: -- Summary: .NET: Add new metrics for data storage Key: IGNITE-8341 URL: https://issues.apache.org/jira/browse/IGNITE-8341 Project: Ignite Issue Type: New Feature Reporter: Dmitriy Govorukhin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8340) .NET Implement new JMX metrics for transactions
Dmitriy Govorukhin created IGNITE-8340: -- Summary: .NET Implement new JMX metrics for transactions Key: IGNITE-8340 URL: https://issues.apache.org/jira/browse/IGNITE-8340 Project: Ignite Issue Type: New Feature Reporter: Dmitriy Govorukhin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8078) Add new metrics for data storage
Dmitriy Govorukhin created IGNITE-8078: -- Summary: Add new metrics for data storage Key: IGNITE-8078 URL: https://issues.apache.org/jira/browse/IGNITE-8078 Project: Ignite Issue Type: New Feature Reporter: Dmitriy Govorukhin 1. Create new MXbean for each index, IndexMxBean {code} class IndexMxBean{ /** The number of PUT operations on the index. */ long processedPuts(); /** The number of GET operations on the index. */ long processedGets(); /** The total index size in bytes. */ long getIndexSize(); } {code} 2. Add new metrics for data storage and cache group. {code} class CacheGroupMetricsMXBean{ /** The total index size in bytes */ long getIndexTotalSize(); long getPKIndexTotalSize(); long getTotalsize(); long getDataSize(); String getType(); } {code} {code} class DataRegionMXBean{ long getIndexTotalSize(); long getPKIndexTotalSize(); long getTotalsize(); long getdataSize(); long offheapUsedSize(); long pagesRead(); long pagesWriten(); long pagesReplaced(); long dirtyPagesForNextCheckpoint(); } {code} {code} class DataStorageMXbean{ long getIndexTotalSize(); long getPKIndexTotalSize(); long getTotalsize(); long offHeapSize(); long offheapUsedSize(); long getDataSize(); long pagesRead(); long pagesWriten(); long pagesReplaced(); long checkpointTotalTime(); long dirtyPagesForNextCheckpoint(); } {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8077) Implement new JMX metrics for transactions
Dmitriy Govorukhin created IGNITE-8077: -- Summary: Implement new JMX metrics for transactions Key: IGNITE-8077 URL: https://issues.apache.org/jira/browse/IGNITE-8077 Project: Ignite Issue Type: New Feature Reporter: Dmitriy Govorukhin These additional metrics should be implemented to monitor transactions. {code} class TransactionsMXbean{ /** The number of transactions which was commited. */ long txCommited(); /** The number of transactions which was rollbacked. */ long txRollBacked(); /** The number of transactions in prepared state. */ long txPrepared(); /** The number of keys locked on the node. */ long lockedKeys(); /** The number of transactions for which node is the initiator. */ int ownerTxs(); } {code} For more details see in IgniteTxAdapter and IgniteTxManager. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-7933) The error writing wal point to cp/node-start file can lead to the inability to start node
Dmitriy Govorukhin created IGNITE-7933: -- Summary: The error writing wal point to cp/node-start file can lead to the inability to start node Key: IGNITE-7933 URL: https://issues.apache.org/jira/browse/IGNITE-7933 Project: Ignite Issue Type: Bug Reporter: Dmitriy Govorukhin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-7865) Need to provide method WAL manager for return serialize version
Dmitriy Govorukhin created IGNITE-7865: -- Summary: Need to provide method WAL manager for return serialize version Key: IGNITE-7865 URL: https://issues.apache.org/jira/browse/IGNITE-7865 Project: Ignite Issue Type: Improvement Reporter: Dmitriy Govorukhin {code} public interface IgniteWriteAheadLogManager { . /** * @return Current serializer version. */ public int serializerVersion(); . } {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-7755) Potentially crash during write cp-***-start.bin can lead to the impossibility of recovering
Dmitriy Govorukhin created IGNITE-7755: -- Summary: Potentially crash during write cp-***-start.bin can lead to the impossibility of recovering Key: IGNITE-7755 URL: https://issues.apache.org/jira/browse/IGNITE-7755 Project: Ignite Issue Type: Bug Affects Versions: 2.3, 2.4 Reporter: Dmitriy Govorukhin Fix For: 2.5 We can crashed after cp-***-start.bin created but before content (wal point) is recorded. On recovery after trying read wal point we got an exception. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-7747) WAL manage getAndReserveWalFiles should not throw exception if segments not found
Dmitriy Govorukhin created IGNITE-7747: -- Summary: WAL manage getAndReserveWalFiles should not throw exception if segments not found Key: IGNITE-7747 URL: https://issues.apache.org/jira/browse/IGNITE-7747 Project: Ignite Issue Type: Improvement Reporter: Dmitriy Govorukhin Assignee: Dmitriy Govorukhin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-7679) Move all test plugins to a separate module
Dmitriy Govorukhin created IGNITE-7679: -- Summary: Move all test plugins to a separate module Key: IGNITE-7679 URL: https://issues.apache.org/jira/browse/IGNITE-7679 Project: Ignite Issue Type: Test Reporter: Dmitriy Govorukhin Assignee: Dmitriy Govorukhin Fix For: 2.5 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-6857) Multi stage start cache
Dmitriy Govorukhin created IGNITE-6857: -- Summary: Multi stage start cache Key: IGNITE-6857 URL: https://issues.apache.org/jira/browse/IGNITE-6857 Project: Ignite Issue Type: Improvement Security Level: Public (Viewable by anyone) Components: cache Reporter: Dmitriy Govorukhin Fix For: 2.4 In current implementation, cache start in one stage (cache start exchange). We must provide ability to start cache more that one stage (for internal operation, like transaction recovery) No cache operation should occur between the start stages. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IGNITE-6547) Need to support ability log timestamp for wal tx and data records
Dmitriy Govorukhin created IGNITE-6547: -- Summary: Need to support ability log timestamp for wal tx and data records Key: IGNITE-6547 URL: https://issues.apache.org/jira/browse/IGNITE-6547 Project: Ignite Issue Type: Improvement Reporter: Dmitriy Govorukhin Assignee: Dmitriy Govorukhin It maybe useful for wal analyse. Also it must be configurable via system properties (IGNITE_WAL_LOGGIN_TIMESTAMP). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IGNITE-6513) Add ability manage version for WAL serializer via system properties
Dmitriy Govorukhin created IGNITE-6513: -- Summary: Add ability manage version for WAL serializer via system properties Key: IGNITE-6513 URL: https://issues.apache.org/jira/browse/IGNITE-6513 Project: Ignite Issue Type: Improvement Affects Versions: 2.1, 2.0, 2.2, 2.3 Reporter: Dmitriy Govorukhin Assignee: Dmitriy Govorukhin Fix For: 2.4 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IGNITE-6493) IgnitePdsWalTlbTest.testWalDirectOutOfMemory() hangs
Dmitriy Govorukhin created IGNITE-6493: -- Summary: IgnitePdsWalTlbTest.testWalDirectOutOfMemory() hangs Key: IGNITE-6493 URL: https://issues.apache.org/jira/browse/IGNITE-6493 Project: Ignite Issue Type: Bug Components: persistence Affects Versions: 2.2 Reporter: Dmitriy Govorukhin Assignee: Dmitriy Govorukhin Fix For: 2.3 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IGNITE-6480) Incorrect server node filter in GridDiscoveryManager
Dmitriy Govorukhin created IGNITE-6480: -- Summary: Incorrect server node filter in GridDiscoveryManager Key: IGNITE-6480 URL: https://issues.apache.org/jira/browse/IGNITE-6480 Project: Ignite Issue Type: Bug Reporter: Dmitriy Govorukhin Assignee: Dmitriy Govorukhin Fix For: 2.3 GridDiscoveryManager.serverTopologyNodes return collection nodes with daemon node inside. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IGNITE-6439) IgnitePersistentStoreSchemaLoadTest is broken
Dmitriy Govorukhin created IGNITE-6439: -- Summary: IgnitePersistentStoreSchemaLoadTest is broken Key: IGNITE-6439 URL: https://issues.apache.org/jira/browse/IGNITE-6439 Project: Ignite Issue Type: Bug Reporter: Dmitriy Govorukhin Assignee: Dmitriy Govorukhin After start nodes, cluster must be activated explicit. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IGNITE-6307) If getAll() fails with NPE, onHeap entry is not removed, for local cache
Dmitriy Govorukhin created IGNITE-6307: -- Summary: If getAll() fails with NPE, onHeap entry is not removed, for local cache Key: IGNITE-6307 URL: https://issues.apache.org/jira/browse/IGNITE-6307 Project: Ignite Issue Type: Bug Affects Versions: 2.0 Reporter: Dmitriy Govorukhin Fix For: 2.3 GridCacheLocalFullApiSelfTest.testGetAllWithNulls {code} final Set c = new HashSet<>(); c.add("key1"); c.add(null); GridTestUtils.assertThrows(log, new Callable() { @Override public Void call() throws Exception { cache.getAll(c); return null; } }, NullPointerException.class, null); {code} After getAll, entry with "key1" will be in heap map, it is dependent of order in collection for getAll(); -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IGNITE-6210) inefficient memory consumption for checkpoint buffer
Dmitriy Govorukhin created IGNITE-6210: -- Summary: inefficient memory consumption for checkpoint buffer Key: IGNITE-6210 URL: https://issues.apache.org/jira/browse/IGNITE-6210 Project: Ignite Issue Type: Bug Components: persistence Affects Versions: 2.1 Reporter: Dmitriy Govorukhin Assignee: Dmitriy Govorukhin Priority: Critical Fix For: 2.2 Current implementation allows configure checkpoint buffer size in PersistentStoreConfiguration, but checkpoint buffer will be created for each memory configuration with size equals the one indicated in PersistentStoreConfiguration. For example: {code} PersistentStoreConfiguration prCfg = new PersistentStoreConfiguration(); prCfg.setCheckpointingFrequency(5L * 1024L * 1024L * 1024L); // 5GB. MemoryConfiguration memCfg = new MemoryConfiguration(); MemoryPolicyConfiguration pl1 = new MemoryPolicyConfiguration(); pl1.setMaxSize(100L * 1024L * 1024L); // 100 Mb. MemoryPolicyConfiguration pl2 = new MemoryPolicyConfiguration(); pl2.setMaxSize(10L * 1024L * 1024L * 1024L); // 10GB. memCfg.setMemoryPolicies(pl1, pl2); {code} pl1(max size 10Gb) will be have checkpoint buffer = 5GB and pl2(max size 100Mb) buffer= 5GB -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IGNITE-6179) Test fail DynamicIndexReplicatedAtomicConcurrentSelfTest.testClientReconnectWithCacheRestart
Dmitriy Govorukhin created IGNITE-6179: -- Summary: Test fail DynamicIndexReplicatedAtomicConcurrentSelfTest.testClientReconnectWithCacheRestart Key: IGNITE-6179 URL: https://issues.apache.org/jira/browse/IGNITE-6179 Project: Ignite Issue Type: Bug Components: sql Reporter: Dmitriy Govorukhin Priority: Critical Fix For: 2.2 Test fail with assertion {code} [2017-08-24 18:34:06,207][ERROR][tcp-client-disco-msg-worker-#61%index.DynamicIndexReplicatedAtomicConcurrentSelfTest4%][IgniteClientReconnectAbstractTest$TestTcpDiscoverySpi] Failed to unmarshal discovery custom message. java.lang.AssertionError at org.apache.ignite.internal.processors.query.GridQueryProcessor.onSchemaFinishDiscovery(GridQueryProcessor.java:498) at org.apache.ignite.internal.processors.query.GridQueryProcessor.onDiscovery(GridQueryProcessor.java:894) at org.apache.ignite.internal.processors.cache.GridCacheProcessor.onCustomEvent(GridCacheProcessor.java:2906) at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.onDiscovery0(GridDiscoveryManager.java:660) at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.onDiscovery(GridDiscoveryManager.java:560) at org.apache.ignite.spi.discovery.tcp.ClientImpl$MessageWorker.notifyDiscovery(ClientImpl.java:2391) at org.apache.ignite.spi.discovery.tcp.ClientImpl$MessageWorker.processCustomMessage(ClientImpl.java:2297) at org.apache.ignite.spi.discovery.tcp.ClientImpl$MessageWorker.processDiscoveryMessage(ClientImpl.java:1874) at org.apache.ignite.spi.discovery.tcp.ClientImpl$MessageWorker.body(ClientImpl.java:1758) at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62) {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IGNITE-6154) Incorrect check checkpoint pages collection
Dmitriy Govorukhin created IGNITE-6154: -- Summary: Incorrect check checkpoint pages collection Key: IGNITE-6154 URL: https://issues.apache.org/jira/browse/IGNITE-6154 Project: Ignite Issue Type: Bug Reporter: Dmitriy Govorukhin Assignee: Dmitriy Govorukhin Priority: Critical Fix For: 2.2 There is incorrect check !F.empty(collection) in checkpoint thread. There should be a full check all elements, because collection is collection of GridMultiCollectionWrapper, and we must check all mutlicollections. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IGNITE-6100) Test fail IgnitePdsRecoveryAfterFileCorruptionTest.testPageRecoveryAfterFileCorruption
Dmitriy Govorukhin created IGNITE-6100: -- Summary: Test fail IgnitePdsRecoveryAfterFileCorruptionTest.testPageRecoveryAfterFileCorruption Key: IGNITE-6100 URL: https://issues.apache.org/jira/browse/IGNITE-6100 Project: Ignite Issue Type: Test Reporter: Dmitriy Govorukhin Assignee: Dmitriy Govorukhin Test leads to a memory leak -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IGNITE-6098) Test fail IgniteDataIntegrityTests.testExpandBuffer
Dmitriy Govorukhin created IGNITE-6098: -- Summary: Test fail IgniteDataIntegrityTests.testExpandBuffer Key: IGNITE-6098 URL: https://issues.apache.org/jira/browse/IGNITE-6098 Project: Ignite Issue Type: Test Reporter: Dmitriy Govorukhin Assignee: Dmitriy Govorukhin {code} junit.framework.AssertionFailedError: expected:<0> but was:<13> at junit.framework.Assert.fail(Assert.java:57) at junit.framework.Assert.failNotEquals(Assert.java:329) at junit.framework.Assert.assertEquals(Assert.java:78) at junit.framework.Assert.assertEquals(Assert.java:234) at junit.framework.Assert.assertEquals(Assert.java:241) at junit.framework.TestCase.assertEquals(TestCase.java:409) at org.apache.ignite.internal.processors.cache.persistence.db.wal.crc.IgniteDataIntegrityTests.testExpandBuffer(IgniteDataIntegrityTests.java:138) {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IGNITE-6091) Test fail CacheLateAffinityAssignmentTest.testRandomOperations
Dmitriy Govorukhin created IGNITE-6091: -- Summary: Test fail CacheLateAffinityAssignmentTest.testRandomOperations Key: IGNITE-6091 URL: https://issues.apache.org/jira/browse/IGNITE-6091 Project: Ignite Issue Type: Test Affects Versions: 2.1 Reporter: Dmitriy Govorukhin Assignee: Dmitriy Govorukhin Fix For: 2.2 Flaky test {code} junit.framework.AssertionFailedError: Unexpected error: javax.cache.CacheException: class org.apache.ignite.IgniteCheckedException: Failed to perform cache operation (cache topology is not valid): join-cache-24 at junit.framework.Assert.fail(Assert.java:57) at junit.framework.Assert.assertTrue(Assert.java:22) at junit.framework.TestCase.assertTrue(TestCase.java:192) at org.apache.ignite.internal.processors.cache.distributed.CacheLateAffinityAssignmentTest.checkCaches(CacheLateAffinityAssignmentTest.java:2176) at org.apache.ignite.internal.processors.cache.distributed.CacheLateAffinityAssignmentTest.afterTest(CacheLateAffinityAssignmentTest.java:225) {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IGNITE-6082) Test fail DynamicIndexPartitionedTransactionalConcurrentSelfTest.testConcurrentPutRemove
Dmitriy Govorukhin created IGNITE-6082: -- Summary: Test fail DynamicIndexPartitionedTransactionalConcurrentSelfTest.testConcurrentPutRemove Key: IGNITE-6082 URL: https://issues.apache.org/jira/browse/IGNITE-6082 Project: Ignite Issue Type: Test Affects Versions: 2.0 Reporter: Dmitriy Govorukhin Assignee: Dmitriy Govorukhin Fix For: 2.2 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IGNITE-6074) Test fail IgniteWalRecoveryTest#testWalBigObjectNodeCancel
Dmitriy Govorukhin created IGNITE-6074: -- Summary: Test fail IgniteWalRecoveryTest#testWalBigObjectNodeCancel Key: IGNITE-6074 URL: https://issues.apache.org/jira/browse/IGNITE-6074 Project: Ignite Issue Type: Test Reporter: Dmitriy Govorukhin Assignee: Dmitriy Govorukhin Fix For: 2.2 Exception in thread "exchange-worker-#79%wal.IgniteWalRecoveryTest1%" java.lang.AssertionError at org.apache.ignite.internal.processors.cache.persistence.wal.FileInput.ensure(FileInput.java:110) at org.apache.ignite.internal.processors.cache.persistence.wal.FileInput$Crc32CheckingFileInput.ensure(FileInput.java:303) at org.apache.ignite.internal.processors.cache.persistence.wal.FileInput$Crc32CheckingFileInput.readFully(FileInput.java:351) at org.apache.ignite.internal.processors.cache.persistence.wal.serializer.RecordV1Serializer.readDataEntry(RecordV1Serializer.java:1550) at org.apache.ignite.internal.processors.cache.persistence.wal.serializer.RecordV1Serializer.readRecord(RecordV1Serializer.java:799) at org.apache.ignite.internal.processors.cache.persistence.wal.serializer.RecordV1Serializer.readRecord(RecordV1Serializer.java:673) at org.apache.ignite.internal.processors.cache.persistence.wal.AbstractWalRecordsIterator.advanceRecord(AbstractWalRecordsIterator.java:208) at org.apache.ignite.internal.processors.cache.persistence.wal.AbstractWalRecordsIterator.advance(AbstractWalRecordsIterator.java:153) at org.apache.ignite.internal.processors.cache.persistence.wal.AbstractWalRecordsIterator.onNext(AbstractWalRecordsIterator.java:120) at org.apache.ignite.internal.processors.cache.persistence.wal.AbstractWalRecordsIterator.onNext(AbstractWalRecordsIterator.java:43) at org.apache.ignite.internal.util.GridCloseableIteratorAdapter.nextX(GridCloseableIteratorAdapter.java:41) at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.restoreMemory(GridCacheDatabaseSharedManager.java:1321) at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.readCheckpointAndRestoreMemory(GridCacheDatabaseSharedManager.java:534) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onClusterStateChangeRequest(GridDhtPartitionsExchangeFuture.java:671) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:463) at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:1954) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110) at java.lang.Thread.run(Thread.java:748) Suppressed: class org.apache.ignite.internal.processors.cache.persistence.wal.crc.IgniteDataIntegrityViolationException: val: 78139338 writtenCrc: 262144 at org.apache.ignite.internal.processors.cache.persistence.wal.FileInput$Crc32CheckingFileInput.close(FileInput.java:320) at org.apache.ignite.internal.processors.cache.persistence.wal.serializer.RecordV1Serializer.readRecord(RecordV1Serializer.java:680) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IGNITE-6064) Rework control.sh script.
Dmitriy Govorukhin created IGNITE-6064: -- Summary: Rework control.sh script. Key: IGNITE-6064 URL: https://issues.apache.org/jira/browse/IGNITE-6064 Project: Ignite Issue Type: Improvement Components: general Affects Versions: 2.1, 2.0 Reporter: Dmitriy Govorukhin Fix For: 2.2 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IGNITE-6058) Test fail testTransformResourceInjection broken
Dmitriy Govorukhin created IGNITE-6058: -- Summary: Test fail testTransformResourceInjection broken Key: IGNITE-6058 URL: https://issues.apache.org/jira/browse/IGNITE-6058 Project: Ignite Issue Type: Test Reporter: Dmitriy Govorukhin Fix For: 2.2 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IGNITE-6052) Check cluster state from daemon node return incorrect cluster state
Dmitriy Govorukhin created IGNITE-6052: -- Summary: Check cluster state from daemon node return incorrect cluster state Key: IGNITE-6052 URL: https://issues.apache.org/jira/browse/IGNITE-6052 Project: Ignite Issue Type: Bug Affects Versions: 2.0 Reporter: Dmitriy Govorukhin Assignee: Dmitriy Govorukhin Priority: Critical Fix For: 2.2 Daemon node must requested cluster state via compute grid from server node. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IGNITE-6015) Test fail in Ignite Cache 2: GridCachePartitionedGetAndTransformStoreSelfTest.testGetAndTransform
Dmitriy Govorukhin created IGNITE-6015: -- Summary: Test fail in Ignite Cache 2: GridCachePartitionedGetAndTransformStoreSelfTest.testGetAndTransform Key: IGNITE-6015 URL: https://issues.apache.org/jira/browse/IGNITE-6015 Project: Ignite Issue Type: Test Reporter: Dmitriy Govorukhin Fix For: 2.2 java.util.concurrent.TimeoutException: Test has been timed out [test=testGetAndTransform, timeout=30] at org.apache.ignite.testframework.junits.GridAbstractTest.runTest(GridAbstractTest.java:1949) at junit.framework.TestCase.runBare(TestCase.java:141) at junit.framework.TestResult$1.protect(TestResult.java:122) at junit.framework.TestResult.runProtected(TestResult.java:142) at junit.framework.TestResult.run(TestResult.java:125) at junit.framework.TestCase.run(TestCase.java:129) at junit.framework.TestSuite.runTest(TestSuite.java:255) at junit.framework.TestSuite.run(TestSuite.java:250) at junit.framework.TestSuite.runTest(TestSuite.java:255) at junit.framework.TestSuite.run(TestSuite.java:250) at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:84) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray2(ReflectionUtils.java:208) at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:156) at org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:82) at org.apache.maven.plugin.surefire.InPluginVMSurefireStarter.runSuitesInProcess(InPluginVMSurefireStarter.java:82) at org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeProvider(AbstractSurefireMojo.java:951) at org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeAfterPreconditionsChecked(AbstractSurefireMojo.java:831) at org.apache.maven.plugin.surefire.AbstractSurefireMojo.execute(AbstractSurefireMojo.java:729) at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:101) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:209) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145) at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:84) at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:59) at org.apache.maven.lifecycle.internal.LifecycleStarter.singleThreadedBuild(LifecycleStarter.java:183) at org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:161) at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:320) at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:156) at org.apache.maven.cli.MavenCli.execute(MavenCli.java:537) at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:196) at org.apache.maven.cli.MavenCli.main(MavenCli.java:141) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:290) at org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:230) at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:409) at org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:3 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IGNITE-5968) Test fail in Ignite Cache 2: GridCachePartitionNotLoadedEventSelfTest.testPrimaryAndBackupDead
Dmitriy Govorukhin created IGNITE-5968: -- Summary: Test fail in Ignite Cache 2: GridCachePartitionNotLoadedEventSelfTest.testPrimaryAndBackupDead Key: IGNITE-5968 URL: https://issues.apache.org/jira/browse/IGNITE-5968 Project: Ignite Issue Type: Test Reporter: Dmitriy Govorukhin Fix For: 2.2 java.util.concurrent.TimeoutException: Test has been timed out [test=testPrimaryAndBackupDead, timeout=30] at org.apache.ignite.testframework.junits.GridAbstractTest.runTest(GridAbstractTest.java:1949) at junit.framework.TestCase.runBare(TestCase.java:141) at junit.framework.TestResult$1.protect(TestResult.java:122) at junit.framework.TestResult.runProtected(TestResult.java:142) at junit.framework.TestResult.run(TestResult.java:125) at junit.framework.TestCase.run(TestCase.java:129) at junit.framework.TestSuite.runTest(TestSuite.java:255) at junit.framework.TestSuite.run(TestSuite.java:250) at junit.framework.TestSuite.runTest(TestSuite.java:255) at junit.framework.TestSuite.run(TestSuite.java:250) at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:84) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IGNITE-5967) Flaky fail in Ignite Java Client: RedisProtocolStringSelfTest.testGetSet
Dmitriy Govorukhin created IGNITE-5967: -- Summary: Flaky fail in Ignite Java Client: RedisProtocolStringSelfTest.testGetSet Key: IGNITE-5967 URL: https://issues.apache.org/jira/browse/IGNITE-5967 Project: Ignite Issue Type: Bug Reporter: Dmitriy Govorukhin Fix For: 2.2 RedisProtocolStringSelfTest.testGetSet redis.clients.jedis.exceptions.JedisConnectionException: Unexpected end of stream. at redis.clients.util.RedisInputStream.ensureFill(RedisInputStream.java:199) at redis.clients.util.RedisInputStream.readByte(RedisInputStream.java:40) at redis.clients.jedis.Protocol.process(Protocol.java:151) at redis.clients.jedis.Protocol.read(Protocol.java:215) at redis.clients.jedis.Connection.readProtocolWithCheckingBroken(Connection.java:340) at redis.clients.jedis.Connection.getBinaryBulkReply(Connection.java:259) at redis.clients.jedis.Connection.getBulkReply(Connection.java:248) at redis.clients.jedis.Jedis.get(Jedis.java:153) at org.apache.ignite.internal.processors.rest.protocols.tcp.redis.RedisProtocolStringSelfTest.testGetSet(RedisProtocolStringSelfTest.java:62) --- Stdout: --- [2017-08-07 06:28:44,379][INFO ][main][root] >>> Starting test: RedisProtocolStringSelfTest#testGetSet <<< [2017-08-07 06:28:52,390][INFO ][main][root] >>> Stopping test: RedisProtocolStringSelfTest#testGetSet in 8010 ms <<< --- Stderr: --- [2017-08-07 06:28:52,389][ERROR][main][root] Test failed. redis.clients.jedis.exceptions.JedisConnectionException: Unexpected end of stream. at redis.clients.util.RedisInputStream.ensureFill(RedisInputStream.java:199) at redis.clients.util.RedisInputStream.readByte(RedisInputStream.java:40) at redis.clients.jedis.Protocol.process(Protocol.java:151) at redis.clients.jedis.Protocol.read(Protocol.java:215) at redis.clients.jedis.Connection.readProtocolWithCheckingBroken(Connection.java:340) at redis.clients.jedis.Connection.getBinaryBulkReply(Connection.java:259) at redis.clients.jedis.Connection.getBulkReply(Connection.java:248) at redis.clients.jedis.Jedis.get(Jedis.java:153) at org.apache.ignite.internal.processors.rest.protocols.tcp.redis.RedisProtocolStringSelfTest.testGetSet(RedisProtocolStringSelfTest.java:62) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at junit.framework.TestCase.runTest(TestCase.java:176) at org.apache.ignite.testframework.junits.GridAbstractTest.runTestInternal(GridAbstractTest.java:2000) at org.apache.ignite.testframework.junits.GridAbstractTest.access$000(GridAbstractTest.java:132) at org.apache.ignite.testframework.junits.GridAbstractTest$5.run(GridAbstractTest.java:1915) at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IGNITE-5741) Replace HeapByteBuffer to DirectByteBuffer in WAL RecordsIterator
Dmitriy Govorukhin created IGNITE-5741: -- Summary: Replace HeapByteBuffer to DirectByteBuffer in WAL RecordsIterator Key: IGNITE-5741 URL: https://issues.apache.org/jira/browse/IGNITE-5741 Project: Ignite Issue Type: Improvement Components: persistence Affects Versions: 2.1 Reporter: Dmitriy Govorukhin Priority: Critical Fix For: 2.2 In current implementation we can get OOM during iterate on wal records, because some wal record may be very large (more that 64Mb). We read using HeapByteBuffer, FileChannel reserve buffer same size as our HeapByteBuffer, but by default FileChannel can not allocate buffer more that 64Mb. (maxMemory = VM.maxDirectMemory()) {code} [16:07:54]W: [org.apache.ignite:ignite-core] [2017-07-12 13:07:54,809][ERROR][exchange-worker-#55966%node0-primary%][GridDhtPartitionsExchangeFuture] Failed to reinitialize local partitions (preloading will be stopped): GridDhtPartitionExchangeId [topVer=AffinityTopologyVersion [topVer=3, minorTopVer=1], nodeId=105056d5, evt=DISCOVERY_CUSTOM_EVT] [16:07:54]W: [org.apache.ignite:ignite-core] java.lang.OutOfMemoryError: Direct buffer memory [16:07:54]W: [org.apache.ignite:ignite-core]at java.nio.Bits.reserveMemory(Bits.java:658) [16:07:54]W: [org.apache.ignite:ignite-core]at java.nio.DirectByteBuffer.(DirectByteBuffer.java:123) [16:07:54]W: [org.apache.ignite:ignite-core]at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306) [16:07:54]W: [org.apache.ignite:ignite-core]at sun.nio.ch.Util.getTemporaryDirectBuffer(Util.java:174) [16:07:54]W: [org.apache.ignite:ignite-core]at sun.nio.ch.IOUtil.read(IOUtil.java:195) [16:07:54]W: [org.apache.ignite:ignite-core]at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:149) [16:07:54]W: [org.apache.ignite:ignite-core]at org.apache.ignite.internal.processors.cache.persistence.file.RandomAccessFileIO.read(RandomAccessFileIO.java:62) [16:07:54]W: [org.apache.ignite:ignite-core]at org.apache.ignite.internal.processors.cache.persistence.wal.FileInput.ensure(FileInput.java:116) [16:07:54]W: [org.apache.ignite:ignite-core]at org.apache.ignite.internal.processors.cache.persistence.wal.FileInput$Crc32CheckingFileInput.ensure(FileInput.java:303) [16:07:54]W: [org.apache.ignite:ignite-core]at org.apache.ignite.internal.processors.cache.persistence.wal.FileInput$Crc32CheckingFileInput.readByte(FileInput.java:376) [16:07:54]W: [org.apache.ignite:ignite-core]at org.apache.ignite.internal.processors.cache.persistence.wal.FileInput$Crc32CheckingFileInput.readUnsignedByte(FileInput.java:385) [16:07:54]W: [org.apache.ignite:ignite-core]at org.apache.ignite.internal.processors.cache.persistence.wal.serializer.RecordV1Serializer.readRecord(RecordV1Serializer.java:697) [16:07:54]W: [org.apache.ignite:ignite-core]at org.apache.ignite.internal.processors.cache.persistence.wal.serializer.RecordV1Serializer.readRecord(RecordV1Serializer.java:673) [16:07:54]W: [org.apache.ignite:ignite-core]at org.apache.ignite.internal.processors.cache.persistence.wal.AbstractWalRecordsIterator.initReadHandle(AbstractWalRecordsIterator.java:243) [16:07:54]W: [org.apache.ignite:ignite-core]at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.advanceSegment(FileWriteAheadLogManager.java:2452) [16:07:54]W: [org.apache.ignite:ignite-core]at org.apache.ignite.internal.processors.cache.persistence.wal.AbstractWalRecordsIterator.advance(AbstractWalRecordsIterator.java:149) [16:07:54]W: [org.apache.ignite:ignite-core]at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.(FileWriteAheadLogManager.java:2352) [16:07:54]W: [org.apache.ignite:ignite-core]at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.(FileWriteAheadLogManager.java:2290) [16:07:54]W: [org.apache.ignite:ignite-core]at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.replay(FileWriteAheadLogManager.java:553) [16:07:54]W: [org.apache.ignite:ignite-core]at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.restoreMemory(GridCacheDatabaseSharedManager.java:1313) [16:07:54]W: [org.apache.ignite:ignite-core]at
[jira] [Created] (IGNITE-5704) Do not allow cluster to be deactivated if the joining node does not complete start all components
Dmitriy Govorukhin created IGNITE-5704: -- Summary: Do not allow cluster to be deactivated if the joining node does not complete start all components Key: IGNITE-5704 URL: https://issues.apache.org/jira/browse/IGNITE-5704 Project: Ignite Issue Type: Improvement Components: general Affects Versions: 2.0, 2.1 Reporter: Dmitriy Govorukhin Assignee: Dmitriy Govorukhin Priority: Critical Fix For: 2.2 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IGNITE-5603) All daemon node, can be only client daemon, server daemon is not allow.
Dmitriy Govorukhin created IGNITE-5603: -- Summary: All daemon node, can be only client daemon, server daemon is not allow. Key: IGNITE-5603 URL: https://issues.apache.org/jira/browse/IGNITE-5603 Project: Ignite Issue Type: Improvement Components: general Reporter: Dmitriy Govorukhin Priority: Critical Fix For: 2.1 No reason for daemon server right now. Rework current functionality, prevent the server node from being a daemon. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IGNITE-5600) Compute hang if send broadcast runnable on daemon cluster group
Dmitriy Govorukhin created IGNITE-5600: -- Summary: Compute hang if send broadcast runnable on daemon cluster group Key: IGNITE-5600 URL: https://issues.apache.org/jira/browse/IGNITE-5600 Project: Ignite Issue Type: Bug Reporter: Dmitriy Govorukhin Fix For: 2.1 Step to reproduce. 1. Start server node 2. Start daemon client and daemon server 3. Try to compute on daemon nodes -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IGNITE-5520) IgniteChangeGlobalStateFailOverTest hangs activate on join node
Dmitriy Govorukhin created IGNITE-5520: -- Summary: IgniteChangeGlobalStateFailOverTest hangs activate on join node Key: IGNITE-5520 URL: https://issues.apache.org/jira/browse/IGNITE-5520 Project: Ignite Issue Type: Bug Components: cache Reporter: Dmitriy Govorukhin Assignee: Dmitriy Govorukhin Fix For: 2.1 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IGNITE-5518) Rework test on join active/inactive node
Dmitriy Govorukhin created IGNITE-5518: -- Summary: Rework test on join active/inactive node Key: IGNITE-5518 URL: https://issues.apache.org/jira/browse/IGNITE-5518 Project: Ignite Issue Type: Task Reporter: Dmitriy Govorukhin Fix For: 2.1 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IGNITE-5480) Need to correctly handle deactivation process if an exception occured during deactivation
Dmitriy Govorukhin created IGNITE-5480: -- Summary: Need to correctly handle deactivation process if an exception occured during deactivation Key: IGNITE-5480 URL: https://issues.apache.org/jira/browse/IGNITE-5480 Project: Ignite Issue Type: Task Components: cache, persistence Affects Versions: 2.0 Reporter: Dmitriy Govorukhin Assignee: Dmitriy Govorukhin Fix For: 2.2 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IGNITE-5076) Optimization of multi-threaded start nodes in tests
Dmitriy Govorukhin created IGNITE-5076: -- Summary: Optimization of multi-threaded start nodes in tests Key: IGNITE-5076 URL: https://issues.apache.org/jira/browse/IGNITE-5076 Project: Ignite Issue Type: Improvement Reporter: Dmitriy Govorukhin Assignee: Dmitriy Govorukhin Fix For: 2.0 Concurrent start,will more effective if we have coordinator before, multi-threaded start nodes. If start all nodes concurrent, they will be compete for coordinator role, it is not effective way. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (IGNITE-5019) Deadlock in GridCacheVariableTopologySelfTest.testNodeStop
Dmitriy Govorukhin created IGNITE-5019: -- Summary: Deadlock in GridCacheVariableTopologySelfTest.testNodeStop Key: IGNITE-5019 URL: https://issues.apache.org/jira/browse/IGNITE-5019 Project: Ignite Issue Type: Bug Components: general Reporter: Dmitriy Govorukhin Assignee: Dmitriy Govorukhin Fix For: 2.0 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (IGNITE-4931) Safe way for deactivate cluster.
Dmitriy Govorukhin created IGNITE-4931: -- Summary: Safe way for deactivate cluster. Key: IGNITE-4931 URL: https://issues.apache.org/jira/browse/IGNITE-4931 Project: Ignite Issue Type: Task Components: general Affects Versions: 2.0 Reporter: Dmitriy Govorukhin Fix For: 2.1 We must provide safe way for deactivate cluster, i mean we must wait while all cache operation, transaction and etc. comleted before start deactivation process, in current implementation we do not wait while transaction comlete, (forcibly stop cache during transaction). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (IGNITE-4919) Remove support BinaryIdentityResolver
Dmitriy Govorukhin created IGNITE-4919: -- Summary: Remove support BinaryIdentityResolver Key: IGNITE-4919 URL: https://issues.apache.org/jira/browse/IGNITE-4919 Project: Ignite Issue Type: Task Components: binary Reporter: Dmitriy Govorukhin Assignee: Dmitriy Govorukhin Fix For: 2.0 We must get rid of BinaryIdentityResolver, because in new memory mode we must provide stable binary key representation. [discussion|http://apache-ignite-developers.2346864.n4.nabble.com/Stable-binary-key-representation-td15904.html] -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (IGNITE-4235) Can't get user exception if was on remote service
Dmitriy Govorukhin created IGNITE-4235: -- Summary: Can't get user exception if was on remote service Key: IGNITE-4235 URL: https://issues.apache.org/jira/browse/IGNITE-4235 Project: Ignite Issue Type: Bug Components: general Affects Versions: 1.7 Reporter: Dmitriy Govorukhin Fix For: 2.0 Can get user exception if was on remote node. Reproducer in attached file . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-4178) Support permission builder
Dmitriy Govorukhin created IGNITE-4178: -- Summary: Support permission builder Key: IGNITE-4178 URL: https://issues.apache.org/jira/browse/IGNITE-4178 Project: Ignite Issue Type: Improvement Components: general Affects Versions: 1.7 Reporter: Dmitriy Govorukhin Assignee: Dmitriy Govorukhin Priority: Minor Fix For: 1.8 Provides a convenient way to create a permission set. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (IGNITE-4044) Add an option to always authenticate local node
Dmitriy Govorukhin created IGNITE-4044: -- Summary: Add an option to always authenticate local node Key: IGNITE-4044 URL: https://issues.apache.org/jira/browse/IGNITE-4044 Project: Ignite Issue Type: Bug Affects Versions: 1.8 Reporter: Dmitriy Govorukhin Assignee: Dmitriy Govorukhin Currently authenticator is called during the startup only if the new node is the first one in the topology. This in counterintuitive and introduces unpredictable behavior when global authentication is enabled - the node may or may not call the authenticator depending on the starting order. -- This message was sent by Atlassian JIRA (v6.3.4#6332)