[jira] [Created] (IGNITE-12128) Potentially pds corruption on a failed node during checkpoint

2019-08-30 Thread Dmitriy Govorukhin (Jira)
Dmitriy Govorukhin created IGNITE-12128:
---

 Summary: Potentially pds corruption on a failed node during 
checkpoint
 Key: IGNITE-12128
 URL: https://issues.apache.org/jira/browse/IGNITE-12128
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin


There are the case when we start a checkpoint but not create CP file marker, 
but PageMemory may start to flush dirty pages from checkpoint pages to page 
store.  If node crashed at this moment, we can get inconsistency state, because 
we still not write checkpoint marker to disk but already write some pages for 
this checkpoint. If we try to recover from this state we cat get any sort of 
corruption problem. Recovery logic may not recognize that crash was during 
checkpoint because we did not write file marker when we start checkpoint but 
write some pages for this checkpoint.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (IGNITE-12127) WAL writer may close file IO with unflushed changes when MMAP is disabled

2019-08-30 Thread Dmitriy Govorukhin (Jira)
Dmitriy Govorukhin created IGNITE-12127:
---

 Summary: WAL writer may close file IO with unflushed changes when 
MMAP is disabled
 Key: IGNITE-12127
 URL: https://issues.apache.org/jira/browse/IGNITE-12127
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin
Assignee: Dmitriy Govorukhin


Most likely the issue manifests itself as the following critical error:
{code}
2019-08-27 14:52:31.286 ERROR 26835 --- [wal-write-worker%null-#447] ROOT : 
Critical system error detected. Will be handled accordingly to configured 
handler [hnd=class o.a.i.failure.StopNodeOrHaltFailureHandler, 
failureCtx=FailureContext [type=CRITICAL_ERROR, err=class 
o.a.i.i.processors.cache.persistence.StorageException: Failed to write buffer.]]
org.apache.ignite.internal.processors.cache.persistence.StorageException: 
Failed to write buffer.
at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$WALWriter.writeBuffer(FileWriteAheadLogManager.java:3444)
 [ignite-core-2.5.7.jar!/:2.5.7]
at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$WALWriter.body(FileWriteAheadLogManager.java:3249)
 [ignite-core-2.5.7.jar!/:2.5.7]
at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110) 
[ignite-core-2.5.7.jar!/:2.5.7]
at java.lang.Thread.run(Thread.java:748) [na:1.8.0_201]
Caused by: java.nio.channels.ClosedChannelException: null
at sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:110) 
~[na:1.8.0_201]
at sun.nio.ch.FileChannelImpl.position(FileChannelImpl.java:253) 
~[na:1.8.0_201]
at 
org.apache.ignite.internal.processors.cache.persistence.file.RandomAccessFileIO.position(RandomAccessFileIO.java:48)
 ~[ignite-core-2.5.7.jar!/:2.5.7]
at 
org.apache.ignite.internal.processors.cache.persistence.file.FileIODecorator.position(FileIODecorator.java:41)
 ~[ignite-core-2.5.7.jar!/:2.5.7]
at 
org.apache.ignite.internal.processors.cache.persistence.file.AbstractFileIO.writeFully(AbstractFileIO.java:111)
 ~[ignite-core-2.5.7.jar!/:2.5.7]
at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$WALWriter.writeBuffer(FileWriteAheadLogManager.java:3437)
 [ignite-core-2.5.7.jar!/:2.5.7]
... 3 common frames omitted
{code}

It appears that there following sequence is possible:
 * Thread A attempts to log a large record which does not fit segment, 
{{addRecord}} fails and the thread A starts segment rollover. I successfully 
runs {{flushOrWait(null)}} and gets de-scheduled before adding switch segment 
record
 * Thread B attempts to log another record, which fits exactly till the end of 
the current segment. The record is added to the buffer
 * Thread A resumes and fails to add the switch segment record. No flush is 
performed and the thread immediately proceeds for wal-writer close
 * WAL writer thread wakes up, sees that there is a CLOSE request, closes the 
file IO and immediately proceeds to write unflushed changes causing the 
exception.

Unconditional flush after switch segment record write should fix the issue.

Besides the bug itself, I suggest the following changes to the 
{{FileWriteHandleImpl}} ({{FileWriteAheadLogManager}} in earlier versions):
 * There is an {{fsync(filePtr)}} call inside {{close()}}; however, {{fsync()}} 
checks the {{stop}} flag (which is set inside {{close}}) and returns 
immediately after {{flushOrWait()}} if the flag is set - this is very 
confusing. After all, the {{close()}} itself explicitly calls {{force}} after 
flush
 * There is an ignored IO exception in mmap mode - this should be propagated to 
the failure handler
 * In WAL writer, we check for file CLOSE and then attemp to write to 
(possibly) the same write handle - write should be always before close
 * In WAL writer, there are racy reads of current handle - it would be better 
if we read the current handle once and then operate on it during the whole loop 
iteration



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (IGNITE-12110) Bugs & tests fixes

2019-08-27 Thread Dmitriy Govorukhin (Jira)
Dmitriy Govorukhin created IGNITE-12110:
---

 Summary:  Bugs & tests fixes
 Key: IGNITE-12110
 URL: https://issues.apache.org/jira/browse/IGNITE-12110
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin






--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (IGNITE-12102) idle_verify should show info about lost partitions

2019-08-26 Thread Dmitriy Govorukhin (Jira)
Dmitriy Govorukhin created IGNITE-12102:
---

 Summary: idle_verify should show info about lost partitions
 Key: IGNITE-12102
 URL: https://issues.apache.org/jira/browse/IGNITE-12102
 Project: Ignite
  Issue Type: Improvement
Reporter: Dmitriy Govorukhin


In the current implementation, idle_verify do not show lost partitions, and 
check shows that everything is fine but it is not true.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (IGNITE-12081) Page replacement can reload invalid page during checkpoint

2019-08-16 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-12081:
---

 Summary: Page replacement can reload invalid page during checkpoint
 Key: IGNITE-12081
 URL: https://issues.apache.org/jira/browse/IGNITE-12081
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin
Assignee: Dmitriy Govorukhin


There is a race between {{writeCheckpointPages}} and page replacement process:
 * Checkpointer thread begins a checkpoint
 * Checkpointer thread calls {{getPageForCheckpoint()}}, which will copy page 
content *and clear dirty flag*
 * Page replacement tries to find a page for replacement and chooses this page, 
the page is thrown away
 * Before the page is written back to the store, the page is acquired again.

As a result, an older copy of the page is brought back to memory, which causes 
all kinds of corruption exceptions and assertions.

The attached unit test demonstrates the issue. It is likely that all baselines 
are affected starting from 2.4

As a part of this ticket, we must add more unit-tests for checkpointing 
protocol invariants we rely on.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (IGNITE-12060) Incorrect row size calculation, lead to tree corruption.

2019-08-12 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-12060:
---

 Summary: Incorrect row size calculation, lead to tree corruption.
 Key: IGNITE-12060
 URL: https://issues.apache.org/jira/browse/IGNITE-12060
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin
Assignee: Dmitriy Govorukhin
 Fix For: 2.8






--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (IGNITE-12057) Persistence files are stored to temp dir

2019-08-10 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-12057:
---

 Summary: Persistence files are stored to temp dir
 Key: IGNITE-12057
 URL: https://issues.apache.org/jira/browse/IGNITE-12057
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin


h2. Description
Check this thread:
[https://stackoverflow.com/questions/56951913/ignite-persistent-schema-tables-disappeared-sometimes/56977212#56977212]

This prospect almost dropped us because the company could figure out why 
persistence files disappear upon restarts. They turned off WARN logging level 
and could see our warning saying that the files are written to such a directory.

I've updated Ignite docs:
[https://apacheignite.readme.io/docs/distributed-persistent-store#section-persistence-path-management]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (IGNITE-12048) Bugs & tests fixes

2019-08-07 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-12048:
---

 Summary: Bugs & tests fixes
 Key: IGNITE-12048
 URL: https://issues.apache.org/jira/browse/IGNITE-12048
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin


Page replacement can reload invalid page during checkpoint

There is a race between {{writeCheckpointPages}} and page replacement process:
 * Checkpointer thread begins a checkpoint
 * Checkpointer thread calls {{getPageForCheckpoint()}}, which will copy page 
content *and clear dirty flag*
 * Page replacement tries to find a page for replacement and chooses this page, 
the page is thrown away
 * Before the page is written back to the store, the page is acquired again.

As a result, an older copy of the page is brought back to memory, which causes 
all kinds of corruption exceptions and assertions.

-

checkpointReadLock() may hang during node stop

I got this hang during one of PDS (Indexing) runs (thread-dump is attached). 
The following code hang:
{code:java}
checkpointer.wakeupForCheckpoint(0, "too many dirty pages").cpBeginFut
.getUninterruptibly();
{code}
It looks like {{wakeupForCheckpoint}} can be called after the checkpointer is 
stopped and {{cpBeginFut}} will be never completed.

-

Fixed 
ZookeeperDiscoveryCommunicationFailureTest.testCommunicationFailureResolve_CachesInfo1

Fixed  *.testFailAfterStart



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (IGNITE-11953) BTree corruption caused by byte array values

2019-07-02 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-11953:
---

 Summary: BTree corruption caused by byte array values
 Key: IGNITE-11953
 URL: https://issues.apache.org/jira/browse/IGNITE-11953
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin


In some cases for caches with cache group, we can get BTree corruption 
exception.

{code}
09:53:58,890][SEVERE][sys-stripe-10-#11][] Critical system error detected. Will 
be handled accordingly to configured handler [hnd=CustomFailureHandler 
[ignoreCriticalErrors=false, disabled=false][StopNodeOrHaltFailureHandler 
[tryStop=false, timeout=0]], failureCtx=FailureContext [type=CRITICAL_ERROR, 
err=class o.a.i.i.transactions.IgniteTxHeuristicCheckedException: Committing a 
transaction has produced runtime exception]]class 
org.apache.ignite.internal.transactions.IgniteTxHeuristicCheckedException: 
Committing a transaction has produced runtime exception
at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxAdapter.heuristicException(IgniteTxAdapter.java:800)
at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxLocalAdapter.userCommit(IgniteTxLocalAdapter.java:922)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocalAdapter.localFinish(GridDhtTxLocalAdapter.java:799)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocal.localFinish(GridDhtTxLocal.java:608)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocal.finishTx(GridDhtTxLocal.java:478)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocal.commitDhtLocalAsync(GridDhtTxLocal.java:535)
at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.finishDhtLocal(IgniteTxHandler.java:1055)
at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.finish(IgniteTxHandler.java:931)
at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.processNearTxFinishRequest(IgniteTxHandler.java:887)
at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.access$200(IgniteTxHandler.java:117)
at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$3.apply(IgniteTxHandler.java:209)
at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$3.apply(IgniteTxHandler.java:207)
at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1129)
at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:594)
at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:393)
at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:319)
at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:109)
at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:308)
at 
org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1568)
at 
org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1196)
at 
org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:126)
at 
org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1092)
at 
org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:504)
at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119)
at java.lang.Thread.run(Thread.java:748)
Caused by: class 
org.apache.ignite.internal.processors.cache.persistence.tree.CorruptedTreeException:
 Runtime failure on search row: SearchRow [key=KeyCacheObjectImpl [part=427, 
val=Grkg1DUF3yQE6tC9Se50mi5w.T, hasValBytes=true], hash=1872857770, 
cacheId=-420893003]
at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.invoke(BPlusTree.java:1811)
at 
org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.invoke0(IgniteCacheOffheapManagerImpl.java:1620)
at 
org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.invoke(IgniteCacheOffheapManagerImpl.java:1603)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.invoke(GridCacheOffheapManager.java:2131)
at 
org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.invoke(IgniteCacheOffheapManagerImpl.java:442)
at 

[jira] [Created] (IGNITE-11934) Bugs & tests fixes

2019-06-18 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-11934:
---

 Summary:  Bugs & tests fixes
 Key: IGNITE-11934
 URL: https://issues.apache.org/jira/browse/IGNITE-11934
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin


This issue contains fixes for several issues:
 * AssertionError occurs on the client when coordinator killed (with ZK 
discovery)
 * IgniteVersionUtils#BUILD_TSTAMP_DATE_FORMATTER is used in a non thread-safe 
manner.
 * Possible discovery race on node joining with Authenticator.
 * PageLocksCommand#parseArguments cannot properly parse arguments user, 
password if its at the end of arguments list.
 * Test CheckpointFreeListTest.testRestoreFreeListCorrectlyAfterRandomStop 
failed on TC

 * IgniteWalFlushBackgroundSelfTest.testFailWhileStart & 
IgniteWalFlushLogOnlySelfTest.testFailWhileStart fail in disk compression suite.
 * IgniteClientConnectAfterCommunicationFailureTest fails
 * Add scale factor for PageLockTrackerTests



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-11835) Support JMX/control.sh API for page lock dump

2019-05-06 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-11835:
---

 Summary: Support JMX/control.sh API for page lock dump
 Key: IGNITE-11835
 URL: https://issues.apache.org/jira/browse/IGNITE-11835
 Project: Ignite
  Issue Type: Sub-task
Reporter: Dmitriy Govorukhin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-11824) Integrate PageLockTracker to DataStructure (per-thread tracker)

2019-04-30 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-11824:
---

 Summary: Integrate PageLockTracker to DataStructure (per-thread 
tracker)
 Key: IGNITE-11824
 URL: https://issues.apache.org/jira/browse/IGNITE-11824
 Project: Ignite
  Issue Type: Sub-task
Reporter: Dmitriy Govorukhin
Assignee: Dmitriy Govorukhin


After [IGNITE-11750] will be completed, we will have a structure for tracking 
page locks per-thread. The next step, need to integrate it into diagnostic API 
and implements a component for creating this structure per-thread.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-11786) Implement thread-local stack for trucking page locks

2019-04-19 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-11786:
---

 Summary: Implement thread-local stack for trucking page locks
 Key: IGNITE-11786
 URL: https://issues.apache.org/jira/browse/IGNITE-11786
 Project: Ignite
  Issue Type: Sub-task
Reporter: Dmitriy Govorukhin


The new structure should work as a stack. 
When thread obtains lock we push pageId (+meta) on the top of the stack when 
thread release lock we pop pageId from the stack. Their cases when thread may 
unlock page not from current thread frame (some split pages in B-tree), from 
previous, in this case, we should go down to stack and find this page and 
update meta.

{code}
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-11738) Incorrect check ObjectInput.available() in CacheMetricsSnapshot

2019-04-12 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-11738:
---

 Summary: Incorrect check  ObjectInput.available() in 
CacheMetricsSnapshot
 Key: IGNITE-11738
 URL: https://issues.apache.org/jira/browse/IGNITE-11738
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-11641) Server node copies a lot of WAL files in WAL archive after restart

2019-03-27 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-11641:
---

 Summary: Server node copies a lot of WAL files in WAL archive 
after restart
 Key: IGNITE-11641
 URL: https://issues.apache.org/jira/browse/IGNITE-11641
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin


Pre-condition: PDS is enabled, wal_path and wal_archive_path are set in config 
file.

1. Cluster is up and running. Some data uploaded into caches.
2. Start load to generate a lot of files in wal archive (more than files in wal 
directory).
3. Stop some node and delete all files from wal archive.
4. Start node.

In this case node copies WAL files from WAL dir into wal archive dir again and 
again until the amount of files will be the same it was in wal archive before 
stop.

Here is information from server node log

{code}
10:10:17,054][INFO][main][GridCacheDatabaseSharedManager] Restoring partition 
state for local groups.
[10:10:18,522][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] 
Copied file 
[src=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/.wal,
 
dst=/storage/ssd/avolkov/wal_archive/node00-83c9db32-fee5-4f3e-8a1c-559221817759/.wal]
[10:10:18,523][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] 
Starting to copy WAL segment [absIdx=1, segIdx=1, 
origFile=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0001.wal,
 
dstFile=/storage/ssd/avolkov/wal_archive/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0001.wal]
[10:10:20,631][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] 
Copied file 
[src=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0001.wal,
 
dst=/storage/ssd/avolkov/wal_archive/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0001.wal]
[10:10:20,632][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] 
Starting to copy WAL segment [absIdx=2, segIdx=2, 
origFile=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0002.wal,
 
dstFile=/storage/ssd/avolkov/wal_archive/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0002.wal]
[10:10:23,276][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] 
Copied file 
[src=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0002.wal,
 
dst=/storage/ssd/avolkov/wal_archive/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0002.wal]
[10:10:23,276][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] 
Starting to copy WAL segment [absIdx=3, segIdx=3, 
origFile=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0003.wal,
 
dstFile=/storage/ssd/avolkov/wal_archive/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0003.wal]
[10:10:23,995][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] 
Copied file 
[src=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0003.wal,
 
dst=/storage/ssd/avolkov/wal_archive/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0003.wal]
[10:10:23,996][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] 
Starting to copy WAL segment [absIdx=4, segIdx=4, 
origFile=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0004.wal,
 
dstFile=/storage/ssd/avolkov/wal_archive/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0004.wal]
[10:10:24,644][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] 
Copied file 
[src=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0004.wal,
 
dst=/storage/ssd/avolkov/wal_archive/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0004.wal]
[10:10:24,645][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] 
Starting to copy WAL segment [absIdx=5, segIdx=5, 
origFile=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0005.wal,
 
dstFile=/storage/ssd/avolkov/wal_archive/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0005.wal]
[10:10:25,301][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] 
Copied file 
[src=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0005.wal,
 
dst=/storage/ssd/avolkov/wal_archive/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0005.wal]
[10:10:25,301][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] 
Starting to copy WAL segment [absIdx=6, segIdx=6, 
origFile=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0006.wal,
 
dstFile=/storage/ssd/avolkov/wal_archive/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0006.wal]
[10:10:26,043][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] 
Copied file 
[src=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0006.wal,
 

[jira] [Created] (IGNITE-11509) Remove DistributedBaselineConfiguration and replace to methods on IgniteCluster

2019-03-08 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-11509:
---

 Summary: Remove DistributedBaselineConfiguration and replace to 
methods on IgniteCluster
 Key: IGNITE-11509
 URL: https://issues.apache.org/jira/browse/IGNITE-11509
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-11095) Failed WalCompactionTest flaky test

2019-01-25 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-11095:
---

 Summary: Failed WalCompactionTest flaky test
 Key: IGNITE-11095
 URL: https://issues.apache.org/jira/browse/IGNITE-11095
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-10974) Grid may hangs if an exception is thrown from PageMemoryImpl.beforeReleaseWrite()

2019-01-18 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-10974:
---

 Summary: Grid may hangs if an exception is thrown from 
PageMemoryImpl.beforeReleaseWrite()
 Key: IGNITE-10974
 URL: https://issues.apache.org/jira/browse/IGNITE-10974
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin
Assignee: Dmitriy Govorukhin


 

 

{code}

[2019-01-17 14:35:15,953][WARN ][main][root] Thread dump at 2019/01/17 14:35:15 
UTC
[17:35:15]W: [org.apache.ignite:ignite-core] Thread 
[name="sys-#857%failure.IoomFailureHandlerTest0%", id=931, state=TIMED_WAITING, 
blockCnt=0, waitCnt=1]
[17:35:15]W: [org.apache.ignite:ignite-core] Lock 
[object=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@4339baec,
 ownerName=null, ownerId=-1]
[17:35:15]W: [org.apache.ignite:ignite-core] at sun.misc.Unsafe.park(Native 
Method)
[17:35:15]W: [org.apache.ignite:ignite-core] at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
[17:35:15]W: [org.apache.ignite:ignite-core] at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
[17:35:15]W: [org.apache.ignite:ignite-core] at 
java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
[17:35:15]W: [org.apache.ignite:ignite-core] at 
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
[17:35:15]W: [org.apache.ignite:ignite-core] at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
[17:35:15]W: [org.apache.ignite:ignite-core] at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[17:35:15]W: [org.apache.ignite:ignite-core] at 
java.lang.Thread.run(Thread.java:748)
[17:35:15]W: [org.apache.ignite:ignite-core] 
[17:35:15]W: [org.apache.ignite:ignite-core] Thread 
[name="sys-#856%failure.IoomFailureHandlerTest0%", id=930, state=TIMED_WAITING, 
blockCnt=0, waitCnt=1]
[17:35:15]W: [org.apache.ignite:ignite-core] Lock 
[object=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@4339baec,
 ownerName=null, ownerId=-1]
[17:35:15]W: [org.apache.ignite:ignite-core] at sun.misc.Unsafe.park(Native 
Method)
[17:35:15]W: [org.apache.ignite:ignite-core] at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
[17:35:15]W: [org.apache.ignite:ignite-core] at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
[17:35:15]W: [org.apache.ignite:ignite-core] at 
java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
[17:35:15]W: [org.apache.ignite:ignite-core] at 
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
[17:35:15]W: [org.apache.ignite:ignite-core] at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
[17:35:15]W: [org.apache.ignite:ignite-core] at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[17:35:15]W: [org.apache.ignite:ignite-core] at 
java.lang.Thread.run(Thread.java:748)
[17:35:15]W: [org.apache.ignite:ignite-core] 
[17:35:15]W: [org.apache.ignite:ignite-core] Thread 
[name="sys-#855%failure.IoomFailureHandlerTest0%", id=929, state=TIMED_WAITING, 
blockCnt=0, waitCnt=1]
[17:35:15]W: [org.apache.ignite:ignite-core] Lock 
[object=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@4339baec,
 ownerName=null, ownerId=-1]
[17:35:15]W: [org.apache.ignite:ignite-core] at sun.misc.Unsafe.park(Native 
Method)
[17:35:15]W: [org.apache.ignite:ignite-core] at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
[17:35:15]W: [org.apache.ignite:ignite-core] at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
[17:35:15]W: [org.apache.ignite:ignite-core] at 
java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
[17:35:15]W: [org.apache.ignite:ignite-core] at 
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
[17:35:15]W: [org.apache.ignite:ignite-core] at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
[17:35:15]W: [org.apache.ignite:ignite-core] at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[17:35:15]W: [org.apache.ignite:ignite-core] at 
java.lang.Thread.run(Thread.java:748)
[17:35:15]W: [org.apache.ignite:ignite-core] 
[17:35:15]W: [org.apache.ignite:ignite-core] Thread 
[name="sys-#854%failure.IoomFailureHandlerTest0%", id=928, state=TIMED_WAITING, 
blockCnt=0, waitCnt=1]
[17:35:15]W: [org.apache.ignite:ignite-core] Lock 
[object=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@4339baec,
 ownerName=null, ownerId=-1]
[17:35:15]W: [org.apache.ignite:ignite-core] at sun.misc.Unsafe.park(Native 
Method)
[17:35:15]W: 

[jira] [Created] (IGNITE-10909) GridCacheBalancingStoreSelfTest.testConcurrentLoad flaky test fail in Cache 1

2019-01-11 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-10909:
---

 Summary: GridCacheBalancingStoreSelfTest.testConcurrentLoad flaky 
test fail in Cache 1
 Key: IGNITE-10909
 URL: https://issues.apache.org/jira/browse/IGNITE-10909
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin
Assignee: Dmitriy Govorukhin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-10908) GridServiceProcessorBatchDeploySelfTest.testDeployAllTopologyChange flaky fail with NPE in Service Grid (legacy mode)

2019-01-11 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-10908:
---

 Summary: 
GridServiceProcessorBatchDeploySelfTest.testDeployAllTopologyChange flaky fail 
with NPE in Service Grid (legacy mode) 
 Key: IGNITE-10908
 URL: https://issues.apache.org/jira/browse/IGNITE-10908
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-10907) IgniteUtilsSelfTest.testDoInParallelWithStealingJobRunTaskInExecutor flaky failed in Basic 1

2019-01-11 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-10907:
---

 Summary: 
IgniteUtilsSelfTest.testDoInParallelWithStealingJobRunTaskInExecutor flaky 
failed in Basic 1
 Key: IGNITE-10907
 URL: https://issues.apache.org/jira/browse/IGNITE-10907
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin
Assignee: Dmitriy Govorukhin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-10891) IgnitePdsThreadInterruptionTest.testInterruptsOnLFSRead flaky in PDS indexing

2019-01-11 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-10891:
---

 Summary: IgnitePdsThreadInterruptionTest.testInterruptsOnLFSRead 
flaky in PDS indexing
 Key: IGNITE-10891
 URL: https://issues.apache.org/jira/browse/IGNITE-10891
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin
Assignee: Dmitriy Govorukhin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-10883) IgniteRebalanceOnCachesStoppingOrDestroyingTest flaky failed in PDS4

2019-01-10 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-10883:
---

 Summary: IgniteRebalanceOnCachesStoppingOrDestroyingTest flaky 
failed in PDS4
 Key: IGNITE-10883
 URL: https://issues.apache.org/jira/browse/IGNITE-10883
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin


[testStopCachesOnDeactivation|https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8=-3436991258700651390=testDetails]

[testStopCachesOnDeactivation|https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8=-3436991258700651390=testDetails]

The first problem in a test, it is not checked that rebalance completed after 
test action performed. And second problem in an assert, there are no guarantees 
that cache will not be desproyed before checkpoint completed.

{code}

Failed to notify listener: 
o.a.i.i.processors.cache.WalStateManager$3...@31e26a1java.lang.AssertionError
at 
org.apache.ignite.internal.processors.cache.WalStateManager$3.applyx(WalStateManager.java:510)
at 
org.apache.ignite.internal.processors.cache.WalStateManager$3.applyx(WalStateManager.java:505)
at 
org.apache.ignite.internal.util.lang.IgniteInClosureX.apply(IgniteInClosureX.java:38)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:399)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.unblock(GridFutureAdapter.java:347)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.unblockAll(GridFutureAdapter.java:335)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:511)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$CheckpointProgress$1.onDone(GridCacheDatabaseSharedManager.java:4280)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$CheckpointProgress$1.onDone(GridCacheDatabaseSharedManager.java:4275)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:490)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:456)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.markCheckpointEnd(GridCacheDatabaseSharedManager.java:3904)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.doCheckpoint(GridCacheDatabaseSharedManager.java:3353)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.body(GridCacheDatabaseSharedManager.java:3119)
at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)

{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-10508) Need to support the new checkpoint feature not wait for the previous operation to complete

2018-12-03 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-10508:
---

 Summary: Need to support the new checkpoint feature not wait for 
the previous operation to complete
 Key: IGNITE-10508
 URL: https://issues.apache.org/jira/browse/IGNITE-10508
 Project: Ignite
  Issue Type: Improvement
Reporter: Dmitriy Govorukhin


There are cases when we should trigger the checkpoint, some operations will be 
sure that all operation finished before the checkpoint. It is necessary to 
support the possibility of run checkpoint without waiting for the completion of 
the previous checkpoint.

Solution:

Merge checkpoint pages and append write new dirty pages to a current checkpoint.

Restrictions:

Trigger new checkpoint should not wait for the previous checkpoint operation 
completed.

- It should not break crash recovery mechanisms

- Only one merged is allow in the first implementation (potentially OOM, if we 
will try to merge many checkpoint operations)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-10341) Missed loss policy tests with persistence

2018-11-20 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-10341:
---

 Summary: Missed loss policy tests with persistence
 Key: IGNITE-10341
 URL: https://issues.apache.org/jira/browse/IGNITE-10341
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin


After IGNITE-10207 was implemented, the test was removed (check policy if 
persistence enables), it is a mistake, need to revert this changes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-10290) Map.Entry interface for key cache may lead to incorrect calculation hash code

2018-11-15 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-10290:
---

 Summary: Map.Entry interface for key cache may lead to incorrect 
calculation hash code
 Key: IGNITE-10290
 URL: https://issues.apache.org/jira/browse/IGNITE-10290
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin
 Attachments: Reproducer.java

In case if use Map.Entry interface for a key, we can try to find (key, value) 
in store with incorrect calculated hash code for binary representation.
The problem is in the 
GridPartitionedSingleGetFuture#localGet() and 
GridPartitionedGetFuture#localGet() does not execute prepareForCache before 
reading cacheDataRow from row store.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-10285) U.doInParallel may lead to deadlock

2018-11-15 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-10285:
---

 Summary: U.doInParallel may lead to deadlock
 Key: IGNITE-10285
 URL: https://issues.apache.org/jira/browse/IGNITE-10285
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin
 Attachments: dump.rtf

There are exist case when we can get a deadlock on the thread pool.
If we try doInParallel in the thread of sys-pool in the number of 
hreads==sys-pool.size we lead to deadlock because threads in sys-pool will try 
doInParallel through the same sys-pool, and they will wait on future infinitely 
because no one thread cannot complete operation doInParallel which require 
threads from sys-pool.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-10252) Cache.get() may be mapped to the node with partition state is "MOVING"

2018-11-14 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-10252:
---

 Summary: Cache.get() may be mapped to the node with partition 
state is "MOVING"
 Key: IGNITE-10252
 URL: https://issues.apache.org/jira/browse/IGNITE-10252
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin


After implemented IGNITE-5357, in some cases get maybe mapped to the node with 
partition state is "MOVING" for PARTITION cache and it may lead to some 
assertion errors (we do not allow read from moving partitions). In an original 
issue, a talk was about only replicated cache, why it was implemented for 
partition cache, not clear.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-10207) Missed loss policy checks

2018-11-09 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-10207:
---

 Summary: Missed loss policy checks
 Key: IGNITE-10207
 URL: https://issues.apache.org/jira/browse/IGNITE-10207
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin


In some cases (client reconnect, new client join, etc) PartitionLossPolicy may 
incorrectly validate operation. Return null for READ_ONLY_SAFE for loss 
partition.
To reproduce run CacheResultIsNotNullOnPartitionLossTest (1000 times
) with random node stop.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-9898) Checkpointer thread hangs on await async task complete

2018-10-16 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-9898:
--

 Summary: Checkpointer thread hangs on await async task complete
 Key: IGNITE-9898
 URL: https://issues.apache.org/jira/browse/IGNITE-9898
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin


In some cases, we can reset thread pool counters during execution async task, 
and then we can get hangs on await

{code}
[19:36:01] : [Step 4/5] [2018-10-15 16:36:01,435][INFO 
][db-checkpoint-thread-#21691%db.IgnitePdsPageEvictionDuringPartitionClearTest0%][GridCacheDatabaseSharedManager]
 Await checkpoint pool tasks comleted, pendingTaskCnt=2, completedTaskCnt=3, 
initialized=true, err=null, activeCnt=0
[19:36:03] : [Step 4/5] [2018-10-15 16:36:03,435][INFO 
][db-checkpoint-thread-#21691%db.IgnitePdsPageEvictionDuringPartitionClearTest0%][GridCacheDatabaseSharedManager]
 Await checkpoint pool tasks comleted, pendingTaskCnt=2, completedTaskCnt=3, 
initialized=true, err=null, activeCnt=0
[19:36:05] : [Step 4/5] [2018-10-15 16:36:05,436][INFO 
][db-checkpoint-thread-#21691%db.IgnitePdsPageEvictionDuringPartitionClearTest0%][GridCacheDatabaseSharedManager]
 Await checkpoint pool tasks comleted, pendingTaskCnt=2, completedTaskCnt=3, 
initialized=true, err=null, activeCnt=0
[19:36:07] : [Step 4/5] [2018-10-15 16:36:07,436][INFO 
][db-checkpoint-thread-#21691%db.IgnitePdsPageEvictionDuringPartitionClearTest0%][GridCacheDatabaseSharedManager]
 Await checkpoint pool tasks comleted, pendingTaskCnt=2, completedTaskCnt=3, 
initialized=true, err=null, activeCnt=0
[19:36:09] : [Step 4/5] [2018-10-15 16:36:09,437][INFO 
][db-checkpoint-thread-#21691%db.IgnitePdsPageEvictionDuringPartitionClearTest0%][GridCacheDatabaseSharedManager]
 Await checkpoint pool tasks comleted, pendingTaskCnt=2, completedTaskCnt=3, 
initialized=true, err=null, activeCnt=0
[19:36:11] : [Step 4/5] [2018-10-15 16:36:11,437][INFO 
][db-checkpoint-thread-#21691%db.IgnitePdsPageEvictionDuringPartitionClearTest0%][GridCacheDatabaseSharedManager]
 Await checkpoint pool tasks comleted, pendingTaskCnt=2, completedTaskCnt=3, 
initialized=true, err=null, activeCnt=0
[19:36:13] : [Step 4/5] [2018-10-15 16:36:13,438][INFO 
][db-checkpoint-thread-#21691%db.IgnitePdsPageEvictionDuringPartitionClearTest0%][GridCacheDatabaseSharedManager]
 Await checkpoint pool tasks comleted, pendingTaskCnt=2, completedTaskCnt=3, 
initialized=true, err=null, activeCnt=0
[19:36:15] : [Step 4/5] [2018-10-15 16:36:15,439][INFO 
][db-checkpoint-thread-#21691%db.IgnitePdsPageEvictionDuringPartitionClearTest0%][GridCacheDatabaseSharedManager]
 Await checkpoint pool tasks comleted, pendingTaskCnt=2, completedTaskCnt=3, 
initialized=true, err=null, activeCnt=0
[19:36:17] : [Step 4/5] [2018-10-15 16:36:17,440][INFO 
][db-checkpoint-thread-#21691%db.IgnitePdsPageEvictionDuringPartitionClearTest0%][GridCacheDatabaseSharedManager]
 Await checkpoint pool tasks comleted, pendingTaskCnt=2, completedTaskCnt=3, 
initialized=true, err=null, activeCnt=0
[19:36:19] : [Step 4/5] [2018-10-15 16:36:19,441][INFO 
][db-checkpoint-thread-#21691%db.IgnitePdsPageEvictionDuringPartitionClearTest0%][GridCacheDatabaseSharedManager]
 Await checkpoint pool tasks comleted, pendingTaskCnt=2, completedTaskCnt=3, 
initialized=true, err=null, activeCnt=0
[19:36:21] : [Step 4/5] [2018-10-15 16:36:21,442][INFO 
][db-checkpoint-thread-#21691%db.IgnitePdsPageEvictionDuringPartitionClearTest0%][GridCacheDatabaseSharedManager]
 Await checkpoint pool tasks comleted, pendingTaskCnt=2, completedTaskCnt=3, 
initialized=true, err=null, activeCnt=0

{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-9426) IgniteAtomicSequence benchmarks

2018-08-29 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-9426:
--

 Summary: IgniteAtomicSequence benchmarks
 Key: IGNITE-9426
 URL: https://issues.apache.org/jira/browse/IGNITE-9426
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin


Need to create JMH and Yardstick benchmarks for the atomic sequence in order to 
be able to measure future performance improvements



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-9260) StandaloneWalRecordsIterator broken on WalSegmentTailReachedException not in work dir

2018-08-13 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-9260:
--

 Summary: StandaloneWalRecordsIterator broken on 
WalSegmentTailReachedException not in work dir
 Key: IGNITE-9260
 URL: https://issues.apache.org/jira/browse/IGNITE-9260
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin


After implementation IGNITE-9050, StandaloneWalRecordsIterator became broke 
because in the standalone mode we can stop the iteration at any moment when the 
last available segment will be fully read.  And validation which was 
implemented in IGNITE-9050 is not applicable for standalone mode. Need to 
change behavior and validate what we stop an iteration in last available WAL 
segment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-9244) Partition eviction may use all threads in sys pool, it leads to hangs send a message via sys pool

2018-08-10 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-9244:
--

 Summary: Partition eviction may use all threads in sys pool, it 
leads to hangs send a message via sys pool 
 Key: IGNITE-9244
 URL: https://issues.apache.org/jira/browse/IGNITE-9244
 Project: Ignite
  Issue Type: Bug
 Environment: In the current implementation, GridDhtPartitionsEvictor 
reset partition to evict one by one.
GridDhtPartitionsEvictor is created for each cache group, if we try to evict 
too many groups as sys pool size, group evictors will take all available 
threads in sys pool. It leads to hangs send a message via sys pool. As a fix, I 
suggest to limit concurrent execution via sys pool or use another pool for this 
purpose.
Reporter: Dmitriy Govorukhin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-9050) WALIterator should throws exception if iterator stopped in the WALArchive but not in WALWork

2018-07-23 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-9050:
--

 Summary: WALIterator should throws exception if iterator stopped 
in the WALArchive but not in WALWork
 Key: IGNITE-9050
 URL: https://issues.apache.org/jira/browse/IGNITE-9050
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin


The iterator will stop iteration if next WAL record pointer is not equals 
expected (WalSegmentTailReachedException), if it happens during iteration over 
segments in WAL archive, this means WAL is corrupted and we cannot ignore this, 
WAL log is not fully read.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-9049) Missed SWITCH_SEGMENT_RECORD at the end of WAL file but space enough

2018-07-22 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-9049:
--

 Summary: Missed SWITCH_SEGMENT_RECORD at the end of WAL file but 
space enough 
 Key: IGNITE-9049
 URL: https://issues.apache.org/jira/browse/IGNITE-9049
 Project: Ignite
  Issue Type: Improvement
Reporter: Dmitriy Govorukhin


There is a situation the several threads try addRecord when the free space ends 
(need rollOver to the next WAL segment) and none thread writes 
SWITCH_SEGMENT_RECORD. This leads to an end of the file will have garbage. If 
we try to iterate over this segment, iterator stopped when try to read next 
record and stumble on the garbage at the end of the file, it leads log will not 
be fully read. Any type of operation required iterator may be broken (crash 
recovery, delta rebalance, etc.).

Example:
File size 1024 bytes
Current tail position 768 (free space 256)

1. Thread-1 try addRecord (size 128) -> tail update to 896.
2. Thread-2 try addRecord (size 128) -> tail update to 1024 (free space ended).
None thread still not write any data, it just reserves position for write. 
(SegmentedRingByteBuffer.offer).

3. Thread-3 try addRecord  (size 128) -> no space enough -> rollOver and CAS 
stop flag to TRUE.

4. Thread-1 and Thread-2 try to write data and cannot do it.

FileWriteHandle.addRecord
{code}
  if (buf == null || (stop.get() && rec.type() != SWITCH_SEGMENT_RECORD))
return null; // Can not write to this segment, need 
to switch to the next one.

{code}

Thread-3 - can not write SWITCH_SEGMENT_RECORD because of not enough space.
Thread-1 and Thread-2 cannot write their data because a stop is TRUE

We have garbage from 768 to 1024 position.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-9047) Add idleVerify check for GridCommonAbstractTest

2018-07-21 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-9047:
--

 Summary: Add idleVerify check for GridCommonAbstractTest
 Key: IGNITE-9047
 URL: https://issues.apache.org/jira/browse/IGNITE-9047
 Project: Ignite
  Issue Type: Improvement
Reporter: Dmitriy Govorukhin


Since we have idleVerify (consistency check between primary and backups) it 
will be useful to add this command into test abstract class for subsequent 
verification of consistency after some test scenarios. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-9042) Transaction with small timeout may lead to inconsistent partition state

2018-07-20 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-9042:
--

 Summary: Transaction with small timeout may lead to inconsistent 
partition state
 Key: IGNITE-9042
 URL: https://issues.apache.org/jira/browse/IGNITE-9042
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin
 Attachments: Reproducer.java

The transaction with a small timeout may lead to inconsistent partition state. 
Reproducer in attached.

Problem in GridDhtTxPrepareFuture.sendPrepareRequests() if timeout will reached 
during iteration over  tx.dhtMap().values() we do not send 
GridDhtTxPrepareRequest for some backups, it lead that backup will not know any 
think about transaction and will not participate in commit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-8973) Need to support dump for idle_verify

2018-07-10 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-8973:
--

 Summary: Need to support dump for idle_verify 
 Key: IGNITE-8973
 URL: https://issues.apache.org/jira/browse/IGNITE-8973
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin


In a current implementation, idle_verify checking consistency between primary 
and backup partitions will be useful to have ability dump current state for all 
partition to file. This dump can help an investigation of some kind of problem 
with partition counters or sizes because it is a cluster partition hash 
snapshot by some partition state (hash include all keys in the partition).

idle_verify --dump - calculate partition hash and print into standard output
idle_verify --dump {path} - calculate partition hash and write output to file




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-8929) WAL should not disable for the group if none a partition is not assigned to a local node.

2018-07-04 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-8929:
--

 Summary: WAL should not disable for the group if none a partition 
is not assigned to a local node.
 Key: IGNITE-8929
 URL: https://issues.apache.org/jira/browse/IGNITE-8929
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-8827) Disable WAL during apply updates on recovery

2018-06-19 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-8827:
--

 Summary: Disable WAL during apply updates on recovery
 Key: IGNITE-8827
 URL: https://issues.apache.org/jira/browse/IGNITE-8827
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-8707) DataStorageMetrics.getTotalAllocatedSize metric does not account reserved partition page header.

2018-06-05 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-8707:
--

 Summary: DataStorageMetrics.getTotalAllocatedSize metric does not 
account reserved partition page header.
 Key: IGNITE-8707
 URL: https://issues.apache.org/jira/browse/IGNITE-8707
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-8685) Incorrect size for switch segment record

2018-06-04 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-8685:
--

 Summary: Incorrect size for switch segment record 
 Key: IGNITE-8685
 URL: https://issues.apache.org/jira/browse/IGNITE-8685
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin


We have invariant that switch segment record should have the size of one byte.
Although, in the current implementation, size calculation with overhard for 
storing CRC and WAL pointer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-8661) WALItreater is not stopped if can not deserialize record

2018-05-31 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-8661:
--

 Summary: WALItreater is not stopped if can not deserialize record 
 Key: IGNITE-8661
 URL: https://issues.apache.org/jira/browse/IGNITE-8661
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-8607) [.NET] Support metrics changes in DataStorageMetricsMXBean

2018-05-24 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-8607:
--

 Summary: [.NET] Support metrics changes in DataStorageMetricsMXBean
 Key: IGNITE-8607
 URL: https://issues.apache.org/jira/browse/IGNITE-8607
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-8583) DataStorageMetricsMXBean.getOffHeapSize include checkpoint buffer size, this is not clear

2018-05-23 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-8583:
--

 Summary: DataStorageMetricsMXBean.getOffHeapSize include 
checkpoint buffer size, this is not clear
 Key: IGNITE-8583
 URL: https://issues.apache.org/jira/browse/IGNITE-8583
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-8475) Create new IgniteCache decorator with fair async methonds

2018-05-11 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-8475:
--

 Summary: Create new IgniteCache decorator with fair async methonds
 Key: IGNITE-8475
 URL: https://issues.apache.org/jira/browse/IGNITE-8475
 Project: Ignite
  Issue Type: Improvement
  Components: cache
Affects Versions: 2.4
Reporter: Dmitriy Govorukhin
 Fix For: None


GridCacheAdapter.syncOp has awaitLastFut(); this call wait last async 
operation completed. 

This means all async operation in one thread will be executed one by one but
not in parallel. Async operation is not async. 

Example for atomic cache 

f1=cache.getAsync(key1); 
f2=cache.getAsync(key2); 

f1 always will be complete before f2. 

Need to create a new decorator for IgniteCache, and return IgniteCache proxy 
with fair async 

operations.

 IgniteCache.withFairAsync()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-8464) WALItreater broken (race on the switch to the next segment during iteration and concurrent archiving same segment)

2018-05-10 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-8464:
--

 Summary: WALItreater broken (race on the switch to the next 
segment during iteration and concurrent archiving same segment)
 Key: IGNITE-8464
 URL: https://issues.apache.org/jira/browse/IGNITE-8464
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin


FileArchiver

{code}

final SegmentArchiveResult res = archiveSegment(toArchive);

synchronized (this) {
 while (locked.containsKey(toArchive) && !stopped)
 wait();
}

// Firstly, format working file
if (!stopped)
 formatFile(res.getOrigWorkFile());

synchronized (this) {
 // Then increase counter to allow rollover on clean working file
 changeLastArchivedIndexAndNotifyWaiters(toArchive);

 notifyAll();
}

{code}

Some thread may try read segments when archive formating file in work dir 
(formatFile not synchronized), last archived index is still not updated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-8346) FileDownloaderTest is not included in the test suite

2018-04-20 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-8346:
--

 Summary: FileDownloaderTest is not included in the test suite
 Key: IGNITE-8346
 URL: https://issues.apache.org/jira/browse/IGNITE-8346
 Project: Ignite
  Issue Type: Test
Reporter: Dmitriy Govorukhin


org.apache.ignite.internal.processors.cache.persistence.file.FileDownloaderTest 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-8341) .NET: Add new metrics for data storage

2018-04-20 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-8341:
--

 Summary: .NET: Add new metrics for data storage
 Key: IGNITE-8341
 URL: https://issues.apache.org/jira/browse/IGNITE-8341
 Project: Ignite
  Issue Type: New Feature
Reporter: Dmitriy Govorukhin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-8340) .NET Implement new JMX metrics for transactions

2018-04-20 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-8340:
--

 Summary: .NET Implement new JMX metrics for transactions
 Key: IGNITE-8340
 URL: https://issues.apache.org/jira/browse/IGNITE-8340
 Project: Ignite
  Issue Type: New Feature
Reporter: Dmitriy Govorukhin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-8078) Add new metrics for data storage

2018-03-29 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-8078:
--

 Summary: Add new metrics for data storage
 Key: IGNITE-8078
 URL: https://issues.apache.org/jira/browse/IGNITE-8078
 Project: Ignite
  Issue Type: New Feature
Reporter: Dmitriy Govorukhin


1. Create new MXbean for each index, IndexMxBean
{code}
class IndexMxBean{
/** The number of PUT operations on the index. */
long processedPuts();
/** The number of GET operations on the index. */
long processedGets();
/** The total index size in bytes. */
long getIndexSize();
}
{code}

2. Add new metrics for data storage and cache group.
{code}
class CacheGroupMetricsMXBean{
/** The total index size in bytes */
long getIndexTotalSize();
long getPKIndexTotalSize();
long getTotalsize();
long getDataSize();
String getType();
}
{code}

{code}
class DataRegionMXBean{
long getIndexTotalSize();
long getPKIndexTotalSize();
long getTotalsize();
long getdataSize();
long offheapUsedSize();
long pagesRead();
long pagesWriten();
long pagesReplaced();
long dirtyPagesForNextCheckpoint();
}
{code}

{code}
class DataStorageMXbean{
long getIndexTotalSize();
long getPKIndexTotalSize();
long getTotalsize();
long offHeapSize();
long offheapUsedSize();
long getDataSize();
long pagesRead();
long pagesWriten();
long pagesReplaced();
long checkpointTotalTime();
long dirtyPagesForNextCheckpoint();
}
{code}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-8077) Implement new JMX metrics for transactions

2018-03-29 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-8077:
--

 Summary: Implement new JMX metrics for transactions
 Key: IGNITE-8077
 URL: https://issues.apache.org/jira/browse/IGNITE-8077
 Project: Ignite
  Issue Type: New Feature
Reporter: Dmitriy Govorukhin


These additional metrics should be implemented to monitor transactions.

{code}
class TransactionsMXbean{
/** The number of transactions which was commited. */
long txCommited();
/** The number of transactions which was rollbacked. */
long txRollBacked();
/** The number of transactions in prepared state. */
long txPrepared();
/** The number of keys locked on the node. */
long lockedKeys();
/** The number of transactions for which node is the initiator. */
int ownerTxs();
}
{code}

For more details see in IgniteTxAdapter and IgniteTxManager.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-7933) The error writing wal point to cp/node-start file can lead to the inability to start node

2018-03-13 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-7933:
--

 Summary: The error writing wal point to cp/node-start file can 
lead to the inability to start node
 Key: IGNITE-7933
 URL: https://issues.apache.org/jira/browse/IGNITE-7933
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-7865) Need to provide method WAL manager for return serialize version

2018-03-02 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-7865:
--

 Summary: Need to provide method WAL manager for return serialize 
version
 Key: IGNITE-7865
 URL: https://issues.apache.org/jira/browse/IGNITE-7865
 Project: Ignite
  Issue Type: Improvement
Reporter: Dmitriy Govorukhin


{code}
public interface IgniteWriteAheadLogManager {
.
/**
 * @return Current serializer version.
 */
public int serializerVersion();
.
}
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-7755) Potentially crash during write cp-***-start.bin can lead to the impossibility of recovering

2018-02-19 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-7755:
--

 Summary: Potentially crash during write cp-***-start.bin can lead 
to the impossibility of recovering
 Key: IGNITE-7755
 URL: https://issues.apache.org/jira/browse/IGNITE-7755
 Project: Ignite
  Issue Type: Bug
Affects Versions: 2.3, 2.4
Reporter: Dmitriy Govorukhin
 Fix For: 2.5


We can crashed after cp-***-start.bin created but before content (wal point) is 
recorded. On recovery after trying read wal point we got an exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-7747) WAL manage getAndReserveWalFiles should not throw exception if segments not found

2018-02-19 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-7747:
--

 Summary: WAL manage getAndReserveWalFiles should not throw 
exception if segments not found
 Key: IGNITE-7747
 URL: https://issues.apache.org/jira/browse/IGNITE-7747
 Project: Ignite
  Issue Type: Improvement
Reporter: Dmitriy Govorukhin
Assignee: Dmitriy Govorukhin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-7679) Move all test plugins to a separate module

2018-02-12 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-7679:
--

 Summary: Move all test plugins to a separate module
 Key: IGNITE-7679
 URL: https://issues.apache.org/jira/browse/IGNITE-7679
 Project: Ignite
  Issue Type: Test
Reporter: Dmitriy Govorukhin
Assignee: Dmitriy Govorukhin
 Fix For: 2.5






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-6857) Multi stage start cache

2017-11-09 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-6857:
--

 Summary: Multi stage start cache
 Key: IGNITE-6857
 URL: https://issues.apache.org/jira/browse/IGNITE-6857
 Project: Ignite
  Issue Type: Improvement
  Security Level: Public (Viewable by anyone)
  Components: cache
Reporter: Dmitriy Govorukhin
 Fix For: 2.4


In current implementation, cache start in one stage (cache start exchange). We 
must provide ability to start cache more that one stage (for internal 
operation, like transaction recovery)
No cache operation should occur between the start stages.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-6547) Need to support ability log timestamp for wal tx and data records

2017-10-03 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-6547:
--

 Summary: Need to support ability log timestamp for wal tx and data 
records
 Key: IGNITE-6547
 URL: https://issues.apache.org/jira/browse/IGNITE-6547
 Project: Ignite
  Issue Type: Improvement
Reporter: Dmitriy Govorukhin
Assignee: Dmitriy Govorukhin


It maybe useful for wal analyse. Also it must be configurable via system 
properties (IGNITE_WAL_LOGGIN_TIMESTAMP).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-6513) Add ability manage version for WAL serializer via system properties

2017-09-27 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-6513:
--

 Summary: Add ability manage version for WAL serializer via system 
properties
 Key: IGNITE-6513
 URL: https://issues.apache.org/jira/browse/IGNITE-6513
 Project: Ignite
  Issue Type: Improvement
Affects Versions: 2.1, 2.0, 2.2, 2.3
Reporter: Dmitriy Govorukhin
Assignee: Dmitriy Govorukhin
 Fix For: 2.4






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-6493) IgnitePdsWalTlbTest.testWalDirectOutOfMemory() hangs

2017-09-25 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-6493:
--

 Summary: IgnitePdsWalTlbTest.testWalDirectOutOfMemory() hangs
 Key: IGNITE-6493
 URL: https://issues.apache.org/jira/browse/IGNITE-6493
 Project: Ignite
  Issue Type: Bug
  Components: persistence
Affects Versions: 2.2
Reporter: Dmitriy Govorukhin
Assignee: Dmitriy Govorukhin
 Fix For: 2.3






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-6480) Incorrect server node filter in GridDiscoveryManager

2017-09-22 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-6480:
--

 Summary: Incorrect server node filter in GridDiscoveryManager
 Key: IGNITE-6480
 URL: https://issues.apache.org/jira/browse/IGNITE-6480
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin
Assignee: Dmitriy Govorukhin
 Fix For: 2.3


GridDiscoveryManager.serverTopologyNodes return collection nodes with daemon 
node inside.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-6439) IgnitePersistentStoreSchemaLoadTest is broken

2017-09-19 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-6439:
--

 Summary: IgnitePersistentStoreSchemaLoadTest is broken
 Key: IGNITE-6439
 URL: https://issues.apache.org/jira/browse/IGNITE-6439
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin
Assignee: Dmitriy Govorukhin


After start nodes, cluster must be activated explicit.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-6307) If getAll() fails with NPE, onHeap entry is not removed, for local cache

2017-09-08 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-6307:
--

 Summary: If getAll() fails with NPE, onHeap entry is not removed, 
for local cache
 Key: IGNITE-6307
 URL: https://issues.apache.org/jira/browse/IGNITE-6307
 Project: Ignite
  Issue Type: Bug
Affects Versions: 2.0
Reporter: Dmitriy Govorukhin
 Fix For: 2.3


GridCacheLocalFullApiSelfTest.testGetAllWithNulls

{code}
final Set c = new HashSet<>();

c.add("key1");
c.add(null);

GridTestUtils.assertThrows(log, new Callable() {
@Override public Void call() throws Exception {
cache.getAll(c);

return null;
}
}, NullPointerException.class, null);
{code}

After getAll, entry with "key1" will be in heap map, it is dependent of order 
in collection for getAll();



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-6210) inefficient memory consumption for checkpoint buffer

2017-08-29 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-6210:
--

 Summary: inefficient memory consumption for checkpoint buffer
 Key: IGNITE-6210
 URL: https://issues.apache.org/jira/browse/IGNITE-6210
 Project: Ignite
  Issue Type: Bug
  Components: persistence
Affects Versions: 2.1
Reporter: Dmitriy Govorukhin
Assignee: Dmitriy Govorukhin
Priority: Critical
 Fix For: 2.2


Current implementation allows configure checkpoint buffer size in 
PersistentStoreConfiguration, but checkpoint buffer will be created for each 
memory configuration with size equals the one indicated in 
PersistentStoreConfiguration.

For example:
{code}
PersistentStoreConfiguration prCfg = new PersistentStoreConfiguration();
prCfg.setCheckpointingFrequency(5L * 1024L * 1024L * 1024L); // 5GB.

MemoryConfiguration memCfg = new MemoryConfiguration();

MemoryPolicyConfiguration pl1 = new MemoryPolicyConfiguration();

pl1.setMaxSize(100L * 1024L * 1024L); // 100 Mb.

MemoryPolicyConfiguration pl2 = new MemoryPolicyConfiguration();

pl2.setMaxSize(10L * 1024L * 1024L * 1024L); // 10GB.

memCfg.setMemoryPolicies(pl1, pl2);
{code}

pl1(max size 10Gb) will be have checkpoint buffer = 5GB and pl2(max size 100Mb) 
buffer= 5GB




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-6179) Test fail DynamicIndexReplicatedAtomicConcurrentSelfTest.testClientReconnectWithCacheRestart

2017-08-24 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-6179:
--

 Summary: Test fail 
DynamicIndexReplicatedAtomicConcurrentSelfTest.testClientReconnectWithCacheRestart
 Key: IGNITE-6179
 URL: https://issues.apache.org/jira/browse/IGNITE-6179
 Project: Ignite
  Issue Type: Bug
  Components: sql
Reporter: Dmitriy Govorukhin
Priority: Critical
 Fix For: 2.2


Test fail with assertion 
{code}
[2017-08-24 
18:34:06,207][ERROR][tcp-client-disco-msg-worker-#61%index.DynamicIndexReplicatedAtomicConcurrentSelfTest4%][IgniteClientReconnectAbstractTest$TestTcpDiscoverySpi]
 Failed to unmarshal discovery custom message.
java.lang.AssertionError
at 
org.apache.ignite.internal.processors.query.GridQueryProcessor.onSchemaFinishDiscovery(GridQueryProcessor.java:498)
at 
org.apache.ignite.internal.processors.query.GridQueryProcessor.onDiscovery(GridQueryProcessor.java:894)
at 
org.apache.ignite.internal.processors.cache.GridCacheProcessor.onCustomEvent(GridCacheProcessor.java:2906)
at 
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.onDiscovery0(GridDiscoveryManager.java:660)
at 
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.onDiscovery(GridDiscoveryManager.java:560)
at 
org.apache.ignite.spi.discovery.tcp.ClientImpl$MessageWorker.notifyDiscovery(ClientImpl.java:2391)
at 
org.apache.ignite.spi.discovery.tcp.ClientImpl$MessageWorker.processCustomMessage(ClientImpl.java:2297)
at 
org.apache.ignite.spi.discovery.tcp.ClientImpl$MessageWorker.processDiscoveryMessage(ClientImpl.java:1874)
at 
org.apache.ignite.spi.discovery.tcp.ClientImpl$MessageWorker.body(ClientImpl.java:1758)
at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62)
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-6154) Incorrect check checkpoint pages collection

2017-08-22 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-6154:
--

 Summary: Incorrect check checkpoint pages collection
 Key: IGNITE-6154
 URL: https://issues.apache.org/jira/browse/IGNITE-6154
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin
Assignee: Dmitriy Govorukhin
Priority: Critical
 Fix For: 2.2


There is incorrect check !F.empty(collection) in checkpoint thread.
There should be a full check all elements, because collection is collection of 
GridMultiCollectionWrapper, and we must check all mutlicollections.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-6100) Test fail IgnitePdsRecoveryAfterFileCorruptionTest.testPageRecoveryAfterFileCorruption

2017-08-17 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-6100:
--

 Summary: Test fail 
IgnitePdsRecoveryAfterFileCorruptionTest.testPageRecoveryAfterFileCorruption 
 Key: IGNITE-6100
 URL: https://issues.apache.org/jira/browse/IGNITE-6100
 Project: Ignite
  Issue Type: Test
Reporter: Dmitriy Govorukhin
Assignee: Dmitriy Govorukhin


Test leads to a memory leak



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-6098) Test fail IgniteDataIntegrityTests.testExpandBuffer

2017-08-17 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-6098:
--

 Summary: Test fail IgniteDataIntegrityTests.testExpandBuffer
 Key: IGNITE-6098
 URL: https://issues.apache.org/jira/browse/IGNITE-6098
 Project: Ignite
  Issue Type: Test
Reporter: Dmitriy Govorukhin
Assignee: Dmitriy Govorukhin


{code}
junit.framework.AssertionFailedError: expected:<0> but was:<13>
at junit.framework.Assert.fail(Assert.java:57)
at junit.framework.Assert.failNotEquals(Assert.java:329)
at junit.framework.Assert.assertEquals(Assert.java:78)
at junit.framework.Assert.assertEquals(Assert.java:234)
at junit.framework.Assert.assertEquals(Assert.java:241)
at junit.framework.TestCase.assertEquals(TestCase.java:409)
at 
org.apache.ignite.internal.processors.cache.persistence.db.wal.crc.IgniteDataIntegrityTests.testExpandBuffer(IgniteDataIntegrityTests.java:138)
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-6091) Test fail CacheLateAffinityAssignmentTest.testRandomOperations

2017-08-16 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-6091:
--

 Summary: Test fail 
CacheLateAffinityAssignmentTest.testRandomOperations
 Key: IGNITE-6091
 URL: https://issues.apache.org/jira/browse/IGNITE-6091
 Project: Ignite
  Issue Type: Test
Affects Versions: 2.1
Reporter: Dmitriy Govorukhin
Assignee: Dmitriy Govorukhin
 Fix For: 2.2


Flaky test
{code}
junit.framework.AssertionFailedError: Unexpected error: 
javax.cache.CacheException: class org.apache.ignite.IgniteCheckedException: 
Failed to perform cache operation (cache topology is not valid): join-cache-24
at junit.framework.Assert.fail(Assert.java:57)
at junit.framework.Assert.assertTrue(Assert.java:22)
at junit.framework.TestCase.assertTrue(TestCase.java:192)
at 
org.apache.ignite.internal.processors.cache.distributed.CacheLateAffinityAssignmentTest.checkCaches(CacheLateAffinityAssignmentTest.java:2176)
at 
org.apache.ignite.internal.processors.cache.distributed.CacheLateAffinityAssignmentTest.afterTest(CacheLateAffinityAssignmentTest.java:225)
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-6082) Test fail DynamicIndexPartitionedTransactionalConcurrentSelfTest.testConcurrentPutRemove

2017-08-16 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-6082:
--

 Summary: Test fail 
DynamicIndexPartitionedTransactionalConcurrentSelfTest.testConcurrentPutRemove 
 Key: IGNITE-6082
 URL: https://issues.apache.org/jira/browse/IGNITE-6082
 Project: Ignite
  Issue Type: Test
Affects Versions: 2.0
Reporter: Dmitriy Govorukhin
Assignee: Dmitriy Govorukhin
 Fix For: 2.2






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-6074) Test fail IgniteWalRecoveryTest#testWalBigObjectNodeCancel

2017-08-16 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-6074:
--

 Summary: Test fail IgniteWalRecoveryTest#testWalBigObjectNodeCancel
 Key: IGNITE-6074
 URL: https://issues.apache.org/jira/browse/IGNITE-6074
 Project: Ignite
  Issue Type: Test
Reporter: Dmitriy Govorukhin
Assignee: Dmitriy Govorukhin
 Fix For: 2.2


Exception in thread "exchange-worker-#79%wal.IgniteWalRecoveryTest1%" 
java.lang.AssertionError
at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileInput.ensure(FileInput.java:110)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileInput$Crc32CheckingFileInput.ensure(FileInput.java:303)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileInput$Crc32CheckingFileInput.readFully(FileInput.java:351)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.serializer.RecordV1Serializer.readDataEntry(RecordV1Serializer.java:1550)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.serializer.RecordV1Serializer.readRecord(RecordV1Serializer.java:799)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.serializer.RecordV1Serializer.readRecord(RecordV1Serializer.java:673)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.AbstractWalRecordsIterator.advanceRecord(AbstractWalRecordsIterator.java:208)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.AbstractWalRecordsIterator.advance(AbstractWalRecordsIterator.java:153)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.AbstractWalRecordsIterator.onNext(AbstractWalRecordsIterator.java:120)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.AbstractWalRecordsIterator.onNext(AbstractWalRecordsIterator.java:43)
at 
org.apache.ignite.internal.util.GridCloseableIteratorAdapter.nextX(GridCloseableIteratorAdapter.java:41)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.restoreMemory(GridCacheDatabaseSharedManager.java:1321)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.readCheckpointAndRestoreMemory(GridCacheDatabaseSharedManager.java:534)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onClusterStateChangeRequest(GridDhtPartitionsExchangeFuture.java:671)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:463)
at 
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:1954)
at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
at java.lang.Thread.run(Thread.java:748)
Suppressed: class 
org.apache.ignite.internal.processors.cache.persistence.wal.crc.IgniteDataIntegrityViolationException:
 val: 78139338 writtenCrc: 262144
at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileInput$Crc32CheckingFileInput.close(FileInput.java:320)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.serializer.RecordV1Serializer.readRecord(RecordV1Serializer.java:680)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-6064) Rework control.sh script.

2017-08-15 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-6064:
--

 Summary: Rework control.sh script.
 Key: IGNITE-6064
 URL: https://issues.apache.org/jira/browse/IGNITE-6064
 Project: Ignite
  Issue Type: Improvement
  Components: general
Affects Versions: 2.1, 2.0
Reporter: Dmitriy Govorukhin
 Fix For: 2.2






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-6058) Test fail testTransformResourceInjection broken

2017-08-14 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-6058:
--

 Summary: Test fail testTransformResourceInjection broken
 Key: IGNITE-6058
 URL: https://issues.apache.org/jira/browse/IGNITE-6058
 Project: Ignite
  Issue Type: Test
Reporter: Dmitriy Govorukhin
 Fix For: 2.2






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-6052) Check cluster state from daemon node return incorrect cluster state

2017-08-14 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-6052:
--

 Summary: Check cluster state from daemon node return incorrect 
cluster state
 Key: IGNITE-6052
 URL: https://issues.apache.org/jira/browse/IGNITE-6052
 Project: Ignite
  Issue Type: Bug
Affects Versions: 2.0
Reporter: Dmitriy Govorukhin
Assignee: Dmitriy Govorukhin
Priority: Critical
 Fix For: 2.2


Daemon node must requested cluster state via compute grid from server node.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-6015) Test fail in Ignite Cache 2: GridCachePartitionedGetAndTransformStoreSelfTest.testGetAndTransform

2017-08-09 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-6015:
--

 Summary: Test fail in Ignite Cache 2: 
GridCachePartitionedGetAndTransformStoreSelfTest.testGetAndTransform 
 Key: IGNITE-6015
 URL: https://issues.apache.org/jira/browse/IGNITE-6015
 Project: Ignite
  Issue Type: Test
Reporter: Dmitriy Govorukhin
 Fix For: 2.2


java.util.concurrent.TimeoutException: Test has been timed out 
[test=testGetAndTransform, timeout=30]
at 
org.apache.ignite.testframework.junits.GridAbstractTest.runTest(GridAbstractTest.java:1949)
at junit.framework.TestCase.runBare(TestCase.java:141)
at junit.framework.TestResult$1.protect(TestResult.java:122)
at junit.framework.TestResult.runProtected(TestResult.java:142)
at junit.framework.TestResult.run(TestResult.java:125)
at junit.framework.TestCase.run(TestCase.java:129)
at junit.framework.TestSuite.runTest(TestSuite.java:255)
at junit.framework.TestSuite.run(TestSuite.java:250)
at junit.framework.TestSuite.runTest(TestSuite.java:255)
at junit.framework.TestSuite.run(TestSuite.java:250)
at 
org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:84)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray2(ReflectionUtils.java:208)
at 
org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:156)
at 
org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:82)
at 
org.apache.maven.plugin.surefire.InPluginVMSurefireStarter.runSuitesInProcess(InPluginVMSurefireStarter.java:82)
at 
org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeProvider(AbstractSurefireMojo.java:951)
at 
org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeAfterPreconditionsChecked(AbstractSurefireMojo.java:831)
at 
org.apache.maven.plugin.surefire.AbstractSurefireMojo.execute(AbstractSurefireMojo.java:729)
at 
org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:101)
at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:209)
at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
at 
org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:84)
at 
org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:59)
at 
org.apache.maven.lifecycle.internal.LifecycleStarter.singleThreadedBuild(LifecycleStarter.java:183)
at 
org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:161)
at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:320)
at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:156)
at org.apache.maven.cli.MavenCli.execute(MavenCli.java:537)
at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:196)
at org.apache.maven.cli.MavenCli.main(MavenCli.java:141)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:290)
at 
org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:230)
at 
org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:409)
at org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:3



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-5968) Test fail in Ignite Cache 2: GridCachePartitionNotLoadedEventSelfTest.testPrimaryAndBackupDead

2017-08-07 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-5968:
--

 Summary: Test fail in Ignite Cache 2: 
GridCachePartitionNotLoadedEventSelfTest.testPrimaryAndBackupDead
 Key: IGNITE-5968
 URL: https://issues.apache.org/jira/browse/IGNITE-5968
 Project: Ignite
  Issue Type: Test
Reporter: Dmitriy Govorukhin
 Fix For: 2.2


java.util.concurrent.TimeoutException: Test has been timed out 
[test=testPrimaryAndBackupDead, timeout=30]
at 
org.apache.ignite.testframework.junits.GridAbstractTest.runTest(GridAbstractTest.java:1949)
at junit.framework.TestCase.runBare(TestCase.java:141)
at junit.framework.TestResult$1.protect(TestResult.java:122)
at junit.framework.TestResult.runProtected(TestResult.java:142)
at junit.framework.TestResult.run(TestResult.java:125)
at junit.framework.TestCase.run(TestCase.java:129)
at junit.framework.TestSuite.runTest(TestSuite.java:255)
at junit.framework.TestSuite.run(TestSuite.java:250)
at junit.framework.TestSuite.runTest(TestSuite.java:255)
at junit.framework.TestSuite.run(TestSuite.java:250)
at 
org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:84)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-5967) Flaky fail in Ignite Java Client: RedisProtocolStringSelfTest.testGetSet

2017-08-07 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-5967:
--

 Summary: Flaky fail in Ignite Java Client: 
RedisProtocolStringSelfTest.testGetSet 
 Key: IGNITE-5967
 URL: https://issues.apache.org/jira/browse/IGNITE-5967
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin
 Fix For: 2.2


RedisProtocolStringSelfTest.testGetSet 

redis.clients.jedis.exceptions.JedisConnectionException: Unexpected end of 
stream.
at redis.clients.util.RedisInputStream.ensureFill(RedisInputStream.java:199)
at redis.clients.util.RedisInputStream.readByte(RedisInputStream.java:40)
at redis.clients.jedis.Protocol.process(Protocol.java:151)
at redis.clients.jedis.Protocol.read(Protocol.java:215)
at 
redis.clients.jedis.Connection.readProtocolWithCheckingBroken(Connection.java:340)
at redis.clients.jedis.Connection.getBinaryBulkReply(Connection.java:259)
at redis.clients.jedis.Connection.getBulkReply(Connection.java:248)
at redis.clients.jedis.Jedis.get(Jedis.java:153)
at 
org.apache.ignite.internal.processors.rest.protocols.tcp.redis.RedisProtocolStringSelfTest.testGetSet(RedisProtocolStringSelfTest.java:62)
--- Stdout: ---
[2017-08-07 06:28:44,379][INFO ][main][root] >>> Starting test: 
RedisProtocolStringSelfTest#testGetSet <<<
[2017-08-07 06:28:52,390][INFO ][main][root] >>> Stopping test: 
RedisProtocolStringSelfTest#testGetSet in 8010 ms <<<
--- Stderr: ---
[2017-08-07 06:28:52,389][ERROR][main][root] Test failed.
redis.clients.jedis.exceptions.JedisConnectionException: Unexpected end of 
stream.
at redis.clients.util.RedisInputStream.ensureFill(RedisInputStream.java:199)
at redis.clients.util.RedisInputStream.readByte(RedisInputStream.java:40)
at redis.clients.jedis.Protocol.process(Protocol.java:151)
at redis.clients.jedis.Protocol.read(Protocol.java:215)
at 
redis.clients.jedis.Connection.readProtocolWithCheckingBroken(Connection.java:340)
at redis.clients.jedis.Connection.getBinaryBulkReply(Connection.java:259)
at redis.clients.jedis.Connection.getBulkReply(Connection.java:248)
at redis.clients.jedis.Jedis.get(Jedis.java:153)
at 
org.apache.ignite.internal.processors.rest.protocols.tcp.redis.RedisProtocolStringSelfTest.testGetSet(RedisProtocolStringSelfTest.java:62)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at junit.framework.TestCase.runTest(TestCase.java:176)
at 
org.apache.ignite.testframework.junits.GridAbstractTest.runTestInternal(GridAbstractTest.java:2000)
at 
org.apache.ignite.testframework.junits.GridAbstractTest.access$000(GridAbstractTest.java:132)
at 
org.apache.ignite.testframework.junits.GridAbstractTest$5.run(GridAbstractTest.java:1915)
at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-5741) Replace HeapByteBuffer to DirectByteBuffer in WAL RecordsIterator

2017-07-12 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-5741:
--

 Summary: Replace HeapByteBuffer to DirectByteBuffer in WAL 
RecordsIterator
 Key: IGNITE-5741
 URL: https://issues.apache.org/jira/browse/IGNITE-5741
 Project: Ignite
  Issue Type: Improvement
  Components: persistence
Affects Versions: 2.1
Reporter: Dmitriy Govorukhin
Priority: Critical
 Fix For: 2.2


In current implementation we can get OOM during iterate on wal records, because 
some wal record may be very large (more that 64Mb). We read using  
HeapByteBuffer, FileChannel reserve buffer same size as our HeapByteBuffer, but 
by default FileChannel can not allocate buffer more that 64Mb. (maxMemory = 
VM.maxDirectMemory())

{code}
[16:07:54]W: [org.apache.ignite:ignite-core] [2017-07-12 
13:07:54,809][ERROR][exchange-worker-#55966%node0-primary%][GridDhtPartitionsExchangeFuture]
 Failed to reinitialize local partitions (preloading will be stopped): 
GridDhtPartitionExchangeId [topVer=AffinityTopologyVersion [topVer=3, 
minorTopVer=1], nodeId=105056d5, evt=DISCOVERY_CUSTOM_EVT]
[16:07:54]W: [org.apache.ignite:ignite-core] 
java.lang.OutOfMemoryError: Direct buffer memory
[16:07:54]W: [org.apache.ignite:ignite-core]at 
java.nio.Bits.reserveMemory(Bits.java:658)
[16:07:54]W: [org.apache.ignite:ignite-core]at 
java.nio.DirectByteBuffer.(DirectByteBuffer.java:123)
[16:07:54]W: [org.apache.ignite:ignite-core]at 
java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306)
[16:07:54]W: [org.apache.ignite:ignite-core]at 
sun.nio.ch.Util.getTemporaryDirectBuffer(Util.java:174)
[16:07:54]W: [org.apache.ignite:ignite-core]at 
sun.nio.ch.IOUtil.read(IOUtil.java:195)
[16:07:54]W: [org.apache.ignite:ignite-core]at 
sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:149)
[16:07:54]W: [org.apache.ignite:ignite-core]at 
org.apache.ignite.internal.processors.cache.persistence.file.RandomAccessFileIO.read(RandomAccessFileIO.java:62)
[16:07:54]W: [org.apache.ignite:ignite-core]at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileInput.ensure(FileInput.java:116)
[16:07:54]W: [org.apache.ignite:ignite-core]at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileInput$Crc32CheckingFileInput.ensure(FileInput.java:303)
[16:07:54]W: [org.apache.ignite:ignite-core]at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileInput$Crc32CheckingFileInput.readByte(FileInput.java:376)
[16:07:54]W: [org.apache.ignite:ignite-core]at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileInput$Crc32CheckingFileInput.readUnsignedByte(FileInput.java:385)
[16:07:54]W: [org.apache.ignite:ignite-core]at 
org.apache.ignite.internal.processors.cache.persistence.wal.serializer.RecordV1Serializer.readRecord(RecordV1Serializer.java:697)
[16:07:54]W: [org.apache.ignite:ignite-core]at 
org.apache.ignite.internal.processors.cache.persistence.wal.serializer.RecordV1Serializer.readRecord(RecordV1Serializer.java:673)
[16:07:54]W: [org.apache.ignite:ignite-core]at 
org.apache.ignite.internal.processors.cache.persistence.wal.AbstractWalRecordsIterator.initReadHandle(AbstractWalRecordsIterator.java:243)
[16:07:54]W: [org.apache.ignite:ignite-core]at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.advanceSegment(FileWriteAheadLogManager.java:2452)
[16:07:54]W: [org.apache.ignite:ignite-core]at 
org.apache.ignite.internal.processors.cache.persistence.wal.AbstractWalRecordsIterator.advance(AbstractWalRecordsIterator.java:149)
[16:07:54]W: [org.apache.ignite:ignite-core]at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.(FileWriteAheadLogManager.java:2352)
[16:07:54]W: [org.apache.ignite:ignite-core]at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.(FileWriteAheadLogManager.java:2290)
[16:07:54]W: [org.apache.ignite:ignite-core]at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.replay(FileWriteAheadLogManager.java:553)
[16:07:54]W: [org.apache.ignite:ignite-core]at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.restoreMemory(GridCacheDatabaseSharedManager.java:1313)
[16:07:54]W: [org.apache.ignite:ignite-core]at 

[jira] [Created] (IGNITE-5704) Do not allow cluster to be deactivated if the joining node does not complete start all components

2017-07-05 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-5704:
--

 Summary: Do not allow cluster to be deactivated if the joining 
node does not complete start all components
 Key: IGNITE-5704
 URL: https://issues.apache.org/jira/browse/IGNITE-5704
 Project: Ignite
  Issue Type: Improvement
  Components: general
Affects Versions: 2.0, 2.1
Reporter: Dmitriy Govorukhin
Assignee: Dmitriy Govorukhin
Priority: Critical
 Fix For: 2.2






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-5603) All daemon node, can be only client daemon, server daemon is not allow.

2017-06-28 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-5603:
--

 Summary: All daemon node, can be only client daemon, server daemon 
is not allow.
 Key: IGNITE-5603
 URL: https://issues.apache.org/jira/browse/IGNITE-5603
 Project: Ignite
  Issue Type: Improvement
  Components: general
Reporter: Dmitriy Govorukhin
Priority: Critical
 Fix For: 2.1


No reason for daemon server right now. Rework current functionality, prevent 
the server node from being a daemon.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-5600) Compute hang if send broadcast runnable on daemon cluster group

2017-06-28 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-5600:
--

 Summary: Compute hang if send broadcast runnable on daemon cluster 
group
 Key: IGNITE-5600
 URL: https://issues.apache.org/jira/browse/IGNITE-5600
 Project: Ignite
  Issue Type: Bug
Reporter: Dmitriy Govorukhin
 Fix For: 2.1


Step to reproduce.
1. Start server node
2. Start daemon client and daemon server
3. Try to compute on daemon nodes




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-5520) IgniteChangeGlobalStateFailOverTest hangs activate on join node

2017-06-16 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-5520:
--

 Summary: IgniteChangeGlobalStateFailOverTest hangs activate on 
join node
 Key: IGNITE-5520
 URL: https://issues.apache.org/jira/browse/IGNITE-5520
 Project: Ignite
  Issue Type: Bug
  Components: cache
Reporter: Dmitriy Govorukhin
Assignee: Dmitriy Govorukhin
 Fix For: 2.1






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-5518) Rework test on join active/inactive node

2017-06-16 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-5518:
--

 Summary: Rework test on join active/inactive node 
 Key: IGNITE-5518
 URL: https://issues.apache.org/jira/browse/IGNITE-5518
 Project: Ignite
  Issue Type: Task
Reporter: Dmitriy Govorukhin
 Fix For: 2.1






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-5480) Need to correctly handle deactivation process if an exception occured during deactivation

2017-06-13 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-5480:
--

 Summary: Need to correctly handle deactivation process if an 
exception occured during deactivation
 Key: IGNITE-5480
 URL: https://issues.apache.org/jira/browse/IGNITE-5480
 Project: Ignite
  Issue Type: Task
  Components: cache, persistence
Affects Versions: 2.0
Reporter: Dmitriy Govorukhin
Assignee: Dmitriy Govorukhin
 Fix For: 2.2






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (IGNITE-5076) Optimization of multi-threaded start nodes in tests

2017-04-25 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-5076:
--

 Summary: Optimization of multi-threaded start nodes in tests
 Key: IGNITE-5076
 URL: https://issues.apache.org/jira/browse/IGNITE-5076
 Project: Ignite
  Issue Type: Improvement
Reporter: Dmitriy Govorukhin
Assignee: Dmitriy Govorukhin
 Fix For: 2.0


Concurrent start,will more effective if we have coordinator before, 
multi-threaded start nodes. If start all nodes concurrent, they will be compete 
for coordinator role, it is not effective way.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (IGNITE-5019) Deadlock in GridCacheVariableTopologySelfTest.testNodeStop

2017-04-18 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-5019:
--

 Summary: Deadlock in GridCacheVariableTopologySelfTest.testNodeStop
 Key: IGNITE-5019
 URL: https://issues.apache.org/jira/browse/IGNITE-5019
 Project: Ignite
  Issue Type: Bug
  Components: general
Reporter: Dmitriy Govorukhin
Assignee: Dmitriy Govorukhin
 Fix For: 2.0






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (IGNITE-4931) Safe way for deactivate cluster.

2017-04-07 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-4931:
--

 Summary: Safe way for deactivate cluster.
 Key: IGNITE-4931
 URL: https://issues.apache.org/jira/browse/IGNITE-4931
 Project: Ignite
  Issue Type: Task
  Components: general
Affects Versions: 2.0
Reporter: Dmitriy Govorukhin
 Fix For: 2.1


We must provide safe way for deactivate cluster, i mean we must wait while all 
cache operation, transaction and etc. comleted before start deactivation 
process, in current implementation we do not wait while transaction comlete, 
(forcibly stop cache during transaction).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (IGNITE-4919) Remove support BinaryIdentityResolver

2017-04-05 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-4919:
--

 Summary: Remove support BinaryIdentityResolver
 Key: IGNITE-4919
 URL: https://issues.apache.org/jira/browse/IGNITE-4919
 Project: Ignite
  Issue Type: Task
  Components: binary
Reporter: Dmitriy Govorukhin
Assignee: Dmitriy Govorukhin
 Fix For: 2.0


We must get rid of BinaryIdentityResolver, because in new memory mode we must 
provide stable binary key representation. 
[discussion|http://apache-ignite-developers.2346864.n4.nabble.com/Stable-binary-key-representation-td15904.html]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (IGNITE-4235) Can't get user exception if was on remote service

2016-11-16 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-4235:
--

 Summary: Can't get user exception if was on remote service
 Key: IGNITE-4235
 URL: https://issues.apache.org/jira/browse/IGNITE-4235
 Project: Ignite
  Issue Type: Bug
  Components: general
Affects Versions: 1.7
Reporter: Dmitriy Govorukhin
 Fix For: 2.0


Can get user exception if was on remote node. Reproducer in attached file .



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (IGNITE-4178) Support permission builder

2016-11-07 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-4178:
--

 Summary: Support permission builder
 Key: IGNITE-4178
 URL: https://issues.apache.org/jira/browse/IGNITE-4178
 Project: Ignite
  Issue Type: Improvement
  Components: general
Affects Versions: 1.7
Reporter: Dmitriy Govorukhin
Assignee: Dmitriy Govorukhin
Priority: Minor
 Fix For: 1.8


 Provides a convenient way to create a permission set.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (IGNITE-4044) Add an option to always authenticate local node

2016-10-06 Thread Dmitriy Govorukhin (JIRA)
Dmitriy Govorukhin created IGNITE-4044:
--

 Summary: Add an option to always authenticate local node
 Key: IGNITE-4044
 URL: https://issues.apache.org/jira/browse/IGNITE-4044
 Project: Ignite
  Issue Type: Bug
Affects Versions: 1.8
Reporter: Dmitriy Govorukhin
Assignee: Dmitriy Govorukhin


Currently authenticator is called during the startup only if the new node is 
the first one in the topology. This in counterintuitive and introduces 
unpredictable behavior when global authentication is enabled - the node may or 
may not call the authenticator depending on the starting order.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)