[jira] [Updated] (IGNITE-14139) Incorrect initialize checkpoint-runner-cpu thread pool

2021-02-08 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-14139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov updated IGNITE-14139:
---
Release Note: Fixed issue, when CPU checkpoint pool size not initialized.

> Incorrect initialize checkpoint-runner-cpu thread pool
> --
>
> Key: IGNITE-14139
> URL: https://issues.apache.org/jira/browse/IGNITE-14139
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vladislav Pyatkov
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> First initialization of checkpoint thread pool for CPU is incorrect.
> Look at the constructor of {{CheckpointWorkflow}}:
> At start, we initialize the pool:
> {code:java}
> this.checkpointCollectPagesInfoPool = initializeCheckpointPool();
> {code}
> and only after, we set a size of the pool:
> {code:java}
> this.checkpointCollectInfoThreads = checkpointCollectInfoThreads;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-14140) Checkpointer thread holds write lock too long

2021-02-08 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-14140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov updated IGNITE-14140:
---
Release Note: Decreased an exclusive checkpoint lock time.

> Checkpointer thread holds write lock too long
> -
>
> Key: IGNITE-14140
> URL: https://issues.apache.org/jira/browse/IGNITE-14140
> Project: Ignite
>  Issue Type: Bug
>  Components: persistence
>Reporter: Vladislav Pyatkov
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Free lists flushing optimization can block db-checkpoint-thread when it got 
> Write lock. It might block all transactions for several hundreds milliseconds.
> {noformat}
> "db-checkpoint-thread-#334%DPL_GRID%DplGridNodeName%" #667 daemon prio=5 
> os_prio=0 tid=0x7e765c123800 nid=0xee0b8 runnable [0x7e767f535000]
>java.lang.Thread.State: RUNNABLE
>   at sun.misc.Unsafe.getObjectVolatile(Native Method)
>   at 
> java.util.concurrent.atomic.AtomicReferenceArray.getRaw(AtomicReferenceArray.java:130)
>   at 
> java.util.concurrent.atomic.AtomicReferenceArray.get(AtomicReferenceArray.java:125)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.freelist.AbstractFreeList.getBucketCache(AbstractFreeList.java:690)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.flushBucketsCache(PagesList.java:374)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.saveMetadata(PagesList.java:343)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.saveStoreMetadata(GridCacheOffheapManager.java:373)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.syncMetadata(GridCacheOffheapManager.java:336)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.syncMetadata(GridCacheOffheapManager.java:322)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.onMarkCheckpointBegin(GridCacheOffheapManager.java:247)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointWorkflow.markCheckpointBegin(CheckpointWorkflow.java:281)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.Checkpointer.doCheckpoint(Checkpointer.java:388)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.Checkpointer.body(Checkpointer.java:264)
>   at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119)
>   at java.lang.Thread.run(Thread.java:748)
> {noformat}
> We can to reduce time into Write lock if switch off optimization before the 
> lock will be gotten and enable it after the lock will be left off.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-14138) Historical rebalance kills cluster

2021-02-08 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-14138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov updated IGNITE-14138:
---
Release Note: Fixed stopping node in some cases when historical rebalance 
could not find reserved WAL segments. Now the node refuses from supplying 
partitions historically.

> Historical rebalance kills cluster
> --
>
> Key: IGNITE-14138
> URL: https://issues.apache.org/jira/browse/IGNITE-14138
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vladislav Pyatkov
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> {noformat}
> [2021-01-12T05:11:02,142][ERROR][rebalance-#508%---%][] Critical system error 
> detected. Will be handled accordingly to configured handler 
> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, 
> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet 
> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], 
> failureCtx=FailureContext [type=CRITICAL_ERROR, err=class 
> o.a.i.IgniteCheckedException: Failed to continue supplying 
> [grp=SQL_USAGES_EPE, demander=48254935-7aa9-4ab5-b398-fdaec334fab7, 
> topVer=AffinityTopologyVersion [topVer=3, minorTopVer=1
> org.apache.ignite.IgniteCheckedException: Failed to continue supplying 
> [grp=SQL_1, demander=48254935-7aa9-4ab5-b398-fdaec334fab7, 
> topVer=AffinityTopologyVersion [topVer=3, minorTopVer=1]]
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionSupplier.handleDemandMessage(GridDhtPartitionSupplier.java:571)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPreloader.handleDemandMessage(GridDhtPreloader.java:398)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:489)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:474)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1142)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:591)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$800(GridCacheIoManager.java:109)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheIoManager$OrderedMessageListener.onMessage(GridCacheIoManager.java:1707)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1721)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.managers.communication.GridIoManager.access$4300(GridIoManager.java:157)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.managers.communication.GridIoManager$GridCommunicationMessageSet.unwind(GridIoManager.java:3011)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.managers.communication.GridIoManager.unwindMessageSet(GridIoManager.java:1662)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.managers.communication.GridIoManager.access$4900(GridIoManager.java:157)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1629)
>  [ignite-core.jar]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  [?:?]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  [?:?]
>   at java.lang.Thread.run(Thread.java:834) [?:?]
> Caused by: org.apache.ignite.IgniteCheckedException: Could not find start 
> pointer for partition [part=4, partCntrSince=1115]
>   at 
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointHistory.searchEarliestWalPointer(CheckpointHistory.java:557)
>  ~[ignite-core.jar]
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.historicalIterator(GridCacheOffheapManager.java:1121)
>  ~[ignite-core.jar]
>   at 
> org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.rebalanceIterator(IgniteCacheOffheapManagerImpl.java:1195)
>  ~[ignite-core.jar]
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionSupplier.handleDemandMessage(GridDhtPartitionSupplier.java:322)
>  ~[ignite-core.jar]
>   ... 16 more
> {noformat}
> I believe that it should throw IgniteHistoricalIteratorException instead of 
> IgniteCheckedException, so it can be properly handled and rebalance can move 
> to 

[jira] [Commented] (IGNITE-14139) Incorrect initialize checkpoint-runner-cpu thread pool

2021-02-08 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-14139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17281595#comment-17281595
 ] 

Vladislav Pyatkov commented on IGNITE-14139:


[~akalashnikov] Please look at this patch.

> Incorrect initialize checkpoint-runner-cpu thread pool
> --
>
> Key: IGNITE-14139
> URL: https://issues.apache.org/jira/browse/IGNITE-14139
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vladislav Pyatkov
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> First initialization of checkpoint thread pool for CPU is incorrect.
> Look at the constructor of {{CheckpointWorkflow}}:
> At start, we initialize the pool:
> {code:java}
> this.checkpointCollectPagesInfoPool = initializeCheckpointPool();
> {code}
> and only after, we set a size of the pool:
> {code:java}
> this.checkpointCollectInfoThreads = checkpointCollectInfoThreads;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (IGNITE-13393) Tracing: Atomic cache read/write flow.

2021-02-08 Thread Alexey Scherbakov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17281589#comment-17281589
 ] 

Alexey Scherbakov edited comment on IGNITE-13393 at 2/9/21, 7:43 AM:
-

[~alapin]

I've left some comments in PR.
 Additional questions:

1. > Agreed, however currently it's not possible to trace specified operations 
properly cause we don't have compute tracing

Why can't we at least trace/log the event on top level ? I see you have 
implemented this for removeAll operation, but not for clear. That's the reason 
for this ?

2. I've noticed a lot of CACHE_*_FUTURE newly introduced spans. It doesn't make 
sense for me to have such kind of spans, because they don't provide any useful 
information, tied to internal implementation (actualy, named like internal 
futures) and multiplies the number of traced spans. All operations should be 
traced on request/response level with intermediate phases. For example, cache 
read operation consists of map/remap and read. 
 Moreover, I haven't noticed such thing for transaction tracing. 
 In my opinion, all this spans can be removed without any drawbacks.

3. Can you provide several trace examples, for put/putAll, get/getAll, 
remove/removeAll, including remap phase ? It's difficult to estimate tracing 
flow correctness wihout having them.

4. Do we have any performance impact from this code ? At least, NOOP tracing 
should't bring any measurable drop.


was (Author: ascherbakov):
[~alapin]

I've left some comments in PR.
 Additional questions:

1. > Agreed, however currently it's not possible to trace specified operations 
properly cause we don't have compute tracing

Why can't we at least trace/log the event on top level ? I see you have 
implemented this for removeAll operation, but not for clear. That's the reason 
for this ?

2. I've noticed a lot of CACHE_*_FUTURE newly introduced spans. It doesn't make 
sense for me to have such kind of spans, because they don't provide any useful 
information, tied to internal implementation (actualy, named like internal 
futures) and multiplies the number of traced spans. All operations should be 
traced on request/response level with intermediate phases. For example, cache 
read operation consists of map/remap and read. 
 Moreover, I haven't noticed such thing for transaction tracing. 
 In my opinion, all this spans can be removed without any drawbacks.

3. Can you provide several trace examples, for put/putAll, get/getAll, 
remove/removeAll, including remap phase ? It's difficult to estimate tracing 
flow correctness wihout having them.

4. Do we have any performance impact from this code ? At least, NOOP tracing 
should't have any measurable effect on this.

> Tracing: Atomic cache read/write flow.
> --
>
> Key: IGNITE-13393
> URL: https://issues.apache.org/jira/browse/IGNITE-13393
> Project: Ignite
>  Issue Type: New Feature
>Reporter: Alexander Lapin
>Assignee: Alexander Lapin
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Implement tracing for atomic cache operations:
>  * put
>  * putAll
>  * putAsync
>  * putAllAsync
>  * remove
>  * removeAll
>  * removeAsync
>  * removeAllAsync
>  * get
>  * getAll
>  * getAsync
>  * getAllAsync
> Also add ability to include root cache read/write operations to tx tracing 
> flow.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-14138) Historical rebalance kills cluster

2021-02-08 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-14138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17281591#comment-17281591
 ] 

Vladislav Pyatkov commented on IGNITE-14138:


[~sk0x50] Please review my changes.

> Historical rebalance kills cluster
> --
>
> Key: IGNITE-14138
> URL: https://issues.apache.org/jira/browse/IGNITE-14138
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vladislav Pyatkov
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> {noformat}
> [2021-01-12T05:11:02,142][ERROR][rebalance-#508%---%][] Critical system error 
> detected. Will be handled accordingly to configured handler 
> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, 
> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet 
> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], 
> failureCtx=FailureContext [type=CRITICAL_ERROR, err=class 
> o.a.i.IgniteCheckedException: Failed to continue supplying 
> [grp=SQL_USAGES_EPE, demander=48254935-7aa9-4ab5-b398-fdaec334fab7, 
> topVer=AffinityTopologyVersion [topVer=3, minorTopVer=1
> org.apache.ignite.IgniteCheckedException: Failed to continue supplying 
> [grp=SQL_1, demander=48254935-7aa9-4ab5-b398-fdaec334fab7, 
> topVer=AffinityTopologyVersion [topVer=3, minorTopVer=1]]
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionSupplier.handleDemandMessage(GridDhtPartitionSupplier.java:571)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPreloader.handleDemandMessage(GridDhtPreloader.java:398)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:489)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:474)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1142)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:591)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$800(GridCacheIoManager.java:109)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheIoManager$OrderedMessageListener.onMessage(GridCacheIoManager.java:1707)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1721)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.managers.communication.GridIoManager.access$4300(GridIoManager.java:157)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.managers.communication.GridIoManager$GridCommunicationMessageSet.unwind(GridIoManager.java:3011)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.managers.communication.GridIoManager.unwindMessageSet(GridIoManager.java:1662)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.managers.communication.GridIoManager.access$4900(GridIoManager.java:157)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1629)
>  [ignite-core.jar]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  [?:?]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  [?:?]
>   at java.lang.Thread.run(Thread.java:834) [?:?]
> Caused by: org.apache.ignite.IgniteCheckedException: Could not find start 
> pointer for partition [part=4, partCntrSince=1115]
>   at 
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointHistory.searchEarliestWalPointer(CheckpointHistory.java:557)
>  ~[ignite-core.jar]
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.historicalIterator(GridCacheOffheapManager.java:1121)
>  ~[ignite-core.jar]
>   at 
> org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.rebalanceIterator(IgniteCacheOffheapManagerImpl.java:1195)
>  ~[ignite-core.jar]
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionSupplier.handleDemandMessage(GridDhtPartitionSupplier.java:322)
>  ~[ignite-core.jar]
>   ... 16 more
> {noformat}
> I believe that it should throw IgniteHistoricalIteratorException instead of 
> IgniteCheckedException, so it can be properly handled and rebalance can move 
> to the full rebalance instead of killing nodes



--
This message was sent by Atlassian Jira

[jira] [Updated] (IGNITE-14138) Historical rebalance kills cluster

2021-02-08 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-14138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov updated IGNITE-14138:
---
Reviewer: Slava Koptilin

> Historical rebalance kills cluster
> --
>
> Key: IGNITE-14138
> URL: https://issues.apache.org/jira/browse/IGNITE-14138
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vladislav Pyatkov
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> {noformat}
> [2021-01-12T05:11:02,142][ERROR][rebalance-#508%---%][] Critical system error 
> detected. Will be handled accordingly to configured handler 
> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, 
> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet 
> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], 
> failureCtx=FailureContext [type=CRITICAL_ERROR, err=class 
> o.a.i.IgniteCheckedException: Failed to continue supplying 
> [grp=SQL_USAGES_EPE, demander=48254935-7aa9-4ab5-b398-fdaec334fab7, 
> topVer=AffinityTopologyVersion [topVer=3, minorTopVer=1
> org.apache.ignite.IgniteCheckedException: Failed to continue supplying 
> [grp=SQL_1, demander=48254935-7aa9-4ab5-b398-fdaec334fab7, 
> topVer=AffinityTopologyVersion [topVer=3, minorTopVer=1]]
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionSupplier.handleDemandMessage(GridDhtPartitionSupplier.java:571)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPreloader.handleDemandMessage(GridDhtPreloader.java:398)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:489)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:474)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1142)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:591)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$800(GridCacheIoManager.java:109)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheIoManager$OrderedMessageListener.onMessage(GridCacheIoManager.java:1707)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1721)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.managers.communication.GridIoManager.access$4300(GridIoManager.java:157)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.managers.communication.GridIoManager$GridCommunicationMessageSet.unwind(GridIoManager.java:3011)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.managers.communication.GridIoManager.unwindMessageSet(GridIoManager.java:1662)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.managers.communication.GridIoManager.access$4900(GridIoManager.java:157)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1629)
>  [ignite-core.jar]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  [?:?]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  [?:?]
>   at java.lang.Thread.run(Thread.java:834) [?:?]
> Caused by: org.apache.ignite.IgniteCheckedException: Could not find start 
> pointer for partition [part=4, partCntrSince=1115]
>   at 
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointHistory.searchEarliestWalPointer(CheckpointHistory.java:557)
>  ~[ignite-core.jar]
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.historicalIterator(GridCacheOffheapManager.java:1121)
>  ~[ignite-core.jar]
>   at 
> org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.rebalanceIterator(IgniteCacheOffheapManagerImpl.java:1195)
>  ~[ignite-core.jar]
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionSupplier.handleDemandMessage(GridDhtPartitionSupplier.java:322)
>  ~[ignite-core.jar]
>   ... 16 more
> {noformat}
> I believe that it should throw IgniteHistoricalIteratorException instead of 
> IgniteCheckedException, so it can be properly handled and rebalance can move 
> to the full rebalance instead of killing nodes



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (IGNITE-13393) Tracing: Atomic cache read/write flow.

2021-02-08 Thread Alexey Scherbakov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17281589#comment-17281589
 ] 

Alexey Scherbakov edited comment on IGNITE-13393 at 2/9/21, 7:40 AM:
-

[~alapin]

I've left some comments in PR.
 Additional questions:

1. > Agreed, however currently it's not possible to trace specified operations 
properly cause we don't have compute tracing

Why can't we at least trace/log the event on top level ? I see you have 
implemented this for removeAll operation, but not for clear. That's the reason 
for this ?

2. I've noticed a lot of CACHE_*_FUTURE newly introduced spans. It doesn't make 
sense for me to have such kind of spans, because they don't provide any useful 
information, tied to internal implementation (actualy, named like internal 
futures) and multiplies the number of traced spans. All operations should be 
traced on request/response level with intermediate phases. For example, cache 
read operation consists of map/remap and read. 
 Moreover, I haven't noticed such thing for transaction tracing. 
 In my opinion, all this spans can be removed without any drawbacks.

3. Can you provide several trace examples, for put/putAll, get/getAll, 
remove/removeAll, including remap phase ? It's difficult to estimate tracing 
flow correctness wihout having them.

4. Do we have any performance impact from this code ? At least, NOOP tracing 
should't have any measurable effect on this.


was (Author: ascherbakov):
[~alapin]

I've left some comments in PR.
Additional questions:


1. > Agreed, however currently it's not possible to trace specified operations 
properly cause we don't have compute tracing

Why can't we at least trace/log the event on top level ? I see you have 
implemented this for removeAll operation, but not for clear. That's the reason 
for this ?

2. I've noticed a lot of CACHE_*_FUTURE newly introduced spans. It doesn't make 
sense for me to have such kind of spans, because they don't provide any useful 
information, tied to internal implementation (actualy, named like internal 
futures) and multiplies the number of traced spans. All operations should be 
traced on request/response level with intermediate phases. For example, cache 
read operation consists of map/remap and read. 
Moreover, I haven't noticed such thing for transaction tracing. 
In my opinion, all this spans can be removed without any drawbacks.

> Tracing: Atomic cache read/write flow.
> --
>
> Key: IGNITE-13393
> URL: https://issues.apache.org/jira/browse/IGNITE-13393
> Project: Ignite
>  Issue Type: New Feature
>Reporter: Alexander Lapin
>Assignee: Alexander Lapin
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Implement tracing for atomic cache operations:
>  * put
>  * putAll
>  * putAsync
>  * putAllAsync
>  * remove
>  * removeAll
>  * removeAsync
>  * removeAllAsync
>  * get
>  * getAll
>  * getAsync
>  * getAllAsync
> Also add ability to include root cache read/write operations to tx tracing 
> flow.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-13393) Tracing: Atomic cache read/write flow.

2021-02-08 Thread Alexey Scherbakov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17281589#comment-17281589
 ] 

Alexey Scherbakov commented on IGNITE-13393:


[~alapin]

I've left some comments in PR.
Additional questions:


1. > Agreed, however currently it's not possible to trace specified operations 
properly cause we don't have compute tracing

Why can't we at least trace/log the event on top level ? I see you have 
implemented this for removeAll operation, but not for clear. That's the reason 
for this ?

2. I've noticed a lot of CACHE_*_FUTURE newly introduced spans. It doesn't make 
sense for me to have such kind of spans, because they don't provide any useful 
information, tied to internal implementation (actualy, named like internal 
futures) and multiplies the number of traced spans. All operations should be 
traced on request/response level with intermediate phases. For example, cache 
read operation consists of map/remap and read. 
Moreover, I haven't noticed such thing for transaction tracing. 
In my opinion, all this spans can be removed without any drawbacks.

> Tracing: Atomic cache read/write flow.
> --
>
> Key: IGNITE-13393
> URL: https://issues.apache.org/jira/browse/IGNITE-13393
> Project: Ignite
>  Issue Type: New Feature
>Reporter: Alexander Lapin
>Assignee: Alexander Lapin
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Implement tracing for atomic cache operations:
>  * put
>  * putAll
>  * putAsync
>  * putAllAsync
>  * remove
>  * removeAll
>  * removeAsync
>  * removeAllAsync
>  * get
>  * getAll
>  * getAsync
>  * getAllAsync
> Also add ability to include root cache read/write operations to tx tracing 
> flow.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-14139) Incorrect initialize checkpoint-runner-cpu thread pool

2021-02-08 Thread Ignite TC Bot (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-14139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17281585#comment-17281585
 ] 

Ignite TC Bot commented on IGNITE-14139:


{panel:title=Branch: [pull/8770/head] Base: [master] : Possible Blockers 
(1)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1}
{color:#d04437}JCache TCK 1.1{color} [[tests 0 TIMEOUT , Exit Code 
|https://ci.ignite.apache.org/viewLog.html?buildId=5864413]]

{panel}
{panel:title=Branch: [pull/8770/head] Base: [master] : New Tests 
(2)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}
{color:#8b}PDS 1{color} [[tests 
2|https://ci.ignite.apache.org/viewLog.html?buildId=5863443]]
* {color:#013220}IgnitePdsTestSuite: 
IgnitePdsCheckpointSimpleTest.testStartNodeWithDefaultCpThreads - PASSED{color}
* {color:#013220}IgnitePdsTestSuite: 
IgnitePdsCheckpointSimpleTest.testStartNodeWithNonDefaultCpThreads - 
PASSED{color}

{panel}
[TeamCity *-- Run :: All* 
Results|https://ci.ignite.apache.org/viewLog.html?buildId=5863473buildTypeId=IgniteTests24Java8_RunAll]

> Incorrect initialize checkpoint-runner-cpu thread pool
> --
>
> Key: IGNITE-14139
> URL: https://issues.apache.org/jira/browse/IGNITE-14139
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vladislav Pyatkov
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> First initialization of checkpoint thread pool for CPU is incorrect.
> Look at the constructor of {{CheckpointWorkflow}}:
> At start, we initialize the pool:
> {code:java}
> this.checkpointCollectPagesInfoPool = initializeCheckpointPool();
> {code}
> and only after, we set a size of the pool:
> {code:java}
> this.checkpointCollectInfoThreads = checkpointCollectInfoThreads;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-14137) Detect and fix failures reasons (nightly runs fails)

2021-02-08 Thread Anton Vinogradov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-14137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17281580#comment-17281580
 ] 

Anton Vinogradov commented on IGNITE-14137:
---

Just found we have failures in both environments we have.
So, seems, the problem is with code.

> Detect and fix failures reasons (nightly runs fails)
> 
>
> Key: IGNITE-14137
> URL: https://issues.apache.org/jira/browse/IGNITE-14137
> Project: Ignite
>  Issue Type: Bug
>Reporter: Anton Vinogradov
>Assignee: Mikhail Filatov
>Priority: Critical
>
> Jenkins runs fails, 1-4 ... 60 tests affected.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-14103) .NET Thin Client: Retrieve binary configuration from server

2021-02-08 Thread Pavel Tupitsyn (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-14103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Tupitsyn updated IGNITE-14103:

Release Note: .NET Thin Client: Add an automated check that client binary 
configuration is compatible to server binary configuration on client start

> .NET Thin Client: Retrieve binary configuration from server
> ---
>
> Key: IGNITE-14103
> URL: https://issues.apache.org/jira/browse/IGNITE-14103
> Project: Ignite
>  Issue Type: Improvement
>  Components: platforms, thin client
>Reporter: Pavel Tupitsyn
>Assignee: Pavel Tupitsyn
>Priority: Major
>  Labels: .NET
> Fix For: 2.11
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Thin clients require manual binary configuration currently. Settings like 
> compact footer and simple/full name mapper should be set to match the cluster 
> settings. Extend the protocol to retrieve those settings automatically on 
> start.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-14103) .NET Thin Client: Retrieve binary configuration from server

2021-02-08 Thread Pavel Tupitsyn (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-14103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17281555#comment-17281555
 ] 

Pavel Tupitsyn commented on IGNITE-14103:
-

Merged to master: 971b3e11e2e546e5bc0d29c4eb88c355eebc6da6

> .NET Thin Client: Retrieve binary configuration from server
> ---
>
> Key: IGNITE-14103
> URL: https://issues.apache.org/jira/browse/IGNITE-14103
> Project: Ignite
>  Issue Type: Improvement
>  Components: platforms, thin client
>Reporter: Pavel Tupitsyn
>Assignee: Pavel Tupitsyn
>Priority: Major
>  Labels: .NET
> Fix For: 2.11
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Thin clients require manual binary configuration currently. Settings like 
> compact footer and simple/full name mapper should be set to match the cluster 
> settings. Extend the protocol to retrieve those settings automatically on 
> start.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Issue Comment Deleted] (IGNITE-14103) .NET Thin Client: Retrieve binary configuration from server

2021-02-08 Thread Pavel Tupitsyn (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-14103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Tupitsyn updated IGNITE-14103:

Comment: was deleted

(was: {panel:title=Branch: [pull/8733/head] Base: [master] : Possible Blockers 
(50)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1}
{color:#d04437}Java Client{color} [[tests 
1|https://ci.ignite.apache.org/viewLog.html?buildId=5862609]]
* IgniteClientTestSuite: 
ClientTcpUnreachableMultiNodeSelfTest.testTopologyListener - Test has low fail 
rate in base branch 0,0% and is not flaky

{color:#d04437}JCache TCK 1.1{color} [[tests 0 TIMEOUT , Exit Code 
|https://ci.ignite.apache.org/viewLog.html?buildId=5862632]]

{color:#d04437}ZooKeeper (Discovery) 1{color} [[tests 3 Out Of Memory Error 
|https://ci.ignite.apache.org/viewLog.html?buildId=5862627]]
* ZookeeperDiscoverySpiTestSuite1: 
ZookeeperDiscoveryTopologyChangeAndReconnectTest.testWithPersistence1 - Test 
has low fail rate in base branch 0,0% and is not flaky
* ZookeeperDiscoverySpiTestSuite1: 
ZookeeperDiscoveryTopologyChangeAndReconnectTest.testDuplicatedNodeId - Test 
has low fail rate in base branch 0,0% and is not flaky
* ZookeeperDiscoverySpiTestSuite1: 
ZookeeperDiscoveryTopologyChangeAndReconnectTest.testLargeUserAttribute3 - Test 
has low fail rate in base branch 1,3% and is not flaky

{color:#d04437}Cache 1{color} [[tests 
1|https://ci.ignite.apache.org/viewLog.html?buildId=5862646]]
* IgniteBinaryCacheTestSuite: 
CacheWithDifferentDataRegionConfigurationTest.firstNodeHasDefaultAndSecondWithTwoRegionsDefaultAndPersistenceAcceptable
 - Test has low fail rate in base branch 0,0% and is not flaky

{color:#d04437}Thin Client: Java{color} [[tests 42 Exit Code 
|https://ci.ignite.apache.org/viewLog.html?buildId=5862612]]
* ClientTestSuite: 
ReliabilityTestPartitionAwareAsync.testReconnectionThrottling - Test has low 
fail rate in base branch 0,0% and is not flaky
* ClientTestSuite: 
FunctionalTest.testPessimisticRepeatableReadsTransactionHoldsLock - Test has 
low fail rate in base branch 0,0% and is not flaky
* ClientTestSuite: 
FunctionalTest.testPessimisticSerializableTransactionHoldsLock - Test has low 
fail rate in base branch 0,0% and is not flaky
* ClientTestSuite: FunctionalTest.testOptimitsticRepeatableReadUpdatesValue - 
Test has low fail rate in base branch 0,0% and is not flaky
* ClientTestSuite: 
FunctionalTest.testOptimitsticSerializableTransactionHoldsLock - Test has low 
fail rate in base branch 0,0% and is not flaky
* ClientTestSuite: ReliabilityTestPartitionAware.testReconnectionThrottling - 
Test has low fail rate in base branch 0,0% and is not flaky
* ClientTestSuite: ReliabilityTestPartitionAwareAsync.testSingleServerFailover 
- Test has low fail rate in base branch 0,0% and is not flaky
* ClientTestSuite: ReliabilityTestAsync.testSingleServerFailover - Test has low 
fail rate in base branch 0,0% and is not flaky
* ClientTestSuite: ReliabilityTest.testTxWithIdIntersection - Test has low fail 
rate in base branch 0,0% and is not flaky
* ClientTestSuite: ReliabilityTestAsync.testTxWithIdIntersection - Test has low 
fail rate in base branch 0,0% and is not flaky
* ClientTestSuite: ReliabilityTestAsync.testFailover - Test has low fail rate 
in base branch 0,0% and is not flaky
... and 31 tests blockers

{panel}
{panel:title=Branch: [pull/8733/head] Base: [master] : New Tests 
(17)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}
{color:#8b}SPI{color} [[tests 
1|https://ci.ignite.apache.org/viewLog.html?buildId=5862620]]
* {color:#013220}IgniteSpiTestSuite: 
TcpCommunicationHandshakeTimeoutTest.testSocketForcedClosedBecauseSlowReadFromSocket
 - PASSED{color}

{color:#8b}Platform .NET (Core Linux){color} [[tests 
8|https://ci.ignite.apache.org/viewLog.html?buildId=5862666]]
* {color:#013220}dll: 
BinaryConfigurationRetrievalTest.TestCompactFooterDisabledOnServerAutomaticallyDisablesOnClient
 - PASSED{color}
* {color:#013220}dll: 
BinaryConfigurationRetrievalTest.TestCompactFooterEnabledOnServerDisabledOnClientProducesWarning
 - PASSED{color}
* {color:#013220}dll: 
BinaryConfigurationRetrievalTest.TestBasicNameMapperSettingsMismatchProducesLogWarning
 - PASSED{color}
* {color:#013220}dll: 
BinaryConfigurationRetrievalTest.TestDefaultConfigurationDoesNotChangeClientSettingsOrLogWarnings
 - PASSED{color}
* {color:#013220}dll: 
BinaryConfigurationRetrievalTest.TestExplicitDefaultConfigurationDoesNotChangeClientSettingsOrLogWarnings
 - PASSED{color}
* {color:#013220}dll: 
BinaryConfigurationRetrievalTest.TestCustomNameMapperOnServerAndClientProducesNoLogWarning
 - PASSED{color}
* {color:#013220}dll: 
BinaryConfigurationRetrievalTest.TestCustomNameMapperOnServerProducesLogWarning 
- PASSED{color}
* {color:#013220}dll: 
BinaryConfigurationRetrievalTest.TestCustomNameMapperExtendingBasicMapperOnServerProducesLogWarning
 - PASSED{color}

{color:#8b}Platform .NET{color} [[tests 

[jira] [Commented] (IGNITE-14103) .NET Thin Client: Retrieve binary configuration from server

2021-02-08 Thread Ignite TC Bot (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-14103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17281545#comment-17281545
 ] 

Ignite TC Bot commented on IGNITE-14103:


{panel:title=Branch: [pull/8733/head] Base: [master] : Possible Blockers 
(1)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1}
{color:#d04437}JCache TCK 1.1{color} [[tests 0 TIMEOUT , Exit Code 
|https://ci.ignite.apache.org/viewLog.html?buildId=5863785]]

{panel}
{panel:title=Branch: [pull/8733/head] Base: [master] : New Tests 
(16)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}
{color:#8b}Platform .NET (Core Linux){color} [[tests 
8|https://ci.ignite.apache.org/viewLog.html?buildId=5862666]]
* {color:#013220}dll: 
BinaryConfigurationRetrievalTest.TestCompactFooterDisabledOnServerAutomaticallyDisablesOnClient
 - PASSED{color}
* {color:#013220}dll: 
BinaryConfigurationRetrievalTest.TestCompactFooterEnabledOnServerDisabledOnClientProducesWarning
 - PASSED{color}
* {color:#013220}dll: 
BinaryConfigurationRetrievalTest.TestBasicNameMapperSettingsMismatchProducesLogWarning
 - PASSED{color}
* {color:#013220}dll: 
BinaryConfigurationRetrievalTest.TestDefaultConfigurationDoesNotChangeClientSettingsOrLogWarnings
 - PASSED{color}
* {color:#013220}dll: 
BinaryConfigurationRetrievalTest.TestExplicitDefaultConfigurationDoesNotChangeClientSettingsOrLogWarnings
 - PASSED{color}
* {color:#013220}dll: 
BinaryConfigurationRetrievalTest.TestCustomNameMapperOnServerAndClientProducesNoLogWarning
 - PASSED{color}
* {color:#013220}dll: 
BinaryConfigurationRetrievalTest.TestCustomNameMapperOnServerProducesLogWarning 
- PASSED{color}
* {color:#013220}dll: 
BinaryConfigurationRetrievalTest.TestCustomNameMapperExtendingBasicMapperOnServerProducesLogWarning
 - PASSED{color}

{color:#8b}Platform .NET{color} [[tests 
8|https://ci.ignite.apache.org/viewLog.html?buildId=5862665]]
* {color:#013220}exe: 
BinaryConfigurationRetrievalTest.TestExplicitDefaultConfigurationDoesNotChangeClientSettingsOrLogWarnings
 - PASSED{color}
* {color:#013220}exe: 
BinaryConfigurationRetrievalTest.TestBasicNameMapperSettingsMismatchProducesLogWarning
 - PASSED{color}
* {color:#013220}exe: 
BinaryConfigurationRetrievalTest.TestCompactFooterDisabledOnServerAutomaticallyDisablesOnClient
 - PASSED{color}
* {color:#013220}exe: 
BinaryConfigurationRetrievalTest.TestCustomNameMapperOnServerProducesLogWarning 
- PASSED{color}
* {color:#013220}exe: 
BinaryConfigurationRetrievalTest.TestDefaultConfigurationDoesNotChangeClientSettingsOrLogWarnings
 - PASSED{color}
* {color:#013220}exe: 
BinaryConfigurationRetrievalTest.TestCompactFooterEnabledOnServerDisabledOnClientProducesWarning
 - PASSED{color}
* {color:#013220}exe: 
BinaryConfigurationRetrievalTest.TestCustomNameMapperOnServerAndClientProducesNoLogWarning
 - PASSED{color}
* {color:#013220}exe: 
BinaryConfigurationRetrievalTest.TestCustomNameMapperExtendingBasicMapperOnServerProducesLogWarning
 - PASSED{color}

{panel}
[TeamCity *-- Run :: All* 
Results|https://ci.ignite.apache.org/viewLog.html?buildId=5862690buildTypeId=IgniteTests24Java8_RunAll]

> .NET Thin Client: Retrieve binary configuration from server
> ---
>
> Key: IGNITE-14103
> URL: https://issues.apache.org/jira/browse/IGNITE-14103
> Project: Ignite
>  Issue Type: Improvement
>  Components: platforms, thin client
>Reporter: Pavel Tupitsyn
>Assignee: Pavel Tupitsyn
>Priority: Major
>  Labels: .NET
> Fix For: 2.11
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Thin clients require manual binary configuration currently. Settings like 
> compact footer and simple/full name mapper should be set to match the cluster 
> settings. Extend the protocol to retrieve those settings automatically on 
> start.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14144) Document C++ thin client transactions

2021-02-08 Thread Nikita Safonov (Jira)
Nikita Safonov created IGNITE-14144:
---

 Summary: Document C++ thin client transactions
 Key: IGNITE-14144
 URL: https://issues.apache.org/jira/browse/IGNITE-14144
 Project: Ignite
  Issue Type: Sub-task
  Components: documentation
Affects Versions: 2.10
Reporter: Nikita Safonov
Assignee: Nikita Safonov


We need to document the C++ thin client transactions functionality.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14143) Document metric for processed keys when rebuilding indexes.

2021-02-08 Thread Nikita Safonov (Jira)
Nikita Safonov created IGNITE-14143:
---

 Summary: Document metric for processed keys when rebuilding 
indexes.
 Key: IGNITE-14143
 URL: https://issues.apache.org/jira/browse/IGNITE-14143
 Project: Ignite
  Issue Type: Sub-task
  Components: documentation
Affects Versions: 2.10
Reporter: Nikita Safonov
Assignee: Nikita Safonov


We need to document a new metric for processed keys when rebuilding indexes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14142) Document control.(sh|bin) command to get an arbitrary SystemView

2021-02-08 Thread Nikita Safonov (Jira)
Nikita Safonov created IGNITE-14142:
---

 Summary: Document control.(sh|bin) command to get an arbitrary 
SystemView
 Key: IGNITE-14142
 URL: https://issues.apache.org/jira/browse/IGNITE-14142
 Project: Ignite
  Issue Type: Sub-task
  Components: documentation
Affects Versions: 2.10
Reporter: Nikita Safonov
Assignee: Nikita Safonov


We need to document the new "get an arbitrary SystemView" control.(sh|bin) 
command. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-14138) Historical rebalance kills cluster

2021-02-08 Thread Ignite TC Bot (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-14138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17281417#comment-17281417
 ] 

Ignite TC Bot commented on IGNITE-14138:


{panel:title=Branch: [pull/8769/head] Base: [master] : Possible Blockers 
(1)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1}
{color:#d04437}JCache TCK 1.1{color} [[tests 0 TIMEOUT , Exit Code 
|https://ci.ignite.apache.org/viewLog.html?buildId=5863779]]

{panel}
{panel:title=Branch: [pull/8769/head] Base: [master] : New Tests 
(4217)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}
{color:#8b}JCache TCK 1.1{color} [[tests 
4216|https://ci.ignite.apache.org/viewLog.html?buildId=5863779]]
* {color:#013220}InterceptorCacheConfigVariationsFullApiTestSuite: 
InterceptorCacheConfigVariationsFullApiTest_27.testContinuousQuery - 
PASSED{color}
* {color:#013220}InterceptorCacheConfigVariationsFullApiTestSuite: 
InterceptorCacheConfigVariationsFullApiTest_27.testReplacexAsyncOld - 
PASSED{color}
* {color:#013220}InterceptorCacheConfigVariationsFullApiTestSuite: 
InterceptorCacheConfigVariationsFullApiTest_27.testIterator - PASSED{color}
* {color:#013220}InterceptorCacheConfigVariationsFullApiTestSuite: 
InterceptorCacheConfigVariationsFullApiTest_27.testRemovexAsyncOld - 
PASSED{color}
* {color:#013220}InterceptorCacheConfigVariationsFullApiTestSuite: 
InterceptorCacheConfigVariationsFullApiTest_27.testOptimisticTxMissingKeyNoCommit
 - PASSED{color}
* {color:#013220}InterceptorCacheConfigVariationsFullApiTestSuite: 
InterceptorCacheConfigVariationsFullApiTest_27.testInvokeReturnValueGetOptimisticRepeatableRead
 - PASSED{color}
* {color:#013220}InterceptorCacheConfigVariationsFullApiTestSuite: 
InterceptorCacheConfigVariationsFullApiTest_27.testInvokeAllOptimisticRepeatableRead
 - PASSED{color}
* {color:#013220}InterceptorCacheConfigVariationsFullApiTestSuite: 
InterceptorCacheConfigVariationsFullApiTest_27.testPessimisticTxRepeatableRead 
- PASSED{color}
* {color:#013220}InterceptorCacheConfigVariationsFullApiTestSuite: 
InterceptorCacheConfigVariationsFullApiTest_27.testPutAll - PASSED{color}
* {color:#013220}InterceptorCacheConfigVariationsFullApiTestSuite: 
InterceptorCacheConfigVariationsFullApiTest_27.testRemove - PASSED{color}
* {color:#013220}InterceptorCacheConfigVariationsFullApiTestSuite: 
InterceptorCacheConfigVariationsFullApiTest_27.testPutIfAbsentAsyncConcurrent - 
PASSED{color}
... and 4205 new tests

{color:#8b}PDS 4{color} [[tests 
1|https://ci.ignite.apache.org/viewLog.html?buildId=5862892]]
* {color:#013220}IgnitePdsTestSuite4: CacheRebalanceWithRemovedWalSegment.test 
- PASSED{color}

{panel}
[TeamCity *-- Run :: All* 
Results|https://ci.ignite.apache.org/viewLog.html?buildId=5862919buildTypeId=IgniteTests24Java8_RunAll]

> Historical rebalance kills cluster
> --
>
> Key: IGNITE-14138
> URL: https://issues.apache.org/jira/browse/IGNITE-14138
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vladislav Pyatkov
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> {noformat}
> [2021-01-12T05:11:02,142][ERROR][rebalance-#508%---%][] Critical system error 
> detected. Will be handled accordingly to configured handler 
> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, 
> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet 
> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], 
> failureCtx=FailureContext [type=CRITICAL_ERROR, err=class 
> o.a.i.IgniteCheckedException: Failed to continue supplying 
> [grp=SQL_USAGES_EPE, demander=48254935-7aa9-4ab5-b398-fdaec334fab7, 
> topVer=AffinityTopologyVersion [topVer=3, minorTopVer=1
> org.apache.ignite.IgniteCheckedException: Failed to continue supplying 
> [grp=SQL_1, demander=48254935-7aa9-4ab5-b398-fdaec334fab7, 
> topVer=AffinityTopologyVersion [topVer=3, minorTopVer=1]]
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionSupplier.handleDemandMessage(GridDhtPartitionSupplier.java:571)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPreloader.handleDemandMessage(GridDhtPreloader.java:398)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:489)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:474)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1142)
>  [ignite-core.jar]
>   at 
> 

[jira] [Commented] (IGNITE-14116) .NET: Review LongRunning tests

2021-02-08 Thread Ignite TC Bot (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-14116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17281376#comment-17281376
 ] 

Ignite TC Bot commented on IGNITE-14116:


{panel:title=Branch: [pull/8773/head] Base: [master] : Possible Blockers 
(7)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1}
{color:#d04437}Control Utility (Zookeeper){color} [[tests 0 TIMEOUT , Exit Code 
|https://ci.ignite.apache.org/viewLog.html?buildId=5863887]]

{color:#d04437}Cache 2{color} [[tests 0 TIMEOUT , Exit Code 
|https://ci.ignite.apache.org/viewLog.html?buildId=5863849]]

{color:#d04437}Cache 5{color} [[tests 
1|https://ci.ignite.apache.org/viewLog.html?buildId=5863852]]
* org.apache.ignite.testsuites.IgniteCacheTestSuite5: 
org.apache.ignite.internal.processors.cache.CacheSerializableTransactionsTest. 
- History for base branch is absent.

{color:#d04437}JCache TCK 1.1{color} [[tests 0 TIMEOUT , Exit Code 
|https://ci.ignite.apache.org/viewLog.html?buildId=5863834]]

{color:#d04437}Java Client{color} [[tests 0 TIMEOUT , Exit Code 
|https://ci.ignite.apache.org/viewLog.html?buildId=5863811]]

{color:#d04437}Platform .NET (Core Linux){color} [[tests 1 TC_SERVICE_MESSAGE 
|https://ci.ignite.apache.org/viewLog.html?buildId=5863868]]
* dll: ServicesTestAsync.TestDeployAll(False) - Test has low fail rate in base 
branch 0,0% and is not flaky

{color:#d04437}Continuous Query 3{color} [[tests 
1|https://ci.ignite.apache.org/viewLog.html?buildId=5863857]]
* IgniteCacheQuerySelfTestSuite5: 
GridCacheContinuousQueryConcurrentTest.testRestartReplicated - Test has low 
fail rate in base branch 0,0% and is not flaky

{panel}
{panel:title=Branch: [pull/8773/head] Base: [master] : New Tests 
(698)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1}
{color:#8b}Cache 5{color} [[tests 
1|https://ci.ignite.apache.org/viewLog.html?buildId=5863852]]
* {color:#8b}org.apache.ignite.testsuites.IgniteCacheTestSuite5: 
org.apache.ignite.internal.processors.cache.CacheSerializableTransactionsTest. 
- FAILED{color}

{color:#8b}JCache TCK 1.1{color} [[tests 
71|https://ci.ignite.apache.org/viewLog.html?buildId=5863834]]
* 
{color:#013220}org.apache.ignite.internal.processors.cache.IgniteDynamicCacheAndNodeStop.testCacheAndNodeStop
 - PASSED{color}
* 
{color:#013220}org.apache.ignite.internal.processors.cache.local.GridCacheAtomicLocalTckMetricsSelfTestImpl.testEntryProcessorRemove
 - PASSED{color}
* 
{color:#013220}org.apache.ignite.internal.processors.cache.local.GridCacheAtomicLocalTckMetricsSelfTestImpl.testInvokeAllAsyncMultipleKeysAvgTime
 - PASSED{color}
* 
{color:#013220}org.apache.ignite.internal.processors.cache.local.GridCacheAtomicLocalTckMetricsSelfTestImpl.testCacheStatistics
 - PASSED{color}
* 
{color:#013220}org.apache.ignite.internal.processors.cache.local.GridCacheAtomicLocalTckMetricsSelfTestImpl.testPutIfAbsent
 - PASSED{color}
* 
{color:#013220}org.apache.ignite.internal.processors.cache.local.GridCacheAtomicLocalTckMetricsSelfTestImpl.testGetMetricsSnapshot
 - PASSED{color}
* 
{color:#013220}org.apache.ignite.internal.processors.cache.local.GridCacheAtomicLocalTckMetricsSelfTestImpl.testCacheSizeWorksAsSize
 - PASSED{color}
* 
{color:#013220}org.apache.ignite.internal.processors.cache.local.GridCacheAtomicLocalTckMetricsSelfTestImpl.testGetAllAvgTime
 - PASSED{color}
* 
{color:#013220}org.apache.ignite.internal.processors.cache.local.GridCacheAtomicLocalTckMetricsSelfTestImpl.testRemoveAvgTime
 - PASSED{color}
* 
{color:#013220}org.apache.ignite.internal.processors.cache.local.GridCacheAtomicLocalTckMetricsSelfTestImpl.testConditionReplace
 - PASSED{color}
* 
{color:#013220}org.apache.ignite.internal.processors.cache.distributed.GridCacheClientModesTcpClientDiscoveryAbstractTest$CaseClientReplicatedAtomic.testGetFromClientNode
 - PASSED{color}
... and 60 new tests

{color:#8b}Platform .NET (Long Running){color} [[tests 
626|https://ci.ignite.apache.org/viewLog.html?buildId=5863871]]
* {color:#013220}exe: ServicesTestFullFooter.TestDuckTyping(False) - 
PASSED{color}
* {color:#013220}exe: ServicesTestFullFooter.TestGetServiceProxy(True) - 
PASSED{color}
* {color:#013220}exe: ServicesTestFullFooter.TestGetServiceProxy(False) - 
PASSED{color}
* {color:#013220}exe: 
ClientClusterDiscoveryTestsBase.TestClientWithOneEndpointDiscoversAllServers - 
PASSED{color}
* {color:#013220}exe: 
ClientClusterDiscoveryTestsBase.TestClientDiscoversJoinedServersAndRemovesDisconnected
 - PASSED{color}
* {color:#013220}exe: CacheQueriesTestSimpleName.TestTextQuery(False,False) - 
PASSED{color}
* {color:#013220}exe: 
ClientClusterDiscoveryTestsBase.TestClientWithOneEndpointDiscoversAllServers - 
PASSED{color}
* {color:#013220}exe: 
ClientClusterDiscoveryTestsBase.TestClientDiscoversJoinedServersAndRemovesDisconnected
 - PASSED{color}
* {color:#013220}exe: CacheAbstractTest.TestClearKeys - PASSED{color}
* {color:#013220}exe: 

[jira] [Updated] (IGNITE-14141) Remove unnecessary storage configuration from PageStore

2021-02-08 Thread Maxim Muzafarov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-14141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maxim Muzafarov updated IGNITE-14141:
-
Fix Version/s: 2.11

> Remove unnecessary storage configuration from PageStore
> ---
>
> Key: IGNITE-14141
> URL: https://issues.apache.org/jira/browse/IGNITE-14141
> Project: Ignite
>  Issue Type: Task
>Reporter: Maxim Muzafarov
>Assignee: Maxim Muzafarov
>Priority: Major
> Fix For: 2.11
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The {{DataStorageConfiguration}} is used only for getting the {{pageSize}} in 
> the {{FilePageStore}} implementation and can be removed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14141) Remove unnecessary storage configuration from PageStore

2021-02-08 Thread Maxim Muzafarov (Jira)
Maxim Muzafarov created IGNITE-14141:


 Summary: Remove unnecessary storage configuration from PageStore
 Key: IGNITE-14141
 URL: https://issues.apache.org/jira/browse/IGNITE-14141
 Project: Ignite
  Issue Type: Task
Reporter: Maxim Muzafarov
Assignee: Maxim Muzafarov


The {{DataStorageConfiguration}} is used only for getting the {{pageSize}} in 
the {{FilePageStore}} implementation and can be removed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-14116) .NET: Review LongRunning tests

2021-02-08 Thread Igor Sapego (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-14116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17281318#comment-17281318
 ] 

Igor Sapego commented on IGNITE-14116:
--

[~ptupitsyn] looks good.

> .NET: Review LongRunning tests
> --
>
> Key: IGNITE-14116
> URL: https://issues.apache.org/jira/browse/IGNITE-14116
> Project: Ignite
>  Issue Type: Improvement
>  Components: platforms
>Reporter: Pavel Tupitsyn
>Assignee: Pavel Tupitsyn
>Priority: Trivial
>  Labels: .NET
> Fix For: 2.11
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> {{TestUtils.CategoryIntensive}} is supposed to be applied to long-running 
> tests, so that we can exclude that category and do a quick test run.
> * Review current tests duration and apply the attribute to all tests that 
> take over 2 or 3 seconds.
> * Review test fixtures that take a long time to set up.
> * Update DEVNOTES with a command to run quick tests only (exclude long and 
> examples).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-13725) Add snapshot check command

2021-02-08 Thread Maxim Muzafarov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17281296#comment-17281296
 ] 

Maxim Muzafarov commented on IGNITE-13725:
--

[~alex_pl], [~xtern] 

Folks, can you review my changes?

> Add snapshot check command
> --
>
> Key: IGNITE-13725
> URL: https://issues.apache.org/jira/browse/IGNITE-13725
> Project: Ignite
>  Issue Type: New Feature
>Reporter: Maxim Muzafarov
>Assignee: Maxim Muzafarov
>Priority: Major
>  Labels: iep-43
> Fix For: 2.11
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The user must be able to validate a created snapshot should be validated on 
> its consistency the same way as the {{idle_verify}} procedure does.
> It is necessary to introduce a new {{SnapshotMetadata}} structure to save the 
> additional snapshot meta information. The verify procedure must collect all 
> snapshot metadata parts and check for its completeness and consistency.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-13725) Add snapshot check command

2021-02-08 Thread Ignite TC Bot (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17281292#comment-17281292
 ] 

Ignite TC Bot commented on IGNITE-13725:


{panel:title=Branch: [pull/8715/head] Base: [master] : No blockers 
found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}{panel}
{panel:title=Branch: [pull/8715/head] Base: [master] : New Tests 
(9)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}
{color:#8b}Control Utility{color} [[tests 
2|https://ci.ignite.apache.org/viewLog.html?buildId=5863140]]
* {color:#013220}IgniteControlUtilityTestSuite: 
GridCommandHandlerWithSSLTest.testCheckSnapshot - PASSED{color}
* {color:#013220}IgniteControlUtilityTestSuite: 
GridCommandHandlerTest.testCheckSnapshot - PASSED{color}

{color:#8b}Control Utility (Zookeeper){color} [[tests 
1|https://ci.ignite.apache.org/viewLog.html?buildId=5857393]]
* {color:#013220}ZookeeperIgniteControlUtilityTestSuite: 
GridCommandHandlerTest.testCheckSnapshot - PASSED{color}

{color:#8b}Basic 3{color} [[tests 
6|https://ci.ignite.apache.org/viewLog.html?buildId=5857339]]
* {color:#013220}IgniteBasicWithPersistenceTestSuite: 
IgniteClusterSnapshotSelfTest.testClusterSnapshotCheckPartitionCounters - 
PASSED{color}
* {color:#013220}IgniteBasicWithPersistenceTestSuite: 
IgniteClusterSnapshotSelfTest.testClusterSnapshotCheckMissedGroup - 
PASSED{color}
* {color:#013220}IgniteBasicWithPersistenceTestSuite: 
IgniteClusterSnapshotSelfTest.testClusterSnapshotCheckMissedMeta - PASSED{color}
* {color:#013220}IgniteBasicWithPersistenceTestSuite: 
IgniteClusterSnapshotSelfTest.testClusterSnapshotCheckMissedPart - PASSED{color}
* {color:#013220}IgniteBasicWithPersistenceTestSuite: 
IgniteClusterSnapshotSelfTest.testClusterSnapshotCheck - PASSED{color}
* {color:#013220}IgniteBasicWithPersistenceTestSuite: 
IgniteClusterSnapshotSelfTest.testClusterSnapshotCheckWithNodeFilter - 
PASSED{color}

{panel}
[TeamCity *-- Run :: All* 
Results|https://ci.ignite.apache.org/viewLog.html?buildId=5859054buildTypeId=IgniteTests24Java8_RunAll]

> Add snapshot check command
> --
>
> Key: IGNITE-13725
> URL: https://issues.apache.org/jira/browse/IGNITE-13725
> Project: Ignite
>  Issue Type: New Feature
>Reporter: Maxim Muzafarov
>Assignee: Maxim Muzafarov
>Priority: Major
>  Labels: iep-43
> Fix For: 2.11
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The user must be able to validate a created snapshot should be validated on 
> its consistency the same way as the {{idle_verify}} procedure does.
> It is necessary to introduce a new {{SnapshotMetadata}} structure to save the 
> additional snapshot meta information. The verify procedure must collect all 
> snapshot metadata parts and check for its completeness and consistency.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-14131) IgniteCompute tasks with same name, running from one node and different ClassLoaders can lead to OOM. Fix problems with concurrent ignite.compute call.

2021-02-08 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-14131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17281290#comment-17281290
 ] 

Vladislav Pyatkov commented on IGNITE-14131:


[~zstan] I left several comments.

> IgniteCompute tasks with same name, running from one node and different 
> ClassLoaders can lead to OOM. Fix problems with concurrent ignite.compute 
> call.
> ---
>
> Key: IGNITE-14131
> URL: https://issues.apache.org/jira/browse/IGNITE-14131
> Project: Ignite
>  Issue Type: Improvement
>  Components: compute
>Affects Versions: 2.9.1
>Reporter: Stanilovsky Evgeny
>Assignee: Stanilovsky Evgeny
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The root cause of such a problem is growing from assumption that one node can 
> obtain only one class loader per class name. Thus multiple tasks calling with 
> different classloaders leads to huge cache grow in server side and finally 
> leads to oom with jvm metaspace. Additionally we can`t use p2p from multiple 
> threads, for example ignite instance is shared as a spring bean.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Issue Comment Deleted] (IGNITE-14103) .NET Thin Client: Retrieve binary configuration from server

2021-02-08 Thread Pavel Tupitsyn (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-14103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Tupitsyn updated IGNITE-14103:

Comment: was deleted

(was: {panel:title=Branch: [pull/8733/head] Base: [master] : Possible Blockers 
(3)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1}
{color:#d04437}SPI{color} [[tests 
1|https://ci.ignite.apache.org/viewLog.html?buildId=5858456]]
* IgniteSpiTestSuite: 
IgniteClientReconnectEventHandlingTest.testClientReconnect - Test has low fail 
rate in base branch 1,3% and is not flaky

{color:#d04437}Cassandra Store{color} [[tests 
2|https://ci.ignite.apache.org/viewLog.html?buildId=5858514]]
* org.apache.ignite.testsuites.cassandra.store.IgniteCassandraStoreTestSuite: 
org.apache.ignite.tests.CassandraDirectPersistenceTest. - History for base 
branch is absent.
* IgniteCassandraStoreTestSuite: 
CassandraDirectPersistenceTest.pojoStrategyTransactionTest - Test has low fail 
rate in base branch 0,0% and is not flaky

{panel}
{panel:title=Branch: [pull/8733/head] Base: [master] : New Tests 
(15)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1}
{color:#8b}Platform .NET (Core Linux){color} [[tests 
7|https://ci.ignite.apache.org/viewLog.html?buildId=5858502]]
* {color:#013220}dll: 
BinaryConfigurationRetrievalTest.TestCompactFooterDisabledOnServerAutomaticallyDisablesOnClient
 - PASSED{color}
* {color:#013220}dll: 
BinaryConfigurationRetrievalTest.TestCompactFooterEnabledOnServerDisabledOnClientProducesWarning
 - PASSED{color}
* {color:#013220}dll: 
BinaryConfigurationRetrievalTest.TestBasicNameMapperSettingsMismatchProducesLogWarning
 - PASSED{color}
* {color:#013220}dll: 
BinaryConfigurationRetrievalTest.TestDefaultConfigurationDoesNotChangeClientSettingsOrLogWarnings
 - PASSED{color}
* {color:#013220}dll: 
BinaryConfigurationRetrievalTest.TestExplicitDefaultConfigurationDoesNotChangeClientSettingsOrLogWarnings
 - PASSED{color}
* {color:#013220}dll: 
BinaryConfigurationRetrievalTest.TestCustomNameMapperOnServerAndClientProducesNoLogWarning
 - PASSED{color}
* {color:#013220}dll: 
BinaryConfigurationRetrievalTest.TestCustomNameMapperOnServerProducesLogWarning 
- PASSED{color}

{color:#8b}Platform .NET{color} [[tests 
7|https://ci.ignite.apache.org/viewLog.html?buildId=5858501]]
* {color:#013220}exe: 
BinaryConfigurationRetrievalTest.TestExplicitDefaultConfigurationDoesNotChangeClientSettingsOrLogWarnings
 - PASSED{color}
* {color:#013220}exe: 
BinaryConfigurationRetrievalTest.TestBasicNameMapperSettingsMismatchProducesLogWarning
 - PASSED{color}
* {color:#013220}exe: 
BinaryConfigurationRetrievalTest.TestCompactFooterDisabledOnServerAutomaticallyDisablesOnClient
 - PASSED{color}
* {color:#013220}exe: 
BinaryConfigurationRetrievalTest.TestCustomNameMapperOnServerProducesLogWarning 
- PASSED{color}
* {color:#013220}exe: 
BinaryConfigurationRetrievalTest.TestDefaultConfigurationDoesNotChangeClientSettingsOrLogWarnings
 - PASSED{color}
* {color:#013220}exe: 
BinaryConfigurationRetrievalTest.TestCompactFooterEnabledOnServerDisabledOnClientProducesWarning
 - PASSED{color}
* {color:#013220}exe: 
BinaryConfigurationRetrievalTest.TestCustomNameMapperOnServerAndClientProducesNoLogWarning
 - PASSED{color}

{color:#8b}Cassandra Store{color} [[tests 
1|https://ci.ignite.apache.org/viewLog.html?buildId=5858514]]
* 
{color:#8b}org.apache.ignite.testsuites.cassandra.store.IgniteCassandraStoreTestSuite:
 org.apache.ignite.tests.CassandraDirectPersistenceTest. - FAILED{color}

{panel}
[TeamCity *-- Run :: All* 
Results|https://ci.ignite.apache.org/viewLog.html?buildId=5858526buildTypeId=IgniteTests24Java8_RunAll])

> .NET Thin Client: Retrieve binary configuration from server
> ---
>
> Key: IGNITE-14103
> URL: https://issues.apache.org/jira/browse/IGNITE-14103
> Project: Ignite
>  Issue Type: Improvement
>  Components: platforms, thin client
>Reporter: Pavel Tupitsyn
>Assignee: Pavel Tupitsyn
>Priority: Major
>  Labels: .NET
> Fix For: 2.11
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Thin clients require manual binary configuration currently. Settings like 
> compact footer and simple/full name mapper should be set to match the cluster 
> settings. Extend the protocol to retrieve those settings automatically on 
> start.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-10073) .NET: Add NuGet package without embedded Ignite JARs

2021-02-08 Thread Pavel Tupitsyn (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-10073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17281206#comment-17281206
 ] 

Pavel Tupitsyn commented on IGNITE-10073:
-

Merged to master: a03313e1dd546b5d5a638e2a7e4b859f3f82f220

> .NET: Add NuGet package without embedded Ignite JARs
> 
>
> Key: IGNITE-10073
> URL: https://issues.apache.org/jira/browse/IGNITE-10073
> Project: Ignite
>  Issue Type: Improvement
>  Components: documentation, platforms
>Affects Versions: 2.6
>Reporter: Alexey Kukushkin
>Assignee: Pavel Tupitsyn
>Priority: Major
>  Labels: .NET, sbcf
> Attachments: ignite-10073-vs-2.8.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The existing Apache.Ignite NuGet package includes Ignite JARs deployed into 
> the "libs" directory in the .NET project output directory upon the package 
> installation.
> We prefer using external Ignite JARs from $IGNITE_HOME/libs instead of the 
> JARs in the local libs directory.
> Right now we have to manually remove local "libs" directory after every 
> Apache.Ignite package installation or upgrade.
> It would help us having another Ignite NuGet package without the embedded 
> Ignite JARs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-13056) SchemaManager refactoring

2021-02-08 Thread Andrey Mashenkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17281136#comment-17281136
 ] 

Andrey Mashenkov commented on IGNITE-13056:
---

[~timonin.maksim],  
I've left a few commens to the PR.

> SchemaManager refactoring
> -
>
> Key: IGNITE-13056
> URL: https://issues.apache.org/jira/browse/IGNITE-13056
> Project: Ignite
>  Issue Type: New Feature
>  Components: sql
>Affects Versions: 2.8
>Reporter: Nikolay Izhikov
>Assignee: Maksim Timonin
>Priority: Major
>  Labels: IEP-49, calcite
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> Since Ignite wants to leverage from several SQL engines we need to make work 
> with index independent from the used SQL engine.
> We also should consider moving all machinery related to an index to the core 
> module to make it available from any module that wants to use it.
> Requirements so far:
> * Ability to track indexes in several engines



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-14140) Checkpointer thread holds write lock too long

2021-02-08 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-14140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov updated IGNITE-14140:
---
Description: 
Free lists flushing optimization can block db-checkpoint-thread when it got 
Write lock. It might block all transactions for several hundreds milliseconds.
{noformat}
"db-checkpoint-thread-#334%DPL_GRID%DplGridNodeName%" #667 daemon prio=5 
os_prio=0 tid=0x7e765c123800 nid=0xee0b8 runnable [0x7e767f535000]
   java.lang.Thread.State: RUNNABLE
at sun.misc.Unsafe.getObjectVolatile(Native Method)
at 
java.util.concurrent.atomic.AtomicReferenceArray.getRaw(AtomicReferenceArray.java:130)
at 
java.util.concurrent.atomic.AtomicReferenceArray.get(AtomicReferenceArray.java:125)
at 
org.apache.ignite.internal.processors.cache.persistence.freelist.AbstractFreeList.getBucketCache(AbstractFreeList.java:690)
at 
org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.flushBucketsCache(PagesList.java:374)
at 
org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.saveMetadata(PagesList.java:343)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.saveStoreMetadata(GridCacheOffheapManager.java:373)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.syncMetadata(GridCacheOffheapManager.java:336)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.syncMetadata(GridCacheOffheapManager.java:322)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.onMarkCheckpointBegin(GridCacheOffheapManager.java:247)
at 
org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointWorkflow.markCheckpointBegin(CheckpointWorkflow.java:281)
at 
org.apache.ignite.internal.processors.cache.persistence.checkpoint.Checkpointer.doCheckpoint(Checkpointer.java:388)
at 
org.apache.ignite.internal.processors.cache.persistence.checkpoint.Checkpointer.body(Checkpointer.java:264)
at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119)
at java.lang.Thread.run(Thread.java:748)
{noformat}
We can to reduce time into Write lock if switch off optimization before the 
lock will be gotten and enable it after the lock will be left off.

  was:
Free lists flushing optimization can block db-checkpoint-thread when it got 
Write lock. It might block all transactions for several hundreds milliseconds.
{noformat}
"db-checkpoint-thread-#334%DPL_GRID%DplGridNodeName%" #667 daemon prio=5 
os_prio=0 tid=0x7e765c123800 nid=0xee0b8 runnable [0x7e767f535000]
   java.lang.Thread.State: RUNNABLE
at sun.misc.Unsafe.getObjectVolatile(Native Method)
at 
java.util.concurrent.atomic.AtomicReferenceArray.getRaw(AtomicReferenceArray.java:130)
at 
java.util.concurrent.atomic.AtomicReferenceArray.get(AtomicReferenceArray.java:125)
at 
org.apache.ignite.internal.processors.cache.persistence.freelist.AbstractFreeList.getBucketCache(AbstractFreeList.java:690)
at 
org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.flushBucketsCache(PagesList.java:374)
at 
org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.saveMetadata(PagesList.java:343)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.saveStoreMetadata(GridCacheOffheapManager.java:373)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.syncMetadata(GridCacheOffheapManager.java:336)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.syncMetadata(GridCacheOffheapManager.java:322)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.onMarkCheckpointBegin(GridCacheOffheapManager.java:247)
at 
org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointWorkflow.markCheckpointBegin(CheckpointWorkflow.java:281)
at 
org.apache.ignite.internal.processors.cache.persistence.checkpoint.Checkpointer.doCheckpoint(Checkpointer.java:388)
at 
org.apache.ignite.internal.processors.cache.persistence.checkpoint.Checkpointer.body(Checkpointer.java:264)
at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119)
at java.lang.Thread.run(Thread.java:748)
{noformat}
We can to reduce time into Write lock if switch off optimization before the 
lock will be gotten and enable it after the lock will be left off.
 This image confirms that all time consume of storing the metadata cache.


> Checkpointer thread holds write lock too long
> -
>
> Key: IGNITE-14140
> URL: 

[jira] [Updated] (IGNITE-14140) Checkpointer thread holds write lock too long

2021-02-08 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-14140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov updated IGNITE-14140:
---
Description: 
Free lists flushing optimization can block db-checkpoint-thread when it got 
Write lock. It might block all transactions for several hundreds milliseconds.
{noformat}
"db-checkpoint-thread-#334%DPL_GRID%DplGridNodeName%" #667 daemon prio=5 
os_prio=0 tid=0x7e765c123800 nid=0xee0b8 runnable [0x7e767f535000]
   java.lang.Thread.State: RUNNABLE
at sun.misc.Unsafe.getObjectVolatile(Native Method)
at 
java.util.concurrent.atomic.AtomicReferenceArray.getRaw(AtomicReferenceArray.java:130)
at 
java.util.concurrent.atomic.AtomicReferenceArray.get(AtomicReferenceArray.java:125)
at 
org.apache.ignite.internal.processors.cache.persistence.freelist.AbstractFreeList.getBucketCache(AbstractFreeList.java:690)
at 
org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.flushBucketsCache(PagesList.java:374)
at 
org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.saveMetadata(PagesList.java:343)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.saveStoreMetadata(GridCacheOffheapManager.java:373)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.syncMetadata(GridCacheOffheapManager.java:336)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.syncMetadata(GridCacheOffheapManager.java:322)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.onMarkCheckpointBegin(GridCacheOffheapManager.java:247)
at 
org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointWorkflow.markCheckpointBegin(CheckpointWorkflow.java:281)
at 
org.apache.ignite.internal.processors.cache.persistence.checkpoint.Checkpointer.doCheckpoint(Checkpointer.java:388)
at 
org.apache.ignite.internal.processors.cache.persistence.checkpoint.Checkpointer.body(Checkpointer.java:264)
at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119)
at java.lang.Thread.run(Thread.java:748)
{noformat}
We can to reduce time into Write lock if switch off optimization before the 
lock will be gotten and enable it after the lock will be left off.
 This image confirms that all time consume of storing the metadata cache.

  was:
Free lists flushing optimization can block db-checkpoint-thread when it got 
Write lock. It might block all transactions for several hundreds milliseconds.

{noformat}
"db-checkpoint-thread-#334%DPL_GRID%DplGridNodeName%" #667 daemon prio=5 
os_prio=0 tid=0x7e765c123800 nid=0xee0b8 runnable [0x7e767f535000] 
java.lang.Thread.State: RUNNABLE at sun.misc.Unsafe.getObjectVolatile(Native 
Method) at 
java.util.concurrent.atomic.AtomicReferenceArray.getRaw(AtomicReferenceArray.java:130)
 at 
java.util.concurrent.atomic.AtomicReferenceArray.get(AtomicReferenceArray.java:125)
 at 
org.apache.ignite.internal.processors.cache.persistence.freelist.AbstractFreeList.getBucketCache(AbstractFreeList.java:690)
 at 
org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.flushBucketsCache(PagesList.java:374)
 at 
org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.saveMetadata(PagesList.java:343)
 at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.saveStoreMetadata(GridCacheOffheapManager.java:373)
 at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.syncMetadata(GridCacheOffheapManager.java:336)
 at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.syncMetadata(GridCacheOffheapManager.java:322)
 at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.onMarkCheckpointBegin(GridCacheOffheapManager.java:247)
 at 
org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointWorkflow.markCheckpointBegin(CheckpointWorkflow.java:281)
 at 
org.apache.ignite.internal.processors.cache.persistence.checkpoint.Checkpointer.doCheckpoint(Checkpointer.java:388)
 at 
org.apache.ignite.internal.processors.cache.persistence.checkpoint.Checkpointer.body(Checkpointer.java:264)
 at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119) 
at java.lang.Thread.run(Thread.java:748)
{noformat}

We can to reduce time into Write lock if switch off optimization before the 
lock will be gotten and enable it after the lock will be left off.
This image confirms that all time consume of storing the metadata cache.


> Checkpointer thread holds write lock too long
> -
>
> Key: IGNITE-14140
> URL: https://issues.apache.org/jira/browse/IGNITE-14140
> Project: 

[jira] [Created] (IGNITE-14140) Checkpointer thread holds write lock too long

2021-02-08 Thread Vladislav Pyatkov (Jira)
Vladislav Pyatkov created IGNITE-14140:
--

 Summary: Checkpointer thread holds write lock too long
 Key: IGNITE-14140
 URL: https://issues.apache.org/jira/browse/IGNITE-14140
 Project: Ignite
  Issue Type: Bug
  Components: persistence
Reporter: Vladislav Pyatkov
Assignee: Vladislav Pyatkov


Free lists flushing optimization can block db-checkpoint-thread when it got 
Write lock. It might block all transactions for several hundreds milliseconds.

{noformat}
"db-checkpoint-thread-#334%DPL_GRID%DplGridNodeName%" #667 daemon prio=5 
os_prio=0 tid=0x7e765c123800 nid=0xee0b8 runnable [0x7e767f535000] 
java.lang.Thread.State: RUNNABLE at sun.misc.Unsafe.getObjectVolatile(Native 
Method) at 
java.util.concurrent.atomic.AtomicReferenceArray.getRaw(AtomicReferenceArray.java:130)
 at 
java.util.concurrent.atomic.AtomicReferenceArray.get(AtomicReferenceArray.java:125)
 at 
org.apache.ignite.internal.processors.cache.persistence.freelist.AbstractFreeList.getBucketCache(AbstractFreeList.java:690)
 at 
org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.flushBucketsCache(PagesList.java:374)
 at 
org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.saveMetadata(PagesList.java:343)
 at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.saveStoreMetadata(GridCacheOffheapManager.java:373)
 at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.syncMetadata(GridCacheOffheapManager.java:336)
 at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.syncMetadata(GridCacheOffheapManager.java:322)
 at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.onMarkCheckpointBegin(GridCacheOffheapManager.java:247)
 at 
org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointWorkflow.markCheckpointBegin(CheckpointWorkflow.java:281)
 at 
org.apache.ignite.internal.processors.cache.persistence.checkpoint.Checkpointer.doCheckpoint(Checkpointer.java:388)
 at 
org.apache.ignite.internal.processors.cache.persistence.checkpoint.Checkpointer.body(Checkpointer.java:264)
 at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119) 
at java.lang.Thread.run(Thread.java:748)
{noformat}

We can to reduce time into Write lock if switch off optimization before the 
lock will be gotten and enable it after the lock will be left off.
This image confirms that all time consume of storing the metadata cache.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (IGNITE-13912) Incorrect calculation of WAL segments that should be deleted from WAL archive

2021-02-08 Thread shivakumar (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17278893#comment-17278893
 ] 

shivakumar edited comment on IGNITE-13912 at 2/8/21, 3:06 PM:
--

Hi [~ktkale...@gridgain.com]

I have uploaded uber jar to 
[https://drive.google.com/file/d/1SZS4248QAv5qVWOYJccHCPrmCWEiWXBf/view?usp=sharing]

Install ignite cluster with persistence enabled, tables which get created as 
part of this JDBC program uses "INVENTORY" as schema and also cache group 
configuration so you need to add  below configuration to your ignite 
configuration (refer attached ignite-config xml file)

=


 
 INVENTORY
 
 


 
 


 
 
 
 
 
 
 

==

download the jar run the JDBC program with below command,

*java -jar inventory-1.0-SNAPSHOT-shaded.jar com.test.app.InventoryCreate 
config.properties yes* 

If you want to re-run the JDBC program make sure you pass last argument as no 
(this will skip cleanup of previously created tables and re-creation of tables 
and directly starts data insertions)

*java -jar inventory-1.0-SNAPSHOT-shaded.jar com.test.app.InventoryCreate 
config.properties no* 

The config.properties which needs to be passed to this client program is also 
attached to this ticket, update ignite_db_url property in that file (default 
set to jdbc:ignite:thin://localhost:10800)**

Connect and disconnect to visor in between data ingestion and see the WAL usage.


was (Author: shm):
Hi [~ktkale...@gridgain.com]

I have uploaded uber jar to 
[https://drive.google.com/file/d/1SZS4248QAv5qVWOYJccHCPrmCWEiWXBf/view?usp=sharing]

Install ignite cluster with persistence enabled, tables which get created as 
part of this JDBC program uses "INVENTORY" as schema and also cache group 
configuration so you need to add  below configuration to your ignite 
configuration (refer attached ignite-config xml file)

=


 
 INVENTORY
 
 


 
 


 
 
 
 
 
 
 

==

download the jar run the JDBC program with below command,

 *java -jar inventory-1.0-SNAPSHOT-shaded.jar com.test.app.InventoryCreate 
config.properties yes* 

If you want to re-run the JDBC program make sure you pass last argument as no 
(this will skip cleanup of previously created tables and re-creation of tables 
and directly starts data insertions)

*java -jar inventory-1.0-SNAPSHOT-shaded.jar com.test.app.InventoryCreate 
config.properties no* 

The config.properties which needs to be passed to this client program is also 
attached to this ticket, update ignite_db_url property in that file (default 
set to jdbc:ignite:thin://localhost:10800)**

Connect and disconnect to visor in between data ingestion and see the WAL usage.

 

 

 

 

 

 

 

> Incorrect calculation of WAL segments that should be deleted from WAL archive
> -
>
> Key: IGNITE-13912
> URL: https://issues.apache.org/jira/browse/IGNITE-13912
> Project: Ignite
>  Issue Type: Bug
>  Components: persistence
>Reporter: Kirill Tkalenko
>Assignee: Kirill Tkalenko
>Priority: Critical
> Fix For: 2.10
>
> Attachments: config.properties, ignite-config, reproducer.zip, 
> server1-full-wal-checkpoint.log, wal-checkpoint-logs, wal_dir_contents, 
> wal_grows_from_peak.PNG, wal_issue_reproduced.PNG, wal_usage.PNG, 
> wal_usage_dec12.PNG, wal_usage_dec22nd_binary.PNG
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Now there is an incorrect calculation of WAL segments that should be deleted 
> from WAL archive. Since we delete only those segments whose total size should 
> not exceed *DataStorageConfiguration#maxWalArchiveSize * 
> IGNITE_THRESHOLD_WAL_ARCHIVE_SIZE_PERCENTAGE*, but should be up to  
> DataStorageConfiguration#maxWalArchiveSize * 
> IGNITE_THRESHOLD_WAL_ARCHIVE_SIZE_PERCENTAGE*. Therefore, an excess of 
> *DataStorageConfiguration#maxWalArchiveSize* occurs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-13912) Incorrect calculation of WAL segments that should be deleted from WAL archive

2021-02-08 Thread shivakumar (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17281126#comment-17281126
 ] 

shivakumar commented on IGNITE-13912:
-

Hi [~ktkale...@gridgain.com]

Are you able to reproduce with the new jar provided ?

 

> Incorrect calculation of WAL segments that should be deleted from WAL archive
> -
>
> Key: IGNITE-13912
> URL: https://issues.apache.org/jira/browse/IGNITE-13912
> Project: Ignite
>  Issue Type: Bug
>  Components: persistence
>Reporter: Kirill Tkalenko
>Assignee: Kirill Tkalenko
>Priority: Critical
> Fix For: 2.10
>
> Attachments: config.properties, ignite-config, reproducer.zip, 
> server1-full-wal-checkpoint.log, wal-checkpoint-logs, wal_dir_contents, 
> wal_grows_from_peak.PNG, wal_issue_reproduced.PNG, wal_usage.PNG, 
> wal_usage_dec12.PNG, wal_usage_dec22nd_binary.PNG
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Now there is an incorrect calculation of WAL segments that should be deleted 
> from WAL archive. Since we delete only those segments whose total size should 
> not exceed *DataStorageConfiguration#maxWalArchiveSize * 
> IGNITE_THRESHOLD_WAL_ARCHIVE_SIZE_PERCENTAGE*, but should be up to  
> DataStorageConfiguration#maxWalArchiveSize * 
> IGNITE_THRESHOLD_WAL_ARCHIVE_SIZE_PERCENTAGE*. Therefore, an excess of 
> *DataStorageConfiguration#maxWalArchiveSize* occurs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14139) Incorrect initialize checkpoint-runner-cpu thread pool

2021-02-08 Thread Vladislav Pyatkov (Jira)
Vladislav Pyatkov created IGNITE-14139:
--

 Summary: Incorrect initialize checkpoint-runner-cpu thread pool
 Key: IGNITE-14139
 URL: https://issues.apache.org/jira/browse/IGNITE-14139
 Project: Ignite
  Issue Type: Bug
Reporter: Vladislav Pyatkov
Assignee: Vladislav Pyatkov


First initialization of checkpoint thread pool for CPU is incorrect.
Look at the constructor of {{CheckpointWorkflow}}:
At start, we initialize the pool:
{code:java}
this.checkpointCollectPagesInfoPool = initializeCheckpointPool();
{code}
and only after, we set a size of the pool:
{code:java}
this.checkpointCollectInfoThreads = checkpointCollectInfoThreads;
{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-14112) Revisit GridClosureProcessor#runLocalSafe(Runnable, byte) usages

2021-02-08 Thread Alexey Goncharuk (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-14112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Goncharuk updated IGNITE-14112:
--
Ignite Flags:   (was: Docs Required,Release Notes Required)

> Revisit GridClosureProcessor#runLocalSafe(Runnable, byte) usages
> 
>
> Key: IGNITE-14112
> URL: https://issues.apache.org/jira/browse/IGNITE-14112
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Alexey Goncharuk
>Assignee: Alexey Goncharuk
>Priority: Major
> Fix For: 2.11
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> If a simple {{Runnable}} is passed to the {{runLocalSafe}} method, not only 
> will Ignite attempt to inject resources to the runnable, but it will also 
> make a call to deployment, which may have various side effects.
> Need to walk through the code and replace {{Runnable}} with 
> {{GridPlainRunnable}} in all places where injection is not needed/expected.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-13967) Refactor and improve performance of python thin client marshaller

2021-02-08 Thread Igor Sapego (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17281035#comment-17281035
 ] 

Igor Sapego commented on IGNITE-13967:
--

[~ivandasch] reviewed. See my comments in PR.

> Refactor and improve performance of python thin client marshaller
> -
>
> Key: IGNITE-13967
> URL: https://issues.apache.org/jira/browse/IGNITE-13967
> Project: Ignite
>  Issue Type: Improvement
>  Components: thin client
>Reporter: Ivan Daschinskiy
>Assignee: Ivan Daschinskiy
>Priority: Major
>  Labels: python
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently implemented serialization has questionable design and suffers from 
> some problems
> 1. It is tightly coupled with Client object
> 2. It doesn't use protocol feature that total length of message is in the 
> header,
> thus it constantly load from Client some data instead of iteration over byte 
> array.
> 3. It uses some tricky hacks and sometimes new connection is created when 
> deserializing object.
> 4. It constantly allocates bytes (immutable data structure).
> I suggest to rewrite serialization and deserialization:
> 1. Pass to corresponding methods specific SerDe context + BytesIO
> 2. Context can be sync and async and contains specific flags and methods for 
> loading/uploading binary object schemas
> 3. Refactor Client in order to retrieve full packet from socket at once then 
> pass full packet futher.
> These steps can significantly improve performance, reduce amount of 
> allocations and give
> foundation for implementing asyncio version of client.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Reopened] (IGNITE-9214) Uncomment 21 test classes in misc cache test suites (see inside)

2021-02-08 Thread Ilya Kasnacheev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-9214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ilya Kasnacheev reopened IGNITE-9214:
-
  Assignee: Ilya Kasnacheev

> Uncomment 21 test classes in misc cache test suites (see inside)
> 
>
> Key: IGNITE-9214
> URL: https://issues.apache.org/jira/browse/IGNITE-9214
> Project: Ignite
>  Issue Type: Sub-task
>  Components: binary, cache, data structures
>Reporter: Ilya Kasnacheev
>Assignee: Ilya Kasnacheev
>Priority: Major
>
> as per the following suites:
> {code}
> 2 
> modules/core/src/test/java/org/apache/ignite/internal/processors/cache/expiry/IgniteCacheExpiryPolicyTestSuite.java
> 4 
> modules/core/src/test/java/org/apache/ignite/internal/processors/cache/IgniteCacheInterceptorSelfTestSuite.java
> 1 
> modules/core/src/test/java/org/apache/ignite/testsuites/IgniteBinaryCacheTestSuite.java
> 6 
> modules/core/src/test/java/org/apache/ignite/testsuites/IgniteBinaryObjectsTestSuite.java
> 4 
> modules/core/src/test/java/org/apache/ignite/testsuites/IgniteCacheDataStructuresSelfTestSuite.java
> 2 
> modules/core/src/test/java/org/apache/ignite/testsuites/IgniteCacheFailoverTestSuite2.java
> 1 
> modules/core/src/test/java/org/apache/ignite/testsuites/IgniteCacheFullApiSelfTestSuite.java
> 1 
> modules/core/src/test/java/org/apache/ignite/testsuites/IgniteCacheRestartTestSuite2.java
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Reopened] (IGNITE-9218) Uncomment 18 test classes in IgniteCacheTestSuite{3,5,6}

2021-02-08 Thread Ilya Kasnacheev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-9218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ilya Kasnacheev reopened IGNITE-9218:
-
  Assignee: Ilya Kasnacheev

> Uncomment 18 test classes in IgniteCacheTestSuite{3,5,6}
> 
>
> Key: IGNITE-9218
> URL: https://issues.apache.org/jira/browse/IGNITE-9218
> Project: Ignite
>  Issue Type: Sub-task
>  Components: cache
>Reporter: Ilya Kasnacheev
>Assignee: Ilya Kasnacheev
>Priority: Major
>
> Cache 3:
> {code}
> //suite.addTestSuite(GridCacheReplicatedEntrySetSelfTest.class);
> //suite.addTestSuite(GridCacheReplicatedMarshallerTxTest.class);
> //suite.addTestSuite(GridCacheReplicatedOnheapFullApiSelfTest.class);
> 
> //suite.addTestSuite(GridCacheReplicatedOnheapMultiNodeFullApiSelfTest.class);
> //suite.addTestSuite(GridCacheReplicatedTxConcurrentGetTest.class);
> //suite.addTestSuite(GridCacheReplicatedTxMultiNodeBasicTest.class);
> //suite.addTestSuite(GridCacheReplicatedTxReadTest.class);
> //suite.addTestSuite(GridCacheDeploymentSelfTest.class);
> //suite.addTestSuite(GridCacheDeploymentOffHeapSelfTest.class);
> //suite.addTestSuite(GridCacheDeploymentOffHeapValuesSelfTest.class);
> {code}
> Cache 5:
> {code}
> //suite.addTestSuite(GridCacheAtomicPreloadSelfTest.class);
> 
> //suite.addTestSuite(IgniteCacheContainsKeyColocatedAtomicSelfTest.class);
> //suite.addTestSuite(IgniteCacheContainsKeyNearAtomicSelfTest.class);
> {code}
> Cache 6:
> {code}
> //suite.addTestSuite(IgniteOutOfMemoryPropagationTest.class);
> //suite.addTestSuite(CacheClientsConcurrentStartTest.class);
> //suite.addTestSuite(CacheTryLockMultithreadedTest.class);
> //suite.addTestSuite(GridCacheRebalancingOrderingTest.class);
> 
> //suite.addTestSuite(IgniteCacheClientMultiNodeUpdateTopologyLockTest.class);
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IGNITE-9214) Uncomment 21 test classes in misc cache test suites (see inside)

2021-02-08 Thread Ilya Kasnacheev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-9214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ilya Kasnacheev resolved IGNITE-9214.
-
Resolution: Duplicate

> Uncomment 21 test classes in misc cache test suites (see inside)
> 
>
> Key: IGNITE-9214
> URL: https://issues.apache.org/jira/browse/IGNITE-9214
> Project: Ignite
>  Issue Type: Sub-task
>  Components: binary, cache, data structures
>Reporter: Ilya Kasnacheev
>Priority: Major
>
> as per the following suites:
> {code}
> 2 
> modules/core/src/test/java/org/apache/ignite/internal/processors/cache/expiry/IgniteCacheExpiryPolicyTestSuite.java
> 4 
> modules/core/src/test/java/org/apache/ignite/internal/processors/cache/IgniteCacheInterceptorSelfTestSuite.java
> 1 
> modules/core/src/test/java/org/apache/ignite/testsuites/IgniteBinaryCacheTestSuite.java
> 6 
> modules/core/src/test/java/org/apache/ignite/testsuites/IgniteBinaryObjectsTestSuite.java
> 4 
> modules/core/src/test/java/org/apache/ignite/testsuites/IgniteCacheDataStructuresSelfTestSuite.java
> 2 
> modules/core/src/test/java/org/apache/ignite/testsuites/IgniteCacheFailoverTestSuite2.java
> 1 
> modules/core/src/test/java/org/apache/ignite/testsuites/IgniteCacheFullApiSelfTestSuite.java
> 1 
> modules/core/src/test/java/org/apache/ignite/testsuites/IgniteCacheRestartTestSuite2.java
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IGNITE-9218) Uncomment 18 test classes in IgniteCacheTestSuite{3,5,6}

2021-02-08 Thread Ilya Kasnacheev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-9218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ilya Kasnacheev resolved IGNITE-9218.
-

> Uncomment 18 test classes in IgniteCacheTestSuite{3,5,6}
> 
>
> Key: IGNITE-9218
> URL: https://issues.apache.org/jira/browse/IGNITE-9218
> Project: Ignite
>  Issue Type: Sub-task
>  Components: cache
>Reporter: Ilya Kasnacheev
>Priority: Major
>
> Cache 3:
> {code}
> //suite.addTestSuite(GridCacheReplicatedEntrySetSelfTest.class);
> //suite.addTestSuite(GridCacheReplicatedMarshallerTxTest.class);
> //suite.addTestSuite(GridCacheReplicatedOnheapFullApiSelfTest.class);
> 
> //suite.addTestSuite(GridCacheReplicatedOnheapMultiNodeFullApiSelfTest.class);
> //suite.addTestSuite(GridCacheReplicatedTxConcurrentGetTest.class);
> //suite.addTestSuite(GridCacheReplicatedTxMultiNodeBasicTest.class);
> //suite.addTestSuite(GridCacheReplicatedTxReadTest.class);
> //suite.addTestSuite(GridCacheDeploymentSelfTest.class);
> //suite.addTestSuite(GridCacheDeploymentOffHeapSelfTest.class);
> //suite.addTestSuite(GridCacheDeploymentOffHeapValuesSelfTest.class);
> {code}
> Cache 5:
> {code}
> //suite.addTestSuite(GridCacheAtomicPreloadSelfTest.class);
> 
> //suite.addTestSuite(IgniteCacheContainsKeyColocatedAtomicSelfTest.class);
> //suite.addTestSuite(IgniteCacheContainsKeyNearAtomicSelfTest.class);
> {code}
> Cache 6:
> {code}
> //suite.addTestSuite(IgniteOutOfMemoryPropagationTest.class);
> //suite.addTestSuite(CacheClientsConcurrentStartTest.class);
> //suite.addTestSuite(CacheTryLockMultithreadedTest.class);
> //suite.addTestSuite(GridCacheRebalancingOrderingTest.class);
> 
> //suite.addTestSuite(IgniteCacheClientMultiNodeUpdateTopologyLockTest.class);
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IGNITE-9210) Uncomment tests in test suites, make sure they pass without hang-ups or errors.

2021-02-08 Thread Ilya Kasnacheev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-9210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ilya Kasnacheev resolved IGNITE-9210.
-
Resolution: Done

> Uncomment tests in test suites, make sure they pass without hang-ups or 
> errors.
> ---
>
> Key: IGNITE-9210
> URL: https://issues.apache.org/jira/browse/IGNITE-9210
> Project: Ignite
>  Issue Type: Test
>Reporter: Ilya Kasnacheev
>Assignee: Ilya Kasnacheev
>Priority: Major
>  Labels: MakeTeamcityGreenAgain
>
> This is a top-level issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IGNITE-8585) Remove ignored-tests module, move those tests to regular suites in commented out form

2021-02-08 Thread Ilya Kasnacheev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-8585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ilya Kasnacheev resolved IGNITE-8585.
-
Resolution: Duplicate

> Remove ignored-tests module, move those tests to regular suites in commented 
> out form
> -
>
> Key: IGNITE-8585
> URL: https://issues.apache.org/jira/browse/IGNITE-8585
> Project: Ignite
>  Issue Type: Task
>Affects Versions: 2.6
>Reporter: Ilya Kasnacheev
>Assignee: Ilya Kasnacheev
>Priority: Minor
>  Labels: test
>
> Nobody either updates or runs this module and it's infeasible to use!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13230) Ignite duplicate key and NullPointerException

2021-02-08 Thread Andrey Mashenkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Mashenkov updated IGNITE-13230:
--
Priority: Minor  (was: Critical)

> Ignite duplicate key and NullPointerException
> -
>
> Key: IGNITE-13230
> URL: https://issues.apache.org/jira/browse/IGNITE-13230
> Project: Ignite
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 2.9.1
>Reporter: Abhay
>Priority: Minor
> Attachments: logFile.txt, patch3.txt
>
>
> [^logFile.txt]
> Following steps lead to crash 
>  # Start ignite node with persistence enabled , and use ODBC client like isql 
> or pyignite
>  # Fire create table command e.g
>  ## CREATE TABLE ct_countries(id bigint PRIMARY KEY NOT NULL,code VARCHAR(50) 
> DEFAULT '',name VARCHAR(100) DEFAULT '',timezonecheck VARCHAR DEFAULT 
> 'N',dstcheck VARCHAR DEFAULT 'N',phonecodelength VARCHAR(20) DEFAULT 
> '',status varchar(10) DEFAULT 'INACTIVE')WITH 
> "template=partitioned,backups=0,affinity_key=id";
>  # Create index without giving index name and create two such index
>  ## CREATE INDEX ON ct_countries(code);
>  ## CREATE INDEX ON ct_countries(name);
> Restart ignite and it will crash with the follolwing logs 
> java.lang.IllegalStateException: Duplicate key
>  at org.apache.ignite.cache.QueryEntity.checkIndexes(QueryEntity.java:233)
>  at org.apache.ignite.cache.QueryEntity.makePatch(QueryEntity.java:184)
>  at 
> org.apache.ignite.internal.processors.query.QuerySchema.makePatch(QuerySchema.java:114)
>  
> java.lang.IllegalStateException: Duplicate key
>  at org.apache.ignite.cache.QueryEntity.checkIndexes(QueryEntity.java:233)
>  at org.apache.ignite.cache.QueryEntity.makePatch(QueryEntity.java:184)
>  at 
> org.apache.ignite.internal.processors.query.QuerySchema.makePatch(QuerySchema.java:114)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-13230) Ignite duplicate key and NullPointerException

2021-02-08 Thread Andrey Mashenkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17280994#comment-17280994
 ] 

Andrey Mashenkov commented on IGNITE-13230:
---

CREATE INDEX command in H2 dialect (as other vendors Mysql, MS, Postgres) 
expects mandatory index name.
The example above looks like misusage.

> Ignite duplicate key and NullPointerException
> -
>
> Key: IGNITE-13230
> URL: https://issues.apache.org/jira/browse/IGNITE-13230
> Project: Ignite
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 2.9.1
>Reporter: Abhay
>Priority: Critical
> Attachments: logFile.txt, patch3.txt
>
>
> [^logFile.txt]
> Following steps lead to crash 
>  # Start ignite node with persistence enabled , and use ODBC client like isql 
> or pyignite
>  # Fire create table command e.g
>  ## CREATE TABLE ct_countries(id bigint PRIMARY KEY NOT NULL,code VARCHAR(50) 
> DEFAULT '',name VARCHAR(100) DEFAULT '',timezonecheck VARCHAR DEFAULT 
> 'N',dstcheck VARCHAR DEFAULT 'N',phonecodelength VARCHAR(20) DEFAULT 
> '',status varchar(10) DEFAULT 'INACTIVE')WITH 
> "template=partitioned,backups=0,affinity_key=id";
>  # Create index without giving index name and create two such index
>  ## CREATE INDEX ON ct_countries(code);
>  ## CREATE INDEX ON ct_countries(name);
> Restart ignite and it will crash with the follolwing logs 
> java.lang.IllegalStateException: Duplicate key
>  at org.apache.ignite.cache.QueryEntity.checkIndexes(QueryEntity.java:233)
>  at org.apache.ignite.cache.QueryEntity.makePatch(QueryEntity.java:184)
>  at 
> org.apache.ignite.internal.processors.query.QuerySchema.makePatch(QuerySchema.java:114)
>  
> java.lang.IllegalStateException: Duplicate key
>  at org.apache.ignite.cache.QueryEntity.checkIndexes(QueryEntity.java:233)
>  at org.apache.ignite.cache.QueryEntity.makePatch(QueryEntity.java:184)
>  at 
> org.apache.ignite.internal.processors.query.QuerySchema.makePatch(QuerySchema.java:114)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-14131) IgniteCompute tasks with same name, running from one node and different ClassLoaders can lead to OOM. Fix problems with concurrent ignite.compute call.

2021-02-08 Thread Ilya Kasnacheev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-14131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ilya Kasnacheev updated IGNITE-14131:
-
Reviewer: Stanilovsky Evgeny

> IgniteCompute tasks with same name, running from one node and different 
> ClassLoaders can lead to OOM. Fix problems with concurrent ignite.compute 
> call.
> ---
>
> Key: IGNITE-14131
> URL: https://issues.apache.org/jira/browse/IGNITE-14131
> Project: Ignite
>  Issue Type: Improvement
>  Components: compute
>Affects Versions: 2.9.1
>Reporter: Stanilovsky Evgeny
>Assignee: Stanilovsky Evgeny
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The root cause of such a problem is growing from assumption that one node can 
> obtain only one class loader per class name. Thus multiple tasks calling with 
> different classloaders leads to huge cache grow in server side and finally 
> leads to oom with jvm metaspace. Additionally we can`t use p2p from multiple 
> threads, for example ignite instance is shared as a spring bean.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-14131) IgniteCompute tasks with same name, running from one node and different ClassLoaders can lead to OOM. Fix problems with concurrent ignite.compute call.

2021-02-08 Thread Ilya Kasnacheev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-14131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ilya Kasnacheev updated IGNITE-14131:
-
Reviewer:   (was: Stanilovsky Evgeny)

> IgniteCompute tasks with same name, running from one node and different 
> ClassLoaders can lead to OOM. Fix problems with concurrent ignite.compute 
> call.
> ---
>
> Key: IGNITE-14131
> URL: https://issues.apache.org/jira/browse/IGNITE-14131
> Project: Ignite
>  Issue Type: Improvement
>  Components: compute
>Affects Versions: 2.9.1
>Reporter: Stanilovsky Evgeny
>Assignee: Stanilovsky Evgeny
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The root cause of such a problem is growing from assumption that one node can 
> obtain only one class loader per class name. Thus multiple tasks calling with 
> different classloaders leads to huge cache grow in server side and finally 
> leads to oom with jvm metaspace. Additionally we can`t use p2p from multiple 
> threads, for example ignite instance is shared as a spring bean.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-14103) .NET Thin Client: Retrieve binary configuration from server

2021-02-08 Thread Ignite TC Bot (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-14103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17280986#comment-17280986
 ] 

Ignite TC Bot commented on IGNITE-14103:


{panel:title=Branch: [pull/8733/head] Base: [master] : Possible Blockers 
(50)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1}
{color:#d04437}Java Client{color} [[tests 
1|https://ci.ignite.apache.org/viewLog.html?buildId=5862609]]
* IgniteClientTestSuite: 
ClientTcpUnreachableMultiNodeSelfTest.testTopologyListener - Test has low fail 
rate in base branch 0,0% and is not flaky

{color:#d04437}JCache TCK 1.1{color} [[tests 0 TIMEOUT , Exit Code 
|https://ci.ignite.apache.org/viewLog.html?buildId=5862632]]

{color:#d04437}ZooKeeper (Discovery) 1{color} [[tests 3 Out Of Memory Error 
|https://ci.ignite.apache.org/viewLog.html?buildId=5862627]]
* ZookeeperDiscoverySpiTestSuite1: 
ZookeeperDiscoveryTopologyChangeAndReconnectTest.testWithPersistence1 - Test 
has low fail rate in base branch 0,0% and is not flaky
* ZookeeperDiscoverySpiTestSuite1: 
ZookeeperDiscoveryTopologyChangeAndReconnectTest.testDuplicatedNodeId - Test 
has low fail rate in base branch 0,0% and is not flaky
* ZookeeperDiscoverySpiTestSuite1: 
ZookeeperDiscoveryTopologyChangeAndReconnectTest.testLargeUserAttribute3 - Test 
has low fail rate in base branch 1,3% and is not flaky

{color:#d04437}Cache 1{color} [[tests 
1|https://ci.ignite.apache.org/viewLog.html?buildId=5862646]]
* IgniteBinaryCacheTestSuite: 
CacheWithDifferentDataRegionConfigurationTest.firstNodeHasDefaultAndSecondWithTwoRegionsDefaultAndPersistenceAcceptable
 - Test has low fail rate in base branch 0,0% and is not flaky

{color:#d04437}Thin Client: Java{color} [[tests 42 Exit Code 
|https://ci.ignite.apache.org/viewLog.html?buildId=5862612]]
* ClientTestSuite: 
ReliabilityTestPartitionAwareAsync.testReconnectionThrottling - Test has low 
fail rate in base branch 0,0% and is not flaky
* ClientTestSuite: 
FunctionalTest.testPessimisticRepeatableReadsTransactionHoldsLock - Test has 
low fail rate in base branch 0,0% and is not flaky
* ClientTestSuite: 
FunctionalTest.testPessimisticSerializableTransactionHoldsLock - Test has low 
fail rate in base branch 0,0% and is not flaky
* ClientTestSuite: FunctionalTest.testOptimitsticRepeatableReadUpdatesValue - 
Test has low fail rate in base branch 0,0% and is not flaky
* ClientTestSuite: 
FunctionalTest.testOptimitsticSerializableTransactionHoldsLock - Test has low 
fail rate in base branch 0,0% and is not flaky
* ClientTestSuite: ReliabilityTestPartitionAware.testReconnectionThrottling - 
Test has low fail rate in base branch 0,0% and is not flaky
* ClientTestSuite: ReliabilityTestPartitionAwareAsync.testSingleServerFailover 
- Test has low fail rate in base branch 0,0% and is not flaky
* ClientTestSuite: ReliabilityTestAsync.testSingleServerFailover - Test has low 
fail rate in base branch 0,0% and is not flaky
* ClientTestSuite: ReliabilityTest.testTxWithIdIntersection - Test has low fail 
rate in base branch 0,0% and is not flaky
* ClientTestSuite: ReliabilityTestAsync.testTxWithIdIntersection - Test has low 
fail rate in base branch 0,0% and is not flaky
* ClientTestSuite: ReliabilityTestAsync.testFailover - Test has low fail rate 
in base branch 0,0% and is not flaky
... and 31 tests blockers

{panel}
{panel:title=Branch: [pull/8733/head] Base: [master] : New Tests 
(17)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}
{color:#8b}SPI{color} [[tests 
1|https://ci.ignite.apache.org/viewLog.html?buildId=5862620]]
* {color:#013220}IgniteSpiTestSuite: 
TcpCommunicationHandshakeTimeoutTest.testSocketForcedClosedBecauseSlowReadFromSocket
 - PASSED{color}

{color:#8b}Platform .NET (Core Linux){color} [[tests 
8|https://ci.ignite.apache.org/viewLog.html?buildId=5862666]]
* {color:#013220}dll: 
BinaryConfigurationRetrievalTest.TestCompactFooterDisabledOnServerAutomaticallyDisablesOnClient
 - PASSED{color}
* {color:#013220}dll: 
BinaryConfigurationRetrievalTest.TestCompactFooterEnabledOnServerDisabledOnClientProducesWarning
 - PASSED{color}
* {color:#013220}dll: 
BinaryConfigurationRetrievalTest.TestBasicNameMapperSettingsMismatchProducesLogWarning
 - PASSED{color}
* {color:#013220}dll: 
BinaryConfigurationRetrievalTest.TestDefaultConfigurationDoesNotChangeClientSettingsOrLogWarnings
 - PASSED{color}
* {color:#013220}dll: 
BinaryConfigurationRetrievalTest.TestExplicitDefaultConfigurationDoesNotChangeClientSettingsOrLogWarnings
 - PASSED{color}
* {color:#013220}dll: 
BinaryConfigurationRetrievalTest.TestCustomNameMapperOnServerAndClientProducesNoLogWarning
 - PASSED{color}
* {color:#013220}dll: 
BinaryConfigurationRetrievalTest.TestCustomNameMapperOnServerProducesLogWarning 
- PASSED{color}
* {color:#013220}dll: 
BinaryConfigurationRetrievalTest.TestCustomNameMapperExtendingBasicMapperOnServerProducesLogWarning
 - PASSED{color}

{color:#8b}Platform .NET{color} [[tests 

[jira] [Updated] (IGNITE-14129) Documentation for .NET: Thin Client: Service invocation

2021-02-08 Thread Maxim Muzafarov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-14129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maxim Muzafarov updated IGNITE-14129:
-
Fix Version/s: 2.10

> Documentation for .NET: Thin Client: Service invocation
> ---
>
> Key: IGNITE-14129
> URL: https://issues.apache.org/jira/browse/IGNITE-14129
> Project: Ignite
>  Issue Type: Sub-task
>  Components: documentation
>Affects Versions: 2.10
>Reporter: Nikita Safonov
>Assignee: Pavel Tupitsyn
>Priority: Major
>  Labels: docuentation
> Fix For: 2.10
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> We need to add a section on the new functionality, including the following 
> info:
>  - Add info that services can be only called from a thin client, but never 
> deployed
>  - Provide a call example 
>  - Highlight that the called service can relate both to .NET and Java.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-14073) False alarm to lose all transaction nodes

2021-02-08 Thread Maxim Muzafarov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-14073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17280972#comment-17280972
 ] 

Maxim Muzafarov commented on IGNITE-14073:
--

Cherry-picked to 2.10

> False alarm to lose all transaction nodes
> -
>
> Key: IGNITE-14073
> URL: https://issues.apache.org/jira/browse/IGNITE-14073
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vladislav Pyatkov
>Assignee: Vladislav Pyatkov
>Priority: Major
> Fix For: 2.10
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This exception will happen when losing a primary and other one node during 
> the transaction.
> But it may not be truth, because the transaction will be able to continue on 
> backups (if they are still alive).
> {noformat}
> [2021-01-23 
> 22:32:50,584][ERROR][test-runner-#1%near.IgniteTxExceptionNodeFailTest%][root]
>  Transaction was not committed.
> class org.apache.ignite.IgniteException: Failed to commit a transaction (all 
> partition owners have left the grid, partition data has been lost) 
> [cacheName=default, partition=3, key=386050343]
>   at 
> org.apache.ignite.internal.util.IgniteUtils.convertException(IgniteUtils.java:1096)
>   at 
> org.apache.ignite.internal.processors.cache.transactions.TransactionProxyImpl.commit(TransactionProxyImpl.java:323)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.near.IgniteTxExceptionNodeFailTest.cacheWithBackups(IgniteTxExceptionNodeFailTest.java:280)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.apache.ignite.testframework.junits.GridAbstractTest$7.run(GridAbstractTest.java:2367)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: class 
> org.apache.ignite.internal.processors.cache.CacheInvalidStateException: 
> Failed to commit a transaction (all partition owners have left the grid, 
> partition data has been lost) [cacheName=default, partition=3, key=386050343]
>   at 
> org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxFinishFuture$FinishMiniFuture.onNodeLeft(GridNearTxFinishFuture.java:993)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxFinishFuture.onNodeLeft(GridNearTxFinishFuture.java:167)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheMvccManager$4.onEvent(GridCacheMvccManager.java:265)
>   at 
> org.apache.ignite.internal.managers.eventstorage.GridEventStorageManager$LocalListenerWrapper.onEvent(GridEventStorageManager.java:1393)
>   at 
> org.apache.ignite.internal.managers.eventstorage.GridEventStorageManager.notifyListeners(GridEventStorageManager.java:888)
>   at 
> org.apache.ignite.internal.managers.eventstorage.GridEventStorageManager.notifyListeners(GridEventStorageManager.java:873)
>   at 
> org.apache.ignite.internal.managers.eventstorage.GridEventStorageManager.record0(GridEventStorageManager.java:349)
>   at 
> org.apache.ignite.internal.managers.eventstorage.GridEventStorageManager.record(GridEventStorageManager.java:312)
>   at 
> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryWorker.recordEvent(GridDiscoveryManager.java:2948)
>   at 
> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryWorker.body0(GridDiscoveryManager.java:3164)
>   at 
> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryWorker.body(GridDiscoveryManager.java:2968)
>   at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119)
>   ... 1 more
> {noformat}
> It will frighten a user, because it looks like a data lose.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-14073) False alarm to lose all transaction nodes

2021-02-08 Thread Maxim Muzafarov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-14073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maxim Muzafarov updated IGNITE-14073:
-
Fix Version/s: (was: 2.11)
   2.10

> False alarm to lose all transaction nodes
> -
>
> Key: IGNITE-14073
> URL: https://issues.apache.org/jira/browse/IGNITE-14073
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vladislav Pyatkov
>Assignee: Vladislav Pyatkov
>Priority: Major
> Fix For: 2.10
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This exception will happen when losing a primary and other one node during 
> the transaction.
> But it may not be truth, because the transaction will be able to continue on 
> backups (if they are still alive).
> {noformat}
> [2021-01-23 
> 22:32:50,584][ERROR][test-runner-#1%near.IgniteTxExceptionNodeFailTest%][root]
>  Transaction was not committed.
> class org.apache.ignite.IgniteException: Failed to commit a transaction (all 
> partition owners have left the grid, partition data has been lost) 
> [cacheName=default, partition=3, key=386050343]
>   at 
> org.apache.ignite.internal.util.IgniteUtils.convertException(IgniteUtils.java:1096)
>   at 
> org.apache.ignite.internal.processors.cache.transactions.TransactionProxyImpl.commit(TransactionProxyImpl.java:323)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.near.IgniteTxExceptionNodeFailTest.cacheWithBackups(IgniteTxExceptionNodeFailTest.java:280)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.apache.ignite.testframework.junits.GridAbstractTest$7.run(GridAbstractTest.java:2367)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: class 
> org.apache.ignite.internal.processors.cache.CacheInvalidStateException: 
> Failed to commit a transaction (all partition owners have left the grid, 
> partition data has been lost) [cacheName=default, partition=3, key=386050343]
>   at 
> org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxFinishFuture$FinishMiniFuture.onNodeLeft(GridNearTxFinishFuture.java:993)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxFinishFuture.onNodeLeft(GridNearTxFinishFuture.java:167)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheMvccManager$4.onEvent(GridCacheMvccManager.java:265)
>   at 
> org.apache.ignite.internal.managers.eventstorage.GridEventStorageManager$LocalListenerWrapper.onEvent(GridEventStorageManager.java:1393)
>   at 
> org.apache.ignite.internal.managers.eventstorage.GridEventStorageManager.notifyListeners(GridEventStorageManager.java:888)
>   at 
> org.apache.ignite.internal.managers.eventstorage.GridEventStorageManager.notifyListeners(GridEventStorageManager.java:873)
>   at 
> org.apache.ignite.internal.managers.eventstorage.GridEventStorageManager.record0(GridEventStorageManager.java:349)
>   at 
> org.apache.ignite.internal.managers.eventstorage.GridEventStorageManager.record(GridEventStorageManager.java:312)
>   at 
> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryWorker.recordEvent(GridDiscoveryManager.java:2948)
>   at 
> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryWorker.body0(GridDiscoveryManager.java:3164)
>   at 
> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryWorker.body(GridDiscoveryManager.java:2968)
>   at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119)
>   ... 1 more
> {noformat}
> It will frighten a user, because it looks like a data lose.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13605) Ducktests test: PDS compatibility for ignite versions

2021-02-08 Thread Anton Vinogradov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Vinogradov updated IGNITE-13605:
--
Sprint: Ducktape Sprint 1, Ducktape Sprint 2  (was: Ducktape Sprint 1, 
Ducktape Sprint 2, Ducktape Sprint 3)

> Ducktests test: PDS compatibility for ignite versions
> -
>
> Key: IGNITE-13605
> URL: https://issues.apache.org/jira/browse/IGNITE-13605
> Project: Ignite
>  Issue Type: New Feature
>Reporter: Maksim Timonin
>Assignee: Mikhail Filatov
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (IGNITE-14138) Historical rebalance kills cluster

2021-02-08 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-14138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov reassigned IGNITE-14138:
--

Assignee: Vladislav Pyatkov

> Historical rebalance kills cluster
> --
>
> Key: IGNITE-14138
> URL: https://issues.apache.org/jira/browse/IGNITE-14138
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vladislav Pyatkov
>Assignee: Vladislav Pyatkov
>Priority: Major
>
> {noformat}
> [2021-01-12T05:11:02,142][ERROR][rebalance-#508%---%][] Critical system error 
> detected. Will be handled accordingly to configured handler 
> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, 
> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet 
> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], 
> failureCtx=FailureContext [type=CRITICAL_ERROR, err=class 
> o.a.i.IgniteCheckedException: Failed to continue supplying 
> [grp=SQL_USAGES_EPE, demander=48254935-7aa9-4ab5-b398-fdaec334fab7, 
> topVer=AffinityTopologyVersion [topVer=3, minorTopVer=1
> org.apache.ignite.IgniteCheckedException: Failed to continue supplying 
> [grp=SQL_1, demander=48254935-7aa9-4ab5-b398-fdaec334fab7, 
> topVer=AffinityTopologyVersion [topVer=3, minorTopVer=1]]
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionSupplier.handleDemandMessage(GridDhtPartitionSupplier.java:571)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPreloader.handleDemandMessage(GridDhtPreloader.java:398)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:489)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:474)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1142)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:591)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$800(GridCacheIoManager.java:109)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheIoManager$OrderedMessageListener.onMessage(GridCacheIoManager.java:1707)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1721)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.managers.communication.GridIoManager.access$4300(GridIoManager.java:157)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.managers.communication.GridIoManager$GridCommunicationMessageSet.unwind(GridIoManager.java:3011)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.managers.communication.GridIoManager.unwindMessageSet(GridIoManager.java:1662)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.managers.communication.GridIoManager.access$4900(GridIoManager.java:157)
>  [ignite-core.jar]
>   at 
> org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1629)
>  [ignite-core.jar]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  [?:?]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  [?:?]
>   at java.lang.Thread.run(Thread.java:834) [?:?]
> Caused by: org.apache.ignite.IgniteCheckedException: Could not find start 
> pointer for partition [part=4, partCntrSince=1115]
>   at 
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointHistory.searchEarliestWalPointer(CheckpointHistory.java:557)
>  ~[ignite-core.jar]
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.historicalIterator(GridCacheOffheapManager.java:1121)
>  ~[ignite-core.jar]
>   at 
> org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.rebalanceIterator(IgniteCacheOffheapManagerImpl.java:1195)
>  ~[ignite-core.jar]
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionSupplier.handleDemandMessage(GridDhtPartitionSupplier.java:322)
>  ~[ignite-core.jar]
>   ... 16 more
> {noformat}
> I believe that it should throw IgniteHistoricalIteratorException instead of 
> IgniteCheckedException, so it can be properly handled and rebalance can move 
> to the full rebalance instead of killing nodes



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14138) Historical rebalance kills cluster

2021-02-08 Thread Vladislav Pyatkov (Jira)
Vladislav Pyatkov created IGNITE-14138:
--

 Summary: Historical rebalance kills cluster
 Key: IGNITE-14138
 URL: https://issues.apache.org/jira/browse/IGNITE-14138
 Project: Ignite
  Issue Type: Bug
Reporter: Vladislav Pyatkov


{noformat}
[2021-01-12T05:11:02,142][ERROR][rebalance-#508%---%][] Critical system error 
detected. Will be handled accordingly to configured handler 
[hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, 
super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet 
[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], 
failureCtx=FailureContext [type=CRITICAL_ERROR, err=class 
o.a.i.IgniteCheckedException: Failed to continue supplying [grp=SQL_USAGES_EPE, 
demander=48254935-7aa9-4ab5-b398-fdaec334fab7, topVer=AffinityTopologyVersion 
[topVer=3, minorTopVer=1
org.apache.ignite.IgniteCheckedException: Failed to continue supplying 
[grp=SQL_1, demander=48254935-7aa9-4ab5-b398-fdaec334fab7, 
topVer=AffinityTopologyVersion [topVer=3, minorTopVer=1]]
at 
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionSupplier.handleDemandMessage(GridDhtPartitionSupplier.java:571)
 [ignite-core.jar]
at 
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPreloader.handleDemandMessage(GridDhtPreloader.java:398)
 [ignite-core.jar]
at 
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:489)
 [ignite-core.jar]
at 
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:474)
 [ignite-core.jar]
at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1142)
 [ignite-core.jar]
at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:591)
 [ignite-core.jar]
at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$800(GridCacheIoManager.java:109)
 [ignite-core.jar]
at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager$OrderedMessageListener.onMessage(GridCacheIoManager.java:1707)
 [ignite-core.jar]
at 
org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1721)
 [ignite-core.jar]
at 
org.apache.ignite.internal.managers.communication.GridIoManager.access$4300(GridIoManager.java:157)
 [ignite-core.jar]
at 
org.apache.ignite.internal.managers.communication.GridIoManager$GridCommunicationMessageSet.unwind(GridIoManager.java:3011)
 [ignite-core.jar]
at 
org.apache.ignite.internal.managers.communication.GridIoManager.unwindMessageSet(GridIoManager.java:1662)
 [ignite-core.jar]
at 
org.apache.ignite.internal.managers.communication.GridIoManager.access$4900(GridIoManager.java:157)
 [ignite-core.jar]
at 
org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1629)
 [ignite-core.jar]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) 
[?:?]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) 
[?:?]
at java.lang.Thread.run(Thread.java:834) [?:?]
Caused by: org.apache.ignite.IgniteCheckedException: Could not find start 
pointer for partition [part=4, partCntrSince=1115]
at 
org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointHistory.searchEarliestWalPointer(CheckpointHistory.java:557)
 ~[ignite-core.jar]
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.historicalIterator(GridCacheOffheapManager.java:1121)
 ~[ignite-core.jar]
at 
org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.rebalanceIterator(IgniteCacheOffheapManagerImpl.java:1195)
 ~[ignite-core.jar]
at 
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionSupplier.handleDemandMessage(GridDhtPartitionSupplier.java:322)
 ~[ignite-core.jar]
... 16 more
{noformat}
I believe that it should throw IgniteHistoricalIteratorException instead of 
IgniteCheckedException, so it can be properly handled and rebalance can move to 
the full rebalance instead of killing nodes



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-14137) Detect and fix failures reasons (nightly runs fails)

2021-02-08 Thread Anton Vinogradov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-14137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Vinogradov updated IGNITE-14137:
--
Priority: Critical  (was: Major)

> Detect and fix failures reasons (nightly runs fails)
> 
>
> Key: IGNITE-14137
> URL: https://issues.apache.org/jira/browse/IGNITE-14137
> Project: Ignite
>  Issue Type: Bug
>Reporter: Anton Vinogradov
>Assignee: Mikhail Filatov
>Priority: Critical
>
> Jenkins runs fails, 1-4 ... 60 tests affected.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (IGNITE-14137) Detect and fix failures reasons (nightly runs fails)

2021-02-08 Thread Anton Vinogradov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-14137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Vinogradov reassigned IGNITE-14137:
-

Assignee: Mikhail Filatov

> Detect and fix failures reasons (nightly runs fails)
> 
>
> Key: IGNITE-14137
> URL: https://issues.apache.org/jira/browse/IGNITE-14137
> Project: Ignite
>  Issue Type: Bug
>Reporter: Anton Vinogradov
>Assignee: Mikhail Filatov
>Priority: Major
>
> Jenkins runs fails, 1-4 ... 60 tests affected.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-14073) False alarm to lose all transaction nodes

2021-02-08 Thread Maxim Muzafarov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-14073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17280967#comment-17280967
 ] 

Maxim Muzafarov commented on IGNITE-14073:
--

Folks, It seems the wrong issue number in the commit message.
This commit - 9af1eb4bf9b3425232a6b9a5109af35077e8548d is referenced to the 
IGNITE-14703. 

> False alarm to lose all transaction nodes
> -
>
> Key: IGNITE-14073
> URL: https://issues.apache.org/jira/browse/IGNITE-14073
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vladislav Pyatkov
>Assignee: Vladislav Pyatkov
>Priority: Major
> Fix For: 2.11
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This exception will happen when losing a primary and other one node during 
> the transaction.
> But it may not be truth, because the transaction will be able to continue on 
> backups (if they are still alive).
> {noformat}
> [2021-01-23 
> 22:32:50,584][ERROR][test-runner-#1%near.IgniteTxExceptionNodeFailTest%][root]
>  Transaction was not committed.
> class org.apache.ignite.IgniteException: Failed to commit a transaction (all 
> partition owners have left the grid, partition data has been lost) 
> [cacheName=default, partition=3, key=386050343]
>   at 
> org.apache.ignite.internal.util.IgniteUtils.convertException(IgniteUtils.java:1096)
>   at 
> org.apache.ignite.internal.processors.cache.transactions.TransactionProxyImpl.commit(TransactionProxyImpl.java:323)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.near.IgniteTxExceptionNodeFailTest.cacheWithBackups(IgniteTxExceptionNodeFailTest.java:280)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.apache.ignite.testframework.junits.GridAbstractTest$7.run(GridAbstractTest.java:2367)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: class 
> org.apache.ignite.internal.processors.cache.CacheInvalidStateException: 
> Failed to commit a transaction (all partition owners have left the grid, 
> partition data has been lost) [cacheName=default, partition=3, key=386050343]
>   at 
> org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxFinishFuture$FinishMiniFuture.onNodeLeft(GridNearTxFinishFuture.java:993)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxFinishFuture.onNodeLeft(GridNearTxFinishFuture.java:167)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheMvccManager$4.onEvent(GridCacheMvccManager.java:265)
>   at 
> org.apache.ignite.internal.managers.eventstorage.GridEventStorageManager$LocalListenerWrapper.onEvent(GridEventStorageManager.java:1393)
>   at 
> org.apache.ignite.internal.managers.eventstorage.GridEventStorageManager.notifyListeners(GridEventStorageManager.java:888)
>   at 
> org.apache.ignite.internal.managers.eventstorage.GridEventStorageManager.notifyListeners(GridEventStorageManager.java:873)
>   at 
> org.apache.ignite.internal.managers.eventstorage.GridEventStorageManager.record0(GridEventStorageManager.java:349)
>   at 
> org.apache.ignite.internal.managers.eventstorage.GridEventStorageManager.record(GridEventStorageManager.java:312)
>   at 
> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryWorker.recordEvent(GridDiscoveryManager.java:2948)
>   at 
> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryWorker.body0(GridDiscoveryManager.java:3164)
>   at 
> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryWorker.body(GridDiscoveryManager.java:2968)
>   at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119)
>   ... 1 more
> {noformat}
> It will frighten a user, because it looks like a data lose.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-14137) Detect and fix failures reasons (nightly runs fails)

2021-02-08 Thread Anton Vinogradov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-14137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Vinogradov updated IGNITE-14137:
--
Description: Jenkins runs fails, 1-4 ... 60 tests affected.

> Detect and fix failures reasons (nightly runs fails)
> 
>
> Key: IGNITE-14137
> URL: https://issues.apache.org/jira/browse/IGNITE-14137
> Project: Ignite
>  Issue Type: Bug
>Reporter: Anton Vinogradov
>Priority: Major
>
> Jenkins runs fails, 1-4 ... 60 tests affected.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-14137) Detect and fix failures reasons (nightly runs fails)

2021-02-08 Thread Anton Vinogradov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-14137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Vinogradov updated IGNITE-14137:
--
Sprint: Ducktape Sprint 3

> Detect and fix failures reasons (nightly runs fails)
> 
>
> Key: IGNITE-14137
> URL: https://issues.apache.org/jira/browse/IGNITE-14137
> Project: Ignite
>  Issue Type: Bug
>Reporter: Anton Vinogradov
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14137) Detect and fix failures reasons (nightly runs fails)

2021-02-08 Thread Anton Vinogradov (Jira)
Anton Vinogradov created IGNITE-14137:
-

 Summary: Detect and fix failures reasons (nightly runs fails)
 Key: IGNITE-14137
 URL: https://issues.apache.org/jira/browse/IGNITE-14137
 Project: Ignite
  Issue Type: Bug
Reporter: Anton Vinogradov






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-13835) Improve discovery ducktape test to research small timeouts and behavior on large cluster.

2021-02-08 Thread Anton Vinogradov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17280952#comment-17280952
 ] 

Anton Vinogradov commented on IGNITE-13835:
---

Merged into ignite-ducktape.
Thanks for your contribution.

> Improve discovery ducktape test to research small timeouts and behavior on 
> large cluster.
> -
>
> Key: IGNITE-13835
> URL: https://issues.apache.org/jira/browse/IGNITE-13835
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladimir Steshin
>Assignee: Vladimir Steshin
>Priority: Major
>
> Improve discovery ducktape test to research the cluster behavior with bigger 
> node number and smaller timeouts. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13446) Configs inside Ducktests logs should be pretty printed

2021-02-08 Thread Anton Vinogradov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Vinogradov updated IGNITE-13446:
--
Sprint: Ducktape Sprint 1, Ducktape Sprint 2, Ducktape Sprint 3  (was: 
Ducktape Sprint 1, Ducktape Sprint 2)

> Configs inside Ducktests logs should be pretty printed
> --
>
> Key: IGNITE-13446
> URL: https://issues.apache.org/jira/browse/IGNITE-13446
> Project: Ignite
>  Issue Type: Task
>Reporter: Anton Vinogradov (Obsolete, actual is "av")
>Assignee: Sergei Ryzhov
>Priority: Critical
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently, I see the following inside the logs
> {code}
> http://www.springframework.org/schema/beans;
>xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance;
>xsi:schemaLocation="http://www.springframework.org/schema/beans
> 
> http://www.springframework.org/schema/beans/spring-beans.xsd;>
> 
> 
> 
> 
>  value="/mnt/service/ignite-log4j.xml"/>
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
>  class="org.apache.ignite.spi.discovery.tcp.ipfinder.vm.TcpDiscoveryVmIpFinder">
> 
> 
> 
> ducker02
> 
> ducker03
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
>  class="org.apache.ignite.configuration.CacheConfiguration">
> 
>  class="org.apache.ignite.cache.affinity.rendezvous.RendezvousAffinityFunction">
> 
>  class="org.apache.ignite.internal.ducktest.tests.cellular_affinity_test.CellularAffinityBackupFilter">
> 
> 
> 
> 
> 
> 
>  
> 
> 
> 
> 
> 
> 
> 
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13492) Basic snapshot test for ducktape

2021-02-08 Thread Anton Vinogradov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Vinogradov updated IGNITE-13492:
--
Sprint: Ducktape Sprint 1, Ducktape Sprint 2, Ducktape Sprint 3  (was: 
Ducktape Sprint 1, Ducktape Sprint 2)

> Basic snapshot test for ducktape
> 
>
> Key: IGNITE-13492
> URL: https://issues.apache.org/jira/browse/IGNITE-13492
> Project: Ignite
>  Issue Type: Task
>Reporter: Sergei Ryzhov
>Assignee: Sergei Ryzhov
>Priority: Minor
>  Labels: iep-28, snapshots
>  Time Spent: 30.5h
>  Remaining Estimate: 0h
>
> Basic snapshot test for ducktape



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13835) Improve discovery ducktape test to research small timeouts and behavior on large cluster.

2021-02-08 Thread Anton Vinogradov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Vinogradov updated IGNITE-13835:
--
Sprint: Ducktape Sprint 1, Ducktape Sprint 2, Ducktape Sprint 3  (was: 
Ducktape Sprint 1, Ducktape Sprint 2)

> Improve discovery ducktape test to research small timeouts and behavior on 
> large cluster.
> -
>
> Key: IGNITE-13835
> URL: https://issues.apache.org/jira/browse/IGNITE-13835
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladimir Steshin
>Assignee: Vladimir Steshin
>Priority: Major
>
> Improve discovery ducktape test to research the cluster behavior with bigger 
> node number and smaller timeouts. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13969) Thin client test [umbrella]

2021-02-08 Thread Anton Vinogradov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Vinogradov updated IGNITE-13969:
--
Sprint: Ducktape Sprint 1, Ducktape Sprint 2, Ducktape Sprint 3  (was: 
Ducktape Sprint 1, Ducktape Sprint 2)

> Thin client test [umbrella]
> ---
>
> Key: IGNITE-13969
> URL: https://issues.apache.org/jira/browse/IGNITE-13969
> Project: Ignite
>  Issue Type: Wish
>Reporter: Anton Vinogradov
>Assignee: Evgeniya Vdovets
>Priority: Major
>
> Ensure Thin client works.
> Check the whole TC API.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-14023) Password based authentication support in ducktape tests

2021-02-08 Thread Anton Vinogradov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-14023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Vinogradov updated IGNITE-14023:
--
Sprint: Ducktape Sprint 1, Ducktape Sprint 2, Ducktape Sprint 3  (was: 
Ducktape Sprint 1, Ducktape Sprint 2)

> Password based authentication support in ducktape tests
> ---
>
> Key: IGNITE-14023
> URL: https://issues.apache.org/jira/browse/IGNITE-14023
> Project: Ignite
>  Issue Type: Task
>Reporter: Mikhail Filatov
>Assignee: Mikhail Filatov
>Priority: Minor
>
> [~map7000], please fill the description.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13605) Ducktests test: PDS compatibility for ignite versions

2021-02-08 Thread Anton Vinogradov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Vinogradov updated IGNITE-13605:
--
Sprint: Ducktape Sprint 1, Ducktape Sprint 2, Ducktape Sprint 3  (was: 
Ducktape Sprint 1, Ducktape Sprint 2)

> Ducktests test: PDS compatibility for ignite versions
> -
>
> Key: IGNITE-13605
> URL: https://issues.apache.org/jira/browse/IGNITE-13605
> Project: Ignite
>  Issue Type: New Feature
>Reporter: Maksim Timonin
>Assignee: Mikhail Filatov
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13902) Add IgniteClient Spring Bean wrapper

2021-02-08 Thread Mikhail Petrov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Petrov updated IGNITE-13902:

Description: It's  needed to add IgniteClient wrapper to provide convenient 
way to start thin client as Spring Bean through XML configuration. The same 
functionality is already present for the Ignite node - IgniteSpringBean.  (was: 
It's  needed to add IgniteClient wrapper to provide convenient way to start 
thin client as Spring Bean. The same functionality is already present for the 
Ignite node - IgniteSpringBean.)

> Add IgniteClient Spring Bean wrapper
> 
>
> Key: IGNITE-13902
> URL: https://issues.apache.org/jira/browse/IGNITE-13902
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Mikhail Petrov
>Assignee: Mikhail Petrov
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> It's  needed to add IgniteClient wrapper to provide convenient way to start 
> thin client as Spring Bean through XML configuration. The same functionality 
> is already present for the Ignite node - IgniteSpringBean.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-14131) IgniteCompute tasks with same name, running from one node and different ClassLoaders can lead to OOM. Fix problems with concurrent ignite.compute call.

2021-02-08 Thread Ilya Kasnacheev (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-14131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17280940#comment-17280940
 ] 

Ilya Kasnacheev commented on IGNITE-14131:
--

I don't think I have enough context to review this.

> IgniteCompute tasks with same name, running from one node and different 
> ClassLoaders can lead to OOM. Fix problems with concurrent ignite.compute 
> call.
> ---
>
> Key: IGNITE-14131
> URL: https://issues.apache.org/jira/browse/IGNITE-14131
> Project: Ignite
>  Issue Type: Improvement
>  Components: compute
>Affects Versions: 2.9.1
>Reporter: Stanilovsky Evgeny
>Assignee: Stanilovsky Evgeny
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The root cause of such a problem is growing from assumption that one node can 
> obtain only one class loader per class name. Thus multiple tasks calling with 
> different classloaders leads to huge cache grow in server side and finally 
> leads to oom with jvm metaspace. Additionally we can`t use p2p from multiple 
> threads, for example ignite instance is shared as a spring bean.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-10073) .NET: Add NuGet package without embedded Ignite JARs

2021-02-08 Thread Pavel Tupitsyn (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-10073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17280882#comment-17280882
 ] 

Pavel Tupitsyn commented on IGNITE-10073:
-

[~isapego] please have a look as well

> .NET: Add NuGet package without embedded Ignite JARs
> 
>
> Key: IGNITE-10073
> URL: https://issues.apache.org/jira/browse/IGNITE-10073
> Project: Ignite
>  Issue Type: Improvement
>  Components: documentation, platforms
>Affects Versions: 2.6
>Reporter: Alexey Kukushkin
>Assignee: Pavel Tupitsyn
>Priority: Major
>  Labels: .NET, sbcf
> Attachments: ignite-10073-vs-2.8.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The existing Apache.Ignite NuGet package includes Ignite JARs deployed into 
> the "libs" directory in the .NET project output directory upon the package 
> installation.
> We prefer using external Ignite JARs from $IGNITE_HOME/libs instead of the 
> JARs in the local libs directory.
> Right now we have to manually remove local "libs" directory after every 
> Apache.Ignite package installation or upgrade.
> It would help us having another Ignite NuGet package without the embedded 
> Ignite JARs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-14112) Revisit GridClosureProcessor#runLocalSafe(Runnable, byte) usages

2021-02-08 Thread Alexey Goncharuk (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-14112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Goncharuk updated IGNITE-14112:
--
Fix Version/s: 2.11

> Revisit GridClosureProcessor#runLocalSafe(Runnable, byte) usages
> 
>
> Key: IGNITE-14112
> URL: https://issues.apache.org/jira/browse/IGNITE-14112
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Alexey Goncharuk
>Assignee: Alexey Goncharuk
>Priority: Major
> Fix For: 2.11
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> If a simple {{Runnable}} is passed to the {{runLocalSafe}} method, not only 
> will Ignite attempt to inject resources to the runnable, but it will also 
> make a call to deployment, which may have various side effects.
> Need to walk through the code and replace {{Runnable}} with 
> {{GridPlainRunnable}} in all places where injection is not needed/expected.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-14055) Deadlock in timeoutObjectProcessor between 'send message' & 'handshake timeout'

2021-02-08 Thread Ivan Bessonov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-14055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bessonov updated IGNITE-14055:
---
Release Note: The possible cluster fail fixed when the socket read hangs 
during establishing a new communication connection 

> Deadlock in timeoutObjectProcessor between 'send message' & 'handshake 
> timeout'
> ---
>
> Key: IGNITE-14055
> URL: https://issues.apache.org/jira/browse/IGNITE-14055
> Project: Ignite
>  Issue Type: Bug
>Reporter: Anton Kalashnikov
>Assignee: Anton Kalashnikov
>Priority: Major
> Fix For: 2.11
>
> Attachments: StartServerWithTxPuts (1).java, freeze (1).sh
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Cluster hangs after jvm pauses on one of server nodes.
>  Scenario:
>  1. Start three server nodes with put operations using StartServerWithTxPuts.
>  2. Emulate jvm freezes on one server node by running the attached script:
>  {{*sh freeze.sh *}}
>  3. Wait until the script has finished.
> Result:
>  The cluster hangs on tx put operations.
> The first server node continuously prints:
> {noformat}
> [2020-11-03 09:36:01,719][INFO 
> ][grid-nio-worker-tcp-comm-3-#26%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Accepted incoming communication connection [locAddr=/127.0.0.1:47100, 
> rmtAddr=/127.0.0.1:57714][2020-11-03 09:36:01,720][INFO 
> ][grid-nio-worker-tcp-comm-3-#26%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Received incoming connection from remote node while connecting to this node, 
> rejecting [locNode=5defd32f-5bdb-4b9e-8a6e-5ee268edac42, locNodeOrder=1, 
> rmtNode=07583a9d-36c8-4100-a69c-8cbd26ca82c9, rmtNodeOrder=3][2020-11-03 
> 09:36:01,922][INFO 
> ][grid-nio-worker-tcp-comm-0-#23%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Accepted incoming communication connection [locAddr=/127.0.0.1:47100, 
> rmtAddr=/127.0.0.1:57716][2020-11-03 09:36:01,922][INFO 
> ][grid-nio-worker-tcp-comm-0-#23%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Received incoming connection from remote node while connecting to this node, 
> rejecting [locNode=5defd32f-5bdb-4b9e-8a6e-5ee268edac42, locNodeOrder=1, 
> rmtNode=07583a9d-36c8-4100-a69c-8cbd26ca82c9, rmtNodeOrder=3][2020-11-03 
> 09:36:02,124][INFO 
> ][grid-nio-worker-tcp-comm-1-#24%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Accepted incoming communication connection [locAddr=/127.0.0.1:47100, 
> rmtAddr=/127.0.0.1:57718][2020-11-03 09:36:02,125][INFO 
> ][grid-nio-worker-tcp-comm-1-#24%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Received incoming connection from remote node while connecting to this node, 
> rejecting [locNode=5defd32f-5bdb-4b9e-8a6e-5ee268edac42, locNodeOrder=1, 
> rmtNode=07583a9d-36c8-4100-a69c-8cbd26ca82c9, rmtNodeOrder=3][2020-11-03 
> 09:36:02,326][INFO 
> ][grid-nio-worker-tcp-comm-2-#25%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Accepted incoming communication connection [locAddr=/127.0.0.1:47100, 
> rmtAddr=/127.0.0.1:57720][2020-11-03 09:36:02,327][INFO 
> ][grid-nio-worker-tcp-comm-2-#25%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Received incoming connection from remote node while connecting to this node, 
> rejecting [locNode=5defd32f-5bdb-4b9e-8a6e-5ee268edac42, locNodeOrder=1, 
> rmtNode=07583a9d-36c8-4100-a69c-8cbd26ca82c9, rmtNodeOrder=3][2020-11-03 
> 09:36:02,528][INFO 
> ][grid-nio-worker-tcp-comm-3-#26%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Accepted incoming communication connection [locAddr=/127.0.0.1:47100, 
> rmtAddr=/127.0.0.1:57722][2020-11-03 09:36:02,529][INFO 
> ][grid-nio-worker-tcp-comm-3-#26%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Received incoming connection from remote node while connecting to this node, 
> rejecting [locNode=5defd32f-5bdb-4b9e-8a6e-5ee268edac42, locNodeOrder=1, 
> rmtNode=07583a9d-36c8-4100-a69c-8cbd26ca82c9, rmtNodeOrder=3]}}
> {noformat}
>  The second node prints long running transactions in prepared state ignoring 
> the default tx timeout:
>  
> {noformat}
> [2020-11-03 09:36:46,199][WARN 
> ][sys-#83%56b4f715-82d6-4d63-ba99-441ffcd673b4%][diagnostic] >>> Future 
> [startTime=09:33:08.496, curTime=09:36:46.181, fut=GridNearTxFinishFuture 
> [futId=425decc8571-4ce98554-8c56-4daf-a7a9-5b9bff52fa08, tx=GridNearTxLocal 
> [mappings=IgniteTxMappingsSingleImpl [mapping=GridDistributedTxMapping 
> [entries=LinkedHashSet [IgniteTxEntry [txKey=IgniteTxKey 
> [key=KeyCacheObjectImpl [part=833, val=833, hasValBytes=true], 
> cacheId=-923393186], val=TxEntryValueHolder [val=CacheObjectByteArrayImpl 
> [arrLen=1048576], op=CREATE], prevVal=TxEntryValueHolder [val=null, op=NOOP], 
> oldVal=TxEntryValueHolder [val=null, op=NOOP], entryProcessorsCol=null, 
> ttl=-1, conflictExpireTime=-1, conflictVer=null, explicitVer=null, 
> dhtVer=null, 

[jira] [Updated] (IGNITE-14055) Deadlock in timeoutObjectProcessor between 'send message' & 'handshake timeout'

2021-02-08 Thread Ivan Bessonov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-14055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bessonov updated IGNITE-14055:
---
Fix Version/s: (was: 2.10)
   2.11

> Deadlock in timeoutObjectProcessor between 'send message' & 'handshake 
> timeout'
> ---
>
> Key: IGNITE-14055
> URL: https://issues.apache.org/jira/browse/IGNITE-14055
> Project: Ignite
>  Issue Type: Bug
>Reporter: Anton Kalashnikov
>Assignee: Anton Kalashnikov
>Priority: Major
> Fix For: 2.11
>
> Attachments: StartServerWithTxPuts (1).java, freeze (1).sh
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Cluster hangs after jvm pauses on one of server nodes.
>  Scenario:
>  1. Start three server nodes with put operations using StartServerWithTxPuts.
>  2. Emulate jvm freezes on one server node by running the attached script:
>  {{*sh freeze.sh *}}
>  3. Wait until the script has finished.
> Result:
>  The cluster hangs on tx put operations.
> The first server node continuously prints:
> {noformat}
> [2020-11-03 09:36:01,719][INFO 
> ][grid-nio-worker-tcp-comm-3-#26%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Accepted incoming communication connection [locAddr=/127.0.0.1:47100, 
> rmtAddr=/127.0.0.1:57714][2020-11-03 09:36:01,720][INFO 
> ][grid-nio-worker-tcp-comm-3-#26%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Received incoming connection from remote node while connecting to this node, 
> rejecting [locNode=5defd32f-5bdb-4b9e-8a6e-5ee268edac42, locNodeOrder=1, 
> rmtNode=07583a9d-36c8-4100-a69c-8cbd26ca82c9, rmtNodeOrder=3][2020-11-03 
> 09:36:01,922][INFO 
> ][grid-nio-worker-tcp-comm-0-#23%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Accepted incoming communication connection [locAddr=/127.0.0.1:47100, 
> rmtAddr=/127.0.0.1:57716][2020-11-03 09:36:01,922][INFO 
> ][grid-nio-worker-tcp-comm-0-#23%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Received incoming connection from remote node while connecting to this node, 
> rejecting [locNode=5defd32f-5bdb-4b9e-8a6e-5ee268edac42, locNodeOrder=1, 
> rmtNode=07583a9d-36c8-4100-a69c-8cbd26ca82c9, rmtNodeOrder=3][2020-11-03 
> 09:36:02,124][INFO 
> ][grid-nio-worker-tcp-comm-1-#24%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Accepted incoming communication connection [locAddr=/127.0.0.1:47100, 
> rmtAddr=/127.0.0.1:57718][2020-11-03 09:36:02,125][INFO 
> ][grid-nio-worker-tcp-comm-1-#24%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Received incoming connection from remote node while connecting to this node, 
> rejecting [locNode=5defd32f-5bdb-4b9e-8a6e-5ee268edac42, locNodeOrder=1, 
> rmtNode=07583a9d-36c8-4100-a69c-8cbd26ca82c9, rmtNodeOrder=3][2020-11-03 
> 09:36:02,326][INFO 
> ][grid-nio-worker-tcp-comm-2-#25%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Accepted incoming communication connection [locAddr=/127.0.0.1:47100, 
> rmtAddr=/127.0.0.1:57720][2020-11-03 09:36:02,327][INFO 
> ][grid-nio-worker-tcp-comm-2-#25%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Received incoming connection from remote node while connecting to this node, 
> rejecting [locNode=5defd32f-5bdb-4b9e-8a6e-5ee268edac42, locNodeOrder=1, 
> rmtNode=07583a9d-36c8-4100-a69c-8cbd26ca82c9, rmtNodeOrder=3][2020-11-03 
> 09:36:02,528][INFO 
> ][grid-nio-worker-tcp-comm-3-#26%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Accepted incoming communication connection [locAddr=/127.0.0.1:47100, 
> rmtAddr=/127.0.0.1:57722][2020-11-03 09:36:02,529][INFO 
> ][grid-nio-worker-tcp-comm-3-#26%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Received incoming connection from remote node while connecting to this node, 
> rejecting [locNode=5defd32f-5bdb-4b9e-8a6e-5ee268edac42, locNodeOrder=1, 
> rmtNode=07583a9d-36c8-4100-a69c-8cbd26ca82c9, rmtNodeOrder=3]}}
> {noformat}
>  The second node prints long running transactions in prepared state ignoring 
> the default tx timeout:
>  
> {noformat}
> [2020-11-03 09:36:46,199][WARN 
> ][sys-#83%56b4f715-82d6-4d63-ba99-441ffcd673b4%][diagnostic] >>> Future 
> [startTime=09:33:08.496, curTime=09:36:46.181, fut=GridNearTxFinishFuture 
> [futId=425decc8571-4ce98554-8c56-4daf-a7a9-5b9bff52fa08, tx=GridNearTxLocal 
> [mappings=IgniteTxMappingsSingleImpl [mapping=GridDistributedTxMapping 
> [entries=LinkedHashSet [IgniteTxEntry [txKey=IgniteTxKey 
> [key=KeyCacheObjectImpl [part=833, val=833, hasValBytes=true], 
> cacheId=-923393186], val=TxEntryValueHolder [val=CacheObjectByteArrayImpl 
> [arrLen=1048576], op=CREATE], prevVal=TxEntryValueHolder [val=null, op=NOOP], 
> oldVal=TxEntryValueHolder [val=null, op=NOOP], entryProcessorsCol=null, 
> ttl=-1, conflictExpireTime=-1, conflictVer=null, explicitVer=null, 
> dhtVer=null, filters=CacheEntryPredicate[] [], filtersPassed=false, 
> filtersSet=true, 

[jira] [Commented] (IGNITE-14055) Deadlock in timeoutObjectProcessor between 'send message' & 'handshake timeout'

2021-02-08 Thread Ivan Bessonov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-14055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17280853#comment-17280853
 ] 

Ivan Bessonov commented on IGNITE-14055:


[~akalashnikov] looks good, I'll merge it right now.

> Deadlock in timeoutObjectProcessor between 'send message' & 'handshake 
> timeout'
> ---
>
> Key: IGNITE-14055
> URL: https://issues.apache.org/jira/browse/IGNITE-14055
> Project: Ignite
>  Issue Type: Bug
>Reporter: Anton Kalashnikov
>Assignee: Anton Kalashnikov
>Priority: Major
> Attachments: StartServerWithTxPuts (1).java, freeze (1).sh
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Cluster hangs after jvm pauses on one of server nodes.
>  Scenario:
>  1. Start three server nodes with put operations using StartServerWithTxPuts.
>  2. Emulate jvm freezes on one server node by running the attached script:
>  {{*sh freeze.sh *}}
>  3. Wait until the script has finished.
> Result:
>  The cluster hangs on tx put operations.
> The first server node continuously prints:
> {noformat}
> [2020-11-03 09:36:01,719][INFO 
> ][grid-nio-worker-tcp-comm-3-#26%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Accepted incoming communication connection [locAddr=/127.0.0.1:47100, 
> rmtAddr=/127.0.0.1:57714][2020-11-03 09:36:01,720][INFO 
> ][grid-nio-worker-tcp-comm-3-#26%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Received incoming connection from remote node while connecting to this node, 
> rejecting [locNode=5defd32f-5bdb-4b9e-8a6e-5ee268edac42, locNodeOrder=1, 
> rmtNode=07583a9d-36c8-4100-a69c-8cbd26ca82c9, rmtNodeOrder=3][2020-11-03 
> 09:36:01,922][INFO 
> ][grid-nio-worker-tcp-comm-0-#23%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Accepted incoming communication connection [locAddr=/127.0.0.1:47100, 
> rmtAddr=/127.0.0.1:57716][2020-11-03 09:36:01,922][INFO 
> ][grid-nio-worker-tcp-comm-0-#23%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Received incoming connection from remote node while connecting to this node, 
> rejecting [locNode=5defd32f-5bdb-4b9e-8a6e-5ee268edac42, locNodeOrder=1, 
> rmtNode=07583a9d-36c8-4100-a69c-8cbd26ca82c9, rmtNodeOrder=3][2020-11-03 
> 09:36:02,124][INFO 
> ][grid-nio-worker-tcp-comm-1-#24%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Accepted incoming communication connection [locAddr=/127.0.0.1:47100, 
> rmtAddr=/127.0.0.1:57718][2020-11-03 09:36:02,125][INFO 
> ][grid-nio-worker-tcp-comm-1-#24%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Received incoming connection from remote node while connecting to this node, 
> rejecting [locNode=5defd32f-5bdb-4b9e-8a6e-5ee268edac42, locNodeOrder=1, 
> rmtNode=07583a9d-36c8-4100-a69c-8cbd26ca82c9, rmtNodeOrder=3][2020-11-03 
> 09:36:02,326][INFO 
> ][grid-nio-worker-tcp-comm-2-#25%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Accepted incoming communication connection [locAddr=/127.0.0.1:47100, 
> rmtAddr=/127.0.0.1:57720][2020-11-03 09:36:02,327][INFO 
> ][grid-nio-worker-tcp-comm-2-#25%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Received incoming connection from remote node while connecting to this node, 
> rejecting [locNode=5defd32f-5bdb-4b9e-8a6e-5ee268edac42, locNodeOrder=1, 
> rmtNode=07583a9d-36c8-4100-a69c-8cbd26ca82c9, rmtNodeOrder=3][2020-11-03 
> 09:36:02,528][INFO 
> ][grid-nio-worker-tcp-comm-3-#26%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Accepted incoming communication connection [locAddr=/127.0.0.1:47100, 
> rmtAddr=/127.0.0.1:57722][2020-11-03 09:36:02,529][INFO 
> ][grid-nio-worker-tcp-comm-3-#26%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Received incoming connection from remote node while connecting to this node, 
> rejecting [locNode=5defd32f-5bdb-4b9e-8a6e-5ee268edac42, locNodeOrder=1, 
> rmtNode=07583a9d-36c8-4100-a69c-8cbd26ca82c9, rmtNodeOrder=3]}}
> {noformat}
>  The second node prints long running transactions in prepared state ignoring 
> the default tx timeout:
>  
> {noformat}
> [2020-11-03 09:36:46,199][WARN 
> ][sys-#83%56b4f715-82d6-4d63-ba99-441ffcd673b4%][diagnostic] >>> Future 
> [startTime=09:33:08.496, curTime=09:36:46.181, fut=GridNearTxFinishFuture 
> [futId=425decc8571-4ce98554-8c56-4daf-a7a9-5b9bff52fa08, tx=GridNearTxLocal 
> [mappings=IgniteTxMappingsSingleImpl [mapping=GridDistributedTxMapping 
> [entries=LinkedHashSet [IgniteTxEntry [txKey=IgniteTxKey 
> [key=KeyCacheObjectImpl [part=833, val=833, hasValBytes=true], 
> cacheId=-923393186], val=TxEntryValueHolder [val=CacheObjectByteArrayImpl 
> [arrLen=1048576], op=CREATE], prevVal=TxEntryValueHolder [val=null, op=NOOP], 
> oldVal=TxEntryValueHolder [val=null, op=NOOP], entryProcessorsCol=null, 
> ttl=-1, conflictExpireTime=-1, conflictVer=null, explicitVer=null, 
> dhtVer=null, filters=CacheEntryPredicate[] [], filtersPassed=false, 
> filtersSet=true, 

[jira] [Commented] (IGNITE-14132) Add configuration to separate Integration and Unit tests

2021-02-08 Thread Peter Ivanov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-14132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17280847#comment-17280847
 ] 

Peter Ivanov commented on IGNITE-14132:
---

Prepared 
[https://ci.ignite.apache.org/buildConfiguration/ignite3_Tests_IntegrationTests?mode=builds]

> Add configuration to separate Integration and Unit tests
> 
>
> Key: IGNITE-14132
> URL: https://issues.apache.org/jira/browse/IGNITE-14132
> Project: Ignite
>  Issue Type: Bug
>Reporter: Peter Ivanov
>Assignee: Peter Ivanov
>Priority: Major
> Fix For: 3.0.0-alpha2
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> After IGNITE-14123, Unit and Integration tests are separated.
> However, Maven goal {{integration-test}} includes unit tests too.
> There is an option to add external property to skip unit tests only which 
> should be added to {{parent/pom.xml}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-14055) Deadlock in timeoutObjectProcessor between 'send message' & 'handshake timeout'

2021-02-08 Thread Anton Kalashnikov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-14055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17280843#comment-17280843
 ] 

Anton Kalashnikov commented on IGNITE-14055:


[~ibessonov] can you take a look and merge please?

> Deadlock in timeoutObjectProcessor between 'send message' & 'handshake 
> timeout'
> ---
>
> Key: IGNITE-14055
> URL: https://issues.apache.org/jira/browse/IGNITE-14055
> Project: Ignite
>  Issue Type: Bug
>Reporter: Anton Kalashnikov
>Assignee: Anton Kalashnikov
>Priority: Major
> Attachments: StartServerWithTxPuts (1).java, freeze (1).sh
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Cluster hangs after jvm pauses on one of server nodes.
>  Scenario:
>  1. Start three server nodes with put operations using StartServerWithTxPuts.
>  2. Emulate jvm freezes on one server node by running the attached script:
>  {{*sh freeze.sh *}}
>  3. Wait until the script has finished.
> Result:
>  The cluster hangs on tx put operations.
> The first server node continuously prints:
> {noformat}
> [2020-11-03 09:36:01,719][INFO 
> ][grid-nio-worker-tcp-comm-3-#26%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Accepted incoming communication connection [locAddr=/127.0.0.1:47100, 
> rmtAddr=/127.0.0.1:57714][2020-11-03 09:36:01,720][INFO 
> ][grid-nio-worker-tcp-comm-3-#26%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Received incoming connection from remote node while connecting to this node, 
> rejecting [locNode=5defd32f-5bdb-4b9e-8a6e-5ee268edac42, locNodeOrder=1, 
> rmtNode=07583a9d-36c8-4100-a69c-8cbd26ca82c9, rmtNodeOrder=3][2020-11-03 
> 09:36:01,922][INFO 
> ][grid-nio-worker-tcp-comm-0-#23%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Accepted incoming communication connection [locAddr=/127.0.0.1:47100, 
> rmtAddr=/127.0.0.1:57716][2020-11-03 09:36:01,922][INFO 
> ][grid-nio-worker-tcp-comm-0-#23%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Received incoming connection from remote node while connecting to this node, 
> rejecting [locNode=5defd32f-5bdb-4b9e-8a6e-5ee268edac42, locNodeOrder=1, 
> rmtNode=07583a9d-36c8-4100-a69c-8cbd26ca82c9, rmtNodeOrder=3][2020-11-03 
> 09:36:02,124][INFO 
> ][grid-nio-worker-tcp-comm-1-#24%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Accepted incoming communication connection [locAddr=/127.0.0.1:47100, 
> rmtAddr=/127.0.0.1:57718][2020-11-03 09:36:02,125][INFO 
> ][grid-nio-worker-tcp-comm-1-#24%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Received incoming connection from remote node while connecting to this node, 
> rejecting [locNode=5defd32f-5bdb-4b9e-8a6e-5ee268edac42, locNodeOrder=1, 
> rmtNode=07583a9d-36c8-4100-a69c-8cbd26ca82c9, rmtNodeOrder=3][2020-11-03 
> 09:36:02,326][INFO 
> ][grid-nio-worker-tcp-comm-2-#25%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Accepted incoming communication connection [locAddr=/127.0.0.1:47100, 
> rmtAddr=/127.0.0.1:57720][2020-11-03 09:36:02,327][INFO 
> ][grid-nio-worker-tcp-comm-2-#25%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Received incoming connection from remote node while connecting to this node, 
> rejecting [locNode=5defd32f-5bdb-4b9e-8a6e-5ee268edac42, locNodeOrder=1, 
> rmtNode=07583a9d-36c8-4100-a69c-8cbd26ca82c9, rmtNodeOrder=3][2020-11-03 
> 09:36:02,528][INFO 
> ][grid-nio-worker-tcp-comm-3-#26%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Accepted incoming communication connection [locAddr=/127.0.0.1:47100, 
> rmtAddr=/127.0.0.1:57722][2020-11-03 09:36:02,529][INFO 
> ][grid-nio-worker-tcp-comm-3-#26%TcpCommunicationSpi%][TcpCommunicationSpi] 
> Received incoming connection from remote node while connecting to this node, 
> rejecting [locNode=5defd32f-5bdb-4b9e-8a6e-5ee268edac42, locNodeOrder=1, 
> rmtNode=07583a9d-36c8-4100-a69c-8cbd26ca82c9, rmtNodeOrder=3]}}
> {noformat}
>  The second node prints long running transactions in prepared state ignoring 
> the default tx timeout:
>  
> {noformat}
> [2020-11-03 09:36:46,199][WARN 
> ][sys-#83%56b4f715-82d6-4d63-ba99-441ffcd673b4%][diagnostic] >>> Future 
> [startTime=09:33:08.496, curTime=09:36:46.181, fut=GridNearTxFinishFuture 
> [futId=425decc8571-4ce98554-8c56-4daf-a7a9-5b9bff52fa08, tx=GridNearTxLocal 
> [mappings=IgniteTxMappingsSingleImpl [mapping=GridDistributedTxMapping 
> [entries=LinkedHashSet [IgniteTxEntry [txKey=IgniteTxKey 
> [key=KeyCacheObjectImpl [part=833, val=833, hasValBytes=true], 
> cacheId=-923393186], val=TxEntryValueHolder [val=CacheObjectByteArrayImpl 
> [arrLen=1048576], op=CREATE], prevVal=TxEntryValueHolder [val=null, op=NOOP], 
> oldVal=TxEntryValueHolder [val=null, op=NOOP], entryProcessorsCol=null, 
> ttl=-1, conflictExpireTime=-1, conflictVer=null, explicitVer=null, 
> dhtVer=null, filters=CacheEntryPredicate[] [], filtersPassed=false, 
> filtersSet=true, 

[jira] [Assigned] (IGNITE-11972) Jepsen tests should check consistency

2021-02-08 Thread Mikhail Filatov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-11972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Filatov reassigned IGNITE-11972:


Assignee: (was: Mikhail Filatov)

> Jepsen tests should check consistency
> -
>
> Key: IGNITE-11972
> URL: https://issues.apache.org/jira/browse/IGNITE-11972
> Project: Ignite
>  Issue Type: Task
>Reporter: Anton Vinogradov (Obsolete, actual is "av")
>Priority: Major
>  Labels: iep-31
>
> We have to check data is consistent during and after the tests.
> Good case is to use:
> - idle_verify of test finish
> - ReadRepair 
> -- during the test (some/(all?) gets should be with RR proxy) 
> -- after the test finish.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14136) ServicesTest.testServiceTimeout is flaky

2021-02-08 Thread Aleksey Plekhanov (Jira)
Aleksey Plekhanov created IGNITE-14136:
--

 Summary: ServicesTest.testServiceTimeout is flaky
 Key: IGNITE-14136
 URL: https://issues.apache.org/jira/browse/IGNITE-14136
 Project: Ignite
  Issue Type: Bug
  Components: thin client
Reporter: Aleksey Plekhanov
Assignee: Aleksey Plekhanov


The test is flaky because the timeout worker sometimes started after the task 
with the given sleep time is completed. In this case response with success 
returned by this task instead of timeout response.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)