[jira] [Commented] (IGNITE-8719) Index left partially built if a node crashes during index create or rebuild

2021-05-12 Thread Kirill Tkalenko (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-8719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17343715#comment-17343715
 ] 

Kirill Tkalenko commented on IGNITE-8719:
-

[~zstan] Please make code review.

> Index left partially built if a node crashes during index create or rebuild
> ---
>
> Key: IGNITE-8719
> URL: https://issues.apache.org/jira/browse/IGNITE-8719
> Project: Ignite
>  Issue Type: Bug
>  Components: sql
>Reporter: Alexey Goncharuk
>Assignee: Kirill Tkalenko
>Priority: Critical
> Fix For: 2.11
>
> Attachments: IndexRebuildAfterNodeCrashTest.java, 
> IndexRebuildingTest.java
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Currently, we do not have any state associated with the index tree. Consider 
> the following scenario:
> 1) Start node, put some data
> 2) start CREATE INDEX operation
> 3) Wait for a checkpoint and stop node before index create finished
> 4) Restart node
> Since the checkpoint finished, the new index tree will be persisted to the 
> disk, but not all data will be present in the index.
> We should somehow store information about initializing index tree and mark it 
> valid only after all data is indexed. The state should be persisted as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-8719) Index left partially built if a node crashes during index create or rebuild

2021-05-12 Thread Ignite TC Bot (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-8719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17343714#comment-17343714
 ] 

Ignite TC Bot commented on IGNITE-8719:
---

{panel:title=Branch: [pull/9090/head] Base: [master] : No blockers 
found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}{panel}
{panel:title=Branch: [pull/9090/head] Base: [master] : New Tests 
(8)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}
{color:#8b}PDS (Indexing){color} [[tests 
8|https://ci.ignite.apache.org/viewLog.html?buildId=6004413]]
* {color:#013220}IgnitePdsWithIndexingTestSuite: 
ResumeRebuildIndexTest.testTwoNodeRestart - PASSED{color}
* {color:#013220}IgnitePdsWithIndexingTestSuite: 
ResumeRebuildIndexTest.testSingleNodeRestart - PASSED{color}
* {color:#013220}IgnitePdsWithIndexingTestSuite: 
ResumeRebuildIndexTest.testTwoNodeReactivation - PASSED{color}
* {color:#013220}IgnitePdsWithIndexingTestSuite: 
ResumeRebuildIndexTest.testDeleteIndexRebuildStateOnDestroyCache - PASSED{color}
* {color:#013220}IgnitePdsWithIndexingTestSuite: 
ResumeRebuildIndexTest.testSingleNodeReactivation - PASSED{color}
* {color:#013220}IgnitePdsWithIndexingTestSuite: 
ResumeRebuildIndexTest.testNormalFlowIndexRebuildStateStorage - PASSED{color}
* {color:#013220}IgnitePdsWithIndexingTestSuite: 
ResumeRebuildIndexTest.testErrorFlowIndexRebuildStateStorage - PASSED{color}
* {color:#013220}IgnitePdsWithIndexingTestSuite: 
ResumeRebuildIndexTest.testRestartNodeFlowIndexRebuildStateStorage - 
PASSED{color}

{panel}
[TeamCity *-- Run :: All* 
Results|https://ci.ignite.apache.org/viewLog.html?buildId=6004446buildTypeId=IgniteTests24Java8_RunAll]

> Index left partially built if a node crashes during index create or rebuild
> ---
>
> Key: IGNITE-8719
> URL: https://issues.apache.org/jira/browse/IGNITE-8719
> Project: Ignite
>  Issue Type: Bug
>  Components: sql
>Reporter: Alexey Goncharuk
>Assignee: Kirill Tkalenko
>Priority: Critical
> Fix For: 2.11
>
> Attachments: IndexRebuildAfterNodeCrashTest.java, 
> IndexRebuildingTest.java
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Currently, we do not have any state associated with the index tree. Consider 
> the following scenario:
> 1) Start node, put some data
> 2) start CREATE INDEX operation
> 3) Wait for a checkpoint and stop node before index create finished
> 4) Restart node
> Since the checkpoint finished, the new index tree will be persisted to the 
> disk, but not all data will be present in the index.
> We should somehow store information about initializing index tree and mark it 
> valid only after all data is indexed. The state should be persisted as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-14490) cache.invoke() triggers failure handler and freezes if entry processor is not urideployed

2021-05-12 Thread Stanilovsky Evgeny (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-14490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17343711#comment-17343711
 ] 

Stanilovsky Evgeny commented on IGNITE-14490:
-

looks good, plz proceed.

> cache.invoke() triggers failure handler and freezes if entry processor is not 
> urideployed
> -
>
> Key: IGNITE-14490
> URL: https://issues.apache.org/jira/browse/IGNITE-14490
> Project: Ignite
>  Issue Type: Bug
>Affects Versions: 2.10
>Reporter: Ilya Kasnacheev
>Assignee: Ilya Kasnacheev
>Priority: Major
> Attachments: LoclerMain.java
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> If URI deployment is specified
> Caused by: java.lang.ClassNotFoundException: [Ljava.lang.StackTraceElement;"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (IGNITE-14702) Inconsistency of the new index when the node falls / deactivates

2021-05-12 Thread Kirill Tkalenko (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-14702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirill Tkalenko reassigned IGNITE-14702:


Assignee: Kirill Tkalenko

> Inconsistency of the new index when the node falls / deactivates
> 
>
> Key: IGNITE-14702
> URL: https://issues.apache.org/jira/browse/IGNITE-14702
> Project: Ignite
>  Issue Type: Bug
>  Components: persistence, sql
>Reporter: Kirill Tkalenko
>Assignee: Kirill Tkalenko
>Priority: Major
> Fix For: 2.11
>
>
> At the moment, if we add a new index and in the middle of its construction we 
> fall / deactivate a node, then it will be inconsistent and may create errors.
> Need to either add a new index to DurableBackgroundTask, or add / modify a 
> separate mechanism that allows to resume the creation of a new index.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (IGNITE-14649) Introduce annotation processor for message serializers/deserializers

2021-05-12 Thread Aleksandr Polovtcev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-14649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Polovtcev reassigned IGNITE-14649:


Assignee: Aleksandr Polovtcev

> Introduce annotation processor for message serializers/deserializers
> 
>
> Key: IGNITE-14649
> URL: https://issues.apache.org/jira/browse/IGNITE-14649
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Semyon Danilov
>Assignee: Aleksandr Polovtcev
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-14713) Calcite. Explain failed for subquery expression.

2021-05-12 Thread Stanilovsky Evgeny (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-14713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stanilovsky Evgeny updated IGNITE-14713:

Issue Type: Bug  (was: Improvement)

> Calcite. Explain failed for subquery expression.
> 
>
> Key: IGNITE-14713
> URL: https://issues.apache.org/jira/browse/IGNITE-14713
> Project: Ignite
>  Issue Type: Bug
>  Components: sql
>Reporter: Stanilovsky Evgeny
>Priority: Major
>  Labels: calcite
>
> Failure test:
> {code:java}
>@Test
> public void test0() throws Exception {
> IgniteCache cache = grid(0).getOrCreateCache(
> new CacheConfiguration("test")
> .setBackups(1)
> .setIndexedTypes(Integer.class, Integer.class)
> );
> for (int i = 0; i < 100; i++)
> cache.put(i, i);
> awaitPartitionMapExchange();
> // Correlated INNER join.
> checkQuery("explain plan for SELECT t._val FROM \"test\".Integer t 
> WHERE t._val < 5 AND " +
> "t._key in (SELECT x FROM table(system_range(t._val, t._val))) ")
> .check();
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14713) Calcite. Explain failed for subquery expression.

2021-05-12 Thread Stanilovsky Evgeny (Jira)
Stanilovsky Evgeny created IGNITE-14713:
---

 Summary: Calcite. Explain failed for subquery expression.
 Key: IGNITE-14713
 URL: https://issues.apache.org/jira/browse/IGNITE-14713
 Project: Ignite
  Issue Type: Improvement
  Components: sql
Reporter: Stanilovsky Evgeny


Failure test:
{code:java}
   @Test
public void test0() throws Exception {
IgniteCache cache = grid(0).getOrCreateCache(
new CacheConfiguration("test")
.setBackups(1)
.setIndexedTypes(Integer.class, Integer.class)
);

for (int i = 0; i < 100; i++)
cache.put(i, i);

awaitPartitionMapExchange();

// Correlated INNER join.
checkQuery("explain plan for SELECT t._val FROM \"test\".Integer t 
WHERE t._val < 5 AND " +
"t._key in (SELECT x FROM table(system_range(t._val, t._val))) ")
.check();
}
{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-14088) Implement scalecube transport API over netty

2021-05-12 Thread Ivan Bessonov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-14088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17343310#comment-17343310
 ] 

Ivan Bessonov commented on IGNITE-14088:


Reverted due to flaky failures. [~sdanilov] please fix everything and we'll try 
one more time, I restored your branch.

> Implement scalecube transport API over netty
> 
>
> Key: IGNITE-14088
> URL: https://issues.apache.org/jira/browse/IGNITE-14088
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Anton Kalashnikov
>Assignee: Semyon Danilov
>Priority: Major
>  Labels: iep-66, ignite-3
> Fix For: 3.0.0-alpha2
>
>  Time Spent: 16h 40m
>  Remaining Estimate: 0h
>
> scalecube has its own netty inside but it is idea to integrate our expanded 
> netty into it. It will help us to support more features like our own 
> handshake, marshalling etc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Reopened] (IGNITE-14088) Implement scalecube transport API over netty

2021-05-12 Thread Ivan Bessonov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-14088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bessonov reopened IGNITE-14088:


> Implement scalecube transport API over netty
> 
>
> Key: IGNITE-14088
> URL: https://issues.apache.org/jira/browse/IGNITE-14088
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Anton Kalashnikov
>Assignee: Semyon Danilov
>Priority: Major
>  Labels: iep-66, ignite-3
> Fix For: 3.0.0-alpha2
>
>  Time Spent: 16h 40m
>  Remaining Estimate: 0h
>
> scalecube has its own netty inside but it is idea to integrate our expanded 
> netty into it. It will help us to support more features like our own 
> handshake, marshalling etc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-14712) Fix MetaStorageServiceTest after IGNITE-14088

2021-05-12 Thread Ivan Bessonov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-14712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bessonov updated IGNITE-14712:
---
Ignite Flags:   (was: Docs Required,Release Notes Required)

> Fix MetaStorageServiceTest after IGNITE-14088 
> --
>
> Key: IGNITE-14712
> URL: https://issues.apache.org/jira/browse/IGNITE-14712
> Project: Ignite
>  Issue Type: Bug
>Reporter: Semyon Danilov
>Assignee: Semyon Danilov
>Priority: Blocker
>  Labels: ignite-3
> Fix For: 3.0.0-alpha2
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-14712) Fix MetaStorageServiceTest after IGNITE-14088

2021-05-12 Thread Semyon Danilov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-14712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Semyon Danilov updated IGNITE-14712:

Description: IGNITE-14088 introduced ScaleCube transport over direct 
marshaller, so ScaleCube message serialization factory should be registered if 
a scalecube cluster is used. 

> Fix MetaStorageServiceTest after IGNITE-14088 
> --
>
> Key: IGNITE-14712
> URL: https://issues.apache.org/jira/browse/IGNITE-14712
> Project: Ignite
>  Issue Type: Bug
>Reporter: Semyon Danilov
>Assignee: Semyon Danilov
>Priority: Blocker
>  Labels: ignite-3
> Fix For: 3.0.0-alpha2
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> IGNITE-14088 introduced ScaleCube transport over direct marshaller, so 
> ScaleCube message serialization factory should be registered if a scalecube 
> cluster is used. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-14490) cache.invoke() triggers failure handler and freezes if entry processor is not urideployed

2021-05-12 Thread Ignite TC Bot (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-14490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17343256#comment-17343256
 ] 

Ignite TC Bot commented on IGNITE-14490:


{panel:title=Branch: [pull/8976/head] Base: [master] : No blockers 
found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}{panel}
{panel:title=Branch: [pull/8976/head] Base: [master] : New Tests 
(1)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}
{color:#8b}SPI (URI Deploy){color} [[tests 
1|https://ci.ignite.apache.org/viewLog.html?buildId=6001617]]
* {color:#013220}IgniteUriDeploymentTestSuite: 
UriDeploymentAbsentProcessorClassTest.test - PASSED{color}

{panel}
[TeamCity *-- Run :: All* 
Results|https://ci.ignite.apache.org/viewLog.html?buildId=6001688buildTypeId=IgniteTests24Java8_RunAll]

> cache.invoke() triggers failure handler and freezes if entry processor is not 
> urideployed
> -
>
> Key: IGNITE-14490
> URL: https://issues.apache.org/jira/browse/IGNITE-14490
> Project: Ignite
>  Issue Type: Bug
>Affects Versions: 2.10
>Reporter: Ilya Kasnacheev
>Assignee: Ilya Kasnacheev
>Priority: Major
> Attachments: LoclerMain.java
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> If URI deployment is specified
> Caused by: java.lang.ClassNotFoundException: [Ljava.lang.StackTraceElement;"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Issue Comment Deleted] (IGNITE-14490) cache.invoke() triggers failure handler and freezes if entry processor is not urideployed

2021-05-12 Thread Ilya Kasnacheev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-14490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ilya Kasnacheev updated IGNITE-14490:
-
Comment: was deleted

(was: {panel:title=Branch: [pull/8976/head] Base: [master] : Possible Blockers 
(8)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1}
{color:#d04437}JDBC Driver{color} [[tests 0 Exit Code 
|https://ci.ignite.apache.org/viewLog.html?buildId=6001609]]

{color:#d04437}RDD{color} [[tests 0 Exit Code 
|https://ci.ignite.apache.org/viewLog.html?buildId=6001611]]

{color:#d04437}Cache 9{color} [[tests 
1|https://ci.ignite.apache.org/viewLog.html?buildId=6001650]]
* IgniteCacheTestSuite9: 
TxPartitionCounterStateConsistencyVolatileRebalanceTest.testPartitionConsistencyDuringRebalanceAndConcurrentUpdates_TxDuringPME
 - Test has low fail rate in base branch 0,0% and is not flaky

{color:#d04437}ZooKeeper (Discovery) 1{color} [[tests 
1|https://ci.ignite.apache.org/viewLog.html?buildId=6001623]]
* ZookeeperDiscoverySpiTestSuite1: 
ZookeeperDiscoveryCommunicationFailureTest.testCommunicationFailureResolve_ConcurrentMultinode
 - Test has low fail rate in base branch 0,0% and is not flaky

{color:#d04437}Cache 5{color} [[tests 
2|https://ci.ignite.apache.org/viewLog.html?buildId=6001646]]
* IgniteCacheTestSuite5: 
CacheSerializableTransactionsTest.testTxConflictReadEntry1 - Test has low fail 
rate in base branch 0,0% and is not flaky
* IgniteCacheTestSuite5: IgniteCacheAtomicProtocolTest.testFullAsyncPutRemap - 
Test has low fail rate in base branch 0,0% and is not flaky

{color:#d04437}PDS (Indexing){color} [[tests 
1|https://ci.ignite.apache.org/viewLog.html?buildId=6001655]]
* IgnitePdsWithIndexingCoreTestSuite: 
IgniteLogicalRecoveryWithParamsTest.testPartiallyCommitedTx_TwoNode_WithoutCpOnNodeStop_SingleNodeTx[nodesCnt=2,
 singleNodeTx=true, backups=0] - Test has low fail rate in base branch 0,0% and 
is not flaky

{color:#d04437}Cache 1{color} [[tests 
1|https://ci.ignite.apache.org/viewLog.html?buildId=6001642]]
* IgniteBinaryCacheTestSuite: 
GridCacheStopSelfTest.testStopImplicitMvccTransactionsReplicated - Test has low 
fail rate in base branch 0,0% and is not flaky

{panel}
{panel:title=Branch: [pull/8976/head] Base: [master] : New Tests 
(1)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}
{color:#8b}SPI (URI Deploy){color} [[tests 
1|https://ci.ignite.apache.org/viewLog.html?buildId=6001617]]
* {color:#013220}IgniteUriDeploymentTestSuite: 
UriDeploymentAbsentProcessorClassTest.test - PASSED{color}

{panel}
[TeamCity *-- Run :: All* 
Results|https://ci.ignite.apache.org/viewLog.html?buildId=6001688buildTypeId=IgniteTests24Java8_RunAll])

> cache.invoke() triggers failure handler and freezes if entry processor is not 
> urideployed
> -
>
> Key: IGNITE-14490
> URL: https://issues.apache.org/jira/browse/IGNITE-14490
> Project: Ignite
>  Issue Type: Bug
>Affects Versions: 2.10
>Reporter: Ilya Kasnacheev
>Assignee: Ilya Kasnacheev
>Priority: Major
> Attachments: LoclerMain.java
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> If URI deployment is specified
> Caused by: java.lang.ClassNotFoundException: [Ljava.lang.StackTraceElement;"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14712) Fix MetaStorageServiceTest after IGNITE-14088

2021-05-12 Thread Semyon Danilov (Jira)
Semyon Danilov created IGNITE-14712:
---

 Summary: Fix MetaStorageServiceTest after IGNITE-14088 
 Key: IGNITE-14712
 URL: https://issues.apache.org/jira/browse/IGNITE-14712
 Project: Ignite
  Issue Type: Bug
Reporter: Semyon Danilov
Assignee: Semyon Danilov






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (IGNITE-14388) Add affinity key support.

2021-05-12 Thread Andrey Mashenkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-14388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Mashenkov reassigned IGNITE-14388:
-

Assignee: Andrey Mashenkov

> Add affinity key support.
> -
>
> Key: IGNITE-14388
> URL: https://issues.apache.org/jira/browse/IGNITE-14388
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Andrey Mashenkov
>Assignee: Andrey Mashenkov
>Priority: Major
>  Labels: iep-54, ignite-3
>
> For now, we do not calculate Row hash at all, it is always equals zero.
>  Let's calculate Row hash for affinity columns only while assembling the row 
> in RowAssembler.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-14088) Implement scalecube transport API over netty

2021-05-12 Thread Ivan Bessonov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-14088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bessonov updated IGNITE-14088:
---
Ignite Flags: Release Notes Required  (was: Docs Required,Release Notes 
Required)

> Implement scalecube transport API over netty
> 
>
> Key: IGNITE-14088
> URL: https://issues.apache.org/jira/browse/IGNITE-14088
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Anton Kalashnikov
>Assignee: Semyon Danilov
>Priority: Major
>  Labels: iep-66, ignite-3
> Fix For: 3.0.0-alpha2
>
>  Time Spent: 16h 40m
>  Remaining Estimate: 0h
>
> scalecube has its own netty inside but it is idea to integrate our expanded 
> netty into it. It will help us to support more features like our own 
> handshake, marshalling etc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-14088) Implement scalecube transport API over netty

2021-05-12 Thread Ivan Bessonov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-14088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bessonov updated IGNITE-14088:
---
Ignite Flags:   (was: Release Notes Required)

> Implement scalecube transport API over netty
> 
>
> Key: IGNITE-14088
> URL: https://issues.apache.org/jira/browse/IGNITE-14088
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Anton Kalashnikov
>Assignee: Semyon Danilov
>Priority: Major
>  Labels: iep-66, ignite-3
> Fix For: 3.0.0-alpha2
>
>  Time Spent: 16h 40m
>  Remaining Estimate: 0h
>
> scalecube has its own netty inside but it is idea to integrate our expanded 
> netty into it. It will help us to support more features like our own 
> handshake, marshalling etc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-14088) Implement scalecube transport API over netty

2021-05-12 Thread Ivan Bessonov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-14088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17343230#comment-17343230
 ] 

Ivan Bessonov commented on IGNITE-14088:


[~sdanilov] thank you for contribution, I'll merge your changes.

> Implement scalecube transport API over netty
> 
>
> Key: IGNITE-14088
> URL: https://issues.apache.org/jira/browse/IGNITE-14088
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Anton Kalashnikov
>Assignee: Semyon Danilov
>Priority: Major
>  Labels: iep-66, ignite-3
>  Time Spent: 16.5h
>  Remaining Estimate: 0h
>
> scalecube has its own netty inside but it is idea to integrate our expanded 
> netty into it. It will help us to support more features like our own 
> handshake, marshalling etc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-14542) Calcite engine. Need to support TableFunctions / SYSTEM_RANGE dynamic table

2021-05-12 Thread Konstantin Orlov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-14542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17343199#comment-17343199
 ] 

Konstantin Orlov commented on IGNITE-14542:
---

[~alex_pl], LGTM!

> Calcite engine. Need to support TableFunctions / SYSTEM_RANGE dynamic table
> ---
>
> Key: IGNITE-14542
> URL: https://issues.apache.org/jira/browse/IGNITE-14542
> Project: Ignite
>  Issue Type: Bug
>  Components: sql
>Reporter: Taras Ledkov
>Assignee: Aleksey Plekhanov
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> A lot of cases require dynamic range data source.
> Tests:
> {{aggregate/aggregates/test_perfect_ht.test}}
> {{aggregate/aggregates/test_string_agg_many_groups.test_slow}}
> {{aggregate/aggregates/test_sum.test}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-8719) Index left partially built if a node crashes during index create or rebuild

2021-05-12 Thread Stanilovsky Evgeny (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-8719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17343198#comment-17343198
 ] 

Stanilovsky Evgeny commented on IGNITE-8719:


[~ktkale...@gridgain.com] plz check my comments, if you ok with my proposals i 
would like to review it once more after fixes, thanks ! 

> Index left partially built if a node crashes during index create or rebuild
> ---
>
> Key: IGNITE-8719
> URL: https://issues.apache.org/jira/browse/IGNITE-8719
> Project: Ignite
>  Issue Type: Bug
>  Components: sql
>Reporter: Alexey Goncharuk
>Assignee: Kirill Tkalenko
>Priority: Critical
> Fix For: 2.11
>
> Attachments: IndexRebuildAfterNodeCrashTest.java, 
> IndexRebuildingTest.java
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Currently, we do not have any state associated with the index tree. Consider 
> the following scenario:
> 1) Start node, put some data
> 2) start CREATE INDEX operation
> 3) Wait for a checkpoint and stop node before index create finished
> 4) Restart node
> Since the checkpoint finished, the new index tree will be persisted to the 
> disk, but not all data will be present in the index.
> We should somehow store information about initializing index tree and mark it 
> valid only after all data is indexed. The state should be persisted as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-14711) Client discovery thread interrupt/stop causes endless communication reconnect attempt

2021-05-12 Thread Ilya Kasnacheev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-14711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ilya Kasnacheev updated IGNITE-14711:
-
Attachment: IgniteDiscoveryThreadKillingTest.java

> Client discovery thread interrupt/stop causes endless communication reconnect 
> attempt
> -
>
> Key: IGNITE-14711
> URL: https://issues.apache.org/jira/browse/IGNITE-14711
> Project: Ignite
>  Issue Type: Bug
>  Components: networking
>Affects Versions: 2.10
>Reporter: Ilya Kasnacheev
>Priority: Major
> Attachments: IgniteDiscoveryThreadKillingTest.java
>
>
> Original issue: if tcp-client-disco-sock-reader thread dies on client node, 
> it will never disconnect from the cluster despite NODE_FAILED, and will 
> endlessly try to open communication connections to server while getting 
> "Remote node does not observe current node in topology" exceptions on client 
> and "Close incoming connection, unknown node" on server.
> Generalized issue: stop()ing or interrupt()ing discovery threads cause 
> cluster to hang in many cases, where it is expected that any such node will:
> * Restart the thread and continue normally
> * Disconnect from the cluster to re-establish discovery connection
> * Stop and close all remaining threads.
> See the attached reproducer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14711) Client discovery thread interrupt/stop causes endless communication reconnect attempt

2021-05-12 Thread Ilya Kasnacheev (Jira)
Ilya Kasnacheev created IGNITE-14711:


 Summary: Client discovery thread interrupt/stop causes endless 
communication reconnect attempt
 Key: IGNITE-14711
 URL: https://issues.apache.org/jira/browse/IGNITE-14711
 Project: Ignite
  Issue Type: Bug
  Components: networking
Affects Versions: 2.10
Reporter: Ilya Kasnacheev


Original issue: if tcp-client-disco-sock-reader thread dies on client node, it 
will never disconnect from the cluster despite NODE_FAILED, and will endlessly 
try to open communication connections to server while getting "Remote node does 
not observe current node in topology" exceptions on client and "Close incoming 
connection, unknown node" on server.

Generalized issue: stop()ing or interrupt()ing discovery threads cause cluster 
to hang in many cases, where it is expected that any such node will:
* Restart the thread and continue normally
* Disconnect from the cluster to re-establish discovery connection
* Stop and close all remaining threads.

See the attached reproducer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14710) Cache API operations throw IgniteClientDisconnectedException

2021-05-12 Thread Ilya Kasnacheev (Jira)
Ilya Kasnacheev created IGNITE-14710:


 Summary: Cache API operations throw 
IgniteClientDisconnectedException
 Key: IGNITE-14710
 URL: https://issues.apache.org/jira/browse/IGNITE-14710
 Project: Ignite
  Issue Type: Bug
  Components: cache
Affects Versions: 2.10
Reporter: Ilya Kasnacheev


Is it possible for Cache.put() operation to throw raw 
IgniteClientDisconnectedException:

{code}
class org.apache.ignite.IgniteClientDisconnectedException: Client node 
disconnected: discovery.IgniteDiscoveryThreadInterruptTest1
at 
org.apache.ignite.internal.GridKernalGatewayImpl.readLock(GridKernalGatewayImpl.java:93)
at org.apache.ignite.internal.IgniteKernal.guard(IgniteKernal.java:4163)
at 
org.apache.ignite.internal.IgniteKernal.transactions(IgniteKernal.java:3173)
at 
org.apache.ignite.internal.processors.cache.GridCacheGateway.checkAtomicOpsInTx(GridCacheGateway.java:363)
at 
org.apache.ignite.internal.processors.cache.GridCacheGateway.onEnter(GridCacheGateway.java:262)
at 
org.apache.ignite.internal.processors.cache.GridCacheGateway.enter(GridCacheGateway.java:177)
at 
org.apache.ignite.internal.processors.cache.GatewayProtectedCacheProxy.onEnter(GatewayProtectedCacheProxy.java:1625)
at 
org.apache.ignite.internal.processors.cache.GatewayProtectedCacheProxy.put(GatewayProtectedCacheProxy.java:853)
at 
org.apache.ignite.spi.discovery.IgniteDiscoveryThreadInterruptTest.run(IgniteDiscoveryThreadInterruptTest.java:117)
at 
org.apache.ignite.spi.discovery.IgniteDiscoveryThreadInterruptTest.testStopClientSockWriter(IgniteDiscoveryThreadInterruptTest.java:89)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.apache.ignite.testframework.junits.GridAbstractTest$7.run(GridAbstractTest.java:2428)
at java.lang.Thread.run(Thread.java:748)
{code}

This is incorrect behavior. Usually cache.put() throws only CacheException and 
it should always throw CacheException with IgniteClientDisconnectedException in 
getCause().

Currently we are both violating JSR107 and forcing users to handle both 
CacheException and IgniteClientDisconnectedException, then check the former to 
see if its cause is the latter, and duplicate future recovery code.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (IGNITE-14292) Change permissions required to create/destroy caches in GridRestProcessor

2021-05-12 Thread Sergei Ryzhov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-14292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergei Ryzhov reassigned IGNITE-14292:
--

Assignee: Sergei Ryzhov

> Change permissions required to create/destroy caches in GridRestProcessor
> -
>
> Key: IGNITE-14292
> URL: https://issues.apache.org/jira/browse/IGNITE-14292
> Project: Ignite
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.9.1
>Reporter: Andrey Kuznetsov
>Assignee: Sergei Ryzhov
>Priority: Major
>
> {{GridRestProcessor}} authorizes {{ADMIN_CACHE}} permission before cache 
> creation/destruction. This is inconsistent with thin client connector 
> behavior and looks counterintuitive. {{ADMIN_CACHE}} should be replaced with 
> {{CACHE_CREATE}} and {{CACHE_DESTROY}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14709) Make ConfigurationStorage#readAll async

2021-05-12 Thread Mirza Aliev (Jira)
Mirza Aliev created IGNITE-14709:


 Summary: Make ConfigurationStorage#readAll async
 Key: IGNITE-14709
 URL: https://issues.apache.org/jira/browse/IGNITE-14709
 Project: Ignite
  Issue Type: Improvement
Reporter: Mirza Aliev


Currently, we faced with a problem when a node starts it is hanged on phase 
when we register DistributedConfigurationStorage. It happens because when 
ConfigurationChanger#register is run it requires ConfigurationStorage#readAll, 
this, in turn, calls MetaStorageServiceImpl#range, but the range is waiting on 
future until cluster init happens, so all process of starting node is hanged. 




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-14709) Make ConfigurationStorage#readAll async

2021-05-12 Thread Mirza Aliev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-14709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mirza Aliev updated IGNITE-14709:
-
Description: 
Currently, we faced with a problem when a node starts it is hanged on phase 
when we register {{DistributedConfigurationStorage}}. It happens because when 
{{ConfigurationChanger#register}} is run it requires 
{{ConfigurationStorage#readAll}}, this, in turn, calls 
{{MetaStorageServiceImpl#range}}, but the range is waiting on future until 
cluster init happens, so all process of starting node is hanged. 


  was:
Currently, we faced with a problem when a node starts it is hanged on phase 
when we register DistributedConfigurationStorage. It happens because when 
ConfigurationChanger#register is run it requires ConfigurationStorage#readAll, 
this, in turn, calls MetaStorageServiceImpl#range, but the range is waiting on 
future until cluster init happens, so all process of starting node is hanged. 



> Make ConfigurationStorage#readAll async
> ---
>
> Key: IGNITE-14709
> URL: https://issues.apache.org/jira/browse/IGNITE-14709
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Mirza Aliev
>Priority: Major
>  Labels: ignite-3
>
> Currently, we faced with a problem when a node starts it is hanged on phase 
> when we register {{DistributedConfigurationStorage}}. It happens because when 
> {{ConfigurationChanger#register}} is run it requires 
> {{ConfigurationStorage#readAll}}, this, in turn, calls 
> {{MetaStorageServiceImpl#range}}, but the range is waiting on future until 
> cluster init happens, so all process of starting node is hanged. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (IGNITE-14708) Add an ability to track request handling completion in GridRestProcessor

2021-05-12 Thread Sergei Ryzhov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-14708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergei Ryzhov reassigned IGNITE-14708:
--

Assignee: Sergei Ryzhov

> Add an ability to track request handling completion in GridRestProcessor
> 
>
> Key: IGNITE-14708
> URL: https://issues.apache.org/jira/browse/IGNITE-14708
> Project: Ignite
>  Issue Type: Improvement
>  Components: rest
>Affects Versions: 2.10
>Reporter: Andrey Kuznetsov
>Assignee: Sergei Ryzhov
>Priority: Major
> Fix For: 2.11
>
>
> It would be useful to have a way to perform some actions when request 
> handling in {{GridRestProcessor}} is done, either normally or exceptionally. 
> This will allow thirdparty plugins to add tracking/monitoring behavior to 
> request processing.
> Probable implementation: allow to add custom listener for {{handleAsync0()}} 
> results. Listener should accept both request and result (response or 
> exception).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14708) Add an ability to track request handling completion in GridRestProcessor

2021-05-12 Thread Andrey Kuznetsov (Jira)
Andrey Kuznetsov created IGNITE-14708:
-

 Summary: Add an ability to track request handling completion in 
GridRestProcessor
 Key: IGNITE-14708
 URL: https://issues.apache.org/jira/browse/IGNITE-14708
 Project: Ignite
  Issue Type: Improvement
  Components: rest
Affects Versions: 2.10
Reporter: Andrey Kuznetsov
 Fix For: 2.11


It would be useful to have a way to perform some actions when request handling 
in {{GridRestProcessor}} is done, either normally or exceptionally. This will 
allow thirdparty plugins to add tracking/monitoring behavior to request 
processing.

Probable implementation: allow to add custom listener for {{handleAsync0()}} 
results. Listener should accept both request and result (response or exception).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-14684) Stopping node at the end of checkpoint can cause "Critical system error"

2021-05-12 Thread Kirill Tkalenko (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-14684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirill Tkalenko updated IGNITE-14684:
-
Release Note: Fixed node fail due to deleting DurableBackgroundTask's at 
the end of a checkpoint when stopping a node.

> Stopping node at the end of checkpoint can cause "Critical system error"
> 
>
> Key: IGNITE-14684
> URL: https://issues.apache.org/jira/browse/IGNITE-14684
> Project: Ignite
>  Issue Type: Bug
>  Components: persistence
>Reporter: Maria Makedonskaya
>Assignee: Kirill Tkalenko
>Priority: Major
> Fix For: 2.11
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Checkpoint listener 
> org.apache.ignite.internal.processors.localtask.DurableBackgroundTasksProcessor#afterCheckpointEnd
>  which trigger at the end of checkpoint process can not take checkpoint read 
> lock during node stopping.
>  Run test(see exception in 
> log):org.apache.ignite.internal.processors.cache.persistence.db.LongDestroyDurableBackgroundTaskTest#testDestroyTaskLifecycle
> {noformat}
> [2021-05-05 
> 15:41:10,907][ERROR][db-checkpoint-thread-#87%db.LongDestroyDurableBackgroundTaskTest0%][root]
>  Critical system error detected. Will be handled accordingly to configured 
> handler [hnd=StopNodeFailureHandler [super=AbstractFailureHandler 
> [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, 
> SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext 
> [type=SYSTEM_WORKER_TERMINATION, err=class o.a.i.IgniteException: Failed to 
> perform cache update: node is stopping.]]
> class org.apache.ignite.IgniteException: Failed to perform cache update: node 
> is stopping.
>   at 
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointTimeoutLock.checkpointReadLock(CheckpointTimeoutLock.java:127)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.checkpointReadLock(GridCacheDatabaseSharedManager.java:1583)
>   at 
> org.apache.ignite.internal.processors.localtask.DurableBackgroundTasksProcessor.metaStorageOperation(DurableBackgroundTasksProcessor.java:335)
>   at 
> org.apache.ignite.internal.processors.localtask.DurableBackgroundTasksProcessor.afterCheckpointEnd(DurableBackgroundTasksProcessor.java:152)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointWorkflow.markCheckpointEnd(CheckpointWorkflow.java:606)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.Checkpointer.doCheckpoint(Checkpointer.java:479)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.Checkpointer.body(Checkpointer.java:282)
>   at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: class org.apache.ignite.internal.NodeStoppingException: Failed to 
> perform cache update: node is stopping.
>   ... 9 more
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-14684) Stopping node at the end of checkpoint can cause "Critical system error"

2021-05-12 Thread Ivan Bessonov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-14684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17343040#comment-17343040
 ] 

Ivan Bessonov commented on IGNITE-14684:


[~ktkale...@gridgain.com] looks good, thank you for the contribution!

> Stopping node at the end of checkpoint can cause "Critical system error"
> 
>
> Key: IGNITE-14684
> URL: https://issues.apache.org/jira/browse/IGNITE-14684
> Project: Ignite
>  Issue Type: Bug
>  Components: persistence
>Reporter: Maria Makedonskaya
>Assignee: Kirill Tkalenko
>Priority: Major
> Fix For: 2.11
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Checkpoint listener 
> org.apache.ignite.internal.processors.localtask.DurableBackgroundTasksProcessor#afterCheckpointEnd
>  which trigger at the end of checkpoint process can not take checkpoint read 
> lock during node stopping.
>  Run test(see exception in 
> log):org.apache.ignite.internal.processors.cache.persistence.db.LongDestroyDurableBackgroundTaskTest#testDestroyTaskLifecycle
> {noformat}
> [2021-05-05 
> 15:41:10,907][ERROR][db-checkpoint-thread-#87%db.LongDestroyDurableBackgroundTaskTest0%][root]
>  Critical system error detected. Will be handled accordingly to configured 
> handler [hnd=StopNodeFailureHandler [super=AbstractFailureHandler 
> [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, 
> SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext 
> [type=SYSTEM_WORKER_TERMINATION, err=class o.a.i.IgniteException: Failed to 
> perform cache update: node is stopping.]]
> class org.apache.ignite.IgniteException: Failed to perform cache update: node 
> is stopping.
>   at 
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointTimeoutLock.checkpointReadLock(CheckpointTimeoutLock.java:127)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.checkpointReadLock(GridCacheDatabaseSharedManager.java:1583)
>   at 
> org.apache.ignite.internal.processors.localtask.DurableBackgroundTasksProcessor.metaStorageOperation(DurableBackgroundTasksProcessor.java:335)
>   at 
> org.apache.ignite.internal.processors.localtask.DurableBackgroundTasksProcessor.afterCheckpointEnd(DurableBackgroundTasksProcessor.java:152)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointWorkflow.markCheckpointEnd(CheckpointWorkflow.java:606)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.Checkpointer.doCheckpoint(Checkpointer.java:479)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.Checkpointer.body(Checkpointer.java:282)
>   at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: class org.apache.ignite.internal.NodeStoppingException: Failed to 
> perform cache update: node is stopping.
>   ... 9 more
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-14684) Stopping node at the end of checkpoint can cause "Critical system error"

2021-05-12 Thread Kirill Tkalenko (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-14684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirill Tkalenko updated IGNITE-14684:
-
Reviewer: Ivan Bessonov

> Stopping node at the end of checkpoint can cause "Critical system error"
> 
>
> Key: IGNITE-14684
> URL: https://issues.apache.org/jira/browse/IGNITE-14684
> Project: Ignite
>  Issue Type: Bug
>  Components: persistence
>Reporter: Maria Makedonskaya
>Assignee: Kirill Tkalenko
>Priority: Major
> Fix For: 2.11
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Checkpoint listener 
> org.apache.ignite.internal.processors.localtask.DurableBackgroundTasksProcessor#afterCheckpointEnd
>  which trigger at the end of checkpoint process can not take checkpoint read 
> lock during node stopping.
>  Run test(see exception in 
> log):org.apache.ignite.internal.processors.cache.persistence.db.LongDestroyDurableBackgroundTaskTest#testDestroyTaskLifecycle
> {noformat}
> [2021-05-05 
> 15:41:10,907][ERROR][db-checkpoint-thread-#87%db.LongDestroyDurableBackgroundTaskTest0%][root]
>  Critical system error detected. Will be handled accordingly to configured 
> handler [hnd=StopNodeFailureHandler [super=AbstractFailureHandler 
> [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, 
> SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext 
> [type=SYSTEM_WORKER_TERMINATION, err=class o.a.i.IgniteException: Failed to 
> perform cache update: node is stopping.]]
> class org.apache.ignite.IgniteException: Failed to perform cache update: node 
> is stopping.
>   at 
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointTimeoutLock.checkpointReadLock(CheckpointTimeoutLock.java:127)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.checkpointReadLock(GridCacheDatabaseSharedManager.java:1583)
>   at 
> org.apache.ignite.internal.processors.localtask.DurableBackgroundTasksProcessor.metaStorageOperation(DurableBackgroundTasksProcessor.java:335)
>   at 
> org.apache.ignite.internal.processors.localtask.DurableBackgroundTasksProcessor.afterCheckpointEnd(DurableBackgroundTasksProcessor.java:152)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointWorkflow.markCheckpointEnd(CheckpointWorkflow.java:606)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.Checkpointer.doCheckpoint(Checkpointer.java:479)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.Checkpointer.body(Checkpointer.java:282)
>   at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: class org.apache.ignite.internal.NodeStoppingException: Failed to 
> perform cache update: node is stopping.
>   ... 9 more
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-14684) Stopping node at the end of checkpoint can cause "Critical system error"

2021-05-12 Thread Kirill Tkalenko (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-14684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17343039#comment-17343039
 ] 

Kirill Tkalenko commented on IGNITE-14684:
--

[~ibessonov] Please make code review.

> Stopping node at the end of checkpoint can cause "Critical system error"
> 
>
> Key: IGNITE-14684
> URL: https://issues.apache.org/jira/browse/IGNITE-14684
> Project: Ignite
>  Issue Type: Bug
>  Components: persistence
>Reporter: Maria Makedonskaya
>Assignee: Kirill Tkalenko
>Priority: Major
> Fix For: 2.11
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Checkpoint listener 
> org.apache.ignite.internal.processors.localtask.DurableBackgroundTasksProcessor#afterCheckpointEnd
>  which trigger at the end of checkpoint process can not take checkpoint read 
> lock during node stopping.
>  Run test(see exception in 
> log):org.apache.ignite.internal.processors.cache.persistence.db.LongDestroyDurableBackgroundTaskTest#testDestroyTaskLifecycle
> {noformat}
> [2021-05-05 
> 15:41:10,907][ERROR][db-checkpoint-thread-#87%db.LongDestroyDurableBackgroundTaskTest0%][root]
>  Critical system error detected. Will be handled accordingly to configured 
> handler [hnd=StopNodeFailureHandler [super=AbstractFailureHandler 
> [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, 
> SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext 
> [type=SYSTEM_WORKER_TERMINATION, err=class o.a.i.IgniteException: Failed to 
> perform cache update: node is stopping.]]
> class org.apache.ignite.IgniteException: Failed to perform cache update: node 
> is stopping.
>   at 
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointTimeoutLock.checkpointReadLock(CheckpointTimeoutLock.java:127)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.checkpointReadLock(GridCacheDatabaseSharedManager.java:1583)
>   at 
> org.apache.ignite.internal.processors.localtask.DurableBackgroundTasksProcessor.metaStorageOperation(DurableBackgroundTasksProcessor.java:335)
>   at 
> org.apache.ignite.internal.processors.localtask.DurableBackgroundTasksProcessor.afterCheckpointEnd(DurableBackgroundTasksProcessor.java:152)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointWorkflow.markCheckpointEnd(CheckpointWorkflow.java:606)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.Checkpointer.doCheckpoint(Checkpointer.java:479)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.Checkpointer.body(Checkpointer.java:282)
>   at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: class org.apache.ignite.internal.NodeStoppingException: Failed to 
> perform cache update: node is stopping.
>   ... 9 more
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-14684) Stopping node at the end of checkpoint can cause "Critical system error"

2021-05-12 Thread Kirill Tkalenko (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-14684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirill Tkalenko updated IGNITE-14684:
-
Fix Version/s: 2.11

> Stopping node at the end of checkpoint can cause "Critical system error"
> 
>
> Key: IGNITE-14684
> URL: https://issues.apache.org/jira/browse/IGNITE-14684
> Project: Ignite
>  Issue Type: Bug
>  Components: persistence
>Reporter: Maria Makedonskaya
>Assignee: Kirill Tkalenko
>Priority: Major
> Fix For: 2.11
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Checkpoint listener 
> org.apache.ignite.internal.processors.localtask.DurableBackgroundTasksProcessor#afterCheckpointEnd
>  which trigger at the end of checkpoint process can not take checkpoint read 
> lock during node stopping.
>  Run test(see exception in 
> log):org.apache.ignite.internal.processors.cache.persistence.db.LongDestroyDurableBackgroundTaskTest#testDestroyTaskLifecycle
> {noformat}
> [2021-05-05 
> 15:41:10,907][ERROR][db-checkpoint-thread-#87%db.LongDestroyDurableBackgroundTaskTest0%][root]
>  Critical system error detected. Will be handled accordingly to configured 
> handler [hnd=StopNodeFailureHandler [super=AbstractFailureHandler 
> [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, 
> SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext 
> [type=SYSTEM_WORKER_TERMINATION, err=class o.a.i.IgniteException: Failed to 
> perform cache update: node is stopping.]]
> class org.apache.ignite.IgniteException: Failed to perform cache update: node 
> is stopping.
>   at 
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointTimeoutLock.checkpointReadLock(CheckpointTimeoutLock.java:127)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.checkpointReadLock(GridCacheDatabaseSharedManager.java:1583)
>   at 
> org.apache.ignite.internal.processors.localtask.DurableBackgroundTasksProcessor.metaStorageOperation(DurableBackgroundTasksProcessor.java:335)
>   at 
> org.apache.ignite.internal.processors.localtask.DurableBackgroundTasksProcessor.afterCheckpointEnd(DurableBackgroundTasksProcessor.java:152)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointWorkflow.markCheckpointEnd(CheckpointWorkflow.java:606)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.Checkpointer.doCheckpoint(Checkpointer.java:479)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.Checkpointer.body(Checkpointer.java:282)
>   at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: class org.apache.ignite.internal.NodeStoppingException: Failed to 
> perform cache update: node is stopping.
>   ... 9 more
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)