[jira] [Updated] (IGNITE-20916) API future of table creation may be completed incorrectly

2023-11-22 Thread Denis Chudov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denis Chudov updated IGNITE-20916:
--
Description: 
After some changes, the map TableManager#tableCreateFuts is used in single 
place only: TableManager#completeApiCreateFuture which makes no sense. It 
should be either removed or (more probably) we have some races on table 
creation from user API's point of view.

Basically, there should be a rework of partition start and moving 
responsibility for partitions from tables to zones, and the table start process 
will have to look completely different and it will be split between different 
meta storage revision (unlike now: it happens within single revision updates).

  was:After some changes, the map TableManager#tableCreateFuts is used in 
single place only: TableManager#completeApiCreateFuture which makes no sense. 
It should be either removed or (more probably) we have some races on table 
creation from user API's point of view.


> API future of table creation may be completed incorrectly
> -
>
> Key: IGNITE-20916
> URL: https://issues.apache.org/jira/browse/IGNITE-20916
> Project: Ignite
>  Issue Type: Bug
>Reporter: Denis Chudov
>Priority: Major
>  Labels: ignite-3
>
> After some changes, the map TableManager#tableCreateFuts is used in single 
> place only: TableManager#completeApiCreateFuture which makes no sense. It 
> should be either removed or (more probably) we have some races on table 
> creation from user API's point of view.
> Basically, there should be a rework of partition start and moving 
> responsibility for partitions from tables to zones, and the table start 
> process will have to look completely different and it will be split between 
> different meta storage revision (unlike now: it happens within single 
> revision updates).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20935) Remove MetaStorageManagerImpl#getService method

2023-11-22 Thread Aleksandr Polovtcev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Polovtcev updated IGNITE-20935:
-
Fix Version/s: 3.0.0-beta2

> Remove MetaStorageManagerImpl#getService method
> ---
>
> Key: IGNITE-20935
> URL: https://issues.apache.org/jira/browse/IGNITE-20935
> Project: Ignite
>  Issue Type: Task
>Reporter: Aleksandr Polovtcev
>Assignee: Aleksandr Polovtcev
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> It is duplicated by the {{MetaStorageManagerImpl#metaStorageServiceFuture}} 
> method.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20938) Extract SQL tests from runner module to ignite-sql-engine

2023-11-22 Thread Yury Gerzhedovich (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yury Gerzhedovich updated IGNITE-20938:
---
Description: 
Runner module incorporate big amount of tests related to another modules. 
That's lead to long running time for the suite on TC.

Currently, intergation tests for SQL functionallity located in ignite-runner 
module. So, need to extract it to ignite-sql-engine module via runner 
test-fixtures support to decrease execution time of test for runner module.

As reference for such activities could be used IGNITE-20670 

  was:
Runner module incorporate big amount of tests related to another modules. 
That's lead to long running time for the suite on TC.
Currently, intergation tests for SQL functionallity located in ignite-runner 
module. So, need to extract it to ignite-sql-engine module via runner 
test-fixtures support to decrease execution time of test for runner module.


> Extract SQL tests from runner module to ignite-sql-engine
> -
>
> Key: IGNITE-20938
> URL: https://issues.apache.org/jira/browse/IGNITE-20938
> Project: Ignite
>  Issue Type: Improvement
>  Components: sql
>Reporter: Yury Gerzhedovich
>Priority: Major
>  Labels: ignite-3
>
> Runner module incorporate big amount of tests related to another modules. 
> That's lead to long running time for the suite on TC.
> Currently, intergation tests for SQL functionallity located in ignite-runner 
> module. So, need to extract it to ignite-sql-engine module via runner 
> test-fixtures support to decrease execution time of test for runner module.
> As reference for such activities could be used IGNITE-20670 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20937) Extract JDBC tests from runner module to ignite-jdbc

2023-11-22 Thread Yury Gerzhedovich (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yury Gerzhedovich updated IGNITE-20937:
---
Description: 
Runner module incorporate big amount of tests related to another modules. 
That's lead to long running time for the suite on TC.

Currently, intergation tests for JDBC functionallity located in ignite-runner 
module. So, need to extract it to JDBC module via runner test-fixtures support 
to decrease execution time of test for runner module.

As reference for such activities could be used IGNITE-20670 

  was:
Runner module incorporate big amount of tests related to another modules. 
That's lead to long running time for the suite on TC.
Currently, intergation tests for JDBC functionallity located in ignite-runner 
module. So, need to extract it to JDBC module via runner test-fixtures support 
to decrease execution time of test for runner module


> Extract JDBC tests from runner module to ignite-jdbc
> 
>
> Key: IGNITE-20937
> URL: https://issues.apache.org/jira/browse/IGNITE-20937
> Project: Ignite
>  Issue Type: Improvement
>  Components: sql
>Reporter: Yury Gerzhedovich
>Priority: Major
>  Labels: ignite-3
>
> Runner module incorporate big amount of tests related to another modules. 
> That's lead to long running time for the suite on TC.
> Currently, intergation tests for JDBC functionallity located in ignite-runner 
> module. So, need to extract it to JDBC module via runner test-fixtures 
> support to decrease execution time of test for runner module.
> As reference for such activities could be used IGNITE-20670 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-20938) Extract SQL tests from runner module to ignite-sql-engine

2023-11-22 Thread Yury Gerzhedovich (Jira)
Yury Gerzhedovich created IGNITE-20938:
--

 Summary: Extract SQL tests from runner module to ignite-sql-engine
 Key: IGNITE-20938
 URL: https://issues.apache.org/jira/browse/IGNITE-20938
 Project: Ignite
  Issue Type: Improvement
  Components: sql
Reporter: Yury Gerzhedovich


Runner module incorporate big amount of tests related to another modules. 
That's lead to long running time for the suite on TC.
Currently, intergation tests for SQL functionallity located in ignite-runner 
module. So, need to extract it to ignite-sql-engine module via runner 
test-fixtures support to decrease execution time of test for runner module.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-20937) Extract JDBC tests from runner module to ignite-jdbc

2023-11-22 Thread Yury Gerzhedovich (Jira)
Yury Gerzhedovich created IGNITE-20937:
--

 Summary: Extract JDBC tests from runner module to ignite-jdbc
 Key: IGNITE-20937
 URL: https://issues.apache.org/jira/browse/IGNITE-20937
 Project: Ignite
  Issue Type: Improvement
  Components: sql
Reporter: Yury Gerzhedovich


Runner module incorporate big amount of tests related to another modules. 
That's lead to long running time for the suite on TC.
Currently, intergation tests for JDBC functionallity located in ignite-runner 
module. So, need to extract it to JDBC module via runner test-fixtures support 
to decrease execution time of test for runner module



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-20863) Selecting indexes when performing update operations for partition

2023-11-22 Thread Roman Puchkovskiy (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-20863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17789002#comment-17789002
 ] 

Roman Puchkovskiy commented on IGNITE-20863:


The patch looks good to me

> Selecting indexes when performing update operations for partition
> -
>
> Key: IGNITE-20863
> URL: https://issues.apache.org/jira/browse/IGNITE-20863
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Kirill Tkalenko
>Assignee: Kirill Tkalenko
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> To implement IGNITE-20125, we need to correctly select the indexes into which 
> we will insert data during RW transactions update operations for partitions.
> It is enough for us to collect all available and registered indexes at the 
> time of the operation, as well as all dropped available indexes that we can 
> know about.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20936) Sql. Introduce container for dynamic parameters.

2023-11-22 Thread Maksim Zhuravkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maksim Zhuravkov updated IGNITE-20936:
--
Labels: ignite-3  (was: )

> Sql. Introduce container for dynamic parameters.
> 
>
> Key: IGNITE-20936
> URL: https://issues.apache.org/jira/browse/IGNITE-20936
> Project: Ignite
>  Issue Type: Improvement
>  Components: sql
>Reporter: Maksim Zhuravkov
>Priority: Minor
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>
> At the moment sql validator uses an array to store values of dynamic 
> parameters. 
> Let's an array of dynamic parameters with a containter to make to possible to 
> leave some parameters unspecified.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (IGNITE-20936) Sql. Introduce container for dynamic parameters.

2023-11-22 Thread Maksim Zhuravkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maksim Zhuravkov reassigned IGNITE-20936:
-

Assignee: Maksim Zhuravkov

> Sql. Introduce container for dynamic parameters.
> 
>
> Key: IGNITE-20936
> URL: https://issues.apache.org/jira/browse/IGNITE-20936
> Project: Ignite
>  Issue Type: Improvement
>  Components: sql
>Reporter: Maksim Zhuravkov
>Assignee: Maksim Zhuravkov
>Priority: Minor
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>
> At the moment sql validator uses an array to store values of dynamic 
> parameters. 
> Let's an array of dynamic parameters with a containter to make to possible to 
> leave some parameters unspecified.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-20936) Sql. Introduce container for dynamic parameters.

2023-11-22 Thread Maksim Zhuravkov (Jira)
Maksim Zhuravkov created IGNITE-20936:
-

 Summary: Sql. Introduce container for dynamic parameters.
 Key: IGNITE-20936
 URL: https://issues.apache.org/jira/browse/IGNITE-20936
 Project: Ignite
  Issue Type: Improvement
  Components: sql
Reporter: Maksim Zhuravkov
 Fix For: 3.0.0-beta2


At the moment sql validator uses an array to store values of dynamic 
parameters. 
Let's an array of dynamic parameters with a containter to make to possible to 
leave some parameters unspecified.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-19905) Race between stable assignments application and table removal

2023-11-22 Thread Kirill Gusakov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-19905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17788939#comment-17788939
 ] 

Kirill Gusakov commented on IGNITE-19905:
-

Actually we have no better way then the simple direct check for null. Moreover, 
other cleanup operations:
- ReplicaManager.stopReplica
- Loza.stopRaftNodes
is ready for the stop of non-existent entity.

So, under this ticket we want just to clean the TODOs from in the source code.

> Race between stable assignments application and table removal
> -
>
> Key: IGNITE-19905
> URL: https://issues.apache.org/jira/browse/IGNITE-19905
> Project: Ignite
>  Issue Type: Bug
>Reporter: Roman Puchkovskiy
>Assignee: Kirill Gusakov
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>
> *Motivation*
> It is possible that handleChangeStableAssignmentEvent() gets an event such 
> that for the corresponding revision the table is already removed, so an NPE 
> happens when trying to destroy a partition storage (at the moment it handled 
> by workaround IGNITE-19906).
> *Definition of done*
> - The logic about partition raft group and storage cleanup is not event 
> executed, if the table is not exist already.
> *Implementation notes*
> - We must check if table still exists at the current revision firstly and 
> only then execute the cleanup logic 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-20737) Idle Verify fails with "Cluster not idle" error when log level is set to DEBUG

2023-11-22 Thread Ignite TC Bot (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-20737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17788865#comment-17788865
 ] 

Ignite TC Bot commented on IGNITE-20737:


{panel:title=Branch: [pull/11052/head] Base: [master] : No blockers 
found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}{panel}
{panel:title=Branch: [pull/11052/head] Base: [master] : No new tests 
found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1}{panel}
[TeamCity *-- Run :: All* 
Results|https://ci2.ignite.apache.org/viewLog.html?buildId=7617485buildTypeId=IgniteTests24Java8_RunAll]

> Idle Verify fails with "Cluster not idle" error when log level is set to DEBUG
> --
>
> Key: IGNITE-20737
> URL: https://issues.apache.org/jira/browse/IGNITE-20737
> Project: Ignite
>  Issue Type: Bug
>Affects Versions: 2.15
>Reporter: Valery Shorin
>Assignee: Egor Fomin
>Priority: Major
>  Labels: ise
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> In case if idle verify executed with DEBUG log level it fails with error:
>  
> {code:java}
> Exception: org.apache.ignite.IgniteException Cluster not idle. Modifications 
> found in caches or groups: [grpName=default, grpId=1544803905, partId=30] 
> changed during size calculation [updCntrBefore=Counter [init=0, val=1], 
> updCntrAfter=Counter [init=0, val=1]]
>  {code}
>  
> This issue can be reproduced by the following test (shoud be added to 
> \{{GridCommandHandlerTest}}:
>  
> {code:java}
> @Test
> public void testCacheIdleVerifyLogLevelDebug() throws Exception {
> IgniteEx ignite = startGrids(3);
> ignite.cluster().state(ACTIVE);
> IgniteCache cache = ignite.createCache(new 
> CacheConfiguration<>(DEFAULT_CACHE_NAME)
> .setAffinity(new RendezvousAffinityFunction(false, 32))
> .setBackups(1));
> cache.put("key", "value");
> injectTestSystemOut();
> setLoggerDebugLevel();
> assertEquals(EXIT_CODE_OK, execute("--cache", "idle_verify"));
> assertContains(log, testOut.toString(), "no conflicts have been found");
> } {code}
>  
>  
> The reason of this failure - {{equals(}} method is not defined for 
> {{PartitionUpdateCounterDebugWrapper}} class



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20935) Remove MetaStorageManagerImpl#getService method

2023-11-22 Thread Aleksandr Polovtcev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Polovtcev updated IGNITE-20935:
-
Description: It is duplicated by the {{metaStorageServiceFuture

> Remove MetaStorageManagerImpl#getService method
> ---
>
> Key: IGNITE-20935
> URL: https://issues.apache.org/jira/browse/IGNITE-20935
> Project: Ignite
>  Issue Type: Task
>Reporter: Aleksandr Polovtcev
>Assignee: Aleksandr Polovtcev
>Priority: Major
>  Labels: ignite-3
>
> It is duplicated by the {{metaStorageServiceFuture



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20935) Remove MetaStorageManagerImpl#getService method

2023-11-22 Thread Aleksandr Polovtcev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Polovtcev updated IGNITE-20935:
-
Description: It is duplicated by the 
{{MetaStorageManagerImpl#metaStorageServiceFuture}} method.  (was: It is 
duplicated by the {{metaStorageServiceFuture)

> Remove MetaStorageManagerImpl#getService method
> ---
>
> Key: IGNITE-20935
> URL: https://issues.apache.org/jira/browse/IGNITE-20935
> Project: Ignite
>  Issue Type: Task
>Reporter: Aleksandr Polovtcev
>Assignee: Aleksandr Polovtcev
>Priority: Major
>  Labels: ignite-3
>
> It is duplicated by the {{MetaStorageManagerImpl#metaStorageServiceFuture}} 
> method.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-20935) Remove MetaStorageManagerImpl#getService method

2023-11-22 Thread Aleksandr Polovtcev (Jira)
Aleksandr Polovtcev created IGNITE-20935:


 Summary: Remove MetaStorageManagerImpl#getService method
 Key: IGNITE-20935
 URL: https://issues.apache.org/jira/browse/IGNITE-20935
 Project: Ignite
  Issue Type: Task
Reporter: Aleksandr Polovtcev
Assignee: Aleksandr Polovtcev






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-20934) Worry about deleted indexes that will no longer be used until they are destroyed

2023-11-22 Thread Kirill Tkalenko (Jira)
Kirill Tkalenko created IGNITE-20934:


 Summary: Worry about deleted indexes that will no longer be used 
until they are destroyed
 Key: IGNITE-20934
 URL: https://issues.apache.org/jira/browse/IGNITE-20934
 Project: Ignite
  Issue Type: Improvement
Reporter: Kirill Tkalenko


At the moment, when executing an RW transaction, we will write to existing 
(available and registered) indexes and to all dropped available indexes that 
were in the past. This is necessary so as not to abort an RW transactions that 
caught the index dropping event.

It won’t be good if we write to old dropped available indexes for a very long 
time; we need to stop writing to them as quickly as possible. At the same time, 
long RW transactions that saw the dropping of the index should not be broken.

We can get stuck on the mechanism of a low watermark, but if it is made 
infinite, it will bring us a lot of pain.

You need to think carefully about solving the problem. I have the following 
thoughts. Locally, each node will monitor the progress of its an RW 
transactions and as soon as the last RW transaction that caught the index 
dropping event disappears, immediately remove it from the selection for new RW 
transactions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20933) Remove InternalTableImpl.assignments

2023-11-22 Thread Pavel Tupitsyn (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Tupitsyn updated IGNITE-20933:

Description: After IGNITE-20701 and IGNITE-19619 we no longer need 
*InternalTableImpl.assignments*. Remove it.  (was: After IGNITE-20701 and 
IGNITE-19619 we no longer need )

> Remove InternalTableImpl.assignments
> 
>
> Key: IGNITE-20933
> URL: https://issues.apache.org/jira/browse/IGNITE-20933
> Project: Ignite
>  Issue Type: Improvement
>Affects Versions: 3.0.0-beta1
>Reporter: Pavel Tupitsyn
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>
> After IGNITE-20701 and IGNITE-19619 we no longer need 
> *InternalTableImpl.assignments*. Remove it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20933) Remove InternalTableImpl.assignments

2023-11-22 Thread Pavel Tupitsyn (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Tupitsyn updated IGNITE-20933:

Description: After IGNITE-20701 and IGNITE-19619 we no longer need 

> Remove InternalTableImpl.assignments
> 
>
> Key: IGNITE-20933
> URL: https://issues.apache.org/jira/browse/IGNITE-20933
> Project: Ignite
>  Issue Type: Improvement
>Affects Versions: 3.0.0-beta1
>Reporter: Pavel Tupitsyn
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>
> After IGNITE-20701 and IGNITE-19619 we no longer need 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-20933) Remove InternalTableImpl.assignments

2023-11-22 Thread Pavel Tupitsyn (Jira)
Pavel Tupitsyn created IGNITE-20933:
---

 Summary: Remove InternalTableImpl.assignments
 Key: IGNITE-20933
 URL: https://issues.apache.org/jira/browse/IGNITE-20933
 Project: Ignite
  Issue Type: Improvement
Affects Versions: 3.0.0-beta1
Reporter: Pavel Tupitsyn
 Fix For: 3.0.0-beta2






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20910) Add a test for inserting data after restart

2023-11-22 Thread Aleksandr Polovtcev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Polovtcev updated IGNITE-20910:
-
Fix Version/s: 3.0.0-beta2

> Add a test for inserting data after restart
> ---
>
> Key: IGNITE-20910
> URL: https://issues.apache.org/jira/browse/IGNITE-20910
> Project: Ignite
>  Issue Type: Task
>Reporter: Aleksandr Polovtcev
>Assignee: Aleksandr Polovtcev
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> When working on a different task, I discovered that inserting around 2000 
> tuples into a single Ignite node doesn't work, if the node has been restarted 
> before. 
> This ticket is only about adding a test that reproduces this issue, fixes 
> should be implemented as part of other tickets.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-20932) Update Linux packages version to 2.16

2023-11-22 Thread Nikita Amelchev (Jira)
Nikita Amelchev created IGNITE-20932:


 Summary: Update Linux packages version to 2.16
 Key: IGNITE-20932
 URL: https://issues.apache.org/jira/browse/IGNITE-20932
 Project: Ignite
  Issue Type: Sub-task
Reporter: Nikita Amelchev
Assignee: Nikita Amelchev


Update Linux packages version to 2.16



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20928) Update ignite-2.16 branch version to 2.16.0

2023-11-22 Thread Nikita Amelchev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikita Amelchev updated IGNITE-20928:
-
Ignite Flags:   (was: Docs Required,Release Notes Required)

> Update ignite-2.16 branch version to 2.16.0
> ---
>
> Key: IGNITE-20928
> URL: https://issues.apache.org/jira/browse/IGNITE-20928
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Nikita Amelchev
>Assignee: Nikita Amelchev
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Update ignite-2.16 branch version to 2.16.0



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-20910) Add a test for inserting data after restart

2023-11-22 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-20910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17788824#comment-17788824
 ] 

Vladislav Pyatkov commented on IGNITE-20910:


LGTM

> Add a test for inserting data after restart
> ---
>
> Key: IGNITE-20910
> URL: https://issues.apache.org/jira/browse/IGNITE-20910
> Project: Ignite
>  Issue Type: Task
>Reporter: Aleksandr Polovtcev
>Assignee: Aleksandr Polovtcev
>Priority: Major
>  Labels: ignite-3
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> When working on a different task, I discovered that inserting around 2000 
> tuples into a single Ignite node doesn't work, if the node has been restarted 
> before. 
> This ticket is only about adding a test that reproduces this issue, fixes 
> should be implemented as part of other tickets.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (IGNITE-20928) Update ignite-2.16 branch version to 2.16.0

2023-11-22 Thread Nikita Amelchev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikita Amelchev resolved IGNITE-20928.
--
Resolution: Fixed

Merged into the 2.16.

[~PetrovMikhail], Thank you for the review.

> Update ignite-2.16 branch version to 2.16.0
> ---
>
> Key: IGNITE-20928
> URL: https://issues.apache.org/jira/browse/IGNITE-20928
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Nikita Amelchev
>Assignee: Nikita Amelchev
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Update ignite-2.16 branch version to 2.16.0



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-20929) Update Ignite version to 2.17.0-SNAPSHOT

2023-11-22 Thread Nikita Amelchev (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-20929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17788820#comment-17788820
 ] 

Nikita Amelchev commented on IGNITE-20929:
--

Merged into the master.

[~PetrovMikhail], Thank you for the review.

> Update Ignite version to 2.17.0-SNAPSHOT
> 
>
> Key: IGNITE-20929
> URL: https://issues.apache.org/jira/browse/IGNITE-20929
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Nikita Amelchev
>Assignee: Nikita Amelchev
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Update Ignite version to 2.17.0-SNAPSHOT



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (IGNITE-20930) Ignite-extensions: Change version of Ignite dependency to 2.17.0-SNAPSHOT

2023-11-22 Thread Nikita Amelchev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikita Amelchev resolved IGNITE-20930.
--
Resolution: Fixed

Merged into the master.

[~PetrovMikhail], Thank you for the review.

> Ignite-extensions: Change version of Ignite dependency to 2.17.0-SNAPSHOT
> -
>
> Key: IGNITE-20930
> URL: https://issues.apache.org/jira/browse/IGNITE-20930
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Nikita Amelchev
>Assignee: Nikita Amelchev
>Priority: Major
>
> Ignite-extensions: Change version of Ignite dependency to 2.17.0-SNAPSHOT



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20930) Ignite-extensions: Change version of Ignite dependency to 2.17.0-SNAPSHOT

2023-11-22 Thread Nikita Amelchev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikita Amelchev updated IGNITE-20930:
-
Ignite Flags:   (was: Docs Required,Release Notes Required)

> Ignite-extensions: Change version of Ignite dependency to 2.17.0-SNAPSHOT
> -
>
> Key: IGNITE-20930
> URL: https://issues.apache.org/jira/browse/IGNITE-20930
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Nikita Amelchev
>Assignee: Nikita Amelchev
>Priority: Major
>
> Ignite-extensions: Change version of Ignite dependency to 2.17.0-SNAPSHOT



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20929) Update Ignite version to 2.17.0-SNAPSHOT

2023-11-22 Thread Nikita Amelchev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikita Amelchev updated IGNITE-20929:
-
Ignite Flags:   (was: Docs Required,Release Notes Required)

> Update Ignite version to 2.17.0-SNAPSHOT
> 
>
> Key: IGNITE-20929
> URL: https://issues.apache.org/jira/browse/IGNITE-20929
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Nikita Amelchev
>Assignee: Nikita Amelchev
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Update Ignite version to 2.17.0-SNAPSHOT



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-20931) ignite-website: Add documentation, release notes and download links for 2.16.0 release

2023-11-22 Thread Nikita Amelchev (Jira)
Nikita Amelchev created IGNITE-20931:


 Summary: ignite-website: Add documentation, release notes and 
download links for 2.16.0 release
 Key: IGNITE-20931
 URL: https://issues.apache.org/jira/browse/IGNITE-20931
 Project: Ignite
  Issue Type: Sub-task
Reporter: Nikita Amelchev
Assignee: Nikita Amelchev


ignite-website: Add documentation, release notes and download links for 2.16.0 
release



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-20930) Ignite-extensions: Change version of Ignite dependency to 2.17.0-SNAPSHOT

2023-11-22 Thread Nikita Amelchev (Jira)
Nikita Amelchev created IGNITE-20930:


 Summary: Ignite-extensions: Change version of Ignite dependency to 
2.17.0-SNAPSHOT
 Key: IGNITE-20930
 URL: https://issues.apache.org/jira/browse/IGNITE-20930
 Project: Ignite
  Issue Type: Sub-task
Reporter: Nikita Amelchev
Assignee: Nikita Amelchev


Ignite-extensions: Change version of Ignite dependency to 2.17.0-SNAPSHOT



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-20929) Update Ignite version to 2.17.0-SNAPSHOT

2023-11-22 Thread Nikita Amelchev (Jira)
Nikita Amelchev created IGNITE-20929:


 Summary: Update Ignite version to 2.17.0-SNAPSHOT
 Key: IGNITE-20929
 URL: https://issues.apache.org/jira/browse/IGNITE-20929
 Project: Ignite
  Issue Type: Sub-task
Reporter: Nikita Amelchev
Assignee: Nikita Amelchev


Update Ignite version to 2.17.0-SNAPSHOT



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-20927) Update IgniteReleasedVersion for compatibility tests to 2.16.0

2023-11-22 Thread Nikita Amelchev (Jira)
Nikita Amelchev created IGNITE-20927:


 Summary: Update IgniteReleasedVersion for compatibility tests to 
2.16.0
 Key: IGNITE-20927
 URL: https://issues.apache.org/jira/browse/IGNITE-20927
 Project: Ignite
  Issue Type: Sub-task
Reporter: Nikita Amelchev
Assignee: Nikita Amelchev


Update IgniteReleasedVersion for compatibility tests to 2.16.0



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-20926) Update Apache Ignite 2.16 release notes

2023-11-22 Thread Nikita Amelchev (Jira)
Nikita Amelchev created IGNITE-20926:


 Summary: Update Apache Ignite 2.16 release notes
 Key: IGNITE-20926
 URL: https://issues.apache.org/jira/browse/IGNITE-20926
 Project: Ignite
  Issue Type: Sub-task
Reporter: Nikita Amelchev
Assignee: Nikita Amelchev


Update Apache Ignite 2.16 release notes



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-20928) Update ignite-2.16 branch version to 2.16.0

2023-11-22 Thread Nikita Amelchev (Jira)
Nikita Amelchev created IGNITE-20928:


 Summary: Update ignite-2.16 branch version to 2.16.0
 Key: IGNITE-20928
 URL: https://issues.apache.org/jira/browse/IGNITE-20928
 Project: Ignite
  Issue Type: Sub-task
Reporter: Nikita Amelchev
Assignee: Nikita Amelchev


Update ignite-2.16 branch version to 2.16.0



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-20925) Sql. Await primary replica only once per transaction, avoid of terms in enlist.

2023-11-22 Thread Evgeny Stanilovsky (Jira)
Evgeny Stanilovsky created IGNITE-20925:
---

 Summary: Sql. Await primary replica only once per transaction, 
avoid of terms in enlist.
 Key: IGNITE-20925
 URL: https://issues.apache.org/jira/browse/IGNITE-20925
 Project: Ignite
  Issue Type: Improvement
  Components: sql
Affects Versions: 3.0.0-beta1
Reporter: Evgeny Stanilovsky


After [1] for statements like :
START TRANSACTION;
INSERT INTO T1 VALUES(1);
INSERT INTO T1 VALUES(2);
COMMIT;
we will call primary partition detection code SqlQueryProcessor#primaryReplicas 
as many times as insert operation is raised. It`s weird cause :
1. If transaction already enlisted into NODE1 for TABLE1_PARTITION1 in scope of 
this transaction it can`t be enlisted into different NODE2 for 
TABLE1_PARTITION1 due to lease expiration or node failure scenarios.
2. Additional useless time consumption due to code execution.
Additionally seems we can call [2] for already awaited primary partitions.

Thus proposal is: 
1. Reduce calling primaryReplicas for 1 time per source and store this call 
usage only for further mapping planning.
2. Change semantic IgniteRelShuttle#enlist(int tableId, List 
assignments) for IgniteRelShuttle#enlist(int tableId, HybridTimestamp clock) 
seems no need to transfer assignments here they can be obtained from [2]. Also 
we can check tx.enlistedNodeAndTerm(tablePartId) and ReplicaMeta#getStartTime 
for term equality is not equal fast exception can be raised otherwise we need 
to wait such exception from tx erroneous enlisting logic.

[1] https://issues.apache.org/jira/browse/IGNITE-19619
[2] PlacementDriver#getPrimaryReplica



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20896) Fix MultifieldIndexQueryTest#testCheckBoundaries flakiness

2023-11-22 Thread Ilya Shishkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ilya Shishkov updated IGNITE-20896:
---
Description: 
Test fails with an error:
{code:java}
java.lang.AssertionError: 
Expected :2
Actual   :3

at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:834)
at org.junit.Assert.assertEquals(Assert.java:645)
at org.junit.Assert.assertEquals(Assert.java:631)
at 
org.apache.ignite.testframework.junits.JUnitAssertAware.assertEquals(JUnitAssertAware.java:95)
at 
org.apache.ignite.cache.query.MultifieldIndexQueryTest.testCheckBoundaries(MultifieldIndexQueryTest.java:170)
{code}

We are failing because of incorrect result set of index query (extra test 
output added in PR [1]):
{code}
> [Entry [key=1, val=Person [id=0, secId=1, descId=1]], Entry [key=1, 
> val=Person [id=0, secId=1, descId=1]], Entry [key=3, val=Person [id=1, 
> secId=1, descId=1]]]
{code}
or
{code}
>> [Entry [key=3, val=Person [id=1, secId=1, descId=1]]]
{code}

# https://github.com/apache/ignite/pull/11062

  was:
Test fails with an error:
{code:java}
java.lang.AssertionError: 
Expected :2
Actual   :3

at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:834)
at org.junit.Assert.assertEquals(Assert.java:645)
at org.junit.Assert.assertEquals(Assert.java:631)
at 
org.apache.ignite.testframework.junits.JUnitAssertAware.assertEquals(JUnitAssertAware.java:95)
at 
org.apache.ignite.cache.query.MultifieldIndexQueryTest.testCheckBoundaries(MultifieldIndexQueryTest.java:170)
{code}

We are failing because of incorrect result set of index query (extra test 
output added in PR [1]):
{code}
> [Entry [key=1, val=Person [id=0, secId=1, descId=1]], Entry [key=1, 
> val=Person [id=0, secId=1, descId=1]], Entry [key=3, val=Person [id=1, 
> secId=1, descId=1]]]
{code}
or
{code}

{code}

# https://github.com/apache/ignite/pull/11062


> Fix MultifieldIndexQueryTest#testCheckBoundaries flakiness
> --
>
> Key: IGNITE-20896
> URL: https://issues.apache.org/jira/browse/IGNITE-20896
> Project: Ignite
>  Issue Type: Test
>Reporter: Ilya Shishkov
>Priority: Trivial
>  Labels: ise
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Test fails with an error:
> {code:java}
> java.lang.AssertionError: 
> Expected :2
> Actual   :3
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.ignite.testframework.junits.JUnitAssertAware.assertEquals(JUnitAssertAware.java:95)
>   at 
> org.apache.ignite.cache.query.MultifieldIndexQueryTest.testCheckBoundaries(MultifieldIndexQueryTest.java:170)
> {code}
> We are failing because of incorrect result set of index query (extra test 
> output added in PR [1]):
> {code}
> > [Entry [key=1, val=Person [id=0, secId=1, descId=1]], Entry [key=1, 
> > val=Person [id=0, secId=1, descId=1]], Entry [key=3, val=Person [id=1, 
> > secId=1, descId=1]]]
> {code}
> or
> {code}
> >> [Entry [key=3, val=Person [id=1, secId=1, descId=1]]]
> {code}
> # https://github.com/apache/ignite/pull/11062



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20896) Fix MultifieldIndexQueryTest#testCheckBoundaries flakiness

2023-11-22 Thread Ilya Shishkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ilya Shishkov updated IGNITE-20896:
---
Description: 
Test fails with an error:
{code:java}
java.lang.AssertionError: 
Expected :2
Actual   :3

at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:834)
at org.junit.Assert.assertEquals(Assert.java:645)
at org.junit.Assert.assertEquals(Assert.java:631)
at 
org.apache.ignite.testframework.junits.JUnitAssertAware.assertEquals(JUnitAssertAware.java:95)
at 
org.apache.ignite.cache.query.MultifieldIndexQueryTest.testCheckBoundaries(MultifieldIndexQueryTest.java:170)
{code}

We are failing because of incorrect result set of index query (extra test 
output added in PR [1]):
{code}
> [Entry [key=1, val=Person [id=0, secId=1, descId=1]], Entry [key=1, 
> val=Person [id=0, secId=1, descId=1]], Entry [key=3, val=Person [id=1, 
> secId=1, descId=1]]]
{code}
or
{code}

{code}

# https://github.com/apache/ignite/pull/11062

  was:
Test fails with an error:
{code}
java.lang.AssertionError: 
Expected :2
Actual   :3

at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:834)
at org.junit.Assert.assertEquals(Assert.java:645)
at org.junit.Assert.assertEquals(Assert.java:631)
at 
org.apache.ignite.testframework.junits.JUnitAssertAware.assertEquals(JUnitAssertAware.java:95)
at 
org.apache.ignite.cache.query.MultifieldIndexQueryTest.testCheckBoundaries(MultifieldIndexQueryTest.java:170)
{code}




> Fix MultifieldIndexQueryTest#testCheckBoundaries flakiness
> --
>
> Key: IGNITE-20896
> URL: https://issues.apache.org/jira/browse/IGNITE-20896
> Project: Ignite
>  Issue Type: Test
>Reporter: Ilya Shishkov
>Priority: Trivial
>  Labels: ise
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Test fails with an error:
> {code:java}
> java.lang.AssertionError: 
> Expected :2
> Actual   :3
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.ignite.testframework.junits.JUnitAssertAware.assertEquals(JUnitAssertAware.java:95)
>   at 
> org.apache.ignite.cache.query.MultifieldIndexQueryTest.testCheckBoundaries(MultifieldIndexQueryTest.java:170)
> {code}
> We are failing because of incorrect result set of index query (extra test 
> output added in PR [1]):
> {code}
> > [Entry [key=1, val=Person [id=0, secId=1, descId=1]], Entry [key=1, 
> > val=Person [id=0, secId=1, descId=1]], Entry [key=3, val=Person [id=1, 
> > secId=1, descId=1]]]
> {code}
> or
> {code}
> {code}
> # https://github.com/apache/ignite/pull/11062



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (IGNITE-20918) Leases expire after a node has been restarted

2023-11-22 Thread Aleksandr Polovtcev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Polovtcev reassigned IGNITE-20918:


Assignee: Vladislav Pyatkov  (was: Alexander Lapin)

> Leases expire after a node has been restarted
> -
>
> Key: IGNITE-20918
> URL: https://issues.apache.org/jira/browse/IGNITE-20918
> Project: Ignite
>  Issue Type: Bug
>Reporter: Aleksandr Polovtcev
>Assignee: Vladislav Pyatkov
>Priority: Critical
>  Labels: ignite-3
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> IGNITE-20910 introduces a test that inserts some data after restarting a 
> node. For some reason, after some time, I can see the following messages in 
> the log:
> {noformat}
> [2023-11-22T10:00:17,056][INFO 
> ][%isnt_tmpar_0%metastorage-watch-executor-3][PartitionReplicaListener] 
> Primary replica expired [grp=5_part_19]
> [2023-11-22T10:00:17,057][INFO 
> ][%isnt_tmpar_0%metastorage-watch-executor-3][PartitionReplicaListener] 
> Primary replica expired [grp=5_part_0]
> [2023-11-22T10:00:17,057][INFO 
> ][%isnt_tmpar_0%metastorage-watch-executor-3][PartitionReplicaListener] 
> Primary replica expired [grp=5_part_9]
> [2023-11-22T10:00:17,057][INFO 
> ][%isnt_tmpar_0%metastorage-watch-executor-3][PartitionReplicaListener] 
> Primary replica expired [grp=5_part_10]
> {noformat}
> After that, the test fails with a {{PrimaryReplicaMissException}}. The 
> problem here, that it is expected that a single node should never have 
> expired leases, they should be prolongated automatically. I think that this 
> happens because the initial lease that was issued before the node was 
> restarted is still accepted by the node after restart.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20896) Fix MultifieldIndexQueryTest#testCheckBoundaries flakiness

2023-11-22 Thread Ilya Shishkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ilya Shishkov updated IGNITE-20896:
---
Description: 
Test fails with an error:
{code}
java.lang.AssertionError: 
Expected :2
Actual   :3

at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:834)
at org.junit.Assert.assertEquals(Assert.java:645)
at org.junit.Assert.assertEquals(Assert.java:631)
at 
org.apache.ignite.testframework.junits.JUnitAssertAware.assertEquals(JUnitAssertAware.java:95)
at 
org.apache.ignite.cache.query.MultifieldIndexQueryTest.testCheckBoundaries(MultifieldIndexQueryTest.java:170)
{code}



  was:
Test fails with an error:
{code}
java.lang.AssertionError: 
Expected :2
Actual   :3

at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:834)
at org.junit.Assert.assertEquals(Assert.java:645)
at org.junit.Assert.assertEquals(Assert.java:631)
at 
org.apache.ignite.testframework.junits.JUnitAssertAware.assertEquals(JUnitAssertAware.java:95)
at 
org.apache.ignite.cache.query.MultifieldIndexQueryTest.testCheckBoundaries(MultifieldIndexQueryTest.java:170)
{code}


> Fix MultifieldIndexQueryTest#testCheckBoundaries flakiness
> --
>
> Key: IGNITE-20896
> URL: https://issues.apache.org/jira/browse/IGNITE-20896
> Project: Ignite
>  Issue Type: Test
>Reporter: Ilya Shishkov
>Priority: Trivial
>  Labels: ise
>
> Test fails with an error:
> {code}
> java.lang.AssertionError: 
> Expected :2
> Actual   :3
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.ignite.testframework.junits.JUnitAssertAware.assertEquals(JUnitAssertAware.java:95)
>   at 
> org.apache.ignite.cache.query.MultifieldIndexQueryTest.testCheckBoundaries(MultifieldIndexQueryTest.java:170)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20896) Fix MultifieldIndexQueryTest#testCheckBoundaries flakiness

2023-11-22 Thread Ilya Shishkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ilya Shishkov updated IGNITE-20896:
---
Ignite Flags:   (was: Docs Required,Release Notes Required)

> Fix MultifieldIndexQueryTest#testCheckBoundaries flakiness
> --
>
> Key: IGNITE-20896
> URL: https://issues.apache.org/jira/browse/IGNITE-20896
> Project: Ignite
>  Issue Type: Test
>Reporter: Ilya Shishkov
>Priority: Trivial
>  Labels: ise
>
> Test fails with an error:
> {code}
> java.lang.AssertionError: 
> Expected :2
> Actual   :3
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> org.apache.ignite.testframework.junits.JUnitAssertAware.assertEquals(JUnitAssertAware.java:95)
>   at 
> org.apache.ignite.cache.query.MultifieldIndexQueryTest.testCheckBoundaries(MultifieldIndexQueryTest.java:170)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-20924) Add tests for parallel building of a new index and table updates

2023-11-22 Thread Kirill Tkalenko (Jira)
Kirill Tkalenko created IGNITE-20924:


 Summary: Add tests for parallel building of a new index and table 
updates
 Key: IGNITE-20924
 URL: https://issues.apache.org/jira/browse/IGNITE-20924
 Project: Ignite
  Issue Type: Improvement
Reporter: Kirill Tkalenko
Assignee: Kirill Tkalenko
 Fix For: 3.0.0-beta2


I discovered that we do not have tests when building an index, we make some 
changes to the table, such as insertions or updates, we need to add such 
integration tests.

It would also be good to move the integration tests associated with building 
indexes into the index module.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-20923) Possible optimization for sending row IDs during index building

2023-11-22 Thread Kirill Tkalenko (Jira)
Kirill Tkalenko created IGNITE-20923:


 Summary: Possible optimization for sending row IDs during index 
building
 Key: IGNITE-20923
 URL: https://issues.apache.org/jira/browse/IGNITE-20923
 Project: Ignite
  Issue Type: Improvement
Reporter: Kirill Tkalenko


Now, when building an index, we send all row IDs from the row store, starting 
from the smallest to the end of the store, and we want to build indexes for all 
of them.

I think that we do not need to send all row IDs, but only those in which the 
version chains have versions less than or equal to the time the index was 
created. Newer ones will be sent in the normal operation of the cluster in the 
replication protocol or full rebalancing.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20863) Selecting indexes when performing update operations for partition

2023-11-22 Thread Kirill Tkalenko (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirill Tkalenko updated IGNITE-20863:
-
Description: 
To implement IGNITE-20125, we need to correctly select the indexes into which 
we will insert data during RW transactions update operations for partitions.

It is enough for us to collect all available and registered indexes at the time 
of the operation, as well as all dropped available indexes that we can know 
about.

  was:
To implement IGNITE-20125, we need to correctly select the indexes into which 
we will insert data during RW transactions update operations for partitions.

What do we need for this:
# Catalog version of the transaction by beginTs of the txId.
# Catalog version of the update operation for the transaction.

An approximate index selection algorithm:
# Start with the catalog version at the beginning of the transaction.
# Selecting all indexes by catalog version for a specific table.
## If the index is not in the resulting selection, then simply add it.
## If the index is already present in the resulting selection, then we do 
nothing.
## If the index is present in the resulting selection but not in the catalog 
version, then depending on its status it will be:
### registered - the index will be removed from the resulting selection.
### available - will remain in the resulting selection.
# Increase the catalog version, if it is greater than the directory version for 
the update operation, then we will complete the index selection, else go to p. 
2.


> Selecting indexes when performing update operations for partition
> -
>
> Key: IGNITE-20863
> URL: https://issues.apache.org/jira/browse/IGNITE-20863
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Kirill Tkalenko
>Assignee: Kirill Tkalenko
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> To implement IGNITE-20125, we need to correctly select the indexes into which 
> we will insert data during RW transactions update operations for partitions.
> It is enough for us to collect all available and registered indexes at the 
> time of the operation, as well as all dropped available indexes that we can 
> know about.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20863) Selecting indexes when performing update operations for partition

2023-11-22 Thread Kirill Tkalenko (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirill Tkalenko updated IGNITE-20863:
-
Description: 
To implement IGNITE-20125, we need to correctly select the indexes into which 
we will insert data during RW transactions update operations for partitions.

What do we need for this:
# Catalog version of the transaction by beginTs of the txId.
# Catalog version of the update operation for the transaction.

An approximate index selection algorithm:
# Start with the catalog version at the beginning of the transaction.
# Selecting all indexes by catalog version for a specific table.
## If the index is not in the resulting selection, then simply add it.
## If the index is already present in the resulting selection, then we do 
nothing.
## If the index is present in the resulting selection but not in the catalog 
version, then depending on its status it will be:
### registered - the index will be removed from the resulting selection.
### available - will remain in the resulting selection.
# Increase the catalog version, if it is greater than the directory version for 
the update operation, then we will complete the index selection, else go to p. 
2.

  was:
To implement IGNITE-20125, we need to correctly select the indexes into which 
we will insert data during update operations for partitions.

What do we need for this:
# Catalog version of the transaction by beginTs of the txId.
# Catalog version of the update operation for the transaction.

An approximate index selection algorithm:
# Start with the catalog version at the beginning of the transaction.
# Selecting all indexes by catalog version for a specific table.
## If the index is not in the resulting selection, then simply add it.
## If the index is already present in the resulting selection, then we do 
nothing.
## If the index is present in the resulting selection but not in the catalog 
version, then depending on its status it will be:
### registered - the index will be removed from the resulting selection.
### available - will remain in the resulting selection.
# Increase the catalog version, if it is greater than the directory version for 
the update operation, then we will complete the index selection, else go to p. 
2.


> Selecting indexes when performing update operations for partition
> -
>
> Key: IGNITE-20863
> URL: https://issues.apache.org/jira/browse/IGNITE-20863
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Kirill Tkalenko
>Assignee: Kirill Tkalenko
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> To implement IGNITE-20125, we need to correctly select the indexes into which 
> we will insert data during RW transactions update operations for partitions.
> What do we need for this:
> # Catalog version of the transaction by beginTs of the txId.
> # Catalog version of the update operation for the transaction.
> An approximate index selection algorithm:
> # Start with the catalog version at the beginning of the transaction.
> # Selecting all indexes by catalog version for a specific table.
> ## If the index is not in the resulting selection, then simply add it.
> ## If the index is already present in the resulting selection, then we do 
> nothing.
> ## If the index is present in the resulting selection but not in the catalog 
> version, then depending on its status it will be:
> ### registered - the index will be removed from the resulting selection.
> ### available - will remain in the resulting selection.
> # Increase the catalog version, if it is greater than the directory version 
> for the update operation, then we will complete the index selection, else go 
> to p. 2.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20922) Java Thin: direct service invocation with ClientClusterGroup and Service Awareness

2023-11-22 Thread Vladimir Steshin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Steshin updated IGNITE-20922:
--
Description: 
Once we implemented Service Awareness for Java Thin Client, we might improve 
the service invocation with `IgniteClient.services(ClientClusterGroup grp)`

Consider:
1) There is a cluster with some nodes A, B, C, D, F
1) A service is deployed on nodes A,B,C
2) Service Awareness is enabled.
3) User limits the service instance nodes set with `IgniteClient.services('node 
A', 'node B')` skipping node C.
4) The thin client requests the service randomly on node C because it has the 
service instance too.
5) Node C redirects the invocation to node A or to node B as user required.


We should prevent the redirection call at #5 and call service only on A or B. 
And this would help user to choose nodes for loading purposes.


It looks simple to intersect passed `ClientClusterGroup` and known set of 
service instance nodes before the service calling. I.e. the client can exclude 
node C from the options where to send the invocation request.

If user chosses `services('node A', 'node B', 'node F')`, where F has no 
service instance, we do not invoke F. 

  was:
Once we implemented Service Awareness for Java Thin Client, we might improve 
the service invocation with `IgniteClient.services(ClientClusterGroup grp)`

Consider:
1) There is a cluster with some nodes A, B, C, D, F
1) A service is deployed on nodes A,B,C
2) Service Awareness is enabled.
3) User limits the service instance nodes set with `IgniteClient.services('node 
A', 'node B')` skipping node C.
4) The thin client requests the service randomly on node C because it has the 
service instance too.
5) Node C redirects the invocation to node A or to node B as user required.


We should prevent the redirection call at #5 and call service only on A or B.


It looks simple to intersect passed `ClientClusterGroup` and known set of 
service instance nodes before the service calling. I.e. the client can exclude 
node C from the options where to send the invocation request.

If user chosses `services('node A', 'node B', 'node F')`, where F has no 
service instance, we do not invoke F. 


> Java Thin: direct service invocation with ClientClusterGroup and Service 
> Awareness
> --
>
> Key: IGNITE-20922
> URL: https://issues.apache.org/jira/browse/IGNITE-20922
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladimir Steshin
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Once we implemented Service Awareness for Java Thin Client, we might improve 
> the service invocation with `IgniteClient.services(ClientClusterGroup grp)`
> Consider:
> 1) There is a cluster with some nodes A, B, C, D, F
> 1) A service is deployed on nodes A,B,C
> 2) Service Awareness is enabled.
> 3) User limits the service instance nodes set with 
> `IgniteClient.services('node A', 'node B')` skipping node C.
> 4) The thin client requests the service randomly on node C because it has the 
> service instance too.
> 5) Node C redirects the invocation to node A or to node B as user required.
> We should prevent the redirection call at #5 and call service only on A or B. 
> And this would help user to choose nodes for loading purposes.
> It looks simple to intersect passed `ClientClusterGroup` and known set of 
> service instance nodes before the service calling. I.e. the client can 
> exclude node C from the options where to send the invocation request.
> If user chosses `services('node A', 'node B', 'node F')`, where F has no 
> service instance, we do not invoke F. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20922) Java Thin: direct service invocation with ClientClusterGroup and Service Awareness

2023-11-22 Thread Vladimir Steshin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Steshin updated IGNITE-20922:
--
Description: 
Once we implemented Service Awareness for Java Thin Client, we might improve 
the service invocation with `IgniteClient.services(ClientClusterGroup grp)`

Consider:
1) There is a cluster with some nodes A, B, C, D, F
1) A service is deployed on nodes A,B,C
2) Service Awareness is enabled.
3) User limits the service instance nodes set with `IgniteClient.services('node 
A', 'node B')` skipping node C.
4) The thin client requests the service randomly on node C because it has the 
service instance too.
5) Node C redirects the invocation to node A or node B as user required.


We should prevent the redirection call at #5 and call service only on A or B.


It looks simple to intersect passed `ClientClusterGroup` and known set of 
service instance nodes before the service calling. I.e. the client can exclude 
node C from the options where to send the invocation request.

If user chosses `services('node A', 'node B', 'node F')`, where F has no 
service instance, we do not invoke F. 

  was:
Once we implemented Service Awareness for Java Thin Client, we might improve 
the service invocation with `IgniteClient.services(ClientClusterGroup grp)`

Consider:
1) There is a cluster with some nodes A, B, C, D, F
1) A service is deployed on nodes A,B,C
2) Service Awareness is enabled.
3) User limits the service instance nodes set with `services('node A', 'node 
B')` skipping node C.
4) The thin client requests the service randomly on node C because it has the 
service instance too.
5) Node C redirects the invocation to node A or node B as user required.


We should prevent the redirection call at #5 and call service only on A or B.


It looks simple to intersect passed `ClientClusterGroup` and known set of 
service instance nodes before the service calling. I.e. the client can exclude 
node C from the options where to send the invocation request.

If user chosses `services('node A', 'node B', 'node F')`, where F has no 
service instance, we do not invoke F. 


> Java Thin: direct service invocation with ClientClusterGroup and Service 
> Awareness
> --
>
> Key: IGNITE-20922
> URL: https://issues.apache.org/jira/browse/IGNITE-20922
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladimir Steshin
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Once we implemented Service Awareness for Java Thin Client, we might improve 
> the service invocation with `IgniteClient.services(ClientClusterGroup grp)`
> Consider:
> 1) There is a cluster with some nodes A, B, C, D, F
> 1) A service is deployed on nodes A,B,C
> 2) Service Awareness is enabled.
> 3) User limits the service instance nodes set with 
> `IgniteClient.services('node A', 'node B')` skipping node C.
> 4) The thin client requests the service randomly on node C because it has the 
> service instance too.
> 5) Node C redirects the invocation to node A or node B as user required.
> We should prevent the redirection call at #5 and call service only on A or B.
> It looks simple to intersect passed `ClientClusterGroup` and known set of 
> service instance nodes before the service calling. I.e. the client can 
> exclude node C from the options where to send the invocation request.
> If user chosses `services('node A', 'node B', 'node F')`, where F has no 
> service instance, we do not invoke F. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20922) Java Thin: direct service invocation with ClientClusterGroup and Service Awareness

2023-11-22 Thread Vladimir Steshin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Steshin updated IGNITE-20922:
--
Description: 
Once we implemented Service Awareness for Java Thin Client, we might improve 
the service invocation with `IgniteClient.services(ClientClusterGroup grp)`

Consider:
1) There is a cluster with some nodes A, B, C, D, F
1) A service is deployed on nodes A,B,C
2) Service Awareness is enabled.
3) User limits the service instance nodes set with `IgniteClient.services('node 
A', 'node B')` skipping node C.
4) The thin client requests the service randomly on node C because it has the 
service instance too.
5) Node C redirects the invocation to node A or to node B as user required.


We should prevent the redirection call at #5 and call service only on A or B.


It looks simple to intersect passed `ClientClusterGroup` and known set of 
service instance nodes before the service calling. I.e. the client can exclude 
node C from the options where to send the invocation request.

If user chosses `services('node A', 'node B', 'node F')`, where F has no 
service instance, we do not invoke F. 

  was:
Once we implemented Service Awareness for Java Thin Client, we might improve 
the service invocation with `IgniteClient.services(ClientClusterGroup grp)`

Consider:
1) There is a cluster with some nodes A, B, C, D, F
1) A service is deployed on nodes A,B,C
2) Service Awareness is enabled.
3) User limits the service instance nodes set with `IgniteClient.services('node 
A', 'node B')` skipping node C.
4) The thin client requests the service randomly on node C because it has the 
service instance too.
5) Node C redirects the invocation to node A or node B as user required.


We should prevent the redirection call at #5 and call service only on A or B.


It looks simple to intersect passed `ClientClusterGroup` and known set of 
service instance nodes before the service calling. I.e. the client can exclude 
node C from the options where to send the invocation request.

If user chosses `services('node A', 'node B', 'node F')`, where F has no 
service instance, we do not invoke F. 


> Java Thin: direct service invocation with ClientClusterGroup and Service 
> Awareness
> --
>
> Key: IGNITE-20922
> URL: https://issues.apache.org/jira/browse/IGNITE-20922
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladimir Steshin
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Once we implemented Service Awareness for Java Thin Client, we might improve 
> the service invocation with `IgniteClient.services(ClientClusterGroup grp)`
> Consider:
> 1) There is a cluster with some nodes A, B, C, D, F
> 1) A service is deployed on nodes A,B,C
> 2) Service Awareness is enabled.
> 3) User limits the service instance nodes set with 
> `IgniteClient.services('node A', 'node B')` skipping node C.
> 4) The thin client requests the service randomly on node C because it has the 
> service instance too.
> 5) Node C redirects the invocation to node A or to node B as user required.
> We should prevent the redirection call at #5 and call service only on A or B.
> It looks simple to intersect passed `ClientClusterGroup` and known set of 
> service instance nodes before the service calling. I.e. the client can 
> exclude node C from the options where to send the invocation request.
> If user chosses `services('node A', 'node B', 'node F')`, where F has no 
> service instance, we do not invoke F. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20922) Java Thin: direct service invocation with ClientClusterGroup and Service Awareness

2023-11-22 Thread Vladimir Steshin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Steshin updated IGNITE-20922:
--
Description: 
Once we implemented Service Awareness for Java Thin Client, we might improve 
the service invocation with `IgniteClient.services(ClientClusterGroup grp)`

Consider:
1) There is a cluster with some nodes A, B, C, D, F
1) A service is deployed on nodes A,B,C
2) Service Awareness is enabled.
3) User limits the service instance nodes set with `services('node A', 'node 
B')` skipping node C.
4) The thin client requests the service randomly on node C because it has the 
service instance too.
5) Node C redirects the invocation to node A or node B as user required.


We should prevent the redirection call at #5 and call service only on A or B.


It looks simple to intersect passed `ClientClusterGroup` and known set of 
service instance nodes before the service calling. I.e. the client can exclude 
node C from the options where to send the invocation request.

If user chosses `services('node A', 'node B', 'node F')`, where F has no 
service instance, we do not invoke F. 

  was:
Once we implemented Service Awareness for Java Thin Client, we can improve the 
service invocation with `IgniteClient.services(ClientClusterGroup grp)`

Consider:
1) There is a cluster with some nodes A, B, C, D, F
1) A service is deployed on nodes A,B,C
2) Service Awareness is enabled.
3) User limits the service instance nodes set with `services('node A', 'node 
B')` skipping node C.
4) The thin client requests the service randomly on node C because it has the 
service instance too.
5) Node C redirects the invocation to node A or node B as user required.


We should prevent the redirection call at #5 and call service only on A or B.


It looks simple to intersect passed `ClientClusterGroup` and known set of 
service instance nodes before the service calling. I.e. the client can exclude 
node C from the options where to send the invocation request.

If user chosses `services('node A', 'node B', 'node F')`, where F has no 
service instance, we do not invoke F. 


> Java Thin: direct service invocation with ClientClusterGroup and Service 
> Awareness
> --
>
> Key: IGNITE-20922
> URL: https://issues.apache.org/jira/browse/IGNITE-20922
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladimir Steshin
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Once we implemented Service Awareness for Java Thin Client, we might improve 
> the service invocation with `IgniteClient.services(ClientClusterGroup grp)`
> Consider:
> 1) There is a cluster with some nodes A, B, C, D, F
> 1) A service is deployed on nodes A,B,C
> 2) Service Awareness is enabled.
> 3) User limits the service instance nodes set with `services('node A', 'node 
> B')` skipping node C.
> 4) The thin client requests the service randomly on node C because it has the 
> service instance too.
> 5) Node C redirects the invocation to node A or node B as user required.
> We should prevent the redirection call at #5 and call service only on A or B.
> It looks simple to intersect passed `ClientClusterGroup` and known set of 
> service instance nodes before the service calling. I.e. the client can 
> exclude node C from the options where to send the invocation request.
> If user chosses `services('node A', 'node B', 'node F')`, where F has no 
> service instance, we do not invoke F. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20922) Java Thin: direct service invocation with ClientClusterGroup and Service Awareness

2023-11-22 Thread Vladimir Steshin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Steshin updated IGNITE-20922:
--
Description: 
Once we implemented Service Awareness for Java Thin Client, we can improve the 
service invocation with `IgniteClient.services(ClientClusterGroup grp)`

Consider:
1) There is a cluster with some nodes A, B, C, D, F
1) A service is deployed on nodes A,B,C
2) Service Awareness is enabled.
3) User limits the service instance nodes set with `services('node A', 'node 
B')` skipping node C.
4) The thin client requests the service randomly on node C because it has the 
service instance too.
5) Node C redirects the invocation to node A or node B as user required.


We should prevent the redirection call at #5 and call service only on A or B.


It looks simple to intersect passed `ClientClusterGroup` and known set of 
service instance nodes before the service calling. I.e. the client can exclude 
node C from the options where to send the invocation request.

If user chosses `services('node A', 'node B', 'node F')`, where F has no 
service instance, we do not invoke F. 

  was:
Once we implemented Service Awareness for Java Thin Client, we can improve the 
service invocation with `IgniteClient.services(ClientClusterGroup grp)`

Consider:
1) There is a cluster with some nodes A, B, C, D, F
1) A service is deployed on nodes A,B,C
2) Service Awareness is enabled.
3) User limits the service instance nodes set with `services('node A', 'node 
B')` skipping node C.
4) The thin client requests the service randomly on node C because it has the 
service instance too.
5) Node C redirects the invocation to node A or node B as user required.

We should prevent the redirection call at #5 and call service only on A or B.

It looks simple to intersect passed `ClientClusterGroup` and known set of 
service instance nodes before the service calling. I.e. the client can exclude 
node C from the options where to send the invocation request.

If user chosses `services('node A', 'node B', 'node F')`, where F has no 
service instance, we do not invoke F. 


> Java Thin: direct service invocation with ClientClusterGroup and Service 
> Awareness
> --
>
> Key: IGNITE-20922
> URL: https://issues.apache.org/jira/browse/IGNITE-20922
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladimir Steshin
>Priority: Major
>
> Once we implemented Service Awareness for Java Thin Client, we can improve 
> the service invocation with `IgniteClient.services(ClientClusterGroup grp)`
> Consider:
> 1) There is a cluster with some nodes A, B, C, D, F
> 1) A service is deployed on nodes A,B,C
> 2) Service Awareness is enabled.
> 3) User limits the service instance nodes set with `services('node A', 'node 
> B')` skipping node C.
> 4) The thin client requests the service randomly on node C because it has the 
> service instance too.
> 5) Node C redirects the invocation to node A or node B as user required.
> We should prevent the redirection call at #5 and call service only on A or B.
> It looks simple to intersect passed `ClientClusterGroup` and known set of 
> service instance nodes before the service calling. I.e. the client can 
> exclude node C from the options where to send the invocation request.
> If user chosses `services('node A', 'node B', 'node F')`, where F has no 
> service instance, we do not invoke F. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-20922) Java Thin: direct service invocation with ClientClusterGroup and Service Awareness

2023-11-22 Thread Vladimir Steshin (Jira)
Vladimir Steshin created IGNITE-20922:
-

 Summary: Java Thin: direct service invocation with 
ClientClusterGroup and Service Awareness
 Key: IGNITE-20922
 URL: https://issues.apache.org/jira/browse/IGNITE-20922
 Project: Ignite
  Issue Type: Improvement
Reporter: Vladimir Steshin


Once we implemented Service Awareness for Java Thin Client, we can improve the 
service invocation with `IgniteClient.services(ClientClusterGroup grp)`

Consider:
1) There is a cluster with some nodes A, B, C, D, F
1) A service is deployed on nodes A,B,C
2) Service Awareness is enabled.
3) User limits the service instance nodes set with `services('node A', 'node 
B')` skipping node C.
4) The thin client requests the service randomly on node C because it has the 
service instance too.
5) Node C redirects the invocation to node A or node B as user required.

We should prevent the redirection call at #5 and call service only on A or B.

It looks simple to intersect passed `ClientClusterGroup` and known set of 
service instance nodes before the service calling. I.e. the client can exclude 
node C from the options where to send the invocation request.

If user chosses `services('node A', 'node B', 'node F')`, where F has no 
service instance, we do not invoke F. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20569) .NET: Thin 3.0: Revise logging

2023-11-22 Thread Pavel Tupitsyn (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Tupitsyn updated IGNITE-20569:

Description: 
.NET client in Ignite 3.x uses custom *IIgniteLogger* interface. This approach 
is copied from 2.x client; it is outdated and inefficient. 

Revise it:
* Should we add a dependency on 
[Microsoft.Extensions.Logging.Abstractions|https://www.nuget.org/packages/Microsoft.Extensions.Logging.Abstractions/6.0.4#dependencies-body-tab],
 and use standard *ILogger* instead (accept *ILoggerFactory* in config)?
* Without that, how do we support structured logging, templates, code-generated 
loggers?
* How do we integrate with popular loggers? With *ILogger* it comes out of the 
box.


See *Logging guidance for .NET library authors*: 
https://learn.microsoft.com/en-us/dotnet/core/extensions/logging-library-authors

  was:
.NET client in Ignite 3.x uses custom *IIgniteLogger* interface. This approach 
is copied from 2.x client; it is outdated and inefficient. 

Revise it:
* Should we add a dependency on 
[Microsoft.Extensions.Logging.Abstractions|https://www.nuget.org/packages/Microsoft.Extensions.Logging.Abstractions/6.0.4#dependencies-body-tab],
 and use standard *ILogger* instead (accept *ILoggerFactory* in config)?
* Without that, how do we support structured logging, templates, code-generated 
loggers?
* How do we integrate with popular loggers? With *ILogger* it comes out of the 
box.


> .NET: Thin 3.0: Revise logging
> --
>
> Key: IGNITE-20569
> URL: https://issues.apache.org/jira/browse/IGNITE-20569
> Project: Ignite
>  Issue Type: Improvement
>  Components: platforms, thin client
>Affects Versions: 3.0.0-beta1
>Reporter: Pavel Tupitsyn
>Assignee: Pavel Tupitsyn
>Priority: Blocker
>  Labels: .NET, ignite-3
> Fix For: 3.0.0-beta2
>
>
> .NET client in Ignite 3.x uses custom *IIgniteLogger* interface. This 
> approach is copied from 2.x client; it is outdated and inefficient. 
> Revise it:
> * Should we add a dependency on 
> [Microsoft.Extensions.Logging.Abstractions|https://www.nuget.org/packages/Microsoft.Extensions.Logging.Abstractions/6.0.4#dependencies-body-tab],
>  and use standard *ILogger* instead (accept *ILoggerFactory* in config)?
> * Without that, how do we support structured logging, templates, 
> code-generated loggers?
> * How do we integrate with popular loggers? With *ILogger* it comes out of 
> the box.
> See *Logging guidance for .NET library authors*: 
> https://learn.microsoft.com/en-us/dotnet/core/extensions/logging-library-authors



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (IGNITE-20874) Node cleanup procedure

2023-11-22 Thread Jira


 [ 
https://issues.apache.org/jira/browse/IGNITE-20874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

 Kirill Sizov reassigned IGNITE-20874:
--

Assignee:  Kirill Sizov

> Node cleanup procedure
> --
>
> Key: IGNITE-20874
> URL: https://issues.apache.org/jira/browse/IGNITE-20874
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladislav Pyatkov
>Assignee:  Kirill Sizov
>Priority: Major
>  Labels: ignite-3
>
> h3. Motivation
> In the final stage, an RW transaction sends cleanup messages to all enlisted 
> replication groups in the transaction. Although several of these groups might 
> be in the same node, the nodes would be notified several times. Besides, a 
> release lock procedure makes sense only once for a specific transaction on 
> the node.
> h3. Definition of done
> Implement a node-wide cleanup. This procedure should be triggered by a direct 
> message to a particular node and release all locks for a specific transaction 
> synchronously (before response). The request also triggers replication 
> cleanup, which fixes all write intents for the specific transaction.
> h3. Implementation notes
> * Add a new message that ought to be named LockReleaseMessage.
> * The new message has a pure network nature (not a replication request).
> * The message should contain a list of replication groups (the list might be 
> empty) and a transaction ID.
> * When the message is received, it should release all locks that are held by 
> the transaction. Then update the lock transation state map and start the 
> cleanup process over all replication groups in the list and immediately reply 
> (do not wait for cleanup on replication groups).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20921) Add stopGuard and busyLock to client connector classes

2023-11-22 Thread Pavel Tupitsyn (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Tupitsyn updated IGNITE-20921:

Description: As noted in [PR 
review|https://github.com/apache/ignite-3/pull/2825#discussion_r1395248456], we 
should have *stopGuard* and *busyLock* in client connector classes, such as 
*ClientHandlerModule*, *ClientPrimaryReplicaTracker*, and potentially some 
others.

> Add stopGuard and busyLock to client connector classes
> --
>
> Key: IGNITE-20921
> URL: https://issues.apache.org/jira/browse/IGNITE-20921
> Project: Ignite
>  Issue Type: Improvement
>  Components: thin client
>Affects Versions: 3.0.0-beta1
>Reporter: Pavel Tupitsyn
>Assignee: Pavel Tupitsyn
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>
> As noted in [PR 
> review|https://github.com/apache/ignite-3/pull/2825#discussion_r1395248456], 
> we should have *stopGuard* and *busyLock* in client connector classes, such 
> as *ClientHandlerModule*, *ClientPrimaryReplicaTracker*, and potentially some 
> others.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-20921) Add stopGuard and busyLock to client connector classes

2023-11-22 Thread Pavel Tupitsyn (Jira)
Pavel Tupitsyn created IGNITE-20921:
---

 Summary: Add stopGuard and busyLock to client connector classes
 Key: IGNITE-20921
 URL: https://issues.apache.org/jira/browse/IGNITE-20921
 Project: Ignite
  Issue Type: Improvement
  Components: thin client
Affects Versions: 3.0.0-beta1
Reporter: Pavel Tupitsyn
Assignee: Pavel Tupitsyn
 Fix For: 3.0.0-beta2






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (IGNITE-20919) "TransactionException: Replication is timed out" after inserting 9M entries

2023-11-22 Thread Ivan Artiukhov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-20919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17788672#comment-17788672
 ] 

Ivan Artiukhov edited comment on IGNITE-20919 at 11/22/23 8:29 AM:
---

The similar error on reads via key-value API. See the same [^logs.zip].

 
{noformat}
Command line: -db site.ycsb.db.ignite3.IgniteClient -t -P 
/opt/pubagent/poc/config/ycsb/workloads/workloadc -threads 1 -p 
operationcount=25 -p recordcount=25 -p warmupops=5 -p 
dataintegrity=true -p measurementtype=timeseries -p status.interval=1 -p 
hosts=192.168.1.107 -p recordcount=2500 -p operationcount=2500 -s
YCSB Client 2023.8Loading workload...
Data integrity is enabled.
Starting test.
2023-11-21 19:06:29:570 [WARM-UP] 0 sec: 0 operations; est completion in 0 
second 
2023-11-21 19:06:29:570 [WARM-UP] 0 sec: 0 operations; est completion in 0 
second 
[19:06:29][INFO ][Thread-2] Create table request: CREATE TABLE IF NOT EXISTS 
usertable (yscb_key VARCHAR PRIMARY KEY, field0 VARCHAR, field1 VARCHAR, field2 
VARCHAR, field3 VARCHAR, field4 VARCHAR, field5 VARCHAR, field6 VARCHAR, field7 
VARCHAR, field8 VARCHAR, field9 VARCHAR)
DBWrapper: report latency for each error is false and specific error codes to 
track for latency are: []
2023-11-21 19:06:30:568 [WARM-UP] 1 sec: 1447 operations; 1447 current ops/sec; 
est completion in 4 hours 48 minutes 
2023-11-21 19:06:30:568 [WARM-UP] 1 sec: 1447 operations; 1447 current ops/sec; 
est completion in 4 hours 48 minutes 
...
...
2023-11-21 19:50:37:567 [PAYLOAD] 2648 sec: 11867826 operations; 4573 current 
ops/sec; est completion in 49 minutes [READ AverageLatency(us)=246.56] [VERIFY 
AverageLatency(us)=2.87] [READ-FAILED AverageLatency(us)=191.82] 
2023-11-21 19:50:37:567 [PAYLOAD] 2648 sec: 11867826 operations; 4573 current 
ops/sec; est completion in 49 minutes [READ AverageLatency(us)=246.56] [VERIFY 
AverageLatency(us)=2.87] [READ-FAILED AverageLatency(us)=191.82] 
2023-11-21 19:50:38:567 [PAYLOAD] 2649 sec: 11868355 operations; 529 current 
ops/sec; est completion in 49 minutes [READ AverageLatency(us)=518.38] [VERIFY 
AverageLatency(us)=3.23] [READ-FAILED AverageLatency(us)=695.92] 
2023-11-21 19:50:38:567 [PAYLOAD] 2649 sec: 11868355 operations; 529 current 
ops/sec; est completion in 49 minutes [READ AverageLatency(us)=518.38] [VERIFY 
AverageLatency(us)=3.23] [READ-FAILED AverageLatency(us)=695.92] 
2023-11-21 19:50:39:567 [PAYLOAD] 2650 sec: 11868355 operations; 0 current 
ops/sec; est completion in 49 minutes    
2023-11-21 19:50:39:567 [PAYLOAD] 2650 sec: 11868355 operations; 0 current 
ops/sec; est completion in 49 minutes    
2023-11-21 19:50:40:567 [PAYLOAD] 2651 sec: 11868355 operations; 0 current 
ops/sec; est completion in 49 minutes    
2023-11-21 19:50:40:567 [PAYLOAD] 2651 sec: 11868355 operations; 0 current 
ops/sec; est completion in 49 minutes    
...
...
2023-11-21 19:51:37:567 [PAYLOAD] 2708 sec: 11868394 operations; 0 current 
ops/sec; est completion in 50 minutes    
2023-11-21 19:51:38:567 [PAYLOAD] 2709 sec: 11868394 operations; 0 current 
ops/sec; est completion in 50 minutes    
2023-11-21 19:51:38:567 [PAYLOAD] 2709 sec: 11868394 operations; 0 current 
ops/sec; est completion in 50 minutes    
2023-11-21 19:51:39:567 [PAYLOAD] 2710 sec: 11868394 operations; 0 current 
ops/sec; est completion in 50 minutes    
2023-11-21 19:51:39:567 [PAYLOAD] 2710 sec: 11868394 operations; 0 current 
ops/sec; est completion in 50 minutes    
[19:51:40][ERROR][Thread-2] Error reading key: user3512309439425897761
org.apache.ignite.lang.IgniteException: The primary replica await timed out 
[replicationGroupId=5_part_19, referenceTimestamp=HybridTimestamp 
[physical=2023-11-21 19:51:10:466 +0300, logical=7, 
composite=111449569392459783], currentLease=Lease 
[leaseholder=poc-tester-SERVER-192.168.1.107-id-0, 
leaseholderId=0ea1764c-7664-4b61-83da-b7ed592082ae, accepted=false, 
startTime=HybridTimestamp [physical=2023-11-21 19:51:09:453 +0300, logical=46, 
composite=111449569326071854], expirationTime=HybridTimestamp 
[physical=2023-11-21 19:53:09:453 +0300, logical=0, 
composite=111449577190391808], prolongable=false, replicationGroupId=5_part_19]]
    at java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:710) 
~[?:?]
    at 
org.apache.ignite.internal.util.ExceptionUtils$1.copy(ExceptionUtils.java:754) 
~[ignite-core-3.0.0-SNAPSHOT.jar:?]
    at 
org.apache.ignite.internal.util.ExceptionUtils$ExceptionFactory.createCopy(ExceptionUtils.java:688)
 ~[ignite-core-3.0.0-SNAPSHOT.jar:?]
    at 
org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCause(ExceptionUtils.java:525)
 ~[ignite-core-3.0.0-SNAPSHOT.jar:?]
    at 
org.apache.ignite.internal.client.ClientUtils.copyExceptionWithCauseIfPossible(ClientUtils.java:73)
 ~[ignite-client-3.0.0-SNAPSHOT.jar:?]
    at 

[jira] [Commented] (IGNITE-20919) "TransactionException: Replication is timed out" after inserting 9M entries

2023-11-22 Thread Ivan Artiukhov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-20919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17788672#comment-17788672
 ] 

Ivan Artiukhov commented on IGNITE-20919:
-

The similar error on reads via key-value API. See the same [^logs.zip].

 
{noformat}
Command line: -db site.ycsb.db.ignite3.IgniteClient -t -P 
/opt/pubagent/poc/config/ycsb/workloads/workloadc -threads 1 -p 
operationcount=25 -p recordcount=25 -p warmupops=5 -p 
dataintegrity=true -p measurementtype=timeseries -p status.interval=1 -p 
hosts=192.168.1.107 -p recordcount=2500 -p operationcount=2500 -s
YCSB Client 2023.8Loading workload...
Data integrity is enabled.
Starting test.
2023-11-21 19:06:29:570 [WARM-UP] 0 sec: 0 operations; est completion in 0 
second 
2023-11-21 19:06:29:570 [WARM-UP] 0 sec: 0 operations; est completion in 0 
second 
[19:06:29][INFO ][Thread-2] Create table request: CREATE TABLE IF NOT EXISTS 
usertable (yscb_key VARCHAR PRIMARY KEY, field0 VARCHAR, field1 VARCHAR, field2 
VARCHAR, field3 VARCHAR, field4 VARCHAR, field5 VARCHAR, field6 VARCHAR, field7 
VARCHAR, field8 VARCHAR, field9 VARCHAR)
DBWrapper: report latency for each error is false and specific error codes to 
track for latency are: []
2023-11-21 19:06:30:568 [WARM-UP] 1 sec: 1447 operations; 1447 current ops/sec; 
est completion in 4 hours 48 minutes 
2023-11-21 19:06:30:568 [WARM-UP] 1 sec: 1447 operations; 1447 current ops/sec; 
est completion in 4 hours 48 minutes 
...
...
2023-11-21 19:50:37:567 [PAYLOAD] 2648 sec: 11867826 operations; 4573 current 
ops/sec; est completion in 49 minutes [READ AverageLatency(us)=246.56] [VERIFY 
AverageLatency(us)=2.87] [READ-FAILED AverageLatency(us)=191.82] 
2023-11-21 19:50:37:567 [PAYLOAD] 2648 sec: 11867826 operations; 4573 current 
ops/sec; est completion in 49 minutes [READ AverageLatency(us)=246.56] [VERIFY 
AverageLatency(us)=2.87] [READ-FAILED AverageLatency(us)=191.82] 
2023-11-21 19:50:38:567 [PAYLOAD] 2649 sec: 11868355 operations; 529 current 
ops/sec; est completion in 49 minutes [READ AverageLatency(us)=518.38] [VERIFY 
AverageLatency(us)=3.23] [READ-FAILED AverageLatency(us)=695.92] 
2023-11-21 19:50:38:567 [PAYLOAD] 2649 sec: 11868355 operations; 529 current 
ops/sec; est completion in 49 minutes [READ AverageLatency(us)=518.38] [VERIFY 
AverageLatency(us)=3.23] [READ-FAILED AverageLatency(us)=695.92] 
2023-11-21 19:50:39:567 [PAYLOAD] 2650 sec: 11868355 operations; 0 current 
ops/sec; est completion in 49 minutes    
2023-11-21 19:50:39:567 [PAYLOAD] 2650 sec: 11868355 operations; 0 current 
ops/sec; est completion in 49 minutes    
2023-11-21 19:50:40:567 [PAYLOAD] 2651 sec: 11868355 operations; 0 current 
ops/sec; est completion in 49 minutes    
2023-11-21 19:50:40:567 [PAYLOAD] 2651 sec: 11868355 operations; 0 current 
ops/sec; est completion in 49 minutes    
...
...
2023-11-21 19:51:37:567 [PAYLOAD] 2708 sec: 11868394 operations; 0 current 
ops/sec; est completion in 50 minutes    
2023-11-21 19:51:38:567 [PAYLOAD] 2709 sec: 11868394 operations; 0 current 
ops/sec; est completion in 50 minutes    
2023-11-21 19:51:38:567 [PAYLOAD] 2709 sec: 11868394 operations; 0 current 
ops/sec; est completion in 50 minutes    
2023-11-21 19:51:39:567 [PAYLOAD] 2710 sec: 11868394 operations; 0 current 
ops/sec; est completion in 50 minutes    
2023-11-21 19:51:39:567 [PAYLOAD] 2710 sec: 11868394 operations; 0 current 
ops/sec; est completion in 50 minutes    
[19:51:40][ERROR][Thread-2] Error reading key: user3512309439425897761
org.apache.ignite.lang.IgniteException: The primary replica await timed out 
[replicationGroupId=5_part_19, referenceTimestamp=HybridTimestamp 
[physical=2023-11-21 19:51:10:466 +0300, logical=7, 
composite=111449569392459783], currentLease=Lease 
[leaseholder=poc-tester-SERVER-192.168.1.107-id-0, 
leaseholderId=0ea1764c-7664-4b61-83da-b7ed592082ae, accepted=false, 
startTime=HybridTimestamp [physical=2023-11-21 19:51:09:453 +0300, logical=46, 
composite=111449569326071854], expirationTime=HybridTimestamp 
[physical=2023-11-21 19:53:09:453 +0300, logical=0, 
composite=111449577190391808], prolongable=false, replicationGroupId=5_part_19]]
    at java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:710) 
~[?:?]
    at 
org.apache.ignite.internal.util.ExceptionUtils$1.copy(ExceptionUtils.java:754) 
~[ignite-core-3.0.0-SNAPSHOT.jar:?]
    at 
org.apache.ignite.internal.util.ExceptionUtils$ExceptionFactory.createCopy(ExceptionUtils.java:688)
 ~[ignite-core-3.0.0-SNAPSHOT.jar:?]
    at 
org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCause(ExceptionUtils.java:525)
 ~[ignite-core-3.0.0-SNAPSHOT.jar:?]
    at 
org.apache.ignite.internal.client.ClientUtils.copyExceptionWithCauseIfPossible(ClientUtils.java:73)
 ~[ignite-client-3.0.0-SNAPSHOT.jar:?]
    at 
org.apache.ignite.internal.client.ClientUtils.ensurePublicException(ClientUtils.java:54)
 

[jira] [Updated] (IGNITE-20920) .NET: TestPutRoutesRequestToPrimaryNode is flaky

2023-11-22 Thread Pavel Tupitsyn (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Tupitsyn updated IGNITE-20920:

Description: 
Test history: 
https://ci.ignite.apache.org/test/6220606330906984960?currentProjectId=ApacheIgnite3xGradle_Test=true

Error:
{code}
Apache.Ignite.IgniteClientConnectionException : Exception while reading from 
socket, connection closed: Connection lost (failed to read data from socket)
  > Apache.Ignite.IgniteClientConnectionException : Connection lost (failed 
to read data from socket)
  > System.Net.Sockets.SocketException : Software caused connection abort
   at 
Apache.Ignite.Internal.ClientFailoverSocket.DoOutInOpAndGetSocketAsync(ClientOp 
clientOp, Transaction tx, PooledArrayBuffer request, PreferredNode 
preferredNode, IRetryPolicy retryPolicyOverride) in 
/opt/buildagent/work/b8d4df1365f1f1e5/modules/platforms/dotnet/Apache.Ignite/Internal/ClientFailoverSocket.cs:line
 195
   at Apache.Ignite.Internal.Table.RecordView`1.DoOutInOpAsync(ClientOp 
clientOp, Transaction tx, PooledArrayBuffer request, PreferredNode 
preferredNode, IRetryPolicy retryPolicyOverride) in 
/opt/buildagent/work/b8d4df1365f1f1e5/modules/platforms/dotnet/Apache.Ignite/Internal/Table/RecordView.cs:line
 400
   at Apache.Ignite.Internal.Table.RecordView`1.DoRecordOutOpAsync(ClientOp op, 
ITransaction transaction, T record, Boolean keyOnly, Nullable`1 
schemaVersionOverride) in 
/opt/buildagent/work/b8d4df1365f1f1e5/modules/platforms/dotnet/Apache.Ignite/Internal/Table/RecordView.cs:line
 425
   at Apache.Ignite.Internal.Table.RecordView`1.UpsertAsync(ITransaction 
transaction, T record) in 
/opt/buildagent/work/b8d4df1365f1f1e5/modules/platforms/dotnet/Apache.Ignite/Internal/Table/RecordView.cs:line
 172
   at 
Apache.Ignite.Tests.PartitionAwarenessRealClusterTests.TestPutRoutesRequestToPrimaryNode()
 in 
/opt/buildagent/work/b8d4df1365f1f1e5/modules/platforms/dotnet/Apache.Ignite.Tests/PartitionAwarenessRealClusterTests.cs:line
 65
   at 
NUnit.Framework.Internal.TaskAwaitAdapter.GenericAdapter`1.BlockUntilCompleted()
   at 
NUnit.Framework.Internal.MessagePumpStrategy.NoMessagePumpStrategy.WaitForCompletion(AwaitAdapter
 awaiter)
   at NUnit.Framework.Internal.AsyncToSyncAdapter.Await(Func`1 invoke)
   at 
NUnit.Framework.Internal.Commands.TestMethodCommand.RunTestMethod(TestExecutionContext
 context)
   at 
NUnit.Framework.Internal.Commands.TestMethodCommand.Execute(TestExecutionContext
 context)
   at 
NUnit.Framework.Internal.Commands.BeforeAndAfterTestCommand.<>c__DisplayClass1_0.b__0()
   at 
NUnit.Framework.Internal.Commands.DelegatingTestCommand.RunTestMethodInThreadAbortSafeZone(TestExecutionContext
 context, Action action)
--IgniteClientConnectionException
   at Apache.Ignite.Internal.ClientSocket.ReceiveBytesAsync(Stream stream, 
Byte[] buffer, Int32 size, CancellationToken cancellationToken) in 
/opt/buildagent/work/b8d4df1365f1f1e5/modules/platforms/dotnet/Apache.Ignite/Internal/ClientSocket.cs:line
 498
   at Apache.Ignite.Internal.ClientSocket.ReadMessageSizeAsync(Stream stream, 
Byte[] buffer, CancellationToken cancellationToken) in 
/opt/buildagent/work/b8d4df1365f1f1e5/modules/platforms/dotnet/Apache.Ignite/Internal/ClientSocket.cs:line
 478
   at Apache.Ignite.Internal.ClientSocket.ReadResponseAsync(Stream stream, 
Byte[] messageSizeBytes, CancellationToken cancellationToken) in 
/opt/buildagent/work/b8d4df1365f1f1e5/modules/platforms/dotnet/Apache.Ignite/Internal/ClientSocket.cs:line
 452
   at Apache.Ignite.Internal.ClientSocket.RunReceiveLoop(CancellationToken 
cancellationToken) in 
/opt/buildagent/work/b8d4df1365f1f1e5/modules/platforms/dotnet/Apache.Ignite/Internal/ClientSocket.cs:line
 700
{code}

> .NET: TestPutRoutesRequestToPrimaryNode is flaky
> 
>
> Key: IGNITE-20920
> URL: https://issues.apache.org/jira/browse/IGNITE-20920
> Project: Ignite
>  Issue Type: Bug
>  Components: platforms, thin client
>Reporter: Pavel Tupitsyn
>Assignee: Pavel Tupitsyn
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>
> Test history: 
> https://ci.ignite.apache.org/test/6220606330906984960?currentProjectId=ApacheIgnite3xGradle_Test=true
> Error:
> {code}
> Apache.Ignite.IgniteClientConnectionException : Exception while reading from 
> socket, connection closed: Connection lost (failed to read data from socket)
>   > Apache.Ignite.IgniteClientConnectionException : Connection lost 
> (failed to read data from socket)
>   > System.Net.Sockets.SocketException : Software caused connection abort
>at 
> Apache.Ignite.Internal.ClientFailoverSocket.DoOutInOpAndGetSocketAsync(ClientOp
>  clientOp, Transaction tx, PooledArrayBuffer request, PreferredNode 
> preferredNode, IRetryPolicy retryPolicyOverride) in 
> 

[jira] [Created] (IGNITE-20920) .NET: TestPutRoutesRequestToPrimaryNode is flaky

2023-11-22 Thread Pavel Tupitsyn (Jira)
Pavel Tupitsyn created IGNITE-20920:
---

 Summary: .NET: TestPutRoutesRequestToPrimaryNode is flaky
 Key: IGNITE-20920
 URL: https://issues.apache.org/jira/browse/IGNITE-20920
 Project: Ignite
  Issue Type: Bug
  Components: platforms, thin client
Reporter: Pavel Tupitsyn
Assignee: Pavel Tupitsyn
 Fix For: 3.0.0-beta2






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-20919) "TransactionException: Replication is timed out" after inserting 9M entries

2023-11-22 Thread Ivan Artiukhov (Jira)
Ivan Artiukhov created IGNITE-20919:
---

 Summary: "TransactionException: Replication is timed out" after 
inserting 9M entries
 Key: IGNITE-20919
 URL: https://issues.apache.org/jira/browse/IGNITE-20919
 Project: Ignite
  Issue Type: Bug
  Components: persistence
Reporter: Ivan Artiukhov
 Attachments: logs.zip

AI3, rev. e1c9b1c4cf589c71aecc4815a3c8a14ae8fbf2f3 (Nov 21 2023)

 

Benchmark: 
[https://github.com/gridgain/YCSB/blob/ycsb-2023.8/ignite3/src/main/java/site/ycsb/db/ignite3/IgniteClient.java]
 

The benchmark uses key-value API to put/get entries. 
h1. Setup

1 server node

aipersist, 25 partitions, raft.fsync=false
h1. Steps

Run a single instance of the benchmark in preload mode with 1 thread. Number of 
unique entries – 25 million.
{noformat}
Command line: -db site.ycsb.db.ignite3.IgniteClient -load -P 
/opt/pubagent/poc/config/ycsb/workloads/workloadc -threads 1 -p 
recordcount=25 -p warmupops=5 -p dataintegrity=true -p 
measurementtype=timeseries -p status.interval=1 -p hosts=192.168.1.107 -p 
recordcount=2500 -p operationcount=2500 -s
{noformat}
h1. Expected result

All 25 million entries were loaded without errors. 
h1. Actual result

{{TransactionException: Replication is timed out after inserting 9.1 million 
entries:}}
{noformat}
Starting test.
2023-11-21 18:11:29:396 [WARM-UP] 0 sec: 0 operations; est completion in 0 
second 
...
2023-11-21 19:06:21:394 [PAYLOAD] 3292 sec: 9151900 operations; 1880 current 
ops/sec; est completion in 1 hour 35 minutes [INSERT AverageLatency(us)=522.76] 
2023-11-21 19:06:21:394 [PAYLOAD] 3292 sec: 9151900 operations; 1880 current 
ops/sec; est completion in 1 hour 35 minutes [INSERT AverageLatency(us)=522.76] 
2023-11-21 19:06:22:394 [PAYLOAD] 3293 sec: 9153100 operations; 1200 current 
ops/sec; est completion in 1 hour 35 minutes [INSERT AverageLatency(us)=531.8] 
2023-11-21 19:06:22:394 [PAYLOAD] 3293 sec: 9153100 operations; 1200 current 
ops/sec; est completion in 1 hour 35 minutes [INSERT AverageLatency(us)=531.8] 
2023-11-21 19:06:23:394 [PAYLOAD] 3294 sec: 9153100 operations; 0 current 
ops/sec; est completion in 1 hour 35 minutes  
2023-11-21 19:06:23:394 [PAYLOAD] 3294 sec: 9153100 operations; 0 current 
ops/sec; est completion in 1 hour 35 minutes  
2023-11-21 19:06:24:394 [PAYLOAD] 3295 sec: 9153100 operations; 0 current 
ops/sec; est completion in 1 hour 35 minutes  
2023-11-21 19:06:24:394 [PAYLOAD] 3295 sec: 9153100 operations; 0 current 
ops/sec; est completion in 1 hour 35 minutes  
[19:06:25][ERROR][Thread-2] Error inserting key: user3519618738173805240
org.apache.ignite.tx.TransactionException: Replication is timed out 
[replicaGrpId=5_part_19]
at 
java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:710) ~[?:?]
at 
org.apache.ignite.internal.util.ExceptionUtils$1.copy(ExceptionUtils.java:754) 
~[ignite-core-3.0.0-SNAPSHOT.jar:?]
at 
org.apache.ignite.internal.util.ExceptionUtils$ExceptionFactory.createCopy(ExceptionUtils.java:688)
 ~[ignite-core-3.0.0-SNAPSHOT.jar:?]
at 
org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCause(ExceptionUtils.java:525)
 ~[ignite-core-3.0.0-SNAPSHOT.jar:?]
at 
org.apache.ignite.internal.client.ClientUtils.copyExceptionWithCauseIfPossible(ClientUtils.java:73)
 ~[ignite-client-3.0.0-SNAPSHOT.jar:?]
at 
org.apache.ignite.internal.client.ClientUtils.ensurePublicException(ClientUtils.java:54)
 ~[ignite-client-3.0.0-SNAPSHOT.jar:?]
at 
org.apache.ignite.internal.client.ClientUtils.sync(ClientUtils.java:97) 
~[ignite-client-3.0.0-SNAPSHOT.jar:?]
at 
org.apache.ignite.internal.client.table.ClientKeyValueBinaryView.put(ClientKeyValueBinaryView.java:168)
 ~[ignite-client-3.0.0-SNAPSHOT.jar:?]
at 
org.apache.ignite.internal.client.table.ClientKeyValueBinaryView.put(ClientKeyValueBinaryView.java:47)
 ~[ignite-client-3.0.0-SNAPSHOT.jar:?]
at site.ycsb.db.ignite3.IgniteClient.insert(IgniteClient.java:127) 
[ignite3-binding-2023.8.jar:?]
at site.ycsb.DBWrapper.insert(DBWrapper.java:237) [core-2023.8.jar:?]
at site.ycsb.workloads.CoreWorkload.doInsert(CoreWorkload.java:623) 
[core-2023.8.jar:?]
at site.ycsb.ClientThread.run(ClientThread.java:167) [core-2023.8.jar:?]
at java.lang.Thread.run(Thread.java:829) [?:?]
Caused by: org.apache.ignite.tx.TransactionException: Replication is timed out 
[replicaGrpId=5_part_19]
at 
java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:710) ~[?:?]
at 
org.apache.ignite.internal.util.ExceptionUtils$1.copy(ExceptionUtils.java:754) 
~[ignite-core-3.0.0-SNAPSHOT.jar:?]
at 
org.apache.ignite.internal.util.ExceptionUtils$ExceptionFactory.createCopy(ExceptionUtils.java:688)
 ~[ignite-core-3.0.0-SNAPSHOT.jar:?]
at 

[jira] [Assigned] (IGNITE-20723) Tests fail on TC because a primary replica is not assigned or does not respond

2023-11-22 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov reassigned IGNITE-20723:
--

Assignee: (was: Vladislav Pyatkov)

> Tests fail on TC because a primary replica is not assigned or does not respond
> --
>
> Key: IGNITE-20723
> URL: https://issues.apache.org/jira/browse/IGNITE-20723
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
> Attachments: _Integration_Tests_Module_Runner_SQL_Logic_11804.log.zip
>
>
> TC run is available 
> [here|https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_IntegrationTests_ModuleRunnerSqlLogic/7584713?hideProblemsFromDependencies=false=false=true=true=true].
> By my brig analysis, the issue is somewhere in the assignments:
> {noformat}
> [2023-10-06T13:03:51,231][INFO 
> ][%sqllogic0%Raft-Group-Client-1][PlacementDriverManager] Placement driver 
> active actor is starting.
> [2023-10-06T13:08:38,981][INFO 
> ][%sqllogic1%MessagingService-inbound--0][ReplicaManager] Received 
> LeaseGrantedMessage for replica belonging to group=291_part_3, force=false
> [2023-10-06T13:08:38,981][INFO 
> ][%sqllogic1%MessagingService-inbound--0][ReplicaManager] Waiting for actual 
> storage state, group=291_part_3
> [2023-10-06T13:08:38,981][INFO 
> ][%sqllogic1%JRaft-Request-Processor-5][ReplicaManager] Lease accepted, 
> group=291_part_3, leaseStartTime=HybridTimestamp [time=88228107862067, 
> physical=1696597718931, logical=51], leaseExpirationTime=HybridTimestamp 
> [time=88235972182016, physical=1696597838931, logical=0]
> [2023-10-06T13:08:50,256][WARN 
> ][CompletableFutureDelayScheduler][RaftGroupServiceImpl] Recoverable error 
> during the request type=ActionRequestImpl occurred (will be retried on the 
> randomly selected node):
> java.util.concurrent.CompletionException: 
> java.util.concurrent.TimeoutException
> at 
> java.util.concurrent.CompletableFuture.encodeRelay(CompletableFuture.java:367)
>  ~[?:?]
> at 
> java.util.concurrent.CompletableFuture.completeRelay(CompletableFuture.java:376)
>  ~[?:?]
> at 
> java.util.concurrent.CompletableFuture$UniRelay.tryFire(CompletableFuture.java:1019)
>  ~[?:?]
> at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
>  [?:?]
> at 
> java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)
>  [?:?]
> at 
> java.util.concurrent.CompletableFuture$Timeout.run(CompletableFuture.java:2792)
>  [?:?]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
> at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
>  [?:?]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  [?:?]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  [?:?]
> at java.lang.Thread.run(Thread.java:834) [?:?]
> Caused by: java.util.concurrent.TimeoutException
> ... 7 more
> at 
> app//org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:42)
> at app//org.junit.jupiter.api.Assertions.fail(Assertions.java:147)
> at 
> app//org.apache.ignite.internal.sqllogic.Statement.execute(Statement.java:112)
> at 
> app//org.apache.ignite.internal.sqllogic.SqlScriptRunner.run(SqlScriptRunner.java:70)
> at 
> app//org.junit.jupiter.api.AssertTimeoutPreemptively.lambda$assertTimeoutPreemptively$0(AssertTimeoutPreemptively.java:48)
> at 
> app//org.junit.jupiter.api.AssertTimeoutPreemptively.lambda$submitTask$3(AssertTimeoutPreemptively.java:95)
> at 
> java.base@11.0.17/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at 
> java.base@11.0.17/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at 
> java.base@11.0.17/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.base@11.0.17/java.lang.Thread.run(Thread.java:834)
> Caused by: org.apache.ignite.tx.TransactionException: IGN-REP-3 
> TraceId:0002850d-2b24-4356-a0dc-2c25af4202a1 
> org.apache.ignite.internal.replicator.exception.ReplicationTimeoutException: 
> IGN-REP-3 TraceId:0002850d-2b24-4356-a0dc-2c25af4202a1 Replication is timed 
> out [replicaGrpId=291_part_3]
> at 
> java.base@11.0.17/java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:710)
> at 
> app//org.apache.ignite.internal.util.ExceptionUtils$1.copy(ExceptionUtils.java:772)
> at 
> 

[jira] [Assigned] (IGNITE-20723) Tests fail on TC because a primary replica is not assigned or does not respond

2023-11-22 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov reassigned IGNITE-20723:
--

Assignee: Vladislav Pyatkov

> Tests fail on TC because a primary replica is not assigned or does not respond
> --
>
> Key: IGNITE-20723
> URL: https://issues.apache.org/jira/browse/IGNITE-20723
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vladislav Pyatkov
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
> Attachments: _Integration_Tests_Module_Runner_SQL_Logic_11804.log.zip
>
>
> TC run is available 
> [here|https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_IntegrationTests_ModuleRunnerSqlLogic/7584713?hideProblemsFromDependencies=false=false=true=true=true].
> By my brig analysis, the issue is somewhere in the assignments:
> {noformat}
> [2023-10-06T13:03:51,231][INFO 
> ][%sqllogic0%Raft-Group-Client-1][PlacementDriverManager] Placement driver 
> active actor is starting.
> [2023-10-06T13:08:38,981][INFO 
> ][%sqllogic1%MessagingService-inbound--0][ReplicaManager] Received 
> LeaseGrantedMessage for replica belonging to group=291_part_3, force=false
> [2023-10-06T13:08:38,981][INFO 
> ][%sqllogic1%MessagingService-inbound--0][ReplicaManager] Waiting for actual 
> storage state, group=291_part_3
> [2023-10-06T13:08:38,981][INFO 
> ][%sqllogic1%JRaft-Request-Processor-5][ReplicaManager] Lease accepted, 
> group=291_part_3, leaseStartTime=HybridTimestamp [time=88228107862067, 
> physical=1696597718931, logical=51], leaseExpirationTime=HybridTimestamp 
> [time=88235972182016, physical=1696597838931, logical=0]
> [2023-10-06T13:08:50,256][WARN 
> ][CompletableFutureDelayScheduler][RaftGroupServiceImpl] Recoverable error 
> during the request type=ActionRequestImpl occurred (will be retried on the 
> randomly selected node):
> java.util.concurrent.CompletionException: 
> java.util.concurrent.TimeoutException
> at 
> java.util.concurrent.CompletableFuture.encodeRelay(CompletableFuture.java:367)
>  ~[?:?]
> at 
> java.util.concurrent.CompletableFuture.completeRelay(CompletableFuture.java:376)
>  ~[?:?]
> at 
> java.util.concurrent.CompletableFuture$UniRelay.tryFire(CompletableFuture.java:1019)
>  ~[?:?]
> at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
>  [?:?]
> at 
> java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)
>  [?:?]
> at 
> java.util.concurrent.CompletableFuture$Timeout.run(CompletableFuture.java:2792)
>  [?:?]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
> at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
>  [?:?]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  [?:?]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  [?:?]
> at java.lang.Thread.run(Thread.java:834) [?:?]
> Caused by: java.util.concurrent.TimeoutException
> ... 7 more
> at 
> app//org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:42)
> at app//org.junit.jupiter.api.Assertions.fail(Assertions.java:147)
> at 
> app//org.apache.ignite.internal.sqllogic.Statement.execute(Statement.java:112)
> at 
> app//org.apache.ignite.internal.sqllogic.SqlScriptRunner.run(SqlScriptRunner.java:70)
> at 
> app//org.junit.jupiter.api.AssertTimeoutPreemptively.lambda$assertTimeoutPreemptively$0(AssertTimeoutPreemptively.java:48)
> at 
> app//org.junit.jupiter.api.AssertTimeoutPreemptively.lambda$submitTask$3(AssertTimeoutPreemptively.java:95)
> at 
> java.base@11.0.17/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at 
> java.base@11.0.17/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at 
> java.base@11.0.17/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.base@11.0.17/java.lang.Thread.run(Thread.java:834)
> Caused by: org.apache.ignite.tx.TransactionException: IGN-REP-3 
> TraceId:0002850d-2b24-4356-a0dc-2c25af4202a1 
> org.apache.ignite.internal.replicator.exception.ReplicationTimeoutException: 
> IGN-REP-3 TraceId:0002850d-2b24-4356-a0dc-2c25af4202a1 Replication is timed 
> out [replicaGrpId=291_part_3]
> at 
> java.base@11.0.17/java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:710)
> at 
> 

[jira] [Assigned] (IGNITE-20723) Tests fail on TC because a primary replica is not assigned or does not respond

2023-11-22 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov reassigned IGNITE-20723:
--

Assignee: (was: Vladislav Pyatkov)

> Tests fail on TC because a primary replica is not assigned or does not respond
> --
>
> Key: IGNITE-20723
> URL: https://issues.apache.org/jira/browse/IGNITE-20723
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
> Attachments: _Integration_Tests_Module_Runner_SQL_Logic_11804.log.zip
>
>
> TC run is available 
> [here|https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_IntegrationTests_ModuleRunnerSqlLogic/7584713?hideProblemsFromDependencies=false=false=true=true=true].
> By my brig analysis, the issue is somewhere in the assignments:
> {noformat}
> [2023-10-06T13:03:51,231][INFO 
> ][%sqllogic0%Raft-Group-Client-1][PlacementDriverManager] Placement driver 
> active actor is starting.
> [2023-10-06T13:08:38,981][INFO 
> ][%sqllogic1%MessagingService-inbound--0][ReplicaManager] Received 
> LeaseGrantedMessage for replica belonging to group=291_part_3, force=false
> [2023-10-06T13:08:38,981][INFO 
> ][%sqllogic1%MessagingService-inbound--0][ReplicaManager] Waiting for actual 
> storage state, group=291_part_3
> [2023-10-06T13:08:38,981][INFO 
> ][%sqllogic1%JRaft-Request-Processor-5][ReplicaManager] Lease accepted, 
> group=291_part_3, leaseStartTime=HybridTimestamp [time=88228107862067, 
> physical=1696597718931, logical=51], leaseExpirationTime=HybridTimestamp 
> [time=88235972182016, physical=1696597838931, logical=0]
> [2023-10-06T13:08:50,256][WARN 
> ][CompletableFutureDelayScheduler][RaftGroupServiceImpl] Recoverable error 
> during the request type=ActionRequestImpl occurred (will be retried on the 
> randomly selected node):
> java.util.concurrent.CompletionException: 
> java.util.concurrent.TimeoutException
> at 
> java.util.concurrent.CompletableFuture.encodeRelay(CompletableFuture.java:367)
>  ~[?:?]
> at 
> java.util.concurrent.CompletableFuture.completeRelay(CompletableFuture.java:376)
>  ~[?:?]
> at 
> java.util.concurrent.CompletableFuture$UniRelay.tryFire(CompletableFuture.java:1019)
>  ~[?:?]
> at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
>  [?:?]
> at 
> java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)
>  [?:?]
> at 
> java.util.concurrent.CompletableFuture$Timeout.run(CompletableFuture.java:2792)
>  [?:?]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
> at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
>  [?:?]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  [?:?]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  [?:?]
> at java.lang.Thread.run(Thread.java:834) [?:?]
> Caused by: java.util.concurrent.TimeoutException
> ... 7 more
> at 
> app//org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:42)
> at app//org.junit.jupiter.api.Assertions.fail(Assertions.java:147)
> at 
> app//org.apache.ignite.internal.sqllogic.Statement.execute(Statement.java:112)
> at 
> app//org.apache.ignite.internal.sqllogic.SqlScriptRunner.run(SqlScriptRunner.java:70)
> at 
> app//org.junit.jupiter.api.AssertTimeoutPreemptively.lambda$assertTimeoutPreemptively$0(AssertTimeoutPreemptively.java:48)
> at 
> app//org.junit.jupiter.api.AssertTimeoutPreemptively.lambda$submitTask$3(AssertTimeoutPreemptively.java:95)
> at 
> java.base@11.0.17/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at 
> java.base@11.0.17/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at 
> java.base@11.0.17/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.base@11.0.17/java.lang.Thread.run(Thread.java:834)
> Caused by: org.apache.ignite.tx.TransactionException: IGN-REP-3 
> TraceId:0002850d-2b24-4356-a0dc-2c25af4202a1 
> org.apache.ignite.internal.replicator.exception.ReplicationTimeoutException: 
> IGN-REP-3 TraceId:0002850d-2b24-4356-a0dc-2c25af4202a1 Replication is timed 
> out [replicaGrpId=291_part_3]
> at 
> java.base@11.0.17/java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:710)
> at 
> app//org.apache.ignite.internal.util.ExceptionUtils$1.copy(ExceptionUtils.java:772)
> at 
> 

[jira] [Assigned] (IGNITE-20918) Leases expire after a node has been restarted

2023-11-22 Thread Aleksandr Polovtcev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Polovtcev reassigned IGNITE-20918:


Assignee: Alexander Lapin

> Leases expire after a node has been restarted
> -
>
> Key: IGNITE-20918
> URL: https://issues.apache.org/jira/browse/IGNITE-20918
> Project: Ignite
>  Issue Type: Bug
>Reporter: Aleksandr Polovtcev
>Assignee: Alexander Lapin
>Priority: Critical
>  Labels: ignite-3
>
> IGNITE-20910 introduces a test that inserts some data after restarting a 
> node. For some reason, after some time, I can see the following messages in 
> the log:
> {noformat}
> [2023-11-22T10:00:17,056][INFO 
> ][%isnt_tmpar_0%metastorage-watch-executor-3][PartitionReplicaListener] 
> Primary replica expired [grp=5_part_19]
> [2023-11-22T10:00:17,057][INFO 
> ][%isnt_tmpar_0%metastorage-watch-executor-3][PartitionReplicaListener] 
> Primary replica expired [grp=5_part_0]
> [2023-11-22T10:00:17,057][INFO 
> ][%isnt_tmpar_0%metastorage-watch-executor-3][PartitionReplicaListener] 
> Primary replica expired [grp=5_part_9]
> [2023-11-22T10:00:17,057][INFO 
> ][%isnt_tmpar_0%metastorage-watch-executor-3][PartitionReplicaListener] 
> Primary replica expired [grp=5_part_10]
> {noformat}
> After that, the test fails with a {{PrimaryReplicaMissException}}. The 
> problem here, that it is expected that a single node should never have 
> expired leases, they should be prolongated automatically. I think that this 
> happens because the initial lease that was issued before the node was 
> restarted is still accepted by the node after restart.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (IGNITE-20723) Tests fail on TC because a primary replica is not assigned or does not respond

2023-11-22 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17788665#comment-17788665
 ] 

Vladislav Pyatkov edited comment on IGNITE-20723 at 11/22/23 8:12 AM:
--

The test failure is not related to the placement driver because cluster nodes 
cannot be joined due to a problem in the discovery procedure. The reason for 
the issue is in the scale cube.

{noformat}
[2023-10-06T13:09:05,844][WARN ][sc-cluster-3345-2][MetadataStore] 
[default:sqllogic1:6745409cb23745f0@10.233.107.199:3345][a4e654b1-b252-4076-bef6-c244f6043163]
 Timeout getting GetMetadataResp from 10.233.107.199:3344 within 1000 ms, 
cause: java.util.concurrent.TimeoutException: Did not observe any item or 
terminal signal within 1000ms in 'source(MonoDefer)' (
and no fallback has been configured)
[2023-10-06T13:09:05,844][WARN ][sc-cluster-3345-2][MembershipProtocol] 
[default:sqllogic1:6745409cb23745f0@10.233.107.199:3345][updateMembership][MEMBERSHIP_GOSSIP]
 Skipping to add/update member: {m: 
default:sqllogic0:1c57e22f10fd47df@10.233.107.199:3344, s: ALIVE, inc: 6}, due 
to failed fetchMetadata call (cause: java.util.concurrent.TimeoutException: Did 
not
 observe any item or terminal signal within 1000ms in 'source(MonoDefer)' (and 
no fallback has been configured))
{noformat}

[~rpuch] is aware of the issue and participated in the investigation.


was (Author: v.pyatkov):
The test failure is not related to the placement driver because cluster nodes 
cannot be joined due to a problem in the discovery procedure. The reason for 
the issue is in the scale cube.

{noformat}
[2023-11-17T00:20:22,152][WARN ][sc-cluster-3345-2][MetadataStore] 
[default:sqllogic1:1ca7b2f5308489d@10.233.107.205:3345][0ccd29d5-2fc2-449c-942c-ae42760adf7d]
 Timeout getting GetMetadataResp from 10.233.107.205:3344 within 1000 ms, 
cause: java.util.concurrent.TimeoutException: Did not observe any item or 
terminal signal within 1000ms in 'source(MonoDefer)' (a
nd no fallback has been configured)
[2023-11-17T00:20:22,153][WARN ][sc-cluster-3345-2][MembershipProtocol] 
[default:sqllogic1:1ca7b2f5308489d@10.233.107.205:3345][updateMembership][MEMBERSHIP_GOSSIP]
 Skipping to add/update member: {m: 
default:sqllogic0:6a78c57fcd0a496d@10.233.107.205:3344, s: ALIVE, inc: 9}, due 
to failed fetchMetadata call (cause: java.util.concurrent.TimeoutException: Did 
not observe any item or terminal signal within 1000ms in 'source(MonoDefer)' 
(and no fallback has been configured))
{noformat}

[~rpuch] is aware of the issue and participated in the investigation.

> Tests fail on TC because a primary replica is not assigned or does not respond
> --
>
> Key: IGNITE-20723
> URL: https://issues.apache.org/jira/browse/IGNITE-20723
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vladislav Pyatkov
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
> Attachments: _Integration_Tests_Module_Runner_SQL_Logic_11804.log.zip
>
>
> TC run is available 
> [here|https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_IntegrationTests_ModuleRunnerSqlLogic/7584713?hideProblemsFromDependencies=false=false=true=true=true].
> By my brig analysis, the issue is somewhere in the assignments:
> {noformat}
> [2023-10-06T13:03:51,231][INFO 
> ][%sqllogic0%Raft-Group-Client-1][PlacementDriverManager] Placement driver 
> active actor is starting.
> [2023-10-06T13:08:38,981][INFO 
> ][%sqllogic1%MessagingService-inbound--0][ReplicaManager] Received 
> LeaseGrantedMessage for replica belonging to group=291_part_3, force=false
> [2023-10-06T13:08:38,981][INFO 
> ][%sqllogic1%MessagingService-inbound--0][ReplicaManager] Waiting for actual 
> storage state, group=291_part_3
> [2023-10-06T13:08:38,981][INFO 
> ][%sqllogic1%JRaft-Request-Processor-5][ReplicaManager] Lease accepted, 
> group=291_part_3, leaseStartTime=HybridTimestamp [time=88228107862067, 
> physical=1696597718931, logical=51], leaseExpirationTime=HybridTimestamp 
> [time=88235972182016, physical=1696597838931, logical=0]
> [2023-10-06T13:08:50,256][WARN 
> ][CompletableFutureDelayScheduler][RaftGroupServiceImpl] Recoverable error 
> during the request type=ActionRequestImpl occurred (will be retried on the 
> randomly selected node):
> java.util.concurrent.CompletionException: 
> java.util.concurrent.TimeoutException
> at 
> java.util.concurrent.CompletableFuture.encodeRelay(CompletableFuture.java:367)
>  ~[?:?]
> at 
> java.util.concurrent.CompletableFuture.completeRelay(CompletableFuture.java:376)
>  ~[?:?]
> at 
> java.util.concurrent.CompletableFuture$UniRelay.tryFire(CompletableFuture.java:1019)
>  ~[?:?]
> at 
> 

[jira] [Commented] (IGNITE-20723) Tests fail on TC because a primary replica is not assigned or does not respond

2023-11-22 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17788665#comment-17788665
 ] 

Vladislav Pyatkov commented on IGNITE-20723:


The test failure is not related to the placement driver because cluster nodes 
cannot be joined due to a problem in the discovery procedure. The reason for 
the issue is in the scale cube.

{noformat}
[2023-11-17T00:20:22,152][WARN ][sc-cluster-3345-2][MetadataStore] 
[default:sqllogic1:1ca7b2f5308489d@10.233.107.205:3345][0ccd29d5-2fc2-449c-942c-ae42760adf7d]
 Timeout getting GetMetadataResp from 10.233.107.205:3344 within 1000 ms, 
cause: java.util.concurrent.TimeoutException: Did not observe any item or 
terminal signal within 1000ms in 'source(MonoDefer)' (a
nd no fallback has been configured)
[2023-11-17T00:20:22,153][WARN ][sc-cluster-3345-2][MembershipProtocol] 
[default:sqllogic1:1ca7b2f5308489d@10.233.107.205:3345][updateMembership][MEMBERSHIP_GOSSIP]
 Skipping to add/update member: {m: 
default:sqllogic0:6a78c57fcd0a496d@10.233.107.205:3344, s: ALIVE, inc: 9}, due 
to failed fetchMetadata call (cause: java.util.concurrent.TimeoutException: Did 
not observe any item or terminal signal within 1000ms in 'source(MonoDefer)' 
(and no fallback has been configured))

{noformat}

[~rpuch] is aware of the issue and participated in the investigation.

> Tests fail on TC because a primary replica is not assigned or does not respond
> --
>
> Key: IGNITE-20723
> URL: https://issues.apache.org/jira/browse/IGNITE-20723
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vladislav Pyatkov
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
> Attachments: _Integration_Tests_Module_Runner_SQL_Logic_11804.log.zip
>
>
> TC run is available 
> [here|https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_IntegrationTests_ModuleRunnerSqlLogic/7584713?hideProblemsFromDependencies=false=false=true=true=true].
> By my brig analysis, the issue is somewhere in the assignments:
> {noformat}
> [2023-10-06T13:03:51,231][INFO 
> ][%sqllogic0%Raft-Group-Client-1][PlacementDriverManager] Placement driver 
> active actor is starting.
> [2023-10-06T13:08:38,981][INFO 
> ][%sqllogic1%MessagingService-inbound--0][ReplicaManager] Received 
> LeaseGrantedMessage for replica belonging to group=291_part_3, force=false
> [2023-10-06T13:08:38,981][INFO 
> ][%sqllogic1%MessagingService-inbound--0][ReplicaManager] Waiting for actual 
> storage state, group=291_part_3
> [2023-10-06T13:08:38,981][INFO 
> ][%sqllogic1%JRaft-Request-Processor-5][ReplicaManager] Lease accepted, 
> group=291_part_3, leaseStartTime=HybridTimestamp [time=88228107862067, 
> physical=1696597718931, logical=51], leaseExpirationTime=HybridTimestamp 
> [time=88235972182016, physical=1696597838931, logical=0]
> [2023-10-06T13:08:50,256][WARN 
> ][CompletableFutureDelayScheduler][RaftGroupServiceImpl] Recoverable error 
> during the request type=ActionRequestImpl occurred (will be retried on the 
> randomly selected node):
> java.util.concurrent.CompletionException: 
> java.util.concurrent.TimeoutException
> at 
> java.util.concurrent.CompletableFuture.encodeRelay(CompletableFuture.java:367)
>  ~[?:?]
> at 
> java.util.concurrent.CompletableFuture.completeRelay(CompletableFuture.java:376)
>  ~[?:?]
> at 
> java.util.concurrent.CompletableFuture$UniRelay.tryFire(CompletableFuture.java:1019)
>  ~[?:?]
> at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
>  [?:?]
> at 
> java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)
>  [?:?]
> at 
> java.util.concurrent.CompletableFuture$Timeout.run(CompletableFuture.java:2792)
>  [?:?]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
> at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
>  [?:?]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  [?:?]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  [?:?]
> at java.lang.Thread.run(Thread.java:834) [?:?]
> Caused by: java.util.concurrent.TimeoutException
> ... 7 more
> at 
> app//org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:42)
> at app//org.junit.jupiter.api.Assertions.fail(Assertions.java:147)
> at 
> app//org.apache.ignite.internal.sqllogic.Statement.execute(Statement.java:112)
> at 
> 

[jira] [Updated] (IGNITE-20918) Leases expire after a node has been restarted

2023-11-22 Thread Aleksandr Polovtcev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Polovtcev updated IGNITE-20918:
-
Description: 
IGNITE-20910 introduces a test that inserts some data after restarting a node. 
For some reason, after some time, I can see the following messages in the log:

{noformat}
[2023-11-22T10:00:17,056][INFO 
][%isnt_tmpar_0%metastorage-watch-executor-3][PartitionReplicaListener] Primary 
replica expired [grp=5_part_19]
[2023-11-22T10:00:17,057][INFO 
][%isnt_tmpar_0%metastorage-watch-executor-3][PartitionReplicaListener] Primary 
replica expired [grp=5_part_0]
[2023-11-22T10:00:17,057][INFO 
][%isnt_tmpar_0%metastorage-watch-executor-3][PartitionReplicaListener] Primary 
replica expired [grp=5_part_9]
[2023-11-22T10:00:17,057][INFO 
][%isnt_tmpar_0%metastorage-watch-executor-3][PartitionReplicaListener] Primary 
replica expired [grp=5_part_10]
{noformat}

After that, the test fails with a {{PrimaryReplicaMissException}}. The problem 
here, that it is expected that a single node should never have expired leases, 
they should be prolongated automatically. I think that this happens because the 
initial lease that was issued before the node was restarted is still accepted 
by the node after restart.

> Leases expire after a node has been restarted
> -
>
> Key: IGNITE-20918
> URL: https://issues.apache.org/jira/browse/IGNITE-20918
> Project: Ignite
>  Issue Type: Bug
>Reporter: Aleksandr Polovtcev
>Priority: Critical
>  Labels: ignite-3
>
> IGNITE-20910 introduces a test that inserts some data after restarting a 
> node. For some reason, after some time, I can see the following messages in 
> the log:
> {noformat}
> [2023-11-22T10:00:17,056][INFO 
> ][%isnt_tmpar_0%metastorage-watch-executor-3][PartitionReplicaListener] 
> Primary replica expired [grp=5_part_19]
> [2023-11-22T10:00:17,057][INFO 
> ][%isnt_tmpar_0%metastorage-watch-executor-3][PartitionReplicaListener] 
> Primary replica expired [grp=5_part_0]
> [2023-11-22T10:00:17,057][INFO 
> ][%isnt_tmpar_0%metastorage-watch-executor-3][PartitionReplicaListener] 
> Primary replica expired [grp=5_part_9]
> [2023-11-22T10:00:17,057][INFO 
> ][%isnt_tmpar_0%metastorage-watch-executor-3][PartitionReplicaListener] 
> Primary replica expired [grp=5_part_10]
> {noformat}
> After that, the test fails with a {{PrimaryReplicaMissException}}. The 
> problem here, that it is expected that a single node should never have 
> expired leases, they should be prolongated automatically. I think that this 
> happens because the initial lease that was issued before the node was 
> restarted is still accepted by the node after restart.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (IGNITE-20723) Tests fail on TC because a primary replica is not assigned or does not respond

2023-11-22 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17788665#comment-17788665
 ] 

Vladislav Pyatkov edited comment on IGNITE-20723 at 11/22/23 8:10 AM:
--

The test failure is not related to the placement driver because cluster nodes 
cannot be joined due to a problem in the discovery procedure. The reason for 
the issue is in the scale cube.

{noformat}
[2023-11-17T00:20:22,152][WARN ][sc-cluster-3345-2][MetadataStore] 
[default:sqllogic1:1ca7b2f5308489d@10.233.107.205:3345][0ccd29d5-2fc2-449c-942c-ae42760adf7d]
 Timeout getting GetMetadataResp from 10.233.107.205:3344 within 1000 ms, 
cause: java.util.concurrent.TimeoutException: Did not observe any item or 
terminal signal within 1000ms in 'source(MonoDefer)' (a
nd no fallback has been configured)
[2023-11-17T00:20:22,153][WARN ][sc-cluster-3345-2][MembershipProtocol] 
[default:sqllogic1:1ca7b2f5308489d@10.233.107.205:3345][updateMembership][MEMBERSHIP_GOSSIP]
 Skipping to add/update member: {m: 
default:sqllogic0:6a78c57fcd0a496d@10.233.107.205:3344, s: ALIVE, inc: 9}, due 
to failed fetchMetadata call (cause: java.util.concurrent.TimeoutException: Did 
not observe any item or terminal signal within 1000ms in 'source(MonoDefer)' 
(and no fallback has been configured))
{noformat}

[~rpuch] is aware of the issue and participated in the investigation.


was (Author: v.pyatkov):
The test failure is not related to the placement driver because cluster nodes 
cannot be joined due to a problem in the discovery procedure. The reason for 
the issue is in the scale cube.

{noformat}
[2023-11-17T00:20:22,152][WARN ][sc-cluster-3345-2][MetadataStore] 
[default:sqllogic1:1ca7b2f5308489d@10.233.107.205:3345][0ccd29d5-2fc2-449c-942c-ae42760adf7d]
 Timeout getting GetMetadataResp from 10.233.107.205:3344 within 1000 ms, 
cause: java.util.concurrent.TimeoutException: Did not observe any item or 
terminal signal within 1000ms in 'source(MonoDefer)' (a
nd no fallback has been configured)
[2023-11-17T00:20:22,153][WARN ][sc-cluster-3345-2][MembershipProtocol] 
[default:sqllogic1:1ca7b2f5308489d@10.233.107.205:3345][updateMembership][MEMBERSHIP_GOSSIP]
 Skipping to add/update member: {m: 
default:sqllogic0:6a78c57fcd0a496d@10.233.107.205:3344, s: ALIVE, inc: 9}, due 
to failed fetchMetadata call (cause: java.util.concurrent.TimeoutException: Did 
not observe any item or terminal signal within 1000ms in 'source(MonoDefer)' 
(and no fallback has been configured))

{noformat}

[~rpuch] is aware of the issue and participated in the investigation.

> Tests fail on TC because a primary replica is not assigned or does not respond
> --
>
> Key: IGNITE-20723
> URL: https://issues.apache.org/jira/browse/IGNITE-20723
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vladislav Pyatkov
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
> Attachments: _Integration_Tests_Module_Runner_SQL_Logic_11804.log.zip
>
>
> TC run is available 
> [here|https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_IntegrationTests_ModuleRunnerSqlLogic/7584713?hideProblemsFromDependencies=false=false=true=true=true].
> By my brig analysis, the issue is somewhere in the assignments:
> {noformat}
> [2023-10-06T13:03:51,231][INFO 
> ][%sqllogic0%Raft-Group-Client-1][PlacementDriverManager] Placement driver 
> active actor is starting.
> [2023-10-06T13:08:38,981][INFO 
> ][%sqllogic1%MessagingService-inbound--0][ReplicaManager] Received 
> LeaseGrantedMessage for replica belonging to group=291_part_3, force=false
> [2023-10-06T13:08:38,981][INFO 
> ][%sqllogic1%MessagingService-inbound--0][ReplicaManager] Waiting for actual 
> storage state, group=291_part_3
> [2023-10-06T13:08:38,981][INFO 
> ][%sqllogic1%JRaft-Request-Processor-5][ReplicaManager] Lease accepted, 
> group=291_part_3, leaseStartTime=HybridTimestamp [time=88228107862067, 
> physical=1696597718931, logical=51], leaseExpirationTime=HybridTimestamp 
> [time=88235972182016, physical=1696597838931, logical=0]
> [2023-10-06T13:08:50,256][WARN 
> ][CompletableFutureDelayScheduler][RaftGroupServiceImpl] Recoverable error 
> during the request type=ActionRequestImpl occurred (will be retried on the 
> randomly selected node):
> java.util.concurrent.CompletionException: 
> java.util.concurrent.TimeoutException
> at 
> java.util.concurrent.CompletableFuture.encodeRelay(CompletableFuture.java:367)
>  ~[?:?]
> at 
> java.util.concurrent.CompletableFuture.completeRelay(CompletableFuture.java:376)
>  ~[?:?]
> at 
> java.util.concurrent.CompletableFuture$UniRelay.tryFire(CompletableFuture.java:1019)
>  ~[?:?]
> at 
> 

[jira] [Updated] (IGNITE-20603) Restore logical topology change event on a node restart

2023-11-22 Thread Mirza Aliev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mirza Aliev updated IGNITE-20603:
-
Description: 
h3. *Motivation*
It is possible that some events were propagated to {{ms.logicalTopology}}, but 
restart happened when we were updating topologyAugmentationMap and other states 
in {{DistributionZoneManager#createMetastorageTopologyListener}}. That means 
that augmentation that must be added to {{zone.topologyAugmentationMap}} wasn't 
added and we need to recover this information, or nodesAttributes wasn't 
propogated to MS.

h3. *Definition of done*
On a node restart, all states, that were going to be updated during watch event 
in  {{DistributionZoneManager#createMetastorageTopologyListener}} must be 
recovered

h3. *Implementation notes*

(outdated, see UPD)
For every zone, compare {{MS.local.logicalTopology.revision}} with 
max(maxScUpFromMap, maxScDownFromMap). If {{logicalTopology.revision}} is 
greater than max(maxScUpFromMap, maxScDownFromMap), that means that some 
topology changes haven't been propagated to topologyAugmentationMap before 
restart and appropriate timers haven't been scheduled. To fill the gap in 
topologyAugmentationMap, compare {{MS.local.logicalTopology}} with 
{{lastSeenLogicalTopology}} and enhance topologyAugmentationMap with the nodes 
that did not have time to be propagated to topologyAugmentationMap before 
restart. {{lastSeenTopology}} is calculated in the following way: we read 
{{MS.local.dataNodes}}, also we take max(scaleUpTriggerKey, 
scaleDownTriggerKey) and retrieve all additions and removals of nodes from the 
topologyAugmentationMap using max(scaleUpTriggerKey, scaleDownTriggerKey) as 
the left bound. After that apply these changes to the map with nodes counters 
from {{MS.local.dataNodes}} and take nodes only with the positive counters. 
This is the lastSeenTopology. Comparing it with {{MS.local.logicalTopology}} 
will tell us which nodes were not added or removed and weren't propagated to 
topologyAugmentationMap before restart. We take these differences and add them 
to the topologyAugmentationMap. As a revision (key for topologyAugmentationMap) 
take {{MS.local.logicalTopology.revision}}. It is safe to take this revision, 
because if some node was added to the {{ms.topology}} after immediate data 
nodes recalculation, this added node must restore this immediate data nodes' 
recalculation intent.

UPD: Implementation notes are outdated, we've implemented a bit different 
approach: now we save the last handled topology to MS, and on restart we 
restore global states according to states from local metastorage and check if 
the current ms.logicalTopology differs from the one that was handled in 
DistributionZoneManager#createMetastorageTopologyListener (we check revision of 
this events), then we just repeat the logic from 
DistributionZoneManager#createMetastorageTopologyListener with the new logical 
topology from the ms.logicalTopology.


  was:
h3. *Motivation*
It is possible that some events were propagated to {{ms.logicalTopology}}, but 
restart happened when we were updating topologyAugmentationMap and other states 
in {{DistributionZoneManager#createMetastorageTopologyListener}}. That means 
that augmentation that must be added to {{zone.topologyAugmentationMap}} wasn't 
added and we need to recover this information, or nodesAttributes wasn't 
propogated to MS.

h3. *Definition of done*
On a node restart, all states, that were going to be updated during watch event 
in  {{DistributionZoneManager#createMetastorageTopologyListener}} must be 
recovered

h3. *Implementation notes*

(outdated, see UPD)
For every zone, compare {{MS.local.logicalTopology.revision}} with 
max(maxScUpFromMap, maxScDownFromMap). If {{logicalTopology.revision}} is 
greater than max(maxScUpFromMap, maxScDownFromMap), that means that some 
topology changes haven't been propagated to topologyAugmentationMap before 
restart and appropriate timers haven't been scheduled. To fill the gap in 
topologyAugmentationMap, compare {{MS.local.logicalTopology}} with 
{{lastSeenLogicalTopology}} and enhance topologyAugmentationMap with the nodes 
that did not have time to be propagated to topologyAugmentationMap before 
restart. {{lastSeenTopology}} is calculated in the following way: we read 
{{MS.local.dataNodes}}, also we take max(scaleUpTriggerKey, 
scaleDownTriggerKey) and retrieve all additions and removals of nodes from the 
topologyAugmentationMap using max(scaleUpTriggerKey, scaleDownTriggerKey) as 
the left bound. After that apply these changes to the map with nodes counters 
from {{MS.local.dataNodes}} and take nodes only with the positive counters. 
This is the lastSeenTopology. Comparing it with {{MS.local.logicalTopology}} 
will tell us which nodes were not added or removed and weren't propagated to 
topologyAugmentationMap before restart. We take these differences and 

[jira] [Updated] (IGNITE-20603) Restore logical topology change event on a node restart

2023-11-22 Thread Mirza Aliev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mirza Aliev updated IGNITE-20603:
-
Description: 
h3. *Motivation*
It is possible that some events were propagated to {{ms.logicalTopology}}, but 
restart happened when we were updating topologyAugmentationMap and other states 
in {{DistributionZoneManager#createMetastorageTopologyListener}}. That means 
that augmentation that must be added to {{zone.topologyAugmentationMap}} wasn't 
added and we need to recover this information, or nodesAttributes wasn't 
propogated to MS.

h3. *Definition of done*
On a node restart, all states, that were going to be updated during watch event 
in  {{DistributionZoneManager#createMetastorageTopologyListener}} must be 
recovered

h3. *Implementation notes*

(outdated, see UPD)
For every zone, compare {{MS.local.logicalTopology.revision}} with 
max(maxScUpFromMap, maxScDownFromMap). If {{logicalTopology.revision}} is 
greater than max(maxScUpFromMap, maxScDownFromMap), that means that some 
topology changes haven't been propagated to topologyAugmentationMap before 
restart and appropriate timers haven't been scheduled. To fill the gap in 
topologyAugmentationMap, compare {{MS.local.logicalTopology}} with 
{{lastSeenLogicalTopology}} and enhance topologyAugmentationMap with the nodes 
that did not have time to be propagated to topologyAugmentationMap before 
restart. {{lastSeenTopology}} is calculated in the following way: we read 
{{MS.local.dataNodes}}, also we take max(scaleUpTriggerKey, 
scaleDownTriggerKey) and retrieve all additions and removals of nodes from the 
topologyAugmentationMap using max(scaleUpTriggerKey, scaleDownTriggerKey) as 
the left bound. After that apply these changes to the map with nodes counters 
from {{MS.local.dataNodes}} and take nodes only with the positive counters. 
This is the lastSeenTopology. Comparing it with {{MS.local.logicalTopology}} 
will tell us which nodes were not added or removed and weren't propagated to 
topologyAugmentationMap before restart. We take these differences and add them 
to the topologyAugmentationMap. As a revision (key for topologyAugmentationMap) 
take {{MS.local.logicalTopology.revision}}. It is safe to take this revision, 
because if some node was added to the {{ms.topology}} after immediate data 
nodes recalculation, this added node must restore this immediate data nodes' 
recalculation intent.

UPD: Implementation notes are outdated, we've implemented a bit different 
approach: now we save the last handled topology to MS, and on restart we 
restore global stated according to states from local metastorage and check if 
the current ms.logicalTopology differs from the one that was handled in 
DistributionZoneManager#createMetastorageTopologyListener (we check revision of 
this events), then we just repeat the logic from 
DistributionZoneManager#createMetastorageTopologyListener with the new logical 
topology from the ms.logicalTopology.


  was:
h3. *Motivation*
It is possible that some events were propagated to {{ms.logicalTopology}}, but 
restart happened when we were updating topologyAugmentationMap and other states 
in {{DistributionZoneManager#createMetastorageTopologyListener}}. That means 
that augmentation that must be added to {{zone.topologyAugmentationMap}} wasn't 
added and we need to recover this information, or nodesAttributes wasn't 
propogated to MS.

h3. *Definition of done*
On a node restart, all states, that were going to be updated during watch event 
in  {{DistributionZoneManager#createMetastorageTopologyListener}} must be 
recovered

h3. *Implementation notes*

(outdated, see UPD)
For every zone, compare {{MS.local.logicalTopology.revision}} with 
max(maxScUpFromMap, maxScDownFromMap). If {{logicalTopology.revision}} is 
greater than max(maxScUpFromMap, maxScDownFromMap), that means that some 
topology changes haven't been propagated to topologyAugmentationMap before 
restart and appropriate timers haven't been scheduled. To fill the gap in 
topologyAugmentationMap, compare {{MS.local.logicalTopology}} with 
{{lastSeenLogicalTopology}} and enhance topologyAugmentationMap with the nodes 
that did not have time to be propagated to topologyAugmentationMap before 
restart. {{lastSeenTopology}} is calculated in the following way: we read 
{{MS.local.dataNodes}}, also we take max(scaleUpTriggerKey, 
scaleDownTriggerKey) and retrieve all additions and removals of nodes from the 
topologyAugmentationMap using max(scaleUpTriggerKey, scaleDownTriggerKey) as 
the left bound. After that apply these changes to the map with nodes counters 
from {{MS.local.dataNodes}} and take nodes only with the positive counters. 
This is the lastSeenTopology. Comparing it with {{MS.local.logicalTopology}} 
will tell us which nodes were not added or removed and weren't propagated to 
topologyAugmentationMap before restart. We take these differences and 

[jira] [Updated] (IGNITE-20603) Restore logical topology change event on a node restart

2023-11-22 Thread Mirza Aliev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mirza Aliev updated IGNITE-20603:
-
Description: 
h3. *Motivation*
It is possible that some events were propagated to {{ms.logicalTopology}}, but 
restart happened when we were updating topologyAugmentationMap and other states 
in {{DistributionZoneManager#createMetastorageTopologyListener}}. That means 
that augmentation that must be added to {{zone.topologyAugmentationMap}} wasn't 
added and we need to recover this information, or nodesAttributes wasn't 
propogated to MS.

h3. *Definition of done*
On a node restart, all states, that were going to be updated during watch event 
in  {{DistributionZoneManager#createMetastorageTopologyListener}} must be 
recovered

h3. *Implementation notes*

(outdated, see UPD)
For every zone, compare {{MS.local.logicalTopology.revision}} with 
max(maxScUpFromMap, maxScDownFromMap). If {{logicalTopology.revision}} is 
greater than max(maxScUpFromMap, maxScDownFromMap), that means that some 
topology changes haven't been propagated to topologyAugmentationMap before 
restart and appropriate timers haven't been scheduled. To fill the gap in 
topologyAugmentationMap, compare {{MS.local.logicalTopology}} with 
{{lastSeenLogicalTopology}} and enhance topologyAugmentationMap with the nodes 
that did not have time to be propagated to topologyAugmentationMap before 
restart. {{lastSeenTopology}} is calculated in the following way: we read 
{{MS.local.dataNodes}}, also we take max(scaleUpTriggerKey, 
scaleDownTriggerKey) and retrieve all additions and removals of nodes from the 
topologyAugmentationMap using max(scaleUpTriggerKey, scaleDownTriggerKey) as 
the left bound. After that apply these changes to the map with nodes counters 
from {{MS.local.dataNodes}} and take nodes only with the positive counters. 
This is the lastSeenTopology. Comparing it with {{MS.local.logicalTopology}} 
will tell us which nodes were not added or removed and weren't propagated to 
topologyAugmentationMap before restart. We take these differences and add them 
to the topologyAugmentationMap. As a revision (key for topologyAugmentationMap) 
take {{MS.local.logicalTopology.revision}}. It is safe to take this revision, 
because if some node was added to the {{ms.topology}} after immediate data 
nodes recalculation, this added node must restore this immediate data nodes' 
recalculation intent.

UPD: Implementation notes are outdated, we've implemented a bit different 
approach: now we save the last handled topology to MS, and on restart we check 
if the current ms.logicalTopology differs from the one that was handled in 
DistributionZoneManager#createMetastorageTopologyListener (we check revision of 
this events), then we just repeat the logic from 
DistributionZoneManager#createMetastorageTopologyListener with the new logical 
topology from the ms.logicalTopology.


  was:
h3. *Motivation*
It is possible that some events were propagated to {{ms.logicalTopology}}, but 
restart happened when we were updating topologyAugmentationMap in 
{{DistributionZoneManager#createMetastorageTopologyListener}}. That means that 
augmentation that must be added to {{zone.topologyAugmentationMap}} wasn't 
added and we need to recover this information.

h3. *Definition of done*
On a node restart, topologyAugmentationMap must be correctly restored according 
to {{ms.logicalTopology}} state.


h3. *Implementation notes*

(outdated, see UPD)
For every zone, compare {{MS.local.logicalTopology.revision}} with 
max(maxScUpFromMap, maxScDownFromMap). If {{logicalTopology.revision}} is 
greater than max(maxScUpFromMap, maxScDownFromMap), that means that some 
topology changes haven't been propagated to topologyAugmentationMap before 
restart and appropriate timers haven't been scheduled. To fill the gap in 
topologyAugmentationMap, compare {{MS.local.logicalTopology}} with 
{{lastSeenLogicalTopology}} and enhance topologyAugmentationMap with the nodes 
that did not have time to be propagated to topologyAugmentationMap before 
restart. {{lastSeenTopology}} is calculated in the following way: we read 
{{MS.local.dataNodes}}, also we take max(scaleUpTriggerKey, 
scaleDownTriggerKey) and retrieve all additions and removals of nodes from the 
topologyAugmentationMap using max(scaleUpTriggerKey, scaleDownTriggerKey) as 
the left bound. After that apply these changes to the map with nodes counters 
from {{MS.local.dataNodes}} and take nodes only with the positive counters. 
This is the lastSeenTopology. Comparing it with {{MS.local.logicalTopology}} 
will tell us which nodes were not added or removed and weren't propagated to 
topologyAugmentationMap before restart. We take these differences and add them 
to the topologyAugmentationMap. As a revision (key for topologyAugmentationMap) 
take {{MS.local.logicalTopology.revision}}. It is safe to take this revision, 
because if 

[jira] [Updated] (IGNITE-20603) Restore logical topology change event on a node restart

2023-11-22 Thread Mirza Aliev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mirza Aliev updated IGNITE-20603:
-
Summary: Restore logical topology change event on a node restart  (was: 
Restore topologyAugmentationMap on a node restart)

> Restore logical topology change event on a node restart
> ---
>
> Key: IGNITE-20603
> URL: https://issues.apache.org/jira/browse/IGNITE-20603
> Project: Ignite
>  Issue Type: Bug
>Reporter: Mirza Aliev
>Assignee: Mirza Aliev
>Priority: Major
>  Labels: ignite-3
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> h3. *Motivation*
> It is possible that some events were propagated to {{ms.logicalTopology}}, 
> but restart happened when we were updating topologyAugmentationMap in 
> {{DistributionZoneManager#createMetastorageTopologyListener}}. That means 
> that augmentation that must be added to {{zone.topologyAugmentationMap}} 
> wasn't added and we need to recover this information.
> h3. *Definition of done*
> On a node restart, topologyAugmentationMap must be correctly restored 
> according to {{ms.logicalTopology}} state.
> h3. *Implementation notes*
> (outdated, see UPD)
> For every zone, compare {{MS.local.logicalTopology.revision}} with 
> max(maxScUpFromMap, maxScDownFromMap). If {{logicalTopology.revision}} is 
> greater than max(maxScUpFromMap, maxScDownFromMap), that means that some 
> topology changes haven't been propagated to topologyAugmentationMap before 
> restart and appropriate timers haven't been scheduled. To fill the gap in 
> topologyAugmentationMap, compare {{MS.local.logicalTopology}} with 
> {{lastSeenLogicalTopology}} and enhance topologyAugmentationMap with the 
> nodes that did not have time to be propagated to topologyAugmentationMap 
> before restart. {{lastSeenTopology}} is calculated in the following way: we 
> read {{MS.local.dataNodes}}, also we take max(scaleUpTriggerKey, 
> scaleDownTriggerKey) and retrieve all additions and removals of nodes from 
> the topologyAugmentationMap using max(scaleUpTriggerKey, scaleDownTriggerKey) 
> as the left bound. After that apply these changes to the map with nodes 
> counters from {{MS.local.dataNodes}} and take nodes only with the positive 
> counters. This is the lastSeenTopology. Comparing it with 
> {{MS.local.logicalTopology}} will tell us which nodes were not added or 
> removed and weren't propagated to topologyAugmentationMap before restart. We 
> take these differences and add them to the topologyAugmentationMap. As a 
> revision (key for topologyAugmentationMap) take 
> {{MS.local.logicalTopology.revision}}. It is safe to take this revision, 
> because if some node was added to the {{ms.topology}} after immediate data 
> nodes recalculation, this added node must restore this immediate data nodes' 
> recalculation intent.
> UPD: Implementation notes are outdated, we've implemented a bit different 
> approach: now we save the last handled topology to MS, and on restart we 
> check if the current ms.logicalTopology differs from the one that was handled 
> in DistributionZoneManager#createMetastorageTopologyListener (we check 
> revision of this events), then we just repeat the logic from 
> DistributionZoneManager#createMetastorageTopologyListener with the new 
> logical topology from the ms.logicalTopology.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)