[jira] [Resolved] (IGNITE-20492) NPE in PartitionReplicaListener's primary replica retrieval

2023-09-29 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov resolved IGNITE-20492.

Resolution: Duplicate

The issue is a duplicate of IGNITE-20484.

> NPE in PartitionReplicaListener's primary replica retrieval
> ---
>
> Key: IGNITE-20492
> URL: https://issues.apache.org/jira/browse/IGNITE-20492
> Project: Ignite
>  Issue Type: Bug
>Reporter:  Kirill Sizov
>Priority: Blocker
>  Labels: ignite-3
>
> PartitionReplicaListener.ensureReplicaIsPrimary has the following block of 
> code
> {code:java}
> if (expectedTerm != null) {
> return placementDriver.getPrimaryReplica(replicationGroupId, now)
> .thenCompose(primaryReplica -> {
> long currentEnlistmentConsistencyToken = 
> primaryReplica.getStartTime().longValue();
>  {code}
> However, according to the placementDriver's contract, {{getPrimaryReplica}} 
> can complete with null:
> {quote}
> Same as awaitPrimaryReplica(ReplicationGroupId, HybridTimestamp) despite the 
> fact that given method await logic is bounded. It will wait for a primary 
> replica for a reasonable period of time, and complete a future with null if a 
> matching lease isn't found. Generally speaking reasonable here means enough 
> for distribution across cluster nodes.
> {quote}
> In that case ensureReplicaIsPrimary will crash with NPE:
> {noformat}
>   ... 3 more
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener.lambda$ensureReplicaIsPrimary$155(PartitionReplicaListener.java:2397)
>  ~[ignite-table-3.0.0-SNAPSHOT.jar:?]
>   at 
> java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1072)
>  ~[?:?]
>   at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
>  ~[?:?]
>   at 
> java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2073) 
> ~[?:?]
>   at 
> org.apache.ignite.internal.util.PendingComparableValuesTracker.lambda$completeWaitersOnUpdate$0(PendingComparableValuesTracker.java:169)
>  ~[ignite-core-3.0.0-SNAPSHOT.jar:?]
>   at java.util.concurrent.ConcurrentMap.forEach(ConcurrentMap.java:122) ~[?:?]
>   at 
> org.apache.ignite.internal.util.PendingComparableValuesTracker.completeWaitersOnUpdate(PendingComparableValuesTracker.java:169)
>  ~[ignite-core-3.0.0-SNAPSHOT.jar:?]
>   at 
> org.apache.ignite.internal.util.PendingComparableValuesTracker.update(PendingComparableValuesTracker.java:103)
>  ~[ignite-core-3.0.0-SNAPSHOT.jar:?]
>   at 
> org.apache.ignite.internal.metastorage.server.time.ClusterTimeImpl.updateSafeTime(ClusterTimeImpl.java:146)
>  ~[ignite-metastorage-3.0.0-SNAPSHOT.jar:?]
>   at 
> org.apache.ignite.internal.metastorage.impl.MetaStorageManagerImpl.onSafeTimeAdvanced(MetaStorageManagerImpl.java:849)
>  ~[ignite-metastorage-3.0.0-SNAPSHOT.jar:?]
>   at 
> org.apache.ignite.internal.metastorage.impl.MetaStorageManagerImpl$1.onSafeTimeAdvanced(MetaStorageManagerImpl.java:456)
>  ~[ignite-metastorage-3.0.0-SNAPSHOT.jar:?]
>   at 
> org.apache.ignite.internal.metastorage.server.WatchProcessor.invokeOnRevisionCallback(WatchProcessor.java:247)
>  ~[ignite-metastorage-3.0.0-SNAPSHOT.jar:?]
>   at 
> org.apache.ignite.internal.metastorage.server.WatchProcessor.lambda$notifyWatches$2(WatchProcessor.java:148)
>  ~[ignite-metastorage-3.0.0-SNAPSHOT.jar:?]
>   at 
> java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1072)
>  ~[?:?]
>   at 
> java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:478)
>  ~[?:?]
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-20515) MappedFileMemoryProvider doesn't work while running on JDK 17

2023-09-29 Thread Ignite TC Bot (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-20515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17770570#comment-17770570
 ] 

Ignite TC Bot commented on IGNITE-20515:


{panel:title=Branch: [pull/10961/head] Base: [master] : Possible Blockers 
(1)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1}
{color:#d04437}Platform .NET (Core Linux){color} [[tests 0 TIMEOUT , Exit Code 
, TC_SERVICE_MESSAGE 
|https://ci2.ignite.apache.org/viewLog.html?buildId=7356042]]

{panel}
{panel:title=Branch: [pull/10961/head] Base: [master] : No new tests 
found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1}{panel}
[TeamCity *-- Run :: All* 
Results|https://ci2.ignite.apache.org/viewLog.html?buildId=7354789buildTypeId=IgniteTests24Java8_RunAll]

> MappedFileMemoryProvider doesn't work while running on JDK 17
> -
>
> Key: IGNITE-20515
> URL: https://issues.apache.org/jira/browse/IGNITE-20515
> Project: Ignite
>  Issue Type: Bug
>Affects Versions: 2.15
>Reporter: Ivan Daschinsky
>Assignee: Ivan Daschinsky
>Priority: Major
>  Labels: ise
> Fix For: 2.16
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> {code}
> Cannot invoke "java.lang.reflect.Method.invoke(Object, Object[])" because 
> "o.a.i.i.mem.file.MappedFile.map0" is null
>  class org.apache.ignite.IgniteCheckedException: Cannot invoke 
> "java.lang.reflect.Method.invoke(Object, Object[])" because 
> "org.apache.ignite.internal.mem.file.MappedFile.map0" is null
>at org.apache.ignite.internal.util.IgniteUtils.cast(IgniteUtils.java:7929)
>at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.resolve(GridFutureAdapter.java:261)
>at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:210)
>at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:161)
>at 
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:3376)
>at 
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:3182)
>at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:125)
>at java.base/java.lang.Thread.run(Thread.java:833)
>  Caused by: java.lang.NullPointerException: Cannot invoke 
> "java.lang.reflect.Method.invoke(Object, Object[])" because 
> "org.apache.ignite.internal.mem.file.MappedFile.map0" is null
>at org.apache.ignite.internal.mem.file.MappedFile.map(MappedFile.java:126)
>at 
> org.apache.ignite.internal.mem.file.MappedFile.(MappedFile.java:65)
>at 
> org.apache.ignite.internal.mem.file.MappedFileMemoryProvider.nextRegion(MappedFileMemoryProvider.java:134)
>at 
> org.apache.ignite.internal.processors.cache.persistence.IgniteCacheDatabaseSharedManager$3.nextRegion(IgniteCacheDatabaseSharedManager.java:1419)
>at 
> org.apache.ignite.internal.pagemem.impl.PageMemoryNoStoreImpl.addSegment(PageMemoryNoStoreImpl.java:716)
>at 
> org.apache.ignite.internal.pagemem.impl.PageMemoryNoStoreImpl.start(PageMemoryNoStoreImpl.java:279)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20471) Timeout exception from org.apache.ignite.sql.Session#execute() could be printed to log ambiguously

2023-09-29 Thread Mirza Aliev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mirza Aliev updated IGNITE-20471:
-
Description: 
*Motivation*
The following code prints the different logs:


{code:java}
sql("CREATE TABLE TEST(ID INT PRIMARY KEY, VAL0 INT)");

IgniteSql sql = igniteSql();
Session ses = sql.sessionBuilder().build();

try {
ses.execute(null, "INSERT INTO TEST VALUES (?, ?)", 1, 1);
} catch (Exception e) {
log.error("EXCEPTION", e);

throw e;
}
{code}

This log is printed when we call {{log.error("EXCEPTION", e);}}

{noformat}
[2023-09-29T17:58:48,717][ERROR][main][ItSqlAsynchronousApiTest] EXCEPTION
 org.apache.ignite.lang.IgniteException: null
at 
java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:710) ~[?:?]
at 
org.apache.ignite.internal.util.ExceptionUtils$1.copy(ExceptionUtils.java:772) 
~[main/:?]
at 
org.apache.ignite.internal.util.ExceptionUtils$ExceptionFactory.createCopy(ExceptionUtils.java:706)
 ~[main/:?]
at 
org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCause(ExceptionUtils.java:543)
 ~[main/:?]
at 
org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCauseInternal(ExceptionUtils.java:641)
 ~[main/:?]
at 
org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCause(ExceptionUtils.java:494)
 ~[main/:?]
at 
org.apache.ignite.internal.sql.AbstractSession.execute(AbstractSession.java:63) 
~[main/:?]
at 
org.apache.ignite.internal.sql.api.ItSqlAsynchronousApiTest.select(ItSqlAsynchronousApiTest.java:458)
 ~[integrationTest/:?]
...
Caused by: java.util.concurrent.CompletionException: 
org.apache.ignite.lang.IgniteException: IGN-CMN-65535 
TraceId:54d81fd9-6453-4adb-863d-6e82b9c0cb08
at 
org.apache.ignite.internal.sql.api.SessionImpl.lambda$executeAsync$3(SessionImpl.java:208)
 ~[main/:?]
...
Caused by: org.apache.ignite.lang.IgniteException
at 
org.apache.ignite.internal.lang.IgniteExceptionMapperUtil.mapToPublicException(IgniteExceptionMapperUtil.java:110)
 ~[main/:?]
at 
org.apache.ignite.internal.sql.engine.AsyncSqlCursorImpl.wrapIfNecessary(AsyncSqlCursorImpl.java:100)
 ~[main/:?]
at 
org.apache.ignite.internal.sql.engine.AsyncSqlCursorImpl.lambda$requestNextAsync$0(AsyncSqlCursorImpl.java:76)
 ~[main/:?]
at 
java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:930) 
~[?:?]
at 
java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:907)
 ~[?:?]
...
Caused by: java.util.concurrent.TimeoutException
at 
org.apache.ignite.internal.sql.engine.exec.ResolvedDependencies.fetchColocationGroup(ResolvedDependencies.java:60)
 ~[main/:?]
at 
org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl$DistributedQueryManager.fetchColocationGroups(ExecutionServiceImpl.java:982)
 ~[main/:?]
at 
org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl$DistributedQueryManager.mapFragments(ExecutionServiceImpl.java:850)
 ~[main/:?]
at 
org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl$DistributedQueryManager.lambda$execute$11(ExecutionServiceImpl.java:654)
 ~[main/:?]
...
{noformat}

This one is printed after we {{throw e}}

{noformat}
org.apache.ignite.lang.IgniteException: IGN-CMN-65535 
TraceId:54d81fd9-6453-4adb-863d-6e82b9c0cb08
at 
java.base/java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:710)
at 
org.apache.ignite.internal.util.ExceptionUtils$1.copy(ExceptionUtils.java:772)
at 
org.apache.ignite.internal.util.ExceptionUtils$ExceptionFactory.createCopy(ExceptionUtils.java:706)
at 
org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCause(ExceptionUtils.java:543)
at 
org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCauseInternal(ExceptionUtils.java:641)
at 
org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCause(ExceptionUtils.java:494)
at 
org.apache.ignite.internal.sql.AbstractSession.execute(AbstractSession.java:63)
at 
org.apache.ignite.internal.sql.api.ItSqlAsynchronousApiTest.select(ItSqlAsynchronousApiTest.java:458)
...
Caused by: java.util.concurrent.CompletionException: 
org.apache.ignite.lang.IgniteException: IGN-CMN-65535 
TraceId:54d81fd9-6453-4adb-863d-6e82b9c0cb08
at 
org.apache.ignite.internal.sql.api.SessionImpl.lambda$executeAsync$3(SessionImpl.java:208)
at 
java.base/java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:986)
...
Caused by: org.apache.ignite.lang.IgniteException: IGN-CMN-65535 
TraceId:54d81fd9-6453-4adb-863d-6e82b9c0cb08
at 
org.apache.ignite.internal.lang.IgniteExceptionMapperUtil.mapToPublicException(IgniteExceptionMapperUtil.java:110)
at 

[jira] [Updated] (IGNITE-20519) Add causality token of the last update of catalog descriptors to CatalogObjectDescriptor

2023-09-29 Thread Mirza Aliev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mirza Aliev updated IGNITE-20519:
-
Summary: Add causality token of the last update of catalog descriptors to 
CatalogObjectDescriptor  (was: Add causality token of the last update of 
catalog descriptors )

> Add causality token of the last update of catalog descriptors to 
> CatalogObjectDescriptor
> 
>
> Key: IGNITE-20519
> URL: https://issues.apache.org/jira/browse/IGNITE-20519
> Project: Ignite
>  Issue Type: Bug
>Reporter: Mirza Aliev
>Priority: Major
>  Labels: ignite-3
>
> *Motivation*
> It could be useful to add causality token of the last update of 
> {{CatalogObjectDescriptor}}. For example, this will help us to call
> {{DistributionZoneManager#dataNodes(long causalityToken, int zoneId)}} for 
> the specified {{CatalogZoneDescriptor}}, so we could receive data nodes with 
> accordance of correct version of filter from {{CatalogZoneDescriptor}}
> *Implementation notes*
> This could be done with the enriching {{UpdateEntry#applyUpdate(Catalog 
> catalog)}} with {{causalityToken}}, so we could propagate {{causalityToken}} 
> to all {{UpdateEntry}}, where we recreate {{CatalogObjectDescriptor}} and 
> create new version of {{Catalog}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20492) NPE in PartitionReplicaListener's primary replica retrieval

2023-09-29 Thread Vyacheslav Koptilin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vyacheslav Koptilin updated IGNITE-20492:
-
Priority: Blocker  (was: Major)

> NPE in PartitionReplicaListener's primary replica retrieval
> ---
>
> Key: IGNITE-20492
> URL: https://issues.apache.org/jira/browse/IGNITE-20492
> Project: Ignite
>  Issue Type: Bug
>Reporter:  Kirill Sizov
>Priority: Blocker
>  Labels: ignite-3
>
> PartitionReplicaListener.ensureReplicaIsPrimary has the following block of 
> code
> {code:java}
> if (expectedTerm != null) {
> return placementDriver.getPrimaryReplica(replicationGroupId, now)
> .thenCompose(primaryReplica -> {
> long currentEnlistmentConsistencyToken = 
> primaryReplica.getStartTime().longValue();
>  {code}
> However, according to the placementDriver's contract, {{getPrimaryReplica}} 
> can complete with null:
> {quote}
> Same as awaitPrimaryReplica(ReplicationGroupId, HybridTimestamp) despite the 
> fact that given method await logic is bounded. It will wait for a primary 
> replica for a reasonable period of time, and complete a future with null if a 
> matching lease isn't found. Generally speaking reasonable here means enough 
> for distribution across cluster nodes.
> {quote}
> In that case ensureReplicaIsPrimary will crash with NPE:
> {noformat}
>   ... 3 more
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener.lambda$ensureReplicaIsPrimary$155(PartitionReplicaListener.java:2397)
>  ~[ignite-table-3.0.0-SNAPSHOT.jar:?]
>   at 
> java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1072)
>  ~[?:?]
>   at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
>  ~[?:?]
>   at 
> java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2073) 
> ~[?:?]
>   at 
> org.apache.ignite.internal.util.PendingComparableValuesTracker.lambda$completeWaitersOnUpdate$0(PendingComparableValuesTracker.java:169)
>  ~[ignite-core-3.0.0-SNAPSHOT.jar:?]
>   at java.util.concurrent.ConcurrentMap.forEach(ConcurrentMap.java:122) ~[?:?]
>   at 
> org.apache.ignite.internal.util.PendingComparableValuesTracker.completeWaitersOnUpdate(PendingComparableValuesTracker.java:169)
>  ~[ignite-core-3.0.0-SNAPSHOT.jar:?]
>   at 
> org.apache.ignite.internal.util.PendingComparableValuesTracker.update(PendingComparableValuesTracker.java:103)
>  ~[ignite-core-3.0.0-SNAPSHOT.jar:?]
>   at 
> org.apache.ignite.internal.metastorage.server.time.ClusterTimeImpl.updateSafeTime(ClusterTimeImpl.java:146)
>  ~[ignite-metastorage-3.0.0-SNAPSHOT.jar:?]
>   at 
> org.apache.ignite.internal.metastorage.impl.MetaStorageManagerImpl.onSafeTimeAdvanced(MetaStorageManagerImpl.java:849)
>  ~[ignite-metastorage-3.0.0-SNAPSHOT.jar:?]
>   at 
> org.apache.ignite.internal.metastorage.impl.MetaStorageManagerImpl$1.onSafeTimeAdvanced(MetaStorageManagerImpl.java:456)
>  ~[ignite-metastorage-3.0.0-SNAPSHOT.jar:?]
>   at 
> org.apache.ignite.internal.metastorage.server.WatchProcessor.invokeOnRevisionCallback(WatchProcessor.java:247)
>  ~[ignite-metastorage-3.0.0-SNAPSHOT.jar:?]
>   at 
> org.apache.ignite.internal.metastorage.server.WatchProcessor.lambda$notifyWatches$2(WatchProcessor.java:148)
>  ~[ignite-metastorage-3.0.0-SNAPSHOT.jar:?]
>   at 
> java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1072)
>  ~[?:?]
>   at 
> java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:478)
>  ~[?:?]
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20519) Add causality token of the last update of catalog descriptors

2023-09-29 Thread Mirza Aliev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mirza Aliev updated IGNITE-20519:
-
Description: 
*Motivation*

It could be useful to add causality token of the last update of 
{{CatalogObjectDescriptor}}. For example, this will help us to call
{{DistributionZoneManager#dataNodes(long causalityToken, int zoneId)}} for the 
specified {{CatalogZoneDescriptor}}, so we could receive data nodes with 
accordance of correct version of filter from {{CatalogZoneDescriptor}}

*Implementation notes*

This could be done with the enriching {{UpdateEntry#applyUpdate(Catalog 
catalog)}} with {{causalityToken}}, so we could propagate {{causalityToken}} to 
all {{UpdateEntry}}, where we recreate {{CatalogObjectDescriptor}} and create 
new version of {{Catalog}}

  was:
*Motivation*

It could be useful to add causality token of the last update of 
{{CatalogObjectDescriptor}}. For example, this will help us to call
{{DistributionZoneManager#dataNodes(long causalityToken, int zoneId)}} for the 
specified {{CatalogZoneDescriptor}} 

*Implementation notes*

This could be done with enriching {{UpdateEntry#applyUpdate(Catalog catalog)}} 
with {{causalityToken}}, so we could propagate {{causalityToken}} to all 
{{UpdateEntry}}, where we recreate {{CatalogObjectDescriptor}} and create new 
version of {{Catalog}}


> Add causality token of the last update of catalog descriptors 
> --
>
> Key: IGNITE-20519
> URL: https://issues.apache.org/jira/browse/IGNITE-20519
> Project: Ignite
>  Issue Type: Bug
>Reporter: Mirza Aliev
>Priority: Major
>  Labels: ignite-3
>
> *Motivation*
> It could be useful to add causality token of the last update of 
> {{CatalogObjectDescriptor}}. For example, this will help us to call
> {{DistributionZoneManager#dataNodes(long causalityToken, int zoneId)}} for 
> the specified {{CatalogZoneDescriptor}}, so we could receive data nodes with 
> accordance of correct version of filter from {{CatalogZoneDescriptor}}
> *Implementation notes*
> This could be done with the enriching {{UpdateEntry#applyUpdate(Catalog 
> catalog)}} with {{causalityToken}}, so we could propagate {{causalityToken}} 
> to all {{UpdateEntry}}, where we recreate {{CatalogObjectDescriptor}} and 
> create new version of {{Catalog}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20519) Add causality token of the last update of catalog descriptors

2023-09-29 Thread Mirza Aliev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mirza Aliev updated IGNITE-20519:
-
Description: 
*Motivation*

It could be useful to add causality token of the last update of 
{{CatalogObjectDescriptor}}. For example, this will help us to call
{{DistributionZoneManager#dataNodes(long causalityToken, int zoneId)}} for the 
specified {{CatalogZoneDescriptor}} 

*Implementation notes*

This could be done with enriching {{UpdateEntry#applyUpdate(Catalog catalog)}} 
with {{causalityToken}}, so we could propagate {{causalityToken}} to all 
{{UpdateEntry}}, where we recreate {{CatalogObjectDescriptor}} and create new 
version of {{Catalog}}

  was:
*Motivation*

It could be useful to add causality token of the last update of 
{{CatalogObjectDescriptor}}. For example, this will help us to call
{{DistributionZoneManager#dataNodes(long causalityToken, int zoneId)}} for the 
specified {{CatalogZoneDescriptor}} 


> Add causality token of the last update of catalog descriptors 
> --
>
> Key: IGNITE-20519
> URL: https://issues.apache.org/jira/browse/IGNITE-20519
> Project: Ignite
>  Issue Type: Bug
>Reporter: Mirza Aliev
>Priority: Major
>  Labels: ignite-3
>
> *Motivation*
> It could be useful to add causality token of the last update of 
> {{CatalogObjectDescriptor}}. For example, this will help us to call
> {{DistributionZoneManager#dataNodes(long causalityToken, int zoneId)}} for 
> the specified {{CatalogZoneDescriptor}} 
> *Implementation notes*
> This could be done with enriching {{UpdateEntry#applyUpdate(Catalog 
> catalog)}} with {{causalityToken}}, so we could propagate {{causalityToken}} 
> to all {{UpdateEntry}}, where we recreate {{CatalogObjectDescriptor}} and 
> create new version of {{Catalog}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20519) Add causality token of the last update of catalog descriptors

2023-09-29 Thread Mirza Aliev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mirza Aliev updated IGNITE-20519:
-
Description: 
*Motivation*

It could be useful to add causality token of the last update of 
{{CatalogObjectDescriptor}}. For example, this will help us to call
{{DistributionZoneManager#dataNodes(long causalityToken, int zoneId)}} for the 
specified {{CatalogZoneDescriptor}} 

  was:
*Motivation*

It could be useful to add causality token of the last update of 
{{CatalogObjectDescriptor}}. For example, this will help us to call
dataNodes(int causalityToken) {{DistributionZoneManager#dataNodes(long 
causalityToken, int zoneId)}} for the specified {{CatalogZoneDescriptor}} 


> Add causality token of the last update of catalog descriptors 
> --
>
> Key: IGNITE-20519
> URL: https://issues.apache.org/jira/browse/IGNITE-20519
> Project: Ignite
>  Issue Type: Bug
>Reporter: Mirza Aliev
>Priority: Major
>  Labels: ignite-3
>
> *Motivation*
> It could be useful to add causality token of the last update of 
> {{CatalogObjectDescriptor}}. For example, this will help us to call
> {{DistributionZoneManager#dataNodes(long causalityToken, int zoneId)}} for 
> the specified {{CatalogZoneDescriptor}} 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20519) Add causality token of the last update of catalog descriptors

2023-09-29 Thread Mirza Aliev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mirza Aliev updated IGNITE-20519:
-
Description: 
*Motivation*

It could be useful to add causality token of the last update of 
{{CatalogObjectDescriptor}}. For example, this will help us to call
dataNodes(int causalityToken) {{DistributionZoneManager#dataNodes(long 
causalityToken, int zoneId)}} for the specified {{CatalogZoneDescriptor}} 

> Add causality token of the last update of catalog descriptors 
> --
>
> Key: IGNITE-20519
> URL: https://issues.apache.org/jira/browse/IGNITE-20519
> Project: Ignite
>  Issue Type: Bug
>Reporter: Mirza Aliev
>Priority: Major
>  Labels: ignite-3
>
> *Motivation*
> It could be useful to add causality token of the last update of 
> {{CatalogObjectDescriptor}}. For example, this will help us to call
> dataNodes(int causalityToken) {{DistributionZoneManager#dataNodes(long 
> causalityToken, int zoneId)}} for the specified {{CatalogZoneDescriptor}} 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20397) java.lang.AssertionError: Group of the event is unsupported

2023-09-29 Thread Sergey Uttsel (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Uttsel updated IGNITE-20397:
---
Description: 
h3. Motivation
{code:java}
  java.lang.AssertionError: Group of the event is unsupported 
[nodeId=<11_part_18/isaat_n_2>, 
event=org.apache.ignite.raft.jraft.core.NodeImpl$LogEntryAndClosure@653d84a]
at 
org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:224)
 ~[ignite-raft-3.0.0-SNAPSHOT.jar:?]
at 
org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:191)
 ~[ignite-raft-3.0.0-SNAPSHOT.jar:?]
at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:137) 
~[disruptor-3.3.7.jar:?]
at java.lang.Thread.run(Thread.java:834) ~[?:?] {code}
[https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_RunAllTests/7498320?expandCode+Inspection=true=true=false=true=false=true]

The root cause:
 # StripedDisruptor.StripeEntryHandler#onEvent method gets handler from 
StripedDisruptor.StripeEntryHandler#subscribers by event.nodeId().
 # In some cases the `subscribers` map is cleared by invocation of 
StripedDisruptor.StripeEntryHandler#unsubscribe (for example on table 
dropping), and then StripeEntryHandler receives event with 
SafeTimeSyncCommandImpl.
 # It produces an assertion error: `assert handler != null`

The issue is not caused by the catalog feature changes.

The issue is reproduced when I run the ItSqlAsynchronousApiTest#batchIncomplete 
with RepeatedTest annotation. In this case the cluster is not restarted after 
each tests. It possible to reproduced it frequently if add Thread.sleep in 
StripeEntryHandler#onEvent.
h3. Implementation notes

We decided that we can use LOG.warn() instead of an assert because it is safely 
to skip this event if the table was dropped.
{code:java}
if (handler != null) {
handler.onEvent(event, sequence, endOfBatch || subscribers.size() > 1 && 
!supportsBatches);
} else {
LOG.warn(format("Group of the event is unsupported [nodeId={}, event={}]", 
event.nodeId(), event));
} {code}
*Definition of done*

There is no asserts if handler is null.

  was:
{code:java}
  java.lang.AssertionError: Group of the event is unsupported 
[nodeId=<11_part_18/isaat_n_2>, 
event=org.apache.ignite.raft.jraft.core.NodeImpl$LogEntryAndClosure@653d84a]
at 
org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:224)
 ~[ignite-raft-3.0.0-SNAPSHOT.jar:?]
at 
org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:191)
 ~[ignite-raft-3.0.0-SNAPSHOT.jar:?]
at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:137) 
~[disruptor-3.3.7.jar:?]
at java.lang.Thread.run(Thread.java:834) ~[?:?] {code}
[https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_RunAllTests/7498320?expandCode+Inspection=true=true=false=true=false=true]

The root cause:
 # StripedDisruptor.StripeEntryHandler#onEvent method gets handler from 
StripedDisruptor.StripeEntryHandler#subscribers by event.nodeId().
 # In some cases the `subscribers` map is cleared by invocation of 
StripedDisruptor.StripeEntryHandler#unsubscribe (for example on table 
dropping), and then StripeEntryHandler receives event with 
SafeTimeSyncCommandImpl.
 # It produces an assertion error: `assert handler != null`

The issue is not caused by the catalog feature changes.

The issue is reproduced when I run the ItSqlAsynchronousApiTest#batchIncomplete 
with RepeatedTest annotation. In this case the cluster is not restarted after 
each tests. It possible to reproduced it frequently if add Thread.sleep in 
StripeEntryHandler#onEvent.

We decided that we can use LOG.warn() instead of an assert:
{code:java}
if (handler != null) {
handler.onEvent(event, sequence, endOfBatch || subscribers.size() > 1 && 
!supportsBatches);
} else {
LOG.warn(format("Group of the event is unsupported [nodeId={}, event={}]", 
event.nodeId(), event));
} {code}


> java.lang.AssertionError: Group of the event is unsupported
> ---
>
> Key: IGNITE-20397
> URL: https://issues.apache.org/jira/browse/IGNITE-20397
> Project: Ignite
>  Issue Type: Bug
>Reporter: Alexander Lapin
>Priority: Major
>  Labels: ignite-3
>
> h3. Motivation
> {code:java}
>   java.lang.AssertionError: Group of the event is unsupported 
> [nodeId=<11_part_18/isaat_n_2>, 
> event=org.apache.ignite.raft.jraft.core.NodeImpl$LogEntryAndClosure@653d84a]
> at 
> org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:224)
>  ~[ignite-raft-3.0.0-SNAPSHOT.jar:?]
> at 
> 

[jira] [Updated] (IGNITE-20471) Timeout exception from org.apache.ignite.sql.Session#execute() could be printed to log ambiguously

2023-09-29 Thread Mirza Aliev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mirza Aliev updated IGNITE-20471:
-
Description: 
*Motivation*
The following code prints the different logs:


{code:java}
sql("CREATE TABLE TEST(ID INT PRIMARY KEY, VAL0 INT)");

IgniteSql sql = igniteSql();
Session ses = sql.sessionBuilder().build();

try {
ses.execute(null, "INSERT INTO TEST VALUES (?, ?)", 1, 1);
} catch (Exception e) {
log.error("EXCEPTION", e);

throw e;
}
{code}

This log is printed when we call {{log.error("EXCEPTION", e);}}

{noformat}
[2023-09-29T17:58:48,717][ERROR][main][ItSqlAsynchronousApiTest] EXCEPTION
 org.apache.ignite.lang.IgniteException: null
at 
java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:710) ~[?:?]
at 
org.apache.ignite.internal.util.ExceptionUtils$1.copy(ExceptionUtils.java:772) 
~[main/:?]
at 
org.apache.ignite.internal.util.ExceptionUtils$ExceptionFactory.createCopy(ExceptionUtils.java:706)
 ~[main/:?]
at 
org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCause(ExceptionUtils.java:543)
 ~[main/:?]
at 
org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCauseInternal(ExceptionUtils.java:641)
 ~[main/:?]
at 
org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCause(ExceptionUtils.java:494)
 ~[main/:?]
at 
org.apache.ignite.internal.sql.AbstractSession.execute(AbstractSession.java:63) 
~[main/:?]
at 
org.apache.ignite.internal.sql.api.ItSqlAsynchronousApiTest.select(ItSqlAsynchronousApiTest.java:458)
 ~[integrationTest/:?]
...
Caused by: java.util.concurrent.CompletionException: 
org.apache.ignite.lang.IgniteException: IGN-CMN-65535 
TraceId:54d81fd9-6453-4adb-863d-6e82b9c0cb08
at 
org.apache.ignite.internal.sql.api.SessionImpl.lambda$executeAsync$3(SessionImpl.java:208)
 ~[main/:?]
...
Caused by: org.apache.ignite.lang.IgniteException
at 
org.apache.ignite.internal.lang.IgniteExceptionMapperUtil.mapToPublicException(IgniteExceptionMapperUtil.java:110)
 ~[main/:?]
at 
org.apache.ignite.internal.sql.engine.AsyncSqlCursorImpl.wrapIfNecessary(AsyncSqlCursorImpl.java:100)
 ~[main/:?]
at 
org.apache.ignite.internal.sql.engine.AsyncSqlCursorImpl.lambda$requestNextAsync$0(AsyncSqlCursorImpl.java:76)
 ~[main/:?]
at 
java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:930) 
~[?:?]
at 
java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:907)
 ~[?:?]
...
Caused by: java.util.concurrent.TimeoutException
at 
org.apache.ignite.internal.sql.engine.exec.ResolvedDependencies.fetchColocationGroup(ResolvedDependencies.java:60)
 ~[main/:?]
at 
org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl$DistributedQueryManager.fetchColocationGroups(ExecutionServiceImpl.java:982)
 ~[main/:?]
at 
org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl$DistributedQueryManager.mapFragments(ExecutionServiceImpl.java:850)
 ~[main/:?]
at 
org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl$DistributedQueryManager.lambda$execute$11(ExecutionServiceImpl.java:654)
 ~[main/:?]
...
{noformat}


{noformat}
org.apache.ignite.lang.IgniteException: IGN-CMN-65535 
TraceId:54d81fd9-6453-4adb-863d-6e82b9c0cb08
at 
java.base/java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:710)
at 
org.apache.ignite.internal.util.ExceptionUtils$1.copy(ExceptionUtils.java:772)
at 
org.apache.ignite.internal.util.ExceptionUtils$ExceptionFactory.createCopy(ExceptionUtils.java:706)
at 
org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCause(ExceptionUtils.java:543)
at 
org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCauseInternal(ExceptionUtils.java:641)
at 
org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCause(ExceptionUtils.java:494)
at 
org.apache.ignite.internal.sql.AbstractSession.execute(AbstractSession.java:63)
at 
org.apache.ignite.internal.sql.api.ItSqlAsynchronousApiTest.select(ItSqlAsynchronousApiTest.java:458)
...
Caused by: java.util.concurrent.CompletionException: 
org.apache.ignite.lang.IgniteException: IGN-CMN-65535 
TraceId:54d81fd9-6453-4adb-863d-6e82b9c0cb08
at 
org.apache.ignite.internal.sql.api.SessionImpl.lambda$executeAsync$3(SessionImpl.java:208)
at 
java.base/java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:986)
...
Caused by: org.apache.ignite.lang.IgniteException: IGN-CMN-65535 
TraceId:54d81fd9-6453-4adb-863d-6e82b9c0cb08
at 
org.apache.ignite.internal.lang.IgniteExceptionMapperUtil.mapToPublicException(IgniteExceptionMapperUtil.java:110)
at 

[jira] [Updated] (IGNITE-20471) Timeout exception from org.apache.ignite.sql.Session#execute() could be printed to log ambiguously

2023-09-29 Thread Mirza Aliev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mirza Aliev updated IGNITE-20471:
-
Description: 
*Motivation*
The following code prints the different logs:


{code:java}
sql("CREATE TABLE TEST(ID INT PRIMARY KEY, VAL0 INT)");

IgniteSql sql = igniteSql();
Session ses = sql.sessionBuilder().build();

try {
ses.execute(null, "INSERT INTO TEST VALUES (?, ?)", 1, 1);
} catch (Exception e) {
log.error("EXCEPTION", e);

throw e;
}
{code}

This log is printed when we call {{log.error("EXCEPTION", e);}}

{noformat}
[2023-09-29T17:58:48,717][ERROR][main][ItSqlAsynchronousApiTest] EXCEPTION
 org.apache.ignite.lang.IgniteException: null
at 
java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:710) ~[?:?]
at 
org.apache.ignite.internal.util.ExceptionUtils$1.copy(ExceptionUtils.java:772) 
~[main/:?]
at 
org.apache.ignite.internal.util.ExceptionUtils$ExceptionFactory.createCopy(ExceptionUtils.java:706)
 ~[main/:?]
at 
org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCause(ExceptionUtils.java:543)
 ~[main/:?]
at 
org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCauseInternal(ExceptionUtils.java:641)
 ~[main/:?]
at 
org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCause(ExceptionUtils.java:494)
 ~[main/:?]
at 
org.apache.ignite.internal.sql.AbstractSession.execute(AbstractSession.java:63) 
~[main/:?]
at 
org.apache.ignite.internal.sql.api.ItSqlAsynchronousApiTest.select(ItSqlAsynchronousApiTest.java:458)
 ~[integrationTest/:?]
...
Caused by: java.util.concurrent.CompletionException: 
org.apache.ignite.lang.IgniteException: IGN-CMN-65535 
TraceId:54d81fd9-6453-4adb-863d-6e82b9c0cb08
at 
org.apache.ignite.internal.sql.api.SessionImpl.lambda$executeAsync$3(SessionImpl.java:208)
 ~[main/:?]
...
Caused by: org.apache.ignite.lang.IgniteException
at 
org.apache.ignite.internal.lang.IgniteExceptionMapperUtil.mapToPublicException(IgniteExceptionMapperUtil.java:110)
 ~[main/:?]
at 
org.apache.ignite.internal.sql.engine.AsyncSqlCursorImpl.wrapIfNecessary(AsyncSqlCursorImpl.java:100)
 ~[main/:?]
at 
org.apache.ignite.internal.sql.engine.AsyncSqlCursorImpl.lambda$requestNextAsync$0(AsyncSqlCursorImpl.java:76)
 ~[main/:?]
at 
java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:930) 
~[?:?]
at 
java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:907)
 ~[?:?]
...
Caused by: java.util.concurrent.TimeoutException
at 
org.apache.ignite.internal.sql.engine.exec.ResolvedDependencies.fetchColocationGroup(ResolvedDependencies.java:60)
 ~[main/:?]
at 
org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl$DistributedQueryManager.fetchColocationGroups(ExecutionServiceImpl.java:982)
 ~[main/:?]
at 
org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl$DistributedQueryManager.mapFragments(ExecutionServiceImpl.java:850)
 ~[main/:?]
at 
org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl$DistributedQueryManager.lambda$execute$11(ExecutionServiceImpl.java:654)
 ~[main/:?]
...
{noformat}


{noformat}
org.apache.ignite.lang.IgniteException: IGN-CMN-65535 
TraceId:54d81fd9-6453-4adb-863d-6e82b9c0cb08
at 
java.base/java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:710)
at 
org.apache.ignite.internal.util.ExceptionUtils$1.copy(ExceptionUtils.java:772)
at 
org.apache.ignite.internal.util.ExceptionUtils$ExceptionFactory.createCopy(ExceptionUtils.java:706)
at 
org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCause(ExceptionUtils.java:543)
at 
org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCauseInternal(ExceptionUtils.java:641)
at 
org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCause(ExceptionUtils.java:494)
at 
org.apache.ignite.internal.sql.AbstractSession.execute(AbstractSession.java:63)
at 
org.apache.ignite.internal.sql.api.ItSqlAsynchronousApiTest.select(ItSqlAsynchronousApiTest.java:458)
...
Caused by: java.util.concurrent.CompletionException: 
org.apache.ignite.lang.IgniteException: IGN-CMN-65535 
TraceId:54d81fd9-6453-4adb-863d-6e82b9c0cb08
at 
org.apache.ignite.internal.sql.api.SessionImpl.lambda$executeAsync$3(SessionImpl.java:208)
at 
java.base/java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:986)
...
Caused by: org.apache.ignite.lang.IgniteException: IGN-CMN-65535 
TraceId:54d81fd9-6453-4adb-863d-6e82b9c0cb08
at 
org.apache.ignite.internal.lang.IgniteExceptionMapperUtil.mapToPublicException(IgniteExceptionMapperUtil.java:110)
at 

[jira] [Updated] (IGNITE-20471) Timeout exception from org.apache.ignite.sql.Session#execute() could be printed to log ambiguously

2023-09-29 Thread Mirza Aliev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mirza Aliev updated IGNITE-20471:
-
Description: 
*Motivation*
The following code prints the different logs:


{code:java}
sql("CREATE TABLE TEST(ID INT PRIMARY KEY, VAL0 INT)");

IgniteSql sql = igniteSql();
Session ses = sql.sessionBuilder().build();

try {
ses.execute(null, "INSERT INTO TEST VALUES (?, ?)", 1, 1);
} catch (Exception e) {
log.error("EXCEPTION", e);

throw e;
}
{code}

This log is printed when we call {{log.error("EXCEPTION", e);}}

{noformat}
[2023-09-29T17:58:48,717][ERROR][main][ItSqlAsynchronousApiTest] EXCEPTION
 org.apache.ignite.lang.IgniteException: null
at 
java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:710) ~[?:?]
at 
org.apache.ignite.internal.util.ExceptionUtils$1.copy(ExceptionUtils.java:772) 
~[main/:?]
at 
org.apache.ignite.internal.util.ExceptionUtils$ExceptionFactory.createCopy(ExceptionUtils.java:706)
 ~[main/:?]
at 
org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCause(ExceptionUtils.java:543)
 ~[main/:?]
at 
org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCauseInternal(ExceptionUtils.java:641)
 ~[main/:?]
at 
org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCause(ExceptionUtils.java:494)
 ~[main/:?]
at 
org.apache.ignite.internal.sql.AbstractSession.execute(AbstractSession.java:63) 
~[main/:?]
at 
org.apache.ignite.internal.sql.api.ItSqlAsynchronousApiTest.select(ItSqlAsynchronousApiTest.java:458)
 ~[integrationTest/:?]
...
Caused by: java.util.concurrent.CompletionException: 
org.apache.ignite.lang.IgniteException: IGN-CMN-65535 
TraceId:54d81fd9-6453-4adb-863d-6e82b9c0cb08
at 
org.apache.ignite.internal.sql.api.SessionImpl.lambda$executeAsync$3(SessionImpl.java:208)
 ~[main/:?]
...
Caused by: org.apache.ignite.lang.IgniteException
at 
org.apache.ignite.internal.lang.IgniteExceptionMapperUtil.mapToPublicException(IgniteExceptionMapperUtil.java:110)
 ~[main/:?]
at 
org.apache.ignite.internal.sql.engine.AsyncSqlCursorImpl.wrapIfNecessary(AsyncSqlCursorImpl.java:100)
 ~[main/:?]
at 
org.apache.ignite.internal.sql.engine.AsyncSqlCursorImpl.lambda$requestNextAsync$0(AsyncSqlCursorImpl.java:76)
 ~[main/:?]
at 
java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:930) 
~[?:?]
at 
java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:907)
 ~[?:?]
...
Caused by: java.util.concurrent.TimeoutException
at 
org.apache.ignite.internal.sql.engine.exec.ResolvedDependencies.fetchColocationGroup(ResolvedDependencies.java:60)
 ~[main/:?]
at 
org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl$DistributedQueryManager.fetchColocationGroups(ExecutionServiceImpl.java:982)
 ~[main/:?]
at 
org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl$DistributedQueryManager.mapFragments(ExecutionServiceImpl.java:850)
 ~[main/:?]
at 
org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl$DistributedQueryManager.lambda$execute$11(ExecutionServiceImpl.java:654)
 ~[main/:?]
...
{noformat}


{noformat}
org.apache.ignite.lang.IgniteException: IGN-CMN-65535 
TraceId:54d81fd9-6453-4adb-863d-6e82b9c0cb08
at 
java.base/java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:710)
at 
org.apache.ignite.internal.util.ExceptionUtils$1.copy(ExceptionUtils.java:772)
at 
org.apache.ignite.internal.util.ExceptionUtils$ExceptionFactory.createCopy(ExceptionUtils.java:706)
at 
org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCause(ExceptionUtils.java:543)
at 
org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCauseInternal(ExceptionUtils.java:641)
at 
org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCause(ExceptionUtils.java:494)
at 
org.apache.ignite.internal.sql.AbstractSession.execute(AbstractSession.java:63)
at 
org.apache.ignite.internal.sql.api.ItSqlAsynchronousApiTest.select(ItSqlAsynchronousApiTest.java:458)
...
Caused by: java.util.concurrent.CompletionException: 
org.apache.ignite.lang.IgniteException: IGN-CMN-65535 
TraceId:54d81fd9-6453-4adb-863d-6e82b9c0cb08
at 
org.apache.ignite.internal.sql.api.SessionImpl.lambda$executeAsync$3(SessionImpl.java:208)
at 
java.base/java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:986)
...
Caused by: org.apache.ignite.lang.IgniteException: IGN-CMN-65535 
TraceId:54d81fd9-6453-4adb-863d-6e82b9c0cb08
at 
org.apache.ignite.internal.lang.IgniteExceptionMapperUtil.mapToPublicException(IgniteExceptionMapperUtil.java:110)
at 

[jira] [Updated] (IGNITE-20471) Timeout exception from org.apache.ignite.sql.Session#execute() could be printed to log ambiguously

2023-09-29 Thread Mirza Aliev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mirza Aliev updated IGNITE-20471:
-
Description: 
*Motivation*
The following code prints the different logs:


{code:java}
sql("CREATE TABLE TEST(ID INT PRIMARY KEY, VAL0 INT)");

IgniteSql sql = igniteSql();
Session ses = sql.sessionBuilder().build();

try {
ses.execute(null, "INSERT INTO TEST VALUES (?, ?)", 1, 1);
} catch (Exception e) {
log.error("EXCEPTION", e);

throw e;
}
{code}

This log is printed when we call {{log.error("EXCEPTION", e);}}

{noformat}
[2023-09-29T17:58:48,717][ERROR][main][ItSqlAsynchronousApiTest] EXCEPTION
 org.apache.ignite.lang.IgniteException: null
at 
java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:710) ~[?:?]
at 
org.apache.ignite.internal.util.ExceptionUtils$1.copy(ExceptionUtils.java:772) 
~[main/:?]
at 
org.apache.ignite.internal.util.ExceptionUtils$ExceptionFactory.createCopy(ExceptionUtils.java:706)
 ~[main/:?]
at 
org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCause(ExceptionUtils.java:543)
 ~[main/:?]
at 
org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCauseInternal(ExceptionUtils.java:641)
 ~[main/:?]
at 
org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCause(ExceptionUtils.java:494)
 ~[main/:?]
at 
org.apache.ignite.internal.sql.AbstractSession.execute(AbstractSession.java:63) 
~[main/:?]
at 
org.apache.ignite.internal.sql.api.ItSqlAsynchronousApiTest.select(ItSqlAsynchronousApiTest.java:458)
 ~[integrationTest/:?]
...
Caused by: java.util.concurrent.CompletionException: 
org.apache.ignite.lang.IgniteException: IGN-CMN-65535 
TraceId:54d81fd9-6453-4adb-863d-6e82b9c0cb08
at 
org.apache.ignite.internal.sql.api.SessionImpl.lambda$executeAsync$3(SessionImpl.java:208)
 ~[main/:?]
...
Caused by: org.apache.ignite.lang.IgniteException
at 
org.apache.ignite.internal.lang.IgniteExceptionMapperUtil.mapToPublicException(IgniteExceptionMapperUtil.java:110)
 ~[main/:?]
at 
org.apache.ignite.internal.sql.engine.AsyncSqlCursorImpl.wrapIfNecessary(AsyncSqlCursorImpl.java:100)
 ~[main/:?]
at 
org.apache.ignite.internal.sql.engine.AsyncSqlCursorImpl.lambda$requestNextAsync$0(AsyncSqlCursorImpl.java:76)
 ~[main/:?]
at 
java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:930) 
~[?:?]
at 
java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:907)
 ~[?:?]
...
Caused by: java.util.concurrent.TimeoutException
at 
org.apache.ignite.internal.sql.engine.exec.ResolvedDependencies.fetchColocationGroup(ResolvedDependencies.java:60)
 ~[main/:?]
at 
org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl$DistributedQueryManager.fetchColocationGroups(ExecutionServiceImpl.java:982)
 ~[main/:?]
at 
org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl$DistributedQueryManager.mapFragments(ExecutionServiceImpl.java:850)
 ~[main/:?]
at 
org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl$DistributedQueryManager.lambda$execute$11(ExecutionServiceImpl.java:654)
 ~[main/:?]
...
{noformat}


{noformat}
org.apache.ignite.lang.IgniteException: IGN-CMN-65535 
TraceId:54d81fd9-6453-4adb-863d-6e82b9c0cb08
at 
java.base/java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:710)
at 
org.apache.ignite.internal.util.ExceptionUtils$1.copy(ExceptionUtils.java:772)
at 
org.apache.ignite.internal.util.ExceptionUtils$ExceptionFactory.createCopy(ExceptionUtils.java:706)
at 
org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCause(ExceptionUtils.java:543)
at 
org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCauseInternal(ExceptionUtils.java:641)
at 
org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCause(ExceptionUtils.java:494)
at 
org.apache.ignite.internal.sql.AbstractSession.execute(AbstractSession.java:63)
at 
org.apache.ignite.internal.sql.api.ItSqlAsynchronousApiTest.select(ItSqlAsynchronousApiTest.java:458)
...
Caused by: java.util.concurrent.CompletionException: 
org.apache.ignite.lang.IgniteException: IGN-CMN-65535 
TraceId:54d81fd9-6453-4adb-863d-6e82b9c0cb08
at 
org.apache.ignite.internal.sql.api.SessionImpl.lambda$executeAsync$3(SessionImpl.java:208)
at 
java.base/java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:986)
...
Caused by: org.apache.ignite.lang.IgniteException: IGN-CMN-65535 
TraceId:54d81fd9-6453-4adb-863d-6e82b9c0cb08
at 
org.apache.ignite.internal.lang.IgniteExceptionMapperUtil.mapToPublicException(IgniteExceptionMapperUtil.java:110)
at 

[jira] [Updated] (IGNITE-20471) Timeout exception from org.apache.ignite.sql.Session#execute() could be printed to log ambiguously

2023-09-29 Thread Mirza Aliev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mirza Aliev updated IGNITE-20471:
-
Description: 
*Motivation*


{noformat}
[2023-09-29T17:58:48,717][ERROR][main][ItSqlAsynchronousApiTest] EXCEPTION
 org.apache.ignite.lang.IgniteException: null
at 
java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:710) ~[?:?]
at 
org.apache.ignite.internal.util.ExceptionUtils$1.copy(ExceptionUtils.java:772) 
~[main/:?]
at 
org.apache.ignite.internal.util.ExceptionUtils$ExceptionFactory.createCopy(ExceptionUtils.java:706)
 ~[main/:?]
at 
org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCause(ExceptionUtils.java:543)
 ~[main/:?]
at 
org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCauseInternal(ExceptionUtils.java:641)
 ~[main/:?]
at 
org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCause(ExceptionUtils.java:494)
 ~[main/:?]
at 
org.apache.ignite.internal.sql.AbstractSession.execute(AbstractSession.java:63) 
~[main/:?]
at 
org.apache.ignite.internal.sql.api.ItSqlAsynchronousApiTest.select(ItSqlAsynchronousApiTest.java:458)
 ~[integrationTest/:?]
...
Caused by: java.util.concurrent.CompletionException: 
org.apache.ignite.lang.IgniteException: IGN-CMN-65535 
TraceId:54d81fd9-6453-4adb-863d-6e82b9c0cb08
at 
org.apache.ignite.internal.sql.api.SessionImpl.lambda$executeAsync$3(SessionImpl.java:208)
 ~[main/:?]
...
Caused by: org.apache.ignite.lang.IgniteException
at 
org.apache.ignite.internal.lang.IgniteExceptionMapperUtil.mapToPublicException(IgniteExceptionMapperUtil.java:110)
 ~[main/:?]
at 
org.apache.ignite.internal.sql.engine.AsyncSqlCursorImpl.wrapIfNecessary(AsyncSqlCursorImpl.java:100)
 ~[main/:?]
at 
org.apache.ignite.internal.sql.engine.AsyncSqlCursorImpl.lambda$requestNextAsync$0(AsyncSqlCursorImpl.java:76)
 ~[main/:?]
at 
java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:930) 
~[?:?]
at 
java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:907)
 ~[?:?]
...
Caused by: java.util.concurrent.TimeoutException
at 
org.apache.ignite.internal.sql.engine.exec.ResolvedDependencies.fetchColocationGroup(ResolvedDependencies.java:60)
 ~[main/:?]
at 
org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl$DistributedQueryManager.fetchColocationGroups(ExecutionServiceImpl.java:982)
 ~[main/:?]
at 
org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl$DistributedQueryManager.mapFragments(ExecutionServiceImpl.java:850)
 ~[main/:?]
at 
org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl$DistributedQueryManager.lambda$execute$11(ExecutionServiceImpl.java:654)
 ~[main/:?]
...
{noformat}


  was:
*Motivation*
According the logic of invocations of {{TxManagerImpl#finish}}, it is possible 
that {{recipientNode}}, which is passed to {{finish}}, could be {{null}}. 
Further in the code of {{finish}} method we make 
{{replicaService.invoke(recipientNode)}} and this could lead to 
{{NullPointerException}}. 

UPD1: 

It is possible that I was wrong and we even don't reach the code where we call 
invoke  {{replicaService.invoke(recipientNode)}}, because before we check 
{{groups.isEmpty()}} and seems that we go through the other branch.

Need to investigate why I've got {{null}} when run 
{{ItTableRaftSnapshotsTest#entriesKeepAppendedAfterSnapshotInstallation}}


> Timeout exception from org.apache.ignite.sql.Session#execute() could be 
> printed to log ambiguously
> --
>
> Key: IGNITE-20471
> URL: https://issues.apache.org/jira/browse/IGNITE-20471
> Project: Ignite
>  Issue Type: Bug
>Reporter: Mirza Aliev
>Priority: Major
>  Labels: ignite-3
>
> *Motivation*
> {noformat}
> [2023-09-29T17:58:48,717][ERROR][main][ItSqlAsynchronousApiTest] EXCEPTION
>  org.apache.ignite.lang.IgniteException: null
>   at 
> java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:710) 
> ~[?:?]
>   at 
> org.apache.ignite.internal.util.ExceptionUtils$1.copy(ExceptionUtils.java:772)
>  ~[main/:?]
>   at 
> org.apache.ignite.internal.util.ExceptionUtils$ExceptionFactory.createCopy(ExceptionUtils.java:706)
>  ~[main/:?]
>   at 
> org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCause(ExceptionUtils.java:543)
>  ~[main/:?]
>   at 
> org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCauseInternal(ExceptionUtils.java:641)
>  ~[main/:?]
>   at 
> org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCause(ExceptionUtils.java:494)
>  ~[main/:?]
>   at 
> 

[jira] [Updated] (IGNITE-20471) Timeout exception from org.apache.ignite.sql.Session#execute() could be printed to log ambiguously

2023-09-29 Thread Mirza Aliev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mirza Aliev updated IGNITE-20471:
-
Summary: Timeout exception from org.apache.ignite.sql.Session#execute() 
could be printed to log ambiguously  (was: Handle TxManagerImpl#finish 
correctly when recipientNode is null)

> Timeout exception from org.apache.ignite.sql.Session#execute() could be 
> printed to log ambiguously
> --
>
> Key: IGNITE-20471
> URL: https://issues.apache.org/jira/browse/IGNITE-20471
> Project: Ignite
>  Issue Type: Bug
>Reporter: Mirza Aliev
>Priority: Major
>  Labels: ignite-3
>
> *Motivation*
> According the logic of invocations of {{TxManagerImpl#finish}}, it is 
> possible that {{recipientNode}}, which is passed to {{finish}}, could be 
> {{null}}. Further in the code of {{finish}} method we make 
> {{replicaService.invoke(recipientNode)}} and this could lead to 
> {{NullPointerException}}. 
> UPD1: 
> It is possible that I was wrong and we even don't reach the code where we 
> call invoke  {{replicaService.invoke(recipientNode)}}, because before we 
> check {{groups.isEmpty()}} and seems that we go through the other branch.
> Need to investigate why I've got {{null}} when run 
> {{ItTableRaftSnapshotsTest#entriesKeepAppendedAfterSnapshotInstallation}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20485) Allow to configure lease interval

2023-09-29 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov updated IGNITE-20485:
---
Description: 
*Motivation*
Currently, the lease interval depends on the lease update frequency and is 
calculated like this:
{code:title=LeaseUpdater.java}
private static final long LEASE_INTERVAL = 10 * UPDATE_LEASE_MS;
{code}
The interval is impossible to configure; that way, it makes the test longer 
than it can be with a short lease interval.

*Implementation notes*

* Create a new property root to configure. The property has to be specified for 
the placement driver only (PlacementDriverConfigurationScema).
* The property should be configured for the entire cluster 
(ConfigurationType#DISTRIBUTED).
* The property may change on the alive cluster. The placement driver manager 
has to handle configuration updates.
* The property should calculate two parameters: the lease interval and the long 
lease interval (LeaseUpdater#LEASE_INTERVAL, LeaseUpdater#longLeaseInterval). 
The frequency of checking leases is being evaluated based on the lease interval 
(LeaseUpdater#UPDATE_LEASE_MS).
* In addition to the tuning ability through the configuration framework, we 
should provide a configuration through system properties to use in tests. Add 
the system property for the lease interval only, because for the long lease 
interval, the property already exists.
* Do not forget to check TODOs.

*Definition of done*
Allow to configure lease intervat at least the system properties to use in the 
test.
Also, the ability to configure should be available through the Ignite property.

  was:
*Motivation*
Currently, the lease interval depends on the lease update frequency and is 
calculated like this:
{code:title=LeaseUpdater.java}
private static final long LEASE_INTERVAL = 10 * UPDATE_LEASE_MS;
{code}
The interval is impossible to configure; that way, it makes the test longer 
than it can be with a short lease interval.

*Implementation notes*
Do not forget to check TODOs.

*Definition of done*
Allow to configure lease intervat at least the system properties to use in the 
test.
Also, the ability to configure should be available through the Ignite property.


> Allow to configure lease interval
> -
>
> Key: IGNITE-20485
> URL: https://issues.apache.org/jira/browse/IGNITE-20485
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
>
> *Motivation*
> Currently, the lease interval depends on the lease update frequency and is 
> calculated like this:
> {code:title=LeaseUpdater.java}
> private static final long LEASE_INTERVAL = 10 * UPDATE_LEASE_MS;
> {code}
> The interval is impossible to configure; that way, it makes the test longer 
> than it can be with a short lease interval.
> *Implementation notes*
> * Create a new property root to configure. The property has to be specified 
> for the placement driver only (PlacementDriverConfigurationScema).
> * The property should be configured for the entire cluster 
> (ConfigurationType#DISTRIBUTED).
> * The property may change on the alive cluster. The placement driver manager 
> has to handle configuration updates.
> * The property should calculate two parameters: the lease interval and the 
> long lease interval (LeaseUpdater#LEASE_INTERVAL, 
> LeaseUpdater#longLeaseInterval). The frequency of checking leases is being 
> evaluated based on the lease interval (LeaseUpdater#UPDATE_LEASE_MS).
> * In addition to the tuning ability through the configuration framework, we 
> should provide a configuration through system properties to use in tests. Add 
> the system property for the lease interval only, because for the long lease 
> interval, the property already exists.
> * Do not forget to check TODOs.
> *Definition of done*
> Allow to configure lease intervat at least the system properties to use in 
> the test.
> Also, the ability to configure should be available through the Ignite 
> property.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20484) NPE when some operation occurs when the primary replica is changing

2023-09-29 Thread Jira


 [ 
https://issues.apache.org/jira/browse/IGNITE-20484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

 Kirill Sizov updated IGNITE-20484:
---
Description: 
*Motivation*
It happens that when the request is created, the primary replica is in this 
node, but when the request is executed in the replica, it has already lost its 
role.

{noformat}
[2023-09-25T11:03:24,408][WARN 
][%iprct_tpclh_2%metastorage-watch-executor-2][ReplicaManager] Failed to 
process replica request [request=ReadWriteSingleRowReplicaRequestImpl 
[binaryRowMessage=BinaryRowMessageImpl 
[binaryTuple=java.nio.HeapByteBuffer[pos=0 lim=9 cap=9], schemaVersion=1], 
commitPartitionId=TablePartitionIdMessageImpl [partitionId=0, tableId=4], 
full=true, groupId=4_part_0, requestType=RW_UPSERT, term=24742070009862, 
timestampLong=24742430588928, 
transactionId=018acb5d-4e54-0006--705db0b1]]
 java.util.concurrent.CompletionException: java.lang.NullPointerException
at 
java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:314)
 ~[?:?]
at 
java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:319)
 ~[?:?]
at 
java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1081)
 ~[?:?]
at 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) 
~[?:?]
at 
java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2073) 
~[?:?]
at 
org.apache.ignite.internal.util.PendingComparableValuesTracker.lambda$completeWaitersOnUpdate$0(PendingComparableValuesTracker.java:169)
 ~[main/:?]
at java.util.concurrent.ConcurrentMap.forEach(ConcurrentMap.java:122) ~[?:?]
at 
org.apache.ignite.internal.util.PendingComparableValuesTracker.completeWaitersOnUpdate(PendingComparableValuesTracker.java:169)
 ~[main/:?]
at 
org.apache.ignite.internal.util.PendingComparableValuesTracker.update(PendingComparableValuesTracker.java:103)
 ~[main/:?]
at 
org.apache.ignite.internal.metastorage.server.time.ClusterTimeImpl.updateSafeTime(ClusterTimeImpl.java:146)
 ~[main/:?]
at 
org.apache.ignite.internal.metastorage.impl.MetaStorageManagerImpl.onSafeTimeAdvanced(MetaStorageManagerImpl.java:849)
 ~[main/:?]
at 
org.apache.ignite.internal.metastorage.impl.MetaStorageManagerImpl$1.onSafeTimeAdvanced(MetaStorageManagerImpl.java:456)
 ~[main/:?]
at 
org.apache.ignite.internal.metastorage.server.WatchProcessor.lambda$advanceSafeTime$7(WatchProcessor.java:269)
 ~[main/:?]
at 
java.util.concurrent.CompletableFuture$UniRun.tryFire(CompletableFuture.java:783)
 [?:?]
at 
java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:478)
 [?:?]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) 
[?:?]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) 
[?:?]
at java.lang.Thread.run(Thread.java:834) [?:?]
Caused by: java.lang.NullPointerException
at 
org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener.lambda$ensureReplicaIsPrimary$161(PartitionReplicaListener.java:2415)
 ~[main/:?]
at 
java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1072)
 ~[?:?]
... 15 more
{noformat}

*Definition of done*
In this case, we should throw the correct exception because the request cannot 
be handled in this replica anymore, and the matched transaction will be rolled 
back.

*Implementation notes*
Do not forget to check all places where the issue is mentioned (especially in 
TODO section).

As discussed with [~sanpwc]:
This exception is likely to be thrown when 
- we successfully get a primary replica on one node
- send a message and the message is slightly slow to be delivered
- we handle the received message on the recepient node and run 
{{placementDriver.getPrimaryReplica}}. 

If the previous lease has expired by the time we handle the message, the call 
to {{placementDriver}} will result in a {{null}} value instead of a 
{{ReplicaMeta}} instance. Hence the NPE.

  was:
*Motivation*
It happens that when the request is created, the primary replica is in this 
node, but when the request is executed in the replica, it has already lost its 
role.

{noformat}
[2023-09-25T11:03:24,408][WARN 
][%iprct_tpclh_2%metastorage-watch-executor-2][ReplicaManager] Failed to 
process replica request [request=ReadWriteSingleRowReplicaRequestImpl 
[binaryRowMessage=BinaryRowMessageImpl 
[binaryTuple=java.nio.HeapByteBuffer[pos=0 lim=9 cap=9], schemaVersion=1], 
commitPartitionId=TablePartitionIdMessageImpl [partitionId=0, tableId=4], 
full=true, groupId=4_part_0, requestType=RW_UPSERT, term=24742070009862, 
timestampLong=24742430588928, 
transactionId=018acb5d-4e54-0006--705db0b1]]
 java.util.concurrent.CompletionException: java.lang.NullPointerException
at 

[jira] [Updated] (IGNITE-20484) NPE when some operation occurs when the primary replica is changing

2023-09-29 Thread Jira


 [ 
https://issues.apache.org/jira/browse/IGNITE-20484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

 Kirill Sizov updated IGNITE-20484:
---
Description: 
*Motivation*
It happens that when the request is created, the primary replica is in this 
node, but when the request is executed in the replica, it has already lost its 
role.

{noformat}
[2023-09-25T11:03:24,408][WARN 
][%iprct_tpclh_2%metastorage-watch-executor-2][ReplicaManager] Failed to 
process replica request [request=ReadWriteSingleRowReplicaRequestImpl 
[binaryRowMessage=BinaryRowMessageImpl 
[binaryTuple=java.nio.HeapByteBuffer[pos=0 lim=9 cap=9], schemaVersion=1], 
commitPartitionId=TablePartitionIdMessageImpl [partitionId=0, tableId=4], 
full=true, groupId=4_part_0, requestType=RW_UPSERT, term=24742070009862, 
timestampLong=24742430588928, 
transactionId=018acb5d-4e54-0006--705db0b1]]
 java.util.concurrent.CompletionException: java.lang.NullPointerException
at 
java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:314)
 ~[?:?]
at 
java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:319)
 ~[?:?]
at 
java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1081)
 ~[?:?]
at 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) 
~[?:?]
at 
java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2073) 
~[?:?]
at 
org.apache.ignite.internal.util.PendingComparableValuesTracker.lambda$completeWaitersOnUpdate$0(PendingComparableValuesTracker.java:169)
 ~[main/:?]
at java.util.concurrent.ConcurrentMap.forEach(ConcurrentMap.java:122) ~[?:?]
at 
org.apache.ignite.internal.util.PendingComparableValuesTracker.completeWaitersOnUpdate(PendingComparableValuesTracker.java:169)
 ~[main/:?]
at 
org.apache.ignite.internal.util.PendingComparableValuesTracker.update(PendingComparableValuesTracker.java:103)
 ~[main/:?]
at 
org.apache.ignite.internal.metastorage.server.time.ClusterTimeImpl.updateSafeTime(ClusterTimeImpl.java:146)
 ~[main/:?]
at 
org.apache.ignite.internal.metastorage.impl.MetaStorageManagerImpl.onSafeTimeAdvanced(MetaStorageManagerImpl.java:849)
 ~[main/:?]
at 
org.apache.ignite.internal.metastorage.impl.MetaStorageManagerImpl$1.onSafeTimeAdvanced(MetaStorageManagerImpl.java:456)
 ~[main/:?]
at 
org.apache.ignite.internal.metastorage.server.WatchProcessor.lambda$advanceSafeTime$7(WatchProcessor.java:269)
 ~[main/:?]
at 
java.util.concurrent.CompletableFuture$UniRun.tryFire(CompletableFuture.java:783)
 [?:?]
at 
java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:478)
 [?:?]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) 
[?:?]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) 
[?:?]
at java.lang.Thread.run(Thread.java:834) [?:?]
Caused by: java.lang.NullPointerException
at 
org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener.lambda$ensureReplicaIsPrimary$161(PartitionReplicaListener.java:2415)
 ~[main/:?]
at 
java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1072)
 ~[?:?]
... 15 more
{noformat}

*Definition of done*
In this case, we should throw the correct exception because the request cannot 
be handled in this replica anymore, and the matched transaction will be rolled 
back.

*Implementation notes*
Do not forget to check all places where the issue is mentioned (especially in 
TODO section).

As discussed with [~sanpwc]:
This exception is likely to be thrown when 
- we successfully get a primary replica on one node
- send a message and the message is slightly slow to be delivered
- we handle the received message on the recepient node and run 
{{placementDriver.getPrimaryReplica}}. 

If the previous lease has expired by the time we handle the message, the call 
to {{placementDriver}} will result in a {{null}} value instead of a 
{{ReplicaMeta}} instance.
Any call with no null check on it may end up with NPE.
Calling  

  was:
*Motivation*
It happens that when the request is created, the primary replica is in this 
node, but when the request is executed in the replica, it has already lost its 
role.

{noformat}
[2023-09-25T11:03:24,408][WARN 
][%iprct_tpclh_2%metastorage-watch-executor-2][ReplicaManager] Failed to 
process replica request [request=ReadWriteSingleRowReplicaRequestImpl 
[binaryRowMessage=BinaryRowMessageImpl 
[binaryTuple=java.nio.HeapByteBuffer[pos=0 lim=9 cap=9], schemaVersion=1], 
commitPartitionId=TablePartitionIdMessageImpl [partitionId=0, tableId=4], 
full=true, groupId=4_part_0, requestType=RW_UPSERT, term=24742070009862, 
timestampLong=24742430588928, 
transactionId=018acb5d-4e54-0006--705db0b1]]
 java.util.concurrent.CompletionException: java.lang.NullPointerException
at 

[jira] [Updated] (IGNITE-20484) NPE when some operation occurs when the primary replica is changing

2023-09-29 Thread Jira


 [ 
https://issues.apache.org/jira/browse/IGNITE-20484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

 Kirill Sizov updated IGNITE-20484:
---
Description: 
*Motivation*
It happens that when the request is created, the primary replica is in this 
node, but when the request is executed in the replica, it has already lost its 
role.

{noformat}
[2023-09-25T11:03:24,408][WARN 
][%iprct_tpclh_2%metastorage-watch-executor-2][ReplicaManager] Failed to 
process replica request [request=ReadWriteSingleRowReplicaRequestImpl 
[binaryRowMessage=BinaryRowMessageImpl 
[binaryTuple=java.nio.HeapByteBuffer[pos=0 lim=9 cap=9], schemaVersion=1], 
commitPartitionId=TablePartitionIdMessageImpl [partitionId=0, tableId=4], 
full=true, groupId=4_part_0, requestType=RW_UPSERT, term=24742070009862, 
timestampLong=24742430588928, 
transactionId=018acb5d-4e54-0006--705db0b1]]
 java.util.concurrent.CompletionException: java.lang.NullPointerException
at 
java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:314)
 ~[?:?]
at 
java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:319)
 ~[?:?]
at 
java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1081)
 ~[?:?]
at 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) 
~[?:?]
at 
java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2073) 
~[?:?]
at 
org.apache.ignite.internal.util.PendingComparableValuesTracker.lambda$completeWaitersOnUpdate$0(PendingComparableValuesTracker.java:169)
 ~[main/:?]
at java.util.concurrent.ConcurrentMap.forEach(ConcurrentMap.java:122) ~[?:?]
at 
org.apache.ignite.internal.util.PendingComparableValuesTracker.completeWaitersOnUpdate(PendingComparableValuesTracker.java:169)
 ~[main/:?]
at 
org.apache.ignite.internal.util.PendingComparableValuesTracker.update(PendingComparableValuesTracker.java:103)
 ~[main/:?]
at 
org.apache.ignite.internal.metastorage.server.time.ClusterTimeImpl.updateSafeTime(ClusterTimeImpl.java:146)
 ~[main/:?]
at 
org.apache.ignite.internal.metastorage.impl.MetaStorageManagerImpl.onSafeTimeAdvanced(MetaStorageManagerImpl.java:849)
 ~[main/:?]
at 
org.apache.ignite.internal.metastorage.impl.MetaStorageManagerImpl$1.onSafeTimeAdvanced(MetaStorageManagerImpl.java:456)
 ~[main/:?]
at 
org.apache.ignite.internal.metastorage.server.WatchProcessor.lambda$advanceSafeTime$7(WatchProcessor.java:269)
 ~[main/:?]
at 
java.util.concurrent.CompletableFuture$UniRun.tryFire(CompletableFuture.java:783)
 [?:?]
at 
java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:478)
 [?:?]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) 
[?:?]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) 
[?:?]
at java.lang.Thread.run(Thread.java:834) [?:?]
Caused by: java.lang.NullPointerException
at 
org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener.lambda$ensureReplicaIsPrimary$161(PartitionReplicaListener.java:2415)
 ~[main/:?]
at 
java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1072)
 ~[?:?]
... 15 more
{noformat}

*Definition of done*
In this case, we should throw the correct exception because the request cannot 
be handled in this replica anymore, and the matched transaction will be rolled 
back.

*Implementation notes*
Do not forget to check all places where the issue is mentioned (especially in 
TODO section).

As discussed with [~sanpwc]:
This exception is likely to be thrown when 
- we successfully get a primary replica on one node
- send a message and the message is slightly slow to be delivered
- we handle the received message on the recepient node and run 
{{placementDriver.getPrimaryReplica}}. 
If the previous lease has expired by the time we handle the message, the call 
to {{placementDriver}} will result in a {{null}} value instead of 
{{ReplicaMeta}}.
Any call with no null check on it may end up with NPE.
Calling  

  was:
*Motivation*
It happens that when the request is created, the primary replica is in this 
node, but when the request is executed in the replica, it has already lost its 
role.

{noformat}
[2023-09-25T11:03:24,408][WARN 
][%iprct_tpclh_2%metastorage-watch-executor-2][ReplicaManager] Failed to 
process replica request [request=ReadWriteSingleRowReplicaRequestImpl 
[binaryRowMessage=BinaryRowMessageImpl 
[binaryTuple=java.nio.HeapByteBuffer[pos=0 lim=9 cap=9], schemaVersion=1], 
commitPartitionId=TablePartitionIdMessageImpl [partitionId=0, tableId=4], 
full=true, groupId=4_part_0, requestType=RW_UPSERT, term=24742070009862, 
timestampLong=24742430588928, 
transactionId=018acb5d-4e54-0006--705db0b1]]
 java.util.concurrent.CompletionException: java.lang.NullPointerException
at 

[jira] [Updated] (IGNITE-20484) NPE when some operation occurs when the primary replica is changing

2023-09-29 Thread Jira


 [ 
https://issues.apache.org/jira/browse/IGNITE-20484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

 Kirill Sizov updated IGNITE-20484:
---
Description: 
*Motivation*
It happens that when the request is created, the primary replica is in this 
node, but when the request is executed in the replica, it has already lost its 
role.

{noformat}
[2023-09-25T11:03:24,408][WARN 
][%iprct_tpclh_2%metastorage-watch-executor-2][ReplicaManager] Failed to 
process replica request [request=ReadWriteSingleRowReplicaRequestImpl 
[binaryRowMessage=BinaryRowMessageImpl 
[binaryTuple=java.nio.HeapByteBuffer[pos=0 lim=9 cap=9], schemaVersion=1], 
commitPartitionId=TablePartitionIdMessageImpl [partitionId=0, tableId=4], 
full=true, groupId=4_part_0, requestType=RW_UPSERT, term=24742070009862, 
timestampLong=24742430588928, 
transactionId=018acb5d-4e54-0006--705db0b1]]
 java.util.concurrent.CompletionException: java.lang.NullPointerException
at 
java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:314)
 ~[?:?]
at 
java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:319)
 ~[?:?]
at 
java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1081)
 ~[?:?]
at 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) 
~[?:?]
at 
java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2073) 
~[?:?]
at 
org.apache.ignite.internal.util.PendingComparableValuesTracker.lambda$completeWaitersOnUpdate$0(PendingComparableValuesTracker.java:169)
 ~[main/:?]
at java.util.concurrent.ConcurrentMap.forEach(ConcurrentMap.java:122) ~[?:?]
at 
org.apache.ignite.internal.util.PendingComparableValuesTracker.completeWaitersOnUpdate(PendingComparableValuesTracker.java:169)
 ~[main/:?]
at 
org.apache.ignite.internal.util.PendingComparableValuesTracker.update(PendingComparableValuesTracker.java:103)
 ~[main/:?]
at 
org.apache.ignite.internal.metastorage.server.time.ClusterTimeImpl.updateSafeTime(ClusterTimeImpl.java:146)
 ~[main/:?]
at 
org.apache.ignite.internal.metastorage.impl.MetaStorageManagerImpl.onSafeTimeAdvanced(MetaStorageManagerImpl.java:849)
 ~[main/:?]
at 
org.apache.ignite.internal.metastorage.impl.MetaStorageManagerImpl$1.onSafeTimeAdvanced(MetaStorageManagerImpl.java:456)
 ~[main/:?]
at 
org.apache.ignite.internal.metastorage.server.WatchProcessor.lambda$advanceSafeTime$7(WatchProcessor.java:269)
 ~[main/:?]
at 
java.util.concurrent.CompletableFuture$UniRun.tryFire(CompletableFuture.java:783)
 [?:?]
at 
java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:478)
 [?:?]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) 
[?:?]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) 
[?:?]
at java.lang.Thread.run(Thread.java:834) [?:?]
Caused by: java.lang.NullPointerException
at 
org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener.lambda$ensureReplicaIsPrimary$161(PartitionReplicaListener.java:2415)
 ~[main/:?]
at 
java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1072)
 ~[?:?]
... 15 more
{noformat}

*Definition of done*
In this case, we should throw the correct exception because the request cannot 
be handled in this replica anymore, and the matched transaction will be rolled 
back.

*Implementation notes*
Do not forget to check all places where the issue is mentioned (especially in 
TODO section).

As discussed with [~sanpwc]:
This exception is likely to be thrown when 
- we successfully get a primary replica on one node
- send a message and the message is slightly slow to be delivered
- we handle the received message on the recepient node and run 
{{placementDriver.getPrimaryReplica}}. 

If the previous lease has expired by the time we handle the message, the call 
to {{placementDriver}} will result in a {{null}} value instead of 
{{ReplicaMeta}}.
Any call with no null check on it may end up with NPE.
Calling  

  was:
*Motivation*
It happens that when the request is created, the primary replica is in this 
node, but when the request is executed in the replica, it has already lost its 
role.

{noformat}
[2023-09-25T11:03:24,408][WARN 
][%iprct_tpclh_2%metastorage-watch-executor-2][ReplicaManager] Failed to 
process replica request [request=ReadWriteSingleRowReplicaRequestImpl 
[binaryRowMessage=BinaryRowMessageImpl 
[binaryTuple=java.nio.HeapByteBuffer[pos=0 lim=9 cap=9], schemaVersion=1], 
commitPartitionId=TablePartitionIdMessageImpl [partitionId=0, tableId=4], 
full=true, groupId=4_part_0, requestType=RW_UPSERT, term=24742070009862, 
timestampLong=24742430588928, 
transactionId=018acb5d-4e54-0006--705db0b1]]
 java.util.concurrent.CompletionException: java.lang.NullPointerException
at 

[jira] [Updated] (IGNITE-20484) NPE when some operation occurs when the primary replica is changing

2023-09-29 Thread Jira


 [ 
https://issues.apache.org/jira/browse/IGNITE-20484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

 Kirill Sizov updated IGNITE-20484:
---
Description: 
*Motivation*
It happens that when the request is created, the primary replica is in this 
node, but when the request is executed in the replica, it has already lost its 
role.

{noformat}
[2023-09-25T11:03:24,408][WARN 
][%iprct_tpclh_2%metastorage-watch-executor-2][ReplicaManager] Failed to 
process replica request [request=ReadWriteSingleRowReplicaRequestImpl 
[binaryRowMessage=BinaryRowMessageImpl 
[binaryTuple=java.nio.HeapByteBuffer[pos=0 lim=9 cap=9], schemaVersion=1], 
commitPartitionId=TablePartitionIdMessageImpl [partitionId=0, tableId=4], 
full=true, groupId=4_part_0, requestType=RW_UPSERT, term=24742070009862, 
timestampLong=24742430588928, 
transactionId=018acb5d-4e54-0006--705db0b1]]
 java.util.concurrent.CompletionException: java.lang.NullPointerException
at 
java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:314)
 ~[?:?]
at 
java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:319)
 ~[?:?]
at 
java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1081)
 ~[?:?]
at 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) 
~[?:?]
at 
java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2073) 
~[?:?]
at 
org.apache.ignite.internal.util.PendingComparableValuesTracker.lambda$completeWaitersOnUpdate$0(PendingComparableValuesTracker.java:169)
 ~[main/:?]
at java.util.concurrent.ConcurrentMap.forEach(ConcurrentMap.java:122) ~[?:?]
at 
org.apache.ignite.internal.util.PendingComparableValuesTracker.completeWaitersOnUpdate(PendingComparableValuesTracker.java:169)
 ~[main/:?]
at 
org.apache.ignite.internal.util.PendingComparableValuesTracker.update(PendingComparableValuesTracker.java:103)
 ~[main/:?]
at 
org.apache.ignite.internal.metastorage.server.time.ClusterTimeImpl.updateSafeTime(ClusterTimeImpl.java:146)
 ~[main/:?]
at 
org.apache.ignite.internal.metastorage.impl.MetaStorageManagerImpl.onSafeTimeAdvanced(MetaStorageManagerImpl.java:849)
 ~[main/:?]
at 
org.apache.ignite.internal.metastorage.impl.MetaStorageManagerImpl$1.onSafeTimeAdvanced(MetaStorageManagerImpl.java:456)
 ~[main/:?]
at 
org.apache.ignite.internal.metastorage.server.WatchProcessor.lambda$advanceSafeTime$7(WatchProcessor.java:269)
 ~[main/:?]
at 
java.util.concurrent.CompletableFuture$UniRun.tryFire(CompletableFuture.java:783)
 [?:?]
at 
java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:478)
 [?:?]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) 
[?:?]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) 
[?:?]
at java.lang.Thread.run(Thread.java:834) [?:?]
Caused by: java.lang.NullPointerException
at 
org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener.lambda$ensureReplicaIsPrimary$161(PartitionReplicaListener.java:2415)
 ~[main/:?]
at 
java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1072)
 ~[?:?]
... 15 more
{noformat}

*Definition of done*
In this case, we should throw the correct exception because the request cannot 
be handled in this replica anymore, and the matched transaction will be rolled 
back.

*Implementation notes*
Do not forget to check all places where the issue is mentioned (especially in 
TODO section).

As discussed with [~sanpwc]:
This exception is likely to be thrown when 
- we get primary replica on one node
- send a message and the message is slightly slow to be delivered
- we handle the received message on a node and run 
{{placementDriver.getPrimaryReplica}}. 
If the previous lease has expired by the time we handle the message, the call 
to {{placementDriver}} will result in a {{null}} value instead of 
{{ReplicaMeta}}.
Any call with no null check on it may end up with NPE.
Calling  

  was:
*Motivation*
It happens that when the request is created, the primary replica is in this 
node, but when the request is executed in the replica, it has already lost its 
role.

{noformat}
[2023-09-25T11:03:24,408][WARN 
][%iprct_tpclh_2%metastorage-watch-executor-2][ReplicaManager] Failed to 
process replica request [request=ReadWriteSingleRowReplicaRequestImpl 
[binaryRowMessage=BinaryRowMessageImpl 
[binaryTuple=java.nio.HeapByteBuffer[pos=0 lim=9 cap=9], schemaVersion=1], 
commitPartitionId=TablePartitionIdMessageImpl [partitionId=0, tableId=4], 
full=true, groupId=4_part_0, requestType=RW_UPSERT, term=24742070009862, 
timestampLong=24742430588928, 
transactionId=018acb5d-4e54-0006--705db0b1]]
 java.util.concurrent.CompletionException: java.lang.NullPointerException
at 

[jira] [Updated] (IGNITE-20408) Replace tx coordinator non-consistent ID with coordinator ClusterNode in local tx state map

2023-09-29 Thread Jira


 [ 
https://issues.apache.org/jira/browse/IGNITE-20408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

 Kirill Sizov updated IGNITE-20408:
---
Description: 
*Motivation*

Local map of transaction states (local tx state map) contains non-consistent id 
of a transaction coordinator node. When trying to resolve write intents using 
coordinator path, we need to check whether the coordinator is still present in 
cluster and has not restarted (because if it has restarted it means it lost its 
volatile state, including local tx state map). But we can't get the 
coordinator's non-consistent id in the message handler, and can't send the 
message to the node using its non-consistent id, so the following race is 
possible:
 * we receive message from coordinator with its consistent id,
 * try to resolve its non-consistent id to save it in the local tx state map, 
but we get the id of restarted node from topology service, so this 
non-consistent id is no longer valid.

There is a ticket for the improvement that will allow us to get ClusterNode 
containing non-consistent id in the message handler: IGNITE-20296 . After that 
improvement we will be able to get ClusterNode as a sender and will have to 
replace coordinator id with ClusterNode representing coordinator in tx local 
state map.

*Definition of done*

Local map of transaction states contains ClusterNode representing the 
coordinator instead of its non-consistent id, and the message to the 
coordinator is sent using this ClusterNode as a recepient node. 

*Implementation details*
h6. Details of the issue:
# a {{NetworkMessage}} is processed in 
{{ReplicaManager.onReplicaMessageReceived}}, we get sender id (which is a 
non-consistent id) from the parameter {{senderConsistentId}}:
{code}
String senderId = 
clusterNetSvc.topologyService().getByConsistentId(senderConsistentId).id();
{code}
# {{senderId}} is then stored in {{TxStateMeta}} when 
{{PartitionReplicaListener}} calls {{txManager.updateTxMeta}} with it.
# Later when we perform write intent resolution in 
{{TransactionStateResolver.resolveDistributiveTxState}} we take the previously 
stored sender id as then id of a coordinator node and run 
{code}
resolveTxStateFromTxCoordinator(txId, localMeta.txCoordinatorId(), commitGrpId, 
timestamp0, txMetaFuture);
{code}

If the node was restarted after it had successfully delivered a 
{{NetworkMessage}} but before #1, the code from #1 may return a different 
sender id:
{noformat}
coordinator (localId = A, consistentId = 1): send message M0 (id = 1) --> 
primary: receive message M0 (id = 1)
coordinator (localId = A, consistentId = 1): restart
coordinator (localId = B, consistentId = 1): the node with the same consistent 
id has now a different local id, previous volatile state is lost
primary: Find coordinator for write intent resolution for consistent id = 1. We 
get node B with no state.
{noformat}

  was:
*Motivation*

Local map of transaction states (local tx state map) contains non-consistent id 
of a transaction coordinator node. When trying to resolve write intents using 
coordinator path, we need to check whether the coordinator is still present in 
cluster and has not restarted (because if it has restarted it means it lost its 
volatile state, including local tx state map). But we can't get the 
coordinator's non-consistent id in the message handler, and can't send the 
message to the node using its non-consistent id, so the following race is 
possible:
 * we receive message from coordinator with its consistent id,
 * try to resolve its non-consistent id to save it in the local tx state map, 
but we get the id of restarted node from topology service, so this 
non-consistent id is no longer valid.

There is a ticket for the improvement that will allow us to get ClusterNode 
containing non-consistent id in the message handler: IGNITE-20296 . After that 
improvement we will be able to get ClusterNode as a sender and will have to 
replace coordinator id with ClusterNode representing coordinator in tx local 
state map.

*Definition of done*

Local map of transaction states contains ClusterNode representing the 
coordinator instead of its non-consistent id, and the message to the 
coordinator is sent using this ClusterNode as a recepient node. 

*Implementation details*
First, the issue.
# a {{NetworkMessage}} is processed in 
{{ReplicaManager.onReplicaMessageReceived}}, we get sender id (which is a 
non-consistent id) from the parameter {{senderConsistentId}}:
{code}
String senderId = 
clusterNetSvc.topologyService().getByConsistentId(senderConsistentId).id();
{code}
# {{senderId}} is then stored in {{TxStateMeta}} when 
{{PartitionReplicaListener}} calls {{txManager.updateTxMeta}} with it.
# Later when we perform write intent resolution in 
{{TransactionStateResolver.resolveDistributiveTxState}} we take the previously 
stored sender id as then id of a coordinator node and run 
{code}

[jira] [Updated] (IGNITE-20408) Replace tx coordinator non-consistent ID with coordinator ClusterNode in local tx state map

2023-09-29 Thread Jira


 [ 
https://issues.apache.org/jira/browse/IGNITE-20408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

 Kirill Sizov updated IGNITE-20408:
---
Description: 
*Motivation*

Local map of transaction states (local tx state map) contains non-consistent id 
of a transaction coordinator node. When trying to resolve write intents using 
coordinator path, we need to check whether the coordinator is still present in 
cluster and has not restarted (because if it has restarted it means it lost its 
volatile state, including local tx state map). But we can't get the 
coordinator's non-consistent id in the message handler, and can't send the 
message to the node using its non-consistent id, so the following race is 
possible:
 * we receive message from coordinator with its consistent id,
 * try to resolve its non-consistent id to save it in the local tx state map, 
but we get the id of restarted node from topology service, so this 
non-consistent id is no longer valid.

There is a ticket for the improvement that will allow us to get ClusterNode 
containing non-consistent id in the message handler: IGNITE-20296 . After that 
improvement we will be able to get ClusterNode as a sender and will have to 
replace coordinator id with ClusterNode representing coordinator in tx local 
state map.

*Definition of done*

Local map of transaction states contains ClusterNode representing the 
coordinator instead of its non-consistent id, and the message to the 
coordinator is sent using this ClusterNode as a recepient node. 

*Implementation details*
First, the issue.
# a {{NetworkMessage}} is processed in 
{{ReplicaManager.onReplicaMessageReceived}}, we get sender id (which is a 
non-consistent id) from the parameter {{senderConsistentId}}:
{code}
String senderId = 
clusterNetSvc.topologyService().getByConsistentId(senderConsistentId).id();
{code}
# {{senderId}} is then stored in {{TxStateMeta}} when 
{{PartitionReplicaListener}} calls {{txManager.updateTxMeta}} with it.
# Later when we perform write intent resolution in 
{{TransactionStateResolver.resolveDistributiveTxState}} we take the previously 
stored sender id as then id of a coordinator node and run 
{code}
resolveTxStateFromTxCoordinator(txId, localMeta.txCoordinatorId(), commitGrpId, 
timestamp0, txMetaFuture);
{code}

If the node was restarted after it had successfully delivered a 
{{NetworkMessage}} but before #1, the code from #1 may return a different 
sender id:
{noformat}
coordinator (localId = A, consistentId = 1): send message M0 (id = 1) --> 
primary: receive message M0 (id = 1)
coordinator (localId = A, consistentId = 1): restart
coordinator (localId = B, consistentId = 1): the node with the same consistent 
id has now a different local id, previous volatile state is lost
primary: Find coordinator for write intent resolution for consistent id = 1. We 
get node B with no state.
{noformat}

  was:
*Motivation*

Local map of transaction states (local tx state map) contains non-consistent id 
of a transaction coordinator node. When trying to resolve write intents using 
coordinator path, we need to check whether the coordinator is still present in 
cluster and has not restarted (because if it has restarted it means it lost its 
volatile state, including local tx state map). But we can't get the 
coordinator's non-consistent id in the message handler, and can't send the 
message to the node using its non-consistent id, so the following race is 
possible:
 * we receive message from coordinator with its consistent id,
 * try to resolve its non-consistent id to save it in the local tx state map, 
but we get the id of restarted node from topology service, so this 
non-consistent id is no longer valid.

There is a ticket for the improvement that will allow us to get ClusterNode 
containing non-consistent id in the message handler: IGNITE-20296 . After that 
improvement we will be able to get ClusterNode as a sender and will have to 
replace coordinator id with ClusterNode representing coordinator in tx local 
state map.

*Definition of done*

Local map of transaction states contains ClusterNode representing the 
coordinator instead of its non-consistent id, and the message to the 
coordinator is sent using this ClusterNode as a recepient node. 

*Implementation details*
First, the issue.
# a {{NetworkMessage}} is processed in 
{{ReplicaManager.onReplicaMessageReceived}}, we get sender id (which is a 
non-consistent id) from the parameter {{senderConsistentId}}:
{code}
String senderId = 
clusterNetSvc.topologyService().getByConsistentId(senderConsistentId).id();
{code}
# {{senderId}} is then stored in {{TxStateMeta}} when 
{{PartitionReplicaListener}} calls {{txManager.updateTxMeta}} with it.
# Later when we perform write intent resolution in 
{{TransactionStateResolver.resolveDistributiveTxState}} we take the previously 
stored sender id as then id of a coordinator node and run 
{code}

[jira] [Updated] (IGNITE-20408) Replace tx coordinator non-consistent ID with coordinator ClusterNode in local tx state map

2023-09-29 Thread Jira


 [ 
https://issues.apache.org/jira/browse/IGNITE-20408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

 Kirill Sizov updated IGNITE-20408:
---
Description: 
*Motivation*

Local map of transaction states (local tx state map) contains non-consistent id 
of a transaction coordinator node. When trying to resolve write intents using 
coordinator path, we need to check whether the coordinator is still present in 
cluster and has not restarted (because if it has restarted it means it lost its 
volatile state, including local tx state map). But we can't get the 
coordinator's non-consistent id in the message handler, and can't send the 
message to the node using its non-consistent id, so the following race is 
possible:
 * we receive message from coordinator with its consistent id,
 * try to resolve its non-consistent id to save it in the local tx state map, 
but we get the id of restarted node from topology service, so this 
non-consistent id is no longer valid.

There is a ticket for the improvement that will allow us to get ClusterNode 
containing non-consistent id in the message handler: IGNITE-20296 . After that 
improvement we will be able to get ClusterNode as a sender and will have to 
replace coordinator id with ClusterNode representing coordinator in tx local 
state map.

*Definition of done*

Local map of transaction states contains ClusterNode representing the 
coordinator instead of its non-consistent id, and the message to the 
coordinator is sent using this ClusterNode as a recepient node. 

*Implementation details*
First, the issue.
# a {{NetworkMessage}} is processed in 
{{ReplicaManager.onReplicaMessageReceived}}, we get sender id (which is a 
non-consistent id) from the parameter {{senderConsistentId}}:
{code}
String senderId = 
clusterNetSvc.topologyService().getByConsistentId(senderConsistentId).id();
{code}
# {{senderId}} is then stored in {{TxStateMeta}} when 
{{PartitionReplicaListener}} calls {{txManager.updateTxMeta}} with it.
# Later when we perform write intent resolution in 
{{TransactionStateResolver.resolveDistributiveTxState}} we take the previously 
stored sender id as then id of a coordinator node and run 
{code}
resolveTxStateFromTxCoordinator(txId, localMeta.txCoordinatorId(), commitGrpId, 
timestamp0, txMetaFuture);
{code}

If the node was restarted after it had successfully delivered a 
{{NetworkMessage}} but before #1, the code from #1 may return a different 
sender id:
{noformat}
coordinator (localId = A, consistentId = 1): send message M0 (id = 1) --> 
primary: receive message M0 (id = 1)
coordinator (localId = A, consistentId = 1): restart
coordinator (localId = B, consistentId = 1): the same node has now different 
local id, previous volatile state is lost
primary: Find coordinator for write intent resolution for consistent id = 1. We 
get node B with no state.
{noformat}

  was:
*Motivation*

Local map of transaction states (local tx state map) contains non-consistent id 
of a transaction coordinator node. When trying to resolve write intents using 
coordinator path, we need to check whether the coordinator is still present in 
cluster and has not restarted (because if it has restarted it means it lost its 
volatile state, including local tx state map). But we can't get the 
coordinator's non-consistent id in the message handler, and can't send the 
message to the node using its non-consistent id, so the following race is 
possible:
 * we receive message from coordinator with its consistent id,
 * try to resolve its non-consistent id to save it in the local tx state map, 
but we get the id of restarted node from topology service, so this 
non-consistent id is no longer valid.

There is a ticket for the improvement that will allow us to get ClusterNode 
containing non-consistent id in the message handler: IGNITE-20296 . After that 
improvement we will be able to get ClusterNode as a sender and will have to 
replace coordinator id with ClusterNode representing coordinator in tx local 
state map.

*Definition of done*

Local map of transaction states contains ClusterNode representing the 
coordinator instead of its non-consistent id, and the message to the 
coordinator is sent using this ClusterNode as a recepient node. 

*Implementation details*
First, the issue.
# a {{NetworkMessage}} is processed in 
{{ReplicaManager.onReplicaMessageReceived}}, we get sender id (which is a 
non-consistent id) from the parameter {{senderConsistentId}}:
{code}
String senderId = 
clusterNetSvc.topologyService().getByConsistentId(senderConsistentId).id();
{code}
# {{senderId}} is then stored in {{TxStateMeta}} when 
{{PartitionReplicaListener}} calls {{txManager.updateTxMeta}} with it.
# Later when we perform write intent resolution in 
{{TransactionStateResolver.resolveDistributiveTxState}} we take the previously 
stored sender id as then id of a coordinator node and run 
{code}
resolveTxStateFromTxCoordinator(txId, 

[jira] [Updated] (IGNITE-20408) Replace tx coordinator non-consistent ID with coordinator ClusterNode in local tx state map

2023-09-29 Thread Jira


 [ 
https://issues.apache.org/jira/browse/IGNITE-20408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

 Kirill Sizov updated IGNITE-20408:
---
Description: 
*Motivation*

Local map of transaction states (local tx state map) contains non-consistent id 
of a transaction coordinator node. When trying to resolve write intents using 
coordinator path, we need to check whether the coordinator is still present in 
cluster and has not restarted (because if it has restarted it means it lost its 
volatile state, including local tx state map). But we can't get the 
coordinator's non-consistent id in the message handler, and can't send the 
message to the node using its non-consistent id, so the following race is 
possible:
 * we receive message from coordinator with its consistent id,
 * try to resolve its non-consistent id to save it in the local tx state map, 
but we get the id of restarted node from topology service, so this 
non-consistent id is no longer valid.

There is a ticket for the improvement that will allow us to get ClusterNode 
containing non-consistent id in the message handler: IGNITE-20296 . After that 
improvement we will be able to get ClusterNode as a sender and will have to 
replace coordinator id with ClusterNode representing coordinator in tx local 
state map.

*Definition of done*

Local map of transaction states contains ClusterNode representing the 
coordinator instead of its non-consistent id, and the message to the 
coordinator is sent using this ClusterNode as a recepient node. 

*Implementation details*
First, the issue.
# a {{NetworkMessage}} is processed in 
{{ReplicaManager.onReplicaMessageReceived}}, we get sender id (which is a 
non-consistent id) from the parameter {{senderConsistentId}}:
{code}
String senderId = 
clusterNetSvc.topologyService().getByConsistentId(senderConsistentId).id();
{code}
# {{senderId}} is then stored in {{TxStateMeta}} when 
{{PartitionReplicaListener}} calls {{txManager.updateTxMeta}} with it.
# Later when we perform write intent resolution in 
{{TransactionStateResolver.resolveDistributiveTxState}} we take the previously 
stored sender id as then id of a coordinator node and run 
{code}
resolveTxStateFromTxCoordinator(txId, localMeta.txCoordinatorId(), commitGrpId, 
timestamp0, txMetaFuture);
{code}

If the node was restarted after it has successfully delivered a 
{{NetworkMessage}} but before #1, the code from #1 may return a different 
sender id:
{noformat}
coordinator (localId = A, consistentId = 1): send message M0 (id = 1) --> 
primary: receive message M0 (id = 1)
coordinator (localId = A, consistentId = 1): restart
coordinator (localId = B, consistentId = 1): the same node has now different 
local id, previous volatile state is lost
primary: Find coordinator for write intent resolution for consistent id = 1. We 
get node B with no state.
{noformat}

  was:
*Motivation*

Local map of transaction states (local tx state map) contains non-consistent id 
of a transaction coordinator node. When trying to resolve write intents using 
coordinator path, we need to check whether the coordinator is still present in 
cluster and has not restarted (because if it has restarted it means it lost its 
volatile state, including local tx state map). But we can't get the 
coordinator's non-consistent id in the message handler, and can't send the 
message to the node using its non-consistent id, so the following race is 
possible:
 * we receive message from coordinator with its consistent id,
 * try to resolve its non-consistent id to save it in the local tx state map, 
but we get the id of restarted node from topology service, so this 
non-consistent id is no longer valid.

There is a ticket for the improvement that will allow us to get ClusterNode 
containing non-consistent id in the message handler: IGNITE-20296 . After that 
improvement we will be able to get ClusterNode as a sender and will have to 
replace coordinator id with ClusterNode representing coordinator in tx local 
state map.

*Definition of done*

Local map of transaction states contains ClusterNode representing the 
coordinator instead of its non-consistent id, and the message to the 
coordinator is sent using this ClusterNode as a recepient node. 


> Replace tx coordinator non-consistent ID with coordinator ClusterNode in 
> local tx state map
> ---
>
> Key: IGNITE-20408
> URL: https://issues.apache.org/jira/browse/IGNITE-20408
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Denis Chudov
>Priority: Major
>  Labels: ignite-3
>
> *Motivation*
> Local map of transaction states (local tx state map) contains non-consistent 
> id of a transaction coordinator node. When trying to resolve write intents 
> using coordinator path, we need to check whether the 

[jira] [Commented] (IGNITE-20055) Durable txCleanupReplicaRequest send from the commit partition

2023-09-29 Thread Alexander Lapin (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-20055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17770430#comment-17770430
 ] 

Alexander Lapin commented on IGNITE-20055:
--

[~ksizov] LGTM!

> Durable txCleanupReplicaRequest send from the commit partition
> --
>
> Key: IGNITE-20055
> URL: https://issues.apache.org/jira/browse/IGNITE-20055
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Alexander Lapin
>Assignee:  Kirill Sizov
>Priority: Major
>  Labels: ignite-3, transaction3_recovery, transactions
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> h3. Motivation
> It's required to continuously send txCleanupReplicaRequest to the primary 
> replica. Suggested flow is following.
> h3. Definition of Done
>  # Resend exact the same type of finish output that was initially evaluated, 
> meaning that commit will be resent infinitely even if previous 
> txCleanupReplicaRequest returns an exception.
>  # Await commit partition primary replica appearance in case of initially 
> enlisted recipient failure.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (IGNITE-20367) ItTableRaftSnapshotsTest times out with high flaky rate

2023-09-29 Thread Vyacheslav Koptilin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vyacheslav Koptilin reassigned IGNITE-20367:


Assignee: Alexander Lapin

> ItTableRaftSnapshotsTest times out with high flaky rate
> ---
>
> Key: IGNITE-20367
> URL: https://issues.apache.org/jira/browse/IGNITE-20367
> Project: Ignite
>  Issue Type: Bug
>Reporter: Alexander Lapin
>Assignee: Alexander Lapin
>Priority: Blocker
>  Labels: ignite-3
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> {code:java}
> org.apache.ignite.lang.IgniteException: IGN-CMN-65535 
> TraceId:f1535407-3cf9-48cd-9091-825ecf308526  at 
> java.base@11.0.17/java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:710)
>   at 
> app//org.apache.ignite.internal.util.ExceptionUtils$1.copy(ExceptionUtils.java:772)
>   at 
> app//org.apache.ignite.internal.util.ExceptionUtils$ExceptionFactory.createCopy(ExceptionUtils.java:706)
>   at 
> app//org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCause(ExceptionUtils.java:543)
>   at 
> app//org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCauseInternal(ExceptionUtils.java:641)
>   at 
> app//org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCause(ExceptionUtils.java:494)
>   at 
> app//org.apache.ignite.internal.sql.AbstractSession.execute(AbstractSession.java:63)
>   at 
> app//org.apache.ignite.internal.SessionUtils.executeUpdate(SessionUtils.java:38)
>   at 
> app//org.apache.ignite.internal.SessionUtils.executeUpdate(SessionUtils.java:50)
>   at 
> app//org.apache.ignite.internal.raftsnapshot.ItTableRaftSnapshotsTest.lambda$executeDmlWithRetry$1(ItTableRaftSnapshotsTest.java:231)
>   at app//org.apache.ignite.internal.Cluster.doInSession(Cluster.java:448)  
> at 
> app//org.apache.ignite.internal.raftsnapshot.ItTableRaftSnapshotsTest.lambda$executeDmlWithRetry$2(ItTableRaftSnapshotsTest.java:230)
>   at 
> app//org.apache.ignite.internal.raftsnapshot.ItTableRaftSnapshotsTest.withRetry(ItTableRaftSnapshotsTest.java:184)
>   at 
> app//org.apache.ignite.internal.raftsnapshot.ItTableRaftSnapshotsTest.executeDmlWithRetry(ItTableRaftSnapshotsTest.java:229)
>   at 
> app//org.apache.ignite.internal.raftsnapshot.ItTableRaftSnapshotsTest.prepareClusterForInstallingSnapshotToNode2(ItTableRaftSnapshotsTest.java:351)
>   at 
> app//org.apache.ignite.internal.raftsnapshot.ItTableRaftSnapshotsTest.snapshotInstallTimeoutDoesNotBreakSubsequentInstallsWhenSecondAttemptIsIdenticalToFirst(ItTableRaftSnapshotsTest.java:685)
>   at 
> java.base@11.0.17/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
>  Method)  at 
> java.base@11.0.17/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base@11.0.17/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base@11.0.17/java.lang.reflect.Method.invoke(Method.java:566)  at 
> app//org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:727)
>   at 
> app//org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
>   at 
> app//org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)
>   at 
> app//org.junit.jupiter.engine.extension.SameThreadTimeoutInvocation.proceed(SameThreadTimeoutInvocation.java:45)
>   at 
> app//org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:156)
>   at 
> app//org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:147)
>   at 
> app//org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestMethod(TimeoutExtension.java:86)
>   at 
> app//org.junit.jupiter.engine.execution.InterceptingExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(InterceptingExecutableInvoker.java:103)
>   at 
> app//org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.lambda$invoke$0(InterceptingExecutableInvoker.java:93)
>   at 
> app//org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)
>   at 
> app//org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64)
>   at 
> app//org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45)
>   at 
> app//org.junit.jupiter.engine.execution.InvocationInterceptorChain.invoke(InvocationInterceptorChain.java:37)
>   at 
> app//org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:92)
>   at 
> 

[jira] [Updated] (IGNITE-20367) ItTableRaftSnapshotsTest times out with high flaky rate

2023-09-29 Thread Vyacheslav Koptilin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vyacheslav Koptilin updated IGNITE-20367:
-
Priority: Blocker  (was: Major)

> ItTableRaftSnapshotsTest times out with high flaky rate
> ---
>
> Key: IGNITE-20367
> URL: https://issues.apache.org/jira/browse/IGNITE-20367
> Project: Ignite
>  Issue Type: Bug
>Reporter: Alexander Lapin
>Priority: Blocker
>  Labels: ignite-3
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> {code:java}
> org.apache.ignite.lang.IgniteException: IGN-CMN-65535 
> TraceId:f1535407-3cf9-48cd-9091-825ecf308526  at 
> java.base@11.0.17/java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:710)
>   at 
> app//org.apache.ignite.internal.util.ExceptionUtils$1.copy(ExceptionUtils.java:772)
>   at 
> app//org.apache.ignite.internal.util.ExceptionUtils$ExceptionFactory.createCopy(ExceptionUtils.java:706)
>   at 
> app//org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCause(ExceptionUtils.java:543)
>   at 
> app//org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCauseInternal(ExceptionUtils.java:641)
>   at 
> app//org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCause(ExceptionUtils.java:494)
>   at 
> app//org.apache.ignite.internal.sql.AbstractSession.execute(AbstractSession.java:63)
>   at 
> app//org.apache.ignite.internal.SessionUtils.executeUpdate(SessionUtils.java:38)
>   at 
> app//org.apache.ignite.internal.SessionUtils.executeUpdate(SessionUtils.java:50)
>   at 
> app//org.apache.ignite.internal.raftsnapshot.ItTableRaftSnapshotsTest.lambda$executeDmlWithRetry$1(ItTableRaftSnapshotsTest.java:231)
>   at app//org.apache.ignite.internal.Cluster.doInSession(Cluster.java:448)  
> at 
> app//org.apache.ignite.internal.raftsnapshot.ItTableRaftSnapshotsTest.lambda$executeDmlWithRetry$2(ItTableRaftSnapshotsTest.java:230)
>   at 
> app//org.apache.ignite.internal.raftsnapshot.ItTableRaftSnapshotsTest.withRetry(ItTableRaftSnapshotsTest.java:184)
>   at 
> app//org.apache.ignite.internal.raftsnapshot.ItTableRaftSnapshotsTest.executeDmlWithRetry(ItTableRaftSnapshotsTest.java:229)
>   at 
> app//org.apache.ignite.internal.raftsnapshot.ItTableRaftSnapshotsTest.prepareClusterForInstallingSnapshotToNode2(ItTableRaftSnapshotsTest.java:351)
>   at 
> app//org.apache.ignite.internal.raftsnapshot.ItTableRaftSnapshotsTest.snapshotInstallTimeoutDoesNotBreakSubsequentInstallsWhenSecondAttemptIsIdenticalToFirst(ItTableRaftSnapshotsTest.java:685)
>   at 
> java.base@11.0.17/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
>  Method)  at 
> java.base@11.0.17/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base@11.0.17/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base@11.0.17/java.lang.reflect.Method.invoke(Method.java:566)  at 
> app//org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:727)
>   at 
> app//org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
>   at 
> app//org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)
>   at 
> app//org.junit.jupiter.engine.extension.SameThreadTimeoutInvocation.proceed(SameThreadTimeoutInvocation.java:45)
>   at 
> app//org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:156)
>   at 
> app//org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:147)
>   at 
> app//org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestMethod(TimeoutExtension.java:86)
>   at 
> app//org.junit.jupiter.engine.execution.InterceptingExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(InterceptingExecutableInvoker.java:103)
>   at 
> app//org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.lambda$invoke$0(InterceptingExecutableInvoker.java:93)
>   at 
> app//org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)
>   at 
> app//org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64)
>   at 
> app//org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45)
>   at 
> app//org.junit.jupiter.engine.execution.InvocationInterceptorChain.invoke(InvocationInterceptorChain.java:37)
>   at 
> app//org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:92)
>   at 
> app//org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:86)

[jira] [Commented] (IGNITE-20502) Sql. Rework fragment mapping

2023-09-29 Thread Yury Gerzhedovich (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-20502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17770408#comment-17770408
 ] 

Yury Gerzhedovich commented on IGNITE-20502:


[~korlov] LGTM

> Sql. Rework fragment mapping
> 
>
> Key: IGNITE-20502
> URL: https://issues.apache.org/jira/browse/IGNITE-20502
> Project: Ignite
>  Issue Type: Improvement
>  Components: sql
>Reporter: Konstantin Orlov
>Assignee: Konstantin Orlov
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Currently, fragment mapping supports two strategies: some nodes from list and 
> exact mapping for partitioned sources. To integrate System Views, we need to 
> support two more strategies: all nodes from a given list (for node views) and 
> exactly one node from given list (for cluster views).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20418) Command 'indexes_force_rebuild' should work with several certain nodes.

2023-09-29 Thread Nikita Amelchev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikita Amelchev updated IGNITE-20418:
-
Fix Version/s: 2.16

> Command 'indexes_force_rebuild' should work with several certain nodes.
> ---
>
> Key: IGNITE-20418
> URL: https://issues.apache.org/jira/browse/IGNITE-20418
> Project: Ignite
>  Issue Type: Task
>Reporter: Vladimir Steshin
>Assignee: Vladimir Steshin
>Priority: Minor
>  Labels: ise
> Fix For: 2.16
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, control.sh's command 'indexes_force_rebuild' has no ablity to 
> lauch index rebuild on several certain nodes. Only one node is accepted as 
> command parameter (--node-id). It would be handy to pass several nodes to 
> execute on like '--nodes'.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-20466) Investigate running sonar checks from fork repositories

2023-09-29 Thread Maxim Muzafarov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-20466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17770397#comment-17770397
 ] 

Maxim Muzafarov commented on IGNITE-20466:
--

References:

pull_request_target
https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#pull_request_target

Checking out a merge commit in pull_request_target workflows #518
https://github.com/actions/checkout/issues/518

Feature Request |trigger action on "Pull Request Approved" #25372
https://github.com/orgs/community/discussions/25372

Run Sonar scan for PRs from forks
https://stackoverflow.com/questions/76528833/run-sonar-scan-for-prs-from-forks

How to use SonarCloud with a forked repository on GitHub?
https://community.sonarsource.com/t/how-to-use-sonarcloud-with-a-forked-repository-on-github/7363/30

> Investigate running sonar checks from fork repositories
> ---
>
> Key: IGNITE-20466
> URL: https://issues.apache.org/jira/browse/IGNITE-20466
> Project: Ignite
>  Issue Type: Task
>Reporter: Maxim Muzafarov
>Assignee: Maxim Muzafarov
>Priority: Major
>
> Investigate running sonar checks from fork repositories.
> See the discussion here:
> https://github.com/actions/checkout/issues/518
> Additionally, we can run checks after a pull-request has been approved by a 
> maintainer:
> https://github.com/orgs/community/discussions/25372



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-20508) DeadlockDetectionManager removal

2023-09-29 Thread Ignite TC Bot (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-20508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17770396#comment-17770396
 ] 

Ignite TC Bot commented on IGNITE-20508:


{panel:title=Branch: [pull/10959/head] Base: [master] : No blockers 
found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}{panel}
{panel:title=Branch: [pull/10959/head] Base: [master] : No new tests 
found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1}{panel}
[TeamCity *-- Run :: All* 
Results|https://ci2.ignite.apache.org/viewLog.html?buildId=7355049buildTypeId=IgniteTests24Java8_RunAll]

> DeadlockDetectionManager removal
> 
>
> Key: IGNITE-20508
> URL: https://issues.apache.org/jira/browse/IGNITE-20508
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Anton Vinogradov
>Assignee: Anton Vinogradov
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20358) Make distributed node storage config local

2023-09-29 Thread Kirill Gusakov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirill Gusakov updated IGNITE-20358:

Issue Type: Improvement  (was: Task)

> Make distributed node storage config local
> --
>
> Key: IGNITE-20358
> URL: https://issues.apache.org/jira/browse/IGNITE-20358
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Kirill Gusakov
>Assignee: Kirill Gusakov
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> *Motivation*
> At the moment, all {{*StorageEngineConfigurationSchema}} has the 
> {{ConfigurationType.DISTRIBUTED}} type. But it is not the case anymore, each 
> node can have the different storage configurations by new design.
> *Definition of done*
> - All {{*StorageEngineConfigurationSchema}} configurations moved to the 
> {{ConfigurationType.LOCAL}} scope.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20478) Sql. Rework use of UNSPECIFIED_VALUE_PLACEHOLDER in row.

2023-09-29 Thread Pavel Pereslegin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Pereslegin updated IGNITE-20478:
--
Description: 
Currently, when scanning an index, we set a special value called 
"UNSPECIFIED_VALUE_PLACEHOLDER" to row. Which means that any value matches the 
bound (more details in IGNITE-16443).

To be able to complete the transition to using a binary tuple, we need to 
rework this approach and try to avoid storing non-conforming schema values in 
row.

Currently, this placeholder sets to row when the search bound is open (that is, 
when the RexNode is null in the list when creating a scalar).
{{ExpressionFactoryImpl#expandBounds}} needs to be reworked and there should be 
no open bounds (see {{ExpressionFactoryImpl#compile}} all nodes elements must 
not be null).

After reworking {{expandBounds}} the {{searchRow}} that comes to 
{{RowConverter#toBinaryTuplePrefix}} should already contain a prefix only.

The code {{ExpressionFactoryImpl#comparator}} that uses this placeholder does 
not appear to be executing and can be removed.

  was:
Currently, when scanning an index, we set a special value called 
"UNSPECIFIED_VALUE_PLACEHOLDER" to row. Which means that any value matches the 
bound (more details in IGNITE-16443).

To be able to complete the transition to using a binary tuple, we need to 
rework this approach and try to avoid storing non-conforming schema values in 
row.

Currently, this placeholder sets to row when the search bound is open (that is, 
when the RexNode is null in the list when creating a scalar).
{{ExpressionFactoryImpl#expandBounds}} needs to be reworked and there should be 
no open bounds (see {{ExpressionFactoryImpl#compile}} all nodes elements must 
not be null).

After reworking {{expandBounds}} the {{searchRow}} that comes to 
{{RowConverter#toBinaryTuplePrefix}} should already contain a prefix only.

In {{ExpressionFactoryImpl#comparator}} this placeholder does not seem to be 
used and this code can be removed.


> Sql. Rework use of UNSPECIFIED_VALUE_PLACEHOLDER in row.
> 
>
> Key: IGNITE-20478
> URL: https://issues.apache.org/jira/browse/IGNITE-20478
> Project: Ignite
>  Issue Type: Improvement
>  Components: sql
>Reporter: Pavel Pereslegin
>Priority: Major
>  Labels: ignite-3
>
> Currently, when scanning an index, we set a special value called 
> "UNSPECIFIED_VALUE_PLACEHOLDER" to row. Which means that any value matches 
> the bound (more details in IGNITE-16443).
> To be able to complete the transition to using a binary tuple, we need to 
> rework this approach and try to avoid storing non-conforming schema values in 
> row.
> Currently, this placeholder sets to row when the search bound is open (that 
> is, when the RexNode is null in the list when creating a scalar).
> {{ExpressionFactoryImpl#expandBounds}} needs to be reworked and there should 
> be no open bounds (see {{ExpressionFactoryImpl#compile}} all nodes elements 
> must not be null).
> After reworking {{expandBounds}} the {{searchRow}} that comes to 
> {{RowConverter#toBinaryTuplePrefix}} should already contain a prefix only.
> The code {{ExpressionFactoryImpl#comparator}} that uses this placeholder does 
> not appear to be executing and can be removed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20478) Sql. Rework use of UNSPECIFIED_VALUE_PLACEHOLDER in row.

2023-09-29 Thread Pavel Pereslegin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Pereslegin updated IGNITE-20478:
--
Description: 
Currently, when scanning an index, we set a special value called 
"UNSPECIFIED_VALUE_PLACEHOLDER" to row. Which means that any value matches the 
bound (more details in IGNITE-16443).

To be able to complete the transition to using a binary tuple, we need to 
rework this approach and try to avoid storing non-conforming schema values in 
row.

Currently, this placeholder is set to row when the search bound is open (that 
is, when the RexNode is null in the list when creating a scalar).
{{ExpressionFactoryImpl#expandBounds}} needs to be reworked and there should be 
no open bounds (see {{ExpressionFactoryImpl#compile}} all nodes elements must 
not be null).

After reworking {{expandBounds}} the {{searchRow}} that comes to 
{{RowConverter#toBinaryTuplePrefix}} should already contain a prefix only.

In {{ExpressionFactoryImpl#comparator}} this placeholder does not seem to be 
used and this code can be removed.

  was:
Currently, when scanning an index, we set a special value called 
"UNSPECIFIED_VALUE_PLACEHOLDER" to row. Which means that any value matches the 
bound (more details in IGNITE-16443).

To be able to complete the transition to using a binary tuple, we need to 
rework this approach and try to avoid storing non-conforming schema values in 
row.


> Sql. Rework use of UNSPECIFIED_VALUE_PLACEHOLDER in row.
> 
>
> Key: IGNITE-20478
> URL: https://issues.apache.org/jira/browse/IGNITE-20478
> Project: Ignite
>  Issue Type: Improvement
>  Components: sql
>Reporter: Pavel Pereslegin
>Priority: Major
>  Labels: ignite-3
>
> Currently, when scanning an index, we set a special value called 
> "UNSPECIFIED_VALUE_PLACEHOLDER" to row. Which means that any value matches 
> the bound (more details in IGNITE-16443).
> To be able to complete the transition to using a binary tuple, we need to 
> rework this approach and try to avoid storing non-conforming schema values in 
> row.
> Currently, this placeholder is set to row when the search bound is open (that 
> is, when the RexNode is null in the list when creating a scalar).
> {{ExpressionFactoryImpl#expandBounds}} needs to be reworked and there should 
> be no open bounds (see {{ExpressionFactoryImpl#compile}} all nodes elements 
> must not be null).
> After reworking {{expandBounds}} the {{searchRow}} that comes to 
> {{RowConverter#toBinaryTuplePrefix}} should already contain a prefix only.
> In {{ExpressionFactoryImpl#comparator}} this placeholder does not seem to be 
> used and this code can be removed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20478) Sql. Rework use of UNSPECIFIED_VALUE_PLACEHOLDER in row.

2023-09-29 Thread Pavel Pereslegin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Pereslegin updated IGNITE-20478:
--
Description: 
Currently, when scanning an index, we set a special value called 
"UNSPECIFIED_VALUE_PLACEHOLDER" to row. Which means that any value matches the 
bound (more details in IGNITE-16443).

To be able to complete the transition to using a binary tuple, we need to 
rework this approach and try to avoid storing non-conforming schema values in 
row.

Currently, this placeholder sets to row when the search bound is open (that is, 
when the RexNode is null in the list when creating a scalar).
{{ExpressionFactoryImpl#expandBounds}} needs to be reworked and there should be 
no open bounds (see {{ExpressionFactoryImpl#compile}} all nodes elements must 
not be null).

After reworking {{expandBounds}} the {{searchRow}} that comes to 
{{RowConverter#toBinaryTuplePrefix}} should already contain a prefix only.

In {{ExpressionFactoryImpl#comparator}} this placeholder does not seem to be 
used and this code can be removed.

  was:
Currently, when scanning an index, we set a special value called 
"UNSPECIFIED_VALUE_PLACEHOLDER" to row. Which means that any value matches the 
bound (more details in IGNITE-16443).

To be able to complete the transition to using a binary tuple, we need to 
rework this approach and try to avoid storing non-conforming schema values in 
row.

Currently, this placeholder is set to row when the search bound is open (that 
is, when the RexNode is null in the list when creating a scalar).
{{ExpressionFactoryImpl#expandBounds}} needs to be reworked and there should be 
no open bounds (see {{ExpressionFactoryImpl#compile}} all nodes elements must 
not be null).

After reworking {{expandBounds}} the {{searchRow}} that comes to 
{{RowConverter#toBinaryTuplePrefix}} should already contain a prefix only.

In {{ExpressionFactoryImpl#comparator}} this placeholder does not seem to be 
used and this code can be removed.


> Sql. Rework use of UNSPECIFIED_VALUE_PLACEHOLDER in row.
> 
>
> Key: IGNITE-20478
> URL: https://issues.apache.org/jira/browse/IGNITE-20478
> Project: Ignite
>  Issue Type: Improvement
>  Components: sql
>Reporter: Pavel Pereslegin
>Priority: Major
>  Labels: ignite-3
>
> Currently, when scanning an index, we set a special value called 
> "UNSPECIFIED_VALUE_PLACEHOLDER" to row. Which means that any value matches 
> the bound (more details in IGNITE-16443).
> To be able to complete the transition to using a binary tuple, we need to 
> rework this approach and try to avoid storing non-conforming schema values in 
> row.
> Currently, this placeholder sets to row when the search bound is open (that 
> is, when the RexNode is null in the list when creating a scalar).
> {{ExpressionFactoryImpl#expandBounds}} needs to be reworked and there should 
> be no open bounds (see {{ExpressionFactoryImpl#compile}} all nodes elements 
> must not be null).
> After reworking {{expandBounds}} the {{searchRow}} that comes to 
> {{RowConverter#toBinaryTuplePrefix}} should already contain a prefix only.
> In {{ExpressionFactoryImpl#comparator}} this placeholder does not seem to be 
> used and this code can be removed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20358) Make distributed node storage config local

2023-09-29 Thread Ivan Bessonov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bessonov updated IGNITE-20358:
---
Fix Version/s: 3.0.0-beta2
 Reviewer: Ivan Bessonov

> Make distributed node storage config local
> --
>
> Key: IGNITE-20358
> URL: https://issues.apache.org/jira/browse/IGNITE-20358
> Project: Ignite
>  Issue Type: Task
>Reporter: Kirill Gusakov
>Assignee: Kirill Gusakov
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> *Motivation*
> At the moment, all {{*StorageEngineConfigurationSchema}} has the 
> {{ConfigurationType.DISTRIBUTED}} type. But it is not the case anymore, each 
> node can have the different storage configurations by new design.
> *Definition of done*
> - All {{*StorageEngineConfigurationSchema}} configurations moved to the 
> {{ConfigurationType.LOCAL}} scope.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-20519) Add causality token of the last update of catalog descriptors

2023-09-29 Thread Mirza Aliev (Jira)
Mirza Aliev created IGNITE-20519:


 Summary: Add causality token of the last update of catalog 
descriptors 
 Key: IGNITE-20519
 URL: https://issues.apache.org/jira/browse/IGNITE-20519
 Project: Ignite
  Issue Type: Bug
Reporter: Mirza Aliev






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20493) Ignite website shows downloading version 2.11 as latest version

2023-09-29 Thread Erlan Aytpaev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erlan Aytpaev updated IGNITE-20493:
---
Component/s: website

> Ignite website shows downloading version 2.11 as latest version
> ---
>
> Key: IGNITE-20493
> URL: https://issues.apache.org/jira/browse/IGNITE-20493
> Project: Ignite
>  Issue Type: Task
>  Components: website
>Reporter: Erlan Aytpaev
>Assignee: Erlan Aytpaev
>Priority: Major
>
> !https://lists.apache.org/api/email.lua?attachment=true=1c01nt4nol691fxz5k71zpwd5r60d0ql=08cc11e094cb73962012551428c510b4c62b6064ee6e2e07737241c552039874!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (IGNITE-17841) Update list of PMC members and Committers

2023-09-29 Thread Erlan Aytpaev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-17841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erlan Aytpaev closed IGNITE-17841.
--

> Update list of PMC members and Committers
> -
>
> Key: IGNITE-17841
> URL: https://issues.apache.org/jira/browse/IGNITE-17841
> Project: Ignite
>  Issue Type: Task
>  Components: website
>Reporter: Kseniya Romanova
>Assignee: Erlan Aytpaev
>Priority: Trivial
>
> Please add to the page 
> [https://ignite.apache.org/our-community.html#community]
>  
> 1. new PMC member: 
> Ivan Daschinsky [https://whimsy.apache.org/roster/committer/ivandasch] 
> [https://github.com/ivandasch] 
>  
> 2. new Committers: 
> Kirill Tkalenko [https://whimsy.apache.org/roster/committer/tkalkirill] 
> [https://github.com/tkalkirill] 
> Mikhail Petrov [https://whimsy.apache.org/roster/committer/mpetrov] 
>  
> Thanks!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20492) NPE in PartitionReplicaListener's primary replica retrieval

2023-09-29 Thread Vyacheslav Koptilin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vyacheslav Koptilin updated IGNITE-20492:
-
Ignite Flags:   (was: Docs Required,Release Notes Required)

> NPE in PartitionReplicaListener's primary replica retrieval
> ---
>
> Key: IGNITE-20492
> URL: https://issues.apache.org/jira/browse/IGNITE-20492
> Project: Ignite
>  Issue Type: Bug
>Reporter:  Kirill Sizov
>Priority: Major
>  Labels: ignite-3
>
> PartitionReplicaListener.ensureReplicaIsPrimary has the following block of 
> code
> {code:java}
> if (expectedTerm != null) {
> return placementDriver.getPrimaryReplica(replicationGroupId, now)
> .thenCompose(primaryReplica -> {
> long currentEnlistmentConsistencyToken = 
> primaryReplica.getStartTime().longValue();
>  {code}
> However, according to the placementDriver's contract, {{getPrimaryReplica}} 
> can complete with null:
> {quote}
> Same as awaitPrimaryReplica(ReplicationGroupId, HybridTimestamp) despite the 
> fact that given method await logic is bounded. It will wait for a primary 
> replica for a reasonable period of time, and complete a future with null if a 
> matching lease isn't found. Generally speaking reasonable here means enough 
> for distribution across cluster nodes.
> {quote}
> In that case ensureReplicaIsPrimary will crash with NPE:
> {noformat}
>   ... 3 more
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener.lambda$ensureReplicaIsPrimary$155(PartitionReplicaListener.java:2397)
>  ~[ignite-table-3.0.0-SNAPSHOT.jar:?]
>   at 
> java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1072)
>  ~[?:?]
>   at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
>  ~[?:?]
>   at 
> java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2073) 
> ~[?:?]
>   at 
> org.apache.ignite.internal.util.PendingComparableValuesTracker.lambda$completeWaitersOnUpdate$0(PendingComparableValuesTracker.java:169)
>  ~[ignite-core-3.0.0-SNAPSHOT.jar:?]
>   at java.util.concurrent.ConcurrentMap.forEach(ConcurrentMap.java:122) ~[?:?]
>   at 
> org.apache.ignite.internal.util.PendingComparableValuesTracker.completeWaitersOnUpdate(PendingComparableValuesTracker.java:169)
>  ~[ignite-core-3.0.0-SNAPSHOT.jar:?]
>   at 
> org.apache.ignite.internal.util.PendingComparableValuesTracker.update(PendingComparableValuesTracker.java:103)
>  ~[ignite-core-3.0.0-SNAPSHOT.jar:?]
>   at 
> org.apache.ignite.internal.metastorage.server.time.ClusterTimeImpl.updateSafeTime(ClusterTimeImpl.java:146)
>  ~[ignite-metastorage-3.0.0-SNAPSHOT.jar:?]
>   at 
> org.apache.ignite.internal.metastorage.impl.MetaStorageManagerImpl.onSafeTimeAdvanced(MetaStorageManagerImpl.java:849)
>  ~[ignite-metastorage-3.0.0-SNAPSHOT.jar:?]
>   at 
> org.apache.ignite.internal.metastorage.impl.MetaStorageManagerImpl$1.onSafeTimeAdvanced(MetaStorageManagerImpl.java:456)
>  ~[ignite-metastorage-3.0.0-SNAPSHOT.jar:?]
>   at 
> org.apache.ignite.internal.metastorage.server.WatchProcessor.invokeOnRevisionCallback(WatchProcessor.java:247)
>  ~[ignite-metastorage-3.0.0-SNAPSHOT.jar:?]
>   at 
> org.apache.ignite.internal.metastorage.server.WatchProcessor.lambda$notifyWatches$2(WatchProcessor.java:148)
>  ~[ignite-metastorage-3.0.0-SNAPSHOT.jar:?]
>   at 
> java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1072)
>  ~[?:?]
>   at 
> java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:478)
>  ~[?:?]
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-20418) Command 'indexes_force_rebuild' should work with several certain nodes.

2023-09-29 Thread Ignite TC Bot (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-20418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17770346#comment-17770346
 ] 

Ignite TC Bot commented on IGNITE-20418:


{panel:title=Branch: [pull/10941/head] Base: [master] : No blockers 
found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}{panel}
{panel:title=Branch: [pull/10941/head] Base: [master] : New Tests 
(10)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}
{color:#8b}Control Utility 2{color} [[tests 
10|https://ci2.ignite.apache.org/viewLog.html?buildId=7353263]]
* {color:#013220}IgniteControlUtilityTestSuite2: 
GridCommandHandlerIndexForceRebuildTest.testWithNodeFilter[cmdHnd=jmx] - 
PASSED{color}
* {color:#013220}IgniteControlUtilityTestSuite2: 
GridCommandHandlerIndexForceRebuildTest.testIndexRebuildAllNodes[cmdHnd=cli] - 
PASSED{color}
* {color:#013220}IgniteControlUtilityTestSuite2: 
GridCommandHandlerIndexForceRebuildTest.testWithNodeFilter[cmdHnd=cli] - 
PASSED{color}
* {color:#013220}IgniteControlUtilityTestSuite2: 
GridCommandHandlerIndexForceRebuildTest.testIndexRebuildAllNodes[cmdHnd=jmx] - 
PASSED{color}
* {color:#013220}IgniteControlUtilityTestSuite2: 
GridCommandHandlerIndexForceRebuildTest.testIndexRebuildOutputTwoNodes[cmdHnd=cli]
 - PASSED{color}
* {color:#013220}IgniteControlUtilityTestSuite2: 
GridCommandHandlerIndexForceRebuildTest.testEmptyResultTwoNodes[cmdHnd=cli] - 
PASSED{color}
* {color:#013220}IgniteControlUtilityTestSuite2: 
GridCommandHandlerIndexForceRebuildTest.testInvalidArgumentGroups[cmdHnd=cli] - 
PASSED{color}
* {color:#013220}IgniteControlUtilityTestSuite2: 
GridCommandHandlerIndexForceRebuildTest.testInvalidArgumentGroups[cmdHnd=jmx] - 
PASSED{color}
* {color:#013220}IgniteControlUtilityTestSuite2: 
GridCommandHandlerIndexForceRebuildTest.testIndexRebuildOutputTwoNodes[cmdHnd=jmx]
 - PASSED{color}
* {color:#013220}IgniteControlUtilityTestSuite2: 
GridCommandHandlerIndexForceRebuildTest.testEmptyResultTwoNodes[cmdHnd=jmx] - 
PASSED{color}

{panel}
[TeamCity *-- Run :: All* 
Results|https://ci2.ignite.apache.org/viewLog.html?buildId=7353267buildTypeId=IgniteTests24Java8_RunAll]

> Command 'indexes_force_rebuild' should work with several certain nodes.
> ---
>
> Key: IGNITE-20418
> URL: https://issues.apache.org/jira/browse/IGNITE-20418
> Project: Ignite
>  Issue Type: Task
>Reporter: Vladimir Steshin
>Assignee: Vladimir Steshin
>Priority: Minor
>  Labels: ise
>
> Currently, control.sh's command 'indexes_force_rebuild' has no ablity to 
> lauch index rebuild on several certain nodes. Only one node is accepted as 
> command parameter (--node-id). It would be handy to pass several nodes to 
> execute on like '--nodes'.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (IGNITE-20435) Preserve key order in InternalTableImpl#deleteAll

2023-09-29 Thread Vyacheslav Koptilin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vyacheslav Koptilin reassigned IGNITE-20435:


Assignee: Vladislav Pyatkov

> Preserve key order in InternalTableImpl#deleteAll
> -
>
> Key: IGNITE-20435
> URL: https://issues.apache.org/jira/browse/IGNITE-20435
> Project: Ignite
>  Issue Type: Bug
>Reporter: Igor Sapego
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
>
> The IGNITE-16004 fixed ordering for the most multi key methods but not for 
> the removeAll methods.
> For example, removeAll(1, 2, 3) should return 1, 3 if a value for 1 and 3 
> doesn't exists, but in practice this order may be broken.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-20425) Corrupted Raft FSM state after restart

2023-09-29 Thread Alexander Lapin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Lapin updated IGNITE-20425:
-
Description: 
According to the protocol, there are several numeric indexes in the Log / FSM:
 * {{lastLogIndex}} - index of the last logged log entry.
 * {{committedIndex}} - index of last committed log entry. {{{}committedIndex 
<= lastLogIndex{}}}.
 * {{appliedIndex}} - index of last log entry, processed by the state machine. 
{{appliedIndex <= }}{{{}committedIndex{}}}.

If committed index is less then last index, RAFT can invoke the "truncate 
suffix" procedure and delete uncommitted log's tail. This is a valid thing to 
do.

Now, imagine the following scenario:
 * {{{}lastIndex == 12{}}}, {{committedIndex == 11}}
 * Node is restarted
 * Upon recovery, we replay the entire log. Now {{appliedIndex == 12}}
 * After recovery, we join the group and receive "truncate suffix command" in 
order to deleted uncommitted entries.
 * We must delete entry 12, but it's already applied. Peer is broken.

The reason is that we don't use default recovery procedure: 
{{org.apache.ignite.raft.jraft.core.NodeImpl#init}}

Canonical raft doesn't replay log before join is complete.

Down to earth scenario, that shows this situation in practice:
 * Start group with 3 nodes: A, B, and C.
 * We assume that A is a leader.
 * Shutdown A, leader re-election is triggered.
 * We assume that B votes for C.
 * C receives grant from B and proceeds writing new configuration into local 
log.
 * Shutdown B before it writes the same log entry (easily-reproducible race).
 * Shutdown C.
 * Restart cluster.

Resulting states:

A - [1: initial cfg]

B - [1: initial cfg]

C - [1: initial cfg, 2: re-election]
h3. How to fix

option a. Recover log after join. This is not optimal, it's like performing 
local recovery after cluster activation in Ignite 2. We fixed that behavior 
long time ago.

option b. Somehow track committed index and perform partial recovery, that 
guarantees safety. We could write committed index into log storage periodically.

"b" is better, but maybe there are other ways as well.
h3. Upd #1

Highly likely we just can remove all that await log replay code on raft node 
start just because it’s no longer needed. Eventually it was introduced in order 
to enable primary replica direct storage reads, which is now covered properly 
within
{code:java}
/**
 * Tries to read index from group leader and wait for this index to appear in 
local storage. Can possible return failed future with
 * timeout exception, and in this case, replica would not answer to placement 
driver, because the response is useless. Placement driver
 * should handle this.
 *
 * @param expirationTime Lease expiration time.
 * @return Future that is completed when local storage catches up the index 
that is actual for leader on the moment of request.
 */
private CompletableFuture waitForActualState(long expirationTime) {
LOG.info("Waiting for actual storage state, group=" + groupId());

long timeout = expirationTime - currentTimeMillis();
if (timeout <= 0) {
return failedFuture(new TimeoutException());
}

return retryOperationUntilSuccess(raftClient::readIndex, e -> 
currentTimeMillis() > expirationTime, executor)
.orTimeout(timeout, TimeUnit.MILLISECONDS)
.thenCompose(storageIndexTracker::waitFor);
}{code}
similar is about RO access, we await the safeTime that has HB relations with 
corresponding storage updates.

  was:
According to the protocol, there are several numeric indexes in the Log / FSM:
 * {{lastLogIndex}} - index of the last logged log entry.
 * {{committedIndex}} - index of last committed log entry. {{{}committedIndex 
<= lastLogIndex{}}}.
 * {{appliedIndex}} - index of last log entry, processed by the state machine. 
{{appliedIndex <= }}{{{}committedIndex{}}}.

If committed index is less then last index, RAFT can invoke the "truncate 
suffix" procedure and delete uncommitted log's tail. This is a valid thing to 
do.

Now, imagine the following scenario:
 * {{{}lastIndex == 12{}}}, {{committedIndex == 11}}
 * Node is restarted
 * Upon recovery, we replay the entire log. Now {{appliedIndex == 12}}
 * After recovery, we join the group and receive "truncate suffix command" in 
order to deleted uncommitted entries.
 * We must delete entry 12, but it's already applied. Peer is broken.

The reason is that we don't use default recovery procedure: 
{{org.apache.ignite.raft.jraft.core.NodeImpl#init}}

Canonical raft doesn't replay log before join is complete.

Down to earth scenario, that shows this situation in practice:
 * Start group with 3 nodes: A, B, and C.
 * We assume that A is a leader.
 * Shutdown A, leader re-election is triggered.
 * We assume that B votes for C.
 * C receives grant from B and proceeds writing new configuration into local 
log.
 * Shutdown B before it writes the same log 

[jira] [Updated] (IGNITE-20425) Corrupted Raft FSM state after restart

2023-09-29 Thread Alexander Lapin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Lapin updated IGNITE-20425:
-
Description: 
According to the protocol, there are several numeric indexes in the Log / FSM:
 * {{lastLogIndex}} - index of the last logged log entry.
 * {{committedIndex}} - index of last committed log entry. {{{}committedIndex 
<= lastLogIndex{}}}.
 * {{appliedIndex}} - index of last log entry, processed by the state machine. 
{{appliedIndex <= }}{{{}committedIndex{}}}.

If committed index is less then last index, RAFT can invoke the "truncate 
suffix" procedure and delete uncommitted log's tail. This is a valid thing to 
do.

Now, imagine the following scenario:
 * {{{}lastIndex == 12{}}}, {{committedIndex == 11}}
 * Node is restarted
 * Upon recovery, we replay the entire log. Now {{appliedIndex == 12}}
 * After recovery, we join the group and receive "truncate suffix command" in 
order to deleted uncommitted entries.
 * We must delete entry 12, but it's already applied. Peer is broken.

The reason is that we don't use default recovery procedure: 
{{org.apache.ignite.raft.jraft.core.NodeImpl#init}}

Canonical raft doesn't replay log before join is complete.

Down to earth scenario, that shows this situation in practice:
 * Start group with 3 nodes: A, B, and C.
 * We assume that A is a leader.
 * Shutdown A, leader re-election is triggered.
 * We assume that B votes for C.
 * C receives grant from B and proceeds writing new configuration into local 
log.
 * Shutdown B before it writes the same log entry (easily-reproducible race).
 * Shutdown C.
 * Restart cluster.

Resulting states:

A - [1: initial cfg]

B - [1: initial cfg]

C - [1: initial cfg, 2: re-election]
h3. How to fix

option a. Recover log after join. This is not optimal, it's like performing 
local recovery after cluster activation in Ignite 2. We fixed that behavior 
long time ago.

option b. Somehow track committed index and perform partial recovery, that 
guarantees safety. We could write committed index into log storage periodically.

"b" is better, but maybe there are other ways as well.
h3. Upd #1

Highly likely we just can remove all that await log replay code on raft node 
start just because it’s no longer needed. Eventually it was introduced in order 
to enable primary replica direct storage reads, which is now covered properly 
within

{{}}
{code:java}
/**
 * Tries to read index from group leader and wait for this index to appear in 
local storage. Can possible return failed future with
 * timeout exception, and in this case, replica would not answer to placement 
driver, because the response is useless. Placement driver
 * should handle this.
 *
 * @param expirationTime Lease expiration time.
 * @return Future that is completed when local storage catches up the index 
that is actual for leader on the moment of request.
 */
private CompletableFuture waitForActualState(long expirationTime) {
LOG.info("Waiting for actual storage state, group=" + groupId());

long timeout = expirationTime - currentTimeMillis();
if (timeout <= 0) {
return failedFuture(new TimeoutException());
}

return retryOperationUntilSuccess(raftClient::readIndex, e -> 
currentTimeMillis() > expirationTime, executor)
.orTimeout(timeout, TimeUnit.MILLISECONDS)
.thenCompose(storageIndexTracker::waitFor);
} {code}
{{}}

similar is about RO access, we await the safeTime that has HB relations with 
corresponding storage update.

  was:
According to the protocol, there are several numeric indexes in the Log / FSM:
 * {{lastLogIndex}} - index of the last logged log entry.
 * {{committedIndex}} - index of last committed log entry. {{{}committedIndex 
<= lastLogIndex{}}}.
 * {{appliedIndex}} - index of last log entry, processed by the state machine. 
{{appliedIndex <= }}{{{}committedIndex{}}}.

If committed index is less then last index, RAFT can invoke the "truncate 
suffix" procedure and delete uncommitted log's tail. This is a valid thing to 
do.

Now, imagine the following scenario:
 * {{{}lastIndex == 12{}}}, {{committedIndex == 11}}
 * Node is restarted
 * Upon recovery, we replay the entire log. Now {{appliedIndex == 12}}
 * After recovery, we join the group and receive "truncate suffix command" in 
order to deleted uncommitted entries.
 * We must delete entry 12, but it's already applied. Peer is broken.

The reason is that we don't use default recovery procedure: 
{{org.apache.ignite.raft.jraft.core.NodeImpl#init}}

Canonical raft doesn't replay log before join is complete.

Down to earth scenario, that shows this situation in practice:
 * Start group with 3 nodes: A, B, and C.
 * We assume that A is a leader.
 * Shutdown A, leader re-election is triggered.
 * We assume that B votes for C.
 * C receives grant from B and proceeds writing new configuration into local 
log.
 * Shutdown B before it writes 

[jira] [Updated] (IGNITE-20425) Corrupted Raft FSM state after restart

2023-09-29 Thread Alexander Lapin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Lapin updated IGNITE-20425:
-
Description: 
According to the protocol, there are several numeric indexes in the Log / FSM:
 * {{lastLogIndex}} - index of the last logged log entry.
 * {{committedIndex}} - index of last committed log entry. {{{}committedIndex 
<= lastLogIndex{}}}.
 * {{appliedIndex}} - index of last log entry, processed by the state machine. 
{{appliedIndex <= }}{{{}committedIndex{}}}.

If committed index is less then last index, RAFT can invoke the "truncate 
suffix" procedure and delete uncommitted log's tail. This is a valid thing to 
do.

Now, imagine the following scenario:
 * {{{}lastIndex == 12{}}}, {{committedIndex == 11}}
 * Node is restarted
 * Upon recovery, we replay the entire log. Now {{appliedIndex == 12}}
 * After recovery, we join the group and receive "truncate suffix command" in 
order to deleted uncommitted entries.
 * We must delete entry 12, but it's already applied. Peer is broken.

The reason is that we don't use default recovery procedure: 
{{org.apache.ignite.raft.jraft.core.NodeImpl#init}}

Canonical raft doesn't replay log before join is complete.

Down to earth scenario, that shows this situation in practice:
 * Start group with 3 nodes: A, B, and C.
 * We assume that A is a leader.
 * Shutdown A, leader re-election is triggered.
 * We assume that B votes for C.
 * C receives grant from B and proceeds writing new configuration into local 
log.
 * Shutdown B before it writes the same log entry (easily-reproducible race).
 * Shutdown C.
 * Restart cluster.

Resulting states:

A - [1: initial cfg]

B - [1: initial cfg]

C - [1: initial cfg, 2: re-election]
h3. How to fix

option a. Recover log after join. This is not optimal, it's like performing 
local recovery after cluster activation in Ignite 2. We fixed that behavior 
long time ago.

option b. Somehow track committed index and perform partial recovery, that 
guarantees safety. We could write committed index into log storage periodically.

"b" is better, but maybe there are other ways as well.
h3. Upd #1

Highly likely we just can remove all that await log replay code on raft node 
start just because it’s no longer needed. Eventually it was introduced in order 
to enable primary replica direct storage reads, which is now covered properly 
within
{code:java}
/**
 * Tries to read index from group leader and wait for this index to appear in 
local storage. Can possible return failed future with
 * timeout exception, and in this case, replica would not answer to placement 
driver, because the response is useless. Placement driver
 * should handle this.
 *
 * @param expirationTime Lease expiration time.
 * @return Future that is completed when local storage catches up the index 
that is actual for leader on the moment of request.
 */
private CompletableFuture waitForActualState(long expirationTime) {
LOG.info("Waiting for actual storage state, group=" + groupId());

long timeout = expirationTime - currentTimeMillis();
if (timeout <= 0) {
return failedFuture(new TimeoutException());
}

return retryOperationUntilSuccess(raftClient::readIndex, e -> 
currentTimeMillis() > expirationTime, executor)
.orTimeout(timeout, TimeUnit.MILLISECONDS)
.thenCompose(storageIndexTracker::waitFor);
}{code}
similar is about RO access, we await the safeTime that has HB relations with 
corresponding storage update.

  was:
According to the protocol, there are several numeric indexes in the Log / FSM:
 * {{lastLogIndex}} - index of the last logged log entry.
 * {{committedIndex}} - index of last committed log entry. {{{}committedIndex 
<= lastLogIndex{}}}.
 * {{appliedIndex}} - index of last log entry, processed by the state machine. 
{{appliedIndex <= }}{{{}committedIndex{}}}.

If committed index is less then last index, RAFT can invoke the "truncate 
suffix" procedure and delete uncommitted log's tail. This is a valid thing to 
do.

Now, imagine the following scenario:
 * {{{}lastIndex == 12{}}}, {{committedIndex == 11}}
 * Node is restarted
 * Upon recovery, we replay the entire log. Now {{appliedIndex == 12}}
 * After recovery, we join the group and receive "truncate suffix command" in 
order to deleted uncommitted entries.
 * We must delete entry 12, but it's already applied. Peer is broken.

The reason is that we don't use default recovery procedure: 
{{org.apache.ignite.raft.jraft.core.NodeImpl#init}}

Canonical raft doesn't replay log before join is complete.

Down to earth scenario, that shows this situation in practice:
 * Start group with 3 nodes: A, B, and C.
 * We assume that A is a leader.
 * Shutdown A, leader re-election is triggered.
 * We assume that B votes for C.
 * C receives grant from B and proceeds writing new configuration into local 
log.
 * Shutdown B before it writes the same log 

[jira] [Commented] (IGNITE-20470) Ducktape to check dump performance

2023-09-29 Thread Nikolay Izhikov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-20470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17770340#comment-17770340
 ] 

Nikolay Izhikov commented on IGNITE-20470:
--

https://github.com/apache/ignite/pull/10953 - cache_dumps is a feature branch 
to add test

> Ducktape to check dump performance
> --
>
> Key: IGNITE-20470
> URL: https://issues.apache.org/jira/browse/IGNITE-20470
> Project: Ignite
>  Issue Type: Task
>Reporter: Nikolay Izhikov
>Priority: Major
>  Labels: IEP-109, ise
>
> Dump creation can affect transactions performance with change listener and 
> disc operations. We must create ducktape test to check this.
> Example test scenario:
> * Start nodes
> * Start transaction operations: insert, update, remove.
> * Create dump
> * Check dump consistency.
> Measure 
> * Transaction performance penalty.
> * GC utilization.
> * Disc utilization.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (IGNITE-20356) Sql. Rework RowHandler "set" operation.

2023-09-29 Thread Pavel Pereslegin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-20356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Pereslegin reassigned IGNITE-20356:
-

Assignee: (was: Pavel Pereslegin)

> Sql. Rework RowHandler "set" operation.
> ---
>
> Key: IGNITE-20356
> URL: https://issues.apache.org/jira/browse/IGNITE-20356
> Project: Ignite
>  Issue Type: Improvement
>  Components: sql
>Reporter: Pavel Pereslegin
>Priority: Major
>  Labels: ignite-3
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In IGNITE-19791, a wrapper over {{BinaryTuple}} was added.
> This wrapper ({{BinaryTupleRowWrapper}}) does not support the "{{set()}}" 
> method
> Instead of using {{set(int, RowT, Object)}} method, we can use the 
> {{append(RowT, Object)}} method to add field values sequentially.
> We need:
>  * Add a new RowFactory method that will return a builder that allows you to 
> append values to row and build the row.
>  * Remove the {{RowHandler#set()}} method and rework all related code/tests 
> to use the builder.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-20518) Use CatalogService in JdbcMetadataCatalog

2023-09-29 Thread Roman Puchkovskiy (Jira)
Roman Puchkovskiy created IGNITE-20518:
--

 Summary: Use CatalogService in JdbcMetadataCatalog
 Key: IGNITE-20518
 URL: https://issues.apache.org/jira/browse/IGNITE-20518
 Project: Ignite
  Issue Type: Improvement
Reporter: Roman Puchkovskiy
Assignee: Roman Puchkovskiy
 Fix For: 3.0.0-beta2


Currently, {{JdbcMetadataCatalog}} uses {{TableManager}} to get tables' 
metadata. It is enough to use CatalogService; it is also more suitable as it 
allows to get a consistent snapshot thanks to timestamps support.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)