[jira] [Created] (IGNITE-22129) Partition, CMG and metastorage should not share threads

2024-04-26 Thread Vladislav Pyatkov (Jira)
Vladislav Pyatkov created IGNITE-22129:
--

 Summary: Partition, CMG and metastorage should not share threads
 Key: IGNITE-22129
 URL: https://issues.apache.org/jira/browse/IGNITE-22129
 Project: Ignite
  Issue Type: Improvement
Reporter: Vladislav Pyatkov


h3. Motivation
These three subsystems have different purposes. Hence, using the same threads 
might lead to starvation. For the same reason, we already have a separate FMC 
caller disruptor for Metastorage, but other disruptors are still shared.
{code:java}
NodeImpl#ownFsmCallerExecutorDisruptorConfig
{code}

h3. Definition of done
At least, all partiton disruptor threads have to be different from the threads 
that are used by Metastorage and CMG.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-22128) Balancing partitions across stripes

2024-04-26 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov updated IGNITE-22128:
---
Description: 
h3. Motivation
Right now, we use a hash to balance partitions.
{code:java}
public int getStripe(NodeId nodeId) {
  return Math.abs(nodeId.hashCode() % stripes);
}
{code}
This approach might lead to a skew.

h3. Definition of done
Partition is distributed statically by the honest round-robin algorithm.

  was:
h3. Motivation
Right now, we use a hash to balance partitions.
{code:java}
public int getStripe(NodeId nodeId) {
  return Math.abs(nodeId.hashCode() % stripes);
}
{code}
This approach might lead to a skew.

h3. Definition of done
Partition is distributed by the round-robin algorithm.


> Balancing partitions across stripes
> ---
>
> Key: IGNITE-22128
> URL: https://issues.apache.org/jira/browse/IGNITE-22128
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
>
> h3. Motivation
> Right now, we use a hash to balance partitions.
> {code:java}
> public int getStripe(NodeId nodeId) {
>   return Math.abs(nodeId.hashCode() % stripes);
> }
> {code}
> This approach might lead to a skew.
> h3. Definition of done
> Partition is distributed statically by the honest round-robin algorithm.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-22128) Balancing partitions across stripes

2024-04-26 Thread Vladislav Pyatkov (Jira)
Vladislav Pyatkov created IGNITE-22128:
--

 Summary: Balancing partitions across stripes
 Key: IGNITE-22128
 URL: https://issues.apache.org/jira/browse/IGNITE-22128
 Project: Ignite
  Issue Type: Improvement
Reporter: Vladislav Pyatkov


h3. Motivation
Right now, we use a hash to balance partitions.
{code:java}
public int getStripe(NodeId nodeId) {
  return Math.abs(nodeId.hashCode() % stripes);
}
{code}
This approach might lead to a skew.

h3. Definition of done
Partition is distributed by the round-robin algorithm.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-22127) Partition listener does not use batch update

2024-04-26 Thread Vladislav Pyatkov (Jira)
Vladislav Pyatkov created IGNITE-22127:
--

 Summary: Partition listener does not use batch update
 Key: IGNITE-22127
 URL: https://issues.apache.org/jira/browse/IGNITE-22127
 Project: Ignite
  Issue Type: Improvement
Reporter: Vladislav Pyatkov


h3. Motivation
RAFT commands are batched in the FSM caller disruptor. The batch passes as 
collection to the partition replica listener, but the eatch command is handled 
as if it were single.

h3. Defenition of done
All command in the command iterator havd to be handeled as a batch storage 
update.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21908) Add metrics of distribution among stripes in disruptor

2024-04-26 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov updated IGNITE-21908:
---
Description: 
h3. Motivation
The metrics are useful to estimate the uniformity of the distribution.

h3. Implementation notes
These metrics can be implemented using the common approach, which is based on 
{{MetricSource}} interface.

h3. Definition of done
Metrics that become available:
* histogramm of batch size
* operations were processed


  was:
h3. Motivation
The metrics are useful to estimate the uniformity of the distribution.

h3. Implementation notes
These metrics can be implemented using the common approach, which is based on 
{{MetricSource}} interface.

h3. Definition of done
Metrics that become available:
* avarage bath size
* operations were processed



> Add metrics of distribution among stripes in disruptor
> --
>
> Key: IGNITE-21908
> URL: https://issues.apache.org/jira/browse/IGNITE-21908
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladislav Pyatkov
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> h3. Motivation
> The metrics are useful to estimate the uniformity of the distribution.
> h3. Implementation notes
> These metrics can be implemented using the common approach, which is based on 
> {{MetricSource}} interface.
> h3. Definition of done
> Metrics that become available:
> * histogramm of batch size
> * operations were processed



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21908) Add metrics of distribution among stripes in disruptor

2024-04-23 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov updated IGNITE-21908:
---
Reviewer: Slava Koptilin

> Add metrics of distribution among stripes in disruptor
> --
>
> Key: IGNITE-21908
> URL: https://issues.apache.org/jira/browse/IGNITE-21908
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladislav Pyatkov
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> h3. Motivation
> The metrics are useful to estimate the uniformity of the distribution.
> h3. Implementation notes
> These metrics can be implemented using the common approach, which is based on 
> {{MetricSource}} interface.
> h3. Definition of done
> Metrics that become available:
> * avarage bath size
> * operations were processed



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-22024) ItSqlClientSynchronousApiTest#runtimeErrorInDmlCausesTransactionToFail is flaky

2024-04-23 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-22024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17840071#comment-17840071
 ] 

Vladislav Pyatkov commented on IGNITE-22024:


Merged c57c1a8c383858828717580db13736e708095292

> ItSqlClientSynchronousApiTest#runtimeErrorInDmlCausesTransactionToFail is 
> flaky
> ---
>
> Key: IGNITE-22024
> URL: https://issues.apache.org/jira/browse/IGNITE-22024
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vladislav Pyatkov
>Assignee: Denis Chudov
>Priority: Major
>  Labels: ignite-3
> Attachments: screenshot-1.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> h3. Motivation
> Only one commit is a base transaction guarantee. The test shows this 
> guarantee is violated for thin clients.
> {noformat}
> java.lang.AssertionError: Exception has not been thrown.
>  
> at 
> org.apache.ignite.internal.testframework.IgniteTestUtils.assertThrowsWithCode(IgniteTestUtils.java:314)
> at 
> org.apache.ignite.internal.sql.api.ItSqlApiBaseTest.runtimeErrorInDmlCausesTransactionToFail(ItSqlApiBaseTest.java:648)
> at 
> org.apache.ignite.internal.sql.api.ItSqlClientSynchronousApiTest.runtimeErrorInDmlCausesTransactionToFail(ItSqlClientSynchronousApiTest.java:65)
> at java.base/java.lang.reflect.Method.invoke(Method.java:566)
> at 
> java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
> at 
> java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
> at 
> java.base/java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177)
> at 
> java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
> at 
> java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
> at 
> java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
> at 
> java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
> at 
> java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
> at 
> java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
> at 
> java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
> at 
> java.base/java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)
> at 
> java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
> at 
> java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
> at 
> java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
> at 
> java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
> at 
> java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
> at 
> java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)
> at 
> java.base/java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:274)
> at 
> java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
> at 
> java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
> at 
> java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
> at 
> java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1654)
> at 
> java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
> at 
> java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
> at 
> java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
> at 
> java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
> at 
> java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
> at 
> java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)
> at 
> java.base/java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:274)
> at 
> java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1654)
> at 
> java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
> at 
> java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
> at 
> java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
> at 
> java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
> at 
> java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
> at 
> java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)
> at java.base/java.util.ArrayList.forEach(ArrayList.java:1540)
> at 

[jira] [Commented] (IGNITE-22062) RO transaction does not close cursor when exception is thrown

2024-04-22 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-22062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839537#comment-17839537
 ] 

Vladislav Pyatkov commented on IGNITE-22062:


Merged 0978749e4ef96a9717c34968f7707f538b9c20bc

> RO transaction does not close cursor when exception is thrown
> -
>
> Key: IGNITE-22062
> URL: https://issues.apache.org/jira/browse/IGNITE-22062
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vladislav Pyatkov
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> h3. Motivation
> If an RO transaction starts and executes some scan operation, and, for some 
> reason, the scan cursor does not close until the transaction is in a final 
> state. The behavior is different for RW transactions because RW transactions 
> go to the final state in case of any exception.
> Although an RO transaction does not have to be closed in the case of an 
> exception in an operation, a scan cursor that is opened during the operation, 
> is useless and can be closed.
> h3. Definition of done
> If an RO transaction gets an exception during a scan operation, the scan 
> cursor has to be closed after the exception is caught by the end user code.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (IGNITE-22062) RO transaction does not close cursor when exception is thrown

2024-04-18 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov reassigned IGNITE-22062:
--

Assignee: Vladislav Pyatkov

> RO transaction does not close cursor when exception is thrown
> -
>
> Key: IGNITE-22062
> URL: https://issues.apache.org/jira/browse/IGNITE-22062
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vladislav Pyatkov
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> h3. Motivation
> If an RO transaction starts and executes some scan operation, and, for some 
> reason, the scan cursor does not close until the transaction is in a final 
> state. The behavior is different for RW transactions because RW transactions 
> go to the final state in case of any exception.
> Although an RO transaction does not have to be closed in the case of an 
> exception in an operation, a scan cursor that is opened during the operation, 
> is useless and can be closed.
> h3. Definition of done
> If an RO transaction gets an exception during a scan operation, the scan 
> cursor has to be closed after the exception is caught by the end user code.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-22062) RO transaction does not close cursor when exception is thrown

2024-04-17 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov updated IGNITE-22062:
---
Description: 
h3. Motivation
If an RO transaction starts and executes some scan operation, and, for some 
reason, the scan cursor does not close until the transaction is in a final 
state. The behavior is different for RW transactions because RW transactions go 
to the final state in case of any exception.
Although an RO transaction does not have to be closed in the case of an 
exception in an operation, a scan cursor that is opened during the operation, 
is useless and can be closed.

h3. Definition of done
If an RO transaction gets an exception during a scan operation, the scan cursor 
has to be closed after the exception is caught by the end user code.


  was:
h3. Motivation
If an RO transaction starts and executes some scan operation than by some 
reason gets an exception, the sacn's cursor does not be close until the 
transaction is in closed state. The behaior is different for RW transaction, 
because RW transactions go to final state in case of any exception.
Althoug RO transaction does not have to be closed in case of exception in an 
operation, a scan cursor, that is opened driung the operation, is useless and 
can be closed.

h3. Definition of done
If RO transaction gets an exception during scan operation, the scan cursor have 
to be close after the exception is catched by the end user code.



> RO transaction does not close cursor when exception is thrown
> -
>
> Key: IGNITE-22062
> URL: https://issues.apache.org/jira/browse/IGNITE-22062
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
>
> h3. Motivation
> If an RO transaction starts and executes some scan operation, and, for some 
> reason, the scan cursor does not close until the transaction is in a final 
> state. The behavior is different for RW transactions because RW transactions 
> go to the final state in case of any exception.
> Although an RO transaction does not have to be closed in the case of an 
> exception in an operation, a scan cursor that is opened during the operation, 
> is useless and can be closed.
> h3. Definition of done
> If an RO transaction gets an exception during a scan operation, the scan 
> cursor has to be closed after the exception is caught by the end user code.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-22062) RO transaction does not close cursor when exception is thrown

2024-04-17 Thread Vladislav Pyatkov (Jira)
Vladislav Pyatkov created IGNITE-22062:
--

 Summary: RO transaction does not close cursor when exception is 
thrown
 Key: IGNITE-22062
 URL: https://issues.apache.org/jira/browse/IGNITE-22062
 Project: Ignite
  Issue Type: Bug
Reporter: Vladislav Pyatkov


h3. Motivation
If an RO transaction starts and executes some scan operation than by some 
reason gets an exception, the sacn's cursor does not be close until the 
transaction is in closed state. The behaior is different for RW transaction, 
because RW transactions go to final state in case of any exception.
Althoug RO transaction does not have to be closed in case of exception in an 
operation, a scan cursor, that is opened driung the operation, is useless and 
can be closed.

h3. Definition of done
If RO transaction gets an exception during scan operation, the scan cursor have 
to be close after the exception is catched by the end user code.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-22029) Change default configuration according to default RAFT option

2024-04-16 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-22029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837730#comment-17837730
 ] 

Vladislav Pyatkov commented on IGNITE-22029:


Merged 46f09f5c15128e59cdef4de9f4d275921f5e4c21

> Change default configuration according to default RAFT option
> -
>
> Key: IGNITE-22029
> URL: https://issues.apache.org/jira/browse/IGNITE-22029
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vladislav Pyatkov
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> h3. Motivation
> The stripes default abount is changed in IGNITE-21907, but it did not 
> modified in RaftConfigurationSchema.
> h3. Defenition of done
> Need change the default in RAFT configuration as it is done in RAFT option.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21908) Add metrics of distribution among stripes in disruptor

2024-04-15 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov updated IGNITE-21908:
---
Description: 
h3. Motivation
The metrics are useful to estimate the uniformity of the distribution.

h3. Implementation notes
These metrics can be implemented using the common approach, which is based on 
{{MetricSource}} interface.

h3. Definition of done
Metrics that become available:
* avarage bath size
* operations were processed


  was:
h3. Motivation



> Add metrics of distribution among stripes in disruptor
> --
>
> Key: IGNITE-21908
> URL: https://issues.apache.org/jira/browse/IGNITE-21908
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladislav Pyatkov
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
>
> h3. Motivation
> The metrics are useful to estimate the uniformity of the distribution.
> h3. Implementation notes
> These metrics can be implemented using the common approach, which is based on 
> {{MetricSource}} interface.
> h3. Definition of done
> Metrics that become available:
> * avarage bath size
> * operations were processed



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21908) Add metrics of distribution among stripes in disruptor

2024-04-15 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov updated IGNITE-21908:
---
Summary: Add metrics of distribution among stripes in disruptor  (was: 
Alignment of distribution among stripes in disruptor)

> Add metrics of distribution among stripes in disruptor
> --
>
> Key: IGNITE-21908
> URL: https://issues.apache.org/jira/browse/IGNITE-21908
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladislav Pyatkov
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21908) Add metrics of distribution among stripes in disruptor

2024-04-15 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov updated IGNITE-21908:
---
Description: 
h3. Motivation


> Add metrics of distribution among stripes in disruptor
> --
>
> Key: IGNITE-21908
> URL: https://issues.apache.org/jira/browse/IGNITE-21908
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladislav Pyatkov
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
>
> h3. Motivation



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-22043) Remove assignment from Meta storage

2024-04-15 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov updated IGNITE-22043:
---
Description: 
h3. Motivation
When the assignments moved from the table to the zone description, we just 
removed a code that was removing the assignment from Meta storage.
{code}
Set assignmentKeys = IntStream.range(0, partitions)
.mapToObj(p -> stablePartAssignmentsKey(new TablePartitionId(tableId, 
p)))
.collect(toSet());
metaStorageMgr.removeAll(assignmentKeys);
{code}
Of course, this is incorrect because the assignment is stored indefinitely now.

h3. Definition of done
Restoring the code when a zone is deleted.
All assignments have be removed from Meta storage after the zone is deleted.

  was:
h3. Motivation
When the assignments moved from the table to the zone description, we just 
removed a code that was removing the assignment from Meta storage.
{code}
Set assignmentKeys = IntStream.range(0, partitions)
.mapToObj(p -> stablePartAssignmentsKey(new TablePartitionId(tableId, 
p)))
.collect(toSet());
metaStorageMgr.removeAll(assignmentKeys);
{code}
Of course, this is incorrect because the assignment is stored indefinitely now.

h3. Definition of done
Restoring the code when a zone is deleted.


> Remove assignment from Meta storage
> ---
>
> Key: IGNITE-22043
> URL: https://issues.apache.org/jira/browse/IGNITE-22043
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
>
> h3. Motivation
> When the assignments moved from the table to the zone description, we just 
> removed a code that was removing the assignment from Meta storage.
> {code}
> Set assignmentKeys = IntStream.range(0, partitions)
> .mapToObj(p -> stablePartAssignmentsKey(new TablePartitionId(tableId, 
> p)))
> .collect(toSet());
> metaStorageMgr.removeAll(assignmentKeys);
> {code}
> Of course, this is incorrect because the assignment is stored indefinitely 
> now.
> h3. Definition of done
> Restoring the code when a zone is deleted.
> All assignments have be removed from Meta storage after the zone is deleted.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-22043) Remove assignment from Meta storage

2024-04-15 Thread Vladislav Pyatkov (Jira)
Vladislav Pyatkov created IGNITE-22043:
--

 Summary: Remove assignment from Meta storage
 Key: IGNITE-22043
 URL: https://issues.apache.org/jira/browse/IGNITE-22043
 Project: Ignite
  Issue Type: Bug
Reporter: Vladislav Pyatkov


h3. Motivation
When the assignments moved from the table to the zone description, we just 
removed a code that was removing the assignment from Meta storage.
{code}
Set assignmentKeys = IntStream.range(0, partitions)
.mapToObj(p -> stablePartAssignmentsKey(new TablePartitionId(tableId, 
p)))
.collect(toSet());
metaStorageMgr.removeAll(assignmentKeys);
{code}
Of course, this is incorrect because the assignment is stored indefinitely now.

h3. Definition of don
Restoring the code when a zone is deleted.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-22043) Remove assignment from Meta storage

2024-04-15 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov updated IGNITE-22043:
---
Description: 
h3. Motivation
When the assignments moved from the table to the zone description, we just 
removed a code that was removing the assignment from Meta storage.
{code}
Set assignmentKeys = IntStream.range(0, partitions)
.mapToObj(p -> stablePartAssignmentsKey(new TablePartitionId(tableId, 
p)))
.collect(toSet());
metaStorageMgr.removeAll(assignmentKeys);
{code}
Of course, this is incorrect because the assignment is stored indefinitely now.

h3. Definition of done
Restoring the code when a zone is deleted.

  was:
h3. Motivation
When the assignments moved from the table to the zone description, we just 
removed a code that was removing the assignment from Meta storage.
{code}
Set assignmentKeys = IntStream.range(0, partitions)
.mapToObj(p -> stablePartAssignmentsKey(new TablePartitionId(tableId, 
p)))
.collect(toSet());
metaStorageMgr.removeAll(assignmentKeys);
{code}
Of course, this is incorrect because the assignment is stored indefinitely now.

h3. Definition of don
Restoring the code when a zone is deleted.


> Remove assignment from Meta storage
> ---
>
> Key: IGNITE-22043
> URL: https://issues.apache.org/jira/browse/IGNITE-22043
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
>
> h3. Motivation
> When the assignments moved from the table to the zone description, we just 
> removed a code that was removing the assignment from Meta storage.
> {code}
> Set assignmentKeys = IntStream.range(0, partitions)
> .mapToObj(p -> stablePartAssignmentsKey(new TablePartitionId(tableId, 
> p)))
> .collect(toSet());
> metaStorageMgr.removeAll(assignmentKeys);
> {code}
> Of course, this is incorrect because the assignment is stored indefinitely 
> now.
> h3. Definition of done
> Restoring the code when a zone is deleted.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-22034) ItRebalanceTest#testRebalanceTablesCounterForZone is flacky

2024-04-12 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov updated IGNITE-22034:
---
Description: 
{noformat}
java.lang.NullPointerException
at 
org.apache.ignite.internal.rebalance.ItRebalanceTest.waitForTablesCounterInMetastore(ItRebalanceTest.java:261)
at 
org.apache.ignite.internal.rebalance.ItRebalanceTest.testRebalanceTablesCounterForZone(ItRebalanceTest.java:199)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at java.base/java.util.ArrayList.forEach(ArrayList.java:1540)
at java.base/java.util.ArrayList.forEach(ArrayList.java:1540)
{noformat}

The reason of this NPE is {{lastAssignmentsHolderForLog[0]}} equals {{null}}. 
We have not wait of appier a key in the metastorage.

  was:
{noformat}
java.lang.NullPointerException
at 
org.apache.ignite.internal.rebalance.ItRebalanceTest.waitForTablesCounterInMetastore(ItRebalanceTest.java:261)
at 
org.apache.ignite.internal.rebalance.ItRebalanceTest.testRebalanceTablesCounterForZone(ItRebalanceTest.java:199)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at java.base/java.util.ArrayList.forEach(ArrayList.java:1540)
at java.base/java.util.ArrayList.forEach(ArrayList.java:1540)
{noformat}


> ItRebalanceTest#testRebalanceTablesCounterForZone is flacky
> ---
>
> Key: IGNITE-22034
> URL: https://issues.apache.org/jira/browse/IGNITE-22034
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
>
> {noformat}
> java.lang.NullPointerException
>   at 
> org.apache.ignite.internal.rebalance.ItRebalanceTest.waitForTablesCounterInMetastore(ItRebalanceTest.java:261)
>   at 
> org.apache.ignite.internal.rebalance.ItRebalanceTest.testRebalanceTablesCounterForZone(ItRebalanceTest.java:199)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at java.base/java.util.ArrayList.forEach(ArrayList.java:1540)
>   at java.base/java.util.ArrayList.forEach(ArrayList.java:1540)
> {noformat}
> The reason of this NPE is {{lastAssignmentsHolderForLog[0]}} equals {{null}}. 
> We have not wait of appier a key in the metastorage.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-22034) ItRebalanceTest#testRebalanceTablesCounterForZone is flacky

2024-04-12 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov updated IGNITE-22034:
---
Description: 
{noformat}
java.lang.NullPointerException
at 
org.apache.ignite.internal.rebalance.ItRebalanceTest.waitForTablesCounterInMetastore(ItRebalanceTest.java:261)
at 
org.apache.ignite.internal.rebalance.ItRebalanceTest.testRebalanceTablesCounterForZone(ItRebalanceTest.java:199)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at java.base/java.util.ArrayList.forEach(ArrayList.java:1540)
at java.base/java.util.ArrayList.forEach(ArrayList.java:1540)
{noformat}

  was:
{noformat}
java.lang.NullPointerException
at 
org.apache.ignite.internal.rebalance.ItRebalanceTest.waitForTablesCounterInMetastore(ItRebalanceTest.java:261)
at 
org.apache.ignite.internal.rebalance.ItRebalanceTest.testRebalanceTablesCounterForZone(ItRebalanceTest.java:199)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at java.base/java.util.ArrayList.forEach(ArrayList.java:1540)
at java.base/java.util.ArrayList.forEach(ArrayList.java:1540)
{nofomrat}


> ItRebalanceTest#testRebalanceTablesCounterForZone is flacky
> ---
>
> Key: IGNITE-22034
> URL: https://issues.apache.org/jira/browse/IGNITE-22034
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
>
> {noformat}
> java.lang.NullPointerException
>   at 
> org.apache.ignite.internal.rebalance.ItRebalanceTest.waitForTablesCounterInMetastore(ItRebalanceTest.java:261)
>   at 
> org.apache.ignite.internal.rebalance.ItRebalanceTest.testRebalanceTablesCounterForZone(ItRebalanceTest.java:199)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at java.base/java.util.ArrayList.forEach(ArrayList.java:1540)
>   at java.base/java.util.ArrayList.forEach(ArrayList.java:1540)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-22034) ItRebalanceTest#testRebalanceTablesCounterForZone is flacky

2024-04-12 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov updated IGNITE-22034:
---
Description: 
{noformat}
java.lang.NullPointerException
at 
org.apache.ignite.internal.rebalance.ItRebalanceTest.waitForTablesCounterInMetastore(ItRebalanceTest.java:261)
at 
org.apache.ignite.internal.rebalance.ItRebalanceTest.testRebalanceTablesCounterForZone(ItRebalanceTest.java:199)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at java.base/java.util.ArrayList.forEach(ArrayList.java:1540)
at java.base/java.util.ArrayList.forEach(ArrayList.java:1540)
{nofomrat}

> ItRebalanceTest#testRebalanceTablesCounterForZone is flacky
> ---
>
> Key: IGNITE-22034
> URL: https://issues.apache.org/jira/browse/IGNITE-22034
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
>
> {noformat}
> java.lang.NullPointerException
>   at 
> org.apache.ignite.internal.rebalance.ItRebalanceTest.waitForTablesCounterInMetastore(ItRebalanceTest.java:261)
>   at 
> org.apache.ignite.internal.rebalance.ItRebalanceTest.testRebalanceTablesCounterForZone(ItRebalanceTest.java:199)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at java.base/java.util.ArrayList.forEach(ArrayList.java:1540)
>   at java.base/java.util.ArrayList.forEach(ArrayList.java:1540)
> {nofomrat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-22034) ItRebalanceTest#testRebalanceTablesCounterForZone is flacky

2024-04-12 Thread Vladislav Pyatkov (Jira)
Vladislav Pyatkov created IGNITE-22034:
--

 Summary: ItRebalanceTest#testRebalanceTablesCounterForZone is 
flacky
 Key: IGNITE-22034
 URL: https://issues.apache.org/jira/browse/IGNITE-22034
 Project: Ignite
  Issue Type: Bug
Reporter: Vladislav Pyatkov






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (IGNITE-22029) Change default configuration according to default RAFT option

2024-04-11 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov reassigned IGNITE-22029:
--

Assignee: Vladislav Pyatkov

> Change default configuration according to default RAFT option
> -
>
> Key: IGNITE-22029
> URL: https://issues.apache.org/jira/browse/IGNITE-22029
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vladislav Pyatkov
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
>
> h3. Motivation
> The stripes default abount is changed in IGNITE-21907, but it did not 
> modified in RaftConfigurationSchema.
> h3. Defenition of done
> Need change the default in RAFT configuration as it is done in RAFT option.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-22029) Change default configuration according to default RAFT option

2024-04-11 Thread Vladislav Pyatkov (Jira)
Vladislav Pyatkov created IGNITE-22029:
--

 Summary: Change default configuration according to default RAFT 
option
 Key: IGNITE-22029
 URL: https://issues.apache.org/jira/browse/IGNITE-22029
 Project: Ignite
  Issue Type: Bug
Reporter: Vladislav Pyatkov


h3. Motivation
The stripes default abount is changed in IGNITE-21907, but it did not modified 
in RaftConfigurationSchema.

h3. Defenition of done
Need change the default in RAFT configuration as it is done in RAFT option.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-22024) ItSqlClientSynchronousApiTest#runtimeErrorInDmlCausesTransactionToFail is flaky

2024-04-10 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov updated IGNITE-22024:
---
Labels: ignite-3  (was: )

> ItSqlClientSynchronousApiTest#runtimeErrorInDmlCausesTransactionToFail is 
> flaky
> ---
>
> Key: IGNITE-22024
> URL: https://issues.apache.org/jira/browse/IGNITE-22024
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
>
> h3. Motivation
> Only one commit is a base transaction guarantee. The test shows this 
> guarantee is violated for thin clients.
> {noformat}
> java.lang.AssertionError: Exception has not been thrown.
>  
> at 
> org.apache.ignite.internal.testframework.IgniteTestUtils.assertThrowsWithCode(IgniteTestUtils.java:314)
> at 
> org.apache.ignite.internal.sql.api.ItSqlApiBaseTest.runtimeErrorInDmlCausesTransactionToFail(ItSqlApiBaseTest.java:648)
> at 
> org.apache.ignite.internal.sql.api.ItSqlClientSynchronousApiTest.runtimeErrorInDmlCausesTransactionToFail(ItSqlClientSynchronousApiTest.java:65)
> at java.base/java.lang.reflect.Method.invoke(Method.java:566)
> at 
> java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
> at 
> java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
> at 
> java.base/java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177)
> at 
> java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
> at 
> java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
> at 
> java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
> at 
> java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
> at 
> java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
> at 
> java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
> at 
> java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
> at 
> java.base/java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)
> at 
> java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
> at 
> java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
> at 
> java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
> at 
> java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
> at 
> java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
> at 
> java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)
> at 
> java.base/java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:274)
> at 
> java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
> at 
> java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
> at 
> java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
> at 
> java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1654)
> at 
> java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
> at 
> java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
> at 
> java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
> at 
> java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
> at 
> java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
> at 
> java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)
> at 
> java.base/java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:274)
> at 
> java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1654)
> at 
> java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
> at 
> java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
> at 
> java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
> at 
> java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
> at 
> java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
> at 
> java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)
> at java.base/java.util.ArrayList.forEach(ArrayList.java:1540)
> at java.base/java.util.ArrayList.forEach(ArrayList.java:1540)
> {noformat}
> h3. Definition of done
> Any transaction operation must notify the user that the transaction is 
> already finished if the previous operation is finished with an 

[jira] [Updated] (IGNITE-22024) ItSqlClientSynchronousApiTest#runtimeErrorInDmlCausesTransactionToFail is flaky

2024-04-10 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-22024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov updated IGNITE-22024:
---
Attachment: screenshot-1.png

> ItSqlClientSynchronousApiTest#runtimeErrorInDmlCausesTransactionToFail is 
> flaky
> ---
>
> Key: IGNITE-22024
> URL: https://issues.apache.org/jira/browse/IGNITE-22024
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
> Attachments: screenshot-1.png
>
>
> h3. Motivation
> Only one commit is a base transaction guarantee. The test shows this 
> guarantee is violated for thin clients.
> {noformat}
> java.lang.AssertionError: Exception has not been thrown.
>  
> at 
> org.apache.ignite.internal.testframework.IgniteTestUtils.assertThrowsWithCode(IgniteTestUtils.java:314)
> at 
> org.apache.ignite.internal.sql.api.ItSqlApiBaseTest.runtimeErrorInDmlCausesTransactionToFail(ItSqlApiBaseTest.java:648)
> at 
> org.apache.ignite.internal.sql.api.ItSqlClientSynchronousApiTest.runtimeErrorInDmlCausesTransactionToFail(ItSqlClientSynchronousApiTest.java:65)
> at java.base/java.lang.reflect.Method.invoke(Method.java:566)
> at 
> java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
> at 
> java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
> at 
> java.base/java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177)
> at 
> java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
> at 
> java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
> at 
> java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
> at 
> java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
> at 
> java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
> at 
> java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
> at 
> java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
> at 
> java.base/java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)
> at 
> java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
> at 
> java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
> at 
> java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
> at 
> java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
> at 
> java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
> at 
> java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)
> at 
> java.base/java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:274)
> at 
> java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
> at 
> java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
> at 
> java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
> at 
> java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1654)
> at 
> java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
> at 
> java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
> at 
> java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
> at 
> java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
> at 
> java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
> at 
> java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)
> at 
> java.base/java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:274)
> at 
> java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1654)
> at 
> java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
> at 
> java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
> at 
> java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
> at 
> java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
> at 
> java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
> at 
> java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)
> at java.base/java.util.ArrayList.forEach(ArrayList.java:1540)
> at java.base/java.util.ArrayList.forEach(ArrayList.java:1540)
> {noformat}
> h3. Definition of done
> Any transaction operation must notify the user that the transaction is 
> already finished if the 

[jira] [Created] (IGNITE-22024) ItSqlClientSynchronousApiTest#runtimeErrorInDmlCausesTransactionToFail is flaky

2024-04-10 Thread Vladislav Pyatkov (Jira)
Vladislav Pyatkov created IGNITE-22024:
--

 Summary: 
ItSqlClientSynchronousApiTest#runtimeErrorInDmlCausesTransactionToFail is flaky
 Key: IGNITE-22024
 URL: https://issues.apache.org/jira/browse/IGNITE-22024
 Project: Ignite
  Issue Type: Bug
Reporter: Vladislav Pyatkov


h3. Motivation

Only one commit is a base transaction guarantee. The test shows this guarantee 
is violated for thin clients.
{noformat}
java.lang.AssertionError: Exception has not been thrown.
 
at 
org.apache.ignite.internal.testframework.IgniteTestUtils.assertThrowsWithCode(IgniteTestUtils.java:314)
at 
org.apache.ignite.internal.sql.api.ItSqlApiBaseTest.runtimeErrorInDmlCausesTransactionToFail(ItSqlApiBaseTest.java:648)
at 
org.apache.ignite.internal.sql.api.ItSqlClientSynchronousApiTest.runtimeErrorInDmlCausesTransactionToFail(ItSqlClientSynchronousApiTest.java:65)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at 
java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
at 
java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
at 
java.base/java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177)
at 
java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
at 
java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
at 
java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
at 
java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
at 
java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
at 
java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
at 
java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
at 
java.base/java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)
at 
java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
at 
java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
at 
java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
at 
java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
at 
java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at 
java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)
at 
java.base/java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:274)
at 
java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
at 
java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
at 
java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
at 
java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1654)
at 
java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
at 
java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
at 
java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
at 
java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
at 
java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at 
java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)
at 
java.base/java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:274)
at 
java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1654)
at 
java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
at 
java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
at 
java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
at 
java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
at 
java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at 
java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)
at java.base/java.util.ArrayList.forEach(ArrayList.java:1540)
at java.base/java.util.ArrayList.forEach(ArrayList.java:1540)
{noformat}
h3. Definition of done
Any transaction operation must notify the user that the transaction is already 
finished if the previous operation is finished with an exception.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-21382) Test ItPrimaryReplicaChoiceTest.testPrimaryChangeLongHandling is flaky

2024-04-09 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835352#comment-17835352
 ] 

Vladislav Pyatkov commented on IGNITE-21382:


Merged 6f297164260089910a08a297c6515713c191e383

> Test ItPrimaryReplicaChoiceTest.testPrimaryChangeLongHandling is flaky
> --
>
> Key: IGNITE-21382
> URL: https://issues.apache.org/jira/browse/IGNITE-21382
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vladislav Pyatkov
>Assignee: Denis Chudov
>Priority: Major
>  Labels: ignite-3
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The test falls while waiting for the primary replica change. This issue is 
> also reproduced locally, at least one per five passes.
> {code}
> assertThat(primaryChangeTask, willCompleteSuccessfully());
> {code}
> {noformat}
> java.lang.AssertionError: java.util.concurrent.TimeoutException
>   at 
> org.apache.ignite.internal.testframework.matchers.CompletableFutureMatcher.matchesSafely(CompletableFutureMatcher.java:78)
>   at 
> org.apache.ignite.internal.testframework.matchers.CompletableFutureMatcher.matchesSafely(CompletableFutureMatcher.java:35)
>   at org.hamcrest.TypeSafeMatcher.matches(TypeSafeMatcher.java:67)
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:10)
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
>   at 
> org.apache.ignite.internal.placementdriver.ItPrimaryReplicaChoiceTest.testPrimaryChangeLongHandling(ItPrimaryReplicaChoiceTest.java:179)
> {noformat}
> This test will be muted on TC to pervent future falls.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-22008) Fix double-checked locking in ReadWriteTransactionImpl

2024-04-09 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-22008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835344#comment-17835344
 ] 

Vladislav Pyatkov commented on IGNITE-22008:


Meregd 6fda2047aca54c2e674544b829891c201548812d

> Fix double-checked locking in ReadWriteTransactionImpl
> --
>
> Key: IGNITE-22008
> URL: https://issues.apache.org/jira/browse/IGNITE-22008
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Roman Puchkovskiy
>Assignee: Roman Puchkovskiy
>Priority: Major
> Fix For: 3.0.0-beta2
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {{finishFuture}} needs to be made volatile



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (IGNITE-22008) Fix double-checked locking in ReadWriteTransactionImpl

2024-04-09 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-22008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835344#comment-17835344
 ] 

Vladislav Pyatkov edited comment on IGNITE-22008 at 4/9/24 10:38 AM:
-

Merged 6fda2047aca54c2e674544b829891c201548812d


was (Author: v.pyatkov):
Meregd 6fda2047aca54c2e674544b829891c201548812d

> Fix double-checked locking in ReadWriteTransactionImpl
> --
>
> Key: IGNITE-22008
> URL: https://issues.apache.org/jira/browse/IGNITE-22008
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Roman Puchkovskiy
>Assignee: Roman Puchkovskiy
>Priority: Major
> Fix For: 3.0.0-beta2
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {{finishFuture}} needs to be made volatile



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-21907) Change thread count for RAFT disruptors to improve performance

2024-04-04 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17833913#comment-17833913
 ] 

Vladislav Pyatkov commented on IGNITE-21907:


Merged 58f43c8f058f40f1ec5eaad44c6f69ca5a17aa9d

> Change thread count for RAFT disruptors to improve performance
> --
>
> Key: IGNITE-21907
> URL: https://issues.apache.org/jira/browse/IGNITE-21907
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladislav Pyatkov
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> h3. Motivation
> Right now, we use a stripe count to configure count of thread. The default 
> number of threads is double the number of processors.
> {code:java|title=NodeOptions.java}
> private static final int DEFAULT_STRIPES = Utils.cpus() * 2;
> {code}
> The total number of threads is more than enough. Our throughput tests showed 
> that a lower number of treads leads to better performance.
> h3. Definition of done
> Default stripes couint is _Utils.cpus()_
> Default number of stripes for RAFT log manager is _4_



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (IGNITE-21908) Alignment of distribution among stripes in disruptor

2024-04-02 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov reassigned IGNITE-21908:
--

Assignee: Vladislav Pyatkov

> Alignment of distribution among stripes in disruptor
> 
>
> Key: IGNITE-21908
> URL: https://issues.apache.org/jira/browse/IGNITE-21908
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladislav Pyatkov
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-21908) Alignment of distribution among stripes in disruptor

2024-04-02 Thread Vladislav Pyatkov (Jira)
Vladislav Pyatkov created IGNITE-21908:
--

 Summary: Alignment of distribution among stripes in disruptor
 Key: IGNITE-21908
 URL: https://issues.apache.org/jira/browse/IGNITE-21908
 Project: Ignite
  Issue Type: Improvement
Reporter: Vladislav Pyatkov






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21907) Change thread count for RAFT disruptors to improve performance

2024-04-02 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov updated IGNITE-21907:
---
Description: 
h3. Motivation

Right now, we use a stripe count to configure count of thread. The default 
number of threads is double the number of processors.
{code:java|title=NodeOptions.java}
private static final int DEFAULT_STRIPES = Utils.cpus() * 2;
{code}
The total number of threads is more than enough. Our throughput tests showed 
that a lower number of treads leads to better performance.
h3. Definition of done

Default stripes couint is _Utils.cpus()_

Default number of stripes for RAFT log manager is _4_

  was:
h3. Motivation

Right now, we use a stripe count to configure count of thread. The default 
number of threads is double the number of processors.
{code:java|title=NodeOptions.java}
private static final int DEFAULT_STRIPES = Utils.cpus() * 2;
{code}
h3. Defenition of done

Default stripes couint is _Utils.cpus()_

Default number of stripes for RAFT log manager is _4_


> Change thread count for RAFT disruptors to improve performance
> --
>
> Key: IGNITE-21907
> URL: https://issues.apache.org/jira/browse/IGNITE-21907
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladislav Pyatkov
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
>
> h3. Motivation
> Right now, we use a stripe count to configure count of thread. The default 
> number of threads is double the number of processors.
> {code:java|title=NodeOptions.java}
> private static final int DEFAULT_STRIPES = Utils.cpus() * 2;
> {code}
> The total number of threads is more than enough. Our throughput tests showed 
> that a lower number of treads leads to better performance.
> h3. Definition of done
> Default stripes couint is _Utils.cpus()_
> Default number of stripes for RAFT log manager is _4_



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (IGNITE-21907) Change thread count for RAFT disruptors to improve performance

2024-04-02 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov reassigned IGNITE-21907:
--

Assignee: Vladislav Pyatkov

> Change thread count for RAFT disruptors to improve performance
> --
>
> Key: IGNITE-21907
> URL: https://issues.apache.org/jira/browse/IGNITE-21907
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladislav Pyatkov
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
>
> h3. Motivation
> Right now, we use a stripe count to configure count of thread. The default 
> number of threads is double the number of processors.
> {code:java|title=NodeOptions.java}
> private static final int DEFAULT_STRIPES = Utils.cpus() * 2;
> {code}
> h3. Defenition of done
> Default stripes couint is _Utils.cpus()_
> Default number of stripes for RAFT log manager is _4_



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21907) Change thread count for RAFT disruptors to improve performance

2024-04-02 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov updated IGNITE-21907:
---
Description: 
h3. Motivation

Right now, we use a stripe count to configure count of thread. The default 
number of threads is double the number of processors.
{code:java|title=NodeOptions.java}
private static final int DEFAULT_STRIPES = Utils.cpus() * 2;
{code}
h3. Defenition of done

Default stripes couint is _Utils.cpus()_

Default number of stripes for RAFT log manager is _4_

  was:
h3. Motivation

Right now, we use a stripe count to configure count of thread. The default 
number of threads is the same as the number of processors.
{code:java|title=NodeOptions.java}
private static final int DEFAULT_STRIPES = Utils.cpus() * 2;
{code}
 


> Change thread count for RAFT disruptors to improve performance
> --
>
> Key: IGNITE-21907
> URL: https://issues.apache.org/jira/browse/IGNITE-21907
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
>
> h3. Motivation
> Right now, we use a stripe count to configure count of thread. The default 
> number of threads is double the number of processors.
> {code:java|title=NodeOptions.java}
> private static final int DEFAULT_STRIPES = Utils.cpus() * 2;
> {code}
> h3. Defenition of done
> Default stripes couint is _Utils.cpus()_
> Default number of stripes for RAFT log manager is _4_



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21907) Change thread count for RAFT disruptors to improve performance

2024-04-02 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov updated IGNITE-21907:
---
Description: 
h3. Motivation

Right now, we use a stripe count to configure count of thread. The default 
number of threads is the same as the number of processors.
{code:java|title=NodeOptions.java}
private static final int DEFAULT_STRIPES = Utils.cpus() * 2;
{code}
 

> Change thread count for RAFT disruptors to improve performance
> --
>
> Key: IGNITE-21907
> URL: https://issues.apache.org/jira/browse/IGNITE-21907
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
>
> h3. Motivation
> Right now, we use a stripe count to configure count of thread. The default 
> number of threads is the same as the number of processors.
> {code:java|title=NodeOptions.java}
> private static final int DEFAULT_STRIPES = Utils.cpus() * 2;
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-21907) Change thread count for RAFT disruptors to improve performance

2024-04-02 Thread Vladislav Pyatkov (Jira)
Vladislav Pyatkov created IGNITE-21907:
--

 Summary: Change thread count for RAFT disruptors to improve 
performance
 Key: IGNITE-21907
 URL: https://issues.apache.org/jira/browse/IGNITE-21907
 Project: Ignite
  Issue Type: Improvement
Reporter: Vladislav Pyatkov






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-21799) A transaction rollback fails with assertion error

2024-04-01 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832764#comment-17832764
 ] 

Vladislav Pyatkov commented on IGNITE-21799:


This issue is not reproducible.

[~amashenkov] Probably, it is already fixed. Could you please take a look?

> A transaction rollback fails with assertion error
> -
>
> Key: IGNITE-21799
> URL: https://issues.apache.org/jira/browse/IGNITE-21799
> Project: Ignite
>  Issue Type: Bug
>Reporter: Andrey Mashenkov
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>
> A transaction rollback fails with 
> {noformat}
> java.lang.AssertionError: Thread does not have allowed operations
> {noformat}
> Here is a simple reproducer below.
> You can use an`ItCommonTest` to reproduce.
> {code:java}
>  @Test
> public void rollbackAsync() {
> Ignite node = CLUSTER.aliveNode();
> sql("CREATE TABLE IF NOT EXISTS TEST(ID INT PRIMARY KEY, VAL0 INT)");
> KeyValueView view1 = 
> node.tables().table("TEST").keyValueView();
> AtomicReference> rollbackFut = new 
> AtomicReference<>();
> 
> Transaction tx = node.transactions().begin();
> view1.getAsync(tx, makeKey(101))
>  .handle((unused, err) -> {
> rollbackFut.set(tx.rollbackAsync()); // unconditional 
> roll back
> return null;
> }).join(); 
> rollbackFut.get().join(); // <- FAILS
> }
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (IGNITE-21775) Lease grant message does not handle

2024-03-29 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov closed IGNITE-21775.
--

> Lease grant message does not handle
> ---
>
> Key: IGNITE-21775
> URL: https://issues.apache.org/jira/browse/IGNITE-21775
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
> Attachments: _Integration_Tests_Module_Runner_23864.log.zip
>
>
> h3. Motivation
> The lease grant message is sent to all primary replicas. After the message is 
> applied, the replica prints (the example for repluication grout 8_part_9):
> {noformat}
> [2024-03-17T00:13:16,472][INFO 
> ][%isdst_n_1%MessagingService-inbound-0-0][ReplicaManager] Received 
> LeaseGrantedMessage for replica belonging to group=8_part_9, force=false
> [2024-03-17T00:13:16,472][INFO 
> ][%isdst_n_1%MessagingService-inbound-0-0][ReplicaManager] Waiting for actual 
> storage state, group=8_part_9
> [2024-03-17T00:13:16,473][INFO 
> ][%isdst_n_1%JRaft-AppendEntries-Processor-8][ReplicaManager] Lease accepted 
> [group=8_part_9, leaseStartTime=HybridTimestamp [physical=2024-03-17 
> 00:13:16:471 +, logical=2, composite=112108135807123458]].
> {noformat}
> But the message is not present for repluication group 8_part_5:
> {noformat}
> Caused by: org.apache.ignite.internal.lang.IgniteInternalException: 
> IGN-PLACEMENTDRIVER-1 TraceId:bd7944eb-5de7-401c-b721-4f6373de2b7d Failed to 
> get the primary replica [tablePartitionId=8_part_5]
> at 
> app//org.apache.ignite.internal.util.ExceptionUtils.lambda$withCause$1(ExceptionUtils.java:384)
> at 
> app//org.apache.ignite.internal.util.ExceptionUtils.withCauseInternal(ExceptionUtils.java:446)
> at 
> app//org.apache.ignite.internal.util.ExceptionUtils.withCause(ExceptionUtils.java:384)
> at 
> app//org.apache.ignite.internal.sql.engine.SqlQueryProcessor.lambda$primaryReplicas$2(SqlQueryProcessor.java:410)
> at 
> java.base@11.0.17/java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:930)
> at 
> java.base@11.0.17/java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:907)
> at 
> java.base@11.0.17/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
> at 
> java.base@11.0.17/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)
> at 
> java.base@11.0.17/java.util.concurrent.CompletableFuture$Timeout.run(CompletableFuture.java:2792)
> at 
> java.base@11.0.17/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> at 
> java.base@11.0.17/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at 
> java.base@11.0.17/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
> ... 3 more
> {noformat}
> h3. Definition of done
> The lease grant message is handled for all replication groups.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (IGNITE-21775) Lease grant message does not handle

2024-03-29 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov resolved IGNITE-21775.

Resolution: Duplicate

> Lease grant message does not handle
> ---
>
> Key: IGNITE-21775
> URL: https://issues.apache.org/jira/browse/IGNITE-21775
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
> Attachments: _Integration_Tests_Module_Runner_23864.log.zip
>
>
> h3. Motivation
> The lease grant message is sent to all primary replicas. After the message is 
> applied, the replica prints (the example for repluication grout 8_part_9):
> {noformat}
> [2024-03-17T00:13:16,472][INFO 
> ][%isdst_n_1%MessagingService-inbound-0-0][ReplicaManager] Received 
> LeaseGrantedMessage for replica belonging to group=8_part_9, force=false
> [2024-03-17T00:13:16,472][INFO 
> ][%isdst_n_1%MessagingService-inbound-0-0][ReplicaManager] Waiting for actual 
> storage state, group=8_part_9
> [2024-03-17T00:13:16,473][INFO 
> ][%isdst_n_1%JRaft-AppendEntries-Processor-8][ReplicaManager] Lease accepted 
> [group=8_part_9, leaseStartTime=HybridTimestamp [physical=2024-03-17 
> 00:13:16:471 +, logical=2, composite=112108135807123458]].
> {noformat}
> But the message is not present for repluication group 8_part_5:
> {noformat}
> Caused by: org.apache.ignite.internal.lang.IgniteInternalException: 
> IGN-PLACEMENTDRIVER-1 TraceId:bd7944eb-5de7-401c-b721-4f6373de2b7d Failed to 
> get the primary replica [tablePartitionId=8_part_5]
> at 
> app//org.apache.ignite.internal.util.ExceptionUtils.lambda$withCause$1(ExceptionUtils.java:384)
> at 
> app//org.apache.ignite.internal.util.ExceptionUtils.withCauseInternal(ExceptionUtils.java:446)
> at 
> app//org.apache.ignite.internal.util.ExceptionUtils.withCause(ExceptionUtils.java:384)
> at 
> app//org.apache.ignite.internal.sql.engine.SqlQueryProcessor.lambda$primaryReplicas$2(SqlQueryProcessor.java:410)
> at 
> java.base@11.0.17/java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:930)
> at 
> java.base@11.0.17/java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:907)
> at 
> java.base@11.0.17/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
> at 
> java.base@11.0.17/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)
> at 
> java.base@11.0.17/java.util.concurrent.CompletableFuture$Timeout.run(CompletableFuture.java:2792)
> at 
> java.base@11.0.17/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> at 
> java.base@11.0.17/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at 
> java.base@11.0.17/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
> ... 3 more
> {noformat}
> h3. Definition of done
> The lease grant message is handled for all replication groups.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-21775) Lease grant message does not handle

2024-03-29 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832211#comment-17832211
 ] 

Vladislav Pyatkov commented on IGNITE-21775:


The issue duplicates IGNITE-21566

> Lease grant message does not handle
> ---
>
> Key: IGNITE-21775
> URL: https://issues.apache.org/jira/browse/IGNITE-21775
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
> Attachments: _Integration_Tests_Module_Runner_23864.log.zip
>
>
> h3. Motivation
> The lease grant message is sent to all primary replicas. After the message is 
> applied, the replica prints (the example for repluication grout 8_part_9):
> {noformat}
> [2024-03-17T00:13:16,472][INFO 
> ][%isdst_n_1%MessagingService-inbound-0-0][ReplicaManager] Received 
> LeaseGrantedMessage for replica belonging to group=8_part_9, force=false
> [2024-03-17T00:13:16,472][INFO 
> ][%isdst_n_1%MessagingService-inbound-0-0][ReplicaManager] Waiting for actual 
> storage state, group=8_part_9
> [2024-03-17T00:13:16,473][INFO 
> ][%isdst_n_1%JRaft-AppendEntries-Processor-8][ReplicaManager] Lease accepted 
> [group=8_part_9, leaseStartTime=HybridTimestamp [physical=2024-03-17 
> 00:13:16:471 +, logical=2, composite=112108135807123458]].
> {noformat}
> But the message is not present for repluication group 8_part_5:
> {noformat}
> Caused by: org.apache.ignite.internal.lang.IgniteInternalException: 
> IGN-PLACEMENTDRIVER-1 TraceId:bd7944eb-5de7-401c-b721-4f6373de2b7d Failed to 
> get the primary replica [tablePartitionId=8_part_5]
> at 
> app//org.apache.ignite.internal.util.ExceptionUtils.lambda$withCause$1(ExceptionUtils.java:384)
> at 
> app//org.apache.ignite.internal.util.ExceptionUtils.withCauseInternal(ExceptionUtils.java:446)
> at 
> app//org.apache.ignite.internal.util.ExceptionUtils.withCause(ExceptionUtils.java:384)
> at 
> app//org.apache.ignite.internal.sql.engine.SqlQueryProcessor.lambda$primaryReplicas$2(SqlQueryProcessor.java:410)
> at 
> java.base@11.0.17/java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:930)
> at 
> java.base@11.0.17/java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:907)
> at 
> java.base@11.0.17/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
> at 
> java.base@11.0.17/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)
> at 
> java.base@11.0.17/java.util.concurrent.CompletableFuture$Timeout.run(CompletableFuture.java:2792)
> at 
> java.base@11.0.17/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> at 
> java.base@11.0.17/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at 
> java.base@11.0.17/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
> ... 3 more
> {noformat}
> h3. Definition of done
> The lease grant message is handled for all replication groups.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-21798) Add properties to configure number of RAFT threads

2024-03-20 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17828903#comment-17828903
 ] 

Vladislav Pyatkov commented on IGNITE-21798:


Merged e9df1de17764aae042551920ae2644cbdb33f293

> Add properties to configure number of RAFT threads
> --
>
> Key: IGNITE-21798
> URL: https://issues.apache.org/jira/browse/IGNITE-21798
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladislav Pyatkov
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> h3. Motivation
> We always have to work to reduce the number of threads in the product. In 
> this case, we create a double CPU for each disruptor that is used in the 
> replication layer. This is a lot of threads, and it is redundant in most 
> cases. We have to create a property to regulate the number in a specific case.
> h3. Implementation notes
>  * The default number of threads should stay the same until we don't have a 
> test that shows a more stable value.
>  * *raft.stripes* is a startup configuration property to configure the number 
> of stripes.
>  * *raft.logStripesCount* is a startup configuration property to configure 
> the number of threads that are used to write WAL.
>  * *raft.logYieldStrategy* is a startup configuration property to use the 
> Yield strategy for WAL writer threads.
> h3. Definition of done
> Prepare all the described system properties
> The task also assumes a throughput test scenario description



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (IGNITE-21798) Add properties to configure number of RAFT threads

2024-03-20 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov reassigned IGNITE-21798:
--

Assignee: Vladislav Pyatkov

> Add properties to configure number of RAFT threads
> --
>
> Key: IGNITE-21798
> URL: https://issues.apache.org/jira/browse/IGNITE-21798
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladislav Pyatkov
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
>
> h3. Motivation
> We always have to work to reduce the number of threads in the product. In 
> this case, we create a double CPU for each disruptor that is used in the 
> replication layer. This is a lot of threads, and it is redundant in most 
> cases. We have to create a property to regulate the number in a specific case.
> h3. Implementation notes
>  * The default number of threads should stay the same until we don't have a 
> test that shows a more stable value.
>  * *raft.stripes* is a startup configuration property to configure the number 
> of stripes.
>  * *raft.logStripesCount* is a startup configuration property to configure 
> the number of threads that are used to write WAL.
>  * *raft.logYieldStrategy* is a startup configuration property to use the 
> Yield strategy for WAL writer threads.
> h3. Definition of done
> Prepare all the described system properties
> The task also assumes a throughput test scenario description



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21798) Add properties to configure number of RAFT threads

2024-03-20 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov updated IGNITE-21798:
---
Reviewer: Denis Chudov

> Add properties to configure number of RAFT threads
> --
>
> Key: IGNITE-21798
> URL: https://issues.apache.org/jira/browse/IGNITE-21798
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
>
> h3. Motivation
> We always have to work to reduce the number of threads in the product. In 
> this case, we create a double CPU for each disruptor that is used in the 
> replication layer. This is a lot of threads, and it is redundant in most 
> cases. We have to create a property to regulate the number in a specific case.
> h3. Implementation notes
>  * The default number of threads should stay the same until we don't have a 
> test that shows a more stable value.
>  * *raft.stripes* is a startup configuration property to configure the number 
> of stripes.
>  * *raft.logStripesCount* is a startup configuration property to configure 
> the number of threads that are used to write WAL.
>  * *raft.logYieldStrategy* is a startup configuration property to use the 
> Yield strategy for WAL writer threads.
> h3. Definition of done
> Prepare all the described system properties
> The task also assumes a throughput test scenario description



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21798) Add properties to configure number of RAFT threads

2024-03-20 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov updated IGNITE-21798:
---
Description: 
h3. Motivation

We always have to work to reduce the number of threads in the product. In this 
case, we create a double CPU for each disruptor that is used in the replication 
layer. This is a lot of threads, and it is redundant in most cases. We have to 
create a property to regulate the number in a specific case.
h3. Implementation notes
 * The default number of threads should stay the same until we don't have a 
test that shows a more stable value.
 * *raft.stripes* is a startup configuration property to configure the number 
of stripes.
 * *raft.logStripesCount* is a startup configuration property to configure the 
number of threads that are used to write WAL.
 * *raft.logYieldStrategy* is a startup configuration property to use the Yield 
strategy for WAL writer threads.

h3. Definition of done

Prepare all the described system properties
The task also assumes a throughput test scenario description

  was:
h3. Motivation

We always have to work to reduce the number of threads in the product. In this 
case, we create a double CPU for each disruptor that is used in the replication 
layer. This is a lot of threads, and it is redundant in most cases. We have to 
create a property to regulate the number in a specific case.
h3. Implementation notes
 * The default number of threads should stay the same until we don't have a 
test that shows a more stable value.
 * IGNITE_RAFT_STRIPES is a system property to configure the number of stripes.
 * IGNITE_RAFT_LOG_STRIPES is a system property to configure the number of 
threads that are used to write WAL.
 * IGNITE_RAFT_LOG_YIELD_STRATEGY is a system property to use the Yield 
strategy for WAL writer threads.

h3. Definition of done

Prepare all the described system properties
The task also assumes a throughput test scenario description


> Add properties to configure number of RAFT threads
> --
>
> Key: IGNITE-21798
> URL: https://issues.apache.org/jira/browse/IGNITE-21798
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
>
> h3. Motivation
> We always have to work to reduce the number of threads in the product. In 
> this case, we create a double CPU for each disruptor that is used in the 
> replication layer. This is a lot of threads, and it is redundant in most 
> cases. We have to create a property to regulate the number in a specific case.
> h3. Implementation notes
>  * The default number of threads should stay the same until we don't have a 
> test that shows a more stable value.
>  * *raft.stripes* is a startup configuration property to configure the number 
> of stripes.
>  * *raft.logStripesCount* is a startup configuration property to configure 
> the number of threads that are used to write WAL.
>  * *raft.logYieldStrategy* is a startup configuration property to use the 
> Yield strategy for WAL writer threads.
> h3. Definition of done
> Prepare all the described system properties
> The task also assumes a throughput test scenario description



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (IGNITE-21798) Add properties to configure number of RAFT threads

2024-03-20 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17828261#comment-17828261
 ] 

Vladislav Pyatkov edited comment on IGNITE-21798 at 3/20/24 7:31 AM:
-

Appropriate values to use in tests:
raft.stripes =
 * Not set;
 * CPUs;
 * CPUs / 2.

raft.logStripesCount =
 * Not set;
 * 1;
 * 2;
 * 4.

raft.logYieldStrategy =
 * Not set;
 * true.


was (Author: v.pyatkov):
Appropriate values to use in tests:
_IGNITE_RAFT_STRIPES_ =
 * Not set;
 * CPUs;
 * CPUs / 2.

IGNITE_RAFT_LOG_STRIPES =
 * Not set;
 * 1;
 * 2;
 * 4.

IGNITE_RAFT_LOG_YIELD_STRATEGY =
 * Not set;
 * true.

> Add properties to configure number of RAFT threads
> --
>
> Key: IGNITE-21798
> URL: https://issues.apache.org/jira/browse/IGNITE-21798
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
>
> h3. Motivation
> We always have to work to reduce the number of threads in the product. In 
> this case, we create a double CPU for each disruptor that is used in the 
> replication layer. This is a lot of threads, and it is redundant in most 
> cases. We have to create a property to regulate the number in a specific case.
> h3. Implementation notes
>  * The default number of threads should stay the same until we don't have a 
> test that shows a more stable value.
>  * IGNITE_RAFT_STRIPES is a system property to configure the number of 
> stripes.
>  * IGNITE_RAFT_LOG_STRIPES is a system property to configure the number of 
> threads that are used to write WAL.
>  * IGNITE_RAFT_LOG_YIELD_STRATEGY is a system property to use the Yield 
> strategy for WAL writer threads.
> h3. Definition of done
> Prepare all the described system properties
> The task also assumes a throughput test scenario description



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (IGNITE-21798) Add properties to configure number of RAFT threads

2024-03-19 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17828261#comment-17828261
 ] 

Vladislav Pyatkov edited comment on IGNITE-21798 at 3/19/24 2:14 PM:
-

Appropriate values to use in tests:
_IGNITE_RAFT_STRIPES_ =
 * Not set;
 * CPUs;
 * CPUs / 2.

IGNITE_RAFT_LOG_STRIPES =
 * Not set;
 * 1;
 * 2;
 * 4.

IGNITE_RAFT_LOG_YIELD_STRATEGY =
 * Not set;
 * true.


was (Author: v.pyatkov):
Appropriate values to use in tests:
_IGNITE_RAFT_STRIPES_ =
 * Not set;
 * CPUs;
 * CPUs / 2.

_IGNITE_RAFT_STRIPES_ =
 * Not set;
 * 1;
 * 2;
 * 4.

_IGNITE_RAFT_WAL_YIELD_STRATEGY_ =
 * Not set;
 * true.

> Add properties to configure number of RAFT threads
> --
>
> Key: IGNITE-21798
> URL: https://issues.apache.org/jira/browse/IGNITE-21798
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
>
> h3. Motivation
> We always have to work to reduce the number of threads in the product. In 
> this case, we create a double CPU for each disruptor that is used in the 
> replication layer. This is a lot of threads, and it is redundant in most 
> cases. We have to create a property to regulate the number in a specific case.
> h3. Implementation notes
>  * The default number of threads should stay the same until we don't have a 
> test that shows a more stable value.
>  * IGNITE_RAFT_STRIPES is a system property to configure the number of 
> stripes.
>  * IGNITE_RAFT_LOG_STRIPES is a system property to configure the number of 
> threads that are used to write WAL.
>  * IGNITE_RAFT_LOG_YIELD_STRATEGY is a system property to use the Yield 
> strategy for WAL writer threads.
> h3. Definition of done
> Prepare all the described system properties
> The task also assumes a throughput test scenario description



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21798) Add properties to configure number of RAFT threads

2024-03-19 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov updated IGNITE-21798:
---
Description: 
h3. Motivation

We always have to work to reduce the number of threads in the product. In this 
case, we create a double CPU for each disruptor that is used in the replication 
layer. This is a lot of threads, and it is redundant in most cases. We have to 
create a property to regulate the number in a specific case.
h3. Implementation notes
 * The default number of threads should stay the same until we don't have a 
test that shows a more stable value.
 * IGNITE_RAFT_STRIPES is a system property to configure the number of stripes.
 * IGNITE_RAFT_LOG_STRIPES is a system property to configure the number of 
threads that are used to write WAL.
 * IGNITE_RAFT_LOG_YIELD_STRATEGY is a system property to use the Yield 
strategy for WAL writer threads.

h3. Definition of done

Prepare all the described system properties
The task also assumes a throughput test scenario description

  was:
h3. Motivation
We always have to work to reduce the number of threads in the product. In this 
case, we create a double CPU for each disruptor that is used in the replication 
layer. This is a lot of threads, and it is redundant in most cases. We have to 
create a property to regulate the number in a specific case.

h3. Implementation notes
* The default number of threads should stay the same until we don't have a test 
that shows a more stable value.
* IGNITE_RAFT_STRIPES is a system property to configure the number of stripes.
* IGNITE_RAFT_WAL_STRIPES is a system property to configure the number of 
threads that are used to write WAL.
* IGNITE_RAFT_WAL_YIELD_STRATEGY is a system property to use the Yield strategy 
for WAL writer threads.

h3. Definition of done
Prepare all the described system properties
The task also assumes a throughput test scenario description


> Add properties to configure number of RAFT threads
> --
>
> Key: IGNITE-21798
> URL: https://issues.apache.org/jira/browse/IGNITE-21798
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
>
> h3. Motivation
> We always have to work to reduce the number of threads in the product. In 
> this case, we create a double CPU for each disruptor that is used in the 
> replication layer. This is a lot of threads, and it is redundant in most 
> cases. We have to create a property to regulate the number in a specific case.
> h3. Implementation notes
>  * The default number of threads should stay the same until we don't have a 
> test that shows a more stable value.
>  * IGNITE_RAFT_STRIPES is a system property to configure the number of 
> stripes.
>  * IGNITE_RAFT_LOG_STRIPES is a system property to configure the number of 
> threads that are used to write WAL.
>  * IGNITE_RAFT_LOG_YIELD_STRATEGY is a system property to use the Yield 
> strategy for WAL writer threads.
> h3. Definition of done
> Prepare all the described system properties
> The task also assumes a throughput test scenario description



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (IGNITE-21798) Add properties to configure number of RAFT threads

2024-03-19 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17828261#comment-17828261
 ] 

Vladislav Pyatkov edited comment on IGNITE-21798 at 3/19/24 10:31 AM:
--

Appropriate values to use in tests:
_IGNITE_RAFT_STRIPES_ =
 * Not set;
 * CPUs;
 * CPUs / 2.

_IGNITE_RAFT_STRIPES_ =
 * Not set;
 * 1;
 * 2;
 * 4.

_IGNITE_RAFT_WAL_YIELD_STRATEGY_ =
 * Not set;
 * true.


was (Author: v.pyatkov):
Appropriate values to use in tests:
_IGNITE_RAFT_STRIPES_ =
 * Not set;
 * CPUs;
 * CPUs / 2.

_IGNITE_RAFT_STRIPES_ =
 * Not set;
 * 1;
 * 2;
 * 4.

_IGNITE_RAFT_WAL_YIELD_STRATEGY_ =
 * Not set;
 * true

> Add properties to configure number of RAFT threads
> --
>
> Key: IGNITE-21798
> URL: https://issues.apache.org/jira/browse/IGNITE-21798
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
>
> h3. Motivation
> We always have to work to reduce the number of threads in the product. In 
> this case, we create a double CPU for each disruptor that is used in the 
> replication layer. This is a lot of threads, and it is redundant in most 
> cases. We have to create a property to regulate the number in a specific case.
> h3. Implementation notes
> * The default number of threads should stay the same until we don't have a 
> test that shows a more stable value.
> * IGNITE_RAFT_STRIPES is a system property to configure the number of stripes.
> * IGNITE_RAFT_WAL_STRIPES is a system property to configure the number of 
> threads that are used to write WAL.
> * IGNITE_RAFT_WAL_YIELD_STRATEGY is a system property to use the Yield 
> strategy for WAL writer threads.
> h3. Definition of done
> Prepare all the described system properties
> The task also assumes a throughput test scenario description



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (IGNITE-21798) Add properties to configure number of RAFT threads

2024-03-19 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17828261#comment-17828261
 ] 

Vladislav Pyatkov edited comment on IGNITE-21798 at 3/19/24 10:29 AM:
--

Appropriate values to use in tests:
_IGNITE_RAFT_STRIPES_ = 
* CPUs * 2; 
* CPUs; 
* CPUs / 2.

_IGNITE_RAFT_STRIPES_ = 
* Not set; 
* 1; 
* 2; 
* 4.
_IGNITE_RAFT_WAL_YIELD_STRATEGY_ = 
* Not set; 
* true


was (Author: v.pyatkov):
Appropriate values to use in tests:
IGNITE_RAFT_STRIPES = CPUs * 2; CPUs; CPUs / 2
IGNITE_RAFT_STRIPES = Not set; 1; 2; 4
IGNITE_RAFT_WAL_YIELD_STRATEGY = Not set; true

> Add properties to configure number of RAFT threads
> --
>
> Key: IGNITE-21798
> URL: https://issues.apache.org/jira/browse/IGNITE-21798
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
>
> h3. Motivation
> We always have to work to reduce the number of threads in the product. In 
> this case, we create a double CPU for each disruptor that is used in the 
> replication layer. This is a lot of threads, and it is redundant in most 
> cases. We have to create a property to regulate the number in a specific case.
> h3. Implementation notes
> * The default number of threads should stay the same until we don't have a 
> test that shows a more stable value.
> * IGNITE_RAFT_STRIPES is a system property to configure the number of stripes.
> * IGNITE_RAFT_WAL_STRIPES is a system property to configure the number of 
> threads that are used to write WAL.
> * IGNITE_RAFT_WAL_YIELD_STRATEGY is a system property to use the Yield 
> strategy for WAL writer threads.
> h3. Definition of done
> Prepare all the described system properties
> The task also assumes a throughput test scenario description



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (IGNITE-21798) Add properties to configure number of RAFT threads

2024-03-19 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17828261#comment-17828261
 ] 

Vladislav Pyatkov edited comment on IGNITE-21798 at 3/19/24 10:29 AM:
--

Appropriate values to use in tests:
_IGNITE_RAFT_STRIPES_ =
 * Not set;
 * CPUs;
 * CPUs / 2.

_IGNITE_RAFT_STRIPES_ =
 * Not set;
 * 1;
 * 2;
 * 4.

_IGNITE_RAFT_WAL_YIELD_STRATEGY_ =
 * Not set;
 * true


was (Author: v.pyatkov):
Appropriate values to use in tests:
_IGNITE_RAFT_STRIPES_ = 
* CPUs * 2; 
* CPUs; 
* CPUs / 2.

_IGNITE_RAFT_STRIPES_ = 
* Not set; 
* 1; 
* 2; 
* 4.

_IGNITE_RAFT_WAL_YIELD_STRATEGY_ = 
* Not set; 
* true

> Add properties to configure number of RAFT threads
> --
>
> Key: IGNITE-21798
> URL: https://issues.apache.org/jira/browse/IGNITE-21798
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
>
> h3. Motivation
> We always have to work to reduce the number of threads in the product. In 
> this case, we create a double CPU for each disruptor that is used in the 
> replication layer. This is a lot of threads, and it is redundant in most 
> cases. We have to create a property to regulate the number in a specific case.
> h3. Implementation notes
> * The default number of threads should stay the same until we don't have a 
> test that shows a more stable value.
> * IGNITE_RAFT_STRIPES is a system property to configure the number of stripes.
> * IGNITE_RAFT_WAL_STRIPES is a system property to configure the number of 
> threads that are used to write WAL.
> * IGNITE_RAFT_WAL_YIELD_STRATEGY is a system property to use the Yield 
> strategy for WAL writer threads.
> h3. Definition of done
> Prepare all the described system properties
> The task also assumes a throughput test scenario description



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (IGNITE-21798) Add properties to configure number of RAFT threads

2024-03-19 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17828261#comment-17828261
 ] 

Vladislav Pyatkov edited comment on IGNITE-21798 at 3/19/24 10:29 AM:
--

Appropriate values to use in tests:
_IGNITE_RAFT_STRIPES_ = 
* CPUs * 2; 
* CPUs; 
* CPUs / 2.

_IGNITE_RAFT_STRIPES_ = 
* Not set; 
* 1; 
* 2; 
* 4.

_IGNITE_RAFT_WAL_YIELD_STRATEGY_ = 
* Not set; 
* true


was (Author: v.pyatkov):
Appropriate values to use in tests:
_IGNITE_RAFT_STRIPES_ = 
* CPUs * 2; 
* CPUs; 
* CPUs / 2.

_IGNITE_RAFT_STRIPES_ = 
* Not set; 
* 1; 
* 2; 
* 4.
_IGNITE_RAFT_WAL_YIELD_STRATEGY_ = 
* Not set; 
* true

> Add properties to configure number of RAFT threads
> --
>
> Key: IGNITE-21798
> URL: https://issues.apache.org/jira/browse/IGNITE-21798
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
>
> h3. Motivation
> We always have to work to reduce the number of threads in the product. In 
> this case, we create a double CPU for each disruptor that is used in the 
> replication layer. This is a lot of threads, and it is redundant in most 
> cases. We have to create a property to regulate the number in a specific case.
> h3. Implementation notes
> * The default number of threads should stay the same until we don't have a 
> test that shows a more stable value.
> * IGNITE_RAFT_STRIPES is a system property to configure the number of stripes.
> * IGNITE_RAFT_WAL_STRIPES is a system property to configure the number of 
> threads that are used to write WAL.
> * IGNITE_RAFT_WAL_YIELD_STRATEGY is a system property to use the Yield 
> strategy for WAL writer threads.
> h3. Definition of done
> Prepare all the described system properties
> The task also assumes a throughput test scenario description



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-21798) Add properties to configure number of RAFT threads

2024-03-19 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17828261#comment-17828261
 ] 

Vladislav Pyatkov commented on IGNITE-21798:


Appropriate values to use in tests:
IGNITE_RAFT_STRIPES = CPUs * 2; CPUs; CPUs / 2
IGNITE_RAFT_STRIPES = Not set; 1; 2; 4
IGNITE_RAFT_WAL_YIELD_STRATEGY = Not set; true

> Add properties to configure number of RAFT threads
> --
>
> Key: IGNITE-21798
> URL: https://issues.apache.org/jira/browse/IGNITE-21798
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
>
> h3. Motivation
> We always have to work to reduce the number of threads in the product. In 
> this case, we create a double CPU for each disruptor that is used in the 
> replication layer. This is a lot of threads, and it is redundant in most 
> cases. We have to create a property to regulate the number in a specific case.
> h3. Implementation notes
> * The default number of threads should stay the same until we don't have a 
> test that shows a more stable value.
> * IGNITE_RAFT_STRIPES is a system property to configure the number of stripes.
> * IGNITE_RAFT_WAL_STRIPES is a system property to configure the number of 
> threads that are used to write WAL.
> * IGNITE_RAFT_WAL_YIELD_STRATEGY is a system property to use the Yield 
> strategy for WAL writer threads.
> h3. Definition of done
> Prepare all the described system properties
> The task also assumes a throughput test scenario description



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21798) Add properties to configure number of RAFT threads

2024-03-19 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov updated IGNITE-21798:
---
Description: 
h3. Motivation
We always have to work to reduce the number of threads in the product. In this 
case, we create a double CPU for each disruptor that is used in the replication 
layer. This is a lot of threads, and it is redundant in most cases. We have to 
create a property to regulate the number in a specific case.

h3. Implementation notes
* The default number of threads should stay the same until we don't have a test 
that shows a more stable value.
* IGNITE_RAFT_STRIPES is a system property to configure the number of stripes.
* IGNITE_RAFT_WAL_STRIPES is a system property to configure the number of 
threads that are used to write WAL.
* IGNITE_RAFT_WAL_YIELD_STRATEGY is a system property to use the Yield strategy 
for WAL writer threads.

h3. Definition of done
Prepare all the described system properties
The task also assumes a throughput test scenario description

  was:
h3. Motivation
We always have to work to reduce the number of threads in the product. In this 
case, we create a double CPU for each disruptor that is used in the replication 
layer. This is a lot of threads, and it is redundant in most cases. We have to 
create a property to regulate the number in a specific case.

h3. Implementation notes
* The default number of threads should stay the same until we don't have a test 
that shows a more stable value.
* IGNITE_RAFT_STRIPES is a system property to configure the number of stripes.
* IGNITE_RAFT_WAL_STRIPES is a system property to configure the number of 
threads that are used to write WAL.
* IGNITE_RAFT_WAL_YIELD_STRATEGY is a system property to use the Yield strategy 
for WAL writer threads.



> Add properties to configure number of RAFT threads
> --
>
> Key: IGNITE-21798
> URL: https://issues.apache.org/jira/browse/IGNITE-21798
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
>
> h3. Motivation
> We always have to work to reduce the number of threads in the product. In 
> this case, we create a double CPU for each disruptor that is used in the 
> replication layer. This is a lot of threads, and it is redundant in most 
> cases. We have to create a property to regulate the number in a specific case.
> h3. Implementation notes
> * The default number of threads should stay the same until we don't have a 
> test that shows a more stable value.
> * IGNITE_RAFT_STRIPES is a system property to configure the number of stripes.
> * IGNITE_RAFT_WAL_STRIPES is a system property to configure the number of 
> threads that are used to write WAL.
> * IGNITE_RAFT_WAL_YIELD_STRATEGY is a system property to use the Yield 
> strategy for WAL writer threads.
> h3. Definition of done
> Prepare all the described system properties
> The task also assumes a throughput test scenario description



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-21798) Add properties to configure number of RAFT threads

2024-03-19 Thread Vladislav Pyatkov (Jira)
Vladislav Pyatkov created IGNITE-21798:
--

 Summary: Add properties to configure number of RAFT threads
 Key: IGNITE-21798
 URL: https://issues.apache.org/jira/browse/IGNITE-21798
 Project: Ignite
  Issue Type: Improvement
Reporter: Vladislav Pyatkov


h3. Motivation
We always have to work to reduce the number of threads in the product. In this 
case, we create a double CPU for each disruptor that is used in the replication 
layer. This is a lot of threads, and it is redundant in most cases. We have to 
create a property to regulate the number in a specific case.

h3. Implementation notes
* The default number of threads should stay the same until we don't have a test 
that shows a more stable value.
* IGNITE_RAFT_STRIPES is a system property to configure the number of stripes.
* IGNITE_RAFT_WAL_STRIPES is a system property to configure the number of 
threads that are used to write WAL.
* IGNITE_RAFT_WAL_YIELD_STRATEGY is a system property to use the Yield strategy 
for WAL writer threads.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21775) Lease grant message does not handle

2024-03-18 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov updated IGNITE-21775:
---
Description: 
h3. Motivation
The lease grant message is sent to all primary replicas. After the message is 
applied, the replica prints (the example for repluication grout 8_part_9):
{noformat}
[2024-03-17T00:13:16,472][INFO 
][%isdst_n_1%MessagingService-inbound-0-0][ReplicaManager] Received 
LeaseGrantedMessage for replica belonging to group=8_part_9, force=false
[2024-03-17T00:13:16,472][INFO 
][%isdst_n_1%MessagingService-inbound-0-0][ReplicaManager] Waiting for actual 
storage state, group=8_part_9
[2024-03-17T00:13:16,473][INFO 
][%isdst_n_1%JRaft-AppendEntries-Processor-8][ReplicaManager] Lease accepted 
[group=8_part_9, leaseStartTime=HybridTimestamp [physical=2024-03-17 
00:13:16:471 +, logical=2, composite=112108135807123458]].
{noformat}
But the message is not present for repluication group 8_part_5:
{noformat}
Caused by: org.apache.ignite.internal.lang.IgniteInternalException: 
IGN-PLACEMENTDRIVER-1 TraceId:bd7944eb-5de7-401c-b721-4f6373de2b7d Failed to 
get the primary replica [tablePartitionId=8_part_5]
at 
app//org.apache.ignite.internal.util.ExceptionUtils.lambda$withCause$1(ExceptionUtils.java:384)
at 
app//org.apache.ignite.internal.util.ExceptionUtils.withCauseInternal(ExceptionUtils.java:446)
at 
app//org.apache.ignite.internal.util.ExceptionUtils.withCause(ExceptionUtils.java:384)
at 
app//org.apache.ignite.internal.sql.engine.SqlQueryProcessor.lambda$primaryReplicas$2(SqlQueryProcessor.java:410)
at 
java.base@11.0.17/java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:930)
at 
java.base@11.0.17/java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:907)
at 
java.base@11.0.17/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
at 
java.base@11.0.17/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)
at 
java.base@11.0.17/java.util.concurrent.CompletableFuture$Timeout.run(CompletableFuture.java:2792)
at 
java.base@11.0.17/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at 
java.base@11.0.17/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at 
java.base@11.0.17/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
... 3 more
{noformat}

h3. Definition of done
The lease grant message is handled for all replication groups.

  was:
h3. Motivation
The lease grant message is sent to all primary replicas. After the message is 
applied the replica prints (the example for repluication grout 8_part_9):
{noformat}
[2024-03-17T00:13:16,472][INFO 
][%isdst_n_1%MessagingService-inbound-0-0][ReplicaManager] Received 
LeaseGrantedMessage for replica belonging to group=8_part_9, force=false
[2024-03-17T00:13:16,472][INFO 
][%isdst_n_1%MessagingService-inbound-0-0][ReplicaManager] Waiting for actual 
storage state, group=8_part_9
[2024-03-17T00:13:16,473][INFO 
][%isdst_n_1%JRaft-AppendEntries-Processor-8][ReplicaManager] Lease accepted 
[group=8_part_9, leaseStartTime=HybridTimestamp [physical=2024-03-17 
00:13:16:471 +, logical=2, composite=112108135807123458]].
{noformat}
But the messages is not prented for repluication group 8_part_5.
{noformat}
Caused by: org.apache.ignite.internal.lang.IgniteInternalException: 
IGN-PLACEMENTDRIVER-1 TraceId:bd7944eb-5de7-401c-b721-4f6373de2b7d Failed to 
get the primary replica [tablePartitionId=8_part_5]
at 
app//org.apache.ignite.internal.util.ExceptionUtils.lambda$withCause$1(ExceptionUtils.java:384)
at 
app//org.apache.ignite.internal.util.ExceptionUtils.withCauseInternal(ExceptionUtils.java:446)
at 
app//org.apache.ignite.internal.util.ExceptionUtils.withCause(ExceptionUtils.java:384)
at 
app//org.apache.ignite.internal.sql.engine.SqlQueryProcessor.lambda$primaryReplicas$2(SqlQueryProcessor.java:410)
at 
java.base@11.0.17/java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:930)
at 
java.base@11.0.17/java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:907)
at 
java.base@11.0.17/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
at 
java.base@11.0.17/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)
at 
java.base@11.0.17/java.util.concurrent.CompletableFuture$Timeout.run(CompletableFuture.java:2792)
at 
java.base@11.0.17/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at 
java.base@11.0.17/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at 

[jira] [Updated] (IGNITE-21775) Lease grant message does not handle

2024-03-18 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov updated IGNITE-21775:
---
Attachment: _Integration_Tests_Module_Runner_23864.log.zip

> Lease grant message does not handle
> ---
>
> Key: IGNITE-21775
> URL: https://issues.apache.org/jira/browse/IGNITE-21775
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
> Attachments: _Integration_Tests_Module_Runner_23864.log.zip
>
>
> h3. Motivation
> The lease grant message is sent to all primary replicas. After the message is 
> applied, the replica prints (the example for repluication grout 8_part_9):
> {noformat}
> [2024-03-17T00:13:16,472][INFO 
> ][%isdst_n_1%MessagingService-inbound-0-0][ReplicaManager] Received 
> LeaseGrantedMessage for replica belonging to group=8_part_9, force=false
> [2024-03-17T00:13:16,472][INFO 
> ][%isdst_n_1%MessagingService-inbound-0-0][ReplicaManager] Waiting for actual 
> storage state, group=8_part_9
> [2024-03-17T00:13:16,473][INFO 
> ][%isdst_n_1%JRaft-AppendEntries-Processor-8][ReplicaManager] Lease accepted 
> [group=8_part_9, leaseStartTime=HybridTimestamp [physical=2024-03-17 
> 00:13:16:471 +, logical=2, composite=112108135807123458]].
> {noformat}
> But the message is not present for repluication group 8_part_5:
> {noformat}
> Caused by: org.apache.ignite.internal.lang.IgniteInternalException: 
> IGN-PLACEMENTDRIVER-1 TraceId:bd7944eb-5de7-401c-b721-4f6373de2b7d Failed to 
> get the primary replica [tablePartitionId=8_part_5]
> at 
> app//org.apache.ignite.internal.util.ExceptionUtils.lambda$withCause$1(ExceptionUtils.java:384)
> at 
> app//org.apache.ignite.internal.util.ExceptionUtils.withCauseInternal(ExceptionUtils.java:446)
> at 
> app//org.apache.ignite.internal.util.ExceptionUtils.withCause(ExceptionUtils.java:384)
> at 
> app//org.apache.ignite.internal.sql.engine.SqlQueryProcessor.lambda$primaryReplicas$2(SqlQueryProcessor.java:410)
> at 
> java.base@11.0.17/java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:930)
> at 
> java.base@11.0.17/java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:907)
> at 
> java.base@11.0.17/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
> at 
> java.base@11.0.17/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)
> at 
> java.base@11.0.17/java.util.concurrent.CompletableFuture$Timeout.run(CompletableFuture.java:2792)
> at 
> java.base@11.0.17/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> at 
> java.base@11.0.17/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at 
> java.base@11.0.17/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
> ... 3 more
> {noformat}
> h3. Definition of done
> The lease grant message is handled for all replication groups.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-21775) Lease grant message does not handle

2024-03-18 Thread Vladislav Pyatkov (Jira)
Vladislav Pyatkov created IGNITE-21775:
--

 Summary: Lease grant message does not handle
 Key: IGNITE-21775
 URL: https://issues.apache.org/jira/browse/IGNITE-21775
 Project: Ignite
  Issue Type: Bug
Reporter: Vladislav Pyatkov


h3. Motivation
The lease grant message is sent to all primary replicas. After the message is 
applied the replica prints (the example for repluication grout 8_part_9):
{noformat}
[2024-03-17T00:13:16,472][INFO 
][%isdst_n_1%MessagingService-inbound-0-0][ReplicaManager] Received 
LeaseGrantedMessage for replica belonging to group=8_part_9, force=false
[2024-03-17T00:13:16,472][INFO 
][%isdst_n_1%MessagingService-inbound-0-0][ReplicaManager] Waiting for actual 
storage state, group=8_part_9
[2024-03-17T00:13:16,473][INFO 
][%isdst_n_1%JRaft-AppendEntries-Processor-8][ReplicaManager] Lease accepted 
[group=8_part_9, leaseStartTime=HybridTimestamp [physical=2024-03-17 
00:13:16:471 +, logical=2, composite=112108135807123458]].
{noformat}
But the messages is not prented for repluication group 8_part_5.
{noformat}
Caused by: org.apache.ignite.internal.lang.IgniteInternalException: 
IGN-PLACEMENTDRIVER-1 TraceId:bd7944eb-5de7-401c-b721-4f6373de2b7d Failed to 
get the primary replica [tablePartitionId=8_part_5]
at 
app//org.apache.ignite.internal.util.ExceptionUtils.lambda$withCause$1(ExceptionUtils.java:384)
at 
app//org.apache.ignite.internal.util.ExceptionUtils.withCauseInternal(ExceptionUtils.java:446)
at 
app//org.apache.ignite.internal.util.ExceptionUtils.withCause(ExceptionUtils.java:384)
at 
app//org.apache.ignite.internal.sql.engine.SqlQueryProcessor.lambda$primaryReplicas$2(SqlQueryProcessor.java:410)
at 
java.base@11.0.17/java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:930)
at 
java.base@11.0.17/java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:907)
at 
java.base@11.0.17/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
at 
java.base@11.0.17/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)
at 
java.base@11.0.17/java.util.concurrent.CompletableFuture$Timeout.run(CompletableFuture.java:2792)
at 
java.base@11.0.17/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at 
java.base@11.0.17/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at 
java.base@11.0.17/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
... 3 more
{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-21634) NPE in HeapLockManager

2024-03-15 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17827493#comment-17827493
 ] 

Vladislav Pyatkov commented on IGNITE-21634:


Merged 52af92491327c9596d705b224fb99b18f20d2f2d

> NPE in HeapLockManager
> --
>
> Key: IGNITE-21634
> URL: https://issues.apache.org/jira/browse/IGNITE-21634
> Project: Ignite
>  Issue Type: Bug
>Reporter: Denis Chudov
>Assignee: Denis Chudov
>Priority: Major
>  Labels: ignite-3
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> {code:java}
> Caused by: java.lang.NullPointerException at 
> org.apache.ignite.internal.tx.impl.HeapLockManager.lambda$lockState$4(HeapLockManager.java:297)
>  ~[main/:?] at 
> java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1908) 
> ~[?:?] at 
> org.apache.ignite.internal.tx.impl.HeapLockManager.lockState(HeapLockManager.java:291)
>  ~[main/:?] at 
> org.apache.ignite.internal.tx.impl.HeapLockManager.acquire(HeapLockManager.java:172)
>  ~[main/:?] at 
> org.apache.ignite.internal.table.distributed.SortedIndexLocker.lambda$locksForInsert$4(SortedIndexLocker.java:169)
>  ~[main/:?] at 
> java.util.concurrent.CompletableFuture.uniComposeStage(CompletableFuture.java:1106)
>  ~[?:?] ... 29 more{code}
> on the line {{v.markedForRemove = false;}}
> {code:java}
> private LockState lockState(LockKey key) {
> int h = spread(key.hashCode());
> int index = h & (slots.length - 1);
> LockState[] res = new LockState[1];
> locks.compute(key, (k, v) -> {
> if (v == null) {
> if (empty.isEmpty()) {
> res[0] = slots[index];
> } else {
> v = empty.poll();
> v.markedForRemove = false;
> v.key = k;
> res[0] = v;
> }
> } else {
> res[0] = v;
> }
> return v;
> });
> return res[0];
> } {code}
> The problem can be reproduced on main(71b4fb34) with following test 
> (probably, fsync should be turned off):
> {code}
> @Test
> void test() {
> sql("CREATE TABLE test("
> + "c1 INT PRIMARY KEY, c2 INT, c3 INT, c4 INT, c5 INT,"
> + "c6 INT, c7 INT, c8 INT, c9 INT, c10 INT)"
> );
> for (int i = 2; i <= 10; i++) {
> sql(format("CREATE INDEX c{}_idx ON test (c{})", i, i));
> }
> sql("INSERT INTO test"
> + " SELECT x as c1, x as c2, x as c3, x as c4, x as c5, "
> + "x as c6, x as c7, x as c8, x as c9, x as c10"
> + "   FROM TABLE (system_range(1, 10))");
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-21348) Trigger the lease negotiation retry in case when the lease candidate is no more contained in assignments

2024-03-15 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17827492#comment-17827492
 ] 

Vladislav Pyatkov commented on IGNITE-21348:


Merged a2cdcda807d509739ae97a1076396c99023aa7a7

> Trigger the lease negotiation retry in case when the lease candidate is no 
> more contained in assignments
> 
>
> Key: IGNITE-21348
> URL: https://issues.apache.org/jira/browse/IGNITE-21348
> Project: Ignite
>  Issue Type: Bug
>Reporter: Denis Chudov
>Assignee: Denis Chudov
>Priority: Major
>  Labels: ignite-3
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> On receiving the "lease granted" message, the candidate replica tries to 
> catch up the actual storage state, in order to do that it makes read index 
> request. But in case when this candidate is no more a member of assignments 
> (and replication group) this request fails and is retried until the lease 
> negotiation interval exceeds. This makes no sense because such retries will 
> not be successful, and the current candidate is not a good candidate anymore 
> - because, although the leaseholder may be not a part of replication group, 
> preferably it should be, and should be its leader.
> The assignment changes when some of current candidates and leaseholders are 
> no more included in new assignment set, should be detected on the placement 
> driver active actor, and the current lease should be revoked (if negotiation 
> is in progress) or not prolonged. The new negotitation will be triggered 
> automatically by the lease updater.
> *Implementation notes*
> This assignment changes detection should be done on placement driver side, 
> because the events of assignment changes can be processed on different nodes 
> in different time, and there is already assignments tracker as a part of 
> placement driver.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-21641) OOM in PartitionReplicaListenerTest

2024-03-12 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17825747#comment-17825747
 ] 

Vladislav Pyatkov commented on IGNITE-21641:


Merged 02f5682181e82d87c1fddc157edbb6475ebf818b

> OOM in PartitionReplicaListenerTest
> ---
>
> Key: IGNITE-21641
> URL: https://issues.apache.org/jira/browse/IGNITE-21641
> Project: Ignite
>  Issue Type: Bug
>Reporter: Mirza Aliev
>Assignee: Denis Chudov
>Priority: Major
>  Labels: ignite-3
> Attachments: image-2024-03-01-12-22-32-053.png, 
> image-2024-03-01-20-36-08-577.png
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> TC run failed with OOM
> Problem occurred after 
> PartitionReplicaListenerTest.testReadOnlyGetAfterRowRewrite run, 
> {noformat}
> [2024-03-01T05:12:50,629][INFO ][Test worker][PartitionReplicaListenerTest] 
> >>> Starting test: 
> PartitionReplicaListenerTest#testReadOnlyGetAfterRowRewrite, displayName: 
> [14] true, true, false, true
> [2024-03-01T05:12:50,629][INFO ][Test worker][PartitionReplicaListenerTest] 
> workDir: 
> build/work/PartitionReplicaListenerTest/testReadOnlyGetAfterRowRewrite_33496469368142283
> [2024-03-01T05:12:50,638][INFO ][Test worker][PartitionReplicaListenerTest] 
> >>> Stopping test: 
> PartitionReplicaListenerTest#testReadOnlyGetAfterRowRewrite, displayName: 
> [14] true, true, false, true, cost: 8ms.
> [05:12:50] :   [testReadOnlyGetAfterRowRewrite(boolean, 
> boolean, boolean, boolean)] 
> org.apache.ignite.internal.table.distributed.replication.PartitionReplicaListenerTest.testReadOnlyGetAfterRowRewrite([15]
>  true, true, true, false) (10m:22s)
> [05:12:50] :   [:ignite-table:test] PartitionReplicaListenerTest > 
> testReadOnlyGetAfterRowRewrite(boolean, boolean, boolean, boolean) > [15] 
> true, true, true, false STANDARD_OUT
> [05:12:50] :   [:ignite-table:test] 
> [2024-03-01T05:12:50,648][INFO ][Test worker][PartitionReplicaListenerTest] 
> >>> Starting test: 
> PartitionReplicaListenerTest#testReadOnlyGetAfterRowRewrite, displayName: 
> [15] true, true, true, false
> [05:12:50] :   [:ignite-table:test] 
> [2024-03-01T05:12:50,648][INFO ][Test worker][PartitionReplicaListenerTest] 
> workDir: 
> build/work/PartitionReplicaListenerTest/testReadOnlyGetAfterRowRewrite_33496469386328241
> [05:18:42] :   [:ignite-table:test] java.lang.OutOfMemoryError: Java 
> heap space
> [05:18:42] :   [:ignite-table:test] Dumping heap to 
> java_pid2349600.hprof ...
> [05:19:06] :   [:ignite-table:test] Heap dump file created 
> [3645526743 bytes in 24.038 secs]
> {noformat}
> https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_RunAllTests/7898564?hideTestsFromDependencies=false=false+Inspection=true=true=true=false
> After analysing heap dump it appears that the reason of OOM is a problem with 
> Mockito.
>  !image-2024-03-01-12-22-32-053.png! 
> We need to investigate the reason of a problem with Mockito 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (IGNITE-21381) ActiveActorTest#testChangeLeaderForce has problems with resource cleanup

2024-03-11 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov reassigned IGNITE-21381:
--

Assignee: Vladislav Pyatkov

> ActiveActorTest#testChangeLeaderForce has problems with resource cleanup
> 
>
> Key: IGNITE-21381
> URL: https://issues.apache.org/jira/browse/IGNITE-21381
> Project: Ignite
>  Issue Type: Bug
>Reporter: Mirza Aliev
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
> Attachments: screenshot-1.png, screenshot-2.png
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> {{ActiveActorTest#testChangeLeaderForce}} is started to be flaky on TC with 
> {noformat}
> [05:19:12]F:   
> [org.apache.ignite.internal.placementdriver.ActiveActorTest.testChangeLeaderForce(TestInfo)]
>  org.opentest4j.AssertionFailedError: expected:  but was: 
>   at 
> app//org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)
>   at 
> app//org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132)
>   at app//org.junit.jupiter.api.AssertTrue.failNotTrue(AssertTrue.java:63)
>   at app//org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:36)
>   at app//org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:31)
>   at app//org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:180)
>   at 
> app//org.apache.ignite.internal.placementdriver.ActiveActorTest.testChangeLeaderForce(ActiveActorTest.java:370)
> {noformat}
> From the log we can see that transfer leadership, which was supposed to be 
> successful, do not happen. Behaviour is the following:
> 1) Current leader is {{Leader: ClusterNodeImpl 
> [id=e99210fb-f872-4e08-a99c-53f9512da20e, name=aat_tclf_1235}}
> 2) We want to transfer leadership to {{Peer to transfer leader: Peer 
> [consistentId=aat_tclf_1234, idx=0]}}
> 3) Process of transfer is started
> 4) We receive warn about error during {{GetLeaderRequestImpl}}:
> {noformat}
> [2024-01-29T05:19:08,855][WARN 
> ][CompletableFutureDelayScheduler][RaftGroupServiceImpl] Recoverable error 
> during the request occurred (will be retried on the randomly selected node) 
> [request=GetLeaderRequestImpl [groupId=TestReplicationGroup, 
> peerId=aat_tclf_1235], peer=Peer [consistentId=aat_tclf_1235, idx=0], 
> newPeer=Peer [consistentId=aat_tclf_1234, idx=0]].
> java.util.concurrent.CompletionException: 
> java.util.concurrent.TimeoutException
>   at 
> java.util.concurrent.CompletableFuture.encodeRelay(CompletableFuture.java:367)
>  ~[?:?]
>   at 
> java.util.concurrent.CompletableFuture.completeRelay(CompletableFuture.java:376)
>  ~[?:?]
>   at 
> java.util.concurrent.CompletableFuture$UniRelay.tryFire(CompletableFuture.java:1019)
>  ~[?:?]
>   at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
>  [?:?]
>   at 
> java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)
>  [?:?]
>   at 
> java.util.concurrent.CompletableFuture$Timeout.run(CompletableFuture.java:2792)
>  [?:?]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
>   at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
>  [?:?]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  [?:?]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  [?:?]
>   at java.lang.Thread.run(Thread.java:834) [?:?]
> Caused by: java.util.concurrent.TimeoutException
>   ... 7 more
> {noformat}
> 5) After that we see that node {{aat_tclf_1236}} sends invalid 
> {{RequestVoteResponse}} because it thinks that it is the leader:
> {noformat}
> [2024-01-29T05:19:11,370][WARN 
> ][%aat_tclf_1234%JRaft-Response-Processor-15][NodeImpl] Node 
>  received invalid RequestVoteResponse 
> from aat_tclf_1236, state not in STATE_CANDIDATE but STATE_LEADER.
> {noformat}
>  
> Tests {{ActiveActorTest#testChangeLeaderForce}} and 
> {{TopologyAwareRaftGroupServiceTest#testChangeLeaderForce}} were muted.
> Also there are some other problems with this tests, they incorrectly clean up 
> resources in case of failure. Cluster is stopped in test itself, meaning that 
> if some assertion is failed, the rest part of the test won't be evaluated, 
> hence cluster won't be stopped.
> The next problem is that if we run this test a several times, even if they 
> pass successfully, we can see that at some point new test cannot be run 
> because of 
> {noformat}
>  java.lang.OutOfMemoryError: unable 

[jira] [Commented] (IGNITE-21291) Scan cursors do not close when an RO transaction is finalized

2024-03-05 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17823639#comment-17823639
 ] 

Vladislav Pyatkov commented on IGNITE-21291:


Merged 7bb9ecf6c53cf94b6ea105872831538b5912e455

> Scan cursors do not close when an RO transaction is finalized
> -
>
> Key: IGNITE-21291
> URL: https://issues.apache.org/jira/browse/IGNITE-21291
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vladislav Pyatkov
>Assignee:  Kirill Sizov
>Priority: Major
>  Labels: ignite-3
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> h3. Motivation
> The cursors are opened on the server side and take up extra memory. When an 
> RW transaction is committed, we send the transaction cleanup messages to all 
> transaction participants. But the state of the RO transaction is strung 
> locally, so do not send any messages to the transaction participants (where 
> the cursors were opened).
> h3. Definition of done
> Cursors that were created by an RO transaction should be closed when the 
> transaction is no longer in use.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-20296) Make a network message handler that would be able to accept ClusterNode as a sender

2024-03-01 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-20296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17822654#comment-17822654
 ] 

Vladislav Pyatkov commented on IGNITE-20296:


Merged 40d14534bc849edbcaa6dc84cf588684226c17e8

> Make a network message handler that would be able to accept ClusterNode as a 
> sender
> ---
>
> Key: IGNITE-20296
> URL: https://issues.apache.org/jira/browse/IGNITE-20296
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Denis Chudov
>Assignee: Roman Puchkovskiy
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> *Motivation*
> With {{MessagingService}} we have an ability to send message using 
> {{ClusterNode}} to specify a recipient. But in {{NetworkMessageHandler}} we 
> have only ability to receive consistent id of a sender. However, in some 
> cases we need node id (for example, knowing that there is no node we the 
> given id in a cluster, allows us to not send message there at all, because 
> this node has left the topology at least once since it had sent this message 
> and therefore this node lost its volatile state)
> Also, the node id should go along with consistent id, so 
> {color:#172b4d}{{ClusterNode}} {color}as the argument type is preferred.
> *Definition of done*
> The following interface is available:
> {code:java}
> public interface NetworkMessageHandler {
> /**
>  * Method that gets invoked when a network message is received.
>  *
>  * @param message Message, which was received from the cluster.
>  * @param sender Sender node.
>  * @param correlationId Correlation id. Used to track correspondence 
> between requests and responses. Can be {@code null} if the received
>  * message is not a request from a {@link MessagingService#invoke} 
> method from another node.
>  */
> void onReceived(NetworkMessage message, ClusterNode sender, @Nullable 
> Long correlationId);
> } {code}
> Also, the {{MessagingService}} is able to accept an object implementing this 
> interface as a message handler.
> *Implementation notes*
> The ClusterNode passed as a sender should have a node id which the sender 
> node had at the moment the message was sent.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-21253) Implement a counter for number of rebalancing tables inside the zone

2024-02-29 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17822120#comment-17822120
 ] 

Vladislav Pyatkov commented on IGNITE-21253:


Merged e701d0709f42a3219fdce5e12fd425d0981123e2

> Implement a counter for number of rebalancing tables inside the zone 
> -
>
> Key: IGNITE-21253
> URL: https://issues.apache.org/jira/browse/IGNITE-21253
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Kirill Gusakov
>Assignee: Mirza Aliev
>Priority: Major
>  Labels: ignite-3
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> *Motivation*
> According to comment we need to switch zone assignments, only when all zone 
> tables finish their rebalances.
> To implement this behaviour we need to implement the metastorage counter of 
> tables, which will be decreased on every successfull table rebalance.
> *Definition of done*
>  - Counter of zone tables created on the rebalance start and decreased with 
> every successfull table rebalance
> *Implementation notes*
> - on pending update we set the counter in metastore (if it is not exist). 
> other tables can't be added because of "avoid table creation during the 
> rebalance" ticket.
> - on the rebalance done instead of the current logic we decrease the counter 
> (with compareAndSet).
> - the separate listener on every node listen to the counters and when it 
> decreased to 0 - do the usual logic with planned->pending, pending->stable 
> and etc.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-21623) Test fails ItTxDistributedTestThreeNodesThreeReplicas.testBatchSinglePartitionGet

2024-02-29 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17822006#comment-17822006
 ] 

Vladislav Pyatkov commented on IGNITE-21623:


Here is the test history
https://ci.ignite.apache.org/test/530545554222388187?currentProjectId=ApacheIgnite3xGradle_Test_IntegrationTests=true

> Test fails 
> ItTxDistributedTestThreeNodesThreeReplicas.testBatchSinglePartitionGet
> -
>
> Key: IGNITE-21623
> URL: https://issues.apache.org/jira/browse/IGNITE-21623
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
> Attachments: _Integration_Tests_Module_Table_20271.log (1).zip
>
>
> The test fails very rarely, but this is a violation of transaction 
> guarantees. This test fails to get committed entries:
> {code}
> for (Tuple tuple : accountRecordsView.getAll(null, keys.stream().map(k -> 
> makeKey(k)).collect(toList( {
> assertEquals(100., tuple.doubleValue("balance"));
> }
> {code}
> Instead, NPE is received:
> {noformat}
> java.lang.NullPointerException
>   at 
> org.apache.ignite.internal.table.TxAbstractTest.testBatchSinglePartitionGet(TxAbstractTest.java:2274)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
>   at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (IGNITE-21541) Avoid partition-operations pool when it not lead to starvation

2024-02-29 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov resolved IGNITE-21541.

Resolution: Duplicate

> Avoid partition-operations pool when it not lead to starvation
> --
>
> Key: IGNITE-21541
> URL: https://issues.apache.org/jira/browse/IGNITE-21541
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladislav Pyatkov
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
>
> h3. Motivation
> Chnaging pools and related parking/unparking lead to an increase in latency. 
> Sometimes we can avoid extra pool changing, for example, by doing it for 
> embedded operations.
> {code:title=ReplicaManager#onReplicaMessageReceived}
> ExecutorService stripeExecutor = 
> ReplicationGroupStripes.stripeFor(request.groupId(), requestsExecutor);
> stripeExecutor.execute(() -> handleReplicaRequest(request, 
> senderConsistentId, correlationId));
> {code}
> This code changes a thread, even if it is not necessary.
> h3. Definition of done
> ReplicaManager should not switch threads if it does not lead to starvation 
> (in my opinion, the splitting is needed only in the case of the network 
> thread).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-21541) Avoid partition-operations pool when it not lead to starvation

2024-02-29 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17821989#comment-17821989
 ] 

Vladislav Pyatkov commented on IGNITE-21541:


The issue with thread switching in the replica manager is also fixed in 
IGNITE-20373.

> Avoid partition-operations pool when it not lead to starvation
> --
>
> Key: IGNITE-21541
> URL: https://issues.apache.org/jira/browse/IGNITE-21541
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladislav Pyatkov
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
>
> h3. Motivation
> Chnaging pools and related parking/unparking lead to an increase in latency. 
> Sometimes we can avoid extra pool changing, for example, by doing it for 
> embedded operations.
> {code:title=ReplicaManager#onReplicaMessageReceived}
> ExecutorService stripeExecutor = 
> ReplicationGroupStripes.stripeFor(request.groupId(), requestsExecutor);
> stripeExecutor.execute(() -> handleReplicaRequest(request, 
> senderConsistentId, correlationId));
> {code}
> This code changes a thread, even if it is not necessary.
> h3. Definition of done
> ReplicaManager should not switch threads if it does not lead to starvation 
> (in my opinion, the splitting is needed only in the case of the network 
> thread).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21623) Test fails ItTxDistributedTestThreeNodesThreeReplicas.testBatchSinglePartitionGet

2024-02-28 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov updated IGNITE-21623:
---
Summary: Test fails 
ItTxDistributedTestThreeNodesThreeReplicas.testBatchSinglePartitionGet  (was: 
Test fails TxAbstractTest#testBatchSinglePartitionGet)

> Test fails 
> ItTxDistributedTestThreeNodesThreeReplicas.testBatchSinglePartitionGet
> -
>
> Key: IGNITE-21623
> URL: https://issues.apache.org/jira/browse/IGNITE-21623
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
> Attachments: _Integration_Tests_Module_Table_20271.log (1).zip
>
>
> The test fails very rarely, but this is a violation of transaction 
> guarantees. This test fails to get committed entries:
> {code}
> for (Tuple tuple : accountRecordsView.getAll(null, keys.stream().map(k -> 
> makeKey(k)).collect(toList( {
> assertEquals(100., tuple.doubleValue("balance"));
> }
> {code}
> Instead, NPE is received:
> {noformat}
> java.lang.NullPointerException
>   at 
> org.apache.ignite.internal.table.TxAbstractTest.testBatchSinglePartitionGet(TxAbstractTest.java:2274)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
>   at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-20373) Fix IO threading model

2024-02-28 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-20373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17821976#comment-17821976
 ] 

Vladislav Pyatkov commented on IGNITE-20373:


[~rpuch] The patch looks good. Thnks for your contribution.
Meged 393aa52cdf1dc9234efe24dcef9beb45d3b0bfab


> Fix IO threading model
> --
>
> Key: IGNITE-20373
> URL: https://issues.apache.org/jira/browse/IGNITE-20373
> Project: Ignite
>  Issue Type: Improvement
>Affects Versions: 3.0
>Reporter: Alexey Scherbakov
>Assignee: Roman Puchkovskiy
>Priority: Major
>  Labels: ignite-3, ignite3_performance, threading
> Fix For: 3.0.0-beta2
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Currently IO is resubmitted to inboundExecutor for further processing (even 
> there are corner cases then a message handler called in IO thread)
> It makes message processing essentially single threaded and introduces 
> additional lag due to message transition to additional queue.
> addMessageHandler should be extended with a 3-d argument: a pool for 
> submitting a callback for execution, or an executorSelector like in jraft.
> inboundExecutor should be changed to striped, use more than one thread, and 
> serve messages without explicit executor. Delivery guaranties should be 
> preserved: if a message A is send before B, B can't be processed on a 
> receiver before A. A stripe is defined by sender-receiver pair (or can be 
> user defined - TBD)
> outboundExecutor also looks like a contention point - need to be addressed as 
> well



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21623) Test fails TxAbstractTest#testBatchSinglePartitionGet

2024-02-28 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov updated IGNITE-21623:
---
Labels: ignite-3  (was: )

> Test fails TxAbstractTest#testBatchSinglePartitionGet
> -
>
> Key: IGNITE-21623
> URL: https://issues.apache.org/jira/browse/IGNITE-21623
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
> Attachments: _Integration_Tests_Module_Table_20271.log (1).zip
>
>
> The test fails very rarely, but this is a violation of transaction 
> guarantees. This test fails to get committed entries:
> {code}
> for (Tuple tuple : accountRecordsView.getAll(null, keys.stream().map(k -> 
> makeKey(k)).collect(toList( {
> assertEquals(100., tuple.doubleValue("balance"));
> }
> {code}
> Instead, NPE is received:
> {noformat}
> java.lang.NullPointerException
>   at 
> org.apache.ignite.internal.table.TxAbstractTest.testBatchSinglePartitionGet(TxAbstractTest.java:2274)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
>   at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21623) Test fails TxAbstractTest#testBatchSinglePartitionGet

2024-02-28 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov updated IGNITE-21623:
---
Attachment: _Integration_Tests_Module_Table_20271.log (1).zip

> Test fails TxAbstractTest#testBatchSinglePartitionGet
> -
>
> Key: IGNITE-21623
> URL: https://issues.apache.org/jira/browse/IGNITE-21623
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vladislav Pyatkov
>Priority: Major
> Attachments: _Integration_Tests_Module_Table_20271.log (1).zip
>
>
> The test fails very rarely, but this is a violation of transaction 
> guarantees. This test fails to get committed entries:
> {code}
> for (Tuple tuple : accountRecordsView.getAll(null, keys.stream().map(k -> 
> makeKey(k)).collect(toList( {
> assertEquals(100., tuple.doubleValue("balance"));
> }
> {code}
> Instead, NPE is received:
> {noformat}
> java.lang.NullPointerException
>   at 
> org.apache.ignite.internal.table.TxAbstractTest.testBatchSinglePartitionGet(TxAbstractTest.java:2274)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
>   at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-21623) Test fails TxAbstractTest#testBatchSinglePartitionGet

2024-02-28 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17821616#comment-17821616
 ] 

Vladislav Pyatkov commented on IGNITE-21623:


This issue cannot connect with broken full-state transactions because here a 
test version of the placement driver is used.
The test placement driver never changes a primary replica.

> Test fails TxAbstractTest#testBatchSinglePartitionGet
> -
>
> Key: IGNITE-21623
> URL: https://issues.apache.org/jira/browse/IGNITE-21623
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vladislav Pyatkov
>Priority: Major
>
> The test fails very rarely, but this is a violation of transaction 
> guarantees. This test fails to get committed entries:
> {code}
> for (Tuple tuple : accountRecordsView.getAll(null, keys.stream().map(k -> 
> makeKey(k)).collect(toList( {
> assertEquals(100., tuple.doubleValue("balance"));
> }
> {code}
> Instead, NPE is received:
> {noformat}
> java.lang.NullPointerException
>   at 
> org.apache.ignite.internal.table.TxAbstractTest.testBatchSinglePartitionGet(TxAbstractTest.java:2274)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
>   at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21623) Test fails TxAbstractTest#testBatchSinglePartitionGet

2024-02-28 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov updated IGNITE-21623:
---
Description: 
The test fails very rarely, but this is a violation of transaction guarantees. 
This test fails to get committed entries:
{code}
for (Tuple tuple : accountRecordsView.getAll(null, keys.stream().map(k -> 
makeKey(k)).collect(toList( {
assertEquals(100., tuple.doubleValue("balance"));
}
{code}
Instead, NPE is received:
{noformat}
java.lang.NullPointerException
  at 
org.apache.ignite.internal.table.TxAbstractTest.testBatchSinglePartitionGet(TxAbstractTest.java:2274)
  at java.base/java.lang.reflect.Method.invoke(Method.java:566)
  at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
  at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
{noformat}

  was:
h3 Motivation
The test fails very rarely, but this is a violation of transaction guarantees. 
This test fails to get committed entries:
{code}
for (Tuple tuple : accountRecordsView.getAll(null, keys.stream().map(k -> 
makeKey(k)).collect(toList( {
assertEquals(100., tuple.doubleValue("balance"));
}
{code}
Instead, NPE is received:
{noformat}
java.lang.NullPointerException
  at 
org.apache.ignite.internal.table.TxAbstractTest.testBatchSinglePartitionGet(TxAbstractTest.java:2274)
  at java.base/java.lang.reflect.Method.invoke(Method.java:566)
  at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
  at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
{noformat}


> Test fails TxAbstractTest#testBatchSinglePartitionGet
> -
>
> Key: IGNITE-21623
> URL: https://issues.apache.org/jira/browse/IGNITE-21623
> Project: Ignite
>  Issue Type: Bug
>Reporter: Vladislav Pyatkov
>Priority: Major
>
> The test fails very rarely, but this is a violation of transaction 
> guarantees. This test fails to get committed entries:
> {code}
> for (Tuple tuple : accountRecordsView.getAll(null, keys.stream().map(k -> 
> makeKey(k)).collect(toList( {
> assertEquals(100., tuple.doubleValue("balance"));
> }
> {code}
> Instead, NPE is received:
> {noformat}
> java.lang.NullPointerException
>   at 
> org.apache.ignite.internal.table.TxAbstractTest.testBatchSinglePartitionGet(TxAbstractTest.java:2274)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
>   at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-21623) Test fails TxAbstractTest#testBatchSinglePartitionGet

2024-02-28 Thread Vladislav Pyatkov (Jira)
Vladislav Pyatkov created IGNITE-21623:
--

 Summary: Test fails TxAbstractTest#testBatchSinglePartitionGet
 Key: IGNITE-21623
 URL: https://issues.apache.org/jira/browse/IGNITE-21623
 Project: Ignite
  Issue Type: Bug
Reporter: Vladislav Pyatkov


h3 Motivation
The test fails very rarely, but this is a violation of transaction guarantees. 
This test fails to get committed entries:
{code}
for (Tuple tuple : accountRecordsView.getAll(null, keys.stream().map(k -> 
makeKey(k)).collect(toList( {
assertEquals(100., tuple.doubleValue("balance"));
}
{code}
Instead, NPE is received:
{noformat}
java.lang.NullPointerException
  at 
org.apache.ignite.internal.table.TxAbstractTest.testBatchSinglePartitionGet(TxAbstractTest.java:2274)
  at java.base/java.lang.reflect.Method.invoke(Method.java:566)
  at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
  at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (IGNITE-21578) ItDurableFinishTest#testWaitForCleanup failed with NPE

2024-02-27 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17821438#comment-17821438
 ] 

Vladislav Pyatkov edited comment on IGNITE-21578 at 2/27/24 10:26 PM:
--

Likly, we do not have to look at the error because the exception occurs in the 
thread pool (the pool is ForkJoinPool.commonPool) whose lifecycle is different 
from the cluster. Currently, we use another pool.
At least because the test does not have a full operation. Probably this 
invocation is an inheritance from any previous test.
We will worry only if this reproduces again in the new base (in the cluster 
pool).


was (Author: v.pyatkov):
Likly, we do not have to look at the error because the exception occurs in the 
thread pool (the pool is ForkJoinPool.commonPool) whose lifecycle is different 
from the cluster. Currently, we use another pool.
At least because the test does not have a full operation. Probably this 
invocation is an inheritance from any previous test.

>  ItDurableFinishTest#testWaitForCleanup failed with NPE
> ---
>
> Key: IGNITE-21578
> URL: https://issues.apache.org/jira/browse/IGNITE-21578
> Project: Ignite
>  Issue Type: Bug
>Reporter: Alexander Lapin
>Assignee: Alexander Lapin
>Priority: Major
>  Labels: ignite-3
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> [https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_RunAllTests/7870395?expandBuildDeploymentsSection=false=false=false=true+Inspection=true=true]
> {code:java}
>   Caused by: java.lang.NullPointerException
>     at 
> org.apache.ignite.internal.tx.impl.TxManagerImpl.lambda$finishFull$3(TxManagerImpl.java:472)
>  ~[ignite-transactions-3.0.0-SNAPSHOT.jar:?]
>     at 
> org.apache.ignite.internal.tx.impl.VolatileTxStateMetaStorage.lambda$updateMeta$0(VolatileTxStateMetaStorage.java:73)
>  ~[ignite-transactions-3.0.0-SNAPSHOT.jar:?]
>     at 
> java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1908) 
> ~[?:?]
>     at 
> org.apache.ignite.internal.tx.impl.VolatileTxStateMetaStorage.updateMeta(VolatileTxStateMetaStorage.java:72)
>  ~[ignite-transactions-3.0.0-SNAPSHOT.jar:?]
>     at 
> org.apache.ignite.internal.tx.impl.TxManagerImpl.updateTxMeta(TxManagerImpl.java:455)
>  ~[ignite-transactions-3.0.0-SNAPSHOT.jar:?]
>     at 
> org.apache.ignite.internal.tx.impl.TxManagerImpl.finishFull(TxManagerImpl.java:472)
>  ~[ignite-transactions-3.0.0-SNAPSHOT.jar:?]
>     at 
> org.apache.ignite.internal.table.distributed.storage.InternalTableImpl.lambda$postEnlist$13(InternalTableImpl.java:593)
>  ~[ignite-table-3.0.0-SNAPSHOT.jar:?] {code}
> Seems that the reason is that old meta may be null in case of exception
> {code:java}
>     public void finishFull(HybridTimestampTracker timestampTracker, UUID 
> txId, boolean commit) {
>         ...
>         updateTxMeta(txId, old -> new TxStateMeta(finalState, 
> old.txCoordinatorId(), old.commitPartitionId(), old.commitTimestamp()));
>         ...
>     }
> {code}
> {code:java}
>         return fut.handle((BiFunction>) 
> (r, e) -> {
>             if (full) { // Full txn is already finished remotely. Just update 
> local state.
>                 txManager.finishFull(observableTimestampTracker, tx0.id(), e 
> == null);{code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-21578) ItDurableFinishTest#testWaitForCleanup failed with NPE

2024-02-27 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17821438#comment-17821438
 ] 

Vladislav Pyatkov commented on IGNITE-21578:


Likly, we do not have to look at the error because the exception occurs in the 
thread pool (the pool is ForkJoinPool.commonPool) whose lifecycle is different 
from the cluster. Currently, we use another pool.
At least because the test does not have a full operation. Probably this 
invocation is an inheritance from any previous test.

>  ItDurableFinishTest#testWaitForCleanup failed with NPE
> ---
>
> Key: IGNITE-21578
> URL: https://issues.apache.org/jira/browse/IGNITE-21578
> Project: Ignite
>  Issue Type: Bug
>Reporter: Alexander Lapin
>Assignee: Alexander Lapin
>Priority: Major
>  Labels: ignite-3
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> [https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_RunAllTests/7870395?expandBuildDeploymentsSection=false=false=false=true+Inspection=true=true]
> {code:java}
>   Caused by: java.lang.NullPointerException
>     at 
> org.apache.ignite.internal.tx.impl.TxManagerImpl.lambda$finishFull$3(TxManagerImpl.java:472)
>  ~[ignite-transactions-3.0.0-SNAPSHOT.jar:?]
>     at 
> org.apache.ignite.internal.tx.impl.VolatileTxStateMetaStorage.lambda$updateMeta$0(VolatileTxStateMetaStorage.java:73)
>  ~[ignite-transactions-3.0.0-SNAPSHOT.jar:?]
>     at 
> java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1908) 
> ~[?:?]
>     at 
> org.apache.ignite.internal.tx.impl.VolatileTxStateMetaStorage.updateMeta(VolatileTxStateMetaStorage.java:72)
>  ~[ignite-transactions-3.0.0-SNAPSHOT.jar:?]
>     at 
> org.apache.ignite.internal.tx.impl.TxManagerImpl.updateTxMeta(TxManagerImpl.java:455)
>  ~[ignite-transactions-3.0.0-SNAPSHOT.jar:?]
>     at 
> org.apache.ignite.internal.tx.impl.TxManagerImpl.finishFull(TxManagerImpl.java:472)
>  ~[ignite-transactions-3.0.0-SNAPSHOT.jar:?]
>     at 
> org.apache.ignite.internal.table.distributed.storage.InternalTableImpl.lambda$postEnlist$13(InternalTableImpl.java:593)
>  ~[ignite-table-3.0.0-SNAPSHOT.jar:?] {code}
> Seems that the reason is that old meta may be null in case of exception
> {code:java}
>     public void finishFull(HybridTimestampTracker timestampTracker, UUID 
> txId, boolean commit) {
>         ...
>         updateTxMeta(txId, old -> new TxStateMeta(finalState, 
> old.txCoordinatorId(), old.commitPartitionId(), old.commitTimestamp()));
>         ...
>     }
> {code}
> {code:java}
>         return fut.handle((BiFunction>) 
> (r, e) -> {
>             if (full) { // Full txn is already finished remotely. Just update 
> local state.
>                 txManager.finishFull(observableTimestampTracker, tx0.id(), e 
> == null);{code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21578) ItDurableFinishTest#testWaitForCleanup failed with NPE

2024-02-27 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov updated IGNITE-21578:
---
Summary:  ItDurableFinishTest#testWaitForCleanup failed with NPE  (was: 
ItDurableFinishTest#testCoordinatorMissedResponse failed with NPE)

>  ItDurableFinishTest#testWaitForCleanup failed with NPE
> ---
>
> Key: IGNITE-21578
> URL: https://issues.apache.org/jira/browse/IGNITE-21578
> Project: Ignite
>  Issue Type: Bug
>Reporter: Alexander Lapin
>Assignee: Alexander Lapin
>Priority: Major
>  Labels: ignite-3
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> [https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_RunAllTests/7870395?expandBuildDeploymentsSection=false=false=false=true+Inspection=true=true]
> {code:java}
>   Caused by: java.lang.NullPointerException
>     at 
> org.apache.ignite.internal.tx.impl.TxManagerImpl.lambda$finishFull$3(TxManagerImpl.java:472)
>  ~[ignite-transactions-3.0.0-SNAPSHOT.jar:?]
>     at 
> org.apache.ignite.internal.tx.impl.VolatileTxStateMetaStorage.lambda$updateMeta$0(VolatileTxStateMetaStorage.java:73)
>  ~[ignite-transactions-3.0.0-SNAPSHOT.jar:?]
>     at 
> java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1908) 
> ~[?:?]
>     at 
> org.apache.ignite.internal.tx.impl.VolatileTxStateMetaStorage.updateMeta(VolatileTxStateMetaStorage.java:72)
>  ~[ignite-transactions-3.0.0-SNAPSHOT.jar:?]
>     at 
> org.apache.ignite.internal.tx.impl.TxManagerImpl.updateTxMeta(TxManagerImpl.java:455)
>  ~[ignite-transactions-3.0.0-SNAPSHOT.jar:?]
>     at 
> org.apache.ignite.internal.tx.impl.TxManagerImpl.finishFull(TxManagerImpl.java:472)
>  ~[ignite-transactions-3.0.0-SNAPSHOT.jar:?]
>     at 
> org.apache.ignite.internal.table.distributed.storage.InternalTableImpl.lambda$postEnlist$13(InternalTableImpl.java:593)
>  ~[ignite-table-3.0.0-SNAPSHOT.jar:?] {code}
> Seems that the reason is that old meta may be null in case of exception
> {code:java}
>     public void finishFull(HybridTimestampTracker timestampTracker, UUID 
> txId, boolean commit) {
>         ...
>         updateTxMeta(txId, old -> new TxStateMeta(finalState, 
> old.txCoordinatorId(), old.commitPartitionId(), old.commitTimestamp()));
>         ...
>     }
> {code}
> {code:java}
>         return fut.handle((BiFunction>) 
> (r, e) -> {
>             if (full) { // Full txn is already finished remotely. Just update 
> local state.
>                 txManager.finishFull(observableTimestampTracker, tx0.id(), e 
> == null);{code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-21545) Introduce a cursor manager

2024-02-26 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17820694#comment-17820694
 ] 

Vladislav Pyatkov commented on IGNITE-21545:


Merged 4915e292808560e1c3eb995d6a03466bbdf3

> Introduce a cursor manager
> --
>
> Key: IGNITE-21545
> URL: https://issues.apache.org/jira/browse/IGNITE-21545
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Denis Chudov
>Assignee:  Kirill Sizov
>Priority: Major
>  Labels: ignite-3
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Introduce a cursor manager that would maintain all cursors created on a node, 
> instead of maintaining them in partition replica listeners.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-21541) Avoid partition-operations pool when it not lead to starvation

2024-02-23 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17820021#comment-17820021
 ] 

Vladislav Pyatkov commented on IGNITE-21541:


This issue may not be needed after the thread model is corrected.

> Avoid partition-operations pool when it not lead to starvation
> --
>
> Key: IGNITE-21541
> URL: https://issues.apache.org/jira/browse/IGNITE-21541
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladislav Pyatkov
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
>
> h3. Motivation
> Chnaging pools and related parking/unparking lead to an increase in latency. 
> Sometimes we can avoid extra pool changing, for example, by doing it for 
> embedded operations.
> {code:title=ReplicaManager#onReplicaMessageReceived}
> ExecutorService stripeExecutor = 
> ReplicationGroupStripes.stripeFor(request.groupId(), requestsExecutor);
> stripeExecutor.execute(() -> handleReplicaRequest(request, 
> senderConsistentId, correlationId));
> {code}
> This code changes a thread, even if it is not necessary.
> h3. Definition of done
> ReplicaManager should not switch threads if it does not lead to starvation 
> (in my opinion, the splitting is needed only in the case of the network 
> thread).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21540) Handle lock exception for transaction operations

2024-02-20 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov updated IGNITE-21540:
---
Description: 
h3. Motivation
Deadlock prevention can throw a lock exception, but it depends on the 
situation. After several retries, the exception might pass because another 
transaction has already released its locks.
h3. Implementation notes

We need to consider all kinds of implicit operations that lead to the creation 
of RW transactions.
h3. Definition of done

Implicit operations never throw the lock exception.

  was:
h3. Motivation

Implicit operations create a transaction under the hood, and the transaction 
may be in conflict with another one. This type of transaction can be 
automatically retried because it does not contain user logic.
h3. Implementation notes

We need to consider all kinds of implicit operations that lead to the creation 
of RW transactions.
h3. Definition of done

Implicit operations never throw the lock exception.


> Handle lock exception for transaction operations
> 
>
> Key: IGNITE-21540
> URL: https://issues.apache.org/jira/browse/IGNITE-21540
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
>
> h3. Motivation
> Deadlock prevention can throw a lock exception, but it depends on the 
> situation. After several retries, the exception might pass because another 
> transaction has already released its locks.
> h3. Implementation notes
> We need to consider all kinds of implicit operations that lead to the 
> creation of RW transactions.
> h3. Definition of done
> Implicit operations never throw the lock exception.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21540) Handle lock exception for transaction operations

2024-02-20 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov updated IGNITE-21540:
---
Summary: Handle lock exception for transaction operations  (was: Unexpected 
lock exception in implicit operation)

> Handle lock exception for transaction operations
> 
>
> Key: IGNITE-21540
> URL: https://issues.apache.org/jira/browse/IGNITE-21540
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
>
> h3. Motivation
> Implicit operations create a transaction under the hood, and the transaction 
> may be in conflict with another one. This type of transaction can be 
> automatically retried because it does not contain user logic.
> h3. Implementation notes
> We need to consider all kinds of implicit operations that lead to the 
> creation of RW transactions.
> h3. Definition of done
> Implicit operations never throw the lock exception.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-21540) Unexpected lock exception in implicit operation

2024-02-20 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818750#comment-17818750
 ] 

Vladislav Pyatkov commented on IGNITE-21540:


In this issue, we will implement a prototype that demonstrates how transactions 
work when all lock exceptions retry for any transaction operation. In the first 
prototype, we won't handle the exception for cursors (this is a future plan).

> Unexpected lock exception in implicit operation
> ---
>
> Key: IGNITE-21540
> URL: https://issues.apache.org/jira/browse/IGNITE-21540
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
>
> h3. Motivation
> Implicit operations create a transaction under the hood, and the transaction 
> may be in conflict with another one. This type of transaction can be 
> automatically retried because it does not contain user logic.
> h3. Implementation notes
> We need to consider all kinds of implicit operations that lead to the 
> creation of RW transactions.
> h3. Definition of done
> Implicit operations never throw the lock exception.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-17824) Use named executor instead of default one in order to process replica Response

2024-02-19 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-17824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818455#comment-17818455
 ] 

Vladislav Pyatkov commented on IGNITE-17824:


[~rpuch] This patch looks good to me.
Merged fef7aab39ef3262e1105af56c625ca5578a901d4

> Use named executor instead of default one in order to process replica Response
> --
>
> Key: IGNITE-17824
> URL: https://issues.apache.org/jira/browse/IGNITE-17824
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Alexander Lapin
>Assignee: Roman Puchkovskiy
>Priority: Major
>  Labels: ignite-3, storage-threading, threading
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Within ReplicaService.sendToReplica
> {code:java}
> private  CompletableFuture sendToReplica(ClusterNode node, 
> ReplicaRequest req) {
> CompletableFuture res = new CompletableFuture<>();
> messagingService.invoke(node.address(), req, 
> RPC_TIMEOUT).whenCompleteAsync((response, throwable) -> { {code}
> named executor should be used instead of default one.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-21541) Avoid partition-operations pool when it not lead to starvation

2024-02-15 Thread Vladislav Pyatkov (Jira)
Vladislav Pyatkov created IGNITE-21541:
--

 Summary: Avoid partition-operations pool when it not lead to 
starvation
 Key: IGNITE-21541
 URL: https://issues.apache.org/jira/browse/IGNITE-21541
 Project: Ignite
  Issue Type: Improvement
Reporter: Vladislav Pyatkov


h3. Motivation
Chnaging pools and related parking/unparking lead to an increase in latency. 
Sometimes we can avoid extra pool changing, for example, by doing it for 
embedded operations.
{code:title=ReplicaManager#onReplicaMessageReceived}
ExecutorService stripeExecutor = 
ReplicationGroupStripes.stripeFor(request.groupId(), requestsExecutor);
stripeExecutor.execute(() -> handleReplicaRequest(request, senderConsistentId, 
correlationId));
{code}
This code changes a thread, even if it is not necessary.

h3. Definition of done
ReplicaManager should not switch threads if it does not lead to starvation (in 
my opinion, the splitting is needed only in the case of the network thread).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21540) Unexpected lock exception in implicit operation

2024-02-15 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov updated IGNITE-21540:
---
Description: 
h3. Motivation
Implicit operations create a transaction under the hook, and the transaction 
may be in conflict with another one. This type of transaction can be 
automatically retried because it does not contain user logic.

h3. Implementation notes
We need to consider all kinds of implicit operations that lead to the creation 
of RW transactions.

h3. Definition of done
Implicit operations never throw the lock exception.

  was:
h3. Motivation
Implicit operations create a transaction under the hook, and the transaction 
may be in conflict with another one. This type of transaction can be 
automatically retried because it does not contain user logic.

h3. Definition of done
Implicit operations never throw the lock exception.


> Unexpected lock exception in implicit operation
> ---
>
> Key: IGNITE-21540
> URL: https://issues.apache.org/jira/browse/IGNITE-21540
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
>
> h3. Motivation
> Implicit operations create a transaction under the hook, and the transaction 
> may be in conflict with another one. This type of transaction can be 
> automatically retried because it does not contain user logic.
> h3. Implementation notes
> We need to consider all kinds of implicit operations that lead to the 
> creation of RW transactions.
> h3. Definition of done
> Implicit operations never throw the lock exception.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21540) Unexpected lock exception in implicit operation

2024-02-15 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov updated IGNITE-21540:
---
Description: 
h3. Motivation
Implicit operations create a transaction under the hook, and the transaction 
may be in conflict with another one. This type of transaction can be 
automatically retried because it does not contain user logic.

h3. Definition of done
Implicit operations never throw the lock exception.

  was:
h3. Motivation
Implicit operations create a transaction under the hook, and the transaction 
may be in conflict with another one. Although this type of transaction can be 
retried because it contains the only operation.

h3. Defenition of done
 Implicit operations cannot throw the lock exception.


> Unexpected lock exception in implicit operation
> ---
>
> Key: IGNITE-21540
> URL: https://issues.apache.org/jira/browse/IGNITE-21540
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3
>
> h3. Motivation
> Implicit operations create a transaction under the hook, and the transaction 
> may be in conflict with another one. This type of transaction can be 
> automatically retried because it does not contain user logic.
> h3. Definition of done
> Implicit operations never throw the lock exception.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (IGNITE-21540) Unexpected lock exception in implicit operation

2024-02-15 Thread Vladislav Pyatkov (Jira)
Vladislav Pyatkov created IGNITE-21540:
--

 Summary: Unexpected lock exception in implicit operation
 Key: IGNITE-21540
 URL: https://issues.apache.org/jira/browse/IGNITE-21540
 Project: Ignite
  Issue Type: Improvement
Reporter: Vladislav Pyatkov


h3. Motivation
Implicit operations create a transaction under the hook, and the transaction 
may be in conflict with another one. Although this type of transaction can be 
retried because it contains the only operation.

h3. Defenition of done
 Implicit operations cannot throw the lock exception.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (IGNITE-21379) Investigate whether currently used busyLocks implementation is fast enough

2024-02-14 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov resolved IGNITE-21379.

Release Note: There is no change needed by the result of the investigation. 
  Resolution: Fixed

> Investigate whether currently used busyLocks implementation is fast enough
> --
>
> Key: IGNITE-21379
> URL: https://issues.apache.org/jira/browse/IGNITE-21379
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Alexander Lapin
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3, performance
> Attachments: BusyLockTest.java
>
>
> h3. Motivation
> Seems that our busyLocks (IgniteSpinBusyLock) aren't good enough from the 
> performance perspective. Let's compare current implementation with common RW 
> locks, CheckpointReadWriteLock, etc. Depending on the results it'll be 
> required either to use faster implementation or re-consider busyLock idea 
> itself because currently it brings significant performance drop. Given ticket 
> is only about initial step - busyLock performance investigation.
> h3. Definition of Done
>  * Prepare JMH benchmarks for busyLocks performance investigation.
>  * Compare IgniteSpinBusyLock, common RW lock, CheckpointReadWriteLock, etc 
> in order to understand whether IgniteSpinBusyLock is fast enough.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (IGNITE-21379) Investigate whether currently used busyLocks implementation is fast enough

2024-02-14 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817350#comment-17817350
 ] 

Vladislav Pyatkov edited comment on IGNITE-21379 at 2/14/24 11:44 AM:
--

> Is this a single op latency in the table?
The total duration of 10M operations is shown in the table. One operation 
contains getting two read locks and two read unlocks. The test ran several 
times; this duration is after warmup.

> Test scenario description is not clear to me.
I attached the test.

> That is counter test?
The last row in the table is matched for the counter test.


was (Author: v.pyatkov):
> Is this a single op latency in the table?
The total duration of 10M operations is shown in the table. One operation 
contains getting two read locks and two read unlocks.

The test ran several times; this duration is after warmup.

> Test scenario description is not clear to me.

I attached the test.

> That is counter test?

The last row in the table is matched for the counter test.

> Investigate whether currently used busyLocks implementation is fast enough
> --
>
> Key: IGNITE-21379
> URL: https://issues.apache.org/jira/browse/IGNITE-21379
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Alexander Lapin
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3, performance
> Attachments: BusyLockTest.java
>
>
> h3. Motivation
> Seems that our busyLocks (IgniteSpinBusyLock) aren't good enough from the 
> performance perspective. Let's compare current implementation with common RW 
> locks, CheckpointReadWriteLock, etc. Depending on the results it'll be 
> required either to use faster implementation or re-consider busyLock idea 
> itself because currently it brings significant performance drop. Given ticket 
> is only about initial step - busyLock performance investigation.
> h3. Definition of Done
>  * Prepare JMH benchmarks for busyLocks performance investigation.
>  * Compare IgniteSpinBusyLock, common RW lock, CheckpointReadWriteLock, etc 
> in order to understand whether IgniteSpinBusyLock is fast enough.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21379) Investigate whether currently used busyLocks implementation is fast enough

2024-02-14 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov updated IGNITE-21379:
---
Attachment: BusyLockTest.java

> Investigate whether currently used busyLocks implementation is fast enough
> --
>
> Key: IGNITE-21379
> URL: https://issues.apache.org/jira/browse/IGNITE-21379
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Alexander Lapin
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3, performance
> Attachments: BusyLockTest.java
>
>
> h3. Motivation
> Seems that our busyLocks (IgniteSpinBusyLock) aren't good enough from the 
> performance perspective. Let's compare current implementation with common RW 
> locks, CheckpointReadWriteLock, etc. Depending on the results it'll be 
> required either to use faster implementation or re-consider busyLock idea 
> itself because currently it brings significant performance drop. Given ticket 
> is only about initial step - busyLock performance investigation.
> h3. Definition of Done
>  * Prepare JMH benchmarks for busyLocks performance investigation.
>  * Compare IgniteSpinBusyLock, common RW lock, CheckpointReadWriteLock, etc 
> in order to understand whether IgniteSpinBusyLock is fast enough.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-21379) Investigate whether currently used busyLocks implementation is fast enough

2024-02-14 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817350#comment-17817350
 ] 

Vladislav Pyatkov commented on IGNITE-21379:


> Is this a single op latency in the table?
The total duration of 10M operations is shown in the table. One operation 
contains getting two read locks and two read unlocks.

The test ran several times; this duration is after warmup.

> Test scenario description is not clear to me.

I attached the test.

> That is counter test?

The last row in the table is matched for the counter test.

> Investigate whether currently used busyLocks implementation is fast enough
> --
>
> Key: IGNITE-21379
> URL: https://issues.apache.org/jira/browse/IGNITE-21379
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Alexander Lapin
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3, performance
> Attachments: BusyLockTest.java
>
>
> h3. Motivation
> Seems that our busyLocks (IgniteSpinBusyLock) aren't good enough from the 
> performance perspective. Let's compare current implementation with common RW 
> locks, CheckpointReadWriteLock, etc. Depending on the results it'll be 
> required either to use faster implementation or re-consider busyLock idea 
> itself because currently it brings significant performance drop. Given ticket 
> is only about initial step - busyLock performance investigation.
> h3. Definition of Done
>  * Prepare JMH benchmarks for busyLocks performance investigation.
>  * Compare IgniteSpinBusyLock, common RW lock, CheckpointReadWriteLock, etc 
> in order to understand whether IgniteSpinBusyLock is fast enough.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (IGNITE-21379) Investigate whether currently used busyLocks implementation is fast enough

2024-02-14 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817309#comment-17817309
 ] 

Vladislav Pyatkov edited comment on IGNITE-21379 at 2/14/24 10:22 AM:
--

I compare two general approaches: based on IgniteSpinReadWriteLock and based on 
ReentrantReadWriteLock. The comparison runs in the dedicated test:
|RW lock test|406|
|Busy lock test|296|
|RW try lock test|358|
|Counter test|200|

This is a latency of 10_000_000 operations in millis. One operation contains 
two read lock and two read unlock operations.
There are the same results, but in multi-threaded environments:
|RW lock test|2 135|
|Busy lock test|2 377|
|RW try lock test|2 088|
|Counter test|1 476|

One operation acquires a lock in one thread, then takes a lock and releases it 
in another one, finally returning to the original thread and releasing the 
first read lock.
Also, I implement IgniteSpinBusyLock based on RW. I ran a load, and this 
implementation does not show a significant performance impact.


was (Author: v.pyatkov):
I compare two general approaches: based on IgniteSpinReadWriteLock and based on 
ReentrantReadWriteLock. The comparison runs in the dedicated test:
|RW lock test|406|
|Busy lock test|296|
|RW try lock test|358|
|Counter test|200|
This is a latency of 1000 operations in millis. One operation contains two read 
lock and two read unlock operations.
There are the same results, but in multi-threaded environments:
|RW lock test|2 135|
|Busy lock test|2 377|
|RW try lock test|2 088|
|Counter test|1 476|
One operation acquires a lock in one thread, then takes a lock and releases it 
in another one, finally returning to the original thread and releasing the 
first read lock.
Also, I implement IgniteSpinBusyLock based on RW. I ran a load, and this 
implementation does not show a significant performance impact.

> Investigate whether currently used busyLocks implementation is fast enough
> --
>
> Key: IGNITE-21379
> URL: https://issues.apache.org/jira/browse/IGNITE-21379
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Alexander Lapin
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3, performance
>
> h3. Motivation
> Seems that our busyLocks (IgniteSpinBusyLock) aren't good enough from the 
> performance perspective. Let's compare current implementation with common RW 
> locks, CheckpointReadWriteLock, etc. Depending on the results it'll be 
> required either to use faster implementation or re-consider busyLock idea 
> itself because currently it brings significant performance drop. Given ticket 
> is only about initial step - busyLock performance investigation.
> h3. Definition of Done
>  * Prepare JMH benchmarks for busyLocks performance investigation.
>  * Compare IgniteSpinBusyLock, common RW lock, CheckpointReadWriteLock, etc 
> in order to understand whether IgniteSpinBusyLock is fast enough.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-21379) Investigate whether currently used busyLocks implementation is fast enough

2024-02-14 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817309#comment-17817309
 ] 

Vladislav Pyatkov commented on IGNITE-21379:


I compare two general approaches: based on IgniteSpinReadWriteLock and based on 
ReentrantReadWriteLock. The comparison runs in the dedicated test:
|RW lock test|406|
|Busy lock test|296|
|RW try lock test|358|
|Counter test|200|
This is a latency of 1000 operations in millis. One operation contains two read 
lock and two read unlock operations.
There are the same results, but in multi-threaded environments:
|RW lock test|2 135|
|Busy lock test|2 377|
|RW try lock test|2 088|
|Counter test|1 476|
One operation acquires a lock in one thread, then takes a lock and releases it 
in another one, finally returning to the original thread and releasing the 
first read lock.
Also, I implement IgniteSpinBusyLock based on RW. I ran a load, and this 
implementation does not show a significant performance impact.

> Investigate whether currently used busyLocks implementation is fast enough
> --
>
> Key: IGNITE-21379
> URL: https://issues.apache.org/jira/browse/IGNITE-21379
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Alexander Lapin
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3, performance
>
> h3. Motivation
> Seems that our busyLocks (IgniteSpinBusyLock) aren't good enough from the 
> performance perspective. Let's compare current implementation with common RW 
> locks, CheckpointReadWriteLock, etc. Depending on the results it'll be 
> required either to use faster implementation or re-consider busyLock idea 
> itself because currently it brings significant performance drop. Given ticket 
> is only about initial step - busyLock performance investigation.
> h3. Definition of Done
>  * Prepare JMH benchmarks for busyLocks performance investigation.
>  * Compare IgniteSpinBusyLock, common RW lock, CheckpointReadWriteLock, etc 
> in order to understand whether IgniteSpinBusyLock is fast enough.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-21378) Investigate whether it's possible to make txnState local map updates faster

2024-02-13 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817047#comment-17817047
 ] 

Vladislav Pyatkov commented on IGNITE-21378:


Merged e521574594d35405b2c0c9bfab8e5c44bcc14c4c

> Investigate whether it's possible to make txnState local map updates faster
> ---
>
> Key: IGNITE-21378
> URL: https://issues.apache.org/jira/browse/IGNITE-21378
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Alexander Lapin
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Labels: PERFORMANCE, Performance, ignite-3, ignite3_performance, 
> performance, performence
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> h3.  Motivation
> IGNITE-21375 is about removing txnState local map updates within RO txn flow 
> because it brings us 20% performance drop. For RW transactions, it's however 
> not possible, because such updates are required by the RW protocol. Thus, it 
> seems reasonable to verify whether proposed VolatileTxStateMetaStorage is 
> itself fast enough.
> h3. Definition of Done
>  * Increase VolatileTxStateMetaStorag performance if it's possible.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21378) Investigate whether it's possible to make txnState local map updates faster

2024-02-13 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov updated IGNITE-21378:
---
Reviewer: Denis Chudov

> Investigate whether it's possible to make txnState local map updates faster
> ---
>
> Key: IGNITE-21378
> URL: https://issues.apache.org/jira/browse/IGNITE-21378
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Alexander Lapin
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Labels: PERFORMANCE, Performance, ignite-3, ignite3_performance, 
> performance, performence
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> h3.  Motivation
> IGNITE-21375 is about removing txnState local map updates within RO txn flow 
> because it brings us 20% performance drop. For RW transactions, it's however 
> not possible, because such updates are required by the RW protocol. Thus, it 
> seems reasonable to verify whether proposed VolatileTxStateMetaStorage is 
> itself fast enough.
> h3. Definition of Done
>  * Increase VolatileTxStateMetaStorag performance if it's possible.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (IGNITE-21378) Investigate whether it's possible to make txnState local map updates faster

2024-02-12 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17816587#comment-17816587
 ] 

Vladislav Pyatkov edited comment on IGNITE-21378 at 2/12/24 12:32 PM:
--

The patch decreases the latency of the single-put operation by about 3.3% (40 
microseconds). The full operation latency on my laptop is 1.2 milliseconds.


was (Author: v.pyatkov):
The patch decreases the latency of the single-put operation by about 3.3% (40 
microseconds).

> Investigate whether it's possible to make txnState local map updates faster
> ---
>
> Key: IGNITE-21378
> URL: https://issues.apache.org/jira/browse/IGNITE-21378
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Alexander Lapin
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Labels: PERFORMANCE, Performance, ignite-3, ignite3_performance, 
> performance, performence
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> h3.  Motivation
> IGNITE-21375 is about removing txnState local map updates within RO txn flow 
> because it brings us 20% performance drop. For RW transactions, it's however 
> not possible, because such updates are required by the RW protocol. Thus, it 
> seems reasonable to verify whether proposed VolatileTxStateMetaStorage is 
> itself fast enough.
> h3. Definition of Done
>  * Increase VolatileTxStateMetaStorag performance if it's possible.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (IGNITE-21375) RO transactions should not update txnState local map

2024-02-09 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-21375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17815991#comment-17815991
 ] 

Vladislav Pyatkov commented on IGNITE-21375:


Merged 7e3605024b18cf2099110fbd038409bcbee02ee5

> RO transactions should not update txnState local map
> 
>
> Key: IGNITE-21375
> URL: https://issues.apache.org/jira/browse/IGNITE-21375
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Alexander Lapin
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3, performance
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> h3. Motivation
> On the one hand according to [~vpyatkov] updating txnstate within RO flow 
> brings us 20% performance drop, on the other hand RO transaction updates 
> corresponding state only for testing purposes, precisely in order to check 
> whether transaction rollback was called. See 
> org.apache.ignite.internal.tx.TxManager#pending for more details.
> h3. Definition of Done
>  * Eliminate txnState local map updates within RO txn flow.
>  * Implement different mechanism for txn cleanup verification instead of 
> txnState local map based one.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-21375) RO transactions should not update txnState local map

2024-02-08 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-21375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov updated IGNITE-21375:
---
Reviewer: Kirill Sizov

> RO transactions should not update txnState local map
> 
>
> Key: IGNITE-21375
> URL: https://issues.apache.org/jira/browse/IGNITE-21375
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Alexander Lapin
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Labels: ignite-3, performance
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> h3. Motivation
> On the one hand according to [~vpyatkov] updating txnstate within RO flow 
> brings us 20% performance drop, on the other hand RO transaction updates 
> corresponding state only for testing purposes, precisely in order to check 
> whether transaction rollback was called. See 
> org.apache.ignite.internal.tx.TxManager#pending for more details.
> h3. Definition of Done
>  * Eliminate txnState local map updates within RO txn flow.
>  * Implement different mechanism for txn cleanup verification instead of 
> txnState local map based one.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


  1   2   3   4   5   6   7   8   9   10   >