[jira] [Commented] (IGNITE-17524) Shutdown of large numbers of servers slowed by linear lookup in IgniteServiceProcessor
[ https://issues.apache.org/jira/browse/IGNITE-17524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17579066#comment-17579066 ] Arthur Naseef commented on IGNITE-17524: Created a Pull Request: [https://github.com/apache/ignite/pull/10196] The changes add another map that tracks service-name to service-info, and uses that map, in place of the existing linear search, for lookup. > Shutdown of large numbers of servers slowed by linear lookup in > IgniteServiceProcessor > -- > > Key: IGNITE-17524 > URL: https://issues.apache.org/jira/browse/IGNITE-17524 > Project: Ignite > Issue Type: Improvement > Components: managed services >Affects Versions: 2.13 >Reporter: Arthur Naseef >Assignee: Arthur Naseef >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Cloned from IGNITE-17274. That ticket addresses startup timing, and this one > addresses shutdown timing. > Shutting down large numbers of services is slowed down by a linear lookup > that can be addressed by adding a map to perform the lookups. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-17524) Shutdown of large numbers of servers slowed by linear lookup in IgniteServiceProcessor
[ https://issues.apache.org/jira/browse/IGNITE-17524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arthur Naseef updated IGNITE-17524: --- Fix Version/s: (was: 2.14) Description: Cloned from IGNITE-17274. That ticket addresses startup timing, and this one addresses shutdown timing. Shutting down large numbers of services is slowed down by a linear lookup that can be addressed by adding a map to perform the lookups. was: Using a small POC, spinning up many servers is slow. In addition, the startup time appears to be exponential. Using timing measurements, found a linear lookup inside the IgniteServiceProcessor that is taking most of the time. Replacing that linear lookup with a Map lookup, and maintaining the map, significantly speeds up the process, and startup time is now linear with the number of services started. Note this was tested with 20K and 50K services on a 1-node ignite cluster. Timings against the stock 2.13.0 code come in at 30s for 20K and 250s for 50K services. Modifying the linear lookup to use a Map, the timing come in at 8s for 20K and 14s for 50K services. > Shutdown of large numbers of servers slowed by linear lookup in > IgniteServiceProcessor > -- > > Key: IGNITE-17524 > URL: https://issues.apache.org/jira/browse/IGNITE-17524 > Project: Ignite > Issue Type: Improvement > Components: managed services >Affects Versions: 2.13 >Reporter: Arthur Naseef >Assignee: Arthur Naseef >Priority: Major > > Cloned from IGNITE-17274. That ticket addresses startup timing, and this one > addresses shutdown timing. > Shutting down large numbers of services is slowed down by a linear lookup > that can be addressed by adding a map to perform the lookups. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-17524) Shutdown of large numbers of servers slowed by linear lookup in IgniteServiceProcessor
Arthur Naseef created IGNITE-17524: -- Summary: Shutdown of large numbers of servers slowed by linear lookup in IgniteServiceProcessor Key: IGNITE-17524 URL: https://issues.apache.org/jira/browse/IGNITE-17524 Project: Ignite Issue Type: Improvement Components: managed services Affects Versions: 2.13 Reporter: Arthur Naseef Assignee: Arthur Naseef Fix For: 2.14 Using a small POC, spinning up many servers is slow. In addition, the startup time appears to be exponential. Using timing measurements, found a linear lookup inside the IgniteServiceProcessor that is taking most of the time. Replacing that linear lookup with a Map lookup, and maintaining the map, significantly speeds up the process, and startup time is now linear with the number of services started. Note this was tested with 20K and 50K services on a 1-node ignite cluster. Timings against the stock 2.13.0 code come in at 30s for 20K and 250s for 50K services. Modifying the linear lookup to use a Map, the timing come in at 8s for 20K and 14s for 50K services. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (IGNITE-17349) Ignite3 CLI output formatting
[ https://issues.apache.org/jira/browse/IGNITE-17349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17579050#comment-17579050 ] Vyacheslav Koptilin edited comment on IGNITE-17349 at 8/12/22 4:48 PM: --- Hello [~aleksandr.pakhomov], I will take a look at your pull-request. was (Author: slava.koptilin): Hello [~aleksandr.pakhomov], I will take at your pull-request. > Ignite3 CLI output formatting > - > > Key: IGNITE-17349 > URL: https://issues.apache.org/jira/browse/IGNITE-17349 > Project: Ignite > Issue Type: Task > Components: cli >Reporter: Aleksandr >Assignee: Aleksandr >Priority: Major > Labels: ignite-3, ignite-3-cli-tool > Time Spent: 1.5h > Remaining Estimate: 0h > > Ignite3 CLI now is not consistent from formatting/styles perspective. > Messages about what went wrong differ from each other. Somewhere 'Done!' is a > marker of successful operation ({{ignite bootstrap}}), somewhere it is just a > sentence notifying that something is done ({{ignite connect}}). Tables are > rendered with different borders for {{ignite bootstrap}}, {{ignite node > list}} and for {{ignite topology}} commands. > The goal of this ticket is to develop user-facing interface components and > use them in the CLI code. The list of the components is also a part of this > ticket but here are some of them: > - problem json render > - table render > - success action render > - suggestion render. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-17349) Ignite3 CLI output formatting
[ https://issues.apache.org/jira/browse/IGNITE-17349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17579050#comment-17579050 ] Vyacheslav Koptilin commented on IGNITE-17349: -- Hello [~aleksandr.pakhomov], I will take at your pull-request. > Ignite3 CLI output formatting > - > > Key: IGNITE-17349 > URL: https://issues.apache.org/jira/browse/IGNITE-17349 > Project: Ignite > Issue Type: Task > Components: cli >Reporter: Aleksandr >Assignee: Aleksandr >Priority: Major > Labels: ignite-3, ignite-3-cli-tool > Time Spent: 1.5h > Remaining Estimate: 0h > > Ignite3 CLI now is not consistent from formatting/styles perspective. > Messages about what went wrong differ from each other. Somewhere 'Done!' is a > marker of successful operation ({{ignite bootstrap}}), somewhere it is just a > sentence notifying that something is done ({{ignite connect}}). Tables are > rendered with different borders for {{ignite bootstrap}}, {{ignite node > list}} and for {{ignite topology}} commands. > The goal of this ticket is to develop user-facing interface components and > use them in the CLI code. The list of the components is also a part of this > ticket but here are some of them: > - problem json render > - table render > - success action render > - suggestion render. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-17349) Ignite3 CLI output formatting
[ https://issues.apache.org/jira/browse/IGNITE-17349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vyacheslav Koptilin updated IGNITE-17349: - Reviewer: Vyacheslav Koptilin > Ignite3 CLI output formatting > - > > Key: IGNITE-17349 > URL: https://issues.apache.org/jira/browse/IGNITE-17349 > Project: Ignite > Issue Type: Task > Components: cli >Reporter: Aleksandr >Assignee: Aleksandr >Priority: Major > Labels: ignite-3, ignite-3-cli-tool > Time Spent: 1.5h > Remaining Estimate: 0h > > Ignite3 CLI now is not consistent from formatting/styles perspective. > Messages about what went wrong differ from each other. Somewhere 'Done!' is a > marker of successful operation ({{ignite bootstrap}}), somewhere it is just a > sentence notifying that something is done ({{ignite connect}}). Tables are > rendered with different borders for {{ignite bootstrap}}, {{ignite node > list}} and for {{ignite topology}} commands. > The goal of this ticket is to develop user-facing interface components and > use them in the CLI code. The list of the components is also a part of this > ticket but here are some of them: > - problem json render > - table render > - success action render > - suggestion render. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-17511) Support IndexQuery for Java ThinClient
[ https://issues.apache.org/jira/browse/IGNITE-17511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17579042#comment-17579042 ] Maksim Timonin commented on IGNITE-17511: - [~ivandasch] [~ptupitsyn] thanks for your comments. Merged to master. > Support IndexQuery for Java ThinClient > -- > > Key: IGNITE-17511 > URL: https://issues.apache.org/jira/browse/IGNITE-17511 > Project: Ignite > Issue Type: New Feature >Reporter: Maksim Timonin >Assignee: Maksim Timonin >Priority: Major > Labels: IEP-71, ise > Fix For: 2.14 > > Time Spent: 20m > Remaining Estimate: 0h > > ThinClient doesn't support IndexQuery. Let's fix it. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-17508) Exception handling in the partition replication listener for RAFT futures
[ https://issues.apache.org/jira/browse/IGNITE-17508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladislav Pyatkov updated IGNITE-17508: --- Description: In the replication listener ({_}PartitionReplicaListener{_}) where we have the pattern: {code:java} raftFut.thenApply(ignored -> result);{code} we should worry about handling RAFT exceptions, including analyzing whether raftFut result. Can distinguish following exception types for RAFT: * RAFT cannot replicate a command for the timeout ({_}TimeoutException{_}). Hence, this exception leads to the replication timeout exception ({_}ReplicationTimeoutException{_}). * It throws some internal exception ({_}RaftException{_}). This exception should be wrapped of the common replication exception ({_}ReplicationException{_}). * Finally, RAFT throws java exceptions (NullPointerException, IndexOutOfRangeException e.t.c). Those exceptions shouldn't be touched, is will be through as is. was: In the replication listener ({_}PartitionReplicaListener{_}) where we have the pattern: {code:java} raftFut.thenApply(ignored -> result);{code} we should worry about handling RAFT exceptions, including analyzing whether raftFut result. Should distinguish following exception types: * The RAFT node has not been started yet. * The RAFT node is started, but it is not a leader. * RAFT cannot replicate a command for the timeout. > Exception handling in the partition replication listener for RAFT futures > - > > Key: IGNITE-17508 > URL: https://issues.apache.org/jira/browse/IGNITE-17508 > Project: Ignite > Issue Type: Improvement >Reporter: Vladislav Pyatkov >Priority: Major > Labels: ignite-3 > > In the replication listener ({_}PartitionReplicaListener{_}) where we have > the pattern: > {code:java} > raftFut.thenApply(ignored -> result);{code} > we should worry about handling RAFT exceptions, including analyzing whether > raftFut result. > Can distinguish following exception types for RAFT: > * RAFT cannot replicate a command for the timeout ({_}TimeoutException{_}). > Hence, this exception leads to the replication timeout exception > ({_}ReplicationTimeoutException{_}). > * It throws some internal exception ({_}RaftException{_}). This exception > should be wrapped of the common replication exception > ({_}ReplicationException{_}). > * Finally, RAFT throws java exceptions (NullPointerException, > IndexOutOfRangeException e.t.c). Those exceptions shouldn't be touched, is > will be through as is. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-17523) Stabilize and cleanup ignite3_tx branch with rw logic implemented
[ https://issues.apache.org/jira/browse/IGNITE-17523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Lapin updated IGNITE-17523: - Epic Link: IGNITE-15081 > Stabilize and cleanup ignite3_tx branch with rw logic implemented > - > > Key: IGNITE-17523 > URL: https://issues.apache.org/jira/browse/IGNITE-17523 > Project: Ignite > Issue Type: Improvement >Reporter: Alexander Lapin >Priority: Major > Labels: ignite-3, transaction3_rw > > It's required to cleanup all obsolete stuff > * TxManager > * VersionedRowStore > * todos without tickets > * etc > and stabilize failing tests. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-17523) Stabilize and cleanup ignite3_tx branch with rw logic implemented
[ https://issues.apache.org/jira/browse/IGNITE-17523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Lapin updated IGNITE-17523: - Summary: Stabilize and cleanup ignite3_tx branch with rw logic implemented (was: Stabilize and cleanup ignite3_rw transactions ) > Stabilize and cleanup ignite3_tx branch with rw logic implemented > - > > Key: IGNITE-17523 > URL: https://issues.apache.org/jira/browse/IGNITE-17523 > Project: Ignite > Issue Type: Improvement >Reporter: Alexander Lapin >Priority: Major > Labels: ignite-3, transaction3_rw > > It's required to cleanup all obsolete stuff > - TxManager > - VersionedRowStore > - todos without tickets > - etc > and stabilize failing tests -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-17378) Check the replica is a primary before processing request at Replica
[ https://issues.apache.org/jira/browse/IGNITE-17378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-17378: --- Summary: Check the replica is a primary before processing request at Replica (was: Check the replica is a primapry before processing request at Replica) > Check the replica is a primary before processing request at Replica > --- > > Key: IGNITE-17378 > URL: https://issues.apache.org/jira/browse/IGNITE-17378 > Project: Ignite > Issue Type: Bug >Reporter: Vladislav Pyatkov >Priority: Major > Labels: ignite-3, transaction3_rw > > Need to check that a current leader and a leader from request are equals on > Replica#processRequest. For this purpose we need to use > RaftGroupService#refreshAndGetLeaderWithTerm in InternalTableImpl#enlist, > send Leader and Term with request and compare it in Replica#processRequest. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-17378) Check the replica is a primapry before processing request at Replica
[ https://issues.apache.org/jira/browse/IGNITE-17378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-17378: --- Summary: Check the replica is a primapry before processing request at Replica (was: Check the replica is alive before processing request at Replica) > Check the replica is a primapry before processing request at Replica > > > Key: IGNITE-17378 > URL: https://issues.apache.org/jira/browse/IGNITE-17378 > Project: Ignite > Issue Type: Bug >Reporter: Vladislav Pyatkov >Priority: Major > Labels: ignite-3, transaction3_rw > > Need to check that a current leader and a leader from request are equals on > Replica#processRequest. For this purpose we need to use > RaftGroupService#refreshAndGetLeaderWithTerm in InternalTableImpl#enlist, > send Leader and Term with request and compare it in Replica#processRequest. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-17523) Stabilize and cleanup ignite3_tx branch with rw logic implemented
[ https://issues.apache.org/jira/browse/IGNITE-17523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Lapin updated IGNITE-17523: - Description: It's required to cleanup all obsolete stuff * TxManager * VersionedRowStore * todos without tickets * etc and stabilize failing tests. was: It's required to cleanup all obsolete stuff - TxManager - VersionedRowStore - todos without tickets - etc and stabilize failing tests > Stabilize and cleanup ignite3_tx branch with rw logic implemented > - > > Key: IGNITE-17523 > URL: https://issues.apache.org/jira/browse/IGNITE-17523 > Project: Ignite > Issue Type: Improvement >Reporter: Alexander Lapin >Priority: Major > Labels: ignite-3, transaction3_rw > > It's required to cleanup all obsolete stuff > * TxManager > * VersionedRowStore > * todos without tickets > * etc > and stabilize failing tests. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-17378) Check the replica is a primary before processing request at Replica
[ https://issues.apache.org/jira/browse/IGNITE-17378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-17378: --- Description: Need to check that a current leader and a leader from request are equals on Replica#processRequest. For this purpose we need to use RaftGroupService#refreshAndGetLeaderWithTerm in InternalTableImpl#enlist, send Leader and Term with request and compare it in Replica#processRequest. If they not equals then need to throw PrimaryReplicaMissException. (was: Need to check that a current leader and a leader from request are equals on Replica#processRequest. For this purpose we need to use RaftGroupService#refreshAndGetLeaderWithTerm in InternalTableImpl#enlist, send Leader and Term with request and compare it in Replica#processRequest.) > Check the replica is a primary before processing request at Replica > --- > > Key: IGNITE-17378 > URL: https://issues.apache.org/jira/browse/IGNITE-17378 > Project: Ignite > Issue Type: Bug >Reporter: Vladislav Pyatkov >Priority: Major > Labels: ignite-3, transaction3_rw > > Need to check that a current leader and a leader from request are equals on > Replica#processRequest. For this purpose we need to use > RaftGroupService#refreshAndGetLeaderWithTerm in InternalTableImpl#enlist, > send Leader and Term with request and compare it in Replica#processRequest. > If they not equals then need to throw PrimaryReplicaMissException. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-17523) Stabilize and cleanup ignite3_rw transactions
Alexander Lapin created IGNITE-17523: Summary: Stabilize and cleanup ignite3_rw transactions Key: IGNITE-17523 URL: https://issues.apache.org/jira/browse/IGNITE-17523 Project: Ignite Issue Type: Improvement Reporter: Alexander Lapin It's required to cleanup all obsolete stuff - TxManager - VersionedRowStore - todos without tickets - etc and stabilize failing tests -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-17378) Check the replica is alive before processing request at Replica
[ https://issues.apache.org/jira/browse/IGNITE-17378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-17378: --- Summary: Check the replica is alive before processing request at Replica (was: Check the replica is alive during before that a method will be invoked) > Check the replica is alive before processing request at Replica > --- > > Key: IGNITE-17378 > URL: https://issues.apache.org/jira/browse/IGNITE-17378 > Project: Ignite > Issue Type: Bug >Reporter: Vladislav Pyatkov >Priority: Major > Labels: ignite-3, transaction3_rw > > Need to check that a current leader and a leader from request are equals on > Replica#processRequest. For this purpose we need to use > RaftGroupService#refreshAndGetLeaderWithTerm in InternalTableImpl#enlist, > send Leader and Term with request and compare it in Replica#processRequest. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-17378) Check the replica is alive during before that a method will be invoked
[ https://issues.apache.org/jira/browse/IGNITE-17378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-17378: --- Description: Need to check that a current leader and a leader from request are equals on Replica#processRequest. For this purpose we need to use RaftGroupService#refreshAndGetLeaderWithTerm in InternalTableImpl#enlist, send Leader and Term with request and compare it in Replica#processRequest. (was: Need to check that a current leader and a leader from request are equals on Replica#processRequest. For this purpose we need to use RaftGroupService#refreshAndGetLeaderWithTerm in InternalTableImpl#enlist) > Check the replica is alive during before that a method will be invoked > -- > > Key: IGNITE-17378 > URL: https://issues.apache.org/jira/browse/IGNITE-17378 > Project: Ignite > Issue Type: Bug >Reporter: Vladislav Pyatkov >Priority: Major > Labels: ignite-3, transaction3_rw > > Need to check that a current leader and a leader from request are equals on > Replica#processRequest. For this purpose we need to use > RaftGroupService#refreshAndGetLeaderWithTerm in InternalTableImpl#enlist, > send Leader and Term with request and compare it in Replica#processRequest. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-17378) Check the replica is alive during before that a method will be invoked
[ https://issues.apache.org/jira/browse/IGNITE-17378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-17378: --- Description: Need to check that a current leader and a leader from request are equals on Replica#processRequest. For this purpose we need to use RaftGroupService#refreshAndGetLeaderWithTerm in InternalTableImpl#enlist > Check the replica is alive during before that a method will be invoked > -- > > Key: IGNITE-17378 > URL: https://issues.apache.org/jira/browse/IGNITE-17378 > Project: Ignite > Issue Type: Bug >Reporter: Vladislav Pyatkov >Priority: Major > Labels: ignite-3, transaction3_rw > > Need to check that a current leader and a leader from request are equals on > Replica#processRequest. For this purpose we need to use > RaftGroupService#refreshAndGetLeaderWithTerm in InternalTableImpl#enlist -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IGNITE-17498) Update HeapLockManager in order to support Intention locks
[ https://issues.apache.org/jira/browse/IGNITE-17498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Denis Chudov reassigned IGNITE-17498: - Assignee: Denis Chudov > Update HeapLockManager in order to support Intention locks > -- > > Key: IGNITE-17498 > URL: https://issues.apache.org/jira/browse/IGNITE-17498 > Project: Ignite > Issue Type: Improvement >Reporter: Alexander Lapin >Assignee: Denis Chudov >Priority: Major > Labels: ignite-3, transaction3_rw > > It's required to implement new lock upgrade logic that will consider not only > S and X locks but also IS, IX and SIX. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-17497) Support inheritance of polymorphic configurations
[ https://issues.apache.org/jira/browse/IGNITE-17497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17578997#comment-17578997 ] Roman Puchkovskiy commented on IGNITE-17497: This issue is being put on hold. We'll estimate the PR if a use-case for the feature appears. > Support inheritance of polymorphic configurations > - > > Key: IGNITE-17497 > URL: https://issues.apache.org/jira/browse/IGNITE-17497 > Project: Ignite > Issue Type: Improvement >Reporter: Roman Puchkovskiy >Assignee: Roman Puchkovskiy >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-alpha6 > > Time Spent: 10m > Remaining Estimate: 0h > > Currently, polymorphic configuration schemas must have exactly one parent > class (not Object). > It is suggested to implement the following logic: > # Top config schema must be annotated with PolymorphicConfig (it already > works as described here, so nothing needs to be done) > # Leaf config schema must be annotated with PolymorphicConfigInstance (it > already works as described here, so nothing needs to be done) > # Intermediary config schema classes (extending, directly or indirectly, the > top config schema and extended, directly or indirectly by leaf config > schemas) are allowed. They do not need to be annotated. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (IGNITE-14985) Re-work error handling in affinity component in accordance with error scopes and prefixes
[ https://issues.apache.org/jira/browse/IGNITE-14985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vyacheslav Koptilin resolved IGNITE-14985. -- Resolution: Won't Fix > Re-work error handling in affinity component in accordance with error scopes > and prefixes > -- > > Key: IGNITE-14985 > URL: https://issues.apache.org/jira/browse/IGNITE-14985 > Project: Ignite > Issue Type: Improvement >Reporter: Vyacheslav Koptilin >Assignee: Vyacheslav Koptilin >Priority: Major > Labels: iep-84, ignite-3 > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-14985) Re-work error handling in affinity component in accordance with error scopes and prefixes
[ https://issues.apache.org/jira/browse/IGNITE-14985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17578992#comment-17578992 ] Vyacheslav Koptilin commented on IGNITE-14985: -- For now, affinity component is stateless and does not require any attention. > Re-work error handling in affinity component in accordance with error scopes > and prefixes > -- > > Key: IGNITE-14985 > URL: https://issues.apache.org/jira/browse/IGNITE-14985 > Project: Ignite > Issue Type: Improvement >Reporter: Vyacheslav Koptilin >Assignee: Vyacheslav Koptilin >Priority: Major > Labels: iep-84, ignite-3 > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-17475) FreeList metadata is not stored on the checkpoint
[ https://issues.apache.org/jira/browse/IGNITE-17475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17578985#comment-17578985 ] Roman Puchkovskiy commented on IGNITE-17475: The patch looks good to me > FreeList metadata is not stored on the checkpoint > - > > Key: IGNITE-17475 > URL: https://issues.apache.org/jira/browse/IGNITE-17475 > Project: Ignite > Issue Type: Bug >Reporter: Kirill Tkalenko >Assignee: Kirill Tkalenko >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-alpha6 > > Time Spent: 3h 10m > Remaining Estimate: 0h > > It has been discovered that we don't save freelist metadata on checkpoint as > it does in 2.0. > This needs to be fixed, see in 2.0: > *org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager#saveStoreMetadata* > Some notes: > * Saving metadata should be in two phases before the checkpoint write lock > and under the write lock to reduce write lock time. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-17508) Exception handling in the partition replication listener for RAFT futures
[ https://issues.apache.org/jira/browse/IGNITE-17508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladislav Pyatkov updated IGNITE-17508: --- Description: In the replication listener ({_}PartitionReplicaListener{_}) where we have the pattern: {code:java} raftFut.thenApply(ignored -> result);{code} we should worry about handling RAFT exceptions, including analyzing whether raftFut result. Should distinguish following exception types: * The RAFT node has not been started yet. * The RAFT node is started, but it is not a leader. * RAFT cannot replicate a command for the timeout. was: In the replication listener ({_}PartitionReplicaListener{_}) where we have the pattern: {code:java} raftFut.thenApply(ignored -> result);{code} we should worry about handling RAFT exceptions, including analyzing whether raftFut result. > Exception handling in the partition replication listener for RAFT futures > - > > Key: IGNITE-17508 > URL: https://issues.apache.org/jira/browse/IGNITE-17508 > Project: Ignite > Issue Type: Improvement >Reporter: Vladislav Pyatkov >Priority: Major > Labels: ignite-3 > > In the replication listener ({_}PartitionReplicaListener{_}) where we have > the pattern: > {code:java} > raftFut.thenApply(ignored -> result);{code} > we should worry about handling RAFT exceptions, including analyzing whether > raftFut result. > Should distinguish following exception types: > * The RAFT node has not been started yet. > * The RAFT node is started, but it is not a leader. > * RAFT cannot replicate a command for the timeout. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-17522) Add documentation for the index rebuild operation in the maintenance mode
[ https://issues.apache.org/jira/browse/IGNITE-17522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Semyon Danilov updated IGNITE-17522: Description: The new command's syntax is as follows: {noformat} --cache schedule_indexes_rebuild --node-id nodeId --cache-names cacheName[index1,...indexN],cacheName2,cacheName3[index1] --group-names groupName1,groupName2,...groupNameN Schedules rebuild of the indexes for specified caches via the Maintenance Mode. Schedules rebuild of specified caches and cache-groups Parameters: --node-id - (Optional) Specify node for indexes rebuild. If not specified, schedules rebuild on all nodes. --cache-names - Comma-separated list of cache names with optionally specified indexes. If indexes are not specified then all indexes of the cache will be scheduled for the rebuild operation. Can be used simultaneously with cache group names. --group-names - Comma-separated list of cache group names for which indexes should be scheduled for the rebuild. Can be used simultaneously with cache names. {noformat} > Add documentation for the index rebuild operation in the maintenance mode > - > > Key: IGNITE-17522 > URL: https://issues.apache.org/jira/browse/IGNITE-17522 > Project: Ignite > Issue Type: Task > Components: documentation >Reporter: Semyon Danilov >Priority: Major > > The new command's syntax is as follows: > {noformat} > --cache schedule_indexes_rebuild --node-id nodeId --cache-names > cacheName[index1,...indexN],cacheName2,cacheName3[index1] --group-names > groupName1,groupName2,...groupNameN > Schedules rebuild of the indexes for specified caches via the Maintenance > Mode. Schedules rebuild of specified caches and cache-groups > Parameters: > --node-id - (Optional) Specify node for indexes rebuild. If not > specified, schedules rebuild on all nodes. > --cache-names - Comma-separated list of cache names with optionally > specified indexes. If indexes are not specified then all indexes of the cache > will be scheduled for the rebuild operation. Can be used simultaneously with > cache group names. > --group-names - Comma-separated list of cache group names for which > indexes should be scheduled for the rebuild. Can be used simultaneously with > cache names. > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-17522) Add documentation for the index rebuild operation in the maintenance mode
Semyon Danilov created IGNITE-17522: --- Summary: Add documentation for the index rebuild operation in the maintenance mode Key: IGNITE-17522 URL: https://issues.apache.org/jira/browse/IGNITE-17522 Project: Ignite Issue Type: Task Components: documentation Reporter: Semyon Danilov -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-17507) Failed to wait for partition map exchange on some clients
[ https://issues.apache.org/jira/browse/IGNITE-17507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vyacheslav Koptilin updated IGNITE-17507: - Release Note: Fixed an issue that could lead to unexpected partition map exchange on client nodes. > Failed to wait for partition map exchange on some clients > - > > Key: IGNITE-17507 > URL: https://issues.apache.org/jira/browse/IGNITE-17507 > Project: Ignite > Issue Type: Bug >Reporter: Vyacheslav Koptilin >Assignee: Vyacheslav Koptilin >Priority: Major > Fix For: 2.14 > > Time Spent: 0.5h > Remaining Estimate: 0h > > We have scenario with several client and server nodes, which can stuck on PME > after start: > * Start some server nodes > * Trigger rebalance > * Start some client and server nodes > * Some of the client nodes stuck with _Failed to wait for partition map > exchange [topVer=AffinityTopologyVersion…_ > Deep investigation of the logs showed, that the root cause of the stuck PME > on client is the race between joining new client node and receiving stale > _CacheAffinityChangeMessage_ on a client, which causes PME, but when other > old nodes receive this _CacheAffinityChangeMessage_, they skip it because of > some optimization. > Optimization can be found in the method > _CacheAffinitySharedManager#onDiscoveryEvent_, we save _lastAffVer = topVer_ > for old nodes, but because of some race _lastAffVer_ for the problem client > node is null when we reach _CacheAffinitySharedManager#onCustomEvent_ and we > schedule invalid PME in _msg.exchangeNeeded(exchangeNeeded)_, but other > nodes skip this PME > The possible fix is that we can try to make the _CacheAffinityChangeMessage_ > mutable (mutable discovery custom message). It allows to modify the message > before sending it across the ring. This approach does not require to make a > decision to apply or skip the message on client nodes, the required flag will > be transferred from a server node. In case of using Zookeeper Discovery, > there is no ability to mutate discovery messages. However is is possible to > mutate the message on the coordinator node (this requires adding > _stopProcess_ flag in _DiscoveryCustomMessage_ which was removed by > IGNITE-12400). This is quite enough for our case. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-17507) Failed to wait for partition map exchange on some clients
[ https://issues.apache.org/jira/browse/IGNITE-17507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vyacheslav Koptilin updated IGNITE-17507: - Ignite Flags: Release Notes Required > Failed to wait for partition map exchange on some clients > - > > Key: IGNITE-17507 > URL: https://issues.apache.org/jira/browse/IGNITE-17507 > Project: Ignite > Issue Type: Bug >Reporter: Vyacheslav Koptilin >Assignee: Vyacheslav Koptilin >Priority: Major > Fix For: 2.14 > > Time Spent: 0.5h > Remaining Estimate: 0h > > We have scenario with several client and server nodes, which can stuck on PME > after start: > * Start some server nodes > * Trigger rebalance > * Start some client and server nodes > * Some of the client nodes stuck with _Failed to wait for partition map > exchange [topVer=AffinityTopologyVersion…_ > Deep investigation of the logs showed, that the root cause of the stuck PME > on client is the race between joining new client node and receiving stale > _CacheAffinityChangeMessage_ on a client, which causes PME, but when other > old nodes receive this _CacheAffinityChangeMessage_, they skip it because of > some optimization. > Optimization can be found in the method > _CacheAffinitySharedManager#onDiscoveryEvent_, we save _lastAffVer = topVer_ > for old nodes, but because of some race _lastAffVer_ for the problem client > node is null when we reach _CacheAffinitySharedManager#onCustomEvent_ and we > schedule invalid PME in _msg.exchangeNeeded(exchangeNeeded)_, but other > nodes skip this PME > The possible fix is that we can try to make the _CacheAffinityChangeMessage_ > mutable (mutable discovery custom message). It allows to modify the message > before sending it across the ring. This approach does not require to make a > decision to apply or skip the message on client nodes, the required flag will > be transferred from a server node. In case of using Zookeeper Discovery, > there is no ability to mutate discovery messages. However is is possible to > mutate the message on the coordinator node (this requires adding > _stopProcess_ flag in _DiscoveryCustomMessage_ which was removed by > IGNITE-12400). This is quite enough for our case. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-17521) Need to retry enlisting new partition in a transaction if first ReplicaService#invoke() return PrimaryReplicaMissException
Sergey Uttsel created IGNITE-17521: -- Summary: Need to retry enlisting new partition in a transaction if first ReplicaService#invoke() return PrimaryReplicaMissException Key: IGNITE-17521 URL: https://issues.apache.org/jira/browse/IGNITE-17521 Project: Ignite Issue Type: Improvement Reporter: Sergey Uttsel If a transaction enlist new partition and ReplicaService#invoke() return PrimaryReplicaMissException then it's need to retry enlisting several times. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-17521) Need to retry enlisting new partition in a transaction if first ReplicaService#invoke() return PrimaryReplicaMissException
[ https://issues.apache.org/jira/browse/IGNITE-17521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-17521: --- Epic Link: IGNITE-15081 > Need to retry enlisting new partition in a transaction if first > ReplicaService#invoke() return PrimaryReplicaMissException > -- > > Key: IGNITE-17521 > URL: https://issues.apache.org/jira/browse/IGNITE-17521 > Project: Ignite > Issue Type: Improvement >Reporter: Sergey Uttsel >Priority: Major > Labels: ignite-3, transaction3_rw > > If a transaction enlist new partition and ReplicaService#invoke() return > PrimaryReplicaMissException then it's need to retry enlisting several times. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-17354) Metrics framework
[ https://issues.apache.org/jira/browse/IGNITE-17354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17578957#comment-17578957 ] Denis Chudov commented on IGNITE-17354: --- [~agura] I made fixes and left some comments, could you take a look? > Metrics framework > -- > > Key: IGNITE-17354 > URL: https://issues.apache.org/jira/browse/IGNITE-17354 > Project: Ignite > Issue Type: New Feature >Reporter: Denis Chudov >Assignee: Denis Chudov >Priority: Major > Labels: ignite-3 > > *Metrics types* > Metrics framework should provide the following metrics types: > - Gauge - is an instantaneous measurement of a value provided by some > existing component. Gauge should support primitive types: int, long, double > - Metric - is just a wrapper on a numeric value which could be increased or > decreased to some value. Metric should support primitive types: int, long, > double. > - Hit Rate - accumulates approximate hit rate statistics based on hits in the > last time interval. > - Distribution - distributes values by buckets where each bucket represent > some numeric interval (Histogram in AI 2). Internal type - primitive long > (should be enough). > *Concurrency characteristics* > For scalar numeric metrics it is enough to have atomic number (e.g. > AtomicInteger) and striped number (e.g. LongAdder). Such approaches affects > memory footprint and performance differently. > *Design* > Metrics should have the same life cycle as well as component that produces > these metrics. So metrics related to some particular component should be tied > together in MetricsSet. the only purpose of metrics set is provide access to > metrics values from exporters. Metrics instances itself placed in > MetricsSource - an entity which keeps instances of metrics and provides > access to the metrics through an interface that is specific for each metrics > source. A component that produces metrics must control metrics source life > cycle (create it and register in metrics registry, see below). > All metrics sources (it is not important, enabled or disabled metrics for > particular metrics source) must be registered in metrics registry on > component start and removed on component stop. > MetricsSource itself produces an instance of MetricsSet which should be > registered in metrics registry on event "metrics enabled" and unregistered on > event "metrics disabled". > Metrics registry provide an access to all metrics sets from exporters side. > It is possible that metrics registry is overloaded by functionality (manage > by metrics sources and metrics sets), so, probably, special component is need > for these purposes (e.g. metrics manager). > Each instance of metric has a name (local in some metric set) and > description. So the full metric name it is a concatenation of metrics source > name and metric name separated by dot. > For composite metrics like distribution we should treat each metrics inside > (e.g. each range) as separate metric. So the full name for each internal > metric will be metrics source + dot + metric instance name + dot + range as > string (e.g. 0_100). > Metrics set must be immutable in order to meet the requirements described in > the epic. > Data structure (likely map) that is responsible for keeping enabled metrics > set should be modified using copy-on-write semantics in order to avoid data > races between main functionality (metrics enabling\disabling) and exporters. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-17496) LWM may be after HWM (reserved) on primary after the node restart
[ https://issues.apache.org/jira/browse/IGNITE-17496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17578954#comment-17578954 ] Ignite TC Bot commented on IGNITE-17496: {panel:title=Branch: [pull/10185/head] Base: [master] : No blockers found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}{panel} {panel:title=Branch: [pull/10185/head] Base: [master] : New Tests (40)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1} {color:#8b}Control Utility{color} [[tests 40|https://ci2.ignite.apache.org/viewLog.html?buildId=6556545]] * {color:#013220}IgniteControlUtilityTestSuite: GridCommandHandlerConsistencyCountersTest.testCountersOnCrachRecovery[strategy=LWW, reuse=false, historical=false, atomicity=ATOMIC] - PASSED{color} * {color:#013220}IgniteControlUtilityTestSuite: GridCommandHandlerConsistencyCountersTest.testCountersOnCrachRecovery[strategy=LWW, reuse=true, historical=false, atomicity=ATOMIC] - PASSED{color} * {color:#013220}IgniteControlUtilityTestSuite: GridCommandHandlerConsistencyCountersTest.testCountersOnCrachRecovery[strategy=LWW, reuse=false, historical=true, atomicity=TRANSACTIONAL] - PASSED{color} * {color:#013220}IgniteControlUtilityTestSuite: GridCommandHandlerConsistencyCountersTest.testCountersOnCrachRecovery[strategy=LWW, reuse=false, historical=true, atomicity=ATOMIC] - PASSED{color} * {color:#013220}IgniteControlUtilityTestSuite: GridCommandHandlerConsistencyCountersTest.testCountersOnCrachRecovery[strategy=LWW, reuse=false, historical=false, atomicity=TRANSACTIONAL] - PASSED{color} * {color:#013220}IgniteControlUtilityTestSuite: GridCommandHandlerConsistencyCountersTest.testCountersOnCrachRecovery[strategy=PRIMARY, reuse=false, historical=false, atomicity=ATOMIC] - PASSED{color} * {color:#013220}IgniteControlUtilityTestSuite: GridCommandHandlerConsistencyCountersTest.testCountersOnCrachRecovery[strategy=LWW, reuse=true, historical=true, atomicity=TRANSACTIONAL] - PASSED{color} * {color:#013220}IgniteControlUtilityTestSuite: GridCommandHandlerConsistencyCountersTest.testCountersOnCrachRecovery[strategy=LWW, reuse=true, historical=true, atomicity=ATOMIC] - PASSED{color} * {color:#013220}IgniteControlUtilityTestSuite: GridCommandHandlerConsistencyCountersTest.testCountersOnCrachRecovery[strategy=LWW, reuse=true, historical=false, atomicity=TRANSACTIONAL] - PASSED{color} * {color:#013220}IgniteControlUtilityTestSuite: GridCommandHandlerConsistencyCountersTest.testCountersOnCrachRecovery[strategy=PRIMARY, reuse=true, historical=false, atomicity=ATOMIC] - PASSED{color} * {color:#013220}IgniteControlUtilityTestSuite: GridCommandHandlerConsistencyCountersTest.testCountersOnCrachRecovery[strategy=PRIMARY, reuse=false, historical=true, atomicity=TRANSACTIONAL] - PASSED{color} ... and 29 new tests {panel} [TeamCity *-- Run :: All* Results|https://ci2.ignite.apache.org/viewLog.html?buildId=6556537buildTypeId=IgniteTests24Java8_RunAll] > LWM may be after HWM (reserved) on primary after the node restart > - > > Key: IGNITE-17496 > URL: https://issues.apache.org/jira/browse/IGNITE-17496 > Project: Ignite > Issue Type: Sub-task >Reporter: Anton Vinogradov >Assignee: Anton Vinogradov >Priority: Major > Labels: iep-31 > Fix For: 2.14 > > Time Spent: 10m > Remaining Estimate: 0h > > {code:java} > java.lang.AssertionError: LWM after HWM: lwm=10010, hwm=10003, cntr=Counter > [lwm=10010, missed=[10011 - 10012, 10021, 10031 - 10032, 10043 - 10044], > maxApplied=10047, hwm=10004] > at > org.apache.ignite.internal.processors.cache.PartitionUpdateCounterTrackingImpl.reserve(PartitionUpdateCounterTrackingImpl.java:265) > at > org.apache.ignite.internal.processors.cache.PartitionUpdateCounterErrorWrapper.reserve(PartitionUpdateCounterErrorWrapper.java:58) > at > org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.getAndIncrementUpdateCounter(IgniteCacheOffheapManagerImpl.java:1620) > at > org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.getAndIncrementUpdateCounter(GridCacheOffheapManager.java:2538) > at > org.apache.ignite.internal.processors.cache.distributed.dht.topology.GridDhtLocalPartition.getAndIncrementUpdateCounter(GridDhtLocalPartition.java:942) > at > org.apache.ignite.internal.processors.cache.transactions.IgniteTxLocalAdapter.calculatePartitionUpdateCounters(IgniteTxLocalAdapter.java:510) > at > org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxPrepareFuture.prepare0(GridDhtTxPrepareFuture.java:1360) > at > org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxPrepareFuture.mapIfLocked(GridDhtTxPrepareFuture.java:730) > at >
[jira] [Assigned] (IGNITE-17224) Support eviction for volatile (in-memory) data region
[ https://issues.apache.org/jira/browse/IGNITE-17224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov reassigned IGNITE-17224: -- Assignee: (was: Ivan Bessonov) > Support eviction for volatile (in-memory) data region > - > > Key: IGNITE-17224 > URL: https://issues.apache.org/jira/browse/IGNITE-17224 > Project: Ignite > Issue Type: Bug >Reporter: Kirill Tkalenko >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-alpha6 > > > I found that the volatile (in memory) data region contains a configuration > for eviction, but does not implement it, you need to implement it by analogy > with 2.0 and write tests for it. We also need to consider validation for a > configuration intended to be evicted. > See in 3.0: > * > *org.apache.ignite.internal.pagememory.configuration.schema.VolatilePageMemoryDataRegionConfigurationSchema* > * *org.apache.ignite.internal.storage.pagememory.VolatilePageMemoryDataRegion* > * *org.apache.ignite.internal.pagememory.inmemory.VolatilePageMemory* > See in 2.0: > * > *org.apache.ignite.internal.processors.cache.persistence.IgniteCacheDatabaseSharedManager#ensureFreeSpace* > * > *org.apache.ignite.internal.processors.cache.persistence.IgniteCacheDatabaseSharedManager#checkRegionEvictionProperties* -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IGNITE-17224) Support eviction for volatile (in-memory) data region
[ https://issues.apache.org/jira/browse/IGNITE-17224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov reassigned IGNITE-17224: -- Assignee: Ivan Bessonov > Support eviction for volatile (in-memory) data region > - > > Key: IGNITE-17224 > URL: https://issues.apache.org/jira/browse/IGNITE-17224 > Project: Ignite > Issue Type: Bug >Reporter: Kirill Tkalenko >Assignee: Ivan Bessonov >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-alpha6 > > > I found that the volatile (in memory) data region contains a configuration > for eviction, but does not implement it, you need to implement it by analogy > with 2.0 and write tests for it. We also need to consider validation for a > configuration intended to be evicted. > See in 3.0: > * > *org.apache.ignite.internal.pagememory.configuration.schema.VolatilePageMemoryDataRegionConfigurationSchema* > * *org.apache.ignite.internal.storage.pagememory.VolatilePageMemoryDataRegion* > * *org.apache.ignite.internal.pagememory.inmemory.VolatilePageMemory* > See in 2.0: > * > *org.apache.ignite.internal.processors.cache.persistence.IgniteCacheDatabaseSharedManager#ensureFreeSpace* > * > *org.apache.ignite.internal.processors.cache.persistence.IgniteCacheDatabaseSharedManager#checkRegionEvictionProperties* -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-17502) Tasks to sent the snapshot files are not ordered
[ https://issues.apache.org/jira/browse/IGNITE-17502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17578944#comment-17578944 ] Pavel Pereslegin commented on IGNITE-17502: --- [~NSAmelchev], looks good to me, thanks for the fix. > Tasks to sent the snapshot files are not ordered > > > Key: IGNITE-17502 > URL: https://issues.apache.org/jira/browse/IGNITE-17502 > Project: Ignite > Issue Type: Bug >Reporter: Amelchev Nikita >Assignee: Amelchev Nikita >Priority: Major > Labels: ise > Fix For: 2.14 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > Tasks to sent the snapshot files are not ordered. This leads to socket > timeout in a file sender while thread is busy by sending to other node: > {noformat} > sender.send(part1); > ... > otherSender.send(part3); > ... > // `sender` throws socket timeout exception. > sender.send(part2); > {noformat} > {noformat} > java.io.EOFException: null > at > java.io.ObjectInputStream$BlockDataInputStream.readBoolean(ObjectInputStream.java:3120) > ~[?:1.8.0_201] > at java.io.ObjectInputStream.readBoolean(ObjectInputStream.java:966) > ~[?:1.8.0_201] > at > org.apache.ignite.internal.managers.communication.GridIoManager.receiveFromChannel(GridIoManager.java:2935) > [classes/:?] > at > org.apache.ignite.internal.managers.communication.GridIoManager.processOpenedChannel(GridIoManager.java:2895) > [classes/:?] > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$4900(GridIoManager.java:244) > [classes/:?] > at > org.apache.ignite.internal.managers.communication.GridIoManager$7.run(GridIoManager.java:1237) > [classes/:?] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [?:1.8.0_201] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [?:1.8.0_201] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201] > ... > Caused by: org.apache.ignite.IgniteCheckedException: Requested topic is busy > by another transmission. It's not allowed to process different sessions over > the same topic simultaneously. Channel will be closed > [initMsg=SessionChannelMessage > [sesId=9c855b38281-d8dcd34f-916f-49d0-a453-cd1866acfce1], > channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:47102 > remote=/127.0.0.1:55621], nodeId=5ace7280-b08a-4cf9-b428-7f70ef70] > at > org.apache.ignite.internal.managers.communication.GridIoManager.processOpenedChannel(GridIoManager.java:2867) > ~[classes/:?] > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$4900(GridIoManager.java:244) > ~[classes/:?] > at > org.apache.ignite.internal.managers.communication.GridIoManager$7.run(GridIoManager.java:1237) > ~[classes/:?] > ... 3 more > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-17455) IndexQuery should support setPartition
[ https://issues.apache.org/jira/browse/IGNITE-17455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maksim Timonin updated IGNITE-17455: Labels: IEP-71 ise (was: IEP-71) > IndexQuery should support setPartition > -- > > Key: IGNITE-17455 > URL: https://issues.apache.org/jira/browse/IGNITE-17455 > Project: Ignite > Issue Type: New Feature >Reporter: Maksim Timonin >Assignee: Maksim Timonin >Priority: Major > Labels: IEP-71, ise > Time Spent: 50m > Remaining Estimate: 0h > > Currently IndexQuery doesn't support querying specified partition. But other > types of queries provide this option - ScanQuery, SqlFieldsQuery. > It's useful option for working with affinity requests. Then IndexQuery should > work over single partition. > To make it possible to migrate to IndexQuery from others queries let's add > such opportunity. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-14914) Support in() clause in IndexQuery.
[ https://issues.apache.org/jira/browse/IGNITE-14914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maksim Timonin updated IGNITE-14914: Labels: IEP-71 ise (was: IEP-71) > Support in() clause in IndexQuery. > -- > > Key: IGNITE-14914 > URL: https://issues.apache.org/jira/browse/IGNITE-14914 > Project: Ignite > Issue Type: New Feature >Reporter: Maksim Timonin >Assignee: Maksim Timonin >Priority: Major > Labels: IEP-71, ise > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-17511) Support IndexQuery for Java ThinClient
[ https://issues.apache.org/jira/browse/IGNITE-17511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maksim Timonin updated IGNITE-17511: Labels: IEP-71 ise (was: IEP-71) > Support IndexQuery for Java ThinClient > -- > > Key: IGNITE-17511 > URL: https://issues.apache.org/jira/browse/IGNITE-17511 > Project: Ignite > Issue Type: New Feature >Reporter: Maksim Timonin >Assignee: Maksim Timonin >Priority: Major > Labels: IEP-71, ise > Fix For: 2.14 > > Time Spent: 10m > Remaining Estimate: 0h > > ThinClient doesn't support IndexQuery. Let's fix it. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-17286) Race between completing table creation and stopping TableManager
[ https://issues.apache.org/jira/browse/IGNITE-17286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17578926#comment-17578926 ] Denis Chudov commented on IGNITE-17286: --- [~maliev] LGTM. > Race between completing table creation and stopping TableManager > > > Key: IGNITE-17286 > URL: https://issues.apache.org/jira/browse/IGNITE-17286 > Project: Ignite > Issue Type: Bug >Reporter: Roman Puchkovskiy >Assignee: Mirza Aliev >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-alpha6 > > Time Spent: 20m > Remaining Estimate: 0h > > As IGNITE-17048 demonstrates, our tests sometimes fail with message like the > following: > java.lang.AssertionError: Raft groups are still running > The leftover Raft groups always relate to table partitions (and NOT > metastorage/cmg). > It looks like this can happen due to TableManager.stop() being called before > some table creation is completed (on some Ignite node). As a result, > TableManager.stop() does not see this table, so the table does not get > stopped, and its Raft groups are left forever. > Adding a delay to table creation completion > public void onSqlSchemaReady(long causalityToken) { > if (Math.random() < 0.33) { > try > { Thread.sleep(1000); } > catch (InterruptedException e) > { // ignore } > } > LOG.info("SCHEMA READY FOR " + causalityToken); > tablesByIdVv.complete(causalityToken); > } > makes the failure manifest itself easily. > The reproducer is in > [https://github.com/gridgain/apache-ignite-3/tree/ignite-17286-repr] > To run the reproducer, just run > ItComputeTest.executesColocatedByClassNameWithTupleKey() > It usually takes less than 10 iterations to bump into the assertion. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-17497) Support inheritance of polymorphic configurations
[ https://issues.apache.org/jira/browse/IGNITE-17497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roman Puchkovskiy updated IGNITE-17497: --- Reviewer: Aleksandr Polovtcev [~apolovtcev] could you please take a look at the attached PR? > Support inheritance of polymorphic configurations > - > > Key: IGNITE-17497 > URL: https://issues.apache.org/jira/browse/IGNITE-17497 > Project: Ignite > Issue Type: Improvement >Reporter: Roman Puchkovskiy >Assignee: Roman Puchkovskiy >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-alpha6 > > Time Spent: 10m > Remaining Estimate: 0h > > Currently, polymorphic configuration schemas must have exactly one parent > class (not Object). > It is suggested to implement the following logic: > # Top config schema must be annotated with PolymorphicConfig (it already > works as described here, so nothing needs to be done) > # Leaf config schema must be annotated with PolymorphicConfigInstance (it > already works as described here, so nothing needs to be done) > # Intermediary config schema classes (extending, directly or indirectly, the > top config schema and extended, directly or indirectly by leaf config > schemas) are allowed. They do not need to be annotated. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-17190) Calcite engine. Unbind statistics from H2
[ https://issues.apache.org/jira/browse/IGNITE-17190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17578911#comment-17578911 ] Ignite TC Bot commented on IGNITE-17190: {panel:title=Branch: [pull/10175/head] Base: [master] : No blockers found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}{panel} {panel:title=Branch: [pull/10175/head] Base: [master] : New Tests (12)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1} {color:#8b}Queries 3 (lazy=true){color} [[tests 6|https://ci.ignite.apache.org/viewLog.html?buildId=6726406]] * {color:#013220}IgniteBinaryCacheQueryLazyTestSuite3: SqlStatisticsCommandTests.statisticsLexemaTest - PASSED{color} * {color:#013220}IgniteBinaryCacheQueryLazyTestSuite3: SqlStatisticsCommandTests.testRefreshNotExistStatistics - PASSED{color} * {color:#013220}IgniteBinaryCacheQueryLazyTestSuite3: SqlStatisticsCommandTests.testAnalyze - PASSED{color} * {color:#013220}IgniteBinaryCacheQueryLazyTestSuite3: SqlStatisticsCommandTests.testDropNotExistStatistics - PASSED{color} * {color:#013220}IgniteBinaryCacheQueryLazyTestSuite3: SqlStatisticsCommandTests.testDropStatistics - PASSED{color} * {color:#013220}IgniteBinaryCacheQueryLazyTestSuite3: SqlStatisticsCommandTests.testRefreshStatistics - PASSED{color} {color:#8b}Queries 3{color} [[tests 6|https://ci.ignite.apache.org/viewLog.html?buildId=6726405]] * {color:#013220}IgniteBinaryCacheQueryTestSuite3: SqlStatisticsCommandTests.statisticsLexemaTest - PASSED{color} * {color:#013220}IgniteBinaryCacheQueryTestSuite3: SqlStatisticsCommandTests.testRefreshNotExistStatistics - PASSED{color} * {color:#013220}IgniteBinaryCacheQueryTestSuite3: SqlStatisticsCommandTests.testAnalyze - PASSED{color} * {color:#013220}IgniteBinaryCacheQueryTestSuite3: SqlStatisticsCommandTests.testDropNotExistStatistics - PASSED{color} * {color:#013220}IgniteBinaryCacheQueryTestSuite3: SqlStatisticsCommandTests.testDropStatistics - PASSED{color} * {color:#013220}IgniteBinaryCacheQueryTestSuite3: SqlStatisticsCommandTests.testRefreshStatistics - PASSED{color} {panel} [TeamCity *-- Run :: All* Results|https://ci.ignite.apache.org/viewLog.html?buildId=6726431buildTypeId=IgniteTests24Java8_RunAll] > Calcite engine. Unbind statistics from H2 > - > > Key: IGNITE-17190 > URL: https://issues.apache.org/jira/browse/IGNITE-17190 > Project: Ignite > Issue Type: Improvement >Reporter: Aleksey Plekhanov >Assignee: Ivan Daschinsky >Priority: Major > Labels: calcite, calcite2-required, ignite-osgi > Time Spent: 10m > Remaining Estimate: 0h > > Currently, table statistics in Ignite uses some H2 classes (Value for > example). We should unbind statistics from H2 and move statistics to the core > module to be able to use it in calcite module without dependency to H2. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IGNITE-17359) Sql. Implement session auto expiration
[ https://issues.apache.org/jira/browse/IGNITE-17359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Evgeny Stanilovsky reassigned IGNITE-17359: --- Assignee: Evgeny Stanilovsky > Sql. Implement session auto expiration > --- > > Key: IGNITE-17359 > URL: https://issues.apache.org/jira/browse/IGNITE-17359 > Project: Ignite > Issue Type: Bug > Components: sql >Reporter: Konstantin Orlov >Assignee: Evgeny Stanilovsky >Priority: Major > Labels: ignite-3 > > Currently server-side sessions are created but never deleted. Session objects > clogs the memory and lead to OOME. > Need to introduce background task to check each session from time to time and > clean up those that have expired or use Caffeine. > start point - > org.apache.ignite.internal.sql.engine.session.SessionManager#activeSessions -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-17519) Fix error handlers in Flow
[ https://issues.apache.org/jira/browse/IGNITE-17519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksandr updated IGNITE-17519: --- Description: In FlowTest there is a couple of tests that reproduce the issue. In case the custom exception hander is set to Flow, it is not used. (was: In FlowTest there is a couple of tests that reproduce the issue. In case the custom exception handerl is set to Flow it is not used in some cases.) > Fix error handlers in Flow > -- > > Key: IGNITE-17519 > URL: https://issues.apache.org/jira/browse/IGNITE-17519 > Project: Ignite > Issue Type: Task > Components: cli, ignite-3 >Reporter: Aleksandr >Priority: Major > > In FlowTest there is a couple of tests that reproduce the issue. In case the > custom exception hander is set to Flow, it is not used. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-13510) Getting status of snapshot execution via command line and jmx
[ https://issues.apache.org/jira/browse/IGNITE-13510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17578889#comment-17578889 ] Amelchev Nikita commented on IGNITE-13510: -- [~RyzhovSV], Hi. Do you mind if I continue your work? > Getting status of snapshot execution via command line and jmx > - > > Key: IGNITE-13510 > URL: https://issues.apache.org/jira/browse/IGNITE-13510 > Project: Ignite > Issue Type: Task >Reporter: Sergei Ryzhov >Assignee: Sergei Ryzhov >Priority: Major > Labels: iep-43, ise, snapshot > Time Spent: 4h 20m > Remaining Estimate: 0h > > the control.sh utility immediately relinquishes control > and without using metricExporter it is impossible to understand whether the > snapshot completed or not > Restoring > {code:java} > Control utility [ver. 2.12.0-SNAPSHOT#20211004-sha1:77de60a7] > 2021 Copyright(C) Apache Software Foundation > User: sega > Time: 2021-10-07T14:18:59.523 > Command [SNAPSHOT] started > Arguments: --snapshot status --yes > > Status of SNAPSHOT operations: > gridCommandHandlerTest0 -> Restoring to snapshot with name: snapshot_02052020 > gridCommandHandlerTest1 -> Restoring to snapshot with name: snapshot_02052020 > Command [SNAPSHOT] finished with code: 0 > Control utility has completed execution at: 2021-10-07T14:18:59.546 > Execution time: 23 ms > {code} > Creating > {code:java} > Control utility [ver. 2.12.0-SNAPSHOT#20211004-sha1:77de60a7] > 2021 Copyright(C) Apache Software Foundation > User: sega > Time: 2021-10-07T14:18:55.368 > Command [SNAPSHOT] started > Arguments: --snapshot status --yes > > Status of SNAPSHOT operations: > gridCommandHandlerTest0 -> Creating the snapshot with name: snapshot_02052020 > gridCommandHandlerTest1 -> Creating the snapshot with name: snapshot_02052020 > Command [SNAPSHOT] finished with code: 0 > Control utility has completed execution at: 2021-10-07T14:18:55.391 > Execution time: 23 ms > {code} > No snapshot operation > {code:java} > Control utility [ver. 2.12.0-SNAPSHOT#20211004-sha1:77de60a7] > 2021 Copyright(C) Apache Software Foundation > User: sega > Time: 2021-10-07T14:18:58.408 > Command [SNAPSHOT] started > Arguments: --snapshot status --yes > > Status of SNAPSHOT operations: > gridCommandHandlerTest0 -> No snapshot operation. > gridCommandHandlerTest1 -> No snapshot operation. > Command [SNAPSHOT] finished with code: 0 > Control utility has completed execution at: 2021-10-07T14:18:58.439 > Execution time: 31 ms > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-15759) [IEP-80] Removal of LOCAL caches support
[ https://issues.apache.org/jira/browse/IGNITE-15759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Muzafarov updated IGNITE-15759: - Labels: IEP-80 important (was: IEP-80) > [IEP-80] Removal of LOCAL caches support > > > Key: IGNITE-15759 > URL: https://issues.apache.org/jira/browse/IGNITE-15759 > Project: Ignite > Issue Type: Improvement >Reporter: Nikolay Izhikov >Assignee: Maxim Muzafarov >Priority: Major > Labels: IEP-80, important > Fix For: 2.14 > > Time Spent: 4h 20m > Remaining Estimate: 0h > > LOCAL cachens has huge amount of known limitation and not intended to be used > in production environment. > Should be removed in 2.13 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-17508) Exception handling in the partition replication listener for RAFT futures
[ https://issues.apache.org/jira/browse/IGNITE-17508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vyacheslav Koptilin updated IGNITE-17508: - Epic Link: IGNITE-15081 > Exception handling in the partition replication listener for RAFT futures > - > > Key: IGNITE-17508 > URL: https://issues.apache.org/jira/browse/IGNITE-17508 > Project: Ignite > Issue Type: Improvement >Reporter: Vladislav Pyatkov >Priority: Major > Labels: ignite-3 > > In the replication listener ({_}PartitionReplicaListener{_}) where we have > the pattern: > {code:java} > raftFut.thenApply(ignored -> result);{code} > we should worry about handling RAFT exceptions, including analyzing whether > raftFut result. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-17508) Exception handling in the partition replication listener for RAFT futures
[ https://issues.apache.org/jira/browse/IGNITE-17508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vyacheslav Koptilin updated IGNITE-17508: - Ignite Flags: (was: Docs Required,Release Notes Required) > Exception handling in the partition replication listener for RAFT futures > - > > Key: IGNITE-17508 > URL: https://issues.apache.org/jira/browse/IGNITE-17508 > Project: Ignite > Issue Type: Improvement >Reporter: Vladislav Pyatkov >Priority: Major > Labels: ignite-3 > > In the replication listener ({_}PartitionReplicaListener{_}) where we have > the pattern: > {code:java} > raftFut.thenApply(ignored -> result);{code} > we should worry about handling RAFT exceptions, including analyzing whether > raftFut result. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IGNITE-17477) Redesign RAFT commands in accordance with replication layer
[ https://issues.apache.org/jira/browse/IGNITE-17477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vyacheslav Koptilin reassigned IGNITE-17477: Assignee: Vladislav Pyatkov > Redesign RAFT commands in accordance with replication layer > --- > > Key: IGNITE-17477 > URL: https://issues.apache.org/jira/browse/IGNITE-17477 > Project: Ignite > Issue Type: Improvement >Reporter: Vladislav Pyatkov >Assignee: Vladislav Pyatkov >Priority: Major > Labels: ignite-3, transaction3_rw > > After we have implemented a replication layer, a part of the RAFT command are > become useless: _GetAndDeleteCommand, UpsertCommand, GetAllCommand, > GetCommand, DeleteExactCommand_ (the list can be changed) and another one > required modification, because all raft command should apply _rowId_ and > never try to match some row to its id (it is already done by replication > layer). > Also required to extract a primary index (for now, it is a map) from the RAFT > state machine. It will be used by replication layer for read, but in the > state machine will use it for modification only. -- This message was sent by Atlassian Jira (v8.20.10#820010)