[jira] [Assigned] (IGNITE-21381) ActiveActorTest#testChangeLeaderForce has problems with resource cleanup
[ https://issues.apache.org/jira/browse/IGNITE-21381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladislav Pyatkov reassigned IGNITE-21381: -- Assignee: Vladislav Pyatkov > ActiveActorTest#testChangeLeaderForce has problems with resource cleanup > > > Key: IGNITE-21381 > URL: https://issues.apache.org/jira/browse/IGNITE-21381 > Project: Ignite > Issue Type: Bug >Reporter: Mirza Aliev >Assignee: Vladislav Pyatkov >Priority: Major > Labels: ignite-3 > Attachments: screenshot-1.png, screenshot-2.png > > Time Spent: 40m > Remaining Estimate: 0h > > {{ActiveActorTest#testChangeLeaderForce}} is started to be flaky on TC with > {noformat} > [05:19:12]F: > [org.apache.ignite.internal.placementdriver.ActiveActorTest.testChangeLeaderForce(TestInfo)] > org.opentest4j.AssertionFailedError: expected: but was: > at > app//org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151) > at > app//org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132) > at app//org.junit.jupiter.api.AssertTrue.failNotTrue(AssertTrue.java:63) > at app//org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:36) > at app//org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:31) > at app//org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:180) > at > app//org.apache.ignite.internal.placementdriver.ActiveActorTest.testChangeLeaderForce(ActiveActorTest.java:370) > {noformat} > From the log we can see that transfer leadership, which was supposed to be > successful, do not happen. Behaviour is the following: > 1) Current leader is {{Leader: ClusterNodeImpl > [id=e99210fb-f872-4e08-a99c-53f9512da20e, name=aat_tclf_1235}} > 2) We want to transfer leadership to {{Peer to transfer leader: Peer > [consistentId=aat_tclf_1234, idx=0]}} > 3) Process of transfer is started > 4) We receive warn about error during {{GetLeaderRequestImpl}}: > {noformat} > [2024-01-29T05:19:08,855][WARN > ][CompletableFutureDelayScheduler][RaftGroupServiceImpl] Recoverable error > during the request occurred (will be retried on the randomly selected node) > [request=GetLeaderRequestImpl [groupId=TestReplicationGroup, > peerId=aat_tclf_1235], peer=Peer [consistentId=aat_tclf_1235, idx=0], > newPeer=Peer [consistentId=aat_tclf_1234, idx=0]]. > java.util.concurrent.CompletionException: > java.util.concurrent.TimeoutException > at > java.util.concurrent.CompletableFuture.encodeRelay(CompletableFuture.java:367) > ~[?:?] > at > java.util.concurrent.CompletableFuture.completeRelay(CompletableFuture.java:376) > ~[?:?] > at > java.util.concurrent.CompletableFuture$UniRelay.tryFire(CompletableFuture.java:1019) > ~[?:?] > at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) > [?:?] > at > java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088) > [?:?] > at > java.util.concurrent.CompletableFuture$Timeout.run(CompletableFuture.java:2792) > [?:?] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?] > at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) > [?:?] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > [?:?] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > [?:?] > at java.lang.Thread.run(Thread.java:834) [?:?] > Caused by: java.util.concurrent.TimeoutException > ... 7 more > {noformat} > 5) After that we see that node {{aat_tclf_1236}} sends invalid > {{RequestVoteResponse}} because it thinks that it is the leader: > {noformat} > [2024-01-29T05:19:11,370][WARN > ][%aat_tclf_1234%JRaft-Response-Processor-15][NodeImpl] Node > received invalid RequestVoteResponse > from aat_tclf_1236, state not in STATE_CANDIDATE but STATE_LEADER. > {noformat} > > Tests {{ActiveActorTest#testChangeLeaderForce}} and > {{TopologyAwareRaftGroupServiceTest#testChangeLeaderForce}} were muted. > Also there are some other problems with this tests, they incorrectly clean up > resources in case of failure. Cluster is stopped in test itself, meaning that > if some assertion is failed, the rest part of the test won't be evaluated, > hence cluster won't be stopped. > The next problem is that if we run this test a several times, even if they > pass successfully, we can see that at some point new test cannot be run > because of > {noformat} > java.lang.OutOfMemoryError: unable
[jira] [Created] (IGNITE-21733) Improve validation of SchemaDescriptor
Konstantin Orlov created IGNITE-21733: - Summary: Improve validation of SchemaDescriptor Key: IGNITE-21733 URL: https://issues.apache.org/jira/browse/IGNITE-21733 Project: Ignite Issue Type: Improvement Reporter: Konstantin Orlov Currently, {{SchemaDescriptor}} doesn't have validation apart of bunch of assertions in constructor. Let's revise this approach and introduce decent validation to improve integrity of the system. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20384) Clean up abandoned resources for destroyed tables in catalog
[ https://issues.apache.org/jira/browse/IGNITE-20384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Mashenkov updated IGNITE-20384: -- Fix Version/s: 3.0.0-beta2 > Clean up abandoned resources for destroyed tables in catalog > > > Key: IGNITE-20384 > URL: https://issues.apache.org/jira/browse/IGNITE-20384 > Project: Ignite > Issue Type: Improvement >Reporter: Kirill Tkalenko >Assignee: Andrey Mashenkov >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > > We need to clean up abandoned resources (from vault and metastore) for > destroyed tables from the catalog. > Perhaps it will be two separate ticket. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20287) Clean up abandoned resources for destroyed zones in catalog
[ https://issues.apache.org/jira/browse/IGNITE-20287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Mashenkov updated IGNITE-20287: -- Fix Version/s: 3.0.0-beta2 > Clean up abandoned resources for destroyed zones in catalog > --- > > Key: IGNITE-20287 > URL: https://issues.apache.org/jira/browse/IGNITE-20287 > Project: Ignite > Issue Type: Improvement >Reporter: Kirill Tkalenko >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > > h3. *Motivation* > We need to clean up resources for destroyed distribution zones from the > catalog. It is possible that while a zone is removed, some actions that must > be done on a zone deletion could be interrupted by restart. On recovery, we > must detect such zone's deletion an must clean up the resources. Currently we > store some zone's state in Meta Storage, so this resources must be cleaned up. > h3. *Definition of done* > Resources for deleted zones are removed as well for a deleted zone even if > this removal were interrupted by restart. > h3. *Implementation notes* > For zones that are not presented in the catalog, but presented in the MS, > just remove all data nodes related keys. All these changes must be done using > meta storage condition which we us when we call > {{DistributionZoneManager#removeTriggerKeysAndDataNodes}} on a zone drop. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21732) Sql. Split TableRowConverterImpl on two different implementations
Konstantin Orlov created IGNITE-21732: - Summary: Sql. Split TableRowConverterImpl on two different implementations Key: IGNITE-21732 URL: https://issues.apache.org/jira/browse/IGNITE-21732 Project: Ignite Issue Type: Improvement Components: sql Reporter: Konstantin Orlov Currently, {{TableRowConverterImpl}} implements two strategies of conversion: with and without field trimming. To make code simper and to remove branching from a hot path let's split this class on two (one per each strategy) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IGNITE-18258) Exception on cast to decimal and numeric in SQL
[ https://issues.apache.org/jira/browse/IGNITE-18258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maksim Zhuravkov reassigned IGNITE-18258: - Assignee: Maksim Zhuravkov > Exception on cast to decimal and numeric in SQL > --- > > Key: IGNITE-18258 > URL: https://issues.apache.org/jira/browse/IGNITE-18258 > Project: Ignite > Issue Type: Bug > Components: sql >Reporter: Pavel Tupitsyn >Assignee: Maksim Zhuravkov >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > > *Query* > {code} > select (cast(_T0.KEY as decimal) / ?), cast(_T0.KEY as numeric) from > PUBLIC.TBL_INT32 as _T0 > {code} > *Result* > {code} > org.apache.ignite.lang.IgniteException: IGN-CMN-65535 > TraceId:9b69e26a-0d1e-4891-82bb-f164919a323c For conversion to decimal, > ConverterUtils#convertToDecimal method should be used instead. > at org.apache.ignite.lang.IgniteException.wrap(IgniteException.java:289) > at > org.apache.ignite.internal.sql.engine.AsyncSqlCursorImpl.lambda$requestNextAsync$0(AsyncSqlCursorImpl.java:77) > at > java.base/java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:930) > at > java.base/java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:907) > at > java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) > at > java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088) > at > org.apache.ignite.internal.sql.engine.exec.rel.AsyncRootNode.lambda$closeAsync$0(AsyncRootNode.java:193) > at > java.base/java.util.concurrent.LinkedBlockingQueue.forEachFrom(LinkedBlockingQueue.java:1010) > at > java.base/java.util.concurrent.LinkedBlockingQueue.forEach(LinkedBlockingQueue.java:979) > at > org.apache.ignite.internal.sql.engine.exec.rel.AsyncRootNode.closeAsync(AsyncRootNode.java:193) > at > org.apache.ignite.internal.sql.engine.exec.rel.AsyncRootNode.onError(AsyncRootNode.java:148) > at > org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl$DistributedQueryManager.lambda$acknowledgeFragment$1(ExecutionServiceImpl.java:453) > at > java.base/java.util.concurrent.CompletableFuture.uniAcceptNow(CompletableFuture.java:753) > at > java.base/java.util.concurrent.CompletableFuture.uniAcceptStage(CompletableFuture.java:731) > at > java.base/java.util.concurrent.CompletableFuture.thenAccept(CompletableFuture.java:2108) > at > org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl$DistributedQueryManager.acknowledgeFragment(ExecutionServiceImpl.java:452) > at > org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl.onMessage(ExecutionServiceImpl.java:310) > at > org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl.lambda$start$3(ExecutionServiceImpl.java:183) > at > org.apache.ignite.internal.sql.engine.message.MessageServiceImpl.onMessageInternal(MessageServiceImpl.java:164) > at > org.apache.ignite.internal.sql.engine.message.MessageServiceImpl.lambda$onMessage$1(MessageServiceImpl.java:135) > at > org.apache.ignite.internal.sql.engine.exec.QueryTaskExecutorImpl.lambda$execute$0(QueryTaskExecutorImpl.java:80) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:829) > Caused by: java.lang.AssertionError: For conversion to decimal, > ConverterUtils#convertToDecimal method should be used instead. > at > org.apache.ignite.internal.sql.engine.exec.exp.ConverterUtils.convert(ConverterUtils.java:222) > at > org.apache.ignite.internal.sql.engine.exec.exp.ConverterUtils.convert(ConverterUtils.java:201) > at > org.apache.ignite.internal.sql.engine.exec.exp.RexToLixTranslator.visitDynamicParam(RexToLixTranslator.java:1249) > at > org.apache.ignite.internal.sql.engine.exec.exp.RexToLixTranslator.visitDynamicParam(RexToLixTranslator.java:80) > at > org.apache.calcite.rex.RexDynamicParam.accept(RexDynamicParam.java:60) > at > org.apache.ignite.internal.sql.engine.exec.exp.RexToLixTranslator.visitLocalRef(RexToLixTranslator.java:983) > at > org.apache.ignite.internal.sql.engine.exec.exp.RexToLixTranslator.visitLocalRef(RexToLixTranslator.java:80) > at org.apache.calcite.rex.RexLocalRef.accept(RexLocalRef.java:77) > at > org.apache.ignite.internal.sql.engine.exec.exp.RexToLixTranslator.implementCallOperand(RexToLixTranslator.java:1106) > at >
[jira] [Created] (IGNITE-21731) Sql. Split TableRowConverter#toBinaryRow on two methods
Konstantin Orlov created IGNITE-21731: - Summary: Sql. Split TableRowConverter#toBinaryRow on two methods Key: IGNITE-21731 URL: https://issues.apache.org/jira/browse/IGNITE-21731 Project: Ignite Issue Type: Improvement Components: sql Reporter: Konstantin Orlov Currently, method {{org.apache.ignite.internal.sql.engine.exec.TableRowConverter#toBinaryRow}} accepts boolean flag {{key}} and creates row with regard to this flag. Perhaps, the api will be cleaner if we split this method on two parts: {{toKeyRow}} and {{toFullRow}}. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21730) Start tables aside of Metastorage event
Andrey Mashenkov created IGNITE-21730: - Summary: Start tables aside of Metastorage event Key: IGNITE-21730 URL: https://issues.apache.org/jira/browse/IGNITE-21730 Project: Ignite Issue Type: Improvement Reporter: Andrey Mashenkov As for now, we start raft groups and storages asynchronously, but as a part of event handling flow. Thus the metastorage watcher can't proceed with the next event until the current one handled (raft groups have been started and storages have been created). This unwanted long operation affects lease updates and leads metastorage errors. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21730) Start tables aside of Metastorage event
[ https://issues.apache.org/jira/browse/IGNITE-21730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Mashenkov updated IGNITE-21730: -- Labels: ignite-3 perfomance (was: ignite-3) > Start tables aside of Metastorage event > --- > > Key: IGNITE-21730 > URL: https://issues.apache.org/jira/browse/IGNITE-21730 > Project: Ignite > Issue Type: Improvement >Reporter: Andrey Mashenkov >Priority: Major > Labels: ignite-3, perfomance > Fix For: 3.0 > > > As for now, we start raft groups and storages asynchronously, but as a part > of event handling flow. Thus the metastorage watcher can't proceed with the > next event until the current one handled (raft groups have been started and > storages have been created). > This unwanted long operation affects lease updates and leads metastorage > errors. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21730) Start tables aside of Metastorage event
[ https://issues.apache.org/jira/browse/IGNITE-21730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Mashenkov updated IGNITE-21730: -- Fix Version/s: 3.0 > Start tables aside of Metastorage event > --- > > Key: IGNITE-21730 > URL: https://issues.apache.org/jira/browse/IGNITE-21730 > Project: Ignite > Issue Type: Improvement >Reporter: Andrey Mashenkov >Priority: Major > Labels: ignite-3 > Fix For: 3.0 > > > As for now, we start raft groups and storages asynchronously, but as a part > of event handling flow. Thus the metastorage watcher can't proceed with the > next event until the current one handled (raft groups have been started and > storages have been created). > This unwanted long operation affects lease updates and leads metastorage > errors. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IGNITE-20384) Clean up abandoned resources for destroyed tables in catalog
[ https://issues.apache.org/jira/browse/IGNITE-20384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Mashenkov reassigned IGNITE-20384: - Assignee: Andrey Mashenkov > Clean up abandoned resources for destroyed tables in catalog > > > Key: IGNITE-20384 > URL: https://issues.apache.org/jira/browse/IGNITE-20384 > Project: Ignite > Issue Type: Improvement >Reporter: Kirill Tkalenko >Assignee: Andrey Mashenkov >Priority: Major > Labels: ignite-3 > > We need to clean up abandoned resources (from vault and metastore) for > destroyed tables from the catalog. > Perhaps it will be two separate ticket. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21729) Prevent threads from being hijacked via async cursors in KV/Record view APIs
[ https://issues.apache.org/jira/browse/IGNITE-21729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roman Puchkovskiy updated IGNITE-21729: --- Description: query() methods return AsyncCursors. AsyncCursor has methods returning CompletableFutures. We need to prevent thread hijacking via these futures. > Prevent threads from being hijacked via async cursors in KV/Record view APIs > > > Key: IGNITE-21729 > URL: https://issues.apache.org/jira/browse/IGNITE-21729 > Project: Ignite > Issue Type: Improvement >Reporter: Roman Puchkovskiy >Assignee: Roman Puchkovskiy >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > > query() methods return AsyncCursors. AsyncCursor has methods returning > CompletableFutures. We need to prevent thread hijacking via these futures. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21729) Prevent threads from being hijacked via async cursors in KV/Record view APIs
Roman Puchkovskiy created IGNITE-21729: -- Summary: Prevent threads from being hijacked via async cursors in KV/Record view APIs Key: IGNITE-21729 URL: https://issues.apache.org/jira/browse/IGNITE-21729 Project: Ignite Issue Type: Improvement Reporter: Roman Puchkovskiy Assignee: Roman Puchkovskiy Fix For: 3.0.0-beta2 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21728) Close cursors synchronously in ExecutionServiceImplTest
[ https://issues.apache.org/jira/browse/IGNITE-21728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksandr Polovtcev updated IGNITE-21728: - Fix Version/s: 3.0.0-beta2 > Close cursors synchronously in ExecutionServiceImplTest > --- > > Key: IGNITE-21728 > URL: https://issues.apache.org/jira/browse/IGNITE-21728 > Project: Ignite > Issue Type: Bug >Reporter: Aleksandr Polovtcev >Assignee: Aleksandr Polovtcev >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > Time Spent: 20m > Remaining Estimate: 0h > > {{AsyncCursor}} in {{ExecutionServiceImplTest}} is closed using > {{closeAsync}} but in some tests nobody waits for the return future to > complete, which may pose race conditions. > An example that was found in the logs during > {{testErrorIsPropagatedToPrefetchCallback}} execution: > {noformat} > [2024-03-11T15:24:02,481][INFO > ][%node_1%sql-execution-pool-0][ExecutionServiceImpl] Unable to send error > message > java.util.concurrent.RejectedExecutionException: Task > org.apache.ignite.internal.sql.engine.exec.QueryTaskExecutorImpl$$Lambda$1368/0x000800820c40@4ff7a879 > rejected from java.util.concurrent.ThreadPoolExecutor@145a1f2a[Terminated, > pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 5] > at > java.base/java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2055) > ~[?:?] > at > java.base/java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:825) > ~[?:?] > at > java.base/java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1355) > ~[?:?] > at > org.apache.ignite.internal.thread.AbstractStripedThreadPoolExecutor.execute(AbstractStripedThreadPoolExecutor.java:61) > ~[main/:?] > at > org.apache.ignite.internal.sql.engine.exec.QueryTaskExecutorImpl.execute(QueryTaskExecutorImpl.java:82) > ~[main/:?] > at > org.apache.ignite.internal.sql.engine.exec.QueryTaskExecutorImpl.execute(QueryTaskExecutorImpl.java:104) > ~[main/:?] > at > org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest$TestCluster$TestNode.lambda$onReceive$2(ExecutionServiceImplTest.java:1088) > ~[test/:?] > at > org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest$TestCluster$TestNode.onReceive(ExecutionServiceImplTest.java:1098) > ~[test/:?] > at > org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest$TestCluster$TestNode$1.send(ExecutionServiceImplTest.java:1017) > ~[test/:?] > at > org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl$DistributedQueryManager.handleError(ExecutionServiceImpl.java:842) > ~[main/:?] > at > org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl$DistributedQueryManager.lambda$submitFragment$11(ExecutionServiceImpl.java:829) > ~[main/:?] > at > java.base/java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:986) > ~[?:?] > at > java.base/java.util.concurrent.CompletableFuture.uniExceptionallyStage(CompletableFuture.java:1004) > ~[?:?] > at > java.base/java.util.concurrent.CompletableFuture.exceptionally(CompletableFuture.java:2307) > ~[?:?] > at > org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl$DistributedQueryManager.submitFragment(ExecutionServiceImpl.java:828) > ~[main/:?] > at > org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl.submitFragment(ExecutionServiceImpl.java:505) > ~[main/:?] > at > org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl.onMessage(ExecutionServiceImpl.java:404) > ~[main/:?] > at > org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl.lambda$start$1(ExecutionServiceImpl.java:253) > ~[main/:?] > at > org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest$TestCluster$TestNode.lambda$onReceive$0(ExecutionServiceImplTest.java:1086) > ~[test/:?] > at > org.apache.ignite.internal.sql.engine.exec.QueryTaskExecutorImpl.lambda$execute$0(QueryTaskExecutorImpl.java:85) > ~[main/:?] > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > [?:?] > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > [?:?] > at java.base/java.lang.Thread.run(Thread.java:834) [?:?] > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21728) Close cursors synchronously in ExecutionServiceImplTest
[ https://issues.apache.org/jira/browse/IGNITE-21728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Pereslegin updated IGNITE-21728: -- Description: {{AsyncCursor}} in {{ExecutionServiceImplTest}} is closed using {{closeAsync}} but in some tests nobody waits for the return future to complete, which may pose race conditions. An example that was found in the logs during {{testErrorIsPropagatedToPrefetchCallback}} execution: {noformat} [2024-03-11T15:24:02,481][INFO ][%node_1%sql-execution-pool-0][ExecutionServiceImpl] Unable to send error message java.util.concurrent.RejectedExecutionException: Task org.apache.ignite.internal.sql.engine.exec.QueryTaskExecutorImpl$$Lambda$1368/0x000800820c40@4ff7a879 rejected from java.util.concurrent.ThreadPoolExecutor@145a1f2a[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 5] at java.base/java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2055) ~[?:?] at java.base/java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:825) ~[?:?] at java.base/java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1355) ~[?:?] at org.apache.ignite.internal.thread.AbstractStripedThreadPoolExecutor.execute(AbstractStripedThreadPoolExecutor.java:61) ~[main/:?] at org.apache.ignite.internal.sql.engine.exec.QueryTaskExecutorImpl.execute(QueryTaskExecutorImpl.java:82) ~[main/:?] at org.apache.ignite.internal.sql.engine.exec.QueryTaskExecutorImpl.execute(QueryTaskExecutorImpl.java:104) ~[main/:?] at org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest$TestCluster$TestNode.lambda$onReceive$2(ExecutionServiceImplTest.java:1088) ~[test/:?] at org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest$TestCluster$TestNode.onReceive(ExecutionServiceImplTest.java:1098) ~[test/:?] at org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest$TestCluster$TestNode$1.send(ExecutionServiceImplTest.java:1017) ~[test/:?] at org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl$DistributedQueryManager.handleError(ExecutionServiceImpl.java:842) ~[main/:?] at org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl$DistributedQueryManager.lambda$submitFragment$11(ExecutionServiceImpl.java:829) ~[main/:?] at java.base/java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:986) ~[?:?] at java.base/java.util.concurrent.CompletableFuture.uniExceptionallyStage(CompletableFuture.java:1004) ~[?:?] at java.base/java.util.concurrent.CompletableFuture.exceptionally(CompletableFuture.java:2307) ~[?:?] at org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl$DistributedQueryManager.submitFragment(ExecutionServiceImpl.java:828) ~[main/:?] at org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl.submitFragment(ExecutionServiceImpl.java:505) ~[main/:?] at org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl.onMessage(ExecutionServiceImpl.java:404) ~[main/:?] at org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl.lambda$start$1(ExecutionServiceImpl.java:253) ~[main/:?] at org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest$TestCluster$TestNode.lambda$onReceive$0(ExecutionServiceImplTest.java:1086) ~[test/:?] at org.apache.ignite.internal.sql.engine.exec.QueryTaskExecutorImpl.lambda$execute$0(QueryTaskExecutorImpl.java:85) ~[main/:?] at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?] at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?] at java.base/java.lang.Thread.run(Thread.java:834) [?:?] {noformat} was: {{AsyncCursor}} in {{ExecutionServiceImplTest}} is closed using {{closeAsync}} but in some tests nobody waits for the return future to complete, which may pose race conditions. An example that was found in the logs: {noformat} [2024-03-11T15:24:02,481][INFO ][%node_1%sql-execution-pool-0][ExecutionServiceImpl] Unable to send error message java.util.concurrent.RejectedExecutionException: Task org.apache.ignite.internal.sql.engine.exec.QueryTaskExecutorImpl$$Lambda$1368/0x000800820c40@4ff7a879 rejected from java.util.concurrent.ThreadPoolExecutor@145a1f2a[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 5] at java.base/java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2055) ~[?:?] at java.base/java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:825) ~[?:?] at
[jira] [Updated] (IGNITE-21728) Close cursors synchronously in ExecutionServiceImplTest
[ https://issues.apache.org/jira/browse/IGNITE-21728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Pereslegin updated IGNITE-21728: -- Description: {{AsyncCursor}} in {{ExecutionServiceImplTest}} is closed using {{closeAsync}} but in some tests nobody waits for the return future to complete, which may pose race conditions. An example that was found in the logs: {noformat} [2024-03-11T15:24:02,481][INFO ][%node_1%sql-execution-pool-0][ExecutionServiceImpl] Unable to send error message java.util.concurrent.RejectedExecutionException: Task org.apache.ignite.internal.sql.engine.exec.QueryTaskExecutorImpl$$Lambda$1368/0x000800820c40@4ff7a879 rejected from java.util.concurrent.ThreadPoolExecutor@145a1f2a[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 5] at java.base/java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2055) ~[?:?] at java.base/java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:825) ~[?:?] at java.base/java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1355) ~[?:?] at org.apache.ignite.internal.thread.AbstractStripedThreadPoolExecutor.execute(AbstractStripedThreadPoolExecutor.java:61) ~[main/:?] at org.apache.ignite.internal.sql.engine.exec.QueryTaskExecutorImpl.execute(QueryTaskExecutorImpl.java:82) ~[main/:?] at org.apache.ignite.internal.sql.engine.exec.QueryTaskExecutorImpl.execute(QueryTaskExecutorImpl.java:104) ~[main/:?] at org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest$TestCluster$TestNode.lambda$onReceive$2(ExecutionServiceImplTest.java:1088) ~[test/:?] at org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest$TestCluster$TestNode.onReceive(ExecutionServiceImplTest.java:1098) ~[test/:?] at org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest$TestCluster$TestNode$1.send(ExecutionServiceImplTest.java:1017) ~[test/:?] at org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl$DistributedQueryManager.handleError(ExecutionServiceImpl.java:842) ~[main/:?] at org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl$DistributedQueryManager.lambda$submitFragment$11(ExecutionServiceImpl.java:829) ~[main/:?] at java.base/java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:986) ~[?:?] at java.base/java.util.concurrent.CompletableFuture.uniExceptionallyStage(CompletableFuture.java:1004) ~[?:?] at java.base/java.util.concurrent.CompletableFuture.exceptionally(CompletableFuture.java:2307) ~[?:?] at org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl$DistributedQueryManager.submitFragment(ExecutionServiceImpl.java:828) ~[main/:?] at org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl.submitFragment(ExecutionServiceImpl.java:505) ~[main/:?] at org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl.onMessage(ExecutionServiceImpl.java:404) ~[main/:?] at org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl.lambda$start$1(ExecutionServiceImpl.java:253) ~[main/:?] at org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest$TestCluster$TestNode.lambda$onReceive$0(ExecutionServiceImplTest.java:1086) ~[test/:?] at org.apache.ignite.internal.sql.engine.exec.QueryTaskExecutorImpl.lambda$execute$0(QueryTaskExecutorImpl.java:85) ~[main/:?] at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?] at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?] at java.base/java.lang.Thread.run(Thread.java:834) [?:?] {noformat} was: {{AsyncCursor}} in {{ExecutionServiceImplTest}} is closed using {{closeAsync}} but in some tests nobody waits for the return future to complete, which may pose race conditions (when the test is stopped). An example that was found in the logs: {noformat} [2024-03-11T15:24:02,481][INFO ][%node_1%sql-execution-pool-0][ExecutionServiceImpl] Unable to send error message java.util.concurrent.RejectedExecutionException: Task org.apache.ignite.internal.sql.engine.exec.QueryTaskExecutorImpl$$Lambda$1368/0x000800820c40@4ff7a879 rejected from java.util.concurrent.ThreadPoolExecutor@145a1f2a[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 5] at java.base/java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2055) ~[?:?] at java.base/java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:825) ~[?:?] at java.base/java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1355) ~[?:?] at
[jira] [Updated] (IGNITE-21728) Close cursors synchronously in ExecutionServiceImplTest
[ https://issues.apache.org/jira/browse/IGNITE-21728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Pereslegin updated IGNITE-21728: -- Description: {{AsyncCursor}} in {{ExecutionServiceImplTest}} is closed using {{closeAsync}} but in some tests nobody waits for the return future to complete, which may pose race conditions (when the test is stopped). An example that was found in the logs: {noformat} [2024-03-11T15:24:02,481][INFO ][%node_1%sql-execution-pool-0][ExecutionServiceImpl] Unable to send error message java.util.concurrent.RejectedExecutionException: Task org.apache.ignite.internal.sql.engine.exec.QueryTaskExecutorImpl$$Lambda$1368/0x000800820c40@4ff7a879 rejected from java.util.concurrent.ThreadPoolExecutor@145a1f2a[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 5] at java.base/java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2055) ~[?:?] at java.base/java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:825) ~[?:?] at java.base/java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1355) ~[?:?] at org.apache.ignite.internal.thread.AbstractStripedThreadPoolExecutor.execute(AbstractStripedThreadPoolExecutor.java:61) ~[main/:?] at org.apache.ignite.internal.sql.engine.exec.QueryTaskExecutorImpl.execute(QueryTaskExecutorImpl.java:82) ~[main/:?] at org.apache.ignite.internal.sql.engine.exec.QueryTaskExecutorImpl.execute(QueryTaskExecutorImpl.java:104) ~[main/:?] at org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest$TestCluster$TestNode.lambda$onReceive$2(ExecutionServiceImplTest.java:1088) ~[test/:?] at org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest$TestCluster$TestNode.onReceive(ExecutionServiceImplTest.java:1098) ~[test/:?] at org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest$TestCluster$TestNode$1.send(ExecutionServiceImplTest.java:1017) ~[test/:?] at org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl$DistributedQueryManager.handleError(ExecutionServiceImpl.java:842) ~[main/:?] at org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl$DistributedQueryManager.lambda$submitFragment$11(ExecutionServiceImpl.java:829) ~[main/:?] at java.base/java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:986) ~[?:?] at java.base/java.util.concurrent.CompletableFuture.uniExceptionallyStage(CompletableFuture.java:1004) ~[?:?] at java.base/java.util.concurrent.CompletableFuture.exceptionally(CompletableFuture.java:2307) ~[?:?] at org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl$DistributedQueryManager.submitFragment(ExecutionServiceImpl.java:828) ~[main/:?] at org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl.submitFragment(ExecutionServiceImpl.java:505) ~[main/:?] at org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl.onMessage(ExecutionServiceImpl.java:404) ~[main/:?] at org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImpl.lambda$start$1(ExecutionServiceImpl.java:253) ~[main/:?] at org.apache.ignite.internal.sql.engine.exec.ExecutionServiceImplTest$TestCluster$TestNode.lambda$onReceive$0(ExecutionServiceImplTest.java:1086) ~[test/:?] at org.apache.ignite.internal.sql.engine.exec.QueryTaskExecutorImpl.lambda$execute$0(QueryTaskExecutorImpl.java:85) ~[main/:?] at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?] at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?] at java.base/java.lang.Thread.run(Thread.java:834) [?:?] {noformat} was:{{AsyncCursor}} in {{ExecutionServiceImplTest}} is closed using {{closeAsync}} but in some tests nobody waits for the return future to complete, which may pose race conditions. > Close cursors synchronously in ExecutionServiceImplTest > --- > > Key: IGNITE-21728 > URL: https://issues.apache.org/jira/browse/IGNITE-21728 > Project: Ignite > Issue Type: Bug >Reporter: Aleksandr Polovtcev >Assignee: Aleksandr Polovtcev >Priority: Major > Labels: ignite-3 > Time Spent: 10m > Remaining Estimate: 0h > > {{AsyncCursor}} in {{ExecutionServiceImplTest}} is closed using > {{closeAsync}} but in some tests nobody waits for the return future to > complete, which may pose race conditions (when the test is stopped). > An example that was found in the logs: > {noformat} > [2024-03-11T15:24:02,481][INFO > ][%node_1%sql-execution-pool-0][ExecutionServiceImpl]
[jira] [Updated] (IGNITE-21726) Sql. Enable metrics by default
[ https://issues.apache.org/jira/browse/IGNITE-21726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Pereslegin updated IGNITE-21726: -- Description: Currently by default all metrics in AI3 are disabled. Since we believe that all existing metrics in AI3 do not impact performance, let's enable them all by default. This will save time on analyzing issues that users may encounter. was: Currently by default all metrics are disabled. Since we believe that all existing metrics in AI3 do not impact performance, let's enable them all by default. This will save time on analyzing issues that users may encounter. > Sql. Enable metrics by default > -- > > Key: IGNITE-21726 > URL: https://issues.apache.org/jira/browse/IGNITE-21726 > Project: Ignite > Issue Type: Task >Reporter: Pavel Pereslegin >Priority: Major > Labels: ignite-3 > > Currently by default all metrics in AI3 are disabled. > Since we believe that all existing metrics in AI3 do not impact performance, > let's enable them all by default. This will save time on analyzing issues > that users may encounter. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21726) Sql. Enable metrics by default
[ https://issues.apache.org/jira/browse/IGNITE-21726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Pereslegin updated IGNITE-21726: -- Description: Currently by default all metrics are disabled. Since we believe that all existing metrics in AI3 do not impact performance, let's enable them all by default. This will save time on analyzing issues that users may encounter. was: Currently by default all metrics are disabled. Since we believe that all existing metrics in AI3 do not impact performance, let's enable them all by default. This will save time on analyzing issues that customers may encounter. > Sql. Enable metrics by default > -- > > Key: IGNITE-21726 > URL: https://issues.apache.org/jira/browse/IGNITE-21726 > Project: Ignite > Issue Type: Task >Reporter: Pavel Pereslegin >Priority: Major > Labels: ignite-3 > > Currently by default all metrics are disabled. > Since we believe that all existing metrics in AI3 do not impact performance, > let's enable them all by default. This will save time on analyzing issues > that users may encounter. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21726) Sql. Enable metrics by default
[ https://issues.apache.org/jira/browse/IGNITE-21726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Pereslegin updated IGNITE-21726: -- Description: Currently by default all metrics are disabled. Since we believe that all existing metrics in AI3 impact performance, let's enable them all by default. This will save time on analyzing issues that customers may encounter. was: Currently by default all metrics are disabled. It is recommended to enable all metrics that do not add significant performance overhead. At the moment we are counting. that all existing metrics in AI3 should not affect performance, so let's enable them all by default. > Sql. Enable metrics by default > -- > > Key: IGNITE-21726 > URL: https://issues.apache.org/jira/browse/IGNITE-21726 > Project: Ignite > Issue Type: Task >Reporter: Pavel Pereslegin >Priority: Major > Labels: ignite-3 > > Currently by default all metrics are disabled. > Since we believe that all existing metrics in AI3 impact performance, let's > enable them all by default. This will save time on analyzing issues that > customers may encounter. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21726) Sql. Enable metrics by default
[ https://issues.apache.org/jira/browse/IGNITE-21726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Pereslegin updated IGNITE-21726: -- Description: Currently by default all metrics are disabled. Since we believe that all existing metrics in AI3 do not impact performance, let's enable them all by default. This will save time on analyzing issues that customers may encounter. was: Currently by default all metrics are disabled. Since we believe that all existing metrics in AI3 impact performance, let's enable them all by default. This will save time on analyzing issues that customers may encounter. > Sql. Enable metrics by default > -- > > Key: IGNITE-21726 > URL: https://issues.apache.org/jira/browse/IGNITE-21726 > Project: Ignite > Issue Type: Task >Reporter: Pavel Pereslegin >Priority: Major > Labels: ignite-3 > > Currently by default all metrics are disabled. > Since we believe that all existing metrics in AI3 do not impact performance, > let's enable them all by default. This will save time on analyzing issues > that customers may encounter. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21728) Close cursors synchronously in ExecutionServiceImplTest
[ https://issues.apache.org/jira/browse/IGNITE-21728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksandr Polovtcev updated IGNITE-21728: - Description: {{AsyncCursor}} in {{ExecutionServiceImplTest}} is closed using {{closeAsync}} but in some tests nobody waits for the return future to complete, which may pose race conditions. (was: {{AsyncCursor}} in ) > Close cursors synchronously in ExecutionServiceImplTest > --- > > Key: IGNITE-21728 > URL: https://issues.apache.org/jira/browse/IGNITE-21728 > Project: Ignite > Issue Type: Bug >Reporter: Aleksandr Polovtcev >Assignee: Aleksandr Polovtcev >Priority: Major > Labels: ignite-3 > > {{AsyncCursor}} in {{ExecutionServiceImplTest}} is closed using > {{closeAsync}} but in some tests nobody waits for the return future to > complete, which may pose race conditions. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21727) Close cursors synchronously in ExecutionServiceImplTest
Aleksandr Polovtcev created IGNITE-21727: Summary: Close cursors synchronously in ExecutionServiceImplTest Key: IGNITE-21727 URL: https://issues.apache.org/jira/browse/IGNITE-21727 Project: Ignite Issue Type: Improvement Reporter: Aleksandr Polovtcev Assignee: Aleksandr Polovtcev {{AsyncCursor}} in {{ExecutionServiceImplTest}} is closed using {{closeAsync}}, but in some tests nobody waits for the future to complete, which can pose race conditions. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21728) Close cursors synchronously in ExecutionServiceImplTest
[ https://issues.apache.org/jira/browse/IGNITE-21728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksandr Polovtcev updated IGNITE-21728: - Description: {{AsyncCursor}} in > Close cursors synchronously in ExecutionServiceImplTest > --- > > Key: IGNITE-21728 > URL: https://issues.apache.org/jira/browse/IGNITE-21728 > Project: Ignite > Issue Type: Bug >Reporter: Aleksandr Polovtcev >Assignee: Aleksandr Polovtcev >Priority: Major > Labels: ignite-3 > > {{AsyncCursor}} in -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21728) Close cursors synchronously in ExecutionServiceImplTest
Aleksandr Polovtcev created IGNITE-21728: Summary: Close cursors synchronously in ExecutionServiceImplTest Key: IGNITE-21728 URL: https://issues.apache.org/jira/browse/IGNITE-21728 Project: Ignite Issue Type: Bug Reporter: Aleksandr Polovtcev Assignee: Aleksandr Polovtcev -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21726) Sql. Enable metrics by default
Pavel Pereslegin created IGNITE-21726: - Summary: Sql. Enable metrics by default Key: IGNITE-21726 URL: https://issues.apache.org/jira/browse/IGNITE-21726 Project: Ignite Issue Type: Task Reporter: Pavel Pereslegin Currently by default all metrics are disabled. It is recommended to enable all metrics that do not add significant performance overhead. At the moment we are counting. that all existing metrics in AI3 should not affect performance, so let's enable them all by default. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21718) Extract volatile state in AbstractPageMemoryMvPartitionStorage into a separate class
[ https://issues.apache.org/jira/browse/IGNITE-21718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksandr Polovtcev updated IGNITE-21718: - Fix Version/s: 3.0.0-beta2 > Extract volatile state in AbstractPageMemoryMvPartitionStorage into a > separate class > > > Key: IGNITE-21718 > URL: https://issues.apache.org/jira/browse/IGNITE-21718 > Project: Ignite > Issue Type: Improvement >Reporter: Aleksandr Polovtcev >Assignee: Aleksandr Polovtcev >Priority: Minor > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > Time Spent: 1h > Remaining Estimate: 0h > > {{AbstractPageMemoryMvPartitionStorage}} contains a bunch of volatile fields > that get replaced during a rebalance cleanup. I propose to wrap this fields > in a single class in order to make the code a little bit more maintainable. I > see the following benefits: > # It will become easy to understand, what components of the storage may be > updated; > # It will be easier to add more volatile components and not forget to update > them; > # It will become easier to avoid unnecessary volatile reads, because the > whole state can be fetched using a single read. > > The only downside I can see is that the code may become a little bit more > verbose, because you will need to access the state class first. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21646) Clean FreeList when cleaning BplusTree
[ https://issues.apache.org/jira/browse/IGNITE-21646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirill Tkalenko updated IGNITE-21646: - Fix Version/s: 3.0.0-beta2 > Clean FreeList when cleaning BplusTree > -- > > Key: IGNITE-21646 > URL: https://issues.apache.org/jira/browse/IGNITE-21646 > Project: Ignite > Issue Type: Improvement >Reporter: Kirill Tkalenko >Assignee: Kirill Tkalenko >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > > When implementing > org.apache.ignite.internal.pagememory.tree.BplusTree#startGradualDestruction, > they forgot to clear pages from FreeLists. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IGNITE-21646) Clean FreeList when cleaning BplusTree
[ https://issues.apache.org/jira/browse/IGNITE-21646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirill Tkalenko reassigned IGNITE-21646: Assignee: Kirill Tkalenko > Clean FreeList when cleaning BplusTree > -- > > Key: IGNITE-21646 > URL: https://issues.apache.org/jira/browse/IGNITE-21646 > Project: Ignite > Issue Type: Improvement >Reporter: Kirill Tkalenko >Assignee: Kirill Tkalenko >Priority: Major > Labels: ignite-3 > > When implementing > org.apache.ignite.internal.pagememory.tree.BplusTree#startGradualDestruction, > they forgot to clear pages from FreeLists. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (IGNITE-21646) Clean FreeList when cleaning BplusTree
[ https://issues.apache.org/jira/browse/IGNITE-21646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirill Tkalenko resolved IGNITE-21646. -- Resolution: Invalid After checking the code, I saw that the *FreeList* were being cleared, the task was not needed. > Clean FreeList when cleaning BplusTree > -- > > Key: IGNITE-21646 > URL: https://issues.apache.org/jira/browse/IGNITE-21646 > Project: Ignite > Issue Type: Improvement >Reporter: Kirill Tkalenko >Priority: Major > Labels: ignite-3 > > When implementing > org.apache.ignite.internal.pagememory.tree.BplusTree#startGradualDestruction, > they forgot to clear pages from FreeLists. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21703) IgniteSqlFunctions.octetLength relies on default encoding
[ https://issues.apache.org/jira/browse/IGNITE-21703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viacheslav Blinov updated IGNITE-21703: --- Component/s: sql > IgniteSqlFunctions.octetLength relies on default encoding > - > > Key: IGNITE-21703 > URL: https://issues.apache.org/jira/browse/IGNITE-21703 > Project: Ignite > Issue Type: Bug > Components: sql >Reporter: Viacheslav Blinov >Priority: Minor > Labels: ignite3 > > Issue detected by SpotBugs. Specifically the warning reported is: > {noformat} > H I DM_DEFAULT_ENCODING Dm: Found reliance on default encoding in > org.apache.ignite.internal.sql.engine.exec.exp.IgniteSqlFunctions.octetLength(String): > String.getBytes() At IgniteSqlFunctions.java:[line 133] > {noformat} > It looks like a potential bug if system default encoding will be something > exotic. > Investigate whenever this is a false-positive and we should suppress it, or > make a proper fix. > At the result of investigation corresponding TODO should be removed in > spotbugs-excludes.xml -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21703) Sql. IgniteSqlFunctions.octetLength relies on default encoding
[ https://issues.apache.org/jira/browse/IGNITE-21703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viacheslav Blinov updated IGNITE-21703: --- Summary: Sql. IgniteSqlFunctions.octetLength relies on default encoding (was: IgniteSqlFunctions.octetLength relies on default encoding) > Sql. IgniteSqlFunctions.octetLength relies on default encoding > -- > > Key: IGNITE-21703 > URL: https://issues.apache.org/jira/browse/IGNITE-21703 > Project: Ignite > Issue Type: Bug > Components: sql >Reporter: Viacheslav Blinov >Priority: Minor > Labels: ignite3 > > Issue detected by SpotBugs. Specifically the warning reported is: > {noformat} > H I DM_DEFAULT_ENCODING Dm: Found reliance on default encoding in > org.apache.ignite.internal.sql.engine.exec.exp.IgniteSqlFunctions.octetLength(String): > String.getBytes() At IgniteSqlFunctions.java:[line 133] > {noformat} > It looks like a potential bug if system default encoding will be something > exotic. > Investigate whenever this is a false-positive and we should suppress it, or > make a proper fix. > At the result of investigation corresponding TODO should be removed in > spotbugs-excludes.xml -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21709) Revise TimestampAware messages processing
[ https://issues.apache.org/jira/browse/IGNITE-21709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Pereslegin updated IGNITE-21709: -- Description: {{TimestampAware}} messages contain hybrid timestamp to adjust a hybrid logical clock. Currently, ReplicaManager updates the local clock when it receives a {{ReplicaRequest}} with a timestamp. It may be worth revising the design and adding general processing of such messages (probably at the {{MessagingService}} level). For example, it is also necessary to adjust the clock when receiving a {{QueryBatchMessage}} (sql-engine) and currently each component must duplicate the clock adjusting logic. was: {{TimestampAware}} messages contain timestamp to adjust a hybrid logical clock. Currently, ReplicaManager updates the local clock when it receives a {{ReplicaRequest}} with a timestamp. It may be worth revising the design and adding general processing of such messages (probably at the {{MessagingService}} level). For example, it is also necessary to adjust the clock when receiving a {{QueryBatchMessage}} (sql-engine) and currently each component must duplicate the clock adjusting logic. > Revise TimestampAware messages processing > - > > Key: IGNITE-21709 > URL: https://issues.apache.org/jira/browse/IGNITE-21709 > Project: Ignite > Issue Type: Improvement >Reporter: Pavel Pereslegin >Priority: Major > Labels: ignite-3 > > {{TimestampAware}} messages contain hybrid timestamp to adjust a hybrid > logical clock. > Currently, ReplicaManager updates the local clock when it receives a > {{ReplicaRequest}} with a timestamp. > It may be worth revising the design and adding general processing of such > messages (probably at the {{MessagingService}} level). > For example, it is also necessary to adjust the clock when receiving a > {{QueryBatchMessage}} (sql-engine) and currently each component must > duplicate the clock adjusting logic. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21725) The exception "Primary replica has expired" on creation of 1000 tables
[ https://issues.apache.org/jira/browse/IGNITE-21725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Igor updated IGNITE-21725: -- Summary: The exception "Primary replica has expired" on creation of 1000 tables (was: The exception "Primary replica has expired" on a lot creation of 1000 tables) > The exception "Primary replica has expired" on creation of 1000 tables > -- > > Key: IGNITE-21725 > URL: https://issues.apache.org/jira/browse/IGNITE-21725 > Project: Ignite > Issue Type: Bug > Components: general, persistence >Affects Versions: 3.0.0-beta1 >Reporter: Igor >Priority: Major > Labels: ignite-3 > > *Steps to reproduce:* > 1. Start cluster with 1 node with JVM options: "-Xms4096m -Xmx4096m" > 2. Create 1000 tables with 200 varchar columns each and insert 1 row into > each. One by one. > *Expected result:* > Tables are created. > *Actual result:* > On table 949 the exception is thrown: > {code:java} > java.sql.SQLException: Primary replica has expired, transaction will be > rolled back: [groupId = 1850_part_21, expected enlistment consistency token = > 112069202113202526, commit timestamp = HybridTimestamp [physical=2024-03-10 > 03:13:16:057 +, logical=396, composite=112069207395991948], current > primary replica = null] > at > org.apache.ignite.internal.jdbc.proto.IgniteQueryErrorCode.createJdbcSqlException(IgniteQueryErrorCode.java:57) > at > org.apache.ignite.internal.jdbc.JdbcStatement.execute0(JdbcStatement.java:154) > at > org.apache.ignite.internal.jdbc.JdbcPreparedStatement.executeWithArguments(JdbcPreparedStatement.java:765) > at > org.apache.ignite.internal.jdbc.JdbcPreparedStatement.executeUpdate(JdbcPreparedStatement.java:173) > at > org.gridgain.ai3tests.tests.TablesAmountCapacityTest.lambda$insertRowAndAssertTimeout$1(TablesAmountCapacityTest.java:166) > at > java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) {code} > In server logs there is an exception: > {code:java} > 2024-03-10 03:13:24:222 + > [WARNING][%TablesAmountCapacityTest_cluster_0%partition-operations-8][TxManagerImpl] > Failed to finish Tx. The operation will be retried > [txId=018e2659-b09f-009c-23c0-6ab50001]. > java.util.concurrent.CompletionException: > org.apache.ignite.internal.replicator.exception.ReplicationTimeoutException: > IGN-REP-3 TraceId:7ff7e851-9f18-4212-b317-a70a0a92fdfe Replication is timed > out [replicaGrpId=1850_part_21] > at > java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:331) > at > java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:346) > at > java.base/java.util.concurrent.CompletableFuture$UniAccept.tryFire(CompletableFuture.java:704) > at > java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) > at > java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088) > at > org.apache.ignite.internal.replicator.ReplicaService.lambda$sendToReplica$0(ReplicaService.java:110) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) > Caused by: > org.apache.ignite.internal.replicator.exception.ReplicationTimeoutException: > IGN-REP-3 TraceId:7ff7e851-9f18-4212-b317-a70a0a92fdfe Replication is timed > out [replicaGrpId=1850_part_21] > ... 4 more > 2024-03-10 03:13:24:290 + > [WARNING][%TablesAmountCapacityTest_cluster_0%partition-operations-22][TrackableNetworkMessageHandler] > Message handling has been too long [duration=67ms, message=[class > org.apache.ignite.raft.jraft.rpc.WriteActionRequestImpl]] > 2024-03-10 03:13:24:290 + > [WARNING][%TablesAmountCapacityTest_cluster_0%partition-operations-11][TrackableNetworkMessageHandler] > Message handling has been too long [duration=67ms, message=[class > org.apache.ignite.raft.jraft.rpc.WriteActionRequestImpl]] > 2024-03-10 03:13:24:290 + > [WARNING][%TablesAmountCapacityTest_cluster_0%partition-operations-19][TrackableNetworkMessageHandler] > Message handling has been too long [duration=67ms, message=[class > org.apache.ignite.raft.jraft.rpc.WriteActionRequestImpl]] > 2024-03-10 03:13:24:290 + >
[jira] [Assigned] (IGNITE-21578) ItDurableFinishTest#testWaitForCleanup failed with NPE
[ https://issues.apache.org/jira/browse/IGNITE-21578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Lapin reassigned IGNITE-21578: Assignee: Kirill Sizov (was: Alexander Lapin) > ItDurableFinishTest#testWaitForCleanup failed with NPE > --- > > Key: IGNITE-21578 > URL: https://issues.apache.org/jira/browse/IGNITE-21578 > Project: Ignite > Issue Type: Bug >Reporter: Alexander Lapin >Assignee: Kirill Sizov >Priority: Major > Labels: ignite-3 > Time Spent: 10m > Remaining Estimate: 0h > > [https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_RunAllTests/7870395?expandBuildDeploymentsSection=false=false=false=true+Inspection=true=true] > {code:java} > Caused by: java.lang.NullPointerException > at > org.apache.ignite.internal.tx.impl.TxManagerImpl.lambda$finishFull$3(TxManagerImpl.java:472) > ~[ignite-transactions-3.0.0-SNAPSHOT.jar:?] > at > org.apache.ignite.internal.tx.impl.VolatileTxStateMetaStorage.lambda$updateMeta$0(VolatileTxStateMetaStorage.java:73) > ~[ignite-transactions-3.0.0-SNAPSHOT.jar:?] > at > java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1908) > ~[?:?] > at > org.apache.ignite.internal.tx.impl.VolatileTxStateMetaStorage.updateMeta(VolatileTxStateMetaStorage.java:72) > ~[ignite-transactions-3.0.0-SNAPSHOT.jar:?] > at > org.apache.ignite.internal.tx.impl.TxManagerImpl.updateTxMeta(TxManagerImpl.java:455) > ~[ignite-transactions-3.0.0-SNAPSHOT.jar:?] > at > org.apache.ignite.internal.tx.impl.TxManagerImpl.finishFull(TxManagerImpl.java:472) > ~[ignite-transactions-3.0.0-SNAPSHOT.jar:?] > at > org.apache.ignite.internal.table.distributed.storage.InternalTableImpl.lambda$postEnlist$13(InternalTableImpl.java:593) > ~[ignite-table-3.0.0-SNAPSHOT.jar:?] {code} > Seems that the reason is that old meta may be null in case of exception > {code:java} > public void finishFull(HybridTimestampTracker timestampTracker, UUID > txId, boolean commit) { > ... > updateTxMeta(txId, old -> new TxStateMeta(finalState, > old.txCoordinatorId(), old.commitPartitionId(), old.commitTimestamp())); > ... > } > {code} > {code:java} > return fut.handle((BiFunction>) > (r, e) -> { > if (full) { // Full txn is already finished remotely. Just update > local state. > txManager.finishFull(observableTimestampTracker, tx0.id(), e > == null);{code} > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21725) The exception "Primary replica has expired" on a lot creation of 1000 tables
Igor created IGNITE-21725: - Summary: The exception "Primary replica has expired" on a lot creation of 1000 tables Key: IGNITE-21725 URL: https://issues.apache.org/jira/browse/IGNITE-21725 Project: Ignite Issue Type: Bug Components: general, persistence Affects Versions: 3.0.0-beta1 Reporter: Igor *Steps to reproduce:* 1. Start cluster with 1 node with JVM options: "-Xms4096m -Xmx4096m" 2. Create 1000 tables with 200 varchar columns each and insert 1 row into each. One by one. *Expected result:* Tables are created. *Actual result:* On table 949 the exception is thrown: {code:java} java.sql.SQLException: Primary replica has expired, transaction will be rolled back: [groupId = 1850_part_21, expected enlistment consistency token = 112069202113202526, commit timestamp = HybridTimestamp [physical=2024-03-10 03:13:16:057 +, logical=396, composite=112069207395991948], current primary replica = null] at org.apache.ignite.internal.jdbc.proto.IgniteQueryErrorCode.createJdbcSqlException(IgniteQueryErrorCode.java:57) at org.apache.ignite.internal.jdbc.JdbcStatement.execute0(JdbcStatement.java:154) at org.apache.ignite.internal.jdbc.JdbcPreparedStatement.executeWithArguments(JdbcPreparedStatement.java:765) at org.apache.ignite.internal.jdbc.JdbcPreparedStatement.executeUpdate(JdbcPreparedStatement.java:173) at org.gridgain.ai3tests.tests.TablesAmountCapacityTest.lambda$insertRowAndAssertTimeout$1(TablesAmountCapacityTest.java:166) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:834) {code} In server logs there is an exception: {code:java} 2024-03-10 03:13:24:222 + [WARNING][%TablesAmountCapacityTest_cluster_0%partition-operations-8][TxManagerImpl] Failed to finish Tx. The operation will be retried [txId=018e2659-b09f-009c-23c0-6ab50001]. java.util.concurrent.CompletionException: org.apache.ignite.internal.replicator.exception.ReplicationTimeoutException: IGN-REP-3 TraceId:7ff7e851-9f18-4212-b317-a70a0a92fdfe Replication is timed out [replicaGrpId=1850_part_21] at java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:331) at java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:346) at java.base/java.util.concurrent.CompletableFuture$UniAccept.tryFire(CompletableFuture.java:704) at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088) at org.apache.ignite.internal.replicator.ReplicaService.lambda$sendToReplica$0(ReplicaService.java:110) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:834) Caused by: org.apache.ignite.internal.replicator.exception.ReplicationTimeoutException: IGN-REP-3 TraceId:7ff7e851-9f18-4212-b317-a70a0a92fdfe Replication is timed out [replicaGrpId=1850_part_21] ... 4 more 2024-03-10 03:13:24:290 + [WARNING][%TablesAmountCapacityTest_cluster_0%partition-operations-22][TrackableNetworkMessageHandler] Message handling has been too long [duration=67ms, message=[class org.apache.ignite.raft.jraft.rpc.WriteActionRequestImpl]] 2024-03-10 03:13:24:290 + [WARNING][%TablesAmountCapacityTest_cluster_0%partition-operations-11][TrackableNetworkMessageHandler] Message handling has been too long [duration=67ms, message=[class org.apache.ignite.raft.jraft.rpc.WriteActionRequestImpl]] 2024-03-10 03:13:24:290 + [WARNING][%TablesAmountCapacityTest_cluster_0%partition-operations-19][TrackableNetworkMessageHandler] Message handling has been too long [duration=67ms, message=[class org.apache.ignite.raft.jraft.rpc.WriteActionRequestImpl]] 2024-03-10 03:13:24:290 + [WARNING][%TablesAmountCapacityTest_cluster_0%partition-operations-17][TrackableNetworkMessageHandler] Message handling has been too long [duration=67ms, message=[class org.apache.ignite.raft.jraft.rpc.WriteActionRequestImpl]] 2024-03-10 03:13:24:290 + [WARNING][%TablesAmountCapacityTest_cluster_0%partition-operations-23][TrackableNetworkMessageHandler] Message handling has been too long [duration=67ms, message=[class org.apache.ignite.raft.jraft.rpc.WriteActionRequestImpl]] 2024-03-10 03:13:24:290 +
[jira] [Updated] (IGNITE-21724) Support "-ea" version in ItInitializedClusterRestTest
[ https://issues.apache.org/jira/browse/IGNITE-21724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Pochatkin updated IGNITE-21724: --- Description: Every release we encounter the following problem: {code:java} "(?\\d+)\\.(?\\d+)\\.(?\\d+)((?-SNAPSHOT)|-(?alpha\\d+)|--(?beta\\d+))?" Apache Ignite ver. 9.0.0-ea5{code} was: Every release we encounter the following problem: {{java.lang.AssertionError: Expected: a string matching the pattern <(?\d+)\.(?\d+)\.(?\d+)((?-SNAPSHOT)|-(?alpha\d+)|--(?beta\d+))?> but: the string was "9.0.0-ea5"}} {{Apache Ignite ver. 9.0.0-ea5}} > Support "-ea" version in ItInitializedClusterRestTest > - > > Key: IGNITE-21724 > URL: https://issues.apache.org/jira/browse/IGNITE-21724 > Project: Ignite > Issue Type: Improvement >Reporter: Mikhail Pochatkin >Assignee: Mikhail Pochatkin >Priority: Major > Labels: ignite-3 > > Every release we encounter the following problem: > > {code:java} > "(?\\d+)\\.(?\\d+)\\.(?\\d+)((?-SNAPSHOT)|-(?alpha\\d+)|--(?beta\\d+))?" > > Apache Ignite ver. 9.0.0-ea5{code} > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21724) Support "-ea" version in ItInitializedClusterRestTest
Mikhail Pochatkin created IGNITE-21724: -- Summary: Support "-ea" version in ItInitializedClusterRestTest Key: IGNITE-21724 URL: https://issues.apache.org/jira/browse/IGNITE-21724 Project: Ignite Issue Type: Improvement Reporter: Mikhail Pochatkin Assignee: Mikhail Pochatkin Every release we encounter the following problem: {{java.lang.AssertionError: Expected: a string matching the pattern <(?\d+)\.(?\d+)\.(?\d+)((?-SNAPSHOT)|-(?alpha\d+)|--(?beta\d+))?> but: the string was "9.0.0-ea5"}} {{Apache Ignite ver. 9.0.0-ea5}} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (IGNITE-21501) Create index storages for new partitions on rebalance
[ https://issues.apache.org/jira/browse/IGNITE-21501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirill Tkalenko resolved IGNITE-21501. -- Resolution: Fixed > Create index storages for new partitions on rebalance > - > > Key: IGNITE-21501 > URL: https://issues.apache.org/jira/browse/IGNITE-21501 > Project: Ignite > Issue Type: Bug >Reporter: Ivan Bessonov >Assignee: Kirill Tkalenko >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > Time Spent: 1h 50m > Remaining Estimate: 0h > > It appears that we only create index storages during the "table creation", > not during the "partition creation" if it's performed in isolation. > Even if we did, > {{org.apache.ignite.internal.table.distributed.index.IndexUpdateHandler#waitIndexes}} > is still badly designed, because it waits for indexes of the initial > partitions distribution and cannot provide any guarantees when assignments > are changed. > This leads to NPEs or bizarre assertions, related to aforementioned method. > What we need to do is: > * Get rid of the faulty index awaiting mechanizm. > * Create index storages before starting raft group. > * [optional] There might be naturally occurring "races" between catalog > updates (index creation) and rebalance. Right now they are resolved by the > fact that these processes are linearized in watch processing, but that's not > the best approach. If we could provide something more robust, that would have > been nice. Let's think about it at least. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20133) Compute hashes for integral/decimal columns in a stable way
[ https://issues.apache.org/jira/browse/IGNITE-20133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirill Tkalenko updated IGNITE-20133: - Epic Link: IGNITE-21450 (was: IGNITE-17767) > Compute hashes for integral/decimal columns in a stable way > --- > > Key: IGNITE-20133 > URL: https://issues.apache.org/jira/browse/IGNITE-20133 > Project: Ignite > Issue Type: Improvement >Reporter: Roman Puchkovskiy >Priority: Minor > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > > The idea is to make hash computation for integral and decimal types satisfy > the following property: if a column type is changed from an integral to a > decimal type, the hashes for values that are already stored remain the same. > This will allow us to permit chaning type (integral -> decimal and decimal -> > longer decimal) of a column that is included in a HASH index. > A hash that has this property is the following function: > hash(val.toString(TRIM_TRAILING_ZEROS)). For instance, for 1 it will be > hash("1"), for 1.000 it will also be hash("1"), but for 1.23 it will give > hash("1.23"). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20134) Only allow changing type of indexed column when indexed values representation remains the same
[ https://issues.apache.org/jira/browse/IGNITE-20134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirill Tkalenko updated IGNITE-20134: - Epic Link: IGNITE-21450 (was: IGNITE-17767) > Only allow changing type of indexed column when indexed values representation > remains the same > -- > > Key: IGNITE-20134 > URL: https://issues.apache.org/jira/browse/IGNITE-20134 > Project: Ignite > Issue Type: Improvement >Reporter: Roman Puchkovskiy >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > > When an attempt to change type of a column that is included in an index is > made, this should only be permitted if the representation of the column > values in the index will remain unchanged (and, hence, index rebuild will not > be needed). > The following changes are acceptable: > * integral->integral (as integral types are represented as varints) > * integral->decimal (with enough precision) and float->double for SORTED > indices where the ordering remains the same > * integral->decimal and decimal->decimal (with enough precision) for HASH > indices (requires IGNITE-20133) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21126) Before starting to backfill an index, only wait for transactions where the index table is enlisted
[ https://issues.apache.org/jira/browse/IGNITE-21126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirill Tkalenko updated IGNITE-21126: - Epic Link: IGNITE-21450 (was: IGNITE-17767) > Before starting to backfill an index, only wait for transactions where the > index table is enlisted > -- > > Key: IGNITE-21126 > URL: https://issues.apache.org/jira/browse/IGNITE-21126 > Project: Ignite > Issue Type: Improvement >Reporter: Roman Puchkovskiy >Priority: Major > Labels: ignite-3 > > IGNITE-21115 says that we must wait for all RW transactions started on old > schema versions to be finished before initiating an index backfill. This > means that a long-running RW transaction that never touches table A will > block backfilling of any new index created on table A, which is too > restrictive. > The idea is that we modify the mechanism defined in IGNITE-21112 by also > passing tableId in RwTransactionsFinishedRequest; the request handling will > only take into account tables where any partition of the table is enlisted. > This means that an RW transaction that was started before the index > appearance will be aborted if trying to write to the index after the index > Backfill starts. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-17325) Implement a comparator for inlined BinaryTuple in sorted index
[ https://issues.apache.org/jira/browse/IGNITE-17325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirill Tkalenko updated IGNITE-17325: - Epic Link: IGNITE-21450 (was: IGNITE-17767) > Implement a comparator for inlined BinaryTuple in sorted index > -- > > Key: IGNITE-17325 > URL: https://issues.apache.org/jira/browse/IGNITE-17325 > Project: Ignite > Issue Type: Improvement >Reporter: Ivan Bessonov >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > > We need to implement an inlined *BinaryTuple* comparator in a sorted index > for a B+tree. > You need to take into account the format of the *BinaryTuple* and the fact > that it can be truncated. > As a basis, you can take > *org.apache.ignite.internal.storage.index.BinaryTupleComparator*. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-20139) RandomForestClassifierTrainer accuracy issue
[ https://issues.apache.org/jira/browse/IGNITE-20139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17825164#comment-17825164 ] Igor Belyakov commented on IGNITE-20139: [~zaleslaw], could you please review the PR? > RandomForestClassifierTrainer accuracy issue > > > Key: IGNITE-20139 > URL: https://issues.apache.org/jira/browse/IGNITE-20139 > Project: Ignite > Issue Type: Bug > Components: ml >Affects Versions: 2.15 >Reporter: Alexandr Shapkin >Assignee: Igor Belyakov >Priority: Major > Attachments: TreeSample2_Portfolio_Change.png, random-forest.zip > > > We tried to use machine learning capabilities, and discovered a bug in > implementation of Random Forest. When comparing Ignite's output with python > prototype (scikit-learn lib), we noticed that Ignite's predictions have much > lower accuracy despite using the same data set and model parameters. > Further investigation showed that Ignite generates decision trees that kinda > "loop". The tree starts checking the same condition over and over until it > reaches the maximum tree depth. > I've attached a standalone reproducer which uses a small excerpt of our data > set. > It loads data from the csv file, then performs the training of the model for > just 1 tree. Then the reproducer finds one of the looping branches and prints > it. You will see that every single node in the branch uses the same feature, > value and has then same calculated impurity. > On my machine the code reproduces this issue 100% of time. > I've also attached an example of the tree generated by python's scikit-learn > on the same data set with the same parameters. In python the tree usually > doesn't get deeper than 20 nodes. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (IGNITE-19712) Handle rebalance wrt indexes
[ https://issues.apache.org/jira/browse/IGNITE-19712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirill Tkalenko resolved IGNITE-19712. -- Resolution: Duplicate > Handle rebalance wrt indexes > > > Key: IGNITE-19712 > URL: https://issues.apache.org/jira/browse/IGNITE-19712 > Project: Ignite > Issue Type: Bug >Reporter: Semyon Danilov >Assignee: Kirill Tkalenko >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > > After IGNITE-19363, index storages are no longer lazily instantiated. Need to > listen to assignment changes and start new storages -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IGNITE-19712) Handle rebalance wrt indexes
[ https://issues.apache.org/jira/browse/IGNITE-19712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirill Tkalenko reassigned IGNITE-19712: Assignee: Kirill Tkalenko > Handle rebalance wrt indexes > > > Key: IGNITE-19712 > URL: https://issues.apache.org/jira/browse/IGNITE-19712 > Project: Ignite > Issue Type: Bug >Reporter: Semyon Danilov >Assignee: Kirill Tkalenko >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > > After IGNITE-19363, index storages are no longer lazily instantiated. Need to > listen to assignment changes and start new storages -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-19712) Handle rebalance wrt indexes
[ https://issues.apache.org/jira/browse/IGNITE-19712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirill Tkalenko updated IGNITE-19712: - Fix Version/s: 3.0.0-beta2 > Handle rebalance wrt indexes > > > Key: IGNITE-19712 > URL: https://issues.apache.org/jira/browse/IGNITE-19712 > Project: Ignite > Issue Type: Bug >Reporter: Semyon Danilov >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > > After IGNITE-19363, index storages are no longer lazily instantiated. Need to > listen to assignment changes and start new storages -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-20139) RandomForestClassifierTrainer accuracy issue
[ https://issues.apache.org/jira/browse/IGNITE-20139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17825163#comment-17825163 ] Igor Belyakov commented on IGNITE-20139: The issue happens when one “pure“ node (with impurity{^}*{^} = 0) is presented in the tree. We calculate an impurity only for children nodes and not for the current node, as well as do not check whether the node is “pure“ and contains just one label, due to that, the “bestSplit” calculation is executed for the already “pure“ node, which decides that all items should be moved to the left child node and no items to the right (leaf node), which gives 2 “pure“ children nodes. Since we don’t calculate impurity for the current (parent) node the {{parentNode.getImpurity() - split.get().getImpurity() > minImpurityDelta}} check is always true, and we continue to split the already “pure“ node until the max tree depth is reached. The following changes were made to resolve the issue: # Gain{^}**{^} calculation and check for the split were added. # Node’s impurity check is added, once the impurity becomes 0 it means that the node is “pure” and we don’t need to calculate a split for it. # Gini impurity calculation was changed to {{(1 - sum(p^2))}} to get the correct values in the range from 0 to 0.5 as required for the Gini index. ^*^ Impurity - is a value from 0 to 0.5, which shows whether the node is “pure“ (impurity = 0) having just 1 label or “impure” with impurity=0.5, which is the worst scenario where the label ratio is 1:1. ^**^ Gain - is a difference between the parent node’s impurity and weighted children nodes' impurity. The split which provides the maximum gain value is considered the best. See [https://www.learndatasci.com/glossary/gini-impurity/] > RandomForestClassifierTrainer accuracy issue > > > Key: IGNITE-20139 > URL: https://issues.apache.org/jira/browse/IGNITE-20139 > Project: Ignite > Issue Type: Bug > Components: ml >Affects Versions: 2.15 >Reporter: Alexandr Shapkin >Assignee: Igor Belyakov >Priority: Major > Attachments: TreeSample2_Portfolio_Change.png, random-forest.zip > > > We tried to use machine learning capabilities, and discovered a bug in > implementation of Random Forest. When comparing Ignite's output with python > prototype (scikit-learn lib), we noticed that Ignite's predictions have much > lower accuracy despite using the same data set and model parameters. > Further investigation showed that Ignite generates decision trees that kinda > "loop". The tree starts checking the same condition over and over until it > reaches the maximum tree depth. > I've attached a standalone reproducer which uses a small excerpt of our data > set. > It loads data from the csv file, then performs the training of the model for > just 1 tree. Then the reproducer finds one of the looping branches and prints > it. You will see that every single node in the branch uses the same feature, > value and has then same calculated impurity. > On my machine the code reproduces this issue 100% of time. > I've also attached an example of the tree generated by python's scikit-learn > on the same data set with the same parameters. In python the tree usually > doesn't get deeper than 20 nodes. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-18879) Leaseholder candidates balancing
[ https://issues.apache.org/jira/browse/IGNITE-18879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17825157#comment-17825157 ] yexiaowei commented on IGNITE-18879: [~Denis Chudov] I would like to ask a question about Metastorage. I noticed that Metastorage relies on a Raft group. However, in many cases when reading Metastorage data, it doesn't read from the leader or use raft ReadIndex. Would this lead to issues with reading outdated data? For example, directly reading lease information PlacementDriver#currentLease from PD. > Leaseholder candidates balancing > > > Key: IGNITE-18879 > URL: https://issues.apache.org/jira/browse/IGNITE-18879 > Project: Ignite > Issue Type: Improvement >Reporter: Denis Chudov >Priority: Major > Labels: ignite-3 > > *Motivation* > Primary replicas (leaseholders) should be evenly distributed over cluster to > balance the transactional load between nodes. As the placement driver assigns > primary replicas, balancing the primary replicas is also it's responsibility. > Naive implementation of balancing should choose a node as leaseholder > candidate in a way to save even lease distribution over all nodes. In real > cluster, it may take into account slow nodes, hot table records, etc. If > lease candidate declines LeaseGrantMessage from placement driver, the > balancer should make decision to choose another candidate for given primary > replica or enforce the previously chosen. So the balancing algorith should be > pluggable, so that we could have ability to improve/replace/compare it with > others. > *Definition of done* > Introduced interface for lease candidates balancer, and a simple > implementation sustaining even lease distribution, which is used by placement > driver by default. No public or internal configuration needed on this stage. > *Implementation notes* > Lease candidates balancer should have at least 2 methods: > - {_}get(group, ignoredNodes){_}: returns candidate for the given group, a > node from ignoredNodes set can't be chosen as a candidate > - {_}considerRedirectProposal(group, candidate, proposedCandidate){_}: > processes redirect proposal for given group provided by given candidate > (previously chosen using _get_ method), proposedCandidate is the alternative > candidate. Returns candidate that should be enforced by placement driver. -- This message was sent by Atlassian Jira (v8.20.10#820010)