[jira] [Assigned] (IGNITE-21661) Test scenario where all stable nodes are lost during a partially completed rebalance
[ https://issues.apache.org/jira/browse/IGNITE-21661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov reassigned IGNITE-21661: -- Assignee: Ivan Bessonov > Test scenario where all stable nodes are lost during a partially completed > rebalance > > > Key: IGNITE-21661 > URL: https://issues.apache.org/jira/browse/IGNITE-21661 > Project: Ignite > Issue Type: Improvement >Reporter: Ivan Bessonov >Assignee: Ivan Bessonov >Priority: Major > Labels: ignite-3 > > Following case is possible: > * Nodes A, B and C for a partition > * B and C go offline > * new distribution is A, D and E > * full state transfer from A to D is completed > * full state transfer from A to E is not > * A goes offline > * we perform "resetPartitions" > Ideally, we should use D as a new leader somehow, but the bare minimum should > be a partition that is functional, maybe an empty one. We should test the case > > This might be a good place to add more tests. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21661) Test scenario where all stable nodes are lost during a partially completed rebalance
[ https://issues.apache.org/jira/browse/IGNITE-21661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov updated IGNITE-21661: --- Description: Following case is possible: * Nodes A, B and C for a partition * B and C go offline * new distribution is A, D and E * full state transfer from A to D is completed * full state transfer from A to E is not * A goes offline * we perform "resetPartitions" Ideally, we should use D as a new leader somehow, but the bare minimum should be a partition that is functional, maybe an empty one. We should test the case This might be a good place to add more tests. was: Following case is possible: * Nodes A, B and C for a partition * B and C go offline * new distribution is A, D and E * full state transfer from A to D is completed * full state transfer from A to E is not * A goes offline * we perform "resetPartitions" Ideally, we should use D as a new leader somehow, but the bare minimum should be a partition that is functional, maybe an empty one. We should test the case > Test scenario where all stable nodes are lost during a partially completed > rebalance > > > Key: IGNITE-21661 > URL: https://issues.apache.org/jira/browse/IGNITE-21661 > Project: Ignite > Issue Type: Improvement >Reporter: Ivan Bessonov >Priority: Major > Labels: ignite-3 > > Following case is possible: > * Nodes A, B and C for a partition > * B and C go offline > * new distribution is A, D and E > * full state transfer from A to D is completed > * full state transfer from A to E is not > * A goes offline > * we perform "resetPartitions" > Ideally, we should use D as a new leader somehow, but the bare minimum should > be a partition that is functional, maybe an empty one. We should test the case > > This might be a good place to add more tests. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-22107) Properly encapsulate partition meta
[ https://issues.apache.org/jira/browse/IGNITE-22107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov updated IGNITE-22107: --- Description: {{PartitionMeta}} and {{PartitionMetaIo}} leak specific implementation details, specifically - all fields except for {{{}pageCount{}}}. This breaks encapsulation and makes {{page-memory}} module code non-reusable. I propose splitting meta into 2 parts - abstract meta, that would only hold page count, and specific meta that will be located in a different module, close to the implementation. In this case, we would have to pass meta IO as parameters into methods like {{{}PartitionMetaManager#readOrCreateMeta{}}}, and create a getter for IO in {{AbstractPartitionMeta}} class itself, but that's a necessary sacrifice. Some other places will be affected as well, mostly tests. was: `PartitionMeta` and `PartitionMetaIo` leak specific implementation details, specifically - all fields except for `pageCount`. This breaks encapsulation and makes `page-memory` module code non-reusable. I propose splitting meta into 2 parts - abstract meta, that would only hold page count, and specific meta that will be located in a different module, close to the implementation. In this case, we would have to pass meta IO as parameters into methods like `PartitionMetaManager#readOrCreateMeta`, and create a getter for IO in `AbstractPartitionMeta` class itself, but that's a necessary sacrifice. > Properly encapsulate partition meta > --- > > Key: IGNITE-22107 > URL: https://issues.apache.org/jira/browse/IGNITE-22107 > Project: Ignite > Issue Type: Improvement >Reporter: Ivan Bessonov >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > > {{PartitionMeta}} and {{PartitionMetaIo}} leak specific implementation > details, specifically - all fields except for {{{}pageCount{}}}. This breaks > encapsulation and makes {{page-memory}} module code non-reusable. > I propose splitting meta into 2 parts - abstract meta, that would only hold > page count, and specific meta that will be located in a different module, > close to the implementation. > In this case, we would have to pass meta IO as parameters into methods like > {{{}PartitionMetaManager#readOrCreateMeta{}}}, and create a getter for IO in > {{AbstractPartitionMeta}} class itself, but that's a necessary sacrifice. > Some other places will be affected as well, mostly tests. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-22107) Properly encapsulate partition meta
Ivan Bessonov created IGNITE-22107: -- Summary: Properly encapsulate partition meta Key: IGNITE-22107 URL: https://issues.apache.org/jira/browse/IGNITE-22107 Project: Ignite Issue Type: Improvement Reporter: Ivan Bessonov Fix For: 3.0.0-beta2 `PartitionMeta` and `PartitionMetaIo` leak specific implementation details, specifically - all fields except for `pageCount`. This breaks encapsulation and makes `page-memory` module code non-reusable. I propose splitting meta into 2 parts - abstract meta, that would only hold page count, and specific meta that will be located in a different module, close to the implementation. In this case, we would have to pass meta IO as parameters into methods like `PartitionMetaManager#readOrCreateMeta`, and create a getter for IO in `AbstractPartitionMeta` class itself, but that's a necessary sacrifice. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (IGNITE-21434) Fail user write requests for non-available partitions
[ https://issues.apache.org/jira/browse/IGNITE-21434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov resolved IGNITE-21434. Resolution: Won't Fix This insert doesn't hang indefinitely anymore, it fails with primary replica awaiting. I'm closing the issue as "Won't Fix" > Fail user write requests for non-available partitions > - > > Key: IGNITE-21434 > URL: https://issues.apache.org/jira/browse/IGNITE-21434 > Project: Ignite > Issue Type: Improvement >Reporter: Ivan Bessonov >Assignee: Ivan Bessonov >Priority: Major > Labels: ignite-3 > > Currently, {{INSERT INTO test VALUES(%d, %d);}} just hands indefinitely, > which is not what you would expect. We should either fail the request > immediately if there's no majority, or return a replication timeout > exception, for example. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-22075) GC doesn't wait for RO transactions
Ivan Bessonov created IGNITE-22075: -- Summary: GC doesn't wait for RO transactions Key: IGNITE-22075 URL: https://issues.apache.org/jira/browse/IGNITE-22075 Project: Ignite Issue Type: Bug Reporter: Ivan Bessonov Fix For: 3.0.0-beta2 In https://issues.apache.org/jira/browse/IGNITE-21773 we started handling the LWM update concurrently by both TX manager and GC, which means that GC might start collecting garbage before transactions are finished. This doesn't even depend on listeners order, because both operations are asynchronous. We must fix it -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-22041) Secondary indexes inline size calculation is wrong
[ https://issues.apache.org/jira/browse/IGNITE-22041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov updated IGNITE-22041: --- Description: * "short" size is used as 16 bytes instead of 2 bytes * decimal header is not included in estimation > Secondary indexes inline size calculation is wrong > -- > > Key: IGNITE-22041 > URL: https://issues.apache.org/jira/browse/IGNITE-22041 > Project: Ignite > Issue Type: Bug >Reporter: Ivan Bessonov >Assignee: Ivan Bessonov >Priority: Major > Labels: ignite-3 > Time Spent: 10m > Remaining Estimate: 0h > > * "short" size is used as 16 bytes instead of 2 bytes > * decimal header is not included in estimation -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-22063) aimem partition deletion doesn't delete GC queue
Ivan Bessonov created IGNITE-22063: -- Summary: aimem partition deletion doesn't delete GC queue Key: IGNITE-22063 URL: https://issues.apache.org/jira/browse/IGNITE-22063 Project: Ignite Issue Type: Bug Reporter: Ivan Bessonov {{org.apache.ignite.internal.storage.pagememory.mv.VolatilePageMemoryMvPartitionStorage#destroyStructures}} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-22050) Data structures don't clear partId of reused page
[ https://issues.apache.org/jira/browse/IGNITE-22050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov updated IGNITE-22050: --- Description: In current implementation we use a single reuse list for all partitions in aimem storage engine. That works fine in Ignite 2, but here in Ignite 3 we implemented a "partitilnless link" format for eliminating 2 bytes, that indicate partition number, from the data in pages. This means that if allocator provided the structure with the page from partition X, but the structure itself represents partition Y, we will lose the "X" in the process and next time will try accessing the page by the pageId that has Y encoded in it. This would lead to pageId mismatch. We have several options here. * ignore mismatched partitions * get rid of partitionless pageIds * fix the allocator, so that it would change partition Id upon allocation Ideally, we should go with the 3rd option. It requires some slight changes in internal data structure API, so that we would pass the required partitionId directly into the allocator (reuse list). This is a little bit excessive at first sight, but seems more appropriate in a long run. Ignite 2 pageIds are all messed up inside of structures, we can fix that. > Data structures don't clear partId of reused page > - > > Key: IGNITE-22050 > URL: https://issues.apache.org/jira/browse/IGNITE-22050 > Project: Ignite > Issue Type: Bug >Reporter: Ivan Bessonov >Assignee: Ivan Bessonov >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > Time Spent: 20m > Remaining Estimate: 0h > > In current implementation we use a single reuse list for all partitions in > aimem storage engine. > That works fine in Ignite 2, but here in Ignite 3 we implemented a > "partitilnless link" format for eliminating 2 bytes, that indicate partition > number, from the data in pages. This means that if allocator provided the > structure with the page from partition X, but the structure itself represents > partition Y, we will lose the "X" in the process and next time will try > accessing the page by the pageId that has Y encoded in it. This would lead to > pageId mismatch. > We have several options here. > * ignore mismatched partitions > * get rid of partitionless pageIds > * fix the allocator, so that it would change partition Id upon allocation > Ideally, we should go with the 3rd option. It requires some slight changes in > internal data structure API, so that we would pass the required partitionId > directly into the allocator (reuse list). This is a little bit excessive at > first sight, but seems more appropriate in a long run. Ignite 2 pageIds are > all messed up inside of structures, we can fix that. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (IGNITE-22055) Shut destruction executor down before closing volatile regions
[ https://issues.apache.org/jira/browse/IGNITE-22055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov resolved IGNITE-22055. Reviewer: Ivan Bessonov Resolution: Fixed > Shut destruction executor down before closing volatile regions > -- > > Key: IGNITE-22055 > URL: https://issues.apache.org/jira/browse/IGNITE-22055 > Project: Ignite > Issue Type: Bug >Reporter: Roman Puchkovskiy >Assignee: Roman Puchkovskiy >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-22058) Use paranoid leak detection in tests
Ivan Bessonov created IGNITE-22058: -- Summary: Use paranoid leak detection in tests Key: IGNITE-22058 URL: https://issues.apache.org/jira/browse/IGNITE-22058 Project: Ignite Issue Type: Improvement Reporter: Ivan Bessonov Fix For: 3.0.0-beta2 We should set `io.netty.leakDetection.level=paranoid` in integration tests and network tests, in order to detect possible leaks -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-22050) Data structures don't clear partId of reused page
Ivan Bessonov created IGNITE-22050: -- Summary: Data structures don't clear partId of reused page Key: IGNITE-22050 URL: https://issues.apache.org/jira/browse/IGNITE-22050 Project: Ignite Issue Type: Bug Reporter: Ivan Bessonov Assignee: Ivan Bessonov Fix For: 3.0.0-beta2 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-22041) Secondary indexes inline size calculation is wrong
Ivan Bessonov created IGNITE-22041: -- Summary: Secondary indexes inline size calculation is wrong Key: IGNITE-22041 URL: https://issues.apache.org/jira/browse/IGNITE-22041 Project: Ignite Issue Type: Bug Reporter: Ivan Bessonov Assignee: Ivan Bessonov -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IGNITE-21999) Merge partition free-lists into one
[ https://issues.apache.org/jira/browse/IGNITE-21999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov reassigned IGNITE-21999: -- Assignee: Philipp Shergalis (was: Ivan Bessonov) > Merge partition free-lists into one > --- > > Key: IGNITE-21999 > URL: https://issues.apache.org/jira/browse/IGNITE-21999 > Project: Ignite > Issue Type: Improvement >Reporter: Ivan Bessonov >Assignee: Philipp Shergalis >Priority: Major > Labels: ignite-3 > > Current implementation has 2 free-lists: > * version chains > * index tuples > These lists have separate buckets for different types of data pages. There's > an issue with this approach: > * overhead on pages - we have to allocate more pages to store buckets > * overhead on checkpoints - we have to save twice as many free-lists on > every checkpoint > The reason, to my understanding, is the fact that FreeList class is > parameterized with the specific type of data that it stores. It makes no > sense to me, to be completely honest, because the algorithm is always the > same, and we always use the code from abstract free-list implementation. > What I propose: > * get rid of abstract implementation and only have the concrete > implementation of free lists > * same for data pages > * serialization code will be fully moved to implementations of Storeable > We're losing some guarantees if we do this change - we can no longer check > that type of the page is correct. My response to this issue is that every > Storeable could add a 1-byte header to the data, in order to validate it when > being read, that should be enough. If we could find a way to store less than > 1 byte then that's nice, I didn't look too much into the question. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IGNITE-21999) Merge partition free-lists into one
[ https://issues.apache.org/jira/browse/IGNITE-21999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov reassigned IGNITE-21999: -- Assignee: Ivan Bessonov > Merge partition free-lists into one > --- > > Key: IGNITE-21999 > URL: https://issues.apache.org/jira/browse/IGNITE-21999 > Project: Ignite > Issue Type: Improvement >Reporter: Ivan Bessonov >Assignee: Ivan Bessonov >Priority: Major > Labels: ignite-3 > > Current implementation has 2 free-lists: > * version chains > * index tuples > These lists have separate buckets for different types of data pages. There's > an issue with this approach: > * overhead on pages - we have to allocate more pages to store buckets > * overhead on checkpoints - we have to save twice as many free-lists on > every checkpoint > The reason, to my understanding, is the fact that FreeList class is > parameterized with the specific type of data that it stores. It makes no > sense to me, to be completely honest, because the algorithm is always the > same, and we always use the code from abstract free-list implementation. > What I propose: > * get rid of abstract implementation and only have the concrete > implementation of free lists > * same for data pages > * serialization code will be fully moved to implementations of Storeable > We're losing some guarantees if we do this change - we can no longer check > that type of the page is correct. My response to this issue is that every > Storeable could add a 1-byte header to the data, in order to validate it when > being read, that should be enough. If we could find a way to store less than > 1 byte then that's nice, I didn't look too much into the question. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21999) Merge partition free-lists into one
Ivan Bessonov created IGNITE-21999: -- Summary: Merge partition free-lists into one Key: IGNITE-21999 URL: https://issues.apache.org/jira/browse/IGNITE-21999 Project: Ignite Issue Type: Improvement Reporter: Ivan Bessonov Current implementation has 2 free-lists: * version chains * index tuples These lists have separate buckets for different types of data pages. There's an issue with this approach: * overhead on pages - we have to allocate more pages to store buckets * overhead on checkpoints - we have to save twice as many free-lists on every checkpoint The reason, to my understanding, is the fact that FreeList class is parameterized with the specific type of data that it stores. It makes no sense to me, to be completely honest, because the algorithm is always the same, and we always use the code from abstract free-list implementation. What I propose: * get rid of abstract implementation and only have the concrete implementation of free lists * same for data pages * serialization code will be fully moved to implementations of Storeable We're losing some guarantees if we do this change - we can no longer check that type of the page is correct. My response to this issue is that every Storeable could add a 1-byte header to the data, in order to validate it when being read, that should be enough. If we could find a way to store less than 1 byte then that's nice, I didn't look too much into the question. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IGNITE-21257) Public Java API to get global partition states
[ https://issues.apache.org/jira/browse/IGNITE-21257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov reassigned IGNITE-21257: -- Assignee: Ivan Bessonov > Public Java API to get global partition states > -- > > Key: IGNITE-21257 > URL: https://issues.apache.org/jira/browse/IGNITE-21257 > Project: Ignite > Issue Type: Improvement >Reporter: Ivan Bessonov >Assignee: Ivan Bessonov >Priority: Major > Labels: ignite-3 > > Please refer to https://issues.apache.org/jira/browse/IGNITE-21140 for the > list. > We should use local partition states, implemented in IGNITE-21256, and > combine them in cluster-wide compute call, before returning to the user. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21987) Optimize RO scan in sorted indexes
[ https://issues.apache.org/jira/browse/IGNITE-21987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov updated IGNITE-21987: --- Description: This issue applies to aimem/aipersist primarily. Optimization for rocksdb might be done separately. * add new method to SortedIndexStorage, like "readOnlyScan", that returns a simple cursor * in the implementation we should use alternative cursor implementation for RO scans - it should delegate calls to B+Tree cursor * reuse existing tests where possible * call new method where necessary (PartitionReplicaListener#scanSortedIndex) IMPORTANT: we should throw an exception if somebody scans an index and IndexStorage#getNextRowIdToBuild is not null. It should be a new error, like "IndexNotBuiltException" was: This issue applies to aimem/aipersist primarily. Optimization for rocksdb might be done separately. * add new method to SortedIndexStorage, like "readOnlyScan", that returns a simple cursor * in the implementation we should use alternative cursor implementation for RO scans - it should delegate calls to B+Tree cursor * reuse existing tests where possible * call new method where necessary (PartitionReplicaListener#scanSortedIndex) IMPORTANT: we should throw an exception if somebody scans an index and IndexStorage#getNextRowIdToBuild is not null. > Optimize RO scan in sorted indexes > -- > > Key: IGNITE-21987 > URL: https://issues.apache.org/jira/browse/IGNITE-21987 > Project: Ignite > Issue Type: Improvement >Reporter: Ivan Bessonov >Priority: Major > Labels: ignite-3 > > This issue applies to aimem/aipersist primarily. Optimization for rocksdb > might be done separately. > * add new method to SortedIndexStorage, like "readOnlyScan", that returns a > simple cursor > * in the implementation we should use alternative cursor implementation for > RO scans - it should delegate calls to B+Tree cursor > * reuse existing tests where possible > * call new method where necessary (PartitionReplicaListener#scanSortedIndex) > IMPORTANT: we should throw an exception if somebody scans an index and > IndexStorage#getNextRowIdToBuild is not null. It should be a new error, like > "IndexNotBuiltException" -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21987) Optimize RO scan in sorted indexes
[ https://issues.apache.org/jira/browse/IGNITE-21987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov updated IGNITE-21987: --- Description: This issue applies to aimem/aipersist primarily. Optimization for rocksdb might be done separately. * add new method to SortedIndexStorage, like "readOnlyScan", that returns a simple cursor * in the implementation we should use alternative cursor implementation for RO scans - it should delegate calls to B+Tree cursor * reuse existing tests where possible * call new method where necessary (PartitionReplicaListener#scanSortedIndex) IMPORTANT: we should throw an exception if somebody scans an index and IndexStorage#getNextRowIdToBuild is not null. was: This issue applies to aimem/aipersist primarily. Optimization for rocksdb might be done separately. * add new method to SortedIndexStorage, like "readOnlyScan", that returns a simple cursor * in the implementation we should use alternative cursor implementation for RO scans - it should delegate calls to B+Tree cursor * reuse existing tests where possible * call new method where necessary (PartitionReplicaListener#scanSortedIndex) > Optimize RO scan in sorted indexes > -- > > Key: IGNITE-21987 > URL: https://issues.apache.org/jira/browse/IGNITE-21987 > Project: Ignite > Issue Type: Improvement >Reporter: Ivan Bessonov >Priority: Major > Labels: ignite-3 > > This issue applies to aimem/aipersist primarily. Optimization for rocksdb > might be done separately. > * add new method to SortedIndexStorage, like "readOnlyScan", that returns a > simple cursor > * in the implementation we should use alternative cursor implementation for > RO scans - it should delegate calls to B+Tree cursor > * reuse existing tests where possible > * call new method where necessary (PartitionReplicaListener#scanSortedIndex) > IMPORTANT: we should throw an exception if somebody scans an index and > IndexStorage#getNextRowIdToBuild is not null. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21987) Optimize RO scan in sorted indexes
[ https://issues.apache.org/jira/browse/IGNITE-21987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov updated IGNITE-21987: --- Description: This issue applies to aimem/aipersist primarily. Optimization for rocksdb might be done separately. * add new method to SortedIndexStorage, like "readOnlyScan", that returns a simple cursor * in the implementation we should use alternative cursor implementation for RO scans - it should delegate calls to B+Tree cursor * reuse existing tests where possible * call new method where necessary (PartitionReplicaListener#scanSortedIndex) was: This issue applies to aimem/aipersist primarily. Optimization for rocksdb might be done separately. * add new flag RO_SCAN to SortedIndexStorage * in the implementation we should use alternative cursor implementation for RO scans - it should delegate calls to B+Tree cursor, and "peek" should throw an "UnsupportedOperationException" * for "rocksdb" it shouldn't refresh the iterator all the time. "peek" should also throw exceptions * reuse existing tests * pass new RO_SCAN flag into a method where it's necessary > Optimize RO scan in sorted indexes > -- > > Key: IGNITE-21987 > URL: https://issues.apache.org/jira/browse/IGNITE-21987 > Project: Ignite > Issue Type: Improvement >Reporter: Ivan Bessonov >Priority: Major > Labels: ignite-3 > > This issue applies to aimem/aipersist primarily. Optimization for rocksdb > might be done separately. > * add new method to SortedIndexStorage, like "readOnlyScan", that returns a > simple cursor > * in the implementation we should use alternative cursor implementation for > RO scans - it should delegate calls to B+Tree cursor > * reuse existing tests where possible > * call new method where necessary (PartitionReplicaListener#scanSortedIndex) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21987) Optimize RO scan in sorted indexes
Ivan Bessonov created IGNITE-21987: -- Summary: Optimize RO scan in sorted indexes Key: IGNITE-21987 URL: https://issues.apache.org/jira/browse/IGNITE-21987 Project: Ignite Issue Type: Improvement Reporter: Ivan Bessonov This issue applies to aimem/aipersist primarily. Optimization for rocksdb might be done separately. * add new flag RO_SCAN to SortedIndexStorage * in the implementation we should use alternative cursor implementation for RO scans - it should delegate calls to B+Tree cursor, and "peek" should throw an "UnsupportedOperationException" * for "rocksdb" it shouldn't refresh the iterator all the time. "peek" should also throw exceptions * reuse existing tests * pass new RO_SCAN flag into a method where it's necessary -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21906) Consider disabling inline in PK index by default
Ivan Bessonov created IGNITE-21906: -- Summary: Consider disabling inline in PK index by default Key: IGNITE-21906 URL: https://issues.apache.org/jira/browse/IGNITE-21906 Project: Ignite Issue Type: Improvement Reporter: Ivan Bessonov In aipersist/aimem we attempt to inline binary tuples into pages for hash indexes by default. This, in theory, saves us from the necessity of accessing binary tuples from data pages for comparison, which is slower than comparing inlined data. But, assuming the good hash distribution, we would only have to do the real comparison for the matched tuple. At the same time, inlined data might be substantially larger than hash+link, meaning that B+Tree with inlined data has bigger height, which correlates with slower search speed. So, we have both pros and cons for inlining, and the only real way to reconcile them is to compare them with some benchmarks. This is exactly what I propose. TL;DR: force inline size to be 0 for hash indices and benchmark for put/get operations, with large enough amount of data. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21902) Add an option to configure log storage path
Ivan Bessonov created IGNITE-21902: -- Summary: Add an option to configure log storage path Key: IGNITE-21902 URL: https://issues.apache.org/jira/browse/IGNITE-21902 Project: Ignite Issue Type: Improvement Reporter: Ivan Bessonov Fix For: 3.0.0-beta2 Option to store log and data on separate devices can substantially improve the performance in a long run for many users, we should implement it. There is such an option in Ignite 2, and people use it all the time. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (IGNITE-21898) Remove reactive methods from AntiHijackingIgniteSql
[ https://issues.apache.org/jira/browse/IGNITE-21898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov resolved IGNITE-21898. Reviewer: Ivan Bessonov Resolution: Fixed > Remove reactive methods from AntiHijackingIgniteSql > --- > > Key: IGNITE-21898 > URL: https://issues.apache.org/jira/browse/IGNITE-21898 > Project: Ignite > Issue Type: Improvement >Reporter: Roman Puchkovskiy >Assignee: Roman Puchkovskiy >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > Time Spent: 20m > Remaining Estimate: 0h > > They were removed from IgniteSql interface. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21661) Test scenario where all stable nodes are lost during a partially completed rebalance
Ivan Bessonov created IGNITE-21661: -- Summary: Test scenario where all stable nodes are lost during a partially completed rebalance Key: IGNITE-21661 URL: https://issues.apache.org/jira/browse/IGNITE-21661 Project: Ignite Issue Type: Improvement Reporter: Ivan Bessonov Following case is possible: * Nodes A, B and C for a partition * B and C go offline * new distribution is A, D and E * full state transfer from A to D is completed * full state transfer from A to E is not * A goes offline * we perform "resetPartitions" Ideally, we should use D as a new leader somehow, but the bare minimum should be a partition that is functional, maybe an empty one. We should test the case -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21284) Internal API for manual raft group configuration update
[ https://issues.apache.org/jira/browse/IGNITE-21284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov updated IGNITE-21284: --- Description: We need an API (with implementation) that's analogous to "reset-lost-partitions", but with the ability to reuse living minority of nodes. This API should gather the states of partitions, identify healthy peers, and use them as a new raft group configuration (through the update of assignments). We have to make sure that node with latest log index will become a leader, so we will have to propagate desired minimum for log index in assignments and use it during the voting. h2. What's implemented "resetPartitions" operation in distributed zone manager. It identifies partitions where only a minority of nodes is online (thus they won't be able to execute "changePeersAsync"), and writes a "forced pending assignments" for them. Forced assignment excludes stable nodes, that are not present in pending assignment, from a new raft group configuration. It also performs a "resetPeers" operation on alive nodes from the stable assignment. Complete loss of all nodes from stable assignments is not yet implemented, at least one node is required to be elected as a leader. was: We need an API (with implementation) that's analogous to "reset-lost-partitions", but with the ability to reuse living minority of nodes. This API should gather the states of partitions, identify healthy peers, and use them as a new raft group configuration (through the update of assignments). We have to make sure that node with latest log index will become a leader, so we will have to propagate desired minimum for log index in assignments and use it during the voting. > Internal API for manual raft group configuration update > --- > > Key: IGNITE-21284 > URL: https://issues.apache.org/jira/browse/IGNITE-21284 > Project: Ignite > Issue Type: Improvement >Reporter: Ivan Bessonov >Assignee: Ivan Bessonov >Priority: Major > Labels: ignite-3 > Time Spent: 10m > Remaining Estimate: 0h > > We need an API (with implementation) that's analogous to > "reset-lost-partitions", but with the ability to reuse living minority of > nodes. > This API should gather the states of partitions, identify healthy peers, and > use them as a new raft group configuration (through the update of > assignments). > We have to make sure that node with latest log index will become a leader, so > we will have to propagate desired minimum for log index in assignments and > use it during the voting. > h2. What's implemented > "resetPartitions" operation in distributed zone manager. It identifies > partitions where only a minority of nodes is online (thus they won't be able > to execute "changePeersAsync"), and writes a "forced pending assignments" for > them. > Forced assignment excludes stable nodes, that are not present in pending > assignment, from a new raft group configuration. It also performs a > "resetPeers" operation on alive nodes from the stable assignment. > Complete loss of all nodes from stable assignments is not yet implemented, at > least one node is required to be elected as a leader. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21588) CMG commands idempotency is broken
[ https://issues.apache.org/jira/browse/IGNITE-21588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov updated IGNITE-21588: --- Description: When handling commands like {{JoinReadyCommand}} and {{NodesLeaveCommand}} we do the following: * Read local state with {{{}readLogicalTopology(){}}}. * Modify state according to the command. * {*}Increase version{*}. * Write new state with {{{}saveSnapshotToStorage(snapshot){}}}. The problem lies in reading and writing of the state - it' local, and version value is not replicated. What happens when we restart the node: * It starts without local storage snapshot, with appliedIndex == 0, which is a {*}state in the past{*}. * We apply commands that were already applied before restart. * We apply these commands to locally saved topology snapshot. * This logical topology snapshot has a *state in the future* when compared to appliedIndex == 0. * As a result, when we re-apply some commands, we *increase the version* one more time, thus breaking data consistency between nodes. This would have been fine if we only used this version locally. But distribution zones rely on the consistency of the version between all nodes in cluster. This might break DZ data nodes handling if any of the cluster nodes restarts. How to fix: * Either drop the storage if there's no storage snapshot, this will restore consistency * or never start CMG group from a snapshot, but rather start it from the latest storage data. was: When handling commands like {{JoinReadyCommand}} and {{NodesLeaveCommand}} we do the following: * Read local state with {{{}readLogicalTopology(){}}}. * Modify state according to the command. * {*}Increase version{*}. * Write new state with {{{}saveSnapshotToStorage(snapshot){}}}. The problem lies in reading and writing of the state - it's local, and version value is not replicated. What happens when we restart the node: * It starts with local storage snapshot, which is a {*}state in the past{*}, generally speaking. * We apply commands that were not applied in the snapshot. * We apply these commands to locally saved topology snapshot. * This logical topology snapshot has a *state in the future* when compared to storage snapshot. * As a result, when we re-apply some commands, we *increase the version* one more time, thus breaking data consistency between nodes. This would have been fine if we only used this version locally. But distribution zones rely on the consistency of the version between all nodes in cluster. This might break DZ data nodes handling if any of the cluster nodes restarts. > CMG commands idempotency is broken > -- > > Key: IGNITE-21588 > URL: https://issues.apache.org/jira/browse/IGNITE-21588 > Project: Ignite > Issue Type: Bug >Reporter: Ivan Bessonov >Priority: Major > Labels: ignite-3 > > When handling commands like {{JoinReadyCommand}} and {{NodesLeaveCommand}} we > do the following: > * Read local state with {{{}readLogicalTopology(){}}}. > * Modify state according to the command. > * {*}Increase version{*}. > * Write new state with {{{}saveSnapshotToStorage(snapshot){}}}. > The problem lies in reading and writing of the state - it' local, and version > value is not replicated. > What happens when we restart the node: > * It starts without local storage snapshot, with appliedIndex == 0, which is > a {*}state in the past{*}. > * We apply commands that were already applied before restart. > * We apply these commands to locally saved topology snapshot. > * This logical topology snapshot has a *state in the future* when compared > to appliedIndex == 0. > * As a result, when we re-apply some commands, we *increase the version* one > more time, thus breaking data consistency between nodes. > This would have been fine if we only used this version locally. But > distribution zones rely on the consistency of the version between all nodes > in cluster. This might break DZ data nodes handling if any of the cluster > nodes restarts. > How to fix: > * Either drop the storage if there's no storage snapshot, this will restore > consistency > * or never start CMG group from a snapshot, but rather start it from the > latest storage data. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21588) CMG commands idempotency is broken
Ivan Bessonov created IGNITE-21588: -- Summary: CMG commands idempotency is broken Key: IGNITE-21588 URL: https://issues.apache.org/jira/browse/IGNITE-21588 Project: Ignite Issue Type: Bug Reporter: Ivan Bessonov When handling commands like {{JoinReadyCommand}} and {{NodesLeaveCommand}} we do the following: * Read local state with {{{}readLogicalTopology(){}}}. * Modify state according to the command. * {*}Increase version{*}. * Write new state with {{{}saveSnapshotToStorage(snapshot){}}}. The problem lies in reading and writing of the state - it' local, and version value is not replicated. What happens when we restart the node: * It starts with local storage snapshot, which is a {*}state in the past{*}, generally speaking. * We apply commands that were not applied in the snapshot. * We apply these commands to locally saved topology snapshot. * This logical topology snapshot has a *state in the future* when compared to storage snapshot. * As a result, when we re-apply some commands, we *increase the version* one more time, thus breaking data consistency between nodes. This would have been fine if we only used this version locally. But distribution zones rely on the consistency of the version between all nodes in cluster. This might break DZ data nodes handling if any of the cluster nodes restarts. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21548) Encapsulate Set
Ivan Bessonov created IGNITE-21548: -- Summary: Encapsulate Set Key: IGNITE-21548 URL: https://issues.apache.org/jira/browse/IGNITE-21548 Project: Ignite Issue Type: Improvement Reporter: Ivan Bessonov Assignee: Ivan Bessonov Assignments may have some associated metadata, like a "force" flag, for example. We should prepare the code for introducing such meta in the future. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-18366) Simplify the configuration asm generator, phase 2
[ https://issues.apache.org/jira/browse/IGNITE-18366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov updated IGNITE-18366: --- Description: After the split, it makes sense to start simplifying every individual generator. This is partially a research issue. Exactly what to do is not clear yet. Some context: classes in package {{org.apache.ignite.internal.configuration.asm}} are pretty big and complicated. {{InnerNodeAsmGenerator}} is almost 2000 lines long. How can we make it simpler? Better naming, more comments. Inner node generation can be split into multiple files, because it also handles polymorphic implementations. In some cases I would change the generation itself. For example, generated methods in polymorphic instances have the same implementation as in original inner node instead of simply delegating the execution to inner nodes. It affect both performance and the code of the generators in negative way. was:After the split, it makes sense to start simplifying every individual generator. This is partially a research issue. Exactly what to do is not clear yet. > Simplify the configuration asm generator, phase 2 > - > > Key: IGNITE-18366 > URL: https://issues.apache.org/jira/browse/IGNITE-18366 > Project: Ignite > Issue Type: Improvement >Reporter: Ivan Bessonov >Assignee: Ivan Bessonov >Priority: Major > Labels: iep-55, ignite-3, technical-debt > Fix For: 3.0.0-beta2 > > > After the split, it makes sense to start simplifying every individual > generator. This is partially a research issue. Exactly what to do is not > clear yet. > Some context: classes in package > {{org.apache.ignite.internal.configuration.asm}} are pretty big and > complicated. {{InnerNodeAsmGenerator}} is almost 2000 lines long. > How can we make it simpler? Better naming, more comments. Inner node > generation can be split into multiple files, because it also handles > polymorphic implementations. > In some cases I would change the generation itself. For example, generated > methods in polymorphic instances have the same implementation as in original > inner node instead of simply delegating the execution to inner nodes. It > affect both performance and the code of the generators in negative way. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (IGNITE-21302) Prohibit automatic group reconfiguration when there's no majority
[ https://issues.apache.org/jira/browse/IGNITE-21302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov resolved IGNITE-21302. Resolution: Won't Fix This fix is not required. Data loss won't happen for different reasons > Prohibit automatic group reconfiguration when there's no majority > - > > Key: IGNITE-21302 > URL: https://issues.apache.org/jira/browse/IGNITE-21302 > Project: Ignite > Issue Type: Improvement >Reporter: Ivan Bessonov >Assignee: Ivan Bessonov >Priority: Major > Labels: ignite-3 > Time Spent: 10m > Remaining Estimate: 0h > > scaleDown timer should not lead to a situation where user loses the data. > Default "changePeers" behavior also won't work, because there's no majority > and thus no leader. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21501) Create index storages for new partitions on rebalance
[ https://issues.apache.org/jira/browse/IGNITE-21501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov updated IGNITE-21501: --- Epic Link: IGNITE-20782 > Create index storages for new partitions on rebalance > - > > Key: IGNITE-21501 > URL: https://issues.apache.org/jira/browse/IGNITE-21501 > Project: Ignite > Issue Type: Bug >Reporter: Ivan Bessonov >Priority: Major > Labels: ignite-3 > > It appears that we only create index storages during the "table creation", > not during the "partition creation" if it's performed in isolation. > Even if we did, > {{org.apache.ignite.internal.table.distributed.index.IndexUpdateHandler#waitIndexes}} > is still badly designed, because it waits for indexes of the initial > partitions distribution and cannot provide any guarantees when assignments > are changed. > This leads to NPEs or bizarre assertions, related to aforementioned method. > What we need to do is: > * Get rid of the faulty index awaiting mechanizm. > * Create index storages before starting raft group. > * [optional] There might be naturally occurring "races" between catalog > updates (index creation) and rebalance. Right now they are resolved by the > fact that these processes are linearized in watch processing, but that's not > the best approach. If we could provide something more robust, that would have > been nice. Let's think about it at least. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21501) Create index storages for new partitions on rebalance
Ivan Bessonov created IGNITE-21501: -- Summary: Create index storages for new partitions on rebalance Key: IGNITE-21501 URL: https://issues.apache.org/jira/browse/IGNITE-21501 Project: Ignite Issue Type: Bug Reporter: Ivan Bessonov It appears that we only create index storages during the "table creation", not during the "partition creation" if it's performed in isolation. Even if we did, {{org.apache.ignite.internal.table.distributed.index.IndexUpdateHandler#waitIndexes}} is still badly designed, because it waits for indexes of the initial partitions distribution and cannot provide any guarantees when assignments are changed. This leads to NPEs or bizarre assertions, related to aforementioned method. What we need to do is: * Get rid of the faulty index awaiting mechanizm. * Create index storages before starting raft group. * [optional] There might be naturally occurring "races" between catalog updates (index creation) and rebalance. Right now they are resolved by the fact that these processes are linearized in watch processing, but that's not the best approach. If we could provide something more robust, that would have been nice. Let's think about it at least. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (IGNITE-21488) Disable thread assertions by default
[ https://issues.apache.org/jira/browse/IGNITE-21488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov resolved IGNITE-21488. Reviewer: Ivan Bessonov Resolution: Fixed > Disable thread assertions by default > > > Key: IGNITE-21488 > URL: https://issues.apache.org/jira/browse/IGNITE-21488 > Project: Ignite > Issue Type: Improvement >Reporter: Roman Puchkovskiy >Assignee: Roman Puchkovskiy >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21469) AssertionError in checkpoint
[ https://issues.apache.org/jira/browse/IGNITE-21469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov updated IGNITE-21469: --- Epic Link: IGNITE-21444 > AssertionError in checkpoint > > > Key: IGNITE-21469 > URL: https://issues.apache.org/jira/browse/IGNITE-21469 > Project: Ignite > Issue Type: Improvement >Reporter: Ivan Bessonov >Priority: Major > Labels: ignite-3 > > > {code:java} > at > java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:870) > ~[?:?] at > java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837) > ~[?:?] at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) > ~[?:?] at > java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088) > ~[?:?] at > org.apache.ignite.internal.pagememory.persistence.checkpoint.AwaitTasksCompletionExecutor.lambda$execute$1(AwaitTasksCompletionExecutor.java:63) > ~[ignite-page-memory-3.0.0-SNAPSHOT.jar:?] at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > ~[?:?] at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > ~[?:?] ... 1 more Caused by: java.lang.AssertionError: FullPageId > [pageId=000100020378, effectivePageId=00020378, groupId=886] > at > org.apache.ignite.internal.pagememory.persistence.PersistentPageMemory.acquirePage(PersistentPageMemory.java:758) > ~[ignite-page-memory-3.0.0-SNAPSHOT.jar:?] at > org.apache.ignite.internal.pagememory.persistence.PersistentPageMemory.acquirePage(PersistentPageMemory.java:641) > ~[ignite-page-memory-3.0.0-SNAPSHOT.jar:?] at > org.apache.ignite.internal.pagememory.persistence.PersistentPageMemory.acquirePage(PersistentPageMemory.java:613) > ~[ignite-page-memory-3.0.0-SNAPSHOT.jar:?] at > org.apache.ignite.internal.pagememory.util.PageHandler.writePage(PageHandler.java:280) > ~[ignite-page-memory-3.0.0-SNAPSHOT.jar:?] at > org.apache.ignite.internal.pagememory.datastructure.DataStructure.write(DataStructure.java:296) > ~[ignite-page-memory-3.0.0-SNAPSHOT.jar:?] at > org.apache.ignite.internal.pagememory.freelist.PagesList.flushBucketsCache(PagesList.java:387) > ~[ignite-page-memory-3.0.0-SNAPSHOT.jar:?] at > org.apache.ignite.internal.pagememory.freelist.PagesList.saveMetadata(PagesList.java:332) > ~[ignite-page-memory-3.0.0-SNAPSHOT.jar:?] at > org.apache.ignite.internal.storage.pagememory.mv.RowVersionFreeList.saveMetadata(RowVersionFreeList.java:185) > ~[ignite-storage-page-memory-3.0.0-SNAPSHOT.jar:?] at > org.apache.ignite.internal.storage.pagememory.mv.PersistentPageMemoryMvPartitionStorage.lambda$syncMetadataOnCheckpoint$13(PersistentPageMemoryMvPartitionStorage.java:345) > ~[ignite-storage-page-memory-3.0.0-SNAPSHOT.jar:?] at > org.apache.ignite.internal.pagememory.persistence.checkpoint.AwaitTasksCompletionExecutor.lambda$execute$1(AwaitTasksCompletionExecutor.java:59) > ~[ignite-page-memory-3.0.0-SNAPSHOT.jar:?] at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > ~[?:?] at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > ~[?:?] ... 1 more{code} > [https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_RunAllTests/7820824?expandBuildDeploymentsSection=false=false=false=true=true+Inspection=true] > > The reason of the assertion is a bug/race in listeners unregistration for > partitions freelists. We should do it properly -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21469) AssertionError in checkpoint
[ https://issues.apache.org/jira/browse/IGNITE-21469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov updated IGNITE-21469: --- Ignite Flags: (was: Docs Required,Release Notes Required) > AssertionError in checkpoint > > > Key: IGNITE-21469 > URL: https://issues.apache.org/jira/browse/IGNITE-21469 > Project: Ignite > Issue Type: Improvement >Reporter: Ivan Bessonov >Priority: Major > Labels: ignite-3 > > > {code:java} > at > java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:870) > ~[?:?] at > java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837) > ~[?:?] at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) > ~[?:?] at > java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088) > ~[?:?] at > org.apache.ignite.internal.pagememory.persistence.checkpoint.AwaitTasksCompletionExecutor.lambda$execute$1(AwaitTasksCompletionExecutor.java:63) > ~[ignite-page-memory-3.0.0-SNAPSHOT.jar:?] at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > ~[?:?] at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > ~[?:?] ... 1 more Caused by: java.lang.AssertionError: FullPageId > [pageId=000100020378, effectivePageId=00020378, groupId=886] > at > org.apache.ignite.internal.pagememory.persistence.PersistentPageMemory.acquirePage(PersistentPageMemory.java:758) > ~[ignite-page-memory-3.0.0-SNAPSHOT.jar:?] at > org.apache.ignite.internal.pagememory.persistence.PersistentPageMemory.acquirePage(PersistentPageMemory.java:641) > ~[ignite-page-memory-3.0.0-SNAPSHOT.jar:?] at > org.apache.ignite.internal.pagememory.persistence.PersistentPageMemory.acquirePage(PersistentPageMemory.java:613) > ~[ignite-page-memory-3.0.0-SNAPSHOT.jar:?] at > org.apache.ignite.internal.pagememory.util.PageHandler.writePage(PageHandler.java:280) > ~[ignite-page-memory-3.0.0-SNAPSHOT.jar:?] at > org.apache.ignite.internal.pagememory.datastructure.DataStructure.write(DataStructure.java:296) > ~[ignite-page-memory-3.0.0-SNAPSHOT.jar:?] at > org.apache.ignite.internal.pagememory.freelist.PagesList.flushBucketsCache(PagesList.java:387) > ~[ignite-page-memory-3.0.0-SNAPSHOT.jar:?] at > org.apache.ignite.internal.pagememory.freelist.PagesList.saveMetadata(PagesList.java:332) > ~[ignite-page-memory-3.0.0-SNAPSHOT.jar:?] at > org.apache.ignite.internal.storage.pagememory.mv.RowVersionFreeList.saveMetadata(RowVersionFreeList.java:185) > ~[ignite-storage-page-memory-3.0.0-SNAPSHOT.jar:?] at > org.apache.ignite.internal.storage.pagememory.mv.PersistentPageMemoryMvPartitionStorage.lambda$syncMetadataOnCheckpoint$13(PersistentPageMemoryMvPartitionStorage.java:345) > ~[ignite-storage-page-memory-3.0.0-SNAPSHOT.jar:?] at > org.apache.ignite.internal.pagememory.persistence.checkpoint.AwaitTasksCompletionExecutor.lambda$execute$1(AwaitTasksCompletionExecutor.java:59) > ~[ignite-page-memory-3.0.0-SNAPSHOT.jar:?] at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > ~[?:?] at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > ~[?:?] ... 1 more{code} > [https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_RunAllTests/7820824?expandBuildDeploymentsSection=false=false=false=true=true+Inspection=true] > > The reason of the assertion is a bug/race in listeners unregistration for > partitions freelists. We should do it properly -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21469) AssertionError in checkpoint
[ https://issues.apache.org/jira/browse/IGNITE-21469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov updated IGNITE-21469: --- Labels: ignite-3 (was: ) > AssertionError in checkpoint > > > Key: IGNITE-21469 > URL: https://issues.apache.org/jira/browse/IGNITE-21469 > Project: Ignite > Issue Type: Improvement >Reporter: Ivan Bessonov >Priority: Major > Labels: ignite-3 > > > {code:java} > at > java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:870) > ~[?:?] at > java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837) > ~[?:?] at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) > ~[?:?] at > java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088) > ~[?:?] at > org.apache.ignite.internal.pagememory.persistence.checkpoint.AwaitTasksCompletionExecutor.lambda$execute$1(AwaitTasksCompletionExecutor.java:63) > ~[ignite-page-memory-3.0.0-SNAPSHOT.jar:?] at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > ~[?:?] at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > ~[?:?] ... 1 more Caused by: java.lang.AssertionError: FullPageId > [pageId=000100020378, effectivePageId=00020378, groupId=886] > at > org.apache.ignite.internal.pagememory.persistence.PersistentPageMemory.acquirePage(PersistentPageMemory.java:758) > ~[ignite-page-memory-3.0.0-SNAPSHOT.jar:?] at > org.apache.ignite.internal.pagememory.persistence.PersistentPageMemory.acquirePage(PersistentPageMemory.java:641) > ~[ignite-page-memory-3.0.0-SNAPSHOT.jar:?] at > org.apache.ignite.internal.pagememory.persistence.PersistentPageMemory.acquirePage(PersistentPageMemory.java:613) > ~[ignite-page-memory-3.0.0-SNAPSHOT.jar:?] at > org.apache.ignite.internal.pagememory.util.PageHandler.writePage(PageHandler.java:280) > ~[ignite-page-memory-3.0.0-SNAPSHOT.jar:?] at > org.apache.ignite.internal.pagememory.datastructure.DataStructure.write(DataStructure.java:296) > ~[ignite-page-memory-3.0.0-SNAPSHOT.jar:?] at > org.apache.ignite.internal.pagememory.freelist.PagesList.flushBucketsCache(PagesList.java:387) > ~[ignite-page-memory-3.0.0-SNAPSHOT.jar:?] at > org.apache.ignite.internal.pagememory.freelist.PagesList.saveMetadata(PagesList.java:332) > ~[ignite-page-memory-3.0.0-SNAPSHOT.jar:?] at > org.apache.ignite.internal.storage.pagememory.mv.RowVersionFreeList.saveMetadata(RowVersionFreeList.java:185) > ~[ignite-storage-page-memory-3.0.0-SNAPSHOT.jar:?] at > org.apache.ignite.internal.storage.pagememory.mv.PersistentPageMemoryMvPartitionStorage.lambda$syncMetadataOnCheckpoint$13(PersistentPageMemoryMvPartitionStorage.java:345) > ~[ignite-storage-page-memory-3.0.0-SNAPSHOT.jar:?] at > org.apache.ignite.internal.pagememory.persistence.checkpoint.AwaitTasksCompletionExecutor.lambda$execute$1(AwaitTasksCompletionExecutor.java:59) > ~[ignite-page-memory-3.0.0-SNAPSHOT.jar:?] at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > ~[?:?] at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > ~[?:?] ... 1 more{code} > [https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_RunAllTests/7820824?expandBuildDeploymentsSection=false=false=false=true=true+Inspection=true] > > The reason of the assertion is a bug/race in listeners unregistration for > partitions freelists. We should do it properly -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21469) AssertionError in checkpoint
Ivan Bessonov created IGNITE-21469: -- Summary: AssertionError in checkpoint Key: IGNITE-21469 URL: https://issues.apache.org/jira/browse/IGNITE-21469 Project: Ignite Issue Type: Improvement Reporter: Ivan Bessonov {code:java} at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:870) ~[?:?] at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837) ~[?:?] at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) ~[?:?] at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088) ~[?:?] at org.apache.ignite.internal.pagememory.persistence.checkpoint.AwaitTasksCompletionExecutor.lambda$execute$1(AwaitTasksCompletionExecutor.java:63) ~[ignite-page-memory-3.0.0-SNAPSHOT.jar:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?] ... 1 more Caused by: java.lang.AssertionError: FullPageId [pageId=000100020378, effectivePageId=00020378, groupId=886] at org.apache.ignite.internal.pagememory.persistence.PersistentPageMemory.acquirePage(PersistentPageMemory.java:758) ~[ignite-page-memory-3.0.0-SNAPSHOT.jar:?] at org.apache.ignite.internal.pagememory.persistence.PersistentPageMemory.acquirePage(PersistentPageMemory.java:641) ~[ignite-page-memory-3.0.0-SNAPSHOT.jar:?] at org.apache.ignite.internal.pagememory.persistence.PersistentPageMemory.acquirePage(PersistentPageMemory.java:613) ~[ignite-page-memory-3.0.0-SNAPSHOT.jar:?] at org.apache.ignite.internal.pagememory.util.PageHandler.writePage(PageHandler.java:280) ~[ignite-page-memory-3.0.0-SNAPSHOT.jar:?] at org.apache.ignite.internal.pagememory.datastructure.DataStructure.write(DataStructure.java:296) ~[ignite-page-memory-3.0.0-SNAPSHOT.jar:?] at org.apache.ignite.internal.pagememory.freelist.PagesList.flushBucketsCache(PagesList.java:387) ~[ignite-page-memory-3.0.0-SNAPSHOT.jar:?] at org.apache.ignite.internal.pagememory.freelist.PagesList.saveMetadata(PagesList.java:332) ~[ignite-page-memory-3.0.0-SNAPSHOT.jar:?] at org.apache.ignite.internal.storage.pagememory.mv.RowVersionFreeList.saveMetadata(RowVersionFreeList.java:185) ~[ignite-storage-page-memory-3.0.0-SNAPSHOT.jar:?] at org.apache.ignite.internal.storage.pagememory.mv.PersistentPageMemoryMvPartitionStorage.lambda$syncMetadataOnCheckpoint$13(PersistentPageMemoryMvPartitionStorage.java:345) ~[ignite-storage-page-memory-3.0.0-SNAPSHOT.jar:?] at org.apache.ignite.internal.pagememory.persistence.checkpoint.AwaitTasksCompletionExecutor.lambda$execute$1(AwaitTasksCompletionExecutor.java:59) ~[ignite-page-memory-3.0.0-SNAPSHOT.jar:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?] ... 1 more{code} [https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_RunAllTests/7820824?expandBuildDeploymentsSection=false=false=false=true=true+Inspection=true] The reason of the assertion is a bug/race in listeners unregistration for partitions freelists. We should do it properly -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (IGNITE-21044) Investigate long table creation
[ https://issues.apache.org/jira/browse/IGNITE-21044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov resolved IGNITE-21044. Resolution: Done > Investigate long table creation > --- > > Key: IGNITE-21044 > URL: https://issues.apache.org/jira/browse/IGNITE-21044 > Project: Ignite > Issue Type: Improvement >Reporter: Ivan Bessonov >Priority: Major > Labels: ignite-3 > > If we run the test, in which we would create a lot of tables (mare than 200? > for example), we soon start seeing a degradation in table creation time. > In particular, handling of corresponding Catalog update might take literal > seconds. > One of the reasons is described here: > https://issues.apache.org/jira/browse/IGNITE-19913 > It explains why table creation might be slow, but it does not explain why it > degrades when we create more tables. So there are basically two issues: > * watch processing waits for unnecessary operations to complete > * those operations are too slow for some reason > We need to investigate and fix both issues -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IGNITE-21466) Add metrics for partition states
[ https://issues.apache.org/jira/browse/IGNITE-21466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov reassigned IGNITE-21466: -- Assignee: Ivan Bessonov > Add metrics for partition states > > > Key: IGNITE-21466 > URL: https://issues.apache.org/jira/browse/IGNITE-21466 > Project: Ignite > Issue Type: Improvement >Reporter: Ivan Bessonov >Assignee: Ivan Bessonov >Priority: Major > Labels: ignite-3 > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21466) Add metrics for partition states
Ivan Bessonov created IGNITE-21466: -- Summary: Add metrics for partition states Key: IGNITE-21466 URL: https://issues.apache.org/jira/browse/IGNITE-21466 Project: Ignite Issue Type: Improvement Reporter: Ivan Bessonov -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21465) Add system views for partition states
Ivan Bessonov created IGNITE-21465: -- Summary: Add system views for partition states Key: IGNITE-21465 URL: https://issues.apache.org/jira/browse/IGNITE-21465 Project: Ignite Issue Type: Improvement Reporter: Ivan Bessonov -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21446) Import JVM args from build.gradle for JUnit run configurations
[ https://issues.apache.org/jira/browse/IGNITE-21446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov updated IGNITE-21446: --- Ignite Flags: (was: Docs Required,Release Notes Required) > Import JVM args from build.gradle for JUnit run configurations > -- > > Key: IGNITE-21446 > URL: https://issues.apache.org/jira/browse/IGNITE-21446 > Project: Ignite > Issue Type: Improvement >Reporter: Ivan Bessonov >Assignee: Ivan Bessonov >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > Time Spent: 20m > Remaining Estimate: 0h > > This should help running tests locally with IDEA runner on Java 17 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21446) Import JVM args from build.gradle for JUnit run configurations
[ https://issues.apache.org/jira/browse/IGNITE-21446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov updated IGNITE-21446: --- Reviewer: Kirill Tkalenko > Import JVM args from build.gradle for JUnit run configurations > -- > > Key: IGNITE-21446 > URL: https://issues.apache.org/jira/browse/IGNITE-21446 > Project: Ignite > Issue Type: Improvement >Reporter: Ivan Bessonov >Assignee: Ivan Bessonov >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > Time Spent: 10m > Remaining Estimate: 0h > > This should help running tests locally with IDEA runner on Java 17 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21446) Import JVM args from build.gradle for JUnit run configurations
Ivan Bessonov created IGNITE-21446: -- Summary: Import JVM args from build.gradle for JUnit run configurations Key: IGNITE-21446 URL: https://issues.apache.org/jira/browse/IGNITE-21446 Project: Ignite Issue Type: Improvement Reporter: Ivan Bessonov Assignee: Ivan Bessonov Fix For: 3.0.0-beta2 This should help running tests locally with IDEA runner on Java 17 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21434) Fail user write requests for non-available partitions
Ivan Bessonov created IGNITE-21434: -- Summary: Fail user write requests for non-available partitions Key: IGNITE-21434 URL: https://issues.apache.org/jira/browse/IGNITE-21434 Project: Ignite Issue Type: Improvement Reporter: Ivan Bessonov Currently, {{INSERT INTO test VALUES(%d, %d);}} just hands indefinitely, which is not what you would expect. We should either fail the request immediately if there's no majority, or return a replication timeout exception, for example. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (IGNITE-20067) Optimize "StorageUpdateHandler#handleUpdateAll"
[ https://issues.apache.org/jira/browse/IGNITE-20067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov resolved IGNITE-20067. Fix Version/s: 3.0.0-beta2 Reviewer: Ivan Bessonov Resolution: Fixed > Optimize "StorageUpdateHandler#handleUpdateAll" > --- > > Key: IGNITE-20067 > URL: https://issues.apache.org/jira/browse/IGNITE-20067 > Project: Ignite > Issue Type: Improvement >Reporter: Ivan Bessonov >Assignee: Philipp Shergalis >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > In current implementation, the size of a single batch inside of the > "runConsistently" is unpredictable, because the collection of rows is > received from the message. > Generally speaking, it's a good idea to make the scope of single > "runConsistently" smaller - it would lead to faster work in all storage > engines: > * for rocksdb, write batches would become smaller; > * for page memory, spikes on checkpoint would become smaller. > There are two criteria that we could use: > * number of rows stored; > * cumulative number of inserted bytes. > Raft does the same approximation when batching log records, for example. This > should not affect the data consistency, because updateAll itself is > idempotent by its nature -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21359) There are 2 RebalanceUtil classes
Ivan Bessonov created IGNITE-21359: -- Summary: There are 2 RebalanceUtil classes Key: IGNITE-21359 URL: https://issues.apache.org/jira/browse/IGNITE-21359 Project: Ignite Issue Type: Improvement Reporter: Ivan Bessonov Assignee: Ivan Bessonov Fix For: 3.0.0-beta2 and they duplicate constants and methods. The least that we could do is remove code duplication and maybe rename one of these classes -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21347) Fix license header extra whitespaces in ErrorCodeGroup annotation processor
[ https://issues.apache.org/jira/browse/IGNITE-21347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov updated IGNITE-21347: --- Labels: ignite-3 (was: ) > Fix license header extra whitespaces in ErrorCodeGroup annotation processor > > > Key: IGNITE-21347 > URL: https://issues.apache.org/jira/browse/IGNITE-21347 > Project: Ignite > Issue Type: Improvement >Reporter: Dmitrii Zabotlin >Assignee: Dmitrii Zabotlin >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > Time Spent: 20m > Remaining Estimate: 0h > > There are extra whitespaces in the license headers in the generated error > codes files. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IGNITE-21284) Internal API for manual raft group configuration update
[ https://issues.apache.org/jira/browse/IGNITE-21284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov reassigned IGNITE-21284: -- Assignee: Ivan Bessonov > Internal API for manual raft group configuration update > --- > > Key: IGNITE-21284 > URL: https://issues.apache.org/jira/browse/IGNITE-21284 > Project: Ignite > Issue Type: Improvement >Reporter: Ivan Bessonov >Assignee: Ivan Bessonov >Priority: Major > Labels: ignite-3 > > We need an API (with implementation) that's analogous to > "reset-lost-partitions", but with the ability to reuse living minority of > nodes. > This API should gather the states of partitions, identify healthy peers, and > use them as a new raft group configuration (through the update of > assignments). > We have to make sure that node with latest log index will become a leader, so > we will have to propagate desired minimum for log index in assignments and > use it during the voting. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21309) DirectMessageWriter keeps holding used buffers
[ https://issues.apache.org/jira/browse/IGNITE-21309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov updated IGNITE-21309: --- Reviewer: Kirill Tkalenko > DirectMessageWriter keeps holding used buffers > -- > > Key: IGNITE-21309 > URL: https://issues.apache.org/jira/browse/IGNITE-21309 > Project: Ignite > Issue Type: Improvement >Reporter: Ivan Bessonov >Assignee: Ivan Bessonov >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > Time Spent: 10m > Remaining Estimate: 0h > > Thread-local optimized marshallers store links to write buffers in their > internal stacks, which could lead to occasional OOMs. We should release > buffers after writing nested messages in DirectMessageWriter. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21309) DirectMessageWriter keeps holding used buffers
Ivan Bessonov created IGNITE-21309: -- Summary: DirectMessageWriter keeps holding used buffers Key: IGNITE-21309 URL: https://issues.apache.org/jira/browse/IGNITE-21309 Project: Ignite Issue Type: Improvement Reporter: Ivan Bessonov Assignee: Ivan Bessonov Fix For: 3.0.0-beta2 Thread-local optimized marshallers store links to write buffers in their internal stacks, which could lead to occasional OOMs. We should release buffers after writing nested messages in DirectMessageWriter. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21305) Internal API for truncating log suffix
Ivan Bessonov created IGNITE-21305: -- Summary: Internal API for truncating log suffix Key: IGNITE-21305 URL: https://issues.apache.org/jira/browse/IGNITE-21305 Project: Ignite Issue Type: Improvement Reporter: Ivan Bessonov API and implementation is needed to truncate suffix of peers in ERROR state that cannot proceed applying commands -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21305) Internal API for truncating log suffix
[ https://issues.apache.org/jira/browse/IGNITE-21305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov updated IGNITE-21305: --- Ignite Flags: (was: Docs Required,Release Notes Required) > Internal API for truncating log suffix > -- > > Key: IGNITE-21305 > URL: https://issues.apache.org/jira/browse/IGNITE-21305 > Project: Ignite > Issue Type: Improvement >Reporter: Ivan Bessonov >Priority: Major > > API and implementation is needed to truncate suffix of peers in ERROR state > that cannot proceed applying commands -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21256) Internal API for local partition states
[ https://issues.apache.org/jira/browse/IGNITE-21256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov updated IGNITE-21256: --- Description: Please refer to https://issues.apache.org/jira/browse/IGNITE-21140 for the list. We need an API (with implementation) to access the list of local partitions and their states. The way to determine them: * comparing current assignments with replica states * check the state machine, it might be broken or installing snapshot was: Please refer to https://issues.apache.org/jira/browse/IGNITE-21140 for the list. We need an API to access the list of local partitions and their states. The way to determine them: * comparing current assignments with replica states * check the state machine, it might be broken or installing snapshot > Internal API for local partition states > --- > > Key: IGNITE-21256 > URL: https://issues.apache.org/jira/browse/IGNITE-21256 > Project: Ignite > Issue Type: Improvement >Reporter: Ivan Bessonov >Assignee: Ivan Bessonov >Priority: Major > Labels: ignite-3 > > Please refer to https://issues.apache.org/jira/browse/IGNITE-21140 for the > list. We need an API (with implementation) to access the list of local > partitions and their states. The way to determine them: > * comparing current assignments with replica states > * check the state machine, it might be broken or installing snapshot -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21284) Internal API for manual raft group configuration update
[ https://issues.apache.org/jira/browse/IGNITE-21284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov updated IGNITE-21284: --- Description: We need an API (with implementation) that's analogous to "reset-lost-partitions", but with the ability to reuse living minority of nodes. This API should gather the states of partitions, identify healthy peers, and use them as a new raft group configuration (through the update of assignments). We have to make sure that node with latest log index will become a leader, so we will have to propagate desired minimum for log index in assignments and use it during the voting. was: We need an API that's analogous to "reset-lost-partitions", but with the ability to reuse living minority of nodes. This API should gather the states of partitions, identify healthy peers, and use them as a new raft group configuration (through the update of assignments). We have to make sure that node with latest log index will become a leader, so we will have to propagate desired minimum for log index in assignments and use it during the voting. > Internal API for manual raft group configuration update > --- > > Key: IGNITE-21284 > URL: https://issues.apache.org/jira/browse/IGNITE-21284 > Project: Ignite > Issue Type: Improvement >Reporter: Ivan Bessonov >Priority: Major > Labels: ignite-3 > > We need an API (with implementation) that's analogous to > "reset-lost-partitions", but with the ability to reuse living minority of > nodes. > This API should gather the states of partitions, identify healthy peers, and > use them as a new raft group configuration (through the update of > assignments). > We have to make sure that node with latest log index will become a leader, so > we will have to propagate desired minimum for log index in assignments and > use it during the voting. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21304) Internal API for restarting partitions
Ivan Bessonov created IGNITE-21304: -- Summary: Internal API for restarting partitions Key: IGNITE-21304 URL: https://issues.apache.org/jira/browse/IGNITE-21304 Project: Ignite Issue Type: Improvement Reporter: Ivan Bessonov API and implementation should be provided for restarting peers in raft groups. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21303) Exclude nodes in "error" state from manual group reconfiguration
Ivan Bessonov created IGNITE-21303: -- Summary: Exclude nodes in "error" state from manual group reconfiguration Key: IGNITE-21303 URL: https://issues.apache.org/jira/browse/IGNITE-21303 Project: Ignite Issue Type: Improvement Reporter: Ivan Bessonov Instead of simply using existing set of node as a baseline for new assignments, we should either exclude peers in ERROR state from it, or force data cleanup on such nodes. Third option - forbid such reconfiguration, forcing user to clear ERROR peers in advance -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21302) Prohibit automatic group reconfiguration when there's no majority
Ivan Bessonov created IGNITE-21302: -- Summary: Prohibit automatic group reconfiguration when there's no majority Key: IGNITE-21302 URL: https://issues.apache.org/jira/browse/IGNITE-21302 Project: Ignite Issue Type: Improvement Reporter: Ivan Bessonov scaleDown timer should not lead to a situation where user loses the data. Default "changePeers" behavior also won't work, because there's no majority and thus no leader. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21301) Sync raft log before flush in all storage engines
Ivan Bessonov created IGNITE-21301: -- Summary: Sync raft log before flush in all storage engines Key: IGNITE-21301 URL: https://issues.apache.org/jira/browse/IGNITE-21301 Project: Ignite Issue Type: Improvement Reporter: Ivan Bessonov Checkpoints and RocsDB's flush actions should sync log before completing writing data to disk, if "fsync" is disabled -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21300) Implement disaster recovery for secondary indexes
Ivan Bessonov created IGNITE-21300: -- Summary: Implement disaster recovery for secondary indexes Key: IGNITE-21300 URL: https://issues.apache.org/jira/browse/IGNITE-21300 Project: Ignite Issue Type: Improvement Reporter: Ivan Bessonov It is possible that if we lost part of the log, some available indexes might become "locally" unavailable. We will have to finish build process second time in such a case. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21299) Rest API for disaster recovery commands
Ivan Bessonov created IGNITE-21299: -- Summary: Rest API for disaster recovery commands Key: IGNITE-21299 URL: https://issues.apache.org/jira/browse/IGNITE-21299 Project: Ignite Issue Type: Improvement Reporter: Ivan Bessonov Please refer to https://issues.apache.org/jira/browse/IGNITE-21298 for a list -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21298) CLI for disaster recovery commands
Ivan Bessonov created IGNITE-21298: -- Summary: CLI for disaster recovery commands Key: IGNITE-21298 URL: https://issues.apache.org/jira/browse/IGNITE-21298 Project: Ignite Issue Type: Improvement Reporter: Ivan Bessonov Names might change. * ignite restart-partitions --nodes [--zones ] [--partitions ] [--purge] * ignite reset-lost-partitions [--zones ] [--partitions ] * ignite truncate-log-suffix --zone --partition --index * ignite partition-states [--local [--nodes ] | --global] [--zones ] [--partitions ] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21295) Public Java API for manual raft group configuration update
Ivan Bessonov created IGNITE-21295: -- Summary: Public Java API for manual raft group configuration update Key: IGNITE-21295 URL: https://issues.apache.org/jira/browse/IGNITE-21295 Project: Ignite Issue Type: Improvement Reporter: Ivan Bessonov Implement public API for IGNITE-21284 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21257) Public Java API to get global partition states
[ https://issues.apache.org/jira/browse/IGNITE-21257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov updated IGNITE-21257: --- Summary: Public Java API to get global partition states (was: Public API to get global partition states) > Public Java API to get global partition states > -- > > Key: IGNITE-21257 > URL: https://issues.apache.org/jira/browse/IGNITE-21257 > Project: Ignite > Issue Type: Improvement >Reporter: Ivan Bessonov >Priority: Major > Labels: ignite-3 > > Please refer to https://issues.apache.org/jira/browse/IGNITE-21140 for the > list. > We should use local partition states, implemented in IGNITE-21256, and > combine them in cluster-wide compute call, before returning to the user. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21284) Internal API for manual raft group configuration update
Ivan Bessonov created IGNITE-21284: -- Summary: Internal API for manual raft group configuration update Key: IGNITE-21284 URL: https://issues.apache.org/jira/browse/IGNITE-21284 Project: Ignite Issue Type: Improvement Reporter: Ivan Bessonov We need an API that's analogous to "reset-lost-partitions", but with the ability to reuse living minority of nodes. This API should gather the states of partitions, identify healthy peers, and use them as a new raft group configuration (through the update of assignments). We have to make sure that node with latest log index will become a leader, so we will have to propagate desired minimum for log index in assignments and use it during the voting. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IGNITE-21256) Internal API for local partition states
[ https://issues.apache.org/jira/browse/IGNITE-21256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov reassigned IGNITE-21256: -- Assignee: Ivan Bessonov > Internal API for local partition states > --- > > Key: IGNITE-21256 > URL: https://issues.apache.org/jira/browse/IGNITE-21256 > Project: Ignite > Issue Type: Improvement >Reporter: Ivan Bessonov >Assignee: Ivan Bessonov >Priority: Major > Labels: ignite-3 > > Please refer to https://issues.apache.org/jira/browse/IGNITE-21140 for the > list. We need an API to access the list of local partitions and their states. > The way to determine them: > * comparing current assignments with replica states > * check the state machine, it might be broken or installing snapshot -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21234) Acquired checkpoint read lock waits for schedules checkpoint write unlock sometimes
[ https://issues.apache.org/jira/browse/IGNITE-21234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov updated IGNITE-21234: --- Reviewer: Kirill Tkalenko > Acquired checkpoint read lock waits for schedules checkpoint write unlock > sometimes > --- > > Key: IGNITE-21234 > URL: https://issues.apache.org/jira/browse/IGNITE-21234 > Project: Ignite > Issue Type: Improvement >Reporter: Ivan Bessonov >Assignee: Ivan Bessonov >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > Time Spent: 50m > Remaining Estimate: 0h > > In a situation where we have "too many dirty pages" we trigger checkpoint and > wait until it starts. This can take seconds, because we have to flush > free-lists before acquiring checkpoint write lock. This can cause severe dips > in performance for no good reason. > I suggest introducing two modes for triggering checkpoints when we have too > many dirty pages: soft threshold and hard threshold. > * soft - trigger checkpoint, but don't wait for its start. Just continue all > operations as usual. Make it like a current threshold - 75% of any existing > memory segment must be dirty. > * hard - trigger checkpoint and wait until it starts. The way it behaves > right now. Make it higher than current threshold - 90% of any existing memory > segment must be dirty. > Maybe we should use different values for thresholds, that should be discussed > during the review -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21257) Public API to get global partition states
Ivan Bessonov created IGNITE-21257: -- Summary: Public API to get global partition states Key: IGNITE-21257 URL: https://issues.apache.org/jira/browse/IGNITE-21257 Project: Ignite Issue Type: Improvement Reporter: Ivan Bessonov Please refer to https://issues.apache.org/jira/browse/IGNITE-21140 for the list. We should use local partition states, implemented in IGNITE-21256, and combine them in cluster-wide compute call, before returning to the user. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21256) Internal API for local partition states
Ivan Bessonov created IGNITE-21256: -- Summary: Internal API for local partition states Key: IGNITE-21256 URL: https://issues.apache.org/jira/browse/IGNITE-21256 Project: Ignite Issue Type: Improvement Reporter: Ivan Bessonov Please refer to https://issues.apache.org/jira/browse/IGNITE-21140 for the list. We need an API to access the list of local partitions and their states. The way to determine them: * comparing current assignments with replica states * check the state machine, it might be broken or installing snapshot -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21140) Ignite 3 Disaster Recovery
[ https://issues.apache.org/jira/browse/IGNITE-21140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov updated IGNITE-21140: --- Description: This epic is related to issues that may happen with users when part of their data becomes unavailable for some reasons, like "node is lost", or "part of the storage is lost", etc. Following definitions will be used throughout: Local partition states. A local property of replica, storage, state machine, etc., associated with the partition: * _Healthy_ State machine is running, everything’s fine. * _Initializing_ Ignite node is online, but the corresponding raft group is yet to complete its initialization. * _Snapshot installation_ Full state transfer is taking place. Once it’s finished, the partition will become _healthy_ or {_}catching-up{_}. Before that, data can’t be read, and log replication is also on pause. * _Catching-up_ Node is in the process of replicating data from the leader, and its data is a little bit in the past. This state can only be observed from the leader, because only the leader has the latest committed index and the state of every peer. * _Broken_ Something’s wrong with the state machine. Some data might be unavailable for reading, log can’t be replicated, and this state won’t be changed automatically without intervention. * Global partition states. A global property of a partition, that specifies its apparent functionality from user’s point of view: * _Available partition_ Healthy partition that can process read and write requests. This means that the majority of peers are healthy at the moment. * _Read-only partition_ Partition that can process read requests, but can’t process write requests. There’s no healthy majority, but there’s at least one alive (healthy/catch-up) peer that can process historical read-only queries. * _Unavailable partition_ Partition that can’t process any requests. Building blocks are a set of operations that can be executed by Ignite or by the user in order to improve cluster state. Each building block must either be an automatic action with configurable timeout (if applicable), or a documented API, with mandatory diagnostics/metrics that would allow users to make decisions about these actions. # Offline Ignite node is brought back online, having all recent data. _Not a disaster recovery mechanism, but worth mentioning._ A node with usable data, that doesn’t require full state transfer, will become a peer, will participate in voting and replication, allowing partition to be _available_ if majority is healthy. This is the best case for the user, where they simply restart offline nodes and the cluster continues being operable. # Automatic group scale-down. Should happen when an Ignite node is offline for too long. Not a disaster recovery mechanism, but worth mentioning. Only happens when the majority is online, meaning that user data is safe. # Manual partition restart. Should be performed manually for broken peers. # Manual group peers/learners reconfiguration. Should be performed on a group manually, if the majority is considered permanently lost. # Freshly re-entering the group. Should happen when an Ignite node is returned back to the group, but partition data is missing. # Cleaning the partition data. If, for some reason, we know that a certain partition on a certain node is broken, we may ask Ignite to drop its data and re-enter the group empty (as stated in option 5). Having a dedicated operation for cleaning the partition is preferable, because: ## partition is be stored in several storages ## not all of them have a “file per partition” storage format, not even close ## there’s also raft log that should be cleaned, most likely ## maybe raft meta as well # Partial truncation of the log’s suffix. This is a case of partial cleanup of partition data. This operation might be useful if we know that there’s junk in the log, but storages are not corrupted, so there’s a chance to save some data. Can be replaced with “clean partition data”. In order for the user to make decisions about manual operations, we must provide partition states for all partitions in all tables/zones. Both global and local states. Global states are more important, because they directly correlate with user experience. Some states will automatically lead to “available” partitions, if the system overall is healthy and we simply wait for some time. For example, we wait until a snapshot installation, or a rebalance is complete, and we’re happy. This is not considered a building block, because it’s a natural artifact of the architecture. Current list is not exhaustive, it consists of basic actions that we could implement that would cover a wide range of potential issues. Any other addition to the list of basic blocks would simply refine it, potentially allowing users to recover faster, or with less data being
[jira] [Created] (IGNITE-21245) Don't store applied revision in Vault
Ivan Bessonov created IGNITE-21245: -- Summary: Don't store applied revision in Vault Key: IGNITE-21245 URL: https://issues.apache.org/jira/browse/IGNITE-21245 Project: Ignite Issue Type: Improvement Reporter: Ivan Bessonov Assignee: Ivan Bessonov Fix For: 3.0.0-beta2 In a newer local node recovery implementation we stopped relying on Vault data, but didn't remove APPLIED_REV_KEY, which might confuse some developers. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21140) Ignite 3 Disaster Recovery
[ https://issues.apache.org/jira/browse/IGNITE-21140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov updated IGNITE-21140: --- Description: This epic is related to issues that may happen with users when part of their data becomes unavailable for some reasons, like "node is lost", or "part of the storage is lost", etc. Following definitions will be used throughout: * Local partition states. A local property of replica, storage, state machine, etc., associated with the partition: * _Healthy_ State machine is running, everything’s fine. * _Initializing_ Ignite node is online, but the corresponding raft group is yet to complete its initialization. * _Snapshot installation_ Full state transfer is taking place. Once it’s finished, the partition will become _healthy_ or {_}catching-up{_}. Before that, data can’t be read, and log replication is also on pause. * _Catching-up_ Node is in the process of replicating data from the leader, and its data is a little bit in the past. This state can only be observed from the leader, because only the leader has the latest committed index and the state of every peer. * _Broken_ Something’s wrong with the state machine. Some data might be unavailable for reading, log can’t be replicated, and this state won’t be changed automatically without intervention. * Global partition states. A global property of a partition, that specifies its apparent functionality from user’s point of view: * _Available partition_ Healthy partition that can process read and write requests. This means that the majority of peers are healthy at the moment. * _Read-only partition_ Partition that can process read requests, but can’t process write requests. There’s no healthy majority, but there’s at least one alive (healthy/catch-up) peer that can process historical read-only queries. * _Unavailable partition_ Partition that can’t process any requests. Building blocks are a set of operations that can be executed by Ignite or by the user in order to improve cluster state. Each building block must either be an automatic action with configurable timeout (if applicable), or a documented API, with mandatory diagnostics/metrics that would allow users to make decisions about these actions. # Offline Ignite node is brought back online, having all recent data. _Not a disaster recovery mechanism, but worth mentioning._ A node with usable data, that doesn’t require full state transfer, will become a peer, will participate in voting and replication, allowing partition to be _available_ if majority is healthy. This is the best case for the user, where they simply restart offline nodes and the cluster continues being operable. # Automatic group scale-down. Should happen when an Ignite node is offline for too long. Not a disaster recovery mechanism, but worth mentioning. Only happens when the majority is online, meaning that user data is safe. # Manual partition restart. Should be performed manually for broken peers. # Manual group peers/learners reconfiguration. Should be performed on a group manually, if the majority is considered permanently lost. # Freshly re-entering the group. Should happen when an Ignite node is returned back to the group, but partition data is missing. # Cleaning the partition data. If, for some reason, we know that a certain partition on a certain node is broken, we may ask Ignite to drop its data and re-enter the group empty (as stated in option 5). Having a dedicated operation for cleaning the partition is preferable, because: ## partition is be stored in several storages ## not all of them have a “file per partition” storage format, not even close ## there’s also raft log that should be cleaned, most likely ## maybe raft meta as well # Partial truncation of the log’s suffix. This is a case of partial cleanup of partition data. This operation might be useful if we know that there’s junk in the log, but storages are not corrupted, so there’s a chance to save some data. Can be replaced with “clean partition data”. In order for the user to make decisions about manual operations, we must provide partition states for all partitions in all tables/zones. Both global and local states. Global states are more important, because they directly correlate with user experience. Some states will automatically lead to “available” partitions, if the system overall is healthy and we simply wait for some time. For example, we wait until a snapshot installation, or a rebalance is complete, and we’re happy. This is not considered a building block, because it’s a natural artifact of the architecture. Current list is not exhaustive, it consists of basic actions that we could implement that would cover a wide range of potential issues. Any other addition to the list of basic blocks would simply refine it, potentially allowing users to recover faster, or with less data being
[jira] [Updated] (IGNITE-21140) Ignite 3 Disaster Recovery
[ https://issues.apache.org/jira/browse/IGNITE-21140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov updated IGNITE-21140: --- Description: This epic is related to issues that may happen with users when part of their data becomes unavailable for some reasons, like "node is lost", or "part of the storage is lost", etc. Following definitions will be used throughout: * Local partition states. A local property of replica, storage, state machine, etc., associated with the partition: * {_}Healthy{_} State machine is running, everything’s fine. * {_}Initializing{_} Ignite node is online, but the corresponding raft group is yet to complete its initialization. * {_}Snapshot installation{_} Full state transfer is taking place. Once it’s finished, the partition will become _healthy_ or {_}catching-up{_}. Before that, data can’t be read, and log replication is also on pause. * {_}Catching-up{_} Node is in the process of replicating data from the leader, and its data is a little bit in the past. This state can only be observed from the leader, because only the leader has the latest committed index and the state of every peer. * {_}Broken{_} Something’s wrong with the state machine. Some data might be unavailable for reading, log can’t be replicated, and this state won’t be changed automatically without intervention. * Global partition states. A global property of a partition, that specifies its apparent functionality from user’s point of view: * {_}Available partition{_} Healthy partition that can process read and write requests. This means that the majority of peers are healthy at the moment. * {_}Read-only partition{_} Partition that can process read requests, but can’t process write requests. There’s no healthy majority, but there’s at least one alive (healthy/catch-up) peer that can process historical read-only queries. * {_}Unavailable partition {_}Partition that can’t process any requests. Building blocks are a set of operations that can be executed by Ignite or by the user in order to improve cluster state. Each building block must either be an automatic action with configurable timeout (if applicable), or a documented API, with mandatory diagnostics/metrics that would allow users to make decisions about these actions. # Offline Ignite node is brought back online, having all recent data. {_}Not a disaster recovery mechanism, but worth mentioning.{_} A node with usable data, that doesn’t require full state transfer, will become a peer, will participate in voting and replication, allowing partition to be _available_ if majority is healthy. This is the best case for the user, where they simply restart offline nodes and the cluster continues being operable. # Automatic group scale-down. Should happen when an Ignite node is offline for too long. Not a disaster recovery mechanism, but worth mentioning. Only happens when the majority is online, meaning that user data is safe. # Manual partition restart. Should be performed manually for broken peers. # Manual group peers/learners reconfiguration. Should be performed on a group manually, if the majority is considered permanently lost. # Freshly re-entering the group. Should happen when an Ignite node is returned back to the group, but partition data is missing. # Cleaning the partition data. If, for some reason, we know that a certain partition on a certain node is broken, we may ask Ignite to drop its data and re-enter the group empty (as stated in option 5). Having a dedicated operation for cleaning the partition is preferable, because: - partition is be stored in several storages - not all of them have a “file per partition” storage format, not even close - there’s also raft log that should be cleaned, most likely - maybe raft meta as well # Partial truncation of the log’s suffix. This is a case of partial cleanup of partition data. This operation might be useful if we know that there’s junk in the log, but storages are not corrupted, so there’s a chance to save some data. Can be replaced with “clean partition data”. In order for the user to make decisions about manual operations, we must provide partition states for all partitions in all tables/zones. Both global and local states. Global states are more important, because they directly correlate with user experience. Some states will automatically lead to “available” partitions, if the system overall is healthy and we simply wait for some time. For example, we wait until a snapshot installation, or a rebalance is complete, and we’re happy. This is not considered a building block, because it’s a natural artifact of the architecture. Current list is not exhaustive, it consists of basic actions that we could implement that would cover a wide range of potential issues. Any other addition to the list of basic blocks would simply refine it, potentially allowing
[jira] [Updated] (IGNITE-21234) Acquired checkpoint read lock waits for schedules checkpoint write unlock sometimes
[ https://issues.apache.org/jira/browse/IGNITE-21234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov updated IGNITE-21234: --- Summary: Acquired checkpoint read lock waits for schedules checkpoint write unlock sometimes (was: Checkpoint read lock waits for checkpoint write unlock sometimes) > Acquired checkpoint read lock waits for schedules checkpoint write unlock > sometimes > --- > > Key: IGNITE-21234 > URL: https://issues.apache.org/jira/browse/IGNITE-21234 > Project: Ignite > Issue Type: Improvement >Reporter: Ivan Bessonov >Assignee: Ivan Bessonov >Priority: Major > Labels: ignite-3 > Time Spent: 10m > Remaining Estimate: 0h > > In a situation where we have "too many dirty pages" we trigger checkpoint and > wait until it starts. This can take seconds, because we have to flush > free-lists before acquiring checkpoint write lock. This can cause severe dips > in performance for no good reason. > I suggest introducing two modes for triggering checkpoints when we have too > many dirty pages: soft threshold and hard threshold. > * soft - trigger checkpoint, but don't wait for its start. Just continue all > operations as usual. Make it like a current threshold - 75% of any existing > memory segment must be dirty. > * hard - trigger checkpoint and wait until it starts. The way it behaves > right now. Make it higher than current threshold - 90% of any existing memory > segment must be dirty. > Maybe we should use different values for thresholds, that should be discussed > during the review -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21234) Checkpoint read lock waits for checkpoint write unlock sometimes
[ https://issues.apache.org/jira/browse/IGNITE-21234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov updated IGNITE-21234: --- Description: In a situation where we have "too many dirty pages" we trigger checkpoint and wait until it starts. This can take seconds, because we have to flush free-lists before acquiring checkpoint write lock. This can cause severe dips in performance for no good reason. I suggest introducing two modes for triggering checkpoints when we have too many dirty pages: soft threshold and hard threshold. * soft - trigger checkpoint, but don't wait for its start. Just continue all operations as usual. Make it like a current threshold - 75% of any existing memory segment must be dirty. * hard - trigger checkpoint and wait until it starts. The way it behaves right now. Make it higher than current threshold - 90% of any existing memory segment must be dirty. Maybe we should use different values for thresholds, that should be discussed during the review was: In a situation where we have "too many dirty pages" we trigger checkpoint and wait until it starts. This can take seconds, because we have to flush free-lists before acquiring checkpoint write lock. This can cause severe dips in performance for no good reason. I suggest introducing two modes for triggering checkpoints when we have too many dirty pages: soft threshold and hard threshold. * soft - trigger checkpoint, but don't wait for its start. Just continue all operations as usual. Make it like a current threshold - 75% of any existing memory segment must be dirty. * hard - trigger checkpoint and wait until it starts. The way it behaves right now. Make it higher than current threshold - 90% of any existing memory segment must be dirty. > Checkpoint read lock waits for checkpoint write unlock sometimes > > > Key: IGNITE-21234 > URL: https://issues.apache.org/jira/browse/IGNITE-21234 > Project: Ignite > Issue Type: Improvement >Reporter: Ivan Bessonov >Assignee: Ivan Bessonov >Priority: Major > Labels: ignite-3 > Time Spent: 10m > Remaining Estimate: 0h > > In a situation where we have "too many dirty pages" we trigger checkpoint and > wait until it starts. This can take seconds, because we have to flush > free-lists before acquiring checkpoint write lock. This can cause severe dips > in performance for no good reason. > I suggest introducing two modes for triggering checkpoints when we have too > many dirty pages: soft threshold and hard threshold. > * soft - trigger checkpoint, but don't wait for its start. Just continue all > operations as usual. Make it like a current threshold - 75% of any existing > memory segment must be dirty. > * hard - trigger checkpoint and wait until it starts. The way it behaves > right now. Make it higher than current threshold - 90% of any existing memory > segment must be dirty. > Maybe we should use different values for thresholds, that should be discussed > during the review -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21234) Checkpoint read lock waits for checkpoint write unlock sometimes
Ivan Bessonov created IGNITE-21234: -- Summary: Checkpoint read lock waits for checkpoint write unlock sometimes Key: IGNITE-21234 URL: https://issues.apache.org/jira/browse/IGNITE-21234 Project: Ignite Issue Type: Improvement Reporter: Ivan Bessonov Assignee: Ivan Bessonov In a situation where we have "too many dirty pages" we trigger checkpoint and wait until it starts. This can take seconds, because we have to flush free-lists before acquiring checkpoint write lock. This can cause severe dips in performance for no good reason. I suggest introducing two modes for triggering checkpoints when we have too many dirty pages: soft threshold and hard threshold. * soft - trigger checkpoint, but don't wait for its start. Just continue all operations as usual. Make it like a current threshold - 75% of any existing memory segment must be dirty. * hard - trigger checkpoint and wait until it starts. The way it behaves right now. Make it higher than current threshold - 90% of any existing memory segment must be dirty. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21140) Ignite 3 Disaster Recovery
[ https://issues.apache.org/jira/browse/IGNITE-21140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov updated IGNITE-21140: --- Description: This epic is related to issues that may happen with users when part of their data becomes unavailable for some reasons, like "node is lost", or "part of the storage is lost", etc. > Ignite 3 Disaster Recovery > -- > > Key: IGNITE-21140 > URL: https://issues.apache.org/jira/browse/IGNITE-21140 > Project: Ignite > Issue Type: Epic >Reporter: Ivan Bessonov >Priority: Major > Labels: ignite-3 > > This epic is related to issues that may happen with users when part of their > data becomes unavailable for some reasons, like "node is lost", or "part of > the storage is lost", etc. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (IGNITE-20051) Add startup recovery to SchemaManager
[ https://issues.apache.org/jira/browse/IGNITE-20051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov resolved IGNITE-20051. Resolution: Fixed > Add startup recovery to SchemaManager > - > > Key: IGNITE-20051 > URL: https://issues.apache.org/jira/browse/IGNITE-20051 > Project: Ignite > Issue Type: Improvement >Reporter: Roman Puchkovskiy >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > > Currently, {{SchemaManager}} does not implement a proper recovery procedure > at start. It needs a way to get all tables from the CatalogService (including > the dropped ones). Also, it must make sure that versions of the tables that > were missed due to being offline are added to the schemas storage as a result > of recovery. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21205) Don't store table versions in meta-storage in SchemaManager
[ https://issues.apache.org/jira/browse/IGNITE-21205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov updated IGNITE-21205: --- Description: Current implementation blocks meta-storage watch thread (doesn't allow new watch events to be processed, to be precise) until new schema version is written into the meta-storage. This is an expensive IO operation, and it might introduce unexpected turbulence. Some scenarios greatly suffer from it. For example, we can't process lease updates while we're writing into meta-storage, which shouldn't be the case. was: Current implementation blocks meta-storage watch thread until new schema version is written into the meta-storage. This is an expensive IO operation, and it might introduce unexpected turbulence. Some scenarios greatly suffer from it. For example, we can't process lease updates while we're writing into meta-storage, which shouldn't be the case. > Don't store table versions in meta-storage in SchemaManager > --- > > Key: IGNITE-21205 > URL: https://issues.apache.org/jira/browse/IGNITE-21205 > Project: Ignite > Issue Type: Improvement >Reporter: Ivan Bessonov >Assignee: Ivan Bessonov >Priority: Major > Labels: ignite-3 > Time Spent: 20m > Remaining Estimate: 0h > > Current implementation blocks meta-storage watch thread (doesn't allow new > watch events to be processed, to be precise) until new schema version is > written into the meta-storage. This is an expensive IO operation, and it > might introduce unexpected turbulence. > Some scenarios greatly suffer from it. For example, we can't process lease > updates while we're writing into meta-storage, which shouldn't be the case. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21205) Don't store table versions in meta-storage in SchemaManager
[ https://issues.apache.org/jira/browse/IGNITE-21205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov updated IGNITE-21205: --- Description: Current implementation blocks meta-storage watch thread until new schema version is written into the meta-storage. This is an expensive IO operation, and it might introduce unexpected turbulence. Some scenarios greatly suffer from it. For example, we can't process lease updates while we're writing into meta-storage, which shouldn't be the case. was:TBD > Don't store table versions in meta-storage in SchemaManager > --- > > Key: IGNITE-21205 > URL: https://issues.apache.org/jira/browse/IGNITE-21205 > Project: Ignite > Issue Type: Improvement >Reporter: Ivan Bessonov >Assignee: Ivan Bessonov >Priority: Major > Labels: ignite-3 > Time Spent: 10m > Remaining Estimate: 0h > > Current implementation blocks meta-storage watch thread until new schema > version is written into the meta-storage. This is an expensive IO operation, > and it might introduce unexpected turbulence. > Some scenarios greatly suffer from it. For example, we can't process lease > updates while we're writing into meta-storage, which shouldn't be the case. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21205) Don't store table versions in meta-storage in SchemaManager
Ivan Bessonov created IGNITE-21205: -- Summary: Don't store table versions in meta-storage in SchemaManager Key: IGNITE-21205 URL: https://issues.apache.org/jira/browse/IGNITE-21205 Project: Ignite Issue Type: Improvement Reporter: Ivan Bessonov Assignee: Ivan Bessonov TBD -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21204) Use shared rocksdb instance for all TX state storages
Ivan Bessonov created IGNITE-21204: -- Summary: Use shared rocksdb instance for all TX state storages Key: IGNITE-21204 URL: https://issues.apache.org/jira/browse/IGNITE-21204 Project: Ignite Issue Type: Improvement Reporter: Ivan Bessonov Assignee: Ivan Bessonov Current implementation uses too many resources if you create multiple tables. Table creation time suffers too. We need to use the same approach as "rocksdb" storage engine uses. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21196) PrimaryReplicaEvent handling is inefficient
[ https://issues.apache.org/jira/browse/IGNITE-21196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov updated IGNITE-21196: --- Reviewer: Vladislav Pyatkov (was: Vladislav Pyatkov) > PrimaryReplicaEvent handling is inefficient > --- > > Key: IGNITE-21196 > URL: https://issues.apache.org/jira/browse/IGNITE-21196 > Project: Ignite > Issue Type: Improvement >Reporter: Ivan Bessonov >Assignee: Ivan Bessonov >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > Time Spent: 40m > Remaining Estimate: 0h > > Currently, every partition replica listener has its own set of instances of > {{PrimaryReplicaEvent}} listeners. In > {{LeaseTracker.UpdateListener#onUpdate}} we create these events in a loop. > > This results in 2 nested loops, which might be extremely inefficient if we > have a lot of replicas on the node. Most of iterations will do nothing > because the following condition won't pass: > {{!replicationGroupId.equals(evt.groupId())}} > > I suggest subscribing to these events in {{ReplicaManager}} and perform > necessary filtering in advance. Such change will greatly improve the > performance of lease update watch event processing -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21196) PrimaryReplicaEvent handling is inefficient
[ https://issues.apache.org/jira/browse/IGNITE-21196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov updated IGNITE-21196: --- Ignite Flags: (was: Docs Required,Release Notes Required) > PrimaryReplicaEvent handling is inefficient > --- > > Key: IGNITE-21196 > URL: https://issues.apache.org/jira/browse/IGNITE-21196 > Project: Ignite > Issue Type: Improvement >Reporter: Ivan Bessonov >Assignee: Ivan Bessonov >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > Time Spent: 40m > Remaining Estimate: 0h > > Currently, every partition replica listener has its own set of instances of > {{PrimaryReplicaEvent}} listeners. In > {{LeaseTracker.UpdateListener#onUpdate}} we create these events in a loop. > > This results in 2 nested loops, which might be extremely inefficient if we > have a lot of replicas on the node. Most of iterations will do nothing > because the following condition won't pass: > {{!replicationGroupId.equals(evt.groupId())}} > > I suggest subscribing to these events in {{ReplicaManager}} and perform > necessary filtering in advance. Such change will greatly improve the > performance of lease update watch event processing -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21198) Optimize memory usage of AbstractEventProducer#fireEvent
[ https://issues.apache.org/jira/browse/IGNITE-21198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov updated IGNITE-21198: --- Ignite Flags: (was: Docs Required,Release Notes Required) > Optimize memory usage of AbstractEventProducer#fireEvent > > > Key: IGNITE-21198 > URL: https://issues.apache.org/jira/browse/IGNITE-21198 > Project: Ignite > Issue Type: Improvement >Reporter: Ivan Bessonov >Assignee: Ivan Bessonov >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > Time Spent: 40m > Remaining Estimate: 0h > > In current implementation, most of listeners do their work synchronously and > return already completed futures. In that cases there's no sense to allocate > the entire array of futures and fill it. > Another reason for not allocating an array right away is the fact that we may > have a big number of listeners, and allocating an array will be expensive and > wasteful. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21198) Optimize memory usage of AbstractEventProducer#fireEvent
Ivan Bessonov created IGNITE-21198: -- Summary: Optimize memory usage of AbstractEventProducer#fireEvent Key: IGNITE-21198 URL: https://issues.apache.org/jira/browse/IGNITE-21198 Project: Ignite Issue Type: Improvement Reporter: Ivan Bessonov Assignee: Ivan Bessonov In current implementation, most of listeners do their work synchronously and return already completed futures. In that cases there's no sense to allocate the entire array of futures and fill it. Another reason for not allocating an array right away is the fact that we may have a big number of listeners, and allocating an array will be expensive and wasteful. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21196) PrimaryReplicaEvent handling is inefficient
Ivan Bessonov created IGNITE-21196: -- Summary: PrimaryReplicaEvent handling is inefficient Key: IGNITE-21196 URL: https://issues.apache.org/jira/browse/IGNITE-21196 Project: Ignite Issue Type: Improvement Reporter: Ivan Bessonov Assignee: Ivan Bessonov Currently, every partition replica listener has its own set of instances of {{PrimaryReplicaEvent}} listeners. In {{LeaseTracker.UpdateListener#onUpdate}} we create these events in a loop. This results in 2 nested loops, which might be extremely inefficient if we have a lot of replicas on the node. Most of iterations will do nothing because the following condition won't pass: {{!replicationGroupId.equals(evt.groupId())}} I suggest subscribing to these events in {{ReplicaManager}} and perform necessary filtering in advance. Such change will greatly improve the performance of lease update watch event processing -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21140) Ignite 3 Disaster Recovery
Ivan Bessonov created IGNITE-21140: -- Summary: Ignite 3 Disaster Recovery Key: IGNITE-21140 URL: https://issues.apache.org/jira/browse/IGNITE-21140 Project: Ignite Issue Type: Epic Reporter: Ivan Bessonov -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-19819) Lease batches compaction
[ https://issues.apache.org/jira/browse/IGNITE-19819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17799313#comment-17799313 ] Ivan Bessonov commented on IGNITE-19819: Please also consider a more space-efficient way to pack values into byte[], I have a quick POC here: [https://github.com/apache/ignite-3/pull/2976] The payload in the POC is about 10 times less than in main. Integrating similar ideas into the proposed improvement could make it better > Lease batches compaction > > > Key: IGNITE-19819 > URL: https://issues.apache.org/jira/browse/IGNITE-19819 > Project: Ignite > Issue Type: Improvement >Reporter: Denis Chudov >Priority: Major > Labels: ignite-3 > > *Motivation* > After IGNITE-19578 leases should be stored as a single batch in meta storage. > However, the size of such a batch is significant and can be reduced. > Each lease contains group name, leaseholder name, left and right timestamp > and couple of boolean flags. > Many leases share the same leaseholder. Also, many leases share the same > right border, as batch of leases are renewed on every iteration of lease > updater and get the same right border. > So, the compacted data structure for all leases could be a map > {code:java} > right border -> set of leaseholders -> set of leases which contain only group > name, left border and flags.{code} > It is important that this data structure is applicable to meta storage > representation, in-memory representation of leases should remain the same. > *Definition of done* > Amount of space required for storing leases is significantly reduced. > *Implementation notes* > The key should be prefix + right border. On each iteration the corresponding > right border should be removed and new one put, so on each iteration there > will be done just one meta storage invoke. > To avoid ABA problem during leases' updates via invokes, entries should be > versioned, this can be done by assigning unique version to each right border > key. There are cases when all leaseholders can be removed from some entry and > then another leaseholders added again (e.g. accepting leases and removing > them from right border that matches long-term unaccepted leases, and later > adding the regular leases to the same right border). In this cases the entry > should not be removed from meta storage, in spite it doesn't have leases, to > preserve the version of the entry. > To avoid merging of batches, lease prolongation should be changed: new right > border should be calculated as current_right_border + lease_interval (now: > current_time + lease_interval ). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21091) Send indexes data during full state transfer
Ivan Bessonov created IGNITE-21091: -- Summary: Send indexes data during full state transfer Key: IGNITE-21091 URL: https://issues.apache.org/jira/browse/IGNITE-21091 Project: Ignite Issue Type: Improvement Reporter: Ivan Bessonov Current full state transfer implementation dictates that receiver will build all secondary indexes on the fly. This might not be efficient: * receiver will have to extract index tuples from each row version * for rows with multiple versions, many of these tuples will be the same. Which means that the "extraction" and "insertion" will be performed several times for the same value Of course, it all depends on how many versions each row has, but generally speaking, inserting data one time is always better than inserting it multiple times. It is proposed to send indexes the same way as we send version chains. By using scan operation and copy-on-write semantics during data modifications (on the sender). The specifics of the algorithm are not clear yet. This issues includes investigation of the proper approach, that will eliminate * the possibility of data inconsistencies * data leaks * excessive memory consumption on sender -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21076) Creating the table with 1024 partitions is too slow
Ivan Bessonov created IGNITE-21076: -- Summary: Creating the table with 1024 partitions is too slow Key: IGNITE-21076 URL: https://issues.apache.org/jira/browse/IGNITE-21076 Project: Ignite Issue Type: Bug Reporter: Ivan Bessonov With the default 25 partitions, creating the table takes few seconds. While not ideal, it's not too bad. But, when increasing the number of partition to 1024 (default of Ignite 2), time increases to about 15-20 seconds. With such a small number of partitions, time shouldn't scale this drastically. Most of the time that spent on creation of a single partition should be a) creating storage and b) leader election. Assuming that, time for 25 and 1024 partitions should not differ this much. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IGNITE-21063) Cannot create 1000 tables
[ https://issues.apache.org/jira/browse/IGNITE-21063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov reassigned IGNITE-21063: -- Assignee: Ivan Bessonov > Cannot create 1000 tables > - > > Key: IGNITE-21063 > URL: https://issues.apache.org/jira/browse/IGNITE-21063 > Project: Ignite > Issue Type: Bug >Reporter: Ivan Bessonov >Assignee: Ivan Bessonov >Priority: Major > Labels: ignite-3 > > Fails with OOM after a while, managing to create about 500 tales locally. We > need to research, why it happens. Is there a leak, or we simply use too much > memory. > Main candidate: thread-local marshallers. For some reason, we use too many > threads, I guess? Meta-storage entries may be up to several megabytes in > current implementation. > We should limit the size of cached buffers, and number of threads in general. > Shared pool (priority-queue) of pre-allocated buffers would solve the issue, > they don't have to be thread-local. It's a bit slower, but it's not a problem > until proven otherwise -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21063) Cannot create 1000 tables
[ https://issues.apache.org/jira/browse/IGNITE-21063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov updated IGNITE-21063: --- Description: Fails with OOM after a while, managing to create about 500 tales locally. We need to research, why it happens. Is there a leak, or we simply use too much memory. Main candidate: thread-local marshallers. For some reason, we use too many threads, I guess? Meta-storage entries may be up to several megabytes in current implementation. We should limit the size of cached buffers, and number of threads in general. Shared pool (priority-queue) of pre-allocated buffers would solve the issue, they don't have to be thread-local. It's a bit slower, but it's not a problem until proven otherwise was:Fails with OOM after a while, managing to create about 500 tales locally. We need to research, why it happens. Is there a leak, or we simply use too much memory > Cannot create 1000 tables > - > > Key: IGNITE-21063 > URL: https://issues.apache.org/jira/browse/IGNITE-21063 > Project: Ignite > Issue Type: Bug >Reporter: Ivan Bessonov >Priority: Major > Labels: ignite-3 > > Fails with OOM after a while, managing to create about 500 tales locally. We > need to research, why it happens. Is there a leak, or we simply use too much > memory. > Main candidate: thread-local marshallers. For some reason, we use too many > threads, I guess? Meta-storage entries may be up to several megabytes in > current implementation. > We should limit the size of cached buffers, and number of threads in general. > Shared pool (priority-queue) of pre-allocated buffers would solve the issue, > they don't have to be thread-local. It's a bit slower, but it's not a problem > until proven otherwise -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21063) Cannot create 1000 tables
[ https://issues.apache.org/jira/browse/IGNITE-21063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov updated IGNITE-21063: --- Description: Fails with OOM after a while, managing to create about 500 tales locally. We need to research, why it happens. Is there a leak, or we simply use too much memory (was: Fails with OOM on TC. We need to research, why it happens. Is there a leak, or we simply use too much memory) > Cannot create 1000 tables > - > > Key: IGNITE-21063 > URL: https://issues.apache.org/jira/browse/IGNITE-21063 > Project: Ignite > Issue Type: Bug >Reporter: Ivan Bessonov >Priority: Major > Labels: ignite-3 > > Fails with OOM after a while, managing to create about 500 tales locally. We > need to research, why it happens. Is there a leak, or we simply use too much > memory -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21063) Cannot create 1000 tables
[ https://issues.apache.org/jira/browse/IGNITE-21063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov updated IGNITE-21063: --- Description: Fails with OOM on TC. We need to research, why it happens. Is there a leak, or we simply use too much memory (was: Fails with OOM on TC) > Cannot create 1000 tables > - > > Key: IGNITE-21063 > URL: https://issues.apache.org/jira/browse/IGNITE-21063 > Project: Ignite > Issue Type: Bug >Reporter: Ivan Bessonov >Priority: Major > Labels: ignite-3 > > Fails with OOM on TC. We need to research, why it happens. Is there a leak, > or we simply use too much memory -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21063) Cannot create 1000 tables
[ https://issues.apache.org/jira/browse/IGNITE-21063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov updated IGNITE-21063: --- Ignite Flags: (was: Docs Required,Release Notes Required) > Cannot create 1000 tables > - > > Key: IGNITE-21063 > URL: https://issues.apache.org/jira/browse/IGNITE-21063 > Project: Ignite > Issue Type: Bug >Reporter: Ivan Bessonov >Priority: Major > Labels: ignite-3 > > Fails with OOM on TC -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21063) Cannot create 1000 tables
Ivan Bessonov created IGNITE-21063: -- Summary: Cannot create 1000 tables Key: IGNITE-21063 URL: https://issues.apache.org/jira/browse/IGNITE-21063 Project: Ignite Issue Type: Bug Reporter: Ivan Bessonov Fails with OOM on TC -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IGNITE-21062) Safe time reordering in partitions
[ https://issues.apache.org/jira/browse/IGNITE-21062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov reassigned IGNITE-21062: -- Assignee: Ivan Bessonov > Safe time reordering in partitions > -- > > Key: IGNITE-21062 > URL: https://issues.apache.org/jira/browse/IGNITE-21062 > Project: Ignite > Issue Type: Bug >Reporter: Ivan Bessonov >Assignee: Ivan Bessonov >Priority: Major > Labels: ignite-3 > > In the scenario of creating a lot of table and having slow system > (presumably), it's possible to notice {{Safe time reordering detected > [current=...}} assertion error in logs. > It happens with safe-time sync commands, in the absence of transactional load. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21062) Safe time reordering in partitions
Ivan Bessonov created IGNITE-21062: -- Summary: Safe time reordering in partitions Key: IGNITE-21062 URL: https://issues.apache.org/jira/browse/IGNITE-21062 Project: Ignite Issue Type: Bug Reporter: Ivan Bessonov In the scenario of creating a lot of table and having slow system (presumably), it's possible to notice {{Safe time reordering detected [current=...}} assertion error in logs. It happens with safe-time sync commands, in the absence of transactional load. -- This message was sent by Atlassian Jira (v8.20.10#820010)