[jira] [Comment Edited] (IGNITE-11704) Write tombstones during rebalance to get rid of deferred delete buffer

2021-07-13 Thread Pavel Tupitsyn (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-11704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17379945#comment-17379945
 ] 

Pavel Tupitsyn edited comment on IGNITE-11704 at 7/13/21, 2:59 PM:
---

TC run above is for https://github.com/apache/ignite/pull/9249: "IGNITE-11704 
Add PARTITION_META_PAGE_DELTA_RECORD_V4 WALRecord type placeholder"

Merged to master: 97edc893d5c6bcd458364746c5624a0bf9d1ae81


was (Author: ptupitsyn):
TC run above is for https://github.com/apache/ignite/pull/9249: "IGNITE-11704 
Add PARTITION_META_PAGE_DELTA_RECORD_V4 WALRecord type placeholder"

> Write tombstones during rebalance to get rid of deferred delete buffer
> --
>
> Key: IGNITE-11704
> URL: https://issues.apache.org/jira/browse/IGNITE-11704
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Alexey Goncharuk
>Assignee: Alexey Scherbakov
>Priority: Major
>  Labels: rebalance
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Currently Ignite relies on deferred delete buffer in order to handle 
> write-remove conflicts during rebalance. Given the limit size of the buffer, 
> this approach is fundamentally flawed, especially in case when persistence is 
> enabled.
> I suggest to extend the logic of data storage to be able to store key 
> tombstones - to keep version for deleted entries. The tombstones will be 
> stored when rebalance is in progress and should be cleaned up when rebalance 
> is completed.
> Later this approach may be used to implement fast partition rebalance based 
> on merkle trees (in this case, tombstones should be written on an incomplete 
> baseline).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (IGNITE-11704) Write tombstones during rebalance to get rid of deferred delete buffer

2019-10-08 Thread Alexei Scherbakov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-11704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16946165#comment-16946165
 ] 

Alexei Scherbakov edited comment on IGNITE-11704 at 10/8/19 7:10 AM:
-

[~jokser] [~sboikov]

I've reviewed changes. Overall looks good, but still I have some questions.

1. My main concern is regarding the necessity of tombstoneBytes 5-bytes object. 
Seems it's possible to implement tombstone by treating absence of value as a 
tombstone.
For example, valLen=0 could be treated as tombstone presense. Doing so we can 
get rid of 5 bytes comparison, and instead do null check:
{noformat}
private Boolean isTombstone(ByteBuffer buf, int offset) {
int valLen = buf.getInt(buf.position() + offset);
if (valLen != tombstoneBytes.length)
return Boolean.FALSE;
...
}
{noformat}

Instead we can do something like {{if (valLen == 0) return Boolean.TRUE}}

2. With new changes in PartitionsEvictManager it's possible to have two tasks 
of different types for the same partition.
Consider a scenario: 
* node finished rebalancing and starts to clear thombstones
* another node joins topology and become an owner for clearing partition.
* eviction is started for already clearing partition.

Probably this should not be allowed.

3. I see changes having no obvious relation to contribution, for example: 
static String cacheGroupMetricsRegistryName(String cacheGrp)
DropCacheContextDuringEvictionTest.java
GridCommandHandlerIndexingTest.java

What's the purpose of these ?

4. Could you clarify the change in 
org.apache.ignite.internal.processors.cache.GridCacheMapEntry#initialValue:

update0 |= (!preload && val == null); ?




was (Author: ascherbakov):
[~jokser] [~sboikov]

I've reviewed changes. Overall looks good, but still I have some questions.

1. My main concern is regarding the necessity of tombstoneBytes 5-bytes object. 
Seems it's possible to implement tombstone by treating absence of value as a 
tombstone.
For example, valLen=0 could be treated as tombstone presense. Doing so we can 
get rid of 5 bytes comparison, and instead do null check:
{noformat}
private Boolean isTombstone(ByteBuffer buf, int offset) {
int valLen = buf.getInt(buf.position() + offset);
if (valLen != tombstoneBytes.length)
return Boolean.FALSE;
...
}
{noformat}

Instead we can do something like {{if (valLen == 0) return true}}

2. With new changes in PartitionsEvictManager it's possible to have two tasks 
of different types for the same partition.
Consider a scenario: 
* node finished rebalancing and starts to clear thombstones
* another node joins topology and become an owner for clearing partition.
* eviction is started for already clearing partition.

Probably this should not be allowed.

3. I see changes having no obvious relation to contribution, for example: 
static String cacheGroupMetricsRegistryName(String cacheGrp)
DropCacheContextDuringEvictionTest.java
GridCommandHandlerIndexingTest.java

What's the purpose of these ?

4. Could you clarify the change in 
org.apache.ignite.internal.processors.cache.GridCacheMapEntry#initialValue:

update0 |= (!preload && val == null); ?



> Write tombstones during rebalance to get rid of deferred delete buffer
> --
>
> Key: IGNITE-11704
> URL: https://issues.apache.org/jira/browse/IGNITE-11704
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Alexey Goncharuk
>Assignee: Pavel Kovalenko
>Priority: Major
>  Labels: rebalance
> Fix For: 2.8
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently Ignite relies on deferred delete buffer in order to handle 
> write-remove conflicts during rebalance. Given the limit size of the buffer, 
> this approach is fundamentally flawed, especially in case when persistence is 
> enabled.
> I suggest to extend the logic of data storage to be able to store key 
> tombstones - to keep version for deleted entries. The tombstones will be 
> stored when rebalance is in progress and should be cleaned up when rebalance 
> is completed.
> Later this approach may be used to implement fast partition rebalance based 
> on merkle trees (in this case, tombstones should be written on an incomplete 
> baseline).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (IGNITE-11704) Write tombstones during rebalance to get rid of deferred delete buffer

2019-10-08 Thread Alexei Scherbakov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-11704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16946165#comment-16946165
 ] 

Alexei Scherbakov edited comment on IGNITE-11704 at 10/8/19 7:04 AM:
-

[~jokser] [~sboikov]

I've reviewed changes. Overall looks good, but still I have some questions.

1. My main concern is regarding the necessity of tombstoneBytes 5-bytes object. 
Seems it's possible to implement tombstone by treating absence of value as a 
tombstone.
For example, valLen=0 could be treated as tombstone presense. Doing so we can 
get rid of 5 bytes comparison, and instead do null check:
{noformat}
private Boolean isTombstone(ByteBuffer buf, int offset) {
int valLen = buf.getInt(buf.position() + offset);
if (valLen != tombstoneBytes.length)
return Boolean.FALSE;
...
}
{noformat}

Instead we can do something like {{if (valLen == 0) return true}}

2. With new changes in PartitionsEvictManager it's possible to have two tasks 
of different types for the same partition.
Consider a scenario: 
* node finished rebalancing and starts to clear thombstones
* another node joins topology and become an owner for clearing partition.
* eviction is started for already clearing partition.

Probably this should not be allowed.

3. I see changes having no obvious relation to contribution, for example: 
static String cacheGroupMetricsRegistryName(String cacheGrp)
DropCacheContextDuringEvictionTest.java
GridCommandHandlerIndexingTest.java

What's the purpose of these ?

4. Could you clarify the change in 
org.apache.ignite.internal.processors.cache.GridCacheMapEntry#initialValue:

update0 |= (!preload && val == null); ?




was (Author: ascherbakov):
[~jokser] [~sboikov]

I've reviewed changes. Overall looks good, but still I have some questions.

1. My main concern is regarding the necessity of tombstoneBytes 5-bytes object. 
Seems it's possible to implement tombstone by treating absence of value as a 
tombstone.
For example, valLen=0 could be treated as tombstone presense. Doing so we can 
get rid of 5 bytes comparison, and instead do null check:
{noformat}
private Boolean isTombstone(ByteBuffer buf, int offset) {
int valLen = buf.getInt(buf.position() + offset);
if (valLen != tombstoneBytes.length)
return Boolean.FALSE;
...
}
{noformat}

Instead we can do something like {{if (valLen == 0) return true}}

2. With new changes in PartitionsEvictManager it's possible to have two tasks 
of different types for the same partition.
Consider a scenario: 
* node finished rebalancing and starts to clear thombstones
* another node joins topology and become an owner for clearing partition.
* eviction is started for already clearing partition.

Probably this should not be allowed.

3. I see changes having no obvious relation to contribution, for example: 
static String cacheGroupMetricsRegistryName(String cacheGrp)
DropCacheContextDuringEvictionTest.java
GridCommandHandlerIndexingTest.java

What's the purpose of these ?

4. Could you explain the modification in 
org.apache.ignite.internal.processors.cache.GridCacheMapEntry#initialValue:

update0 |= (!preload && val == null); ?



> Write tombstones during rebalance to get rid of deferred delete buffer
> --
>
> Key: IGNITE-11704
> URL: https://issues.apache.org/jira/browse/IGNITE-11704
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Alexey Goncharuk
>Assignee: Pavel Kovalenko
>Priority: Major
>  Labels: rebalance
> Fix For: 2.8
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently Ignite relies on deferred delete buffer in order to handle 
> write-remove conflicts during rebalance. Given the limit size of the buffer, 
> this approach is fundamentally flawed, especially in case when persistence is 
> enabled.
> I suggest to extend the logic of data storage to be able to store key 
> tombstones - to keep version for deleted entries. The tombstones will be 
> stored when rebalance is in progress and should be cleaned up when rebalance 
> is completed.
> Later this approach may be used to implement fast partition rebalance based 
> on merkle trees (in this case, tombstones should be written on an incomplete 
> baseline).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)