[
https://issues.apache.org/jira/browse/IGNITE-13912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17269554#comment-17269554
]
Kirill Tkalenko edited comment on IGNITE-13912 at 1/21/21, 6:58 PM:
--------------------------------------------------------------------
Hi. [~shm]!
The last reserved segment is 11 (it is reserved because of the PME), and I
cannot see its release, so all segments greater and equal to it cannot be
deleted.
{noformat}
[2021-01-21T13:47:37,493][DEBUG][sys-#310][FileWriteAheadLogManager] Reserved
WAL pointer: WALPointer [idx=11, fileOff=540780430, len=9572]
[2021-01-21T13:47:37,493][WARN ][sys-#310][FileWriteAheadLogManager] Reserved
WAL stack
at
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.reserve(FileWriteAheadLogManager.java:1015)
[ignite-core-2.11.0-SNAPSHOT.jar:2.11.0-SNAPSHOT]
{noformat}
Here is an example of the last released, and all after & before it was deleted.
{noformat}
[2021-01-21T13:47:37,497][DEBUG][sys-#310][FileWriteAheadLogManager] Released
WAL pointer: WALPointer [idx=6, fileOff=576103521, len=9572]
{noformat}
You need to understand why the PME happened. Anyway, without the reproducer it
is impossible to understand what the matter is.
was (Author: [email protected]):
Hi. [~shm]!
The last reserved segment is 11 (it is reserved because of the PME), and I
cannot see its release, so all segments greater and equal to it cannot be
deleted.
{noformat}
[2021-01-21T13:47:37,493][DEBUG][sys-#310][FileWriteAheadLogManager] Reserved
WAL pointer: WALPointer [idx=11, fileOff=540780430, len=9572]
[2021-01-21T13:47:37,493][WARN ][sys-#310][FileWriteAheadLogManager] Reserved
WAL stack
at
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.reserve(FileWriteAheadLogManager.java:1015)
[ignite-core-2.11.0-SNAPSHOT.jar:2.11.0-SNAPSHOT]
{noformat}
Here is an example of the last released, and all after it was deleted.
{noformat}
[2021-01-21T13:47:37,497][DEBUG][sys-#310][FileWriteAheadLogManager] Released
WAL pointer: WALPointer [idx=6, fileOff=576103521, len=9572]
{noformat}
You need to understand why the PME happened. Anyway, without the reproducer it
is impossible to understand what the matter is.
> Incorrect calculation of WAL segments that should be deleted from WAL archive
> -----------------------------------------------------------------------------
>
> Key: IGNITE-13912
> URL: https://issues.apache.org/jira/browse/IGNITE-13912
> Project: Ignite
> Issue Type: Bug
> Components: persistence
> Reporter: Kirill Tkalenko
> Assignee: Kirill Tkalenko
> Priority: Critical
> Fix For: 2.10
>
> Attachments: server1-full-wal-checkpoint.log, wal-checkpoint-logs,
> wal_dir_contents, wal_grows_from_peak.PNG, wal_issue_reproduced.PNG,
> wal_usage.PNG, wal_usage_dec12.PNG, wal_usage_dec22nd_binary.PNG
>
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> Now there is an incorrect calculation of WAL segments that should be deleted
> from WAL archive. Since we delete only those segments whose total size should
> not exceed *DataStorageConfiguration#maxWalArchiveSize *
> IGNITE_THRESHOLD_WAL_ARCHIVE_SIZE_PERCENTAGE*, but should be up to
> DataStorageConfiguration#maxWalArchiveSize *
> IGNITE_THRESHOLD_WAL_ARCHIVE_SIZE_PERCENTAGE*. Therefore, an excess of
> *DataStorageConfiguration#maxWalArchiveSize* occurs.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)