[
https://issues.apache.org/jira/browse/IGNITE-9612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16631585#comment-16631585
]
ASF GitHub Bot commented on IGNITE-9612:
----------------------------------------
GitHub user ascherbakoff opened a pull request:
https://github.com/apache/ignite/pull/4864
IGNITE-9612 test fix
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/gridgain/apache-ignite ignite-9612
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/ignite/pull/4864.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #4864
----
commit 7e3c341ff4b99d3ca6f10283da539464d45a1c17
Author: ascherbakoff <alexey.scherbakoff@...>
Date: 2018-09-23T16:34:09Z
IGNITE-9612 wip.
commit 5ab5553e989df5971472978d36515cfe1565627b
Author: Aleksei Scherbakov <alexey.scherbakoff@...>
Date: 2018-09-24T10:49:59Z
IGNITE-9612 wip.
commit 5ad45b70e1fe6a921b49aec943bec874dcf64758
Author: Aleksei Scherbakov <alexey.scherbakoff@...>
Date: 2018-09-25T10:30:18Z
IGNITE-9612 wip.
commit 53c99ba9bb02cd83ee9752224ca27a82312b731d
Author: Aleksei Scherbakov <alexey.scherbakoff@...>
Date: 2018-09-25T10:33:15Z
IGNITE-9612 wip.
commit 795f296cfa7cd0783f03d76b1b7066b835a3d4c2
Author: Aleksei Scherbakov <alexey.scherbakoff@...>
Date: 2018-09-25T10:33:34Z
Merge branch 'master' of https://github.com/apache/ignite into ignite-9612
commit 9a5b1519c333953fb24621742ff0ad6aa03a011a
Author: Aleksei Scherbakov <alexey.scherbakoff@...>
Date: 2018-09-26T09:48:25Z
Merge branch 'master' of https://github.com/apache/ignite into ignite-9612
commit b105f7beb8643ad971d33352b8fd3fd90e6bb3f4
Author: Aleksei Scherbakov <alexey.scherbakoff@...>
Date: 2018-09-26T17:20:48Z
IGNITE-9612 wip.
commit db622f6f709d758324f15b6aa582c40c3bbea13d
Author: ascherbakoff <alexey.scherbakoff@...>
Date: 2018-09-27T19:28:57Z
Merge branch 'master' of https://github.com/apache/ignite into ignite-9612
commit 12c36d2771e870c08e6bfc2d3eef561f9ac6c8ca
Author: Aleksei Scherbakov <alexey.scherbakoff@...>
Date: 2018-09-28T08:34:05Z
Merge branch 'master' of https://github.com/apache/ignite into ignite-9612
commit a2c6b48127f9d1a4660c43e05b9302c7c16dc2d1
Author: Aleksei Scherbakov <alexey.scherbakoff@...>
Date: 2018-09-28T08:36:38Z
Merge branch 'ignite-9612' of https://github.com/gridgain/apache-ignite
into ignite-9612
commit 0b2b24a58bcd0a9a6f35536852628ddde4bcb846
Author: Aleksei Scherbakov <alexey.scherbakoff@...>
Date: 2018-09-28T09:04:57Z
IGNITE-9612 wip.
----
> Improve checkpoint mark phase speed.
> ------------------------------------
>
> Key: IGNITE-9612
> URL: https://issues.apache.org/jira/browse/IGNITE-9612
> Project: Ignite
> Issue Type: Improvement
> Reporter: Alexei Scherbakov
> Assignee: Alexei Scherbakov
> Priority: Major
> Fix For: 2.7
>
>
> I'm observing regular slow checkpoints due to long mark duration, which is
> not related to dirty pages number:
> {noformat}
> 2018-09-01 14:55:20.408 [INFO
> ][db-checkpoint-thread-#241%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.p.GridCacheDatabaseSharedManager]
> Checkpoint started [checkpointId=01e0c7bf-842f-4ed6-8589-b4904063434f,
> startPtr=FileWALPointer [idx=19814, fileOff=948996096, len=5233457],
> checkpointLockWait=0ms, checkpointLockHoldTime=951ms,
> walCpRecordFsyncDuration=39ms, pages=78477, reason='timeout']
> 2018-09-01 14:55:21.307 [INFO
> ][db-checkpoint-thread-#241%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.p.GridCacheDatabaseSharedManager]
> Checkpoint finished [cpId=01e0c7bf-842f-4ed6-8589-b4904063434f, pages=78477,
> markPos=FileWALPointer [idx=19814, fileOff=948996096, len=5233457],
> walSegmentsCleared=0, walSegmentsCovered=[], *markDuration=1002m*s,
> pagesWrite=478ms, fsync=421ms, total=1901ms]
> {noformat}
> {noformat}
> 2018-09-01 14:58:20.355 [INFO
> ][db-checkpoint-thread-#241%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.p.GridCacheDatabaseSharedManager]
> Checkpoint started [checkpointId=09d1f4bc-d3f3-4a16-b291-89d7fa745ea5,
> startPtr=FileWALPointer [idx=19814, fileOff=1000024208, len=5233457],
> checkpointLockWait=0ms, checkpointLockHoldTime=926ms,
> walCpRecordFsyncDuration=14ms, pages=10837, reason='timeout']
> 2018-09-01 14:58:20.480 [INFO
> ][db-checkpoint-thread-#241%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.p.GridCacheDatabaseSharedManager]
> Checkpoint finished [cpId=09d1f4bc-d3f3-4a16-b291-89d7fa745ea5, pages=10837,
> markPos=FileWALPointer [idx=19814, fileOff=1000024208, len=5233457],
> walSegmentsCleared=0, walSegmentsCovered=[], *markDuration=943ms*,
> pagesWrite=64ms, fsync=61ms, total=1068ms]
> {noformat}
> Debugging has revealed what this is due to large amount of work required to
> save metadata for metapages and free/reuse lists. Because this is done under
> checkpoint write lock, all other activities are blocked, resulting in
> increased tx and atomic ops latency.
> Simple solution: parallelize metadata processing during mark phase.
> Best way to solve the problem is described in IGNITE-9520.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)