[jira] [Comment Edited] (IGNITE-6930) Optionally to do not write free list updates to WAL

Ivan Rakov (Jira) Thu, 03 Oct 2019 11:18:15 -0700


    [ 
https://issues.apache.org/jira/browse/IGNITE-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16943806#comment-16943806
 ]


Ivan Rakov edited comment on IGNITE-6930 at 10/3/19 6:17 PM:
-------------------------------------------------------------

[~alex_pl], I've taken a look. Some comments:
1) testRestoreFreeListCorrectlyAfterRandomStop - why do we need to disable 
caching here?
2) testFreeListUnderLoadMultipleCheckpoints - what is being tested? I think, we 
need to add comment that test is intended to cover weakened pageId != 0 
assertion.
3) MAX_SIZE, STRIPES_COUNT - don't you think that we should make these options 
configurable?
4) How did you choose 64 and 4 as defaults? Can you share some benchmarks? I 
think that 64 might be on overkill: in data load scenario, data pages traverse 
from biggest to lowest buckets by turn. I don't think that pages are likely to 
heavily accumulate in a certain bucket; maybe 8 as MAX_SIZE would show the same 
performance boost.
5) PagesList.PagesCache#flush: do we need to garbage-collect all allocated long 
lists when we flush page cache? We can just clear() them and reuse again after 
the checkpoint. It should reduce GC pressure.


was (Author: ivan.glukos):
[~alex_pl], I've take a look. Some comments:
1) testRestoreFreeListCorrectlyAfterRandomStop - why do we need to disable 
caching here?
2) testFreeListUnderLoadMultipleCheckpoints - what is being tested? I think, we 
need to add comment that test is intended to cover weakened pageId != 0 
assertion.
3) MAX_SIZE, STRIPES_COUNT - don't you think that we should make these options 
configurable?
4) How did you choose 64 and 4 as defaults? Can you share some benchmarks? I 
think that 64 might be on overkill: in data load scenario, data pages traverse 
from biggest to lowest buckets by turn. I don't think that pages are likely to 
heavily accumulate in a certain bucket; maybe 8 as MAX_SIZE would show the same 
performance boost.
5) PagesList.PagesCache#flush: do we need to garbage-collect all allocated long 
lists when we flush page cache? We can just clear() them and reuse again after 
the checkpoint. It should reduce GC pressure.

> Optionally to do not write free list updates to WAL
> ---------------------------------------------------
>
>                 Key: IGNITE-6930
>                 URL: https://issues.apache.org/jira/browse/IGNITE-6930
>             Project: Ignite
>          Issue Type: Task
>          Components: cache
>            Reporter: Vladimir Ozerov
>            Assignee: Aleksey Plekhanov
>            Priority: Major
>              Labels: IEP-8, performance
>             Fix For: 2.8
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> When cache entry is created, we need to write update the free list. When 
> entry is updated, we need to update free list(s) several times. Currently 
> free list is persistent structure, so every update to it must be logged to be 
> able to recover after crash. This may incur significant overhead, especially 
> for small entries.
> E.g. this is how WAL for a single update looks like. "D" - updates with real 
> data, "F" - free-list management:
> {code}
>  1. [D] DataRecord [writeEntries=[UnwrapDataEntry[k = key, v = [ BinaryObject 
> [idHash=2053299190, hash=1986931360, typeId=-1580729813]], super = [DataEntry 
> [cacheId=94416770, op=UPDATE, writeVer=GridCacheVersion [topVer=122147562, 
> order=1510667560607, nodeOrder=1], partId=0, partCnt=4]]]], super=WALRecord 
> [size=0, chainSize=0, pos=null, type=DATA_RECORD]]
>  2. [F] PagesListRemovePageRecord [rmvdPageId=0001000000000005, 
> pageId=0001000000000006, grpId=94416770, super=PageDeltaRecord 
> [grpId=94416770, pageId=0001000000000006, super=WALRecord [size=37, 
> chainSize=0, pos=null, type=PAGES_LIST_REMOVE_PAGE]]]
>  3. [D] DataPageInsertRecord [super=PageDeltaRecord [grpId=94416770, 
> pageId=0001000000000005, super=WALRecord [size=129, chainSize=0, pos=null, 
> type=DATA_PAGE_INSERT_RECORD]]]
>  4. [F] PagesListAddPageRecord [dataPageId=0001000000000005, 
> super=PageDeltaRecord [grpId=94416770, pageId=0001000000000008, 
> super=WALRecord [size=37, chainSize=0, pos=null, type=PAGES_LIST_ADD_PAGE]]]
>  5. [F] DataPageSetFreeListPageRecord [freeListPage=281474976710664, 
> super=PageDeltaRecord [grpId=94416770, pageId=0001000000000005, 
> super=WALRecord [size=37, chainSize=0, pos=null, 
> type=DATA_PAGE_SET_FREE_LIST_PAGE]]]
>  6. [D] ReplaceRecord [io=DataLeafIO[ver=1], idx=0, super=PageDeltaRecord 
> [grpId=94416770, pageId=0001000000000004, super=WALRecord [size=47, 
> chainSize=0, pos=null, type=BTREE_PAGE_REPLACE]]]
>  7. [F] DataPageRemoveRecord [itemId=0, super=PageDeltaRecord 
> [grpId=94416770, pageId=0001000000000005, super=WALRecord [size=30, 
> chainSize=0, pos=null, type=DATA_PAGE_REMOVE_RECORD]]]
>  8. [F] PagesListRemovePageRecord [rmvdPageId=0001000000000005, 
> pageId=0001000000000008, grpId=94416770, super=PageDeltaRecord 
> [grpId=94416770, pageId=0001000000000008, super=WALRecord [size=37, 
> chainSize=0, pos=null, type=PAGES_LIST_REMOVE_PAGE]]]
>  9. [F] DataPageSetFreeListPageRecord [freeListPage=0, super=PageDeltaRecord 
> [grpId=94416770, pageId=0001000000000005, super=WALRecord [size=37, 
> chainSize=0, pos=null, type=DATA_PAGE_SET_FREE_LIST_PAGE]]]
> 10. [F] PagesListAddPageRecord [dataPageId=0001000000000005, 
> super=PageDeltaRecord [grpId=94416770, pageId=0001000000000006, 
> super=WALRecord [size=37, chainSize=0, pos=null, type=PAGES_LIST_ADD_PAGE]]]
> 11. [F] DataPageSetFreeListPageRecord [freeListPage=281474976710662, 
> super=PageDeltaRecord [grpId=94416770, pageId=0001000000000005, 
> super=WALRecord [size=37, chainSize=0, pos=null, 
> type=DATA_PAGE_SET_FREE_LIST_PAGE]]]
> {code}
> If you sum all space required for operation (size in p.3 is shown incorrectly 
> here), you will see that data update required ~300 bytes, so do free list 
> update! 
> *Proposed solution*
> 1) Optionally do not write free list updates to WAL
> 2) In case of node restart we start with empty free lists, so data inserts 
> will have to allocate new pages
> 3) When old data page is read, add it to the free list
> 4) Start a background thread which will iterate over all old data pages and 
> re-create the free list, so that eventually all data pages are tracked.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (IGNITE-6930) Optionally to do not write free list updates to WAL

Reply via email to