[ https://issues.apache.org/jira/browse/IGNITE-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16943806#comment-16943806 ]
Ivan Rakov edited comment on IGNITE-6930 at 10/3/19 6:17 PM: ------------------------------------------------------------- [~alex_pl], I've taken a look. Some comments: 1) testRestoreFreeListCorrectlyAfterRandomStop - why do we need to disable caching here? 2) testFreeListUnderLoadMultipleCheckpoints - what is being tested? I think, we need to add comment that test is intended to cover weakened pageId != 0 assertion. 3) MAX_SIZE, STRIPES_COUNT - don't you think that we should make these options configurable? 4) How did you choose 64 and 4 as defaults? Can you share some benchmarks? I think that 64 might be on overkill: in data load scenario, data pages traverse from biggest to lowest buckets by turn. I don't think that pages are likely to heavily accumulate in a certain bucket; maybe 8 as MAX_SIZE would show the same performance boost. 5) PagesList.PagesCache#flush: do we need to garbage-collect all allocated long lists when we flush page cache? We can just clear() them and reuse again after the checkpoint. It should reduce GC pressure. was (Author: ivan.glukos): [~alex_pl], I've take a look. Some comments: 1) testRestoreFreeListCorrectlyAfterRandomStop - why do we need to disable caching here? 2) testFreeListUnderLoadMultipleCheckpoints - what is being tested? I think, we need to add comment that test is intended to cover weakened pageId != 0 assertion. 3) MAX_SIZE, STRIPES_COUNT - don't you think that we should make these options configurable? 4) How did you choose 64 and 4 as defaults? Can you share some benchmarks? I think that 64 might be on overkill: in data load scenario, data pages traverse from biggest to lowest buckets by turn. I don't think that pages are likely to heavily accumulate in a certain bucket; maybe 8 as MAX_SIZE would show the same performance boost. 5) PagesList.PagesCache#flush: do we need to garbage-collect all allocated long lists when we flush page cache? We can just clear() them and reuse again after the checkpoint. It should reduce GC pressure. > Optionally to do not write free list updates to WAL > --------------------------------------------------- > > Key: IGNITE-6930 > URL: https://issues.apache.org/jira/browse/IGNITE-6930 > Project: Ignite > Issue Type: Task > Components: cache > Reporter: Vladimir Ozerov > Assignee: Aleksey Plekhanov > Priority: Major > Labels: IEP-8, performance > Fix For: 2.8 > > Time Spent: 10m > Remaining Estimate: 0h > > When cache entry is created, we need to write update the free list. When > entry is updated, we need to update free list(s) several times. Currently > free list is persistent structure, so every update to it must be logged to be > able to recover after crash. This may incur significant overhead, especially > for small entries. > E.g. this is how WAL for a single update looks like. "D" - updates with real > data, "F" - free-list management: > {code} > 1. [D] DataRecord [writeEntries=[UnwrapDataEntry[k = key, v = [ BinaryObject > [idHash=2053299190, hash=1986931360, typeId=-1580729813]], super = [DataEntry > [cacheId=94416770, op=UPDATE, writeVer=GridCacheVersion [topVer=122147562, > order=1510667560607, nodeOrder=1], partId=0, partCnt=4]]]], super=WALRecord > [size=0, chainSize=0, pos=null, type=DATA_RECORD]] > 2. [F] PagesListRemovePageRecord [rmvdPageId=0001000000000005, > pageId=0001000000000006, grpId=94416770, super=PageDeltaRecord > [grpId=94416770, pageId=0001000000000006, super=WALRecord [size=37, > chainSize=0, pos=null, type=PAGES_LIST_REMOVE_PAGE]]] > 3. [D] DataPageInsertRecord [super=PageDeltaRecord [grpId=94416770, > pageId=0001000000000005, super=WALRecord [size=129, chainSize=0, pos=null, > type=DATA_PAGE_INSERT_RECORD]]] > 4. [F] PagesListAddPageRecord [dataPageId=0001000000000005, > super=PageDeltaRecord [grpId=94416770, pageId=0001000000000008, > super=WALRecord [size=37, chainSize=0, pos=null, type=PAGES_LIST_ADD_PAGE]]] > 5. [F] DataPageSetFreeListPageRecord [freeListPage=281474976710664, > super=PageDeltaRecord [grpId=94416770, pageId=0001000000000005, > super=WALRecord [size=37, chainSize=0, pos=null, > type=DATA_PAGE_SET_FREE_LIST_PAGE]]] > 6. [D] ReplaceRecord [io=DataLeafIO[ver=1], idx=0, super=PageDeltaRecord > [grpId=94416770, pageId=0001000000000004, super=WALRecord [size=47, > chainSize=0, pos=null, type=BTREE_PAGE_REPLACE]]] > 7. [F] DataPageRemoveRecord [itemId=0, super=PageDeltaRecord > [grpId=94416770, pageId=0001000000000005, super=WALRecord [size=30, > chainSize=0, pos=null, type=DATA_PAGE_REMOVE_RECORD]]] > 8. [F] PagesListRemovePageRecord [rmvdPageId=0001000000000005, > pageId=0001000000000008, grpId=94416770, super=PageDeltaRecord > [grpId=94416770, pageId=0001000000000008, super=WALRecord [size=37, > chainSize=0, pos=null, type=PAGES_LIST_REMOVE_PAGE]]] > 9. [F] DataPageSetFreeListPageRecord [freeListPage=0, super=PageDeltaRecord > [grpId=94416770, pageId=0001000000000005, super=WALRecord [size=37, > chainSize=0, pos=null, type=DATA_PAGE_SET_FREE_LIST_PAGE]]] > 10. [F] PagesListAddPageRecord [dataPageId=0001000000000005, > super=PageDeltaRecord [grpId=94416770, pageId=0001000000000006, > super=WALRecord [size=37, chainSize=0, pos=null, type=PAGES_LIST_ADD_PAGE]]] > 11. [F] DataPageSetFreeListPageRecord [freeListPage=281474976710662, > super=PageDeltaRecord [grpId=94416770, pageId=0001000000000005, > super=WALRecord [size=37, chainSize=0, pos=null, > type=DATA_PAGE_SET_FREE_LIST_PAGE]]] > {code} > If you sum all space required for operation (size in p.3 is shown incorrectly > here), you will see that data update required ~300 bytes, so do free list > update! > *Proposed solution* > 1) Optionally do not write free list updates to WAL > 2) In case of node restart we start with empty free lists, so data inserts > will have to allocate new pages > 3) When old data page is read, add it to the free list > 4) Start a background thread which will iterate over all old data pages and > re-create the free list, so that eventually all data pages are tracked. -- This message was sent by Atlassian Jira (v8.3.4#803005)