Guys,
We have benchmarked how checkpoint write speed on SSD disk depends on
various parameters. It became absolutely obvious that using 4K pages in
durable memory instead of 2K brings considerable, significant speed-up.
I think, we must set 4K as default page size.
Ticket with detailed explanation:
https://issues.apache.org/jira/browse/IGNITE-5884
Spoiler: it depends on write order and alignment, but writing 4K is at
least *3x faster* than writing 2K when other parameters are the same.
The question is backwards compatibility. If pageSize is not explicitly
set in user configuration, attempt to start "4k default" Ignite node
from "2k default" LFS files will fail with exception:
class org.apache.ignite.IgniteCheckedException: Failed to verify store
file (invalid page size) [expectedPageSize=4096, filePageSize=2048]
at
org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.checkFile(FilePageStore.java:206)
at
org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.init(FilePageStore.java:416)
at
org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.read(FilePageStore.java:315)
at
org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.read(FilePageStoreManager.java:287)
at
org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.read(FilePageStoreManager.java:272)
at
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:569)
at
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:487)
at
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.getOrAllocateCacheMetas(GridCacheOffheapManager.java:515)
at
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.initDataStructures(GridCacheOffheapManager.java:86)
at
org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.start(IgniteCacheOffheapManagerImpl.java:139)
at
org.apache.ignite.internal.processors.cache.CacheGroupContext.start(CacheGroupContext.java:868)
I think, we have two options here:
1) Obvious and safe - provide silent backwards compatibility. We can
implement a task which will find any LFS file, check its pageSize and
use it as default.
2) Less user-friendly, but in my opinion still better option - crash
node, but make error message more informative. We'll let user know that
default pageSize was changed to 4k due to discovered performance boost
on most UNIX-based enviroments with SSD (which is for sure most popular
enviroment among users), and recommend user to migrate to 4K-page LFS.
If user still wants to work with 2k pages, he can always set it
explicitly in MemoryConfiguration and start node.
Thoughts?
--
Best Regards,
Ivan Rakov