1. We have DirectIO disabled : does enabling it impact the performance if enabled? It should increase performance, but its always worth to do some benchmarks before using such features in production.

2. When we disabled throttling, we saw 5x better performance. Struggling load test completed in 1/5th of time. What are its side effects if we keep it disabled. You can safely disable it. In this case throttling will still be present, but it would use less intelligent strategies (that also from time to time may work incorrectly)

3. Does MaxMemoryDirectSize have any relation to throughput rate.
I don't know anything regarding this.

From my perspective your checkpointBuffSize is enough, take a look to your log messages: cpBufUsed=133, cpBufTotal=1554645

Increasing checkpoint frequency should spread IO pressure more evenly over time, but as I mentioned before if you decide to increase/decrease any IO parameter you would be better to benchmark how it would impact your setup.

On 2022/04/17 05:43:57 Surinder Mehra wrote:
> Hey thanks for replying. we haven't configured the storage path so by
> default it should be in the work directory. work, wal, walarchive all three
> are SSDs. I have the following queries.
>
> 1. We have DirectIO disabled : does enabling it impact the performance if
> enabled?
> 2. When we disabled throttling, we saw 5x better performance. Struggling
> load test completed in 1/5th of time. What are its side effects if we keep
> it disabled
> 3. Does MaxMemoryDirectSize have any relation to throughput rate.
> 4. Can the current configuration mentioned in the previous thread be scaled
> further? like increasing WalSegment size beyond 1GB and related size of
> walArchive, checkpointbufferSize and MaxMemoryDirectSize jvm parameter.
> 5. We see now due to throttling disabled, WalArchive size is going beyond
> 50G(WalSegment size 1G and checkpoint buffer size 6G). would decreasing
> checkpoint frequency and/or increasing checkpoint threads count increase
> throughput or impact application writes inversely. Currently checkpointing
> frequency and threads are default
>
>
> On Sun, Apr 17, 2022 at 6:33 AM Ilya Korol <ll...@gmail.com> wrote:
>
> > Hi, From my perspective this looks like your physical storage is not
> > fast enough to handle incoming writes. markDirty speed is 2x times
> > faster that checkpointWrite even in the presence of throttling. You've
> > mentioned that ignite work folder stored on SSD, but what about PDS
> > folder (DataStorageConfiguration.setStoragePath())?
> >
> > Btw, have you tested your setup with DirectIO disabled?
> >
> > On 2022/04/14 10:55:23 Surinder Mehra wrote:
> > > Hi,
> > > We have an application with ignite thick clients which writes to ignite
> > > caches on ignite grid deployed separately. Below is the ignite
> > > configuration per node
> > >
> > > With this configuration, we see throttling happening and
> > checkpointing time
> > > is between 20-30 seconds. Did we miss something in configuration or any
> > > other settings we can enable. Any suggestions will be of great help.
> > >
> > > * 100-200 concurrent writes to 25 node cluster
> > > * #partitions 512
> > > * cache backups = 2
> > > * cache mode partitioned
> > > * syncronizationMode : primary Sync
> > > * Off Heap caches
> > > * Server nodes : 25
> > > * RAM : 64G
> > > * maxmemoryDirectSize : 4G
> > > * Heap: 25G
> > >
> > > * persistenceEnabled: true
> > > * data region size : 24GB
> > > * checkPointingBufferSize: 6gb
> > > * walSegmentSize: 1G
> > > * walBufferSize : 256MB
> > > * walarchiveSize: 24G
> > > * writeThrotlingEnabled: true
> > > * checkPointingfreq : 60 sec
> > > * checkPointingThreads: 4
> > > * DirectIO enabled: true
> > >
> > > SSDs atatched:
> > > work volume : 20G
> > > wal volume : 15G
> > > Wal archive volume : 26G
> > >
> > >
> > > Checkpointing logs:
> > >
> > > [10:27:13,237][INFO][db-checkpoint-thread-#230][Checkpointer] Checkpoint
> > > started [checkpointId=11749dc0-fd0d-4b5f-8b9a-510e774fec38,
> > > startPtr=WALPointer [idx=26, fileOff=385214751, len=16683],
> > > checkpointBeforeLockTime=29ms, checkpointLockWait=0ms,
> > > checkpointListenersExecuteTime=2ms, checkpointLockHoldTime=3ms,
> > > walCpRecordFsyncDuration=11ms, writeCheckpointEntryDuration=3ms,
> > > splitAndSortCpPagesDuration=30ms, pages=40505, reason='timeout']
> > > [10:27:13,242][INFO][sys-stripe-7-#8][PageMemoryImpl] Throttling is
> > applied
> > > to page modifications [percentOfPartTime=0.88, markDirty=2121 pages/sec,
> > > checkpointWrite=1219 pages/sec, estIdealMarkDirty=0 pages/sec,
> > > curDirty=0.00, maxDirty=0.02, avgParkTime=410172 ns, pages:
> > (total=40505,
> > > evicted=0, written=10, synced=0, cpBufUsed=133, cpBufTotal=1554645)]
> > > [10:27:29,935][INFO][grid-timeout-worker-#30][IgniteKernal]
> > > Metrics for local node (to disable set 'metricsLogFrequency' to 0)
> > > ^-- Node [id=214f3c2b, uptime=00:45:00.227]
> > > ^-- Cluster [hosts=45, CPUs=540, servers=25, clients=20, topVer=75,
> > > minorTopVer=0]
> > > ^-- Network [addrs=[127.0.0.1, 192.168.98.141], discoPort=47500,
> > > commPort=47100]
> > > ^-- CPU [CPUs=12, curLoad=3.67%, avgLoad=0.82%, GC=0%]
> > > ^-- Heap [used=5330MB, free=79.18%, comm=20480MB]
> > > ^-- Off-heap memory [used=1019MB, free=95.92%, allocated=24775MB]
> > > ^-- Page memory [pages=257976]
> > > ^-- sysMemPlc region [type=internal, persistence=true,
> > > lazyAlloc=false,
> > > ... initCfg=40MB, maxCfg=100MB, usedRam=0MB, freeRam=99.99%,
> > > allocRam=99MB, allocTotal=0MB]
> > > ^-- default region [type=default, persistence=true, lazyAlloc=true,
> > > ... initCfg=24576MB, maxCfg=24576MB, usedRam=1018MB, freeRam=95.86%,
> > > allocRam=24576MB, allocTotal=3820MB]
> > > ^-- metastoreMemPlc region [type=internal, persistence=true,
> > > lazyAlloc=false,
> > > ... initCfg=40MB, maxCfg=100MB, usedRam=1MB, freeRam=98.78%,
> > > allocRam=0MB, allocTotal=1MB]
> > > ^-- TxLog region [type=internal, persistence=true, lazyAlloc=false,
> > > ... initCfg=40MB, maxCfg=100MB, usedRam=0MB, freeRam=100%,
> > > allocRam=99MB, allocTotal=0MB]
> > > ^-- volatileDsMemPlc region [type=user, persistence=false,
> > > lazyAlloc=true,
> > > ... initCfg=40MB, maxCfg=100MB, usedRam=0MB, freeRam=100%,
> > > allocRam=0MB]
> > > ^-- Ignite persistence [used=3821MB]
> > > ^-- Outbound messages queue [size=0]
> > > ^-- Public thread pool [active=0, idle=0, qSize=0]
> > > ^-- System thread pool [active=0, idle=7, qSize=0]
> > > ^-- Striped thread pool [active=0, idle=12, qSize=0]
> > > [10:27:38,261][INFO][db-checkpoint-thread-#230][Checkpointer] Checkpoint
> > > finished [cpId=11749dc0-fd0d-4b5f-8b9a-510e774fec38, pages=40505,
> > > markPos=WALPointer [idx=26, fileOff=385214751, len=16683],
> > > walSegmentsCovered=[], markDuration=47ms, pagesWrite=25018ms, fsync=6ms,
> > > total=25100ms]
> > >
> >
>

Reply via email to