[ 
https://issues.apache.org/jira/browse/IGNITE-7751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Rakov updated IGNITE-7751:
-------------------------------
    Description: 
Even with write throttling enabled, checkpoint buffer still can be overflowed. 
Overflow chance increases with number of writing threads. Example stacktrace:
{noformat}
2018-02-17 21:00:14.777 
[ERROR][sys-stripe-12-#13%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.d.dht.GridDhtTxRemote]
 Commit failed.
org.apache.ignite.internal.transactions.IgniteTxHeuristicCheckedException: 
Commit produced a runtime exception (all transaction entries will be 
invalidated): 
GridDhtTxRemote[id=06db48da161-00000000-07c5-23f5-0000-000000000005, 
concurrency=OPTIMISTIC, isolation=SERIALIZABLE, state=COMMITTING, 
invalidate=false, rollbackOnly=false, 
nodeId=da415868-d9b3-48a5-9b56-0706ae60dd3b, duration=60]
        at 
org.apache.ignite.internal.processors.cache.distributed.GridDistributedTxRemoteAdapter.commitIfLocked(GridDistributedTxRemoteAdapter.java:739)
        at 
org.apache.ignite.internal.processors.cache.distributed.GridDistributedTxRemoteAdapter.commitRemoteTx(GridDistributedTxRemoteAdapter.java:813)
        at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.finish(IgniteTxHandler.java:1319)
        at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.processDhtTxFinishRequest(IgniteTxHandler.java:1231)
        at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.access$600(IgniteTxHandler.java:97)
        at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$7.apply(IgniteTxHandler.java:213)
        at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$7.apply(IgniteTxHandler.java:211)
        at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1060)
        at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:579)
        at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:378)
        at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:304)
        at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:99)
        at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:293)
        at 
org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1555)
        at 
org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1183)
        at 
org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:126)
        at 
org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1090)
        at 
org.apache.ignite.internal.util.StripedExecutor$Stripe.run(StripedExecutor.java:499)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.ignite.IgniteException: Runtime failure on row: 
Row@9f0a081[ key: 4694439661580364888, val: 
com.sbt.bm.ucp.common.dpl.model.party.DUserInfo_DPL_PROXY [idHash=1290746929, 
hash=400782371, colocationKey=16678, lastChangeDate=1518890414661, 
userFullName=null, partition_DPL_id=6, bankInfo_DPL_id=4694439661580364888, 
bankInfo_DPL_colocationKey=16678, ownerId=null, 
infoFlowChannel_DPL_colocationKey=0, userLogin=reloading, 
uid=1102030258731339432, isDeleted=false, infoFlowChannel_DPL_id=0, 
sourceSystem_DPL_id=65, id=4694439661580364888, 
colocationId=1102030258828706483], ver: GridCacheVersion [topVer=130360309, 
order=1519034613156, nodeOrder=5] ][ 1102030258731339432, reloading, 
4694439661580364888, 0, null, 65, 4694439661580364888, FALSE, 6 ]
        at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.doPut(BPlusTree.java:2102)
        at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.putx(BPlusTree.java:2049)
        at 
org.apache.ignite.internal.processors.query.h2.database.H2TreeIndex.putx(H2TreeIndex.java:247)
        at 
org.apache.ignite.internal.processors.query.h2.opt.GridH2Table.addToIndex(GridH2Table.java:536)
        at 
org.apache.ignite.internal.processors.query.h2.opt.GridH2Table.update(GridH2Table.java:468)
        at 
org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.store(IgniteH2Indexing.java:595)
        at 
org.apache.ignite.internal.processors.query.GridQueryProcessor.store(GridQueryProcessor.java:1865)
        at 
org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.store(GridCacheQueryManager.java:407)
        at 
org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.finishUpdate(IgniteCacheOffheapManagerImpl.java:1343)
        at 
org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.invoke(IgniteCacheOffheapManagerImpl.java:1207)
        at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.invoke(GridCacheOffheapManager.java:1356)
        at 
org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.invoke(IgniteCacheOffheapManagerImpl.java:345)
        at 
org.apache.ignite.internal.processors.cache.GridCacheMapEntry.storeValue(GridCacheMapEntry.java:3527)
        at 
org.apache.ignite.internal.processors.cache.GridCacheMapEntry.innerSet(GridCacheMapEntry.java:1039)
        at 
org.apache.ignite.internal.processors.cache.distributed.GridDistributedTxRemoteAdapter.commitIfLocked(GridDistributedTxRemoteAdapter.java:609)
        ... 18 common frames omitted
Caused by: org.apache.ignite.IgniteException: Failed to allocate temporary 
buffer for checkpoint (increase checkpointPageBufferSize configuration property)
        at 
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.postWriteLockPage(PageMemoryImpl.java:1293)
        at 
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.writeLockPage(PageMemoryImpl.java:1276)
        at 
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.writeLock(PageMemoryImpl.java:398)
        at 
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.writeLock(PageMemoryImpl.java:393)
        at 
org.apache.ignite.internal.processors.cache.persistence.tree.util.PageHandler.writeLock(PageHandler.java:398)
        at 
org.apache.ignite.internal.processors.cache.persistence.tree.util.PageHandler.writePage(PageHandler.java:326)
        at 
org.apache.ignite.internal.processors.cache.persistence.DataStructure.write(DataStructure.java:262)
        at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.access$11100(BPlusTree.java:82)
        at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Put.tryInsert(BPlusTree.java:2922)
        at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Put.access$7600(BPlusTree.java:2610)
        at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.putDown(BPlusTree.java:2348)
        at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.putDown(BPlusTree.java:2329)
        at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.putDown(BPlusTree.java:2329)
        at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.putDown(BPlusTree.java:2329)
        at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.doPut(BPlusTree.java:2069)
        ... 32 common frames omitted{noformat}
The problem is that we apply throttling by checkpoint buffer only for pages 
that are present in current checkpoint:
{noformat}
if (isPageInCheckpoint) {
    int checkpointBufLimit = pageMemory.checkpointBufferPagesSize() * 2 / 3;

    shouldThrottle = pageMemory.checkpointBufferPagesCount() > 
checkpointBufLimit;
}{noformat}
On the other hand, we clear backoff counter if we don't apply throttling, which 
can happen for page which is not in checkpoint:
{noformat}
if (shouldThrottle) {
    int throttleLevel = exponentialBackoffCntr.getAndIncrement();

    LockSupport.parkNanos((long)(STARTING_THROTTLE_NANOS * 
Math.pow(BACKOFF_RATIO, throttleLevel)));
}
else
    exponentialBackoffCntr.set(0);{noformat}
Possible solution: introduce two separate backoff counters for pages in / not 
in checkpoint.

 

  was:
Even with write throttling enabled, checkpoint buffer still can be overflowed. 
Example stacktrace:
{noformat}
2018-02-17 21:00:14.777 
[ERROR][sys-stripe-12-#13%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.d.dht.GridDhtTxRemote]
 Commit failed.
org.apache.ignite.internal.transactions.IgniteTxHeuristicCheckedException: 
Commit produced a runtime exception (all transaction entries will be 
invalidated): 
GridDhtTxRemote[id=06db48da161-00000000-07c5-23f5-0000-000000000005, 
concurrency=OPTIMISTIC, isolation=SERIALIZABLE, state=COMMITTING, 
invalidate=false, rollbackOnly=false, 
nodeId=da415868-d9b3-48a5-9b56-0706ae60dd3b, duration=60]
        at 
org.apache.ignite.internal.processors.cache.distributed.GridDistributedTxRemoteAdapter.commitIfLocked(GridDistributedTxRemoteAdapter.java:739)
        at 
org.apache.ignite.internal.processors.cache.distributed.GridDistributedTxRemoteAdapter.commitRemoteTx(GridDistributedTxRemoteAdapter.java:813)
        at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.finish(IgniteTxHandler.java:1319)
        at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.processDhtTxFinishRequest(IgniteTxHandler.java:1231)
        at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.access$600(IgniteTxHandler.java:97)
        at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$7.apply(IgniteTxHandler.java:213)
        at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$7.apply(IgniteTxHandler.java:211)
        at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1060)
        at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:579)
        at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:378)
        at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:304)
        at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:99)
        at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:293)
        at 
org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1555)
        at 
org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1183)
        at 
org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:126)
        at 
org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1090)
        at 
org.apache.ignite.internal.util.StripedExecutor$Stripe.run(StripedExecutor.java:499)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.ignite.IgniteException: Runtime failure on row: 
Row@9f0a081[ key: 4694439661580364888, val: 
com.sbt.bm.ucp.common.dpl.model.party.DUserInfo_DPL_PROXY [idHash=1290746929, 
hash=400782371, colocationKey=16678, lastChangeDate=1518890414661, 
userFullName=null, partition_DPL_id=6, bankInfo_DPL_id=4694439661580364888, 
bankInfo_DPL_colocationKey=16678, ownerId=null, 
infoFlowChannel_DPL_colocationKey=0, userLogin=reloading, 
uid=1102030258731339432, isDeleted=false, infoFlowChannel_DPL_id=0, 
sourceSystem_DPL_id=65, id=4694439661580364888, 
colocationId=1102030258828706483], ver: GridCacheVersion [topVer=130360309, 
order=1519034613156, nodeOrder=5] ][ 1102030258731339432, reloading, 
4694439661580364888, 0, null, 65, 4694439661580364888, FALSE, 6 ]
        at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.doPut(BPlusTree.java:2102)
        at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.putx(BPlusTree.java:2049)
        at 
org.apache.ignite.internal.processors.query.h2.database.H2TreeIndex.putx(H2TreeIndex.java:247)
        at 
org.apache.ignite.internal.processors.query.h2.opt.GridH2Table.addToIndex(GridH2Table.java:536)
        at 
org.apache.ignite.internal.processors.query.h2.opt.GridH2Table.update(GridH2Table.java:468)
        at 
org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.store(IgniteH2Indexing.java:595)
        at 
org.apache.ignite.internal.processors.query.GridQueryProcessor.store(GridQueryProcessor.java:1865)
        at 
org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.store(GridCacheQueryManager.java:407)
        at 
org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.finishUpdate(IgniteCacheOffheapManagerImpl.java:1343)
        at 
org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.invoke(IgniteCacheOffheapManagerImpl.java:1207)
        at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.invoke(GridCacheOffheapManager.java:1356)
        at 
org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.invoke(IgniteCacheOffheapManagerImpl.java:345)
        at 
org.apache.ignite.internal.processors.cache.GridCacheMapEntry.storeValue(GridCacheMapEntry.java:3527)
        at 
org.apache.ignite.internal.processors.cache.GridCacheMapEntry.innerSet(GridCacheMapEntry.java:1039)
        at 
org.apache.ignite.internal.processors.cache.distributed.GridDistributedTxRemoteAdapter.commitIfLocked(GridDistributedTxRemoteAdapter.java:609)
        ... 18 common frames omitted
Caused by: org.apache.ignite.IgniteException: Failed to allocate temporary 
buffer for checkpoint (increase checkpointPageBufferSize configuration property)
        at 
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.postWriteLockPage(PageMemoryImpl.java:1293)
        at 
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.writeLockPage(PageMemoryImpl.java:1276)
        at 
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.writeLock(PageMemoryImpl.java:398)
        at 
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.writeLock(PageMemoryImpl.java:393)
        at 
org.apache.ignite.internal.processors.cache.persistence.tree.util.PageHandler.writeLock(PageHandler.java:398)
        at 
org.apache.ignite.internal.processors.cache.persistence.tree.util.PageHandler.writePage(PageHandler.java:326)
        at 
org.apache.ignite.internal.processors.cache.persistence.DataStructure.write(DataStructure.java:262)
        at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.access$11100(BPlusTree.java:82)
        at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Put.tryInsert(BPlusTree.java:2922)
        at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Put.access$7600(BPlusTree.java:2610)
        at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.putDown(BPlusTree.java:2348)
        at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.putDown(BPlusTree.java:2329)
        at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.putDown(BPlusTree.java:2329)
        at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.putDown(BPlusTree.java:2329)
        at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.doPut(BPlusTree.java:2069)
        ... 32 common frames omitted{noformat}
The problem is that we apply throttling by checkpoint buffer only for pages 
that are present in current checkpoint:
{noformat}
if (isPageInCheckpoint) {
    int checkpointBufLimit = pageMemory.checkpointBufferPagesSize() * 2 / 3;

    shouldThrottle = pageMemory.checkpointBufferPagesCount() > 
checkpointBufLimit;
}{noformat}
On the other hand, we clear backoff counter if we don't apply throttling, which 
can happen for page which is not in checkpoint:
{noformat}
if (shouldThrottle) {
    int throttleLevel = exponentialBackoffCntr.getAndIncrement();

    LockSupport.parkNanos((long)(STARTING_THROTTLE_NANOS * 
Math.pow(BACKOFF_RATIO, throttleLevel)));
}
else
    exponentialBackoffCntr.set(0);{noformat}
Possible solution: introduce two separate backoff counters for pages in / not 
in checkpoint.

 


> Pages Write Throttle mode doesn't protect from checkpoint buffer overflow
> -------------------------------------------------------------------------
>
>                 Key: IGNITE-7751
>                 URL: https://issues.apache.org/jira/browse/IGNITE-7751
>             Project: Ignite
>          Issue Type: Improvement
>    Affects Versions: 2.3
>            Reporter: Ivan Rakov
>            Assignee: Ivan Rakov
>            Priority: Critical
>             Fix For: 2.5
>
>
> Even with write throttling enabled, checkpoint buffer still can be 
> overflowed. Overflow chance increases with number of writing threads. Example 
> stacktrace:
> {noformat}
> 2018-02-17 21:00:14.777 
> [ERROR][sys-stripe-12-#13%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.d.dht.GridDhtTxRemote]
>  Commit failed.
> org.apache.ignite.internal.transactions.IgniteTxHeuristicCheckedException: 
> Commit produced a runtime exception (all transaction entries will be 
> invalidated): 
> GridDhtTxRemote[id=06db48da161-00000000-07c5-23f5-0000-000000000005, 
> concurrency=OPTIMISTIC, isolation=SERIALIZABLE, state=COMMITTING, 
> invalidate=false, rollbackOnly=false, 
> nodeId=da415868-d9b3-48a5-9b56-0706ae60dd3b, duration=60]
>       at 
> org.apache.ignite.internal.processors.cache.distributed.GridDistributedTxRemoteAdapter.commitIfLocked(GridDistributedTxRemoteAdapter.java:739)
>       at 
> org.apache.ignite.internal.processors.cache.distributed.GridDistributedTxRemoteAdapter.commitRemoteTx(GridDistributedTxRemoteAdapter.java:813)
>       at 
> org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.finish(IgniteTxHandler.java:1319)
>       at 
> org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.processDhtTxFinishRequest(IgniteTxHandler.java:1231)
>       at 
> org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.access$600(IgniteTxHandler.java:97)
>       at 
> org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$7.apply(IgniteTxHandler.java:213)
>       at 
> org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$7.apply(IgniteTxHandler.java:211)
>       at 
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1060)
>       at 
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:579)
>       at 
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:378)
>       at 
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:304)
>       at 
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:99)
>       at 
> org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:293)
>       at 
> org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1555)
>       at 
> org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1183)
>       at 
> org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:126)
>       at 
> org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1090)
>       at 
> org.apache.ignite.internal.util.StripedExecutor$Stripe.run(StripedExecutor.java:499)
>       at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.ignite.IgniteException: Runtime failure on row: 
> Row@9f0a081[ key: 4694439661580364888, val: 
> com.sbt.bm.ucp.common.dpl.model.party.DUserInfo_DPL_PROXY [idHash=1290746929, 
> hash=400782371, colocationKey=16678, lastChangeDate=1518890414661, 
> userFullName=null, partition_DPL_id=6, bankInfo_DPL_id=4694439661580364888, 
> bankInfo_DPL_colocationKey=16678, ownerId=null, 
> infoFlowChannel_DPL_colocationKey=0, userLogin=reloading, 
> uid=1102030258731339432, isDeleted=false, infoFlowChannel_DPL_id=0, 
> sourceSystem_DPL_id=65, id=4694439661580364888, 
> colocationId=1102030258828706483], ver: GridCacheVersion [topVer=130360309, 
> order=1519034613156, nodeOrder=5] ][ 1102030258731339432, reloading, 
> 4694439661580364888, 0, null, 65, 4694439661580364888, FALSE, 6 ]
>       at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.doPut(BPlusTree.java:2102)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.putx(BPlusTree.java:2049)
>       at 
> org.apache.ignite.internal.processors.query.h2.database.H2TreeIndex.putx(H2TreeIndex.java:247)
>       at 
> org.apache.ignite.internal.processors.query.h2.opt.GridH2Table.addToIndex(GridH2Table.java:536)
>       at 
> org.apache.ignite.internal.processors.query.h2.opt.GridH2Table.update(GridH2Table.java:468)
>       at 
> org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.store(IgniteH2Indexing.java:595)
>       at 
> org.apache.ignite.internal.processors.query.GridQueryProcessor.store(GridQueryProcessor.java:1865)
>       at 
> org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.store(GridCacheQueryManager.java:407)
>       at 
> org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.finishUpdate(IgniteCacheOffheapManagerImpl.java:1343)
>       at 
> org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.invoke(IgniteCacheOffheapManagerImpl.java:1207)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.invoke(GridCacheOffheapManager.java:1356)
>       at 
> org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.invoke(IgniteCacheOffheapManagerImpl.java:345)
>       at 
> org.apache.ignite.internal.processors.cache.GridCacheMapEntry.storeValue(GridCacheMapEntry.java:3527)
>       at 
> org.apache.ignite.internal.processors.cache.GridCacheMapEntry.innerSet(GridCacheMapEntry.java:1039)
>       at 
> org.apache.ignite.internal.processors.cache.distributed.GridDistributedTxRemoteAdapter.commitIfLocked(GridDistributedTxRemoteAdapter.java:609)
>       ... 18 common frames omitted
> Caused by: org.apache.ignite.IgniteException: Failed to allocate temporary 
> buffer for checkpoint (increase checkpointPageBufferSize configuration 
> property)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.postWriteLockPage(PageMemoryImpl.java:1293)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.writeLockPage(PageMemoryImpl.java:1276)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.writeLock(PageMemoryImpl.java:398)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.writeLock(PageMemoryImpl.java:393)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.tree.util.PageHandler.writeLock(PageHandler.java:398)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.tree.util.PageHandler.writePage(PageHandler.java:326)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.DataStructure.write(DataStructure.java:262)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.access$11100(BPlusTree.java:82)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Put.tryInsert(BPlusTree.java:2922)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Put.access$7600(BPlusTree.java:2610)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.putDown(BPlusTree.java:2348)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.putDown(BPlusTree.java:2329)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.putDown(BPlusTree.java:2329)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.putDown(BPlusTree.java:2329)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.doPut(BPlusTree.java:2069)
>       ... 32 common frames omitted{noformat}
> The problem is that we apply throttling by checkpoint buffer only for pages 
> that are present in current checkpoint:
> {noformat}
> if (isPageInCheckpoint) {
>     int checkpointBufLimit = pageMemory.checkpointBufferPagesSize() * 2 / 3;
>     shouldThrottle = pageMemory.checkpointBufferPagesCount() > 
> checkpointBufLimit;
> }{noformat}
> On the other hand, we clear backoff counter if we don't apply throttling, 
> which can happen for page which is not in checkpoint:
> {noformat}
> if (shouldThrottle) {
>     int throttleLevel = exponentialBackoffCntr.getAndIncrement();
>     LockSupport.parkNanos((long)(STARTING_THROTTLE_NANOS * 
> Math.pow(BACKOFF_RATIO, throttleLevel)));
> }
> else
>     exponentialBackoffCntr.set(0);{noformat}
> Possible solution: introduce two separate backoff counters for pages in / not 
> in checkpoint.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to