Igniters,
At work I often have to solve performance issues with Ignite cluster
without having access to source code of running user application.
Looks like Ignite have limited capabilities to identify bottlenecks without
extensive profiling on server and client side (JFR recording , sampling
profilers, regular thread dumps, etc), which is not always possible.
Even having profiling data not always helpful for determining several types
of bottlenecks, on example, if where is a contention on single
key/partition.
I propose to implement new feature, like lightweight profiling of message
processing.
The feature will allow to have view on message processing statistics for
each node and for all grid nodes.
In short, it's necessary to track each message execution in executors and
record execution statistics like synchronous execution time in executor
thread and waiting time in queue.
Full description:
1. Implement detailed tracking of message waiting in queue and actual
processing by executors with splitting to several time bins. Example of
detailed statistics for each processed message:
Processing time(%)
Message Total Average(ms)
< 1 ms < 10 ms < 30ms < 50ms < 100ms < 250ms <
500ms < 750ms < 1000ms > 1000ms
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
GridNearSingleGetRequest 904311372 0.023000
904240521 57394 7242 3961 1932 229
61 24 4 4
GridNearSingleGetResponse 340134416 0.041000
340118791 11660 1167 729 901 1001
158 8 1 0
GridNearLockRequest 77088689 0.079000
77073458 11945 2299 643 311 31
2 0 0 0
GridNearAtomicSingleUpdateInvokeRequest 39645752 0.298000
39580914 28222 6469 4638 9870 13414
2087 137 1 0
GridDhtAtomicSingleUpdateRequest 37636829 0.277000
37579375 23247 5915 4210 8954 12917
2048 162 1 0
GridDhtAtomicDeferredUpdateResponse 33580198 0.002000
33579805 337 51 3 1 1
0 0 0 0
GridNearTxPrepareRequest 21667790 0.238000
21078069 580126 1622 1261 2531 3631
496 40 0 14
GridDhtTxPrepareResponse 20949873 0.316000
17130161 3803105 4615 3162 4489 3721
577 34 1 8
GridDhtTxPrepareRequest 20949838 0.501000
16158732 4750217 16183 5735 8472 8994
1353 88 11 53
GridDhtTxFinishResponse 13835065 0.007000
13832519 2476 27 28 14 1
0 0 0 0
GridDhtTxFinishRequest 13835028 0.547000
12084106 1736789 8971 2340 1792 807
118 41 4 60
GridNearTxFinishRequest 13762197 0.725000
11811828 1942499 4441 1400 1201 524
89 34 19 162
GridDhtAtomicNearResponse 2784422 0.122000
2783393 1022 5 2 0 0
0 0 0 0
GridNearGetRequest 2360483 0.484000
2345937 14129 244 101 64 8
0 0 0 0
GridNearGetResponse 1984243 0.054000
1981905 2327 8 1 1 1
0 0 0 0
GridNearTxPrepareResponse 192856 0.153000
192660 188 1 5 1 1
0 0 0 0
GridNearLockResponse 192780 0.091000
192667 107 3 0 3 0
0 0 0 0
GridNearTxFinishResponse 177 0.822000
129 47 1 0 0 0
0 0 0 0
GridNearAtomicSingleUpdateRequest 124 4.803000
52 53 19 0 0 0
0 0 0 0
GridNearAtomicUpdateResponse 120 0.448000
110 10 0 0 0 0
0 0 0 0
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1544912252
1531765132 12965900 59283 28219 40537 45281
6989 568 42 301
Queue waiting time(%)
Message Total Average(ms)
< 1 ms < 10 ms < 30ms < 50ms < 100ms < 250ms <
500ms < 750ms < 1000ms > 1000ms
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
GridNearSingleGetRequest 904311372 0.023000
903130400 1051499 33248 18049 34044 38809
4714 212 8 389
GridNearSingleGetResponse 340134416 0.041000
339708055 419479 3732 1004 1222 737
117 2 1 67
GridNearLockRequest 77088689 0.079000
76949792 129666 2003 1624 2585 2531
390 18 2 78
GridNearAtomicSingleUpdateInvokeRequest 39645752 0.298000
39572801 53913 3949 2960 5487 5583
963 45 5 46
GridDhtAtomicSingleUpdateRequest 37636829 0.277000
37561043 57294 3721 2736 5339 5716
879 54 6 41
GridDhtAtomicDeferredUpdateResponse 33580198 0.002000
33512734 45873 4427 3340 6342 6363
1033 50 10 26
GridNearTxPrepareRequest 21667790 0.238000
21584416 76283 1681 1026 1948 2027
330 31 2 46
GridDhtTxPrepareResponse 20949873 0.316000
20862632 83578 1214 633 821 817
118 26 8 26
GridDhtTxPrepareRequest 20949838 0.501000
20867551 72264 2265 1516 2773 2905
479 38 6 41
GridDhtTxFinishResponse 13835065 0.007000
13808274 26487 150 68 55 21
3 1 0 6
GridDhtTxFinishRequest 13835028 0.547000
13809380 25142 188 127 102 59
9 2 4 15
GridNearTxFinishRequest 13762197 0.725000
13743925 17365 241 171 278 191
16 4 0 6
GridDhtAtomicNearResponse 2784422 0.122000
2780600 3767 29 9 7 9
1 0 0 0
GridNearGetRequest 2360483 0.484000
2355963 3762 194 114 184 228
38 0 0 0
GridNearGetResponse 1984243 0.054000
1979404 4721 31 9 49 28
1 0 0 0
GridNearTxPrepareResponse 192856 0.153000
192852 2 0 1 1 0
0 0 0 0
GridNearLockResponse 192780 0.091000
192777 3 0 0 0 0
0 0 0 0
GridNearTxFinishResponse 177 0.822000
177 0 0 0 0 0
0 0 0 0
GridNearAtomicSingleUpdateRequest 124 4.803000
124 0 0 0 0 0
0 0 0 0
GridNearAtomicUpdateResponse 120 0.448000
119 1 0 0 0 0
0 0 0 0
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1544912252
1542613019 2071099 57073 33387 61237 66024
9091 483 52 787
This will allow to understand cluster load type and presence of problems
with increased latency in message processing. In general all messages
should be processed within time < 30ms.
2. Log messages which are waiting too long or have been processed for too
long (this is configured threshold).
Track the following: message enqueue time, waiting in queue until
processing is started, head of queue before starting to wait, queue size
before starting to wait, total waiting time, total processing time, queue
size after processing, message content. Example:
Slow message: *enqueueTs*=2018-11-27 15:10:22.241, *waitTime*=0.048,
*procTime*=305.186, *messageId*=3a3064a9, *queueSzBefore*=0,
*headMessageId*=null,
*queueSzAfter*=0, *message*=GridNearTxFinishRequest [miniId=1,
mvccSnapshot=null, super=GridDistributedTxFinishRequest
[topVer=AffinityTopologyVersion [topVer=4, minorTopVer=0],
futId=199a3155761-f379f312-ad4b-4181-acc5-0aacb3391f07, threadId=296,
commitVer=null, invalidate=false, commit=true, baseVer=null, txSize=0,
sys=false, plc=2, subjId=dda703a0-69ee-47cf-9b9a-bf3dc9309feb,
taskNameHash=0, flags=32, syncMode=FULL_SYNC, txState=IgniteTxStateImpl
[activeCacheIds=[644280847], recovery=false, mvccEnabled=false,
txMap=HashSet [IgniteTxEntry [key=KeyCacheObjectImpl [part=8, val=8,
hasValBytes=true], cacheId=644280847, txKey=IgniteTxKey
[key=KeyCacheObjectImpl [part=8, val=8, hasValBytes=true],
cacheId=644280847], val=[op=READ, val=null], prevVal=[op=NOOP, val=null],
oldVal=[op=NOOP, val=null], entryProcessorsCol=null, ttl=-1,
conflictExpireTime=-1, conflictVer=null, explicitVer=null, dhtVer=null,
filters=CacheEntryPredicate[] [], filtersPassed=false, filtersSet=false,
entry=GridCacheMapEntry [key=KeyCacheObjectImpl [part=8, val=8,
hasValBytes=true], val=null, ver=GridCacheVersion [topVer=0, order=0,
nodeOrder=0], hash=8, extras=GridCacheObsoleteEntryExtras
[obsoleteVer=GridCacheVersion [topVer=2147483647, order=0, nodeOrder=0]],
flags=2]GridDistributedCacheEntry [super=]GridDhtCacheEntry
[rdrs=ReaderId[] [], part=8, super=], prepared=0, locked=false,
nodeId=null, locMapped=false, expiryPlc=null, transferExpiryPlc=false,
flags=0, partUpdateCntr=0, serReadVer=null, xidVer=GridCacheVersion
[topVer=154800617, order=1543320621588, nodeOrder=1]], IgniteTxEntry
[key=KeyCacheObjectImpl [part=11, val=11, hasValBytes=true],
cacheId=644280847, txKey=IgniteTxKey [key=KeyCacheObjectImpl [part=11,
val=11, hasValBytes=true], cacheId=644280847], val=[op=READ, val=null],
prevVal=[op=NOOP, val=null], oldVal=[op=NOOP, val=null],
entryProcessorsCol=null, ttl=-1, conflictExpireTime=-1, conflictVer=null,
explicitVer=null, dhtVer=null, filters=CacheEntryPredicate[] [],
filtersPassed=false, filtersSet=false, entry=GridCacheMapEntry
[key=KeyCacheObjectImpl [part=11, val=11, hasValBytes=true], val=null,
ver=GridCacheVersion [topVer=0, order=0, nodeOrder=0], hash=11,
extras=GridCacheObsoleteEntryExtras [obsoleteVer=GridCacheVersion
[topVer=2147483647, order=0, nodeOrder=0]],
flags=2]GridDistributedCacheEntry [super=]GridDhtCacheEntry
[rdrs=ReaderId[] [], part=11, super=], prepared=0, locked=false,
nodeId=null, locMapped=false, expiryPlc=null, transferExpiryPlc=false,
flags=0, partUpdateCntr=0, serReadVer=null, xidVer=GridCacheVersion
[topVer=154800617, order=1543320621588, nodeOrder=1]]]],
super=GridDistributedBaseMessage [ver=GridCacheVersion [topVer=154800617,
order=1543320621587, nodeOrder=4], committedVers=null, rolledbackVers=null,
cnt=0, super=GridCacheIdMessage [cacheId=0]]]]
This will allow to understand type of activity which produces latency
issues and possibly identify culprit code.
3. Track all synchronous waits while processing message: checkpoint locks,
segment locks, page locks, cold page reads from disk, etc.
4. Allow extensions to collection additional statistics for specific
message type. On example, for *TxFinishRequest we could track total
entities written for each cache and size in bytes. For GridNearLockRequest
we could track keys with max lock queue length for identifying contention.
5. Implement additional logging, JMX metrics and control.sh command for
printing currently collected profiling data.
6. Implementation should be as lightweight as possible.
Maybe something else.
BTW, I already started working on it and want to contribute my work to
community. Ticket with PR (not finished) [1], check for
CacheMessageStatsTest and CacheMessageStatsIndexingTest
[1] https://issues.apache.org/jira/browse/IGNITE-10418
Thoughts ?
--
Best regards,
Alexei Scherbakov