Venki Korukanti created SPARK-36236:
---------------------------------------

             Summary: RocksDB state store: Add additional metrics for better 
observability into state store operations
                 Key: SPARK-36236
                 URL: https://issues.apache.org/jira/browse/SPARK-36236
             Project: Spark
          Issue Type: Sub-task
          Components: Structured Streaming
    Affects Versions: 3.1.2
            Reporter: Venki Korukanti


Proposing adding following new metrics to {{customMetrics}} under the 
{{stateOperators}} in {{StreamingQueryProgress}} event These metrics help have 
better visibility into the RocksDB based state store in streaming jobs.
 * {{rocksdbGetCount}} number of get calls to the DB (doesn’t include Gets from 
WriteBatch - in memory batch used for staging writes) 
 * {{rocksdbPutCount}} number of put calls to the DB (doesn’t include Puts to 
WriteBatch - in memory batch used for staging writes)
 * {{rocksdbTotalBytesReadByGet/rocksdbTotalBytesWrittenByPut}}: Number of 
uncompressed bytes read/written by get/put operations
 * {{rocksdbReadBlockCacheHitCount/rocksdbReadBlockCacheMissCount}} indicates 
how much of the block cache in RocksDB is useful or not and avoiding local disk 
reads
 * {{rocksdbTotalBytesReadByCompaction/rocksdbTotalBytesWrittenByCompaction}}: 
How many bytes the compaction process read from disk and written to disk. 
 * {{rocksdbTotalCompactionTime}}: Time (in ns) took for compactions (both 
background and the optional compaction initiated during the commit)
 * {{rocksdbWriterStallDuration}} Time (in ns) the writer has stalled due to a 
background compaction or flushing of the immutable memtables to disk. 
 * {{rocksdbTotalBytesReadThroughIterator}} Some of the stateful operations 
(such as timeout processing in FlatMapGroupsWithState and watermarking) 
requires reading entire data in DB through iterator. This metric tells the 
total size of uncompressed data read using the iterator.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to