Hi All, We are running flink in AWS and we are observing a strange behavior. We are using docker containers, EBS for storage and Rocks DB state backend. We have a few map and value states with checkpointing every 30 seconds and incremental checkpointing turned on. The issue we are noticing is the read IOPS and read throughput gradually increases over time and keeps constantly growing. The write throughput and write bytes are not increasing as much as reads. The checkpoints are written to a durable NFS storage. We are not sure what is causing this constant increase in read throughput but due to which we are running out of EBS burst balance and need to restart the job every once in a while. Attached the EBS read and write metrics. Has anyone encountered this issue and what could be the possible solution.
We have also tried setting the below rocksdb options but didn't help. DBOptions: currentOptions.setOptimizeFiltersForHits(true) .setWriteBufferSize(536870912) .setMaxWriteBufferNumber(5) .setMinWriteBufferNumberToMerge(2); ColumnFamilyOptions: currentOptions.setMaxBackgroundCompactions(4) .setMaxManifestFileSize(1048576) .setMaxLogFileSize(1048576); Thanks.