Hello Magnat, Thanks for reporting your observations. I have some questions:
1) Are your global state stores also in-memory or persisted on disks? 2) Are your Kafka and KStreams colocated? Guozhang On Tue, Aug 10, 2021 at 6:10 AM mangat rai <mangatm...@gmail.com> wrote: > Hey All, > > We are using the low level processor API to create kafka stream > applications. Each app has 1 or more in-memory state stores with caching > disabled and changelog enabled. Some of the apps also have global stores. > We noticed from the node metrics (kubernetes) that the stream applications > are consuming too much disk IO. On going deeper I found following > > 1. Running locally with docker I could see some pretty high disk reads. I > used `docker stats` and got `BLOCK I/O` as `438MB / 0B`. To compare we did > only a few gigabytes of Net I/O. > 2. In kubernetes, `container_fs_reads_bytes_total` gives us pretty big > numbers whereas `container_fs_writes_bytes_total` is almost negligible. > > Now we are *not* using RocksDB. The pattern is not correlated to having a > global store. I read various documents but I still can't figure out why a > stream application would perform so much disk read. It's not even writing > so that rules out the swap space or any buffering etc. > > I also noticed that a higher amount of data consumption is directly > proportional to a higher amount of disk reads. Is it possible that the data > is zero copied from the network interface to the disk and Kafka app is > reading from it. I am not aware if there is any mechanism to do that. > > I would really appreciate any help in debugging this issue. > > Thanks, > Mangat > -- -- Guozhang