Re: RocksDB CPU resource usage

2021-06-17 Thread Padarn Wilson
- > *From:* Robert Metzger > *Sent:* Thursday, June 17, 2021 14:11 > *To:* Padarn Wilson > *Cc:* JING ZHANG ; user > *Subject:* Re: RocksDB CPU resource usage > > If you are able to execute your job locally as well (with enough data), >

Re: RocksDB CPU resource usage

2021-06-17 Thread Yun Tang
thread stack. [1] https://github.com/jvm-profiling-tools/async-profiler Best Yun Tang From: Robert Metzger Sent: Thursday, June 17, 2021 14:11 To: Padarn Wilson Cc: JING ZHANG ; user Subject: Re: RocksDB CPU resource usage If you are able to execute your job loc

Re: RocksDB CPU resource usage

2021-06-17 Thread Robert Metzger
If you are able to execute your job locally as well (with enough data), you can also run it with a profiler and see the CPU cycles spent on serialization (you can also use RocksDB locally) On Wed, Jun 16, 2021 at 3:51 PM Padarn Wilson wrote: > Thanks Robert. I think it would be easy enough to

Re: RocksDB CPU resource usage

2021-06-16 Thread Padarn Wilson
Thanks Robert. I think it would be easy enough to test this hypothesis by making the same comparison with some simpler state inside the aggregation window. On Wed, 16 Jun 2021, 7:58 pm Robert Metzger, wrote: > Depending on the datatypes you are using, seeing 3x more CPU usage seems > realistic.

Re: RocksDB CPU resource usage

2021-06-16 Thread Robert Metzger
Depending on the datatypes you are using, seeing 3x more CPU usage seems realistic. Serialization can be quite expensive. See also: https://flink.apache.org/news/2020/04/15/flink-serialization-tuning-vol-1.html Maybe it makes sense to optimize there a bit. On Tue, Jun 15, 2021 at 5:23 PM JING

Re: RocksDB CPU resource usage

2021-06-15 Thread JING ZHANG
Hi Padarn, After switch stateBackend from filesystem to rocksdb, all reads/writes from/to backend have to go through de-/serialization to retrieve/store the state objects, this may cause more cpu cost. But I'm not sure it is the main reason leads to 3x CPU cost in your job. To find out the reason,

RocksDB CPU resource usage

2021-06-15 Thread Padarn Wilson
Hi all, We have a job that we just enabled rocksdb on (instead of file backend), and see that the CPU usage is almost 3x greater on (we had to increase taskmanagers 3x to get it to run. I don't really understand this, is there something we can look at to understand why CPU use is so high? Our