Re: State processor API very slow reading a keyed state with RocksDB

2021-09-10 Thread David Causse
oc.io/doc/org.rocksdb/rocksdbjni/6.20.3/org/rocksdb/ReadOptions.html#setFillCache(boolean) > -- > *From:* Seth Wiesman > *Sent:* Friday, September 10, 2021 0:58 > *To:* David Causse ; user > *Cc:* Piotr Nowojski > *Subject:* Re: State processor A

Re: State processor API very slow reading a keyed state with RocksDB

2021-09-09 Thread Yun Tang
) From: Seth Wiesman Sent: Friday, September 10, 2021 0:58 To: David Causse ; user Cc: Piotr Nowojski Subject: Re: State processor API very slow reading a keyed state with RocksDB Hi David, I was also able to reproduce the behavior, but was able to get significant

Re: State processor API very slow reading a keyed state with RocksDB

2021-09-09 Thread Seth Wiesman
Hi David, I was also able to reproduce the behavior, but was able to get significant performance improvements by reducing the number of slots on each TM to 1. My suspicion, as Piotr alluded to, has to do with the different runtime execution of DataSet over DataStream. In particular, Flink's

Re: State processor API very slow reading a keyed state with RocksDB

2021-09-09 Thread Piotr Nowojski
Hi David, I can confirm that I'm able to reproduce this behaviour. I've tried profiling/flame graphs and I was not able to make much sense out of those results. There are no IO/Memory bottlenecks that I could notice, it looks indeed like the Job is stuck inside RocksDB itself. This might be an

State processor API very slow reading a keyed state with RocksDB

2021-09-08 Thread David Causse
Hi, I'm investigating why a job we use to inspect a flink state is a lot slower than the bootstrap job used to generate it. I use RocksdbDB with a simple keyed value state mapping a string key to a long value. Generating the bootstrap state from a CSV file with 100M entries takes a couple