kbendick commented on pull request #2680:
URL: https://github.com/apache/iceberg/pull/2680#issuecomment-987542250


   > I agree with you that having RocksDB is a win, our stateful Flink jobs all 
use Rocksdb backend :)
   > 
   > But I think we need to more clearly communicated the trade-off to the 
user. For example, RocksDB can be a magnitude slower than in-memory hash map 
even with very fast SSD, depending on the read/write pattern.
   
   > I agree with you that having RocksDB is a win, our stateful Flink jobs all 
use Rocksdb backend :)
   > 
   > But I think we need to more clearly communicated the trade-off to the 
user. For example, RocksDB can be a magnitude slower than in-memory hash map 
even with very fast SSD, depending on the read/write pattern.
   
   For sure. No disagreement. But we're just maybe not their yet is all I mean. 
Having the ability to spill to disk is a lot of work - which @openinx and 
others have been doing a great job with. But even look at the age of this PR - 
it's moving along, but it's definitely a process.
   
   It's good to be aware of how the need to pass configurations will affect the 
rest of the codebase, as Ryan had mentioned was one of the bigger areas for 
concern. Ideally we can pass configs to rocksdb without too much disruption to 
the rest of the codebase.
   
   Once that has crossed it's goal, we can worry more about user experience 
when using RocksDB. And I'll happily drop most anything I'm doing to review 
most documentation PRs!
   
   And when the time comes, if you want to write a blogpost about using 
RocksDB, I'd happily review any drafts or be sure it's prominently displayed on 
the Apache Flink website 🙂
   
   For now, let's focus on getting RocksDB or similar available to developers 
within Iceberg and then we can definitely focus on the user experience. It is 
definitely true that there are many situations where it makes less sense than 
remaining in-memory, but having the option sure is nice.
   
   But it's good to be thinking about end user experience always. It's 
definitely one of my biggest concerns with all things too. So many thanks for 
that. 😀


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to