edwardcapriolo edited a comment on pull request #24922: URL: https://github.com/apache/spark/pull/24922#issuecomment-849265104
https://docs.databricks.com/spark/latest/structured-streaming/production.html https://docs.databricks.com/spark/latest/structured-streaming/production.html#configure-rocksdb-state-store "As the Spark project continues to grow, I think it is important that we guard against the core becoming a swiss army knife, with too many different configurations for the community to maintain in the long run. In this case we are not only adding a new dependency, but we are also committing the Spark to supporting the specifics of how you are packaging and uploading the RocksDB files forever." RocksDB state store has "found its way" into a commercial offering, so it should "find its way" into mainline spark. It is obviously being packaged as a feature by databricks (thus proving its value) What will happen if spark tries to "tiptoe the open source the fence", folks will notice and they will simply switch to kafka streams, which is hungry for user base and offers rocksdb state OUT OF THE BOX. Which is actually a conversation I had today, someone saying "spark cant do it lets use kafka streams or flink" -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
