[GitHub] [spark] edwardcapriolo edited a comment on pull request #24922: [SPARK-28120][SS] Rocksdb state storage implementation

GitBox Wed, 26 May 2021 19:22:42 -0700


edwardcapriolo edited a comment on pull request #24922:
URL: https://github.com/apache/spark/pull/24922#issuecomment-849265104



   https://docs.databricks.com/spark/latest/structured-streaming/production.html
   
   
https://docs.databricks.com/spark/latest/structured-streaming/production.html#configure-rocksdb-state-store
   
   "As the Spark project continues to grow, I think it is important that we 
guard against the core becoming a swiss army knife, with too many different 
configurations for the community to maintain in the long run. In this case we 
are not only adding a new dependency, but we are also committing the Spark to 
supporting the specifics of how you are packaging and uploading the RocksDB 
files forever."
   
   RocksDB state store has "found its way" into a commercial offering, so it 
should "find its way" into mainline spark. It is obviously being packaged as a 
feature by databricks (thus proving its value) 
   
   What will happen if spark tries to "tiptoe the open source the fence", folks 
will notice and they will simply switch to kafka streams, which is hungry for 
user base and offers rocksdb state OUT OF THE BOX. Which is actually a 
conversation I had today, someone saying "spark cant do it lets use kafka 
streams or flink"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] edwardcapriolo edited a comment on pull request #24922: [SPARK-28120][SS] Rocksdb state storage implementation

Reply via email to