itsvikramagr commented on a change in pull request #24922: [SPARK-28120][SS]  
Rocksdb state storage implementation
URL: https://github.com/apache/spark/pull/24922#discussion_r333870578
 
 

 ##########
 File path: sql/core/pom.xml
 ##########
 @@ -147,6 +147,12 @@
       <artifactId>mockito-core</artifactId>
       <scope>test</scope>
     </dependency>
+    <!-- RocksDB dependency for Structured Streaming State Store -->
+    <dependency>
+      <groupId>org.rocksdb</groupId>
+      <artifactId>rocksdbjni</artifactId>
 
 Review comment:
   @gatorsmile - what are the alternatives if rocksdb is not the best backend. 
Other streaming technologies such as flink and kstreams are using rocksdb as 
primary storage engine. 
   
   With integration in spark codebase, we can probably change the code in any 
way later, but if we take the separate jar route, the kind of extensions you 
can make are limited by the current contract. For example @skonto mentioned one 
of way where we can abstract  state storage implementation to get the best out 
of rocksdb. How can we support such improvement of we take spark package route?
   
   Current implementation based on in memory hashmap is not scalable beyond a 
point. How shall we go about solving it? 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to