[ 
https://issues.apache.org/jira/browse/SAMZA-622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14388798#comment-14388798
 ] 

Chris Riccomini commented on SAMZA-622:
---------------------------------------

[~jghoman] had a basic consumer/producer that he was hacking on a while ago. 
Maybe he can write a quick description of how he designed it.

bq. Made the mistake of cloning jacob's ticket and unable to change the assignee

Referring to SAMZA-263?

bq. Has someone played with this

Hmm, interesting point. I'm unaware of anyone playing with this. Haven't 
investigated, but first question would be whether the HDFS stuff works with 
RocksDB JNI. If so, this could be interesting. SAMZA-557 might also be loosely 
related to this.

> Persisting Samza State on HDFS
> ------------------------------
>
>                 Key: SAMZA-622
>                 URL: https://issues.apache.org/jira/browse/SAMZA-622
>             Project: Samza
>          Issue Type: Improvement
>            Reporter: Vinoth Chandar
>            Assignee: Vinoth Chandar
>
> Samza's state currently lives in Kafka as a change log (compacted) and local 
> rocksdb kv store.. 
> It would be nice to save this onto HDFS directly for the following reasons 
> - HDFS is a fault tolerant FS. Thus, restarting Samza tasks can be achieved 
> by locating the task to where the other copies are.
> - HDFS virtualizes storage and thus, one would not have to worry explicitly 
> about balancing disk usage across different tiers (I don't know what the 
> right word is) in a data flow graph
> - Storing the state in HDFS, makes it easier to share this with other 
> processing systems in the Hadoop land. 
> Rocksdb seems to have an option to store files onto HDFS 
> https://github.com/facebook/rocksdb/tree/master/hdfs (Has someone played with 
> this). 
> Context: I am working on producing compacted DB snapshots on HDFS for 
> spark/MR jobs to use and thus super interested in this. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to