[ 
https://issues.apache.org/jira/browse/SAMZA-967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15488252#comment-15488252
 ] 

Hai commented on SAMZA-967:
---------------------------

You brought up a good point. There is no guarantee for deterministic 
consumption if repartitioning happens. But I think my point is that we are not 
able to solve this problem for Kafka either. Let's say we do repartitioning for 
a job that reads from Kafka and writes to Kafka, how do you guarantee 
consistent result, now? Well, you could argue that deterministic repartitioning 
result is not needed in the case of Kafka - a stream processing job, but is 
relevant in HDFS - essentially a batch processing job. I have to admit that I 
don't have a good solution to your question as of now

> Add HDFS system consumer to Samza
> ---------------------------------
>
>                 Key: SAMZA-967
>                 URL: https://issues.apache.org/jira/browse/SAMZA-967
>             Project: Samza
>          Issue Type: Sub-task
>            Reporter: Hai
>            Assignee: Hai
>             Fix For: 0.12.0
>
>         Attachments: HDFSSystemConsumer.pdf
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to