[
https://issues.apache.org/jira/browse/SAMZA-967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15488252#comment-15488252
]
Hai commented on SAMZA-967:
---------------------------
You brought up a good point. There is no guarantee for deterministic
consumption if repartitioning happens. But I think my point is that we are not
able to solve this problem for Kafka either. Let's say we do repartitioning for
a job that reads from Kafka and writes to Kafka, how do you guarantee
consistent result, now? Well, you could argue that deterministic repartitioning
result is not needed in the case of Kafka - a stream processing job, but is
relevant in HDFS - essentially a batch processing job. I have to admit that I
don't have a good solution to your question as of now
> Add HDFS system consumer to Samza
> ---------------------------------
>
> Key: SAMZA-967
> URL: https://issues.apache.org/jira/browse/SAMZA-967
> Project: Samza
> Issue Type: Sub-task
> Reporter: Hai
> Assignee: Hai
> Fix For: 0.12.0
>
> Attachments: HDFSSystemConsumer.pdf
>
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)