[ 
https://issues.apache.org/jira/browse/SAMZA-225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Fang updated SAMZA-225:
---------------------------

    Attachment: SAMZA-225.5.patch

Hi Martin and Chris, thank you very much for the reviewing.

1. update based on Martin's latest comments
2. in terms of data lost, Spark Streaming does have one situation where it 
loses data: the failure happens when the data is received in receiver but not 
yet replicated to other nodes, no matter it is using Flume/Kafka as the input 
stream. Mentioned this in the fault-tolerance part.

updated in RB: https://reviews.apache.org/r/23358/

Thank you.

> Write a Spark Streaming comparison
> ----------------------------------
>
>                 Key: SAMZA-225
>                 URL: https://issues.apache.org/jira/browse/SAMZA-225
>             Project: Samza
>          Issue Type: Task
>          Components: docs
>    Affects Versions: 0.6.0
>            Reporter: Chris Riccomini
>            Assignee: Yan Fang
>         Attachments: SAMZA-225-1.patch, SAMZA-225-on-324-3.patch, 
> SAMZA-225.2.patch, SAMZA-225.3.patch, SAMZA-225.4.patch, SAMZA-225.5.patch, 
> SAMZA-225.patch, SAMZA-324-3.patch
>
>
> We currently have comparison pages for 
> [MUPD8|http://samza.incubator.apache.org/learn/documentation/0.7.0/comparisons/mupd8.html]
>  and 
> [Storm|http://samza.incubator.apache.org/learn/documentation/0.7.0/comparisons/storm.html].
>  We should write-up a comparison for Spark Streaming as well. It seems like a 
> really nice system, and we might learn some things that we can use by 
> investigating it. It will also be useful to describe any differences we find, 
> just as we've done for the other systems.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to