Updated Branches: refs/heads/master f2d660844 -> 1af23c1b9
fix broken image in comparison intro page Project: http://git-wip-us.apache.org/repos/asf/incubator-samza/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-samza/commit/1af23c1b Tree: http://git-wip-us.apache.org/repos/asf/incubator-samza/tree/1af23c1b Diff: http://git-wip-us.apache.org/repos/asf/incubator-samza/diff/1af23c1b Branch: refs/heads/master Commit: 1af23c1b9455913b037dfa3127c7747a991ee78c Parents: f2d6608 Author: Chris Riccomini <[email protected]> Authored: Mon Aug 12 10:01:24 2013 -0700 Committer: Chris Riccomini <[email protected]> Committed: Mon Aug 12 10:01:24 2013 -0700 ---------------------------------------------------------------------- docs/learn/documentation/0.7.0/comparisons/introduction.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/incubator-samza/blob/1af23c1b/docs/learn/documentation/0.7.0/comparisons/introduction.md ---------------------------------------------------------------------- diff --git a/docs/learn/documentation/0.7.0/comparisons/introduction.md b/docs/learn/documentation/0.7.0/comparisons/introduction.md index bb42d9b..a8a4944 100644 --- a/docs/learn/documentation/0.7.0/comparisons/introduction.md +++ b/docs/learn/documentation/0.7.0/comparisons/introduction.md @@ -27,7 +27,7 @@ We have put particular effort into allowing Samza jobs to manage large amounts o This means that you can view a Samza job as being both a piece of processing code, but also a co-partitioned "table" of state. This allows rich local queries and scans against this state. These tables are made fault-tolerant by producing a "changelog" stream which is used to restore the state of the table on fail-over. This stream is just another Samza stream, it can even be used as input for other jobs. - + In our experience most processing flows require joins against other data sourceIn the absence of state maintenance, any joining or aggregation has to be done by querying an external data system. This tends to be one or two orders of magnitude slower than sequential processing. For example per-node throughput for Kafka would easily be in the 100k-500k messages/sec range (depending on message size) but remote queries against a key-value store tend to be closer to 1-5k queries-per-second per node.
