Repository: storm Updated Branches: refs/heads/master 7e718862a -> 7e622d178
Quick fix: typo in FAQ.md Just a typo. Project: http://git-wip-us.apache.org/repos/asf/storm/repo Commit: http://git-wip-us.apache.org/repos/asf/storm/commit/44e0e0a4 Tree: http://git-wip-us.apache.org/repos/asf/storm/tree/44e0e0a4 Diff: http://git-wip-us.apache.org/repos/asf/storm/diff/44e0e0a4 Branch: refs/heads/master Commit: 44e0e0a491d1db23884eb018852e97e7ca9d0a6b Parents: bfd1006 Author: MichealShin <[email protected]> Authored: Wed Nov 29 15:22:50 2017 +0800 Committer: GitHub <[email protected]> Committed: Wed Nov 29 15:22:50 2017 +0800 ---------------------------------------------------------------------- docs/FAQ.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/storm/blob/44e0e0a4/docs/FAQ.md ---------------------------------------------------------------------- diff --git a/docs/FAQ.md b/docs/FAQ.md index 127c95c..ce9130e 100644 --- a/docs/FAQ.md +++ b/docs/FAQ.md @@ -123,5 +123,5 @@ You cannot know that all events are collected -- this is an epistemological chal * Set a time limit using domain knowledge * Introduce a _punctuation_: a record known to come after all records in the given time bucket. Trident uses this scheme to know when a batch is complete. If you for instance receive records from a set of sensors, each in order for that sensor, then once all sensors have sent you a 3:02:xx or later timestamp lets you know you can commit. -* When possible, make your process incremental: each value that comes in makes the answer more an more true. A Trident ReducerAggregator is an operator that takes a prior result and a set of new records and returns a new result. This lets the result be cached and serialized to a datastore; if a server drops off line for a day and then comes back with a full day's worth of data in a rush, the old results will be calmly retrieved and updated. +* When possible, make your process incremental: each value that comes in makes the answer more and more true. A Trident ReducerAggregator is an operator that takes a prior result and a set of new records and returns a new result. This lets the result be cached and serialized to a datastore; if a server drops off line for a day and then comes back with a full day's worth of data in a rush, the old results will be calmly retrieved and updated. * Lambda architecture: Record all events into an archival store (S3, HBase, HDFS) on receipt. in the fast layer, once the time window is clear, process the bucket to get an actionable answer, and ignore everything older than the time window. Periodically run a global aggregation to calculate a "correct" answer.
