Repository: incubator-beam-site
Updated Branches:
  refs/heads/asf-site ab1f700ca -> 976b0302a


minor: remove duplicate words


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/repo
Commit: 
http://git-wip-us.apache.org/repos/asf/incubator-beam-site/commit/6a5a0b3c
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/tree/6a5a0b3c
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/diff/6a5a0b3c

Branch: refs/heads/asf-site
Commit: 6a5a0b3cd77d89b81638bc4787bc635d0e10fda5
Parents: ab1f700
Author: terrencehan(韩亮) <[email protected]>
Authored: Wed Sep 28 17:56:30 2016 +0800
Committer: terrencehan(韩亮) <[email protected]>
Committed: Wed Sep 28 17:56:30 2016 +0800

----------------------------------------------------------------------
 learn/programming-guide.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/6a5a0b3c/learn/programming-guide.md
----------------------------------------------------------------------
diff --git a/learn/programming-guide.md b/learn/programming-guide.md
index ac18ba6..a7e5f12 100644
--- a/learn/programming-guide.md
+++ b/learn/programming-guide.md
@@ -158,7 +158,7 @@ A `PCollection` is a large, immutable "bag" of elements. 
There is no upper limit
 
 A `PCollection` can be either **bounded** or **unbounded** in size. A 
**bounded** `PCollection` represents a data set of a known, fixed size, while 
an **unbounded** `PCollection` represents a data set of unlimited size. Whether 
a `PCollection` is bounded or unbounded depends on the source of the data set 
that it represents. Reading from a batch data source, such as a file or a 
database, creates a bounded `PCollection`. Reading from a streaming or 
continously-updating data source, such as Pub/Sub or Kafka, creates an 
unbounded `PCollection` (unless you explicitly tell it not to).
 
-The bounded (or unbounded) nature The bounded (or unbounded) nature of your 
`PCollection` affects how Beam processes your data. A bounded `PCollection` can 
be processed using a batch job, which might read the entire data set once, and 
perform processing in a job of finite length. An unbounded `PCollection` must 
be processed using a streaming job that runs continuously, as the entire 
collection can never be available for processing at any one time.
+The bounded (or unbounded) nature of your `PCollection` affects how Beam 
processes your data. A bounded `PCollection` can be processed using a batch 
job, which might read the entire data set once, and perform processing in a job 
of finite length. An unbounded `PCollection` must be processed using a 
streaming job that runs continuously, as the entire collection can never be 
available for processing at any one time.
 
 When performing an operation that groups elements in an unbounded 
`PCollection`, Beam requires a concept called **Windowing** to divide a 
continuously updating data set into logical windows of finite size.  Beam 
processes each window as a bundle, and processing continues as the data set is 
generated. These logical windows are determined by some characteristic 
associated with a data element, such as a **timestamp**.
 

Reply via email to