lostluck commented on a change in pull request #12448:
URL: https://github.com/apache/beam/pull/12448#discussion_r464583093



##########
File path: 
learning/katas/go/core_transforms/additional_parameters/additional_parameters/task.md
##########
@@ -0,0 +1,84 @@
+<!--
+  ~ Licensed to the Apache Software Foundation (ASF) under one
+  ~ or more contributor license agreements.  See the NOTICE file
+  ~ distributed with this work for additional information
+  ~ regarding copyright ownership.  The ASF licenses this file
+  ~ to you under the Apache License, Version 2.0 (the
+  ~ "License"); you may not use this file except in compliance
+  ~ with the License.  You may obtain a copy of the License at
+  ~
+  ~     http://www.apache.org/licenses/LICENSE-2.0
+  ~
+  ~ Unless required by applicable law or agreed to in writing, software
+  ~ distributed under the License is distributed on an "AS IS" BASIS,
+  ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  ~ See the License for the specific language governing permissions and
+  ~ limitations under the License.
+  -->
+
+# Additional Parameters - Window and Timestamp

Review comment:
       Additional Parameters is a weird name for this section. It's simply how 
this is implemented in the Go SDK.  There are also other additional parameters 
that aren't being covered which may get confusing as we add them to the SDK 
(Pane, Timers, State...)
   
   Windowing or windows and timestamps stands fairly well on it's own.
   
   

##########
File path: 
learning/katas/go/core_transforms/additional_parameters/additional_parameters/task.md
##########
@@ -0,0 +1,84 @@
+<!--
+  ~ Licensed to the Apache Software Foundation (ASF) under one
+  ~ or more contributor license agreements.  See the NOTICE file
+  ~ distributed with this work for additional information
+  ~ regarding copyright ownership.  The ASF licenses this file
+  ~ to you under the Apache License, Version 2.0 (the
+  ~ "License"); you may not use this file except in compliance
+  ~ with the License.  You may obtain a copy of the License at
+  ~
+  ~     http://www.apache.org/licenses/LICENSE-2.0
+  ~
+  ~ Unless required by applicable law or agreed to in writing, software
+  ~ distributed under the License is distributed on an "AS IS" BASIS,
+  ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  ~ See the License for the specific language governing permissions and
+  ~ limitations under the License.
+  -->
+
+# Additional Parameters - Window and Timestamp
+
+This lesson introduces the concept of windowing and timestamped PCollection 
elements.
+Before discussing windowing, we need to distinguish bounded from unbounded 
data.
+Bounded data is of a fixed size such as a file or database query.  Unbounded 
data comes
+from a continuously updated source such as a subscription or stream.
+
+A window is a view into a fixed beginning and fixed end to a set of data.  In 
the beam model, windowing subdivides 
+a PCollection according to the timestamps of its individual elements.  This is 
useful
+for unbounded data because it allows the model to work with fixed element 
sizes.  Note that windowing
+is not unique to unbounded data.  The beam model windows all data whether it 
is bounded or unbounded.
+Yet, when you read from a fixed size source such as a file, beam applies the 
same timestamp to all the elements.

Review comment:
       Beam doesn't specify timestamps. It's transform or runner dependent. If 
the framework receives timestamps, it propagates them or updates them as the 
transforms require. 
   
   eg. "The reading transform applies a timestamp...." not "beam applies the 
timestamp"

##########
File path: 
learning/katas/go/core_transforms/additional_parameters/additional_parameters/task.md
##########
@@ -0,0 +1,84 @@
+<!--
+  ~ Licensed to the Apache Software Foundation (ASF) under one
+  ~ or more contributor license agreements.  See the NOTICE file
+  ~ distributed with this work for additional information
+  ~ regarding copyright ownership.  The ASF licenses this file
+  ~ to you under the Apache License, Version 2.0 (the
+  ~ "License"); you may not use this file except in compliance
+  ~ with the License.  You may obtain a copy of the License at
+  ~
+  ~     http://www.apache.org/licenses/LICENSE-2.0
+  ~
+  ~ Unless required by applicable law or agreed to in writing, software
+  ~ distributed under the License is distributed on an "AS IS" BASIS,
+  ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  ~ See the License for the specific language governing permissions and
+  ~ limitations under the License.
+  -->
+
+# Additional Parameters - Window and Timestamp
+
+This lesson introduces the concept of windowing and timestamped PCollection 
elements.
+Before discussing windowing, we need to distinguish bounded from unbounded 
data.
+Bounded data is of a fixed size such as a file or database query.  Unbounded 
data comes
+from a continuously updated source such as a subscription or stream.
+
+A window is a view into a fixed beginning and fixed end to a set of data.  In 
the beam model, windowing subdivides 
+a PCollection according to the timestamps of its individual elements.  This is 
useful
+for unbounded data because it allows the model to work with fixed element 
sizes.  Note that windowing
+is not unique to unbounded data.  The beam model windows all data whether it 
is bounded or unbounded.
+Yet, when you read from a fixed size source such as a file, beam applies the 
same timestamp to all the elements.
+
+Beam will include information about the window and timestamp to your elements 
in your DoFn.  All your previous
+lessons' DoFn had this information provided, yet you never made use of it in 
your DoFn parameters.  In this 
+lesson you will.  The simple toy dataset has five git commit messages and 
their timestamps 
+from the [Apache Beam public repository](https://github.com/apache/beam).  
Their timestamps have been
+applied to the PCollection input to simulate an unbounded dataset.

Review comment:
       Probably repeating myself now, but bounded datasets can have timestamps 
as well.
   
   Speaking outside of the context of this lesson:
   Consider you have a stream of data from pubsub or something. Each element 
has the publishing time associated with it. However, data can be late*, which 
means you might emit less than accurate results if you want to maintain your ~1 
minute averages or similar. To have the daily graphs be correct after the fact, 
you could preserve the incoming datastream somewhere, timestamps and all in 
some files. Then after the fact you could run the same pipeline against those 
files, to get the correct running averages throughout the day instead, just by 
replacing the streaming source transform, with the batch source transform, 
along with the respective sinks. Fun eh?
   
   *which you can configure beam to handle, but that's not implemented in the 
Go SDK yet.
   

##########
File path: 
learning/katas/go/core_transforms/additional_parameters/additional_parameters/task.md
##########
@@ -0,0 +1,84 @@
+<!--
+  ~ Licensed to the Apache Software Foundation (ASF) under one
+  ~ or more contributor license agreements.  See the NOTICE file
+  ~ distributed with this work for additional information
+  ~ regarding copyright ownership.  The ASF licenses this file
+  ~ to you under the Apache License, Version 2.0 (the
+  ~ "License"); you may not use this file except in compliance
+  ~ with the License.  You may obtain a copy of the License at
+  ~
+  ~     http://www.apache.org/licenses/LICENSE-2.0
+  ~
+  ~ Unless required by applicable law or agreed to in writing, software
+  ~ distributed under the License is distributed on an "AS IS" BASIS,
+  ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  ~ See the License for the specific language governing permissions and
+  ~ limitations under the License.
+  -->
+
+# Additional Parameters - Window and Timestamp
+
+This lesson introduces the concept of windowing and timestamped PCollection 
elements.
+Before discussing windowing, we need to distinguish bounded from unbounded 
data.
+Bounded data is of a fixed size such as a file or database query.  Unbounded 
data comes
+from a continuously updated source such as a subscription or stream.
+
+A window is a view into a fixed beginning and fixed end to a set of data.  In 
the beam model, windowing subdivides 
+a PCollection according to the timestamps of its individual elements.  This is 
useful
+for unbounded data because it allows the model to work with fixed element 
sizes.  Note that windowing
+is not unique to unbounded data.  The beam model windows all data whether it 
is bounded or unbounded.
+Yet, when you read from a fixed size source such as a file, beam applies the 
same timestamp to all the elements.
+
+Beam will include information about the window and timestamp to your elements 
in your DoFn.  All your previous
+lessons' DoFn had this information provided, yet you never made use of it in 
your DoFn parameters.  In this 

Review comment:
       I'd say "available" rather than "provided".

##########
File path: 
learning/katas/go/core_transforms/additional_parameters/additional_parameters/task.md
##########
@@ -0,0 +1,84 @@
+<!--
+  ~ Licensed to the Apache Software Foundation (ASF) under one
+  ~ or more contributor license agreements.  See the NOTICE file
+  ~ distributed with this work for additional information
+  ~ regarding copyright ownership.  The ASF licenses this file
+  ~ to you under the Apache License, Version 2.0 (the
+  ~ "License"); you may not use this file except in compliance
+  ~ with the License.  You may obtain a copy of the License at
+  ~
+  ~     http://www.apache.org/licenses/LICENSE-2.0
+  ~
+  ~ Unless required by applicable law or agreed to in writing, software
+  ~ distributed under the License is distributed on an "AS IS" BASIS,
+  ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  ~ See the License for the specific language governing permissions and
+  ~ limitations under the License.
+  -->
+
+# Additional Parameters - Window and Timestamp
+
+This lesson introduces the concept of windowing and timestamped PCollection 
elements.
+Before discussing windowing, we need to distinguish bounded from unbounded 
data.

Review comment:
       Bounded vs Unbounded is orthogonal to windowing/event times. There's no 
need to understand it to understand the other. Windowing is useful and 
available to both kinds of PCollection. I'd recommend not mentioning it at all 
at this juncture.

##########
File path: 
learning/katas/go/core_transforms/additional_parameters/additional_parameters/task.md
##########
@@ -0,0 +1,84 @@
+<!--
+  ~ Licensed to the Apache Software Foundation (ASF) under one
+  ~ or more contributor license agreements.  See the NOTICE file
+  ~ distributed with this work for additional information
+  ~ regarding copyright ownership.  The ASF licenses this file
+  ~ to you under the Apache License, Version 2.0 (the
+  ~ "License"); you may not use this file except in compliance
+  ~ with the License.  You may obtain a copy of the License at
+  ~
+  ~     http://www.apache.org/licenses/LICENSE-2.0
+  ~
+  ~ Unless required by applicable law or agreed to in writing, software
+  ~ distributed under the License is distributed on an "AS IS" BASIS,
+  ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  ~ See the License for the specific language governing permissions and
+  ~ limitations under the License.
+  -->
+
+# Additional Parameters - Window and Timestamp
+
+This lesson introduces the concept of windowing and timestamped PCollection 
elements.
+Before discussing windowing, we need to distinguish bounded from unbounded 
data.
+Bounded data is of a fixed size such as a file or database query.  Unbounded 
data comes
+from a continuously updated source such as a subscription or stream.
+
+A window is a view into a fixed beginning and fixed end to a set of data.  In 
the beam model, windowing subdivides 
+a PCollection according to the timestamps of its individual elements.  This is 
useful
+for unbounded data because it allows the model to work with fixed element 
sizes.  Note that windowing

Review comment:
       WRT elements, size refers to how many bytes it takes up. You probably 
mean counts.
   Windowing doesn't set things to fixed sizes or element sizes, or even 
counts. 
   
   WRT to bounded/unbounded, note that the text is saying "It's true for A!" 
"it's also true for not A!" It's true for both A and not A!"
   So my recommendation is to not mention A at all.
   
   

##########
File path: 
learning/katas/go/core_transforms/additional_parameters/additional_parameters/task.md
##########
@@ -0,0 +1,84 @@
+<!--
+  ~ Licensed to the Apache Software Foundation (ASF) under one
+  ~ or more contributor license agreements.  See the NOTICE file
+  ~ distributed with this work for additional information
+  ~ regarding copyright ownership.  The ASF licenses this file
+  ~ to you under the Apache License, Version 2.0 (the
+  ~ "License"); you may not use this file except in compliance
+  ~ with the License.  You may obtain a copy of the License at
+  ~
+  ~     http://www.apache.org/licenses/LICENSE-2.0
+  ~
+  ~ Unless required by applicable law or agreed to in writing, software
+  ~ distributed under the License is distributed on an "AS IS" BASIS,
+  ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  ~ See the License for the specific language governing permissions and
+  ~ limitations under the License.
+  -->
+
+# Additional Parameters - Window and Timestamp
+
+This lesson introduces the concept of windowing and timestamped PCollection 
elements.
+Before discussing windowing, we need to distinguish bounded from unbounded 
data.
+Bounded data is of a fixed size such as a file or database query.  Unbounded 
data comes
+from a continuously updated source such as a subscription or stream.
+
+A window is a view into a fixed beginning and fixed end to a set of data.  In 
the beam model, windowing subdivides 
+a PCollection according to the timestamps of its individual elements.  This is 
useful
+for unbounded data because it allows the model to work with fixed element 
sizes.  Note that windowing
+is not unique to unbounded data.  The beam model windows all data whether it 
is bounded or unbounded.
+Yet, when you read from a fixed size source such as a file, beam applies the 
same timestamp to all the elements.
+
+Beam will include information about the window and timestamp to your elements 
in your DoFn.  All your previous

Review comment:
       Beam can pass information about....




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to