robertwb commented on code in PR #27437:
URL: https://github.com/apache/beam/pull/27437#discussion_r1260346360


##########
model/fn-execution/src/main/proto/org/apache/beam/model/fn_execution/v1/beam_fn_api.proto:
##########
@@ -46,6 +46,50 @@ import "google/protobuf/struct.proto";
 import "google/protobuf/timestamp.proto";
 import "google/protobuf/duration.proto";
 
+
+// Describes transforms necessary to execute Beam over the FnAPI but are
+// implementation details rather than part of the core model.
+message FnApiTransforms {
+  enum Runner {
+    // DataSource is a Root Transform, and a source of data for downstream
+    // transforms in the same ProcessBundleDescriptor. 
+    // It represents a logical PCollection.

Review Comment:
   I might say "It represents reading a PCollection from an external source 
(typically the Runner)" or "represents a stream of values coming in from an 
external source/over a data channel" as it's not the PCollection itself, but 
rather a descriptor of how to get the PCollection. 



##########
model/fn-execution/src/main/proto/org/apache/beam/model/fn_execution/v1/beam_fn_api.proto:
##########
@@ -46,6 +46,50 @@ import "google/protobuf/struct.proto";
 import "google/protobuf/timestamp.proto";
 import "google/protobuf/duration.proto";
 
+
+// Describes transforms necessary to execute Beam over the FnAPI but are
+// implementation details rather than part of the core model.
+message FnApiTransforms {
+  enum Runner {
+    // DataSource is a Root Transform, and a source of data for downstream
+    // transforms in the same ProcessBundleDescriptor. 
+    // It represents a logical PCollection.
+    //
+    // The DataSource transform is implemented in each SDK but not explicitly
+    // provided during pipeline construction. A runner inserts the transform
+    // in ProcessBundleDescriptors to indicate where the bundle
+    // can retrieve data for an associated ProcessBundleRequest.
+    // Data for the same request will be retrieved with the matching 
instruction ID,
+    // and transform ID determined by the runner.
+    //
+    // The DataSource transform will take a stream of bytes from the remote
+    // source for the matching instruction ID and decode them as windowed
+    // values using the provided coder ID, which must be for a WINDOWED_VALUE 
or
+    // PARAM_WINDOWED_VALUE coder.
+    //
+    // Payload: RemoteGrpcPort
+    DATA_SOURCE = 0 [(org.apache.beam.model.pipeline.v1.beam_urn) = 
"beam:runner:source:v1"];
+
+    // DataSink is a transform that sends data to a remote port
+    // ProcessBundleDescriptor. It represents a logical PCollection.

Review Comment:
   "remote port ProcessBundleDescriptor?"
   
   Maybe "DataSink is a transform that sends PCollection elements to a remote 
port using the Data API."



##########
model/fn-execution/src/main/proto/org/apache/beam/model/fn_execution/v1/beam_fn_api.proto:
##########
@@ -46,6 +46,50 @@ import "google/protobuf/struct.proto";
 import "google/protobuf/timestamp.proto";
 import "google/protobuf/duration.proto";
 
+
+// Describes transforms necessary to execute Beam over the FnAPI but are
+// implementation details rather than part of the core model.
+message FnApiTransforms {
+  enum Runner {
+    // DataSource is a Root Transform, and a source of data for downstream
+    // transforms in the same ProcessBundleDescriptor. 
+    // It represents a logical PCollection.
+    //
+    // The DataSource transform is implemented in each SDK but not explicitly
+    // provided during pipeline construction. A runner inserts the transform
+    // in ProcessBundleDescriptors to indicate where the bundle
+    // can retrieve data for an associated ProcessBundleRequest.
+    // Data for the same request will be retrieved with the matching 
instruction ID,
+    // and transform ID determined by the runner.
+    //
+    // The DataSource transform will take a stream of bytes from the remote
+    // source for the matching instruction ID and decode them as windowed
+    // values using the provided coder ID, which must be for a WINDOWED_VALUE 
or
+    // PARAM_WINDOWED_VALUE coder.

Review Comment:
   We may allow other kinds of acceptable coders in the future, but you could 
say "must be a windowed value coder" generically.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to