lostluck commented on code in PR #29590:
URL: https://github.com/apache/beam/pull/29590#discussion_r1419781152


##########
sdks/go/pkg/beam/core/runtime/exec/translate.go:
##########
@@ -411,11 +413,11 @@ func (b *builder) makePCollection(id string) 
(*PCollection, error) {
 }
 
 func (b *builder) newPCollectionNode(id string, out Node) (*PCollection, 
error) {
-       ec, _, err := b.makeCoderForPCollection(id)
+       ec, wc, err := b.makeCoderForPCollection(id)
        if err != nil {
                return nil, err
        }
-       u := &PCollection{UID: b.idgen.New(), Out: out, PColID: id, Coder: ec, 
Seed: rand.Int63()}
+       u := &PCollection{UID: b.idgen.New(), Out: out, PColID: id, Coder: ec, 
WindowCoder: wc, Seed: rand.Int63(), dataSampler: b.dataSampler}

Review Comment:
   1. Correct. That's the intent.
   2. Probably not, since except for right at the DataSource, we don't have 
bytes, we have elements. We must encode them to get the bytes.
   
   WRT 2. The implementation I suggested in the 2 paths for Sampling available, 
vs not, avoids duplicate encodings, so it's about as good as we're getting. We 
also don't sample that frequently (outside of the streaming case with tiny 
bundle sizes).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to