lostluck commented on a change in pull request #12426:
URL: https://github.com/apache/beam/pull/12426#discussion_r465189010
##########
File path:
model/fn-execution/src/main/resources/org/apache/beam/model/fnexecution/v1/standard_coders.yaml
##########
@@ -384,3 +384,31 @@ nested: false
examples:
"\x02\x01\x02\x01": {f_bool: True, f_bytes: null}
"\x02\x00\x00\x04ab\x00c": {f_bool: False, f_bytes: "ab\0c"}
+
+---
+
+# Binary data generated with the python SDK:
+#
+# import typing
+# import apache_beam as beam
+# class Test(typing.NamedTuple):
+# f_map: typing.Mapping[str,int]
+# schema = beam.typehints.schemas.named_tuple_to_schema(Test)
+# coder = beam.coders.row_coder.RowCoder(schema)
+# print("payload = %s" % schema.SerializeToString())
+# examples = (Test(f_map={}),
+# Test(f_map={"foo": 9001, "bar": 9223372036854775807}),
+# Test(f_map={"everything": None, "is": None, "null!": None,
"¯\_(ツ)_/¯": None}))
+# for example in examples:
+# print("example = %s" % coder.encode(example))
+coder:
+ urn: "beam:coder:row:v1"
+ # f_map: map<str, nullable int64>
+ payload:
"\n\x15\n\x05f_map\x1a\x0c*\n\n\x02\x10\x07\x12\x04\x08\x01\x10\x04\x12$d8c8f969-14e6-457f-a8b5-62a1aec7f1cd"
+ # map ordering is non-deterministic
+ non_deterministic: True
+nested: false
Review comment:
As it stands, this is confusing for SDK authors writing tests against
standard_coders.yaml, as I've got the go testing written I need to explicitly
ignore the nested field for the row coders because they're all set to
nested:false, rather than nested:true.
This is per my thread on the dev list:
https://lists.apache.org/thread.html/r7da098363e6ce607ce96f9fbedb08f9f4757bedd68846aaeba5dd4f0%40%3Cdev.beam.apache.org%3E
Portability only ever supports nested coders. The semantics of
standard_coders.yaml say that
```
# nested: a boolean meaning whether the coder was used in the nested
context. Missing means to
# test both contexts, a shorthand for when the coder is invariant
across context.
```
https://github.com/apache/beam/blob/587dde57cbb2b0095a1fa04b59798d1b62c66f18/model/fn-execution/src/main/resources/org/apache/beam/model/fnexecution/v1/standard_coders.yaml#L24
Meaning that nested: false means that the outer most encoding has the length
prefix if necessary.
Structually, there's never a reason for a single schema value to have the
wrapped length prefix (it's orthogonal to this aspect of the encoding, as any
sub component is always nested as needed), so it's not included in the various
payload examples.
So, I re-iterate: Why is nested: false, instead of nested true if the coding
is going to be identical in both context?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]