claudevdm commented on code in PR #38108:
URL: https://github.com/apache/beam/pull/38108#discussion_r3106214201
##########
sdks/python/apache_beam/typehints/row_type_test.py:
##########
@@ -33,17 +33,18 @@ class RowTypeTest(unittest.TestCase):
@staticmethod
def _check_key_type_and_count(x) -> int:
key_type = type(x[0])
- if not row_type._user_type_is_generated(key_type):
- raise RuntimeError("Expect type after GBK to be generated user type")
+ if row_type._user_type_is_generated(key_type):
+ raise RuntimeError("Type after GBK not preserved, get generated type")
+ if not hasattr(key_type, row_type._BEAM_SCHEMA_ID):
+ raise RuntimeError("Type after GBK missing Beam schema ID")
return len(x[1])
def test_group_by_key_namedtuple(self):
MyNamedTuple = typing.NamedTuple(
"MyNamedTuple", [("id", int), ("name", str)])
- beam.coders.typecoders.registry.register_coder(
- MyNamedTuple, beam.coders.RowCoder)
+ beam.coders.typecoders.registry.register_row(MyNamedTuple)
Review Comment:
I am not sure the divegence between `
beam.coders.typecoders.registry.register_coder` and
`beam.coders.typecoders.registry.register_row` is a good thing. Most users
probably want the benefit of this PR but rely on `register_coder`.
Can we either
1. make register_coder act the same as register_row OR
2. Deprecate register_coder(T, RowCoder) with a warning. DeprecationWarning:
use register_row(T) instead when the coder class is RowCoder
##########
sdks/python/apache_beam/coders/typecoders.py:
##########
@@ -133,6 +134,21 @@ def register_coder(
self._register_coder_internal(
self._normalize_typehint_type(typehint_type), typehint_coder_class)
+ def register_row(self, typehint_type: Any) -> None:
Review Comment:
What do you think of adding a decorator like @beam.row for conveinence?
Something like
```
@beam.row
@dataclass
class Order:
id: int
amount: float
def row(cls):
cls._beam_row_pending = True
return cls
```
Then when registering coders or RowTypeConstraint.from_user_type check for
this attribute?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]