TheNeuralBit commented on code in PR #22066: URL: https://github.com/apache/beam/pull/22066#discussion_r912143324
########## sdks/python/apache_beam/typehints/row_type.py: ########## @@ -17,19 +17,94 @@ # pytype: skip-file +from __future__ import annotations + +from typing import List +from typing import Optional +from typing import Sequence +from typing import Tuple + from apache_beam.typehints import typehints +from apache_beam.typehints.native_type_compatibility import match_is_named_tuple + +# Name of the attribute added to user types (existing and generated) to store +# the corresponding schema ID +_BEAM_SCHEMA_ID = "_beam_schema_id" class RowTypeConstraint(typehints.TypeConstraint): - def __init__(self, fields): - self._fields = tuple(fields) + def __init__(self, fields: List[Tuple[str, type]], user_type=None): + """For internal use only, no backwards comatibility guaratees. See + https://beam.apache.org/documentation/programming-guide/#schemas-for-pl-types + for guidance on creating PCollections with inferred schemas. + + Note RowTypeConstraint does not currently store functions for converting + to/from the user type. Currently we only support a few types that satisfy Review Comment: Thanks for asking this. I originally wrote this docstring when I was adding support for dataclasses. I was trying to communicate that both dataclass instances and NamedTuple instances can be constructed (to) and consumed (from) in the same way. We assume that these will work elsewhere (e.g. in `RowCoder`), so I wanted to document the assumption here. Anyway that framing was confusing for this PR since I dropped the dataclass support, so I rephrased it. Hopefully that helps. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
