TheNeuralBit commented on code in PR #23296: URL: https://github.com/apache/beam/pull/23296#discussion_r975656793
########## sdks/python/apache_beam/typehints/batch.py: ########## @@ -35,6 +35,7 @@ from typing import TypeVar import numpy as np +import torch Review Comment: Could you make this a separate module, following the pattern I just used for pandas: https://github.com/apache/beam/blob/master/sdks/python/apache_beam/typehints/pandas_type_compatibility.py That way it can be separable from everything else here if `torch` is available. ########## sdks/python/apache_beam/typehints/batch.py: ########## @@ -35,6 +35,7 @@ from typing import TypeVar import numpy as np +import torch Review Comment: One issue with that is I haven't found a good way to re-use the BatchConverterTest logic, for now it's just duplicated in `pandas_type_compatibility_test`. ########## sdks/python/apache_beam/typehints/batch_test.py: ########## @@ -54,6 +64,17 @@ 'element_typehint': str, 'batch': ["foo" * (i % 5) + str(i) for i in range(1000)], }, + { + 'batch_typehint': torch.Tensor, + 'element_typehint': torch.Tensor, Review Comment: Interesting. It could be problematic to allow this as an element type though, since it's unclear what data type is. For now, could we always represent scalars as a 0-dim PytorchTensor? i.e. `PytorchTensor[torch.int32, (,)]`. There could also be a wrapper for this, like `PytorchScalar[torch.int32]` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
