Abacn commented on code in PR #17604:
URL: https://github.com/apache/beam/pull/17604#discussion_r871464993
##########
sdks/python/apache_beam/io/fileio.py:
##########
@@ -835,10 +839,15 @@ def finish_bundle(self):
class _RemoveDuplicates(beam.DoFn):
-
+ """Internal DoFn that filters out filenames already seen (even though the
file
+ has updated)."""
COUNT_STATE = CombiningValueStateSpec('count', combine_fn=sum)
- def process(self, element, count_state=beam.DoFn.StateParam(COUNT_STATE)):
+ def process(
+ self,
+ element: Tuple[str, filesystem.FileMetadata],
Review Comment:
Got warnings
```
WARNING:apache_beam.transforms.core:Key coder FastPrimitivesCoder for
transform <ParDo(PTransform) label=[RemoveAlreadyRead]> with stateful DoFn may
not be deterministic.
This may cause incorrect behavior for complex key types.
Consider adding an input type hint for this transform.
WARNING:apache_beam.transforms.core:Key coder FastPrimitivesCoder for
transform <ParDo(PTransform) label=[RemoveOldAlreadyRead]> with stateful DoFn
may not be deterministic.
This may cause incorrect behavior for complex key types.
Consider adding an input type hint for this transform.
```
tried these type hints but the warning persist. This may be relevant to
flaky tests. How can I add typehint properly and resolve these warnings?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]