Abacn commented on issue #20137:
URL: https://github.com/apache/beam/issues/20137#issuecomment-1324331946

   Another duplicated listing blob operation happens here
   
https://github.com/apache/beam/blob/9da27671cdc8b3df2c548d92a4b2e34f5e0aaa0f/sdks/python/apache_beam/io/filebasedsource.py#L144
   and
   
https://github.com/apache/beam/blob/9da27671cdc8b3df2c548d92a4b2e34f5e0aaa0f/sdks/python/apache_beam/io/filebasedsource.py#L202
   
   For FileBasedSource, get_range_tracker first calls _get_concat_source which 
will fetch file list once. Then estimate_size will do another fetch. (If 
validate is set to True, there is even one more fetch).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to