Hello . I have a pipeline built on apache beam 2.13.0 using python 3.7.3. My pipeline lasts about 5 hours to ingest 2 sets of approximately 70000 Json objects using Direct Runner.
I want to diagnose which transforms are taking time and improve code for better performance. I saw below module for profiling but it seems it does not report about speed of each transform. https://beam.apache.org/releases/pydoc/2.13.0/apache_beam.utils.profiler.html Is there any module that you could use to monitor speed of each transform ? If not, I appreciate if I could get some help for how to monitor speed for each transform. Best Regards, Yu Watanabe -- Yu Watanabe Weekend Freelancer who loves to challenge building data platform [email protected] [image: LinkedIn icon] <https://www.linkedin.com/in/yuwatanabe1> [image: Twitter icon] <https://twitter.com/yuwtennis>
