Hello .

I have a pipeline built on  apache beam 2.13.0 using python 3.7.3.
My pipeline lasts about 5 hours to ingest 2 sets of approximately 70000
Json objects using Direct Runner.

I want to diagnose which transforms are taking time and  improve code for
better performance. I saw below module for profiling but it seems it does
not report about speed of each transform.

https://beam.apache.org/releases/pydoc/2.13.0/apache_beam.utils.profiler.html

Is there any module that you could use to monitor speed of each transform ?
If not, I appreciate if I could get some help for how to monitor speed for
each transform.

Best Regards,
Yu Watanabe

-- 
Yu Watanabe
Weekend Freelancer who loves to challenge building data platform
[email protected]
[image: LinkedIn icon] <https://www.linkedin.com/in/yuwatanabe1>  [image:
Twitter icon] <https://twitter.com/yuwtennis>

Reply via email to