I have created this gist [1] which shows a simple reproducer and also provides coverage data when using prefork and solo pool cls.
That shows that when using prefork the coverage data is simply not captured. [1] https://gist.github.com/elyezer/2d8225843451364cbb19af08f4f52de3 On Tue, Aug 30, 2016 at 4:03 PM, Brian Bouterse <[email protected]> wrote: > Using the solo thread pool is good to get a one-off coverage report. The > one you linked to [3] can be used to talk about the state of pulp smash at > that point in time. I spot checked the report and the areas that I knew > were incorrectly reported as uncovered before are shown as covered in [3] > so that is great. > > For a permanent fix to the coverage problem, I recommend not putting the > workaround in place on Jenkins. The test environment needs to run exactly > the same as the production infrastructure. As one example, there could be > meaningful problems that occur in forking celery environments which would > not be discovered in the test environment which would not fork. > > In talking with @asksol he indicates that this is not a known issue in > celery. Can an issue with a simple reproducer be filed against celery and > we can pick up the technical conversation on that bug? Maybe reply on this > thread with a link? > > Thank you for all the work that you've done on this issue. This is not an > easy problem. I really appreciate your effort here. > > -Brian > > > > On 08/29/2016 08:40 PM, Elyezer Rezende wrote: > >> All, >> >> During this month of August I've been working on making the coverage >> report we have for Pulp Automation better. I got stuck many times during >> the process but I've finally found something that helped improving it. >> >> The current coverage report is not reliable since any of the Celery >> tasks have its coverage data being captured. Because that we had some >> guesses on what could be causing that, we suspected that for some reason >> the coverage hook was not being installed and so the data not being >> captured, also we suspected that coverage was not playing good on how >> Celery forks its processes. >> >> With those two guesses in mind, I've read Celery and Coverage source >> code and also got some help from the community. This allowed me to >> discover that Celery uses a forked multiprocessing library called >> Billiard [1], knowing that then I started checking if coverage.py was >> able to capture coverage data multiprocessing processes [2] and created >> a sample Django + Celery application to test some assumptions. The first >> one I tested was if coverage.py was really able to capture coverage data >> from decorated functions (since celery tasks are decorated functions) >> ran by the celery workers, the first try failed since I had to make a >> call from a celery task and celery tasks was not being captured at all >> which made also the simple decorated function data not be captured. So >> my first assumption have failed and I got stuck one more time on the >> coverage data not being captured issue. Continuing the research I found >> the POOL_CLS configuration for the celery workers, that allowed me to >> change how celery deals with concurrency and it happened that using >> "solo" or "threads" the coverage data was finally captured. >> >> Next I checked how coverage.py deals with the multiprocessing built-in >> library and discovered that it does a patch [3]. That made me realize >> that maybe something similar for the Billiard library will be necessary. >> Since just changing the POOL_CLS config works and is easier than >> creating a patch for the Billiard plugin to capture coverage data I >> dropped this and will proceed only if necessary. >> >> I checked Pulp code and updated the services that spawn celery processes >> to use threads as the POOL_CLS (this also required to install the >> threadpool Python package), then I tried to run the Pulp Smash >> test_sync_publish for RPM over API, and it was passing before the change >> and started failing due to celery workers not being able to proper >> handle the process' signals since they are only received by the parent >> process. I did not investigate more about why it was failing and gave >> "solo" a try, it happened that it worked and I was able to proper >> capture coverage data for the pulp.server.async.tasks module which was >> not being captured before, finally! >> >> I went ahead and ran all the Pulp Smash tests which produced the >> coverage report [4], that has more captured data from the one available >> on Jenkins. Now I need your help on identifying if there is any missing >> data to be captured still. Please let me know if you think any module >> should have more coverage data than presented by the updated report. >> >> Finally I am about to send a PR to the pulp repository in order to >> update the coverage hook with the changes needed in order to set the >> POOL_CLS to solo. The pulp worker template [5] and the >> pulp_resource_manager.service [6] need to be patched, again, please let >> me know if I am missing something and should patch any other file. >> >> PS.: I would like to thank all the help I received from Pulp developers >> with all my newbie questions about Pulp design. A special thanks to Sean >> who created the first version of the coverage data and provided some >> initial input about what was missing, Brian who spent about an hour with >> me explaining how Pulp workers and resource manager work and also to >> provide some insights about what could be the cause for the coverage >> data not being captured and Jeremy who listened me all the times I was >> frustrated by being stuck. I hope this improved coverage report will >> bring us value and the information we need in order to ensure Pulp is >> well tested by Pulp Smash. >> >> [1] https://github.com/celery/billiard >> [2] http://coverage.readthedocs.io/en/coverage-4.2/subprocess.html >> [3] >> https://bitbucket.org/ned/coveragepy/src/257e52793fb0f28853d >> dca679f67b158107262bf/coverage/multiproc.py >> [4] https://dl.dropboxusercontent.com/u/109792/pulp-2-11-coverag >> e-report-20160829/index.html >> [5] https://github.com/pulp/pulp/blob/master/server/pulp/server/ >> async/manage_workers.py#L23-L25 >> [6] https://github.com/pulp/pulp/blob/master/server/usr/lib/syst >> emd/system/pulp_resource_manager.service#L10-L12 >> >> -- >> Elyézer Rezende >> Senior Quality Engineer >> irc: elyezer >> >> >> _______________________________________________ >> Pulp-dev mailing list >> [email protected] >> https://www.redhat.com/mailman/listinfo/pulp-dev >> >> > _______________________________________________ > Pulp-dev mailing list > [email protected] > https://www.redhat.com/mailman/listinfo/pulp-dev > -- Elyézer Rezende Senior Quality Engineer irc: elyezer
_______________________________________________ Pulp-dev mailing list [email protected] https://www.redhat.com/mailman/listinfo/pulp-dev
