Hey, responses inline!
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ On Friday, 21 August 2020 22:53, Sander Striker <[email protected]> wrote: > +1 on retiring WSL1 runners. Do you anticipate any unique behavior in > WSL2 that we should take into account? The only unique behavior I could see is if users have their buildstream elements / cache on the Windows filesystem. However: 1) The sharing is done through the samba protocol. We could test for network shares if we really need 2) This is a setup discouraged by Microsoft, their stand is that you should put the data on your WSL system. So all in all I think we do not need to cater for this specific case. ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ On Friday, 21 August 2020 23:57, Chandan Singh <[email protected]> wrote: > Hi Ben, > > I'll let you decide what to do with WSL as I don't have enough context > there :) But the plan generally seems good to me. > Seems good! > > I believe that this is not enough a reason to keep WSL1 tests, and that, > > when we will have moved to GitHub actions, we should be able to have Mac > > tests instead, which would bring better value. > > This is a bit of a sidetrack, but I tried quickly adding a Mac test > environment using Actions and the GitHub-provided runners. Here are my > observations. > > Python 3.8 (which is the default now) uses `spawn` as the default > multiprocessing method. This wreaks havoc with our testsuite and > everything hangs. An example of this behavior can be seen in this job > that ran for 50 mins without doing anything: > https://github.com/cs-shadow/buildstream/runs/1014073454. > > Although `fork` is technically considered unsafe on MacOS (and > Windows), it does work for the most part. At least more so than the > `spawn`. So, I wonder if we should force the multiprocessing method to > `fork` in BuildStream? > > Things look better on Python3.7. Here, most tests pass but about 10% > of the tests fail where we fail to fork correctly. Here is an example > of such a job: https://github.com/cs-shadow/buildstream/runs/1014350157. > The failures look something like: > > BuildStream exited with code -1 for invocation: > Program stderr was: > [--:--:--][ ][ main:core activity ] START Build > [--:--:--][ ][ main:core activity ] START Loading elements > objc[4443]: +[__NSCFConstantString initialize] may have been in > progress in another thread when fork() was called. > objc[4443]: +[__NSCFConstantString initialize] may have been in > progress in another thread when fork() was called. We cannot safely > call it or ignore it in the fork() child process. Crashing instead. > Set a breakpoint on objc_initializeAfterForkError to debug. > BUG: Message handling out of sync, unable to retrieve failure message > for element base.bst > [--:--:--][ecf8572c][ main:base.bst ] ERROR Internal job process > unexpectedly died with exit code -6 > > > Would any of our resident multiprocessing experts have any thoughts on > how to handle this correctly? > I think the most correct solution would be to get rid of our multiprocessed scheduler, which is something I have tried to make work for more than two months now. I'd advocate for postponing that a bit, seeing how well a non-multiprocessed solution can go, and normally the problem should go away? My branch with my current work is at [0] Cheers! Ben [0]: https://gitlab.com/BuildStream/buildstream/-/merge_requests/1982
