Hi, On Tuesday 04 October 2011, Simon Urbanek wrote: > I don't see why this should be anything new - this is already happening > since both packages that were folded into parallel (snow and multicore) > are well known and well used. > > In multicore we were explicitly warning about this and also working around > issues where possible (e.g. the Mac GUI, for example). Judging by the > widespread use of multicore and the absence of problem reports related to > GUIs, my impression would be that this aspect is not really a problem > (more below). We get more users confused about the inability to perform > side-effects than this, for example.
Well, some users do heed the advice to address their problem reports to the package / GUI maintainers, esp., if they experience that the problem only occurs with the GUI loaded, not in a "plain" R session. We've had a problem report about using mclapply() for a while in the RKWard bug tracker, already. > In general, there are two main issues that can be addressed by the GUI: > > a) shared file descriptors. This is a problem if the GUI uses FDs for > communication and they are not closed in the child instance. You don't > want both the child and the parent to process those FDs. E.g., closeAll() > can be used to work around that issue and with parallel there could be an > easier interface for this given that it's in core R. > > b) event loop. If the GUI hooks into the event loop then, obviously, this > is only intended to be run from the master. multicore was already > disabling the even loop hook for AQUA, but it was hard to provide a more > comprehensive solution since it needed cooperation of R. In parallel it's > much easier, because it can modify R to allow the event loop conditionally > and thus only in the master process. For me the problem set was having multiple threads + mutexes, linking to a library that installs a SIGCHLD handler, code waiting for the "communicator" thread to negotiate something with the frontend, except that thread doesn't exist in the fork()ed child process... After spending the day debugging, I think, I have finally solved the key issues for RKWard. That also means the issue is mostly painless for me, now. However, addressing fork()-related issues is not always a trivial exercise, and I continue to think that it could be useful for maintainers of "problematic" packages to have a way to stop / warn direct and indirect users running mcfork(). > The whole point of parallel is that it can do more than an external > package, so I think you're going about it the wrong way - you should be > talking to us much earlier so whatever your constraints in RKWard can be > possibly addressed by the infrastructure. Also note that a lot of this > should be seamless, a lot of users don't care what the infrastructure is, > they just want their task to run in parallel, they don't care about > mcfork() and the like - the choices will be made for them, because there > is no fork on Windows, for example. Exactly. I want the choice to be made for the user, where reasonably possible. My point is that knowing whether you're on Windows or a Unix is not enough to decide on the technique to use, in this case. Reliably enumerating all corner cases where forking could be a problem on Unix is probably next to impossible. The developers responsible for those corner cases have a decent chance to be aware of the problem, though. And thus, I think it would be a good idea, if they had a standard way of informing library(parallel), and any third party using library(parallel), if there is a problem with forking. Regards Thomas
signature.asc
Description: This is a digitally signed message part.
______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel