Hi, >> 1) UB from the libev point of view, or >> 2) works with certain versions of libev on certain OSes, but may break >> without warning, or >> 3) is fully supported and is a part of the public API contract. > > It's certainly 1 or 2.
I am glad it was stated in clear now. Remember a discussion a while ago when a patch was submitted to work around a situation when libev runs out of file descriptors? Back then you suggested long jump, but today you admit it is unreliable. > Even if it were supported under some circumstances, I'd say whatever thing > you are trying to do is illdesigned - you'd have to wrap every call, and > libev is not the only source of errors - if libev runs out of memory for > example, then the program might crash at any time (e.g. because of kernel > OOM, or because it needs more stack space or…). Let’s talk about memory. Indeed, recovering from a stack overflow is troublesome at the very best. However, with careful programming it is possible to avoid this kind of errors. Today it became customary to omit checks for allocation failures in some projects. The rationale for doing so is that typically there is a plenty of RAM available + kernel overcommit resulting in allocation failure deferred until the presumably allocated memory is being accessed at which point OOM killer kicks in. This is a reasonable programming model though not universal. You surely won’t assume that an embedded system has plenty of RAM. It is also possible to turn overcommit off. Let’s consider networking or the file system next. It is the common understanding that errors will inevitably happen and must be handled. Luckily, it is not so hard to handle a socket or a file io error. The case with the descriptors running out falls somewhere in-between. If I was to consider adding resiliency to fd shortage in a library like libev I would have asked the following questions: A) How common is such a situation? B) Is it possible to avoid it by e.g. carefully controlling the resource usage? C) Does a reasonable error recovery strategy exist and how hard it is to implement? My answers: A) Quite common. Default limit in Linux is quite low (~1000) and can be lowered. A heavy user of network connectivity is rather likely to hit the limit. Besides, this kind of a program is the one benefiting from the performance offered by libev most. B) Not quite. Controlling fd usage manually is hard due to many existing libraries consuming descriptors internally. Besides, it is complex and feels awkward. C) It depends. But generally, yes. Especially, in a network server, that is already prepared to handle connection errors. > If you want to catch errors and do something sensible, the libev > errorhandler is not your solution. In fact, no in-process solution exists > - you should have a watchdog that does the right thing in fatal situations > such as this. > > Just saying... all this sounds like some inane customer feature request > form a customer who doesn't know what he is doing and wants to go headlong > through the nearest wall. May be I am missing something and a peace of advice will be greatly appreciated. We are spawning a new thread with a dedicated event loop. Sometimes it fails due to fd shortage when we setup ev_async used for communication with the thread. We would like to shutdown the thread in a clean way and to deliver an error into the procedure waiting for the thread’s completion. The procedure is fully prepared to handle the error (maybe it attempts to restart processing sans the dedicated thread, or it just throws the hands into the air.) Regards, Nick _______________________________________________ libev mailing list [email protected] http://lists.schmorp.de/mailman/listinfo/libev
