Looking for advice here. Let me start with a little description of the situation: there's a main control program which forks off a large number of children (which may have their own children, etc). Each of these descendants creates a data file and, having done so, reports back to the control process, which takes responsibility for uploading all data files to a server using libcurl.

Only the controller process uses libcurl. Subprocesses communicate with the controller via sockets but these do not involve curl, just regular connect/send/close system calls. These "incoming" sockets are multiplexed with a select loop. The controller uses the synchronous easy API to upload files, and this creates a choke point since files can come in asynchronously faster than they can go out synchronously. I'm trying to fix that bottleneck, and as far as I can tell I have three options:

1. Fork a new process to do each upload, still using the easy API. This is kind of heavyweight but could work. The problem is that I use SIGCHLD to keep track of child processes and spurious SIGCHLDs from transfer processes confuse the bookkeeping. This could probably be worked around at the cost of perhaps making some already complex code painfully complex.

2. Use threads with easy handles. As I understand it, each thread would need a dedicated curl handle and I'd need to maintain a pool of worker threads. I have little experience of threaded programming so I don't know how good or how hard this option is.

3. Use the multi API. I'm leaning this way because it seems the most "curl-ish" solution. The problem I fear here is that I already have a select loop with a lot of file descriptors in play for the incoming data. The idea of managing two select loops in parallel feels painfully tricky. I'm not sure if it's possible to have just one loop and distinguish between 'input' and 'output' sockets.

An additional point is that this must work on both Unix and Windows. Solutions #1 (process creation) and #2 (thread creation) would have to be implemented differently for each, so that's another argument for the multi API.

Does anyone have a happy experience to report with any of these methods, or preferably even sample code? I'd be especially grateful for guidance on #3, using the multi API in the presence of an existing select loop.

Thanks,
MB

Reply via email to