On 12 Jul 2010, at 12:18 PM, Daniel Stenberg wrote:

On Thu, 8 Jul 2010, G Drukier wrote:

(Please don't reply to a subject as a shortcut to start a new thread, you'll end up as a reply in the existing thread in clients and web archives and more, and it is messy.)

My apologies. I've put this in a new thread, which I hope won't mess things up too much more as it's still the start of the discussion.


We're working on an application where we're taking a video feed off of an IP camera via HTTP or RTSP and then processing the video in memory without writing to disk. In order to maximize performance we'd prefer to not have to copy the incoming data from the libcurl internal buffer.

So, has there actually been any work in implementing the zero-copy interface?

No.

I've been poking around the code-base, but I'm not that familiar with it yet. Does anyone have any guidance they could give in implementing zero-copy?

My vision of a zero copy interface would be that you provide buffers in advance to libcurl, and as it goes about and stores data into the buffers it'll ask for more and use those accordingly.

That way, it would only be a matter of updating the main "buffer pointer" at the suitable place in the code to not point to the internal buffer all the time but instead point to the correct new one.

Possibly it could be made as another callback: getbuffer()

getbuffer() gets called when libcurl needs a new buffer, and the buffer you provide to libcurl with that callback must be able to hold at least CURL_MAX_WRITE_SIZE bytes. When libcurl calls the write callback, it will pass on a pointer to within that buffer and a length. Note that it MAY not point to the first byte of the passed- in buffer.

What what you say about that concept?

The way I've done this in the past is to have the code acquiring the data, which in this case would be libcurl, assign an appropriate block of memory itself for each incoming piece of data and pass that on. The problem with that approach is knowing what allocator is being used so that the buffer can be subsequently deallocated properly. This problem is alleviated by your approach in which the user allocates and provides the buffer.

If I understand the rest of your proposal, the user would set the zero- copy option, and then, when libcurl hits the location where it needs to write, it call the getbuffer callback to get the buffer. When it is done reading, it then calls the write callback. This seemed inefficient to me, until I thought about it further in the context of how libcurl works.

I would have suggested instead, that the setopt mechanism be used to set the location of the buffer to write to. The buffer should be subject to the CURL_MAX_WRITE_SIZE minimum. Then the data gets read and the write callback gets called as usual. The user then does what he likes with the memory, and, if desired allocate new memory. The problem, as I then realized, is that the write callback doesn't have the handle, and so can't run setopt. Further, although I've only used the easy interface until now, I imagine that having only one buffer is going to cause a problem for multi.

So your proposal makes sense in that it minimizes the amount of modification needed, and leaves the memory allocation problems in the user's hands. It would require one or two new options. One to signify that zero-copy is to be used, and one to set the getbuffer callback. Alternatively, and more economically, the latter would set the former; the default state being a NULL callback, and bypass of the code.

A question though. If the buffer is larger than CURL_MAX_WRITE_SIZE, does the getbuffer callback would need to notify libcurl that this is the case? Or is CURL_MAX_WRITE_SIZE a hard(ish) limit, and libcurl shouldn't be called upon to do more than this?

At the other limit, currently if the user want's a smaller buffer, and more frequent reference to the write callback, the CURLOPT_BUFFERSIZE option is available, but its satisfaction is not guaranteed. What I'm concerned about is excessive demands on memory in the zero-copy case where the incoming data chunks are small with respect to CURL_MAX_WRITE_SIZE.

Gordon
-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-library
Etiquette:  http://curl.haxx.se/mail/etiquette.html

Reply via email to