On 2019-04-15 at 22:31+02:00, Daniel Stenberg wrote:
On Mon, 15 Apr 2019, Nicolas Roeser via curl-library wrote:

My problem is that I do not know where the boundary between header and body is if the download has been aborted. To make things worse, I have the feeling that it may be difficult to properly detect.

I read your email several times and I can't figure out *why* you need to detect that boundary yourself. Why can't you use the different callbacks for header and body as then you can simply lean on libcurl's detection that it always does?

What I wanted to do at first, was to enable CURLOPT_HEADER and to stuff all data received by the write callback in one buffer. After the transmission is complete, I planned to split that buffer into header and body. I wanted to do that mainly because the existing code which I am working on did it this way.

But after a _lot_ of thinking and some experiments, I see that your suggestion is *much* better. In the header callback, I will save the headers which may be needed after completion of the transmission. And I will disable CURLOPT_HEADER. Then the data obtained by the write callback will only be the last body, fine.


I would like to clear the receive buffer each time the client starts reading a new resource.

And that is not before you invoke curl? When you ask libcurl to follow a redirect, the only body that is sent to the write callback is that if the URL that isn't itself a redirect.

Ahh, many thanks for clearing this up! I had not understood that because I had been looking at the number of downloaded octets reported by the progress callback. This number is always 0 while headers are processed (which is OK). When a redirecting resource is read, the callback may report a higher number (the size of the body of the redirecting document, even though this is not sent to the write callback). And when the redirection is followed and processing of the headers of the target resource starts, the number drops to 0 again.

I had been confused because I had assumed that the number would be monotonically increasing, and would report the number of octets processed by the write callback (more or less).


I first thought that I might disable CURLOPT_HEADER and handle some headers differently from what is done now. But this seems not to help with my problem of identifying when to clear my receive buffer as long as CURLOPT_FOLLOWLOCATION is on.

Do you mean a receive buffer for the *headers* of the final non-redirect URL? If so, then I presume you can just detect a 2xx response code and take that as start of the last set of headers.

Will implement something along these lines, thanks!


I have a feeling that the write callback function will never be called with data from two HTTP responses at once (that is, will never cross redirections).

I'm not following this. How can there be two HTTP responses at once?

Sorry, that had been wrongly phrased by me. I meant that it could be called _once_ and be passed data from _two_ HTTP responses that have _arrived in succession_ (like a response with a redirection and the final response). So a single call handling data which overlaps two responses. Anyways, never mind, as now I know that the write callback will not receive any but the last body, and that I can handle the headers without CURLOPT_HEADER and in the header callback.

Many thanks again!
--
Nico

Nicolas Roeser
kiz – Information Systems Department, Ulm University
-------------------------------------------------------------------
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library
Etiquette:   https://curl.haxx.se/mail/etiquette.html

Reply via email to