Re: [PATCH 1/2] httpfs: handle 3xx, 4xx, 5xx HTTP status codes
Hello, Gianluca Cannata, le mer. 04 févr. 2026 08:52:29 +0100, a ecrit: > I agree that libcurl is the best option for handling HTTP requests. I > am already exploring a few possible implementations: > > 1. Blocking: One curl_easy handle per node performing blocking requests. That would be a useful first step that matches the curent httpfs behavior. > 2. Non-Blocking: One. curl_easy handle per node managed by a global > curl_multi handle for non-blocking IO Once the first step is done and working, that can be a useful second step to improve performance. > 3. Custom Library: A dedicated libhttpfs using a ring buffer that > wraps libnetfs and exposes HTTP-specific callbacks through libcurl I don't really see the point of this approach? To take an example, in netfs_attempt_read you just want to submit an asynchronous request to libcurl and return MIG_NO_REPLY, and have a callback called by libcurl later to send the result when libcurl has received it. Samuel
Re: [PATCH 1/2] httpfs: handle 3xx, 4xx, 5xx HTTP status codes
Hi Samuel, Sorry for the late reply; I have been very busy lately and forgot to get back to you. I agree that libcurl is the best option for handling HTTP requests. I am already exploring a few possible implementations: 1. Blocking: One curl_easy handle per node performing blocking requests. 2. Non-Blocking: One. curl_easy handle per node managed by a global curl_multi handle for non-blocking IO 3. Custom Library: A dedicated libhttpfs using a ring buffer that wraps libnetfs and exposes HTTP-specific callbacks through libcurl What do you think ? Il giorno mar 27 gen 2026 alle ore 00:47 Samuel Thibault ha scritto: > > Hello, > > Thinking about these: shouldn't we just use libcurl? I don't see a > reason to maintain http/https code when curl can probably provide just > what we need. > > Samuel
Re: [PATCH 1/2] httpfs: handle 3xx, 4xx, 5xx HTTP status codes
Hello, Thinking about these: shouldn't we just use libcurl? I don't see a reason to maintain http/https code when curl can probably provide just what we need. Samuel
Re: [PATCH 1/2] httpfs: handle 3xx, 4xx, 5xx HTTP status codes
On 26/01/2026 21:28, Gianluca Cannata wrote:
Il giorno lun 26 gen 2026 alle ore 07:35 Amos Jeffries
ha scritto:
On 26/01/2026 01:05, Gianluca Cannata wrote:
Hi Amos,
thank you for your feedback.
I have tried to address all of your comments.
Let me know what you think.
Gianluca
Index: httpfs/http.c
===
--- httpfs.orig/http.c
+++ httpfs/http.c
@@ -34,6 +34,8 @@
#include
#include
+#define MAX_HTTP_STATUS_LINE 256
+
Still need to document that this value is relatively arbitrary. We only
use the first 13 bytes of the first line.
What value do you suggest ?
The number is fine, what is needed is a comment that the value is
arbitrary with a minimum of 13, so others are free to alter it if they
want but know not to go too small.
Technically the HTTP "Reason Phrase" string is unlimited length.
How to protect against it ?
I would have the caller function doing all the I/O and passing the
buffer to this function once it thinks a status line has been received.
/* do a DNS lookup for HOST and store result in *DEST */
error_t lookup_host (const char *host, struct sockaddr_storage *dest,
socklen_t *dest_len, int *socktype, int *protocol)
{
@@ -100,34 +102,73 @@ error_t lookup_host (const char *host, s
return 0;
}
-/* read the first line from the socket and return an error_t error */
+/* read the first line from the socket or return an error_t error */
static error_t translate_http_status (int fd, ssize_t *nread)
{
- char buf[32];
+ char buf[MAX_HTTP_STATUS_LINE];
+ size_t i = 0;
+ ssize_t n;
+ char c;
+ int status;
+
+ *nread = 0;
+
+ /* 1. byte-per-byte read until \n */
+ while (i < sizeof (buf) - 1) {
+ n = read (fd, &c, 1);
+ if (n <= 0) return EIO;
+ buf[i++] = c;
+ if (c == '\n') break;
+ }
I suspect this is heading in a bad direction. AFAIK, system calls are
quite expensive.
We know that a minimum of 13 bytes are needed for this function, so you
can start with that many and reduce by the amount actually received
until that much is read in.
Ok.
After which, you can do whatever to discard the remainder of the line.
No need to even put that part in buf.
How to discard data ? There is a portable way of doing it ?
Loop on a read() call, but do not increment the place in the buffer
which the data is being put. So each cycle it "forgets" what was
received previously. Until nothing, or the EOL delimiter arrives.
If this function keeps doing the read() operations you will need to be
careful not to over-read.
If you have the caller doing the I/O then you can read larger chunks and
shift the buffer contents to keep any of the header section that came in.
HTH
Amos
Re: [PATCH 1/2] httpfs: handle 3xx, 4xx, 5xx HTTP status codes
Il giorno lun 26 gen 2026 alle ore 07:35 Amos Jeffries
ha scritto:
>
> On 26/01/2026 01:05, Gianluca Cannata wrote:
> > Hi Amos,
> >
> > thank you for your feedback.
> >
> > I have tried to address all of your comments.
> >
> > Let me know what you think.
> >
> > Gianluca
>
> > Index: httpfs/http.c
> > ===
> > --- httpfs.orig/http.c
> > +++ httpfs/http.c
> > @@ -34,6 +34,8 @@
> > #include
> > #include
> >
> > +#define MAX_HTTP_STATUS_LINE 256
> > +
>
> Still need to document that this value is relatively arbitrary. We only
> use the first 13 bytes of the first line.
What value do you suggest ?
> Technically the HTTP "Reason Phrase" string is unlimited length.
How to protect against it ?
>
> > /* do a DNS lookup for HOST and store result in *DEST */
> > error_t lookup_host (const char *host, struct sockaddr_storage *dest,
> > socklen_t *dest_len, int *socktype, int *protocol)
> > {
> > @@ -100,34 +102,73 @@ error_t lookup_host (const char *host, s
> > return 0;
> > }
> >
> > -/* read the first line from the socket and return an error_t error */
> > +/* read the first line from the socket or return an error_t error */
> > static error_t translate_http_status (int fd, ssize_t *nread)
> > {
> > - char buf[32];
> > + char buf[MAX_HTTP_STATUS_LINE];
> > + size_t i = 0;
> > + ssize_t n;
> > + char c;
> > + int status;
> > +
> > + *nread = 0;
> > +
> > + /* 1. byte-per-byte read until \n */
> > + while (i < sizeof (buf) - 1) {
> > + n = read (fd, &c, 1);
> > + if (n <= 0) return EIO;
> > + buf[i++] = c;
> > + if (c == '\n') break;
> > + }
> >
>
> I suspect this is heading in a bad direction. AFAIK, system calls are
> quite expensive.
>
> We know that a minimum of 13 bytes are needed for this function, so you
> can start with that many and reduce by the amount actually received
> until that much is read in.
Ok.
> After which, you can do whatever to discard the remainder of the line.
> No need to even put that part in buf.
How to discard data ? There is a portable way of doing it ?
> AFAICS the header are currently being dropped so a bulk read to drop the
> status as well is fine ... for now.
>
>
>
> > - *nread = read (fd, buf, sizeof (buf) - 1);
> > - if (*nread < 12) return EIO;
> > - buf[*nread] = '\0';
> > + buf[i] = '\0';
> > + *nread = i;
> >
> > - if (strncmp (buf, "HTTP/", 5) != 0)
> > - return EPROTO;
> > + /* 2. Positional validation: "HTTP/1.N SSS" */
> > + int valid = (i >= 12) &&
> > + (strncmp (buf, "HTTP/1.", 7) == 0) && /* "HTTP/1."
> > */
>
> We know that buf has more than 7 bytes/octets.
> Use memcmp() rather than strncmp().
>
>
> > + (buf[7] >= '0' && buf[7] <= '9') && /* N (digit) */
> > + (buf[8] == ' ') && /* SP */
> > + (buf[9] >= '0' && buf[9] <= '9') && /* S (digit 1)
> > */
> > + (buf[10] >= '0' && buf[10] <= '9') && /* S (digit
> > 2) */
> > + (buf[11] >= '0' && buf[11] <= '9') && /* S (digit
> > 3) */
> > + (buf[12] == ' ' || buf[12] == '\r' || buf[12] ==
> > '\n'); /* End of line */
>
> Sadly bare-CR is a thing from malicious services. We do have to check
> for "\r\n" together instead of just an '\r'.
>
> So IMO the above should be:
>
>(i >= 13) &&
>(memcmp (buf, "HTTP/1.", 7) == 0 && isdigit(buf[7])) && // HTTP version
>(buf[8] == ' ') && // SP delimiter
>(buf[9] >= '1' && buf[9] <= '5' && isdigit(buf[10]) &&
> isdigit(buf[10])) && // status code
> (buf[12] == ' ' || (buf[12] == '\r' && buf[13] == 'n') || buf[12] ==
> '\n'); // SP delimiter or End of Line
>
> Note, with the change to which characters buf[9] is compared against we
> are no longer needing the extra (status < 100 || status >= 600) check below.
>
> >
> > - int status = atoi (buf + 9);
> > + if (!valid) return EPROTO;
> > +
> > + /* 3. HTTP status code conversion */
> > + status = atoi (buf + 9);
> >
> > if (debug_flag)
> > - fprintf (stderr, "HTTP Status: %d\n", status);
> > + fprintf (stderr, "HTTP Protocol Verified. Status: %d\n",
> > status);
> > +
> > + /* 4. RFC 9110 - POSIX mappings */
> > + if (status < 100 || status >= 600) return EPROTO;
> > + if (status < 200) return EIO; /* 1xx not supported */
> > + if (status < 300) return 0; /* 2xx Success */
> > +
> > + if (status < 400) { /* 3xx Redirection */
> > + switch (status) {
> > + case 301: case 302: case 303: case 307: case 308:
> > return EAGAIN;
> > + default: return EIO;
> > + }
> > + }
> > +
> > + if (status < 500) { /* 4xx Client Errors */
> > +
Re: [PATCH 1/2] httpfs: handle 3xx, 4xx, 5xx HTTP status codes
On 26/01/2026 01:05, Gianluca Cannata wrote:
Hi Amos,
thank you for your feedback.
I have tried to address all of your comments.
Let me know what you think.
Gianluca
Index: httpfs/http.c
===
--- httpfs.orig/http.c
+++ httpfs/http.c
@@ -34,6 +34,8 @@
#include
#include
+#define MAX_HTTP_STATUS_LINE 256
+
Still need to document that this value is relatively arbitrary. We only
use the first 13 bytes of the first line.
Technically the HTTP "Reason Phrase" string is unlimited length.
/* do a DNS lookup for HOST and store result in *DEST */
error_t lookup_host (const char *host, struct sockaddr_storage *dest, socklen_t *dest_len, int *socktype, int *protocol)
{
@@ -100,34 +102,73 @@ error_t lookup_host (const char *host, s
return 0;
}
-/* read the first line from the socket and return an error_t error */
+/* read the first line from the socket or return an error_t error */
static error_t translate_http_status (int fd, ssize_t *nread)
{
- char buf[32];
+ char buf[MAX_HTTP_STATUS_LINE];
+ size_t i = 0;
+ ssize_t n;
+ char c;
+ int status;
+
+ *nread = 0;
+
+ /* 1. byte-per-byte read until \n */
+ while (i < sizeof (buf) - 1) {
+ n = read (fd, &c, 1);
+ if (n <= 0) return EIO;
+ buf[i++] = c;
+ if (c == '\n') break;
+ }
I suspect this is heading in a bad direction. AFAIK, system calls are
quite expensive.
We know that a minimum of 13 bytes are needed for this function, so you
can start with that many and reduce by the amount actually received
until that much is read in.
After which, you can do whatever to discard the remainder of the line.
No need to even put that part in buf.
AFAICS the header are currently being dropped so a bulk read to drop the
status as well is fine ... for now.
- *nread = read (fd, buf, sizeof (buf) - 1);
- if (*nread < 12) return EIO;
- buf[*nread] = '\0';
+ buf[i] = '\0';
+ *nread = i;
- if (strncmp (buf, "HTTP/", 5) != 0)
- return EPROTO;
+ /* 2. Positional validation: "HTTP/1.N SSS" */
+ int valid = (i >= 12) &&
+ (strncmp (buf, "HTTP/1.", 7) == 0) && /* "HTTP/1." */
We know that buf has more than 7 bytes/octets.
Use memcmp() rather than strncmp().
+ (buf[7] >= '0' && buf[7] <= '9') && /* N (digit) */
+ (buf[8] == ' ') && /* SP */
+ (buf[9] >= '0' && buf[9] <= '9') && /* S (digit 1) */
+ (buf[10] >= '0' && buf[10] <= '9') && /* S (digit 2) */
+ (buf[11] >= '0' && buf[11] <= '9') && /* S (digit 3) */
+ (buf[12] == ' ' || buf[12] == '\r' || buf[12] == '\n');
/* End of line */
Sadly bare-CR is a thing from malicious services. We do have to check
for "\r\n" together instead of just an '\r'.
So IMO the above should be:
(i >= 13) &&
(memcmp (buf, "HTTP/1.", 7) == 0 && isdigit(buf[7])) && // HTTP version
(buf[8] == ' ') && // SP delimiter
(buf[9] >= '1' && buf[9] <= '5' && isdigit(buf[10]) &&
isdigit(buf[10])) && // status code
(buf[12] == ' ' || (buf[12] == '\r' && buf[13] == 'n') || buf[12] ==
'\n'); // SP delimiter or End of Line
Note, with the change to which characters buf[9] is compared against we
are no longer needing the extra (status < 100 || status >= 600) check below.
- int status = atoi (buf + 9);
+ if (!valid) return EPROTO;
+
+ /* 3. HTTP status code conversion */
+ status = atoi (buf + 9);
if (debug_flag)
- fprintf (stderr, "HTTP Status: %d\n", status);
+ fprintf (stderr, "HTTP Protocol Verified. Status: %d\n",
status);
+
+ /* 4. RFC 9110 - POSIX mappings */
+ if (status < 100 || status >= 600) return EPROTO;
+ if (status < 200) return EIO; /* 1xx not supported */
+ if (status < 300) return 0; /* 2xx Success */
+
+ if (status < 400) { /* 3xx Redirection */
+ switch (status) {
+ case 301: case 302: case 303: case 307: case 308:
return EAGAIN;
+ default: return EIO;
+ }
+ }
+
+ if (status < 500) { /* 4xx Client Errors */
+ switch (status) {
+ case 401: case 402: case 403: case 405:
+ case 407: case 421: case 451: return EACCES;
+ case 404: case 410: case 406: case 412: case 415: case
428: return ENOENT;
+ case 426: return EPROTO;
+ default: return EINVAL;
+ }
+ }
- switch (status)
- {
- case 200: return 0;
- case 301:
- case 302: return EAGAIN;
- case 401:
- case 403: return EACCES;
-
Re: [PATCH 1/2] httpfs: handle 3xx, 4xx, 5xx HTTP status codes
Hi Amos,
thank you for your feedback.
I have tried to address all of your comments.
Let me know what you think.
Gianluca
Index: httpfs/http.c
===
--- httpfs.orig/http.c
+++ httpfs/http.c
@@ -34,6 +34,8 @@
#include
#include
+#define MAX_HTTP_STATUS_LINE 256
+
/* do a DNS lookup for HOST and store result in *DEST */
error_t lookup_host (const char *host, struct sockaddr_storage *dest, socklen_t *dest_len, int *socktype, int *protocol)
{
@@ -100,34 +102,73 @@ error_t lookup_host (const char *host, s
return 0;
}
-/* read the first line from the socket and return an error_t error */
+/* read the first line from the socket or return an error_t error */
static error_t translate_http_status (int fd, ssize_t *nread)
{
- char buf[32];
+ char buf[MAX_HTTP_STATUS_LINE];
+ size_t i = 0;
+ ssize_t n;
+ char c;
+ int status;
+
+ *nread = 0;
+
+ /* 1. byte-per-byte read until \n */
+ while (i < sizeof (buf) - 1) {
+ n = read (fd, &c, 1);
+ if (n <= 0) return EIO;
+ buf[i++] = c;
+ if (c == '\n') break;
+ }
- *nread = read (fd, buf, sizeof (buf) - 1);
- if (*nread < 12) return EIO;
- buf[*nread] = '\0';
+ buf[i] = '\0';
+ *nread = i;
- if (strncmp (buf, "HTTP/", 5) != 0)
- return EPROTO;
+ /* 2. Positional validation: "HTTP/1.N SSS" */
+ int valid = (i >= 12) &&
+ (strncmp (buf, "HTTP/1.", 7) == 0) && /* "HTTP/1." */
+ (buf[7] >= '0' && buf[7] <= '9') && /* N (digit) */
+ (buf[8] == ' ') && /* SP */
+ (buf[9] >= '0' && buf[9] <= '9') && /* S (digit 1) */
+ (buf[10] >= '0' && buf[10] <= '9') && /* S (digit 2) */
+ (buf[11] >= '0' && buf[11] <= '9') && /* S (digit 3) */
+ (buf[12] == ' ' || buf[12] == '\r' || buf[12] == '\n'); /* End of line */
- int status = atoi (buf + 9);
+ if (!valid) return EPROTO;
+
+ /* 3. HTTP status code conversion */
+ status = atoi (buf + 9);
if (debug_flag)
- fprintf (stderr, "HTTP Status: %d\n", status);
+ fprintf (stderr, "HTTP Protocol Verified. Status: %d\n", status);
+
+ /* 4. RFC 9110 - POSIX mappings */
+ if (status < 100 || status >= 600) return EPROTO;
+ if (status < 200) return EIO; /* 1xx not supported */
+ if (status < 300) return 0; /* 2xx Success */
+
+ if (status < 400) { /* 3xx Redirection */
+ switch (status) {
+ case 301: case 302: case 303: case 307: case 308: return EAGAIN;
+ default: return EIO;
+ }
+ }
+
+ if (status < 500) { /* 4xx Client Errors */
+ switch (status) {
+ case 401: case 402: case 403: case 405:
+ case 407: case 421: case 451: return EACCES;
+ case 404: case 410: case 406: case 412: case 415: case 428: return ENOENT;
+ case 426: return EPROTO;
+ default: return EINVAL;
+ }
+ }
- switch (status)
- {
- case 200: return 0;
- case 301:
- case 302: return EAGAIN;
- case 401:
- case 403: return EACCES;
- case 404: return ENOENT;
- case 410: return ENOENT;
- default:
- return (status >= 500) ? EIO : EINVAL;
+ /* 5xx Server Errors */
+ switch (status) {
+ case 501: case 505: case 510: return EINVAL;
+ case 503: return EACCES;
+ default: return EIO;
}
}
Re: [PATCH 1/2] httpfs: handle 3xx, 4xx, 5xx HTTP status codes
My 2c below,
On 25/01/2026 04:18, Gianluca Cannata wrote:
Hi Samuel,
This is the first patch that strives to handle HTTP status codes other
than 200 OK.
Only 4xx and 5xx are actually handled in their own way.
Let me know what you think.
Sincerely,
Gianluca
>
Index: httpfs/http.c
===
--- httpfs.orig/http.c
+++ httpfs/http.c
@@ -100,6 +100,37 @@ error_t lookup_host (const char *host, s
return 0;
}
+/* read the first line from the socket and return an error_t error */
+static error_t translate_http_status (int fd, ssize_t *nread)
+{
+ char buf[32];
What using magic numbers, it is best to document what the value means.
In this case the buffer needs to store at least the protocol
first-bytes. Which are "HTTP/1.N SSS ".
Also, something to consider. In modern HTTP the "reason phrase" is
optional. As are the headers. Which makes the minimum valid HTTP/1.x
response only 13 bytes.
This function may read up to
If more than that are read in by this function, any following data will
be lost.
+
+ *nread = read (fd, buf, sizeof (buf) - 1);
+ if (*nread < 12) return EIO;
+ buf[*nread] = '\0';
+
+ if (strncmp (buf, "HTTP/", 5) != 0)
+ return EPROTO;
With modern standard changing to non-ASCII the HTTP/1 response syntax is
exactly this:
* the 7-octet string "HTTP/1."
* Then a DIGIT ('0'-'9'),
* Then an SP (0x20),
* Then three DIGIT ('0'-'9'),
* Then one of: SP (0x20), or LF (0x0A), or CRLF (0x0D0A)
Anything not matching that exact pattern is invalid HTTP and should
produce EPROTO.
+
+ int status = atoi (buf + 9);
+
+ if (debug_flag)
+ fprintf (stderr, "HTTP Status: %d\n", status);
+
The HTTP specification, RFC 9110:
"a client MUST understand the class of any status code, as indicated
by the first digit, and treat an unrecognized status code as being
equivalent to the x00 status code of that class."
I think it might be better for comprehension to have an if statement
checking which range the status is part of, with a switch inside those
for the specially handled code.
eg.
if (status < 100 || 600 < status)
return EPROTO;
else if (status < 200)
return EIO; // Informational Responses not supported yet
else if (status < 300)
return 0;
else if (status < 400)
{
switch (status)
{
case 301:
case 302:
case 303:
case 307:
case 308:
return EIO; // Redirection is not supported yet
default:
return EINVAL;
}
}
else if (status < 400)
...
+ switch (status)
+ {
+ case 200: return 0;
The 205 and 210 need some consideration. They should either map to the
same as 200, or possibly ENOENT depending on how the error produced by
this function is intended to be used.
+ case 301:
+ case 302: return EAGAIN;
The new 303, 307 and 308 should also have the same handling.
However, AFAIK these status do not actually mean the same thing that
EAGAIN does. In their case a *different* URL should be tried on the
followup, not the same one repeated.
In HTTP the "try again later" meaning is given by 429 and 503 status.
Otherwise the "Retry-After" header has to be looked for - and this can
appear in any 300-599 status code.
+ case 401:
+ case 403: return EACCES;
The 402, 405, 407, 421, 451, 503, and 511 are all about wrong or broken
access and should also map to EACCES.
+ case 404: return ENOENT;
+ case 410: return ENOENT;
The 406, 412, 415, 428 mean the specific resource representation being
requested does not exist and thus ENOENT.
Also:
The status 501, 505, and 510 are about the server not accepting what the
client has sent - thus map to EINVAL.
The status 306, 417, and 422 all mean transport issues - thus map to EIO.
Status 426 and 428 should map to EPROTO.
+ default:
+ return (status >= 500) ? EIO : EINVAL;
If the status parsed is <100 or >= 600, or 418 that should produce
EPROTO while this translator is is still sending HTTP/1.0.
FYI: Handling 1xx properly is one of the things needed to use "1.1"
version labels.
+ }
+}
+
/* open a connection with the remote web server */
error_t open_connection(struct netnode *node, int *fd,off_t *head_len)
{
@@ -115,10 +146,6 @@ error_t open_connection(struct netnode *
size_t towrite;
char buffer[4096];
ssize_t bytes_read;
- char *token,*mesg;
- int code;
- char delimiters0[] = " ";
- char delimiters1[] = "\n";
/* 1. Target selection.
* If ip_addr (proxy global variable) is set, we use it.
@@ -165,27 +192,24 @@ error_t open_connection(struct netnode *
fprintf(stderr,"Could not send an HTTP request to host\n");
return errno;
}
+
+
