On Thu, Apr 18, 2024 at 3:58 PM Graham Leggett <minf...@sharp.fm> wrote: > > On 18 Apr 2024, at 13:15, Yann Ylavic <ylavic....@gmail.com> wrote: > >> Indeed at least APR_BUFFER_MAX and buf->size of should be of the same >> signedness. > > > APR_BUFFER_MAX represents the size limit visible outside the API. This is > always positive.
I meant the same signedness of the type, not the value itself. If APR_BUFFER_MAX is APR_SIZE_MAX/2 it's a size_t (thus unsigned), while buf->size is an apr_off_t (thus signed), not the same types. > > buf->size is an internal implementation detail. This is invisible. > > The only reason we can see inside apr_buffer_t is because we want to allocate > them on the stack, and make arrays of them, so code outside needs to know the > size. Fair enough, though there probably should be big warning a the struct apr_buffer_t definition to not use ->size directly. > >> But let me plead again for a much simpler ->size of type apr_size_t, >> checked against APR_BUFFER_MAX=APR_SIZE_MAX/2 wherever an apr_buffer_t >> is initialized, using the high bit of ->size for string vs plain >> buffer, and then getting rid of off_t/ssize_t plus all the fancy >> signed arithmetics in the apr_buffer code (we don't care about the >> sizeof(off_t) or anything like that anymore).. > > > I looked at it, and it was more confusing. > > Right now, positive is a memory buffer, negative is a string. I don't know > why we would need to make it more complicated than that. I think we can hardly make more complicated than: APR_DECLARE(apr_size_t) apr_buffer_len(const apr_buffer_t *buf) { if (buf->size < 0) { return (apr_size_t)((-buf->size) - 1); } else { return (apr_size_t)buf->size; } } ... With an apr_size_t ->size it could be: APR_DECLARE(apr_size_t) apr_buffer_len(const apr_buffer_t *buf) { return buf->size & APR_BUFFER_MAX; } which I find much more simpler (and efficient) personally. Another exemple: APR_DECLARE(apr_status_t) apr_buffer_str_set(apr_buffer_t *buf, char *str, apr_ssize_t len) { if (len > APR_BUFFER_MAX) { return APR_EINVAL; } if (!str) { buf->d.str = NULL; buf->size = 0; } else if (len < 0) { apr_size_t slen = strlen(str); if (slen <= APR_BUFFER_MAX) { buf->d.str = str; buf->size = -(apr_off_t)(slen + 1); } else { buf->d.str = NULL; buf->size = 0; } } else { buf->d.str = str; buf->size = -(len + 1); } return APR_SUCCESS; } could be: APR_DECLARE(apr_status_t) apr_buffer_str_set(apr_buffer_t *buf, char *str, apr_size_t len) { if (len == APR_BUFFER_STRING) { len = str ? strlen(str) : 0; } if (len > APR_BUFFER_MAX) { return APR_EINVAL; } if (!str) { buf->d.str = NULL; buf->size = 0; } else { buf->d.str = str; buf->size = len; } buf->size |= ~APR_BUFFER_MAX; return APR_SUCCESS; } Should be the same kind of simplification everywhere I suppose. > >> Currently apr_buffer_str_make(mystring, strlen(mystring)) is UB, the >> API is just broken IMHO. > > Can you clarify why? What, specifically, is broken, so I can fix it? Passing/converting an unsigned value to a signed one where the unsigned value cannot be represented in the signed one is undefined behaviour in C. So with: APR_DECLARE(apr_status_t) apr_buffer_str_set(apr_buffer_t *buf, char *str, apr_ssize_t len); one cannot simply do (w/o UB): apr_size_t len = strlen(mystr); apr_buffer_str_set(buf, mystr, len); without first checking that len <= APR_BUFFER_MAX, which is not really user friendly. Whereas with the proposed: APR_DECLARE(apr_status_t) apr_buffer_str_set(apr_buffer_t *buf, char *str, apr_size_t len); the will return EINVAL if len > APR_BUFFER_MAX without the user having to do the check to avoid UB. Regards; Yann.