On Thu, Apr 18, 2024 at 3:58 PM Graham Leggett <minf...@sharp.fm> wrote:
>
> On 18 Apr 2024, at 13:15, Yann Ylavic <ylavic....@gmail.com> wrote:
>
>> Indeed at least APR_BUFFER_MAX and buf->size of should be of the same
>> signedness.
>
>
> APR_BUFFER_MAX represents the size limit visible outside the API. This is 
> always positive.

I meant the same signedness of the type, not the value itself.
If APR_BUFFER_MAX is APR_SIZE_MAX/2 it's a size_t (thus unsigned),
while buf->size is an apr_off_t (thus signed), not the same types.

>
> buf->size is an internal implementation detail. This is invisible.
>
> The only reason we can see inside apr_buffer_t is because we want to allocate 
> them on the stack, and make arrays of them, so code outside needs to know the 
> size.

Fair enough, though there probably should be big warning a the struct
apr_buffer_t definition to not use ->size directly.

>
>> But let me plead again for a much simpler ->size of type apr_size_t,
>> checked against APR_BUFFER_MAX=APR_SIZE_MAX/2 wherever an apr_buffer_t
>> is initialized, using the high bit of ->size for string vs plain
>> buffer, and then getting rid of off_t/ssize_t plus all the fancy
>> signed arithmetics in the apr_buffer code (we don't care about the
>> sizeof(off_t) or anything like that anymore)..
>
>
> I looked at it, and it was more confusing.
>
> Right now, positive is a memory buffer, negative is a string. I don't know 
> why we would need to make it more complicated than that.

I think we can hardly make more complicated than:
APR_DECLARE(apr_size_t) apr_buffer_len(const apr_buffer_t *buf)
{
    if (buf->size < 0) {
        return (apr_size_t)((-buf->size) - 1);
    }
    else {
        return (apr_size_t)buf->size;
    }
}
...

With an apr_size_t ->size it could be:
APR_DECLARE(apr_size_t) apr_buffer_len(const apr_buffer_t *buf)
{
    return buf->size & APR_BUFFER_MAX;
}
which I find much more simpler (and efficient) personally.

Another exemple:

APR_DECLARE(apr_status_t) apr_buffer_str_set(apr_buffer_t *buf,
                                             char *str, apr_ssize_t len)
{

    if (len > APR_BUFFER_MAX) {
        return APR_EINVAL;
    }

    if (!str) {
        buf->d.str = NULL;
        buf->size = 0;
    }
    else if (len < 0) {
        apr_size_t slen = strlen(str);
        if (slen <= APR_BUFFER_MAX) {
            buf->d.str = str;
            buf->size = -(apr_off_t)(slen + 1);
        }
        else {
            buf->d.str = NULL;
            buf->size = 0;
        }
    }
    else {
        buf->d.str = str;
        buf->size = -(len + 1);
    }

    return APR_SUCCESS;
}

could be:
APR_DECLARE(apr_status_t) apr_buffer_str_set(apr_buffer_t *buf,
                                             char *str, apr_size_t len)
{
    if (len == APR_BUFFER_STRING) {
        len = str ? strlen(str) : 0;
    }
    if (len > APR_BUFFER_MAX) {
        return APR_EINVAL;
    }

    if (!str) {
        buf->d.str = NULL;
        buf->size = 0;
    }
    else {
        buf->d.str = str;
        buf->size = len;
    }
    buf->size |= ~APR_BUFFER_MAX;

    return APR_SUCCESS;
}

Should be the same kind of simplification everywhere I suppose.

>
>> Currently apr_buffer_str_make(mystring, strlen(mystring)) is UB, the
>> API is just broken IMHO.
>
> Can you clarify why? What, specifically, is broken, so I can fix it?

Passing/converting an unsigned value to a signed one where the
unsigned value cannot be represented in the signed one is undefined
behaviour in C.

So with:
  APR_DECLARE(apr_status_t) apr_buffer_str_set(apr_buffer_t *buf,
                                               char *str, apr_ssize_t len);
one cannot simply do (w/o UB):
  apr_size_t len = strlen(mystr);
  apr_buffer_str_set(buf, mystr, len);
without first checking that len <= APR_BUFFER_MAX, which is not really
user friendly.

Whereas with the proposed:
  APR_DECLARE(apr_status_t) apr_buffer_str_set(apr_buffer_t *buf,
                                               char *str, apr_size_t len);
the will return EINVAL if len > APR_BUFFER_MAX without the user having
to do the check to avoid UB.


Regards;
Yann.

Reply via email to