On 08 Jul 2018, at 4:23 PM, Yann Ylavic <ylavic....@gmail.com> wrote:
>> +static apr_status_t apr_json_decode_boolean(apr_json_scanner_t * self, int >> *retval) >> +{ >> + if (self->p >= self->e) >> + return APR_EOF; >> + >> + if (self->e - self->p >= 4 && strncmp("true", self->p, 4) == 0 && >> + (self->p == self->e || > > (self->p == self) is always false since (self->e - self->p >= 4), > (self->e - self->p == 4) maybe? > >> + (!isalnum(((unsigned char *)self->p)[4]) && >> + ((unsigned char *)self->p)[4] != '_'))) { > >> + self->p += 4; >> + *retval = 1; >> + return APR_SUCCESS; >> + } >> + else if (self->e - self->p >= 5 && strncmp("false", self->p, 5) == 0 && >> + (self->p == self->e || > > Likewise. This extra check is redundant, as we catch trailing rubbish elsewhere. Fixed in r 1835366. >> + (!isalnum(((unsigned char *)self->p)[5]) && >> + ((unsigned char *)self->p)[5] != '_'))) { >> + self->p += 5; >> + *retval = 0; >> + return APR_SUCCESS; >> + } >> + return APR_BADCH; >> +} >> + >> +static apr_status_t apr_json_decode_number(apr_json_scanner_t * self, >> apr_json_value_t * retval) >> +{ > [] > > Determining 'treat_as_float' looks costly so far. I disagree. The code does a single pass sanity walk, which will then reject any bogus input precisely on that character. We then run exactly one of strtod or strtol to get what we already know is a valid number. >> + >> + if (treat_as_float) { >> + retval->type = APR_JSON_DOUBLE; >> + retval->value.dnumber = strtod(self->p, NULL); >> + } >> + else { >> + retval->type = APR_JSON_LONG; >> + retval->value.lnumber = strtol(self->p, NULL, 10); >> + } > > Can't we simply try strtol() and then strtod() if the former failed? I would rather not. That adds a failure path into the success path, and passes the problem across to detecting the possible options for detecting the difference between the end of a long, the end of a float, or trailing bogus characters. The current approach gives a clear and unambiguous behaviour. >> + if (self->flags & APR_JSON_FLAGS_WHITESPACE) { >> + if (len) { >> + *space = s = apr_palloc(self->pool, len + 1); >> + >> + while (self->p < self->e && isspace(*(unsigned char *) >> self->p)) { >> + *s++ = *self->p++; >> + } >> + *s = 0; >> + >> + } > > A single pass (and auto-doubling apr_array of char) could work here too. We’re talking about a few characters of whitespace typically, I would rather keep this tight and allocate exactly the amount of RAM we need, especially if this allocation hangs around until the end of a request in something like httpd. Regards, Graham —