On Sat, Sep 16, 2017 at 3:37 AM, Eric Covener <cove...@gmail.com> wrote: > On Sat, Dec 29, 2012 at 8:23 PM, <s...@apache.org> wrote: >> >> +/* >> + * If strict mode ever becomes the default, this should be folded into >> + * fix_hostname_non_v6() >> + */ >> +static apr_status_t strict_hostname_check(request_rec *r, char *host, >> + int logonly) >> +{ >> + char *ch; >> + int is_dotted_decimal = 1, leading_zeroes = 0, dots = 0; >> + >> + for (ch = host; *ch; ch++) { >> + if (!apr_isascii(*ch)) { >> + goto bad; >> + } >> + else if (apr_isalpha(*ch) || *ch == '-') { >> + is_dotted_decimal = 0; >> + } >> + else if (ch[0] == '.') { >> + dots++; >> + if (ch[1] == '0' && apr_isdigit(ch[2])) >> + leading_zeroes = 1; >> + } >> + else if (!apr_isdigit(*ch)) { >> + /* also takes care of multiple Host headers by denying commas */ >> + goto bad; >> + } >> + } >> + if (is_dotted_decimal) { >> + if (host[0] == '.' || (host[0] == '0' && apr_isdigit(host[1]))) >> + leading_zeroes = 1; >> + if (leading_zeroes || dots != 3) { >> + /* RFC 3986 7.4 */ >> + goto bad; >> + } >> + } >> + else { >> + /* The top-level domain must start with a letter (RFC 1123 2.1) */ >> + while (ch > host && *ch != '.') >> + ch--; >> + if (ch[0] == '.' && ch[1] != '\0' && !apr_isalpha(ch[1])) >> + goto bad; >> + } >> + return APR_SUCCESS; >> + >> +bad: >> + ap_log_rerror(APLOG_MARK, APLOG_DEBUG, 0, r, APLOGNO() >> + "[strict] Invalid host name '%s'%s%.6s", >> + host, *ch ? ", problem near: " : "", ch); >> + if (logonly) >> + return APR_SUCCESS; >> + return APR_EINVAL; >> +} > > (sorry for the necromancy of this very old commit) > > Re: the 1123 2.1 reference a dozen lines from the end of the function: > RFC 1123 2.1 seems to say the opposite. Just a bug or something over > my head? > > 2.1 Host Names and Numbers > > The syntax of a legal Internet host name was specified in RFC-952 > [DNS:4]. One aspect of host name syntax is hereby changed: the > restriction on the first character is relaxed to allow either a > letter or a digit. Host software MUST support this more liberal > syntax.
RFC 1123 2.1 seems to be about the first character of the host, while the code checks the first one of the TLD. Are there TLDs starting with a digit?