On Sat, Sep 16, 2017 at 9:48 AM, Yann Ylavic <[email protected]> wrote:
> On Sat, Sep 16, 2017 at 3:37 AM, Eric Covener <[email protected]> wrote:
>> On Sat, Dec 29, 2012 at 8:23 PM, <[email protected]> wrote:
>>>
>>> +/*
>>> + * If strict mode ever becomes the default, this should be folded into
>>> + * fix_hostname_non_v6()
>>> + */
>>> +static apr_status_t strict_hostname_check(request_rec *r, char *host,
>>> + int logonly)
>>> +{
>>> + char *ch;
>>> + int is_dotted_decimal = 1, leading_zeroes = 0, dots = 0;
>>> +
>>> + for (ch = host; *ch; ch++) {
>>> + if (!apr_isascii(*ch)) {
>>> + goto bad;
>>> + }
>>> + else if (apr_isalpha(*ch) || *ch == '-') {
>>> + is_dotted_decimal = 0;
>>> + }
>>> + else if (ch[0] == '.') {
>>> + dots++;
>>> + if (ch[1] == '0' && apr_isdigit(ch[2]))
>>> + leading_zeroes = 1;
>>> + }
>>> + else if (!apr_isdigit(*ch)) {
>>> + /* also takes care of multiple Host headers by denying commas */
>>> + goto bad;
>>> + }
>>> + }
>>> + if (is_dotted_decimal) {
>>> + if (host[0] == '.' || (host[0] == '0' && apr_isdigit(host[1])))
>>> + leading_zeroes = 1;
>>> + if (leading_zeroes || dots != 3) {
>>> + /* RFC 3986 7.4 */
>>> + goto bad;
>>> + }
>>> + }
>>> + else {
>>> + /* The top-level domain must start with a letter (RFC 1123 2.1) */
>>> + while (ch > host && *ch != '.')
>>> + ch--;
>>> + if (ch[0] == '.' && ch[1] != '\0' && !apr_isalpha(ch[1]))
>>> + goto bad;
>>> + }
>>> + return APR_SUCCESS;
>>> +
>>> +bad:
>>> + ap_log_rerror(APLOG_MARK, APLOG_DEBUG, 0, r, APLOGNO()
>>> + "[strict] Invalid host name '%s'%s%.6s",
>>> + host, *ch ? ", problem near: " : "", ch);
>>> + if (logonly)
>>> + return APR_SUCCESS;
>>> + return APR_EINVAL;
>>> +}
>>
>> (sorry for the necromancy of this very old commit)
>>
>> Re: the 1123 2.1 reference a dozen lines from the end of the function:
>> RFC 1123 2.1 seems to say the opposite. Just a bug or something over
>> my head?
>>
>> 2.1 Host Names and Numbers
>>
>> The syntax of a legal Internet host name was specified in RFC-952
>> [DNS:4]. One aspect of host name syntax is hereby changed: the
>> restriction on the first character is relaxed to allow either a
>> letter or a digit. Host software MUST support this more liberal
>> syntax.
>
> RFC 1123 2.1 seems to be about the first character of the host, while
> the code checks the first one of the TLD. Are there TLDs starting with
> a digit?
I see, thanks. The basis in 1123 is a bit later in 2.1 but doesn't
really seem normative:
If a dotted-decimal number can be entered without such
identifying delimiters, then a full syntactic check must be
made, because a segment of a host domain name is now allowed
to begin with a digit and could legally be entirely numeric
(see Section 6.1.2.4). However, a valid host name can never
have the dotted-decimal form #.#.#.#, since at least the
highest-level component label will be alphabetic.
The 6.1.2.4 reference is likely an error because that is about compression.
It seems like we'd reject "1foo" but accept "1foo.com", but i am not
sure if this warrants an exception or reconsidering the check.
(In the case that had me looking, a high TCP port was used as the
hostname AND port in the Host header so it is clearly someone elses
bug at the core)
--
Eric Covener
[email protected]