One more comment / idea.
The 'cookie_domain' comes from a HTTP Set-Cookie repsonse header and thus is
(must be) toASCII() encoded (=puncode). Of course this has to be checked when
normalizing the incoming cookie data. A cookie comain having non-ascii
characters should simply be dropped.
The whole check only works when 'host' is also in toASCII() (punycode) form.
Assuming this, psl_str_to_utf8lower() just reduces to a ASCII lowercase
converter.
If Wget would convert any domain name input to punycode + lowercase, many
conversions would fall away and case-function would not be needed (e.g.
calling strcmp instead of strcasecmp, the need to call psl_str_to_utf8lower()
would fall away, etc.).
What do you think ?
Tim
On Monday 07 July 2014 17:08:48 Darshit Shah wrote:
> + if (psl_str_to_utf8lower (cookie_domain, NULL, NULL,&cookie_domain_lower)
== PSL_SUCCESS &&
> + psl_str_to_utf8lower (host, NULL, NULL, &host_lower) == PSL_SUCCESS)
> + {
> + is_acceptable = psl_is_cookie_domain_acceptable (psl,
> host_lower, cookie_domain_lower);
> + }
> + else
> + {
> + DEBUGP (("libpsl unable to parse domain name. "
> + "Falling back to simple heuristics.\n"));
> + goto no_psl;
> + }