Hi, You're absolutely right, I have merged the commits into one patch, and removed the trailing whitespaces in the patch. Please find it attached, I hope it's okay now.
I have fixed the error which was breaking the build when configured with "--without-ssl". I can't figure out why the build fails on the other two cases. I am unable to replicate them as well. The tests which are failing in the build are Test-cookie and Test-cookie-401, both of which run successfully when I execute make check. Am I missing something here? As for testing the code, should I follow the way its being done in Test-*.py files? Thanks, Kush On Sun, Jan 24, 2016 at 6:14 PM, Darshit Shah <[email protected]> wrote: > Hi Kushagra, > > Thanks for the patches! A couple of remarks on your patches before I > dive into the code: > > 1. Please merge your changes into fewer, more logical patches. Making > small patches when working locally is a good idea, but when you submit > them for merging, they should be reorganized into logical changes. In > the case of your patches, one adds the new function, and the next > patch removes it. We don't need this in the git history. > 2. Your patches have trailing whitespace in them. Please configure > your editor to either alert you about this, or fix it silently. Most > good text editors have these features. You can even configure your Git > to complain if you try to commit anything with whitespace errors. > Trailing whitespaces can be a major pain when merging larger patches > or branches in the future. > 3. A good patch would also contain a test case that fails before. but > passes after the patch is applied. This way we can verify the > correctness of the patch and ensure that no regressions occur in the > future. > > Apart from these, the build failed on Travis with the --without-ssl > configure option. That is an issue worth looking into. > The other two builds on travis failed inside the multi-byte locale > tests. It's probably some subtle bug, but we'll have to fix that as > well. > > The code itself looks good. Will provide a deeper review once the > larger, more obvious issues have been sorted. > > > On 24 January 2016 at 12:38, Kushagra Singh > <[email protected]> wrote: > > I have added the first two out the three recommendations in the draft. > The > > third one is relevant when cookies have to be removed in case the total > > number of cookies hit a predefined upper bound, I'm not sure whether we > > do that in wget? > > > > As you mentioned, I had to change some method prototypes to get the uri > > scheme. I made sure that I replaced all instances of those function calls > > with the right call. The tests run fine, so hopefully I haven't broken > > anything. > > > > I am attaching the patch files, please review them. > > > > Thanks, > > Kush > > > > On Sun, Jan 24, 2016 at 4:39 AM, Darshit Shah <[email protected]> wrote: > > > >> On 23 January 2016 at 23:36, Kushagra Singh > >> <[email protected]> wrote: > >> > Thanks a lot for the help! > >> > > >> > I've made some progress, but have a couple of more questions > >> > > >> > - I can't manage to find the http-only-flag in the cookie struct, do > we > >> not > >> > store this? > >> Since Wget supports only HTTP, this is not required. The HttpOnly > >> attribute prevents access to script code, but since Wget never > >> executes them it is not necessary at all. Although, it may be a good > >> idea to explicitly store the flag for Wget saves the cookies to a > >> file. Maybe, we should add this. > >> > >> > - The draft asks to check whether the "scheme" component of the > >> > "request-uri" denotes a secure protocol or not. Currently I am > checking > >> > using "#ifdef HAVE_SSL". I am not sure whether this is the right way > to > >> do > >> > so, since having SSL with wget does not necessarily mean that the > current > >> > connection is secure. > >> > >> Ideally, a code base should have as few #ifdef statements as possible. > >> They make reading the code very difficult for a human. That said, in > >> this scenario it is the absolute wrong technique. You will want to > >> access the scheme from the request URI. Find a way to access this > >> information, you may need to change some method prototypes to make > >> this happen. > >> > >> > - To check whether there exists a cookie whose domain, domain-matches > the > >> > domain of a new cookie, we should iterate through the chains returned > by > >> > find_chains_of_host right? > >> > >> That ought to work, I think. > >> > >> > > >> > Regards, > >> > Kush > >> > >> > >> > >> -- > >> Thanking You, > >> Darshit Shah > >> > > > > -- > Thanking You, > Darshit Shah >
From 127e502a012567221fd792e256655006849fada8 Mon Sep 17 00:00:00 2001 From: kush789 <[email protected]> Date: Wed, 27 Jan 2016 03:34:31 +0530 Subject: [PATCH] Added recomendations of draft-west-leave-secure-cookies-alone --- src/cookies.c | 198 +++++++++++++++++++++++++++++++++++++--------------------- src/cookies.h | 5 +- src/http.c | 2 +- 3 files changed, 131 insertions(+), 74 deletions(-) diff --git a/src/cookies.c b/src/cookies.c index 81ecfa5..156bd61 100644 --- a/src/cookies.c +++ b/src/cookies.c @@ -350,7 +350,8 @@ discard_matching_cookie (struct cookie_jar *jar, struct cookie *cookie) filled. */ static struct cookie * -parse_set_cookie (const char *set_cookie, bool silent) +parse_set_cookie (const char *set_cookie, enum url_scheme scheme, + bool silent) { const char *ptr = set_cookie; struct cookie *cookie = cookie_new (); @@ -439,8 +440,30 @@ parse_set_cookie (const char *set_cookie, bool silent) } else if (TOKEN_IS (name, "secure")) { - /* ignore value completely */ - cookie->secure = 1; +#ifdef HAVE_SSL + if (scheme == SCHEME_HTTPS) + /* Ignore value completely since secure is a value-less + attribute*/ + cookie->secure = 1; + else + { + /* Deleting cookie since secure only flag is set but connection + is not secure. */ + if (!silent) + logprintf (LOG_NOTQUIET, + _("Trying to create secure only cookie, but connection is not secure.\nSet-Cookie : %s\n"), + quotearg_style (escape_quoting_style, set_cookie)); + delete_cookie (cookie); + return NULL; + } +#else + if (!silent) + logprintf (LOG_NOTQUIET, + _("Trying to create secure only cookie, wget configured with\" --without-ssl.\nSet-Cookie : %s\n"), + quotearg_style (escape_quoting_style, set_cookie)); + delete_cookie (cookie); + return NULL; +#endif } /* else: Ignore unrecognized attribute. */ } @@ -699,17 +722,84 @@ check_path_match (const char *cookie_path, const char *path) s = PS_newstr; \ } while (0) +/* Return a count of how many times CHR occurs in STRING. */ + +static int +count_char (const char *string, char chr) +{ + const char *p; + int count = 0; + for (p = string; *p; p++) + if (*p == chr) + ++count; + return count; +} + +/* Find the cookie chains whose domains match HOST and store them to + DEST. + + A cookie chain is the head of a list of cookies that belong to a + host/domain. Given HOST "img.search.xemacs.org", this function + will return the chains for "img.search.xemacs.org", + "search.xemacs.org", and "xemacs.org" -- those of them that exist + (if any), that is. + + DEST should be large enough to accept (in the worst case) as many + elements as there are domain components of HOST. */ + +static int +find_chains_of_host (struct cookie_jar *jar, const char *host, + struct cookie *dest[]) +{ + int dest_count = 0; + int passes, passcnt; + + /* Bail out quickly if there are no cookies in the jar. */ + if (!hash_table_count (jar->chains)) + return 0; + + if (numeric_address_p (host)) + /* If host is an IP address, only check for the exact match. */ + passes = 1; + else + /* Otherwise, check all the subdomains except the top-level (last) + one. As a domain with N components has N-1 dots, the number of + passes equals the number of dots. */ + passes = count_char (host, '.'); + + passcnt = 0; + + /* Find chains that match HOST, starting with exact match and + progressing to less specific domains. For instance, given HOST + fly.srk.fer.hr, first look for fly.srk.fer.hr's chain, then + srk.fer.hr's, then fer.hr's. */ + while (1) + { + struct cookie *chain = hash_table_get (jar->chains, host); + if (chain) + dest[dest_count++] = chain; + if (++passcnt >= passes) + break; + host = strchr (host, '.') + 1; + } + + return dest_count; +} /* Process the HTTP `Set-Cookie' header. This results in storing the cookie or discarding a matching one, or ignoring it completely, all depending on the contents. */ + void cookie_handle_set_cookie (struct cookie_jar *jar, const char *host, int port, - const char *path, const char *set_cookie) + const char *path, enum url_scheme scheme, + const char *set_cookie) { - struct cookie *cookie; + struct cookie *cookie, *old_cookie; + struct cookie **chains; + int chain_count, i; cookies_now = time (NULL); /* Wget's paths don't begin with '/' (blame rfc1808), but cookie @@ -717,7 +807,7 @@ cookie_handle_set_cookie (struct cookie_jar *jar, simply prepend slash to PATH. */ PREPEND_SLASH (path); - cookie = parse_set_cookie (set_cookie, false); + cookie = parse_set_cookie (set_cookie, scheme, false); if (!cookie) goto out; @@ -767,6 +857,31 @@ cookie_handle_set_cookie (struct cookie_jar *jar, } } +#ifdef HAVE_SSL + if ((cookie->secure == 0) && (scheme != SCHEME_HTTPS)) + { + /* If an old cookie exists such that the all of the following are + true, then discard the new cookie. + + - The "domain" domain-matches the domain of the new cookie + - The "name" matches the "name" of the new cookie + - Secure-only flag of old cookie is set */ + + chains = alloca_array (struct cookie *, 1 + count_char (host, '.')); + chain_count = find_chains_of_host (jar, host, chains); + + for (i = 0; i < chain_count; i++) + for (old_cookie = chains[i]; old_cookie; old_cookie = old_cookie->next) + { + if (!cookie_expired_p(old_cookie) + && !(old_cookie->domain_exact + && 0 != strcasecmp (host, old_cookie->domain)) + && 0 == strcasecmp(old_cookie->attr, cookie->attr) + && 1 == old_cookie->secure) + goto out; + } + } +#endif /* Now store the cookie, or discard an existing cookie, if discarding was requested. */ @@ -788,70 +903,6 @@ cookie_handle_set_cookie (struct cookie_jar *jar, previously stored cookies. Entry point is `build_cookies_request'. */ -/* Return a count of how many times CHR occurs in STRING. */ - -static int -count_char (const char *string, char chr) -{ - const char *p; - int count = 0; - for (p = string; *p; p++) - if (*p == chr) - ++count; - return count; -} - -/* Find the cookie chains whose domains match HOST and store them to - DEST. - - A cookie chain is the head of a list of cookies that belong to a - host/domain. Given HOST "img.search.xemacs.org", this function - will return the chains for "img.search.xemacs.org", - "search.xemacs.org", and "xemacs.org" -- those of them that exist - (if any), that is. - - DEST should be large enough to accept (in the worst case) as many - elements as there are domain components of HOST. */ - -static int -find_chains_of_host (struct cookie_jar *jar, const char *host, - struct cookie *dest[]) -{ - int dest_count = 0; - int passes, passcnt; - - /* Bail out quickly if there are no cookies in the jar. */ - if (!hash_table_count (jar->chains)) - return 0; - - if (numeric_address_p (host)) - /* If host is an IP address, only check for the exact match. */ - passes = 1; - else - /* Otherwise, check all the subdomains except the top-level (last) - one. As a domain with N components has N-1 dots, the number of - passes equals the number of dots. */ - passes = count_char (host, '.'); - - passcnt = 0; - - /* Find chains that match HOST, starting with exact match and - progressing to less specific domains. For instance, given HOST - fly.srk.fer.hr, first look for fly.srk.fer.hr's chain, then - srk.fer.hr's, then fer.hr's. */ - while (1) - { - struct cookie *chain = hash_table_get (jar->chains, host); - if (chain) - dest[dest_count++] = chain; - if (++passcnt >= passes) - break; - host = strchr (host, '.') + 1; - } - - return dest_count; -} - /* If FULL_PATH begins with PREFIX, return the length of PREFIX, zero otherwise. */ @@ -1410,6 +1461,9 @@ test_cookies (void) }; int i; + /* The uri scheme doesn't make a difference in these tests */ + enum url_scheme scheme = SCHEME_HTTP; + for (i = 0; i < countof (tests_succ); i++) { int ind; @@ -1417,7 +1471,7 @@ test_cookies (void) const char **expected = tests_succ[i].results; struct cookie *c; - c = parse_set_cookie (data, true); + c = parse_set_cookie (data, scheme, true); if (!c) { printf ("NULL cookie returned for valid data: %s\n", data); @@ -1457,7 +1511,7 @@ test_cookies (void) { struct cookie *c; char *data = tests_fail[i]; - c = parse_set_cookie (data, true); + c = parse_set_cookie (data, scheme, true); if (c) printf ("Failed to report error on invalid data: %s\n", data); } diff --git a/src/cookies.h b/src/cookies.h index b4281e1..d565b09 100644 --- a/src/cookies.h +++ b/src/cookies.h @@ -31,13 +31,16 @@ as that of the covered work. */ #ifndef COOKIES_H #define COOKIES_H +#include "url.h" + struct cookie_jar; struct cookie_jar *cookie_jar_new (void); void cookie_jar_delete (struct cookie_jar *); void cookie_handle_set_cookie (struct cookie_jar *, const char *, int, - const char *, const char *); + const char *, enum url_scheme, + const char *); char *cookie_header (struct cookie_jar *, const char *, int, const char *, bool); diff --git a/src/http.c b/src/http.c index 8916d2b..f7d42b2 100644 --- a/src/http.c +++ b/src/http.c @@ -3276,7 +3276,7 @@ gethttp (struct url *u, struct http_stat *hs, int *dt, struct url *proxy, { char *set_cookie; BOUNDED_TO_ALLOCA (scbeg, scend, set_cookie); cookie_handle_set_cookie (wget_cookie_jar, u->host, u->port, - u->path, set_cookie); + u->path, u->scheme, set_cookie); } } -- 1.9.1
