Hi,

You're absolutely right, I have merged the commits into one patch, and
removed the trailing whitespaces in the patch. Please find it attached, I
hope it's okay now.

I have fixed the error which was breaking the build when configured with
"--without-ssl". I can't figure out why the build fails on the other two
cases. I am unable to replicate them as well. The tests which are failing
in the build are Test-cookie and Test-cookie-401, both of which run
successfully when I execute make check. Am I missing something here?

As for testing the code, should I follow the way its being done in
Test-*.py files?

Thanks,
Kush

On Sun, Jan 24, 2016 at 6:14 PM, Darshit Shah <[email protected]> wrote:

> Hi Kushagra,
>
> Thanks for the patches! A couple of remarks on your patches before I
> dive into the code:
>
> 1. Please merge your changes into fewer, more logical patches. Making
> small patches when working locally is a good idea, but when you submit
> them for merging, they should be reorganized into logical changes. In
> the case of your patches, one adds the new function, and the next
> patch removes it. We don't need this in the git history.
> 2. Your patches have trailing whitespace in them. Please configure
> your editor to either alert you about this, or fix it silently. Most
> good text editors have these features. You can even configure your Git
> to complain if you try to commit anything with whitespace errors.
> Trailing whitespaces can be a major pain when merging larger patches
> or branches in the future.
> 3. A good patch would also contain a test case that fails before. but
> passes after the patch is applied. This way we can verify the
> correctness of the patch and ensure that no regressions occur in the
> future.
>
> Apart from these, the build failed on Travis with the --without-ssl
> configure option. That is an issue worth looking into.
> The other two builds on travis failed inside the multi-byte locale
> tests. It's probably some subtle bug, but we'll have to fix that as
> well.
>
> The code itself looks good. Will provide a deeper review once the
> larger, more obvious issues have been sorted.
>
>
> On 24 January 2016 at 12:38, Kushagra Singh
> <[email protected]> wrote:
> > I have added the first two out the three recommendations in the draft.
> The
> > third one is relevant when cookies have to be removed in case the total
> > number of cookies hit a predefined upper bound, I'm not sure whether we
> > do that in wget?
> >
> > As you mentioned, I had to change some method prototypes to get the uri
> > scheme. I made sure that I replaced all instances of those function calls
> > with the right call. The tests run fine, so hopefully I haven't broken
> > anything.
> >
> > I am attaching the patch files, please review them.
> >
> > Thanks,
> > Kush
> >
> > On Sun, Jan 24, 2016 at 4:39 AM, Darshit Shah <[email protected]> wrote:
> >
> >> On 23 January 2016 at 23:36, Kushagra Singh
> >> <[email protected]> wrote:
> >> > Thanks a lot for the help!
> >> >
> >> > I've made some progress, but have a couple of more questions
> >> >
> >> > - I can't manage to find the http-only-flag in the cookie struct, do
> we
> >> not
> >> > store this?
> >> Since Wget supports only HTTP, this is not required. The HttpOnly
> >> attribute prevents access to script code, but since Wget never
> >> executes them it is not necessary at all. Although, it may be a good
> >> idea to explicitly store the flag for Wget saves the cookies to a
> >> file. Maybe, we should add this.
> >>
> >> > - The draft asks to check whether the "scheme" component of the
> >> > "request-uri" denotes a secure protocol or not. Currently I am
> checking
> >> > using "#ifdef HAVE_SSL". I am not sure whether this is the right way
> to
> >> do
> >> > so, since having SSL with wget does not necessarily mean that the
> current
> >> > connection is secure.
> >>
> >> Ideally, a code base should have as few #ifdef statements as possible.
> >> They make reading the code very difficult for a human. That said, in
> >> this scenario it is the absolute wrong technique. You will want to
> >> access the scheme from the request URI. Find a way to access this
> >> information, you may need to change some method prototypes to make
> >> this happen.
> >>
> >> > - To check whether there exists a cookie whose domain, domain-matches
> the
> >> > domain of a new cookie, we should iterate through the chains returned
> by
> >> > find_chains_of_host right?
> >>
> >> That ought to work, I think.
> >>
> >> >
> >> > Regards,
> >> > Kush
> >>
> >>
> >>
> >> --
> >> Thanking You,
> >> Darshit Shah
> >>
>
>
>
> --
> Thanking You,
> Darshit Shah
>
From 127e502a012567221fd792e256655006849fada8 Mon Sep 17 00:00:00 2001
From: kush789 <[email protected]>
Date: Wed, 27 Jan 2016 03:34:31 +0530
Subject: [PATCH] Added recomendations of draft-west-leave-secure-cookies-alone

---
 src/cookies.c | 198 +++++++++++++++++++++++++++++++++++++---------------------
 src/cookies.h |   5 +-
 src/http.c    |   2 +-
 3 files changed, 131 insertions(+), 74 deletions(-)

diff --git a/src/cookies.c b/src/cookies.c
index 81ecfa5..156bd61 100644
--- a/src/cookies.c
+++ b/src/cookies.c
@@ -350,7 +350,8 @@ discard_matching_cookie (struct cookie_jar *jar, struct cookie *cookie)
    filled.  */
 
 static struct cookie *
-parse_set_cookie (const char *set_cookie, bool silent)
+parse_set_cookie (const char *set_cookie, enum url_scheme scheme,
+                  bool silent)
 {
   const char *ptr = set_cookie;
   struct cookie *cookie = cookie_new ();
@@ -439,8 +440,30 @@ parse_set_cookie (const char *set_cookie, bool silent)
         }
       else if (TOKEN_IS (name, "secure"))
         {
-          /* ignore value completely */
-          cookie->secure = 1;
+#ifdef HAVE_SSL
+          if (scheme == SCHEME_HTTPS)
+          /* Ignore value completely since secure is a value-less
+             attribute*/
+            cookie->secure = 1;
+          else
+            {
+              /* Deleting cookie since secure only flag is set but connection
+                 is not secure. */
+              if (!silent)
+                  logprintf (LOG_NOTQUIET,
+                            _("Trying to create secure only cookie, but connection is not secure.\nSet-Cookie : %s\n"),
+                              quotearg_style (escape_quoting_style, set_cookie));
+              delete_cookie (cookie);
+              return NULL;
+            }
+#else
+          if (!silent)
+              logprintf (LOG_NOTQUIET,
+                        _("Trying to create secure only cookie, wget configured with\" --without-ssl.\nSet-Cookie : %s\n"),
+                          quotearg_style (escape_quoting_style, set_cookie));
+          delete_cookie (cookie);
+          return NULL;
+#endif
         }
       /* else: Ignore unrecognized attribute. */
     }
@@ -699,17 +722,84 @@ check_path_match (const char *cookie_path, const char *path)
   s = PS_newstr;                                                \
 } while (0)
 
+/* Return a count of how many times CHR occurs in STRING. */
+
+static int
+count_char (const char *string, char chr)
+{
+  const char *p;
+  int count = 0;
+  for (p = string; *p; p++)
+    if (*p == chr)
+      ++count;
+  return count;
+}
+
+/* Find the cookie chains whose domains match HOST and store them to
+   DEST.
+
+   A cookie chain is the head of a list of cookies that belong to a
+   host/domain.  Given HOST "img.search.xemacs.org", this function
+   will return the chains for "img.search.xemacs.org",
+   "search.xemacs.org", and "xemacs.org" -- those of them that exist
+   (if any), that is.
+
+   DEST should be large enough to accept (in the worst case) as many
+   elements as there are domain components of HOST.  */
+
+static int
+find_chains_of_host (struct cookie_jar *jar, const char *host,
+                     struct cookie *dest[])
+{
+  int dest_count = 0;
+  int passes, passcnt;
+
+  /* Bail out quickly if there are no cookies in the jar.  */
+  if (!hash_table_count (jar->chains))
+    return 0;
+
+  if (numeric_address_p (host))
+    /* If host is an IP address, only check for the exact match. */
+    passes = 1;
+  else
+    /* Otherwise, check all the subdomains except the top-level (last)
+       one.  As a domain with N components has N-1 dots, the number of
+       passes equals the number of dots.  */
+    passes = count_char (host, '.');
+
+  passcnt = 0;
+
+  /* Find chains that match HOST, starting with exact match and
+     progressing to less specific domains.  For instance, given HOST
+     fly.srk.fer.hr, first look for fly.srk.fer.hr's chain, then
+     srk.fer.hr's, then fer.hr's.  */
+  while (1)
+    {
+      struct cookie *chain = hash_table_get (jar->chains, host);
+      if (chain)
+        dest[dest_count++] = chain;
+      if (++passcnt >= passes)
+        break;
+      host = strchr (host, '.') + 1;
+    }
+
+  return dest_count;
+}
 
 /* Process the HTTP `Set-Cookie' header.  This results in storing the
    cookie or discarding a matching one, or ignoring it completely, all
    depending on the contents.  */
 
+
 void
 cookie_handle_set_cookie (struct cookie_jar *jar,
                           const char *host, int port,
-                          const char *path, const char *set_cookie)
+                          const char *path, enum url_scheme scheme,
+                          const char *set_cookie)
 {
-  struct cookie *cookie;
+  struct cookie *cookie, *old_cookie;
+  struct cookie **chains;
+  int chain_count, i;
   cookies_now = time (NULL);
 
   /* Wget's paths don't begin with '/' (blame rfc1808), but cookie
@@ -717,7 +807,7 @@ cookie_handle_set_cookie (struct cookie_jar *jar,
      simply prepend slash to PATH.  */
   PREPEND_SLASH (path);
 
-  cookie = parse_set_cookie (set_cookie, false);
+  cookie = parse_set_cookie (set_cookie, scheme, false);
   if (!cookie)
     goto out;
 
@@ -767,6 +857,31 @@ cookie_handle_set_cookie (struct cookie_jar *jar,
         }
     }
 
+#ifdef HAVE_SSL
+  if ((cookie->secure == 0) && (scheme != SCHEME_HTTPS))
+    {
+      /* If an old cookie exists such that the all of the following are
+         true, then discard the new cookie.
+
+         - The "domain" domain-matches the domain of the new cookie
+         - The "name" matches the "name" of the new cookie
+         - Secure-only flag of old cookie is set */
+
+      chains = alloca_array (struct cookie *, 1 + count_char (host, '.'));
+      chain_count = find_chains_of_host (jar, host, chains);
+
+      for (i = 0; i < chain_count; i++)
+        for (old_cookie = chains[i]; old_cookie; old_cookie = old_cookie->next)
+          {
+            if (!cookie_expired_p(old_cookie)
+                && !(old_cookie->domain_exact
+                     && 0 != strcasecmp (host, old_cookie->domain))
+                && 0 == strcasecmp(old_cookie->attr, cookie->attr)
+                && 1 == old_cookie->secure)
+              goto out;
+          }
+    }
+#endif
   /* Now store the cookie, or discard an existing cookie, if
      discarding was requested.  */
 
@@ -788,70 +903,6 @@ cookie_handle_set_cookie (struct cookie_jar *jar,
    previously stored cookies.  Entry point is
    `build_cookies_request'.  */
 
-/* Return a count of how many times CHR occurs in STRING. */
-
-static int
-count_char (const char *string, char chr)
-{
-  const char *p;
-  int count = 0;
-  for (p = string; *p; p++)
-    if (*p == chr)
-      ++count;
-  return count;
-}
-
-/* Find the cookie chains whose domains match HOST and store them to
-   DEST.
-
-   A cookie chain is the head of a list of cookies that belong to a
-   host/domain.  Given HOST "img.search.xemacs.org", this function
-   will return the chains for "img.search.xemacs.org",
-   "search.xemacs.org", and "xemacs.org" -- those of them that exist
-   (if any), that is.
-
-   DEST should be large enough to accept (in the worst case) as many
-   elements as there are domain components of HOST.  */
-
-static int
-find_chains_of_host (struct cookie_jar *jar, const char *host,
-                     struct cookie *dest[])
-{
-  int dest_count = 0;
-  int passes, passcnt;
-
-  /* Bail out quickly if there are no cookies in the jar.  */
-  if (!hash_table_count (jar->chains))
-    return 0;
-
-  if (numeric_address_p (host))
-    /* If host is an IP address, only check for the exact match. */
-    passes = 1;
-  else
-    /* Otherwise, check all the subdomains except the top-level (last)
-       one.  As a domain with N components has N-1 dots, the number of
-       passes equals the number of dots.  */
-    passes = count_char (host, '.');
-
-  passcnt = 0;
-
-  /* Find chains that match HOST, starting with exact match and
-     progressing to less specific domains.  For instance, given HOST
-     fly.srk.fer.hr, first look for fly.srk.fer.hr's chain, then
-     srk.fer.hr's, then fer.hr's.  */
-  while (1)
-    {
-      struct cookie *chain = hash_table_get (jar->chains, host);
-      if (chain)
-        dest[dest_count++] = chain;
-      if (++passcnt >= passes)
-        break;
-      host = strchr (host, '.') + 1;
-    }
-
-  return dest_count;
-}
-
 /* If FULL_PATH begins with PREFIX, return the length of PREFIX, zero
    otherwise.  */
 
@@ -1410,6 +1461,9 @@ test_cookies (void)
   };
   int i;
 
+  /* The uri scheme doesn't make a difference in these tests */
+  enum url_scheme scheme = SCHEME_HTTP;
+
   for (i = 0; i < countof (tests_succ); i++)
     {
       int ind;
@@ -1417,7 +1471,7 @@ test_cookies (void)
       const char **expected = tests_succ[i].results;
       struct cookie *c;
 
-      c = parse_set_cookie (data, true);
+      c = parse_set_cookie (data, scheme, true);
       if (!c)
         {
           printf ("NULL cookie returned for valid data: %s\n", data);
@@ -1457,7 +1511,7 @@ test_cookies (void)
     {
       struct cookie *c;
       char *data = tests_fail[i];
-      c = parse_set_cookie (data, true);
+      c = parse_set_cookie (data, scheme, true);
       if (c)
         printf ("Failed to report error on invalid data: %s\n", data);
     }
diff --git a/src/cookies.h b/src/cookies.h
index b4281e1..d565b09 100644
--- a/src/cookies.h
+++ b/src/cookies.h
@@ -31,13 +31,16 @@ as that of the covered work.  */
 #ifndef COOKIES_H
 #define COOKIES_H
 
+#include "url.h"
+
 struct cookie_jar;
 
 struct cookie_jar *cookie_jar_new (void);
 void cookie_jar_delete (struct cookie_jar *);
 
 void cookie_handle_set_cookie (struct cookie_jar *, const char *, int,
-                               const char *, const char *);
+                               const char *, enum url_scheme,
+                               const char *);
 char *cookie_header (struct cookie_jar *, const char *, int,
                      const char *, bool);
 
diff --git a/src/http.c b/src/http.c
index 8916d2b..f7d42b2 100644
--- a/src/http.c
+++ b/src/http.c
@@ -3276,7 +3276,7 @@ gethttp (struct url *u, struct http_stat *hs, int *dt, struct url *proxy,
         {
           char *set_cookie; BOUNDED_TO_ALLOCA (scbeg, scend, set_cookie);
           cookie_handle_set_cookie (wget_cookie_jar, u->host, u->port,
-                                    u->path, set_cookie);
+                                    u->path, u->scheme, set_cookie);
         }
     }
 
-- 
1.9.1

Reply via email to