Whenever a HTTP server sends cookies, we have to check for validity before we
accept them. The Mozilla Publix Suffix List (PSL[0]) provides a set of rules
that allows to detect some forms of domain misuses (which would allow privacy
leaking of cookies, e.g. login information leaks).

Here is a patch that allows Wget to automatically load the latest PSL, if
provided by a distribution/package.
Using PSL in DAFSA[1] format is recommended - as Debian provides in it's
latest 'publicsuffix' package. Plain text PSL still works, but needs a bunch of
parsing and processing while the DAFSA format doesn't (just a read - and it is
ready to use).
Libpsl[2] 0.14.+ provides a tool to compile plain text PSL into DAFSA format.

I chose a configure option to allow package maintainers to set a default PSL
file at build time. If it can't be read, the code falls back to the built-in
data of libpsl.

Please review and comment.

Regards, Tim

[0] https://publicsuffix.org/
[1] https://en.wikipedia.org/wiki/Deterministic_acyclic_finite_state_automaton
[2] https://github.com/rockdaboot/libpsl
From 11e69c7c8c56efd075d492d0b0e977b16a83c64e Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Tim Rühsen?= <tim.rueh...@gmx.de>
Date: Thu, 11 Aug 2016 15:16:24 +0200
Subject: [PATCH] Improve PSL cookie checking

* configure.ac: Add --with-psl-file to set a PSL file
* src/cookies.c (check_domain_match): Load PSL_FILE with
  fallback to built-in data.

This change allows package maintainers to make Wget use the latest
PSL (DAFSA or plain text), without updating libpsl itself.

E.g. Debian now comes with a DAFSA binary within the 'publicsuffix'
package which allows very fast loading (no parsing or processing needed).
---
 configure.ac  |  7 ++++++-
 src/cookies.c | 43 ++++++++++++++++++++++++++++++++++++++-----
 2 files changed, 44 insertions(+), 6 deletions(-)

diff --git a/configure.ac b/configure.ac
index cce20c6..20cdbd2 100644
--- a/configure.ac
+++ b/configure.ac
@@ -329,6 +329,11 @@ AS_IF([test "x$with_libpsl" != xno], [
   ])
 ])

+# Check for custom PSL file
+AC_ARG_WITH(psl-file,
+  AC_HELP_STRING([--with-psl-file=[PATH]], [path to PSL file (plain text or DAFSA)]),
+  PSL_FILE=$withval AC_DEFINE_UNQUOTED([PSL_FILE], ["$PSL_FILE"], [path to PSL file (plain text or DAFSA)]))
+
 AS_IF([test x"$with_zlib" != xno], [
   with_zlib=yes
   PKG_CHECK_MODULES([ZLIB], zlib, [
@@ -823,7 +828,7 @@ AC_MSG_NOTICE([Summary of build options:
   Libs:              $LIBS
   SSL:               $with_ssl
   Zlib:              $with_zlib
-  PSL:               $with_libpsl
+  PSL:               $with_libpsl $PSL_FILE
   Digest:            $ENABLE_DIGEST
   NTLM:              $ENABLE_NTLM
   OPIE:              $ENABLE_OPIE
diff --git a/src/cookies.c b/src/cookies.c
index 767b284..d45f4ef 100644
--- a/src/cookies.c
+++ b/src/cookies.c
@@ -526,19 +526,52 @@ check_domain_match (const char *cookie_domain, const char *host)
 {

 #ifdef HAVE_LIBPSL
+  static int init_psl;
+  static const psl_ctx_t *psl;
+
   char *cookie_domain_lower = NULL;
   char *host_lower = NULL;
-  const psl_ctx_t *psl;
   int is_acceptable;

   DEBUGP (("cdm: 1"));
-  if (!(psl = psl_builtin()))
+  if (!init_psl)
     {
-      DEBUGP (("\nlibpsl not built with a public suffix list. "
-               "Falling back to simple heuristics.\n"));
-      goto no_psl;
+      init_psl = 1;
+
+#ifdef PSL_FILE
+      /* If PSL_FILE is a DAFSA file, loading is very fast */
+      if ((psl = psl_load_file (PSL_FILE)))
+        goto have_psl;
+
+      DEBUGP (("\nPSL: %s not found or not readable. "
+               "Falling back to built-in data.\n", quote (PSL_FILE)));
+#endif
+
+      if ((psl = psl_builtin ()) && !psl_builtin_outdated ())
+        goto have_psl;
+
+      DEBUGP (("\nPSL: built-in data outdated. "
+               "Trying to load data from %s.\n",
+              quote (psl_builtin_filename ())));
+
+      if ((psl = psl_load_file(psl_builtin_filename ())))
+        goto have_psl;
+
+      DEBUGP (("\nPSL: %s not found or not readable. "
+               "Falling back to built-in data.\n",
+              quote (psl_builtin_filename ())));
+
+      if (!(psl = psl_builtin ()))
+        {
+          DEBUGP (("\nPSL: libpsl not built with a public suffix list. "
+                   "Falling back to simple heuristics.\n"));
+          goto no_psl;
+        }
     }
+  else if (!psl)
+    goto no_psl;

+have_psl:
   if (psl_str_to_utf8lower (cookie_domain, NULL, NULL, &cookie_domain_lower) == PSL_SUCCESS &&
       psl_str_to_utf8lower (host, NULL, NULL, &host_lower) == PSL_SUCCESS)
     {
--
2.8.1

Attachment: signature.asc
Description: This is a digitally signed message part.

Reply via email to