Re: [Bug-wget] [PATCH 09/25] Enforce Metalink file name verification, strip directory if necessary

2016-09-13 Thread Matthew White
On Tue, 13 Sep 2016 16:21:37 +0200
Tim Ruehsen  wrote:

> On Tuesday, September 13, 2016 4:08:30 PM CEST Matthew White wrote:
> > On Tue, 13 Sep 2016 09:29:27 +0200
> > 
> > Tim Ruehsen  wrote:
> > > On Tuesday, September 13, 2016 5:13:10 AM CEST Matthew White wrote:
> > > > On Mon, 12 Sep 2016 21:20:54 +0200
> > > > 
> > > > Tim Rühsen  wrote:
> > > > > On Montag, 12. September 2016 20:18:30 CEST Eli Zaretskii wrote:
> > > > > > > From: Tim Ruehsen 
> > > > > > > Date: Mon, 12 Sep 2016 13:00:32 +0200
> > > > > > > 
> > > > > > > > +  char *basename = name;
> > > > > > > > +
> > > > > > > > +  while ((name = strstr (basename, "/")))
> > > > > > > > +basename = name + 1;
> > > > > > > 
> > > > > > > Could you use strrchr() ? something like
> > > > > > > 
> > > > > > > char *basename = strrchr (name, '/');
> > > > > > > 
> > > > > > > if (basename)
> > > > > > > 
> > > > > > >   basename += 1;
> > > > > > > 
> > > > > > > else
> > > > > > > 
> > > > > > >   basename = name;
> > > > > > 
> > > > > > I think we want to use ISSEP, no?  Otherwise Windows file names with
> > > > > > backslashes will misfire.
> > > > > 
> > > > > Good point. What about device names ?
> > > > > 
> > > > > So maybe base_name() from Gnulib module 'dirname' is the right choice
> > > > > !?
> > > > > See https://www.gnu.org/software/gnulib/manual/html_node/basename.html
> > > > 
> > > > What if Gnulib's base_name() returns "./"?
> > > > 
> > > > libmetalink's metalink_check_safe_path() rejects relative paths:
> > > > https://tools.ietf.org/html/rfc5854#section-4.1.2.1
> > > > 
> > > > Also, basename is used to point to an existing memory location,
> > > > base_name()
> > > > instead allocates new space. This is not a biggy, but we should keep it
> > > > in
> > > > mind to amend properly.
> > > > 
> > > > lib/basename.c (base_name)
> > > > --
> > > > 
> > > >   /* On systems with drive letters, "a/b:c" must return "./b:c" rather
> > > >   
> > > >  than "b:c" to avoid confusion with a drive letter.  On systems
> > > >  with pure POSIX semantics, this is not an issue.  */
> > > > 
> > > > --
> > > > 
> > > > Suggestions?
> > > 
> > > ISSEP is "homebrewed" and incomplete but doesn't need a memory allocation.
> > > base_name() is "complete" (the macros check more than just WINDOWS) and we
> > > automatically get improvements from upstream - but it calls malloc().
> > > 
> > > There is also last_component() which returns a pointer to the basename
> > > within your filename. This is basically what you do.
> > > 
> > > Anyways, this last component (basename) may still hold a device prefix -
> > > you have to check that with either HAS_DEVICE() (only defined in certain
> > > environments, needs to guarded by #ifdef) or by FILE_SYSTEM_PREFIX_LEN()
> > > which gives 2 if your basename has a leading device prefix. And you
> > > should do this check in a loop to catch names like 'C:D:xxx'. If you
> > > don't do, we likely get a CVE assigned ;-)
> > 
> > Thanks Tim! Yours is a super smart solution!
> > 
> > I rewrote src/metalink.c (get_metalink_basename) to skip prefix drive
> > letters on the basename (the declaration of last_component() is in
> > src/metalink.h):
> > 
> > #include "dosname.h"
> > 
> > char *
> > get_metalink_basename (char *name)
> > {
> >   char *basename;
> > 
> >   if (!name)
> > return NULL;
> > 
> >   basename = last_component (name);
> > 
> >   while (FILE_SYSTEM_PREFIX_LEN (basename))
> > basename += 2;
> > 
> >   return metalink_check_safe_path (basename) ? basename : NULL;
> > }
> > 
> > [make syntax-check is ok, make check-valgrind is ok, contrib/check-hard is
> > ok]
> > 
> > WDYT?
> 
> Looks great !
> 
> To be absolutely future ready, don't assume FILE_SYSTEM_PREFIX_LEN to return 
> 0 
> or 2 (even if the current code is like that). 
> 
> while ((n = FILE_SYSTEM_PREFIX_LEN (basename)))
>   basename += n;
> 
> and you are on the safe side.

Thanks, you're beyond me! Fix implemented.

> 
> Regards, Tim

Regards,
Matthew

-- 
Matthew White 


pgp_OzItGI347.pgp
Description: PGP signature


Re: [Bug-wget] wget bug re cookies

2016-09-13 Thread Tim Ruehsen
On Tuesday, September 13, 2016 11:46:39 PM CEST David Oppenheim wrote:
> wget has long had a problem with cookies - it does not handle
> them the same as browsers. As a consequence it is often not usable
> in sites with session logins other than very simple logins,
> for example SiteMinder controlled web sites.
> 
> Specifically, wget treats a received cookie that does not include
> an explicit domain as having a domain of the host (good) but
> requiring an exact match before that cookie is sent in later replies
> (bad - not like browsers).
> 
> So a cookie from foo.com that doesn't set the domain will only
> be sent by wget back to foo.com, not to xx.foo.com
> 
> This follows RFC 6265 (see section 4.1.2.3).  But is not the way
> today's browsers work.
> 
> The issue is simply fixed in cookies.c by removing one line :
> 
>   /* Sanitize parts of cookie. */
> 
>   if (!cookie->domain)
> {
>   cookie->domain = xstrdup (host);
>     cookie->domain_exact = 1;   // This code obeys RFC 6265 but is
> not the way real browsers behave
> 
> 
> I'm happy to discuss further if you're interested.
> 
> I think this change would make wget immensely more useful (and
> it's very useful already, thanks !)


IMO, it would make wget immensely more insecure :')

We don't have to argue, others made a pretty good job in discussion all the 
cookie RFCs and testing different types of browsers.
From what I can read here 
(http://erik.io/blog/2014/03/04/definitive-guide-to-cookie-domains/), only 
Internet Explorer isn't RFC conform regarding cookies 
without domain (but year 2014 is *very* long ago regarding security issues).

So if you really want 'Internet Explorer Insecurity' (and I can guess that 
there some use cases), we should protect the RFC behavior by a new switch 
(e.g. like --internet-explorer-insecurity[=on] with default to off).
You might have a better name... ;-)

Regards, Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] [PATCH 09/25] Enforce Metalink file name verification, strip directory if necessary

2016-09-13 Thread Tim Ruehsen
On Tuesday, September 13, 2016 4:08:30 PM CEST Matthew White wrote:
> On Tue, 13 Sep 2016 09:29:27 +0200
> 
> Tim Ruehsen  wrote:
> > On Tuesday, September 13, 2016 5:13:10 AM CEST Matthew White wrote:
> > > On Mon, 12 Sep 2016 21:20:54 +0200
> > > 
> > > Tim Rühsen  wrote:
> > > > On Montag, 12. September 2016 20:18:30 CEST Eli Zaretskii wrote:
> > > > > > From: Tim Ruehsen 
> > > > > > Date: Mon, 12 Sep 2016 13:00:32 +0200
> > > > > > 
> > > > > > > +  char *basename = name;
> > > > > > > +
> > > > > > > +  while ((name = strstr (basename, "/")))
> > > > > > > +basename = name + 1;
> > > > > > 
> > > > > > Could you use strrchr() ? something like
> > > > > > 
> > > > > > char *basename = strrchr (name, '/');
> > > > > > 
> > > > > > if (basename)
> > > > > > 
> > > > > >   basename += 1;
> > > > > > 
> > > > > > else
> > > > > > 
> > > > > >   basename = name;
> > > > > 
> > > > > I think we want to use ISSEP, no?  Otherwise Windows file names with
> > > > > backslashes will misfire.
> > > > 
> > > > Good point. What about device names ?
> > > > 
> > > > So maybe base_name() from Gnulib module 'dirname' is the right choice
> > > > !?
> > > > See https://www.gnu.org/software/gnulib/manual/html_node/basename.html
> > > 
> > > What if Gnulib's base_name() returns "./"?
> > > 
> > > libmetalink's metalink_check_safe_path() rejects relative paths:
> > > https://tools.ietf.org/html/rfc5854#section-4.1.2.1
> > > 
> > > Also, basename is used to point to an existing memory location,
> > > base_name()
> > > instead allocates new space. This is not a biggy, but we should keep it
> > > in
> > > mind to amend properly.
> > > 
> > > lib/basename.c (base_name)
> > > --
> > > 
> > >   /* On systems with drive letters, "a/b:c" must return "./b:c" rather
> > >   
> > >  than "b:c" to avoid confusion with a drive letter.  On systems
> > >  with pure POSIX semantics, this is not an issue.  */
> > > 
> > > --
> > > 
> > > Suggestions?
> > 
> > ISSEP is "homebrewed" and incomplete but doesn't need a memory allocation.
> > base_name() is "complete" (the macros check more than just WINDOWS) and we
> > automatically get improvements from upstream - but it calls malloc().
> > 
> > There is also last_component() which returns a pointer to the basename
> > within your filename. This is basically what you do.
> > 
> > Anyways, this last component (basename) may still hold a device prefix -
> > you have to check that with either HAS_DEVICE() (only defined in certain
> > environments, needs to guarded by #ifdef) or by FILE_SYSTEM_PREFIX_LEN()
> > which gives 2 if your basename has a leading device prefix. And you
> > should do this check in a loop to catch names like 'C:D:xxx'. If you
> > don't do, we likely get a CVE assigned ;-)
> 
> Thanks Tim! Yours is a super smart solution!
> 
> I rewrote src/metalink.c (get_metalink_basename) to skip prefix drive
> letters on the basename (the declaration of last_component() is in
> src/metalink.h):
> 
> #include "dosname.h"
> 
> char *
> get_metalink_basename (char *name)
> {
>   char *basename;
> 
>   if (!name)
> return NULL;
> 
>   basename = last_component (name);
> 
>   while (FILE_SYSTEM_PREFIX_LEN (basename))
> basename += 2;
> 
>   return metalink_check_safe_path (basename) ? basename : NULL;
> }
> 
> [make syntax-check is ok, make check-valgrind is ok, contrib/check-hard is
> ok]
> 
> WDYT?

Looks great !

To be absolutely future ready, don't assume FILE_SYSTEM_PREFIX_LEN to return 0 
or 2 (even if the current code is like that). 

while ((n = FILE_SYSTEM_PREFIX_LEN (basename)))
  basename += n;

and you are on the safe side.

Regards, Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] [PATCH 09/25] Enforce Metalink file name verification, strip directory if necessary

2016-09-13 Thread Matthew White
On Tue, 13 Sep 2016 09:29:27 +0200
Tim Ruehsen  wrote:

> On Tuesday, September 13, 2016 5:13:10 AM CEST Matthew White wrote:
> > On Mon, 12 Sep 2016 21:20:54 +0200
> > 
> > Tim Rühsen  wrote:
> > > On Montag, 12. September 2016 20:18:30 CEST Eli Zaretskii wrote:
> > > > > From: Tim Ruehsen 
> > > > > Date: Mon, 12 Sep 2016 13:00:32 +0200
> > > > > 
> > > > > > +  char *basename = name;
> > > > > > +
> > > > > > +  while ((name = strstr (basename, "/")))
> > > > > > +basename = name + 1;
> > > > > 
> > > > > Could you use strrchr() ? something like
> > > > > 
> > > > > char *basename = strrchr (name, '/');
> > > > > 
> > > > > if (basename)
> > > > > 
> > > > >   basename += 1;
> > > > > 
> > > > > else
> > > > > 
> > > > >   basename = name;
> > > > 
> > > > I think we want to use ISSEP, no?  Otherwise Windows file names with
> > > > backslashes will misfire.
> > > 
> > > Good point. What about device names ?
> > > 
> > > So maybe base_name() from Gnulib module 'dirname' is the right choice !?
> > > See https://www.gnu.org/software/gnulib/manual/html_node/basename.html
> > 
> > What if Gnulib's base_name() returns "./"?
> > 
> > libmetalink's metalink_check_safe_path() rejects relative paths:
> > https://tools.ietf.org/html/rfc5854#section-4.1.2.1
> > 
> > Also, basename is used to point to an existing memory location, base_name()
> > instead allocates new space. This is not a biggy, but we should keep it in
> > mind to amend properly.
> > 
> > lib/basename.c (base_name)
> > --
> >   /* On systems with drive letters, "a/b:c" must return "./b:c" rather
> >  than "b:c" to avoid confusion with a drive letter.  On systems
> >  with pure POSIX semantics, this is not an issue.  */
> > --
> > 
> > Suggestions?
> 
> ISSEP is "homebrewed" and incomplete but doesn't need a memory allocation.
> base_name() is "complete" (the macros check more than just WINDOWS) and we 
> automatically get improvements from upstream - but it calls malloc().
> 
> There is also last_component() which returns a pointer to the basename within 
> your filename. This is basically what you do.
> 
> Anyways, this last component (basename) may still hold a device prefix - you 
> have to check that with either HAS_DEVICE() (only defined in certain 
> environments, needs to guarded by #ifdef) or by FILE_SYSTEM_PREFIX_LEN() 
> which 
> gives 2 if your basename has a leading device prefix. And you should do this 
> check in a loop to catch names like 'C:D:xxx'. If you don't do, we likely get 
> a CVE assigned ;-)

Thanks Tim! Yours is a super smart solution!

I rewrote src/metalink.c (get_metalink_basename) to skip prefix drive letters 
on the basename (the declaration of last_component() is in src/metalink.h):

#include "dosname.h"

char *
get_metalink_basename (char *name)
{
  char *basename;

  if (!name)
return NULL;

  basename = last_component (name);

  while (FILE_SYSTEM_PREFIX_LEN (basename))
basename += 2;

  return metalink_check_safe_path (basename) ? basename : NULL;
}

[make syntax-check is ok, make check-valgrind is ok, contrib/check-hard is ok]

WDYT?

> 
> Regards, Tim

Regards,
Matthew

-- 
Matthew White 


pgpB0PYP4SWIt.pgp
Description: PGP signature


[Bug-wget] wget bug re cookies

2016-09-13 Thread David Oppenheim
wget has long had a problem with cookies - it does not handle
them the same as browsers. As a consequence it is often not usable
in sites with session logins other than very simple logins,
for example SiteMinder controlled web sites.

Specifically, wget treats a received cookie that does not include 
an explicit domain as having a domain of the host (good) but 
requiring an exact match before that cookie is sent in later replies 
(bad - not like browsers).

So a cookie from foo.com that doesn't set the domain will only
be sent by wget back to foo.com, not to xx.foo.com

This follows RFC 6265 (see section 4.1.2.3).  But is not the way
today's browsers work.

The issue is simply fixed in cookies.c by removing one line :

  /* Sanitize parts of cookie. */

  if (!cookie->domain)
{
  cookie->domain = xstrdup (host);
    cookie->domain_exact = 1;   // This code obeys RFC 6265 but is not 
the way real browsers behave


I'm happy to discuss further if you're interested.

I think this change would make wget immensely more useful (and
it's very useful already, thanks !)




[Bug-wget] WordPress downloaded site

2016-09-13 Thread Michael

Hello wget people!

I have downloaded my WordPress site using wget and it works like a charm.
(thank you)!

As WordPress uses directory names as meaningful file names, wget does create
a directory with  and inserts there index.html with correct
relative directory path. (../).

I want that this behavior will be kept for the top directory only and when
wget recourse, it will create .html files out of it. (With
proper links in the site of course).

Is this feature exists? If not, how complicated it is to program it?

Best regards,

Michael


Re: [Bug-wget] [PATCH 04/25] Bugfix: Keep the download progress when alternating metalink:url

2016-09-13 Thread Matthew White
On Sun, 11 Sep 2016 22:04:41 +0200
Giuseppe Scrivano  wrote:

> Matthew White  writes:
> 
> > From f9fc03c0788675275041d0876d3e7fffd3f50eee Mon Sep 17 00:00:00 2001
> > From: Matthew White 
> > Date: Thu, 4 Aug 2016 11:35:42 +0200
> > Subject: [PATCH 04/25] Bugfix: Keep the download progress when alternating
> >  metalink:url
> >
> > * src/metalink.c (retrieve_from_metalink): On download error, resume
> >   output_stream with the next mres->url. Keep fully downloaded files
> >   started with --continue, otherwise rename/remove the file
> >
> > Before this patch, with --continue, existing and/or fully retrieved
> > files which fail the sanity tests were renamed (--keep-badhash), or
> > removed.
> >
> > This patch ensures that --continue doesn't rename/remove existing
> > and/or fully retrieved files which fail the sanity tests.
> > ---
> 
> ACK

Patch 12/25 is now squased in:
http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00086.html

The series of patches is now all amended/rebased. There are still changes to do.

Posting after final decisions are taken about open topics in this series of 
patches.

> 
> Giuseppe

Regards,
Matthew

-- 
Matthew White 


pgp1UbwBOQybg.pgp
Description: PGP signature


Re: [Bug-wget] [PATCH 12/25] New test: --continue shall keep fully retrieved Metalink files (HTTP 416)

2016-09-13 Thread Matthew White
On Sun, 11 Sep 2016 22:51:27 +0200
Giuseppe Scrivano  wrote:

> Hi Matthew,
> 
> Matthew White  writes:
> 
> > [Coverity Scan is ok, make syntax-check is ok, make check-valgrind is ok, 
> > contrib/check-hard is ok]
> >
> > This introduces a new Metalink test to verify that when --continue is used, 
> > after a HTTP 416 answer a file that fails a sanity test isn't 
> > renamed/removed.
> >
> > The following description is verbatim from the patch:
> > -
> > Ensure that --continue doesn't rename/remove existing and/or fully
> > retrieved files which fail the sanity tests.
> > -
> 
> can this test be included together with the patch that introduces this change?

After amending, squashed with Patch 04/25, and inserted its update in the Patch 
11/25.
http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00020.html
http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00027.html

The series of patches is now all amended/rebased. There are still changes to do.

> 
> > +MetaXml = re.sub (r'{{FILE1_HASH}}', File1_sha256, MetaXml)
> > +MetaXml = re.sub (r'{{FILE2_HASH}}', File2_sha256, MetaXml)
> > +MetaXml = re.sub (r'{{FILE3_HASH}}', File3_sha256, MetaXml)
> > +MetaXml = re.sub (r'{{FILE4_HASH}}', File4_sha256, MetaXml)
> > +MetaXml = re.sub (r'{{FILE5_HASH}}', File5_sha256, MetaXml)
> > +MetaXml = re.sub (r'{{SRV_HOST}}', srv_host, MetaXml)
> > +MetaXml = re.sub (r'{{SRV_PORT}}', str (srv_port), MetaXml)
> > +MetaFile.content = MetaXml
> 
> would just MetaXml.replace be enough here as well?

Thanks, fixed here and in other patches.

Posting after final decisions are taken about open topics in this series of 
patches.

> 
> Thanks,
> Giuseppe

Regards,
Matthew

-- 
Matthew White 


pgpm9_4Wyy5kg.pgp
Description: PGP signature


Re: [Bug-wget] [PATCH 01/25] Add two Metalink/XML tests

2016-09-13 Thread Matthew White
On Tue, 13 Sep 2016 10:10:49 +0200
Giuseppe Scrivano  wrote:

> Hi Matthew,
> 
> Matthew White  writes:
> 
> > Next patches add more conditions to testenv/Test-metalink-xml-relpath.py.
> >
> > Also variants are introduced, like:
> > * testenv/Test-metalink-xml-trust.py
> > * testenv/Test-metalink-xml-homepath.py
> > * testenv/Test-metalink-xml-homepath-trust.py
> > * testenv/Test-metalink-xml-abspath-trust.py
> > * testenv/Test-metalink-xml-relpath-trust.py
> >
> > And there will be other tests with other meanings too.
> >
> > So, I don't know if it's a good idea to mix specific tests together.
> >
> > This topic also applies to Patch 05/25, Patch 10/25, and Patch 17/25:
> > http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00046.html
> > http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00050.html
> > http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00057.html
> >
> > WDYT?
> 
> Could we have a base file, that is shared by these files?  I agree it is
> good to have a separate file for each different test, but if the
> difference is only for one line, I think we can refactor them a bit.  It
> will make maintainance easier later.

I'll work on it.

> 
> Regards,
> Giuseppe

Regards,
Matthew

-- 
Matthew White 


pgpHumFkSxT3S.pgp
Description: PGP signature


Re: [Bug-wget] [PATCH 01/25] Add two Metalink/XML tests

2016-09-13 Thread Giuseppe Scrivano
Hi Matthew,

Matthew White  writes:

> Next patches add more conditions to testenv/Test-metalink-xml-relpath.py.
>
> Also variants are introduced, like:
> * testenv/Test-metalink-xml-trust.py
> * testenv/Test-metalink-xml-homepath.py
> * testenv/Test-metalink-xml-homepath-trust.py
> * testenv/Test-metalink-xml-abspath-trust.py
> * testenv/Test-metalink-xml-relpath-trust.py
>
> And there will be other tests with other meanings too.
>
> So, I don't know if it's a good idea to mix specific tests together.
>
> This topic also applies to Patch 05/25, Patch 10/25, and Patch 17/25:
> http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00046.html
> http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00050.html
> http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00057.html
>
> WDYT?

Could we have a base file, that is shared by these files?  I agree it is
good to have a separate file for each different test, but if the
difference is only for one line, I think we can refactor them a bit.  It
will make maintainance easier later.

Regards,
Giuseppe



Re: [Bug-wget] [PATCH 09/25] Enforce Metalink file name verification, strip directory if necessary

2016-09-13 Thread Tim Ruehsen
On Tuesday, September 13, 2016 5:13:10 AM CEST Matthew White wrote:
> On Mon, 12 Sep 2016 21:20:54 +0200
> 
> Tim Rühsen  wrote:
> > On Montag, 12. September 2016 20:18:30 CEST Eli Zaretskii wrote:
> > > > From: Tim Ruehsen 
> > > > Date: Mon, 12 Sep 2016 13:00:32 +0200
> > > > 
> > > > > +  char *basename = name;
> > > > > +
> > > > > +  while ((name = strstr (basename, "/")))
> > > > > +basename = name + 1;
> > > > 
> > > > Could you use strrchr() ? something like
> > > > 
> > > > char *basename = strrchr (name, '/');
> > > > 
> > > > if (basename)
> > > > 
> > > >   basename += 1;
> > > > 
> > > > else
> > > > 
> > > >   basename = name;
> > > 
> > > I think we want to use ISSEP, no?  Otherwise Windows file names with
> > > backslashes will misfire.
> > 
> > Good point. What about device names ?
> > 
> > So maybe base_name() from Gnulib module 'dirname' is the right choice !?
> > See https://www.gnu.org/software/gnulib/manual/html_node/basename.html
> 
> What if Gnulib's base_name() returns "./"?
> 
> libmetalink's metalink_check_safe_path() rejects relative paths:
> https://tools.ietf.org/html/rfc5854#section-4.1.2.1
> 
> Also, basename is used to point to an existing memory location, base_name()
> instead allocates new space. This is not a biggy, but we should keep it in
> mind to amend properly.
> 
> lib/basename.c (base_name)
> --
>   /* On systems with drive letters, "a/b:c" must return "./b:c" rather
>  than "b:c" to avoid confusion with a drive letter.  On systems
>  with pure POSIX semantics, this is not an issue.  */
> --
> 
> Suggestions?

ISSEP is "homebrewed" and incomplete but doesn't need a memory allocation.
base_name() is "complete" (the macros check more than just WINDOWS) and we 
automatically get improvements from upstream - but it calls malloc().

There is also last_component() which returns a pointer to the basename within 
your filename. This is basically what you do.

Anyways, this last component (basename) may still hold a device prefix - you 
have to check that with either HAS_DEVICE() (only defined in certain 
environments, needs to guarded by #ifdef) or by FILE_SYSTEM_PREFIX_LEN() which 
gives 2 if your basename has a leading device prefix. And you should do this 
check in a loop to catch names like 'C:D:xxx'. If you don't do, we likely get 
a CVE assigned ;-)

Regards, Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] [PATCH 05/25] Update Metalink/XML tests and add a new test for home paths

2016-09-13 Thread Matthew White
On Sun, 11 Sep 2016 22:06:50 +0200
Giuseppe Scrivano  wrote:

> Matthew White  writes:
> 
> > +
> 
> same as before, can be just another test in the same file, not a
> complete new test file.

Metalink tests are specialized and may be revisioned later on adding more 
conditions. I like the idea of different sub-tests for different things.

WDYT?

Please, see the comments on Patch 01/25:
http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00074.html

> 
> Giuseppe

Regards,
Matthew

-- 
Matthew White 


pgpMHkBiWUjIS.pgp
Description: PGP signature


Re: [Bug-wget] [PATCH 10/25] New document: Metalink/XML and Metalink/HTTP standard reference

2016-09-13 Thread Matthew White
On Sun, 11 Sep 2016 22:45:09 +0200
Giuseppe Scrivano  wrote:

> Matthew White  writes:
> 
> > [Coverity Scan is ok, make syntax-check is ok, make check-valgrind is ok, 
> > contrib/check-hard is ok]
> >
> > This introduces the new document doc/metalink-standard.txt.
> >
> > The purpose of the document is to serve as a Metalink/XML and Metalink/HTTP 
> > standard reference of the Metalink functionalities currently implemented in 
> > Wget.
> >
> > Regards,
> > Matthew
> >
> > -- 
> > Matthew White 
> >
> > From 40442c885ab06dbef19caeef6bc4ba22a26dbb31 Mon Sep 17 00:00:00 2001
> > From: Matthew White 
> > Date: Fri, 19 Aug 2016 13:17:34 +0200
> > Subject: [PATCH 10/25] New document: Metalink/XML and Metalink/HTTP standard
> >  reference
> >
> > * doc/metalink-standard.txt: New doc. Implemented and recommended
> >   Metalink/XML and Metalink/HTTP standard features
> > ---
> >  doc/metalink-standard.txt | 156 
> > ++
> >  1 file changed, 156 insertions(+)
> >  create mode 100644 doc/metalink-standard.txt
> >
> > diff --git a/doc/metalink-standard.txt b/doc/metalink-standard.txt
> > new file mode 100644
> > index 000..d00c384
> > --- /dev/null
> > +++ b/doc/metalink-standard.txt
> > @@ -0,0 +1,156 @@
> > +GNU Wget Metalink recommended behaviour
> > +
> > +  Metalink/XML and Metalink/HTTP standard reference
> > +
> > +
> > +1. Security features
> > +
> > +
> > +Only metalink:file elements with safe "name" fields shall be accepted
> > +[1 #section-4.1.2.1]. If unsafe metalink:file elements are saved, any
> > +related test shall fail (see '2. Tests').
> > +
> > +By design, libmetalink rejects unsafe metalink:file elements [3]:
> > +* lib/metalink_helper.c (metalink_check_safe_path): Verify path
> > +
> > +1.1 Exceptions
> > +==
> > +
> > +The option --directory-prefix could allow to use an absolute, relative
> > +or home path.
> > +
> > +2. Tests
> > +
> > +
> > +Saving a file to an unexpected path poses a security problem. We must
> > +ensure that Wget's automated tests never modify the root and the home
> > +paths or descend/escalate to a relative path unexpectedly.
> > +
> > +2.1 Metalink/XML implemented tests
> > +==
> > +
> > +* testenv/Test-metalink-xml.py: Accept safe paths
> > +* testenv/Test-metalink-xml-abspath.py: Reject absolute paths
> > +* testenv/Test-metalink-xml-relpath.py: Reject relative paths
> > +* testenv/Test-metalink-xml-homepath.py: Reject home paths
> 
> ACK with these tests merged together.

All the Metalink tests are specialized, hence each test has a different file 
name and content.

I don't know if it's a good idea to merge the tests together. But this is just 
my opinion.

WDYT?

Please, see the comments on Patch 01/25:
http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00074.html

> 
> Regards,
> Giuseppe

Regards,
Matthew

-- 
Matthew White 


pgp58naq3p1aU.pgp
Description: PGP signature


Re: [Bug-wget] [PATCH 16/25] Bugfix: Remove surrounding quotes from Metalink/HTTP key's value

2016-09-13 Thread Matthew White
On Sun, 11 Sep 2016 23:10:55 +0200
Giuseppe Scrivano  wrote:

> Matthew White  writes:
> 
> > +/*
> > +  Remove the quotation surrounding a string.
> > +
> > +  The string is permanently modified.
> > + */
> > +void
> > +dequote_metalink_string (char **str)
> > +{
> > +  char *new, *beg, *end;
> > +  size_t str_len, new_len;
> > +
> > +  if (!str || !*str)
> > +return;
> > +
> > +  str_len = strlen (*str); /* current string length */
> > +
> > +  if (str_len < 2)
> > +return;
> > +
> > +  new_len = str_len - 2;   /* predict dequoted length */
> > +
> > +  beg = *str; /* begin of current string */
> > +  end = *str + (str_len - 1); /* end of current string */
> > +
> > +  /* Verify if the current string is surrounded by quotes.  */
> > +  if (!(*beg == '\"' && *end == '\"') && !(*beg == '\'' && *end == '\''))
> > +return;
> 
> could we just verify that (*str[0] == '\"' || *str[0] == '\'') and in
> case return before computing the strlen?

Yes we can.

Fixed, see below. Posting after final decisions are taken about open topics in 
this series of patches.

void
dequote_metalink_string (char **str)
{
  char *new;
  size_t str_len;

  if (!str || !*str || ((*str)[0] != '\"' && (*str)[0] != '\''))
return;

  str_len = strlen (*str); /* current string length */

  /* Verify if the current string is surrounded by quotes.  */
  if (str_len < 2 || (*str)[0] != (*str)[str_len - 1])
return;

  /* Dequoted string.  */
  new = xmemdup0 (*str + 1, str_len - 2);
  xfree (*str);
  *str = new;
}

WDYT?

> 
> Giuseppe

Regards,
Matthew

-- 
Matthew White 


pgpsS9o_aiOV3.pgp
Description: PGP signature