Re: [Bug-wget] [PATCH 09/25] Enforce Metalink file name verification, strip directory if necessary
On Tue, 13 Sep 2016 16:21:37 +0200 Tim Ruehsen wrote: > On Tuesday, September 13, 2016 4:08:30 PM CEST Matthew White wrote: > > On Tue, 13 Sep 2016 09:29:27 +0200 > > > > Tim Ruehsen wrote: > > > On Tuesday, September 13, 2016 5:13:10 AM CEST Matthew White wrote: > > > > On Mon, 12 Sep 2016 21:20:54 +0200 > > > > > > > > Tim Rühsen wrote: > > > > > On Montag, 12. September 2016 20:18:30 CEST Eli Zaretskii wrote: > > > > > > > From: Tim Ruehsen > > > > > > > Date: Mon, 12 Sep 2016 13:00:32 +0200 > > > > > > > > > > > > > > > + char *basename = name; > > > > > > > > + > > > > > > > > + while ((name = strstr (basename, "/"))) > > > > > > > > +basename = name + 1; > > > > > > > > > > > > > > Could you use strrchr() ? something like > > > > > > > > > > > > > > char *basename = strrchr (name, '/'); > > > > > > > > > > > > > > if (basename) > > > > > > > > > > > > > > basename += 1; > > > > > > > > > > > > > > else > > > > > > > > > > > > > > basename = name; > > > > > > > > > > > > I think we want to use ISSEP, no? Otherwise Windows file names with > > > > > > backslashes will misfire. > > > > > > > > > > Good point. What about device names ? > > > > > > > > > > So maybe base_name() from Gnulib module 'dirname' is the right choice > > > > > !? > > > > > See https://www.gnu.org/software/gnulib/manual/html_node/basename.html > > > > > > > > What if Gnulib's base_name() returns "./"? > > > > > > > > libmetalink's metalink_check_safe_path() rejects relative paths: > > > > https://tools.ietf.org/html/rfc5854#section-4.1.2.1 > > > > > > > > Also, basename is used to point to an existing memory location, > > > > base_name() > > > > instead allocates new space. This is not a biggy, but we should keep it > > > > in > > > > mind to amend properly. > > > > > > > > lib/basename.c (base_name) > > > > -- > > > > > > > > /* On systems with drive letters, "a/b:c" must return "./b:c" rather > > > > > > > > than "b:c" to avoid confusion with a drive letter. On systems > > > > with pure POSIX semantics, this is not an issue. */ > > > > > > > > -- > > > > > > > > Suggestions? > > > > > > ISSEP is "homebrewed" and incomplete but doesn't need a memory allocation. > > > base_name() is "complete" (the macros check more than just WINDOWS) and we > > > automatically get improvements from upstream - but it calls malloc(). > > > > > > There is also last_component() which returns a pointer to the basename > > > within your filename. This is basically what you do. > > > > > > Anyways, this last component (basename) may still hold a device prefix - > > > you have to check that with either HAS_DEVICE() (only defined in certain > > > environments, needs to guarded by #ifdef) or by FILE_SYSTEM_PREFIX_LEN() > > > which gives 2 if your basename has a leading device prefix. And you > > > should do this check in a loop to catch names like 'C:D:xxx'. If you > > > don't do, we likely get a CVE assigned ;-) > > > > Thanks Tim! Yours is a super smart solution! > > > > I rewrote src/metalink.c (get_metalink_basename) to skip prefix drive > > letters on the basename (the declaration of last_component() is in > > src/metalink.h): > > > > #include "dosname.h" > > > > char * > > get_metalink_basename (char *name) > > { > > char *basename; > > > > if (!name) > > return NULL; > > > > basename = last_component (name); > > > > while (FILE_SYSTEM_PREFIX_LEN (basename)) > > basename += 2; > > > > return metalink_check_safe_path (basename) ? basename : NULL; > > } > > > > [make syntax-check is ok, make check-valgrind is ok, contrib/check-hard is > > ok] > > > > WDYT? > > Looks great ! > > To be absolutely future ready, don't assume FILE_SYSTEM_PREFIX_LEN to return > 0 > or 2 (even if the current code is like that). > > while ((n = FILE_SYSTEM_PREFIX_LEN (basename))) > basename += n; > > and you are on the safe side. Thanks, you're beyond me! Fix implemented. > > Regards, Tim Regards, Matthew -- Matthew White pgp_OzItGI347.pgp Description: PGP signature
Re: [Bug-wget] wget bug re cookies
On Tuesday, September 13, 2016 11:46:39 PM CEST David Oppenheim wrote: > wget has long had a problem with cookies - it does not handle > them the same as browsers. As a consequence it is often not usable > in sites with session logins other than very simple logins, > for example SiteMinder controlled web sites. > > Specifically, wget treats a received cookie that does not include > an explicit domain as having a domain of the host (good) but > requiring an exact match before that cookie is sent in later replies > (bad - not like browsers). > > So a cookie from foo.com that doesn't set the domain will only > be sent by wget back to foo.com, not to xx.foo.com > > This follows RFC 6265 (see section 4.1.2.3). But is not the way > today's browsers work. > > The issue is simply fixed in cookies.c by removing one line : > > /* Sanitize parts of cookie. */ > > if (!cookie->domain) > { > cookie->domain = xstrdup (host); > cookie->domain_exact = 1; // This code obeys RFC 6265 but is > not the way real browsers behave > > > I'm happy to discuss further if you're interested. > > I think this change would make wget immensely more useful (and > it's very useful already, thanks !) IMO, it would make wget immensely more insecure :') We don't have to argue, others made a pretty good job in discussion all the cookie RFCs and testing different types of browsers. From what I can read here (http://erik.io/blog/2014/03/04/definitive-guide-to-cookie-domains/), only Internet Explorer isn't RFC conform regarding cookies without domain (but year 2014 is *very* long ago regarding security issues). So if you really want 'Internet Explorer Insecurity' (and I can guess that there some use cases), we should protect the RFC behavior by a new switch (e.g. like --internet-explorer-insecurity[=on] with default to off). You might have a better name... ;-) Regards, Tim signature.asc Description: This is a digitally signed message part.
Re: [Bug-wget] [PATCH 09/25] Enforce Metalink file name verification, strip directory if necessary
On Tuesday, September 13, 2016 4:08:30 PM CEST Matthew White wrote: > On Tue, 13 Sep 2016 09:29:27 +0200 > > Tim Ruehsen wrote: > > On Tuesday, September 13, 2016 5:13:10 AM CEST Matthew White wrote: > > > On Mon, 12 Sep 2016 21:20:54 +0200 > > > > > > Tim Rühsen wrote: > > > > On Montag, 12. September 2016 20:18:30 CEST Eli Zaretskii wrote: > > > > > > From: Tim Ruehsen > > > > > > Date: Mon, 12 Sep 2016 13:00:32 +0200 > > > > > > > > > > > > > + char *basename = name; > > > > > > > + > > > > > > > + while ((name = strstr (basename, "/"))) > > > > > > > +basename = name + 1; > > > > > > > > > > > > Could you use strrchr() ? something like > > > > > > > > > > > > char *basename = strrchr (name, '/'); > > > > > > > > > > > > if (basename) > > > > > > > > > > > > basename += 1; > > > > > > > > > > > > else > > > > > > > > > > > > basename = name; > > > > > > > > > > I think we want to use ISSEP, no? Otherwise Windows file names with > > > > > backslashes will misfire. > > > > > > > > Good point. What about device names ? > > > > > > > > So maybe base_name() from Gnulib module 'dirname' is the right choice > > > > !? > > > > See https://www.gnu.org/software/gnulib/manual/html_node/basename.html > > > > > > What if Gnulib's base_name() returns "./"? > > > > > > libmetalink's metalink_check_safe_path() rejects relative paths: > > > https://tools.ietf.org/html/rfc5854#section-4.1.2.1 > > > > > > Also, basename is used to point to an existing memory location, > > > base_name() > > > instead allocates new space. This is not a biggy, but we should keep it > > > in > > > mind to amend properly. > > > > > > lib/basename.c (base_name) > > > -- > > > > > > /* On systems with drive letters, "a/b:c" must return "./b:c" rather > > > > > > than "b:c" to avoid confusion with a drive letter. On systems > > > with pure POSIX semantics, this is not an issue. */ > > > > > > -- > > > > > > Suggestions? > > > > ISSEP is "homebrewed" and incomplete but doesn't need a memory allocation. > > base_name() is "complete" (the macros check more than just WINDOWS) and we > > automatically get improvements from upstream - but it calls malloc(). > > > > There is also last_component() which returns a pointer to the basename > > within your filename. This is basically what you do. > > > > Anyways, this last component (basename) may still hold a device prefix - > > you have to check that with either HAS_DEVICE() (only defined in certain > > environments, needs to guarded by #ifdef) or by FILE_SYSTEM_PREFIX_LEN() > > which gives 2 if your basename has a leading device prefix. And you > > should do this check in a loop to catch names like 'C:D:xxx'. If you > > don't do, we likely get a CVE assigned ;-) > > Thanks Tim! Yours is a super smart solution! > > I rewrote src/metalink.c (get_metalink_basename) to skip prefix drive > letters on the basename (the declaration of last_component() is in > src/metalink.h): > > #include "dosname.h" > > char * > get_metalink_basename (char *name) > { > char *basename; > > if (!name) > return NULL; > > basename = last_component (name); > > while (FILE_SYSTEM_PREFIX_LEN (basename)) > basename += 2; > > return metalink_check_safe_path (basename) ? basename : NULL; > } > > [make syntax-check is ok, make check-valgrind is ok, contrib/check-hard is > ok] > > WDYT? Looks great ! To be absolutely future ready, don't assume FILE_SYSTEM_PREFIX_LEN to return 0 or 2 (even if the current code is like that). while ((n = FILE_SYSTEM_PREFIX_LEN (basename))) basename += n; and you are on the safe side. Regards, Tim signature.asc Description: This is a digitally signed message part.
Re: [Bug-wget] [PATCH 09/25] Enforce Metalink file name verification, strip directory if necessary
On Tue, 13 Sep 2016 09:29:27 +0200 Tim Ruehsen wrote: > On Tuesday, September 13, 2016 5:13:10 AM CEST Matthew White wrote: > > On Mon, 12 Sep 2016 21:20:54 +0200 > > > > Tim Rühsen wrote: > > > On Montag, 12. September 2016 20:18:30 CEST Eli Zaretskii wrote: > > > > > From: Tim Ruehsen > > > > > Date: Mon, 12 Sep 2016 13:00:32 +0200 > > > > > > > > > > > + char *basename = name; > > > > > > + > > > > > > + while ((name = strstr (basename, "/"))) > > > > > > +basename = name + 1; > > > > > > > > > > Could you use strrchr() ? something like > > > > > > > > > > char *basename = strrchr (name, '/'); > > > > > > > > > > if (basename) > > > > > > > > > > basename += 1; > > > > > > > > > > else > > > > > > > > > > basename = name; > > > > > > > > I think we want to use ISSEP, no? Otherwise Windows file names with > > > > backslashes will misfire. > > > > > > Good point. What about device names ? > > > > > > So maybe base_name() from Gnulib module 'dirname' is the right choice !? > > > See https://www.gnu.org/software/gnulib/manual/html_node/basename.html > > > > What if Gnulib's base_name() returns "./"? > > > > libmetalink's metalink_check_safe_path() rejects relative paths: > > https://tools.ietf.org/html/rfc5854#section-4.1.2.1 > > > > Also, basename is used to point to an existing memory location, base_name() > > instead allocates new space. This is not a biggy, but we should keep it in > > mind to amend properly. > > > > lib/basename.c (base_name) > > -- > > /* On systems with drive letters, "a/b:c" must return "./b:c" rather > > than "b:c" to avoid confusion with a drive letter. On systems > > with pure POSIX semantics, this is not an issue. */ > > -- > > > > Suggestions? > > ISSEP is "homebrewed" and incomplete but doesn't need a memory allocation. > base_name() is "complete" (the macros check more than just WINDOWS) and we > automatically get improvements from upstream - but it calls malloc(). > > There is also last_component() which returns a pointer to the basename within > your filename. This is basically what you do. > > Anyways, this last component (basename) may still hold a device prefix - you > have to check that with either HAS_DEVICE() (only defined in certain > environments, needs to guarded by #ifdef) or by FILE_SYSTEM_PREFIX_LEN() > which > gives 2 if your basename has a leading device prefix. And you should do this > check in a loop to catch names like 'C:D:xxx'. If you don't do, we likely get > a CVE assigned ;-) Thanks Tim! Yours is a super smart solution! I rewrote src/metalink.c (get_metalink_basename) to skip prefix drive letters on the basename (the declaration of last_component() is in src/metalink.h): #include "dosname.h" char * get_metalink_basename (char *name) { char *basename; if (!name) return NULL; basename = last_component (name); while (FILE_SYSTEM_PREFIX_LEN (basename)) basename += 2; return metalink_check_safe_path (basename) ? basename : NULL; } [make syntax-check is ok, make check-valgrind is ok, contrib/check-hard is ok] WDYT? > > Regards, Tim Regards, Matthew -- Matthew White pgpB0PYP4SWIt.pgp Description: PGP signature
[Bug-wget] wget bug re cookies
wget has long had a problem with cookies - it does not handle them the same as browsers. As a consequence it is often not usable in sites with session logins other than very simple logins, for example SiteMinder controlled web sites. Specifically, wget treats a received cookie that does not include an explicit domain as having a domain of the host (good) but requiring an exact match before that cookie is sent in later replies (bad - not like browsers). So a cookie from foo.com that doesn't set the domain will only be sent by wget back to foo.com, not to xx.foo.com This follows RFC 6265 (see section 4.1.2.3). But is not the way today's browsers work. The issue is simply fixed in cookies.c by removing one line : /* Sanitize parts of cookie. */ if (!cookie->domain) { cookie->domain = xstrdup (host); cookie->domain_exact = 1; // This code obeys RFC 6265 but is not the way real browsers behave I'm happy to discuss further if you're interested. I think this change would make wget immensely more useful (and it's very useful already, thanks !)
[Bug-wget] WordPress downloaded site
Hello wget people! I have downloaded my WordPress site using wget and it works like a charm. (thank you)! As WordPress uses directory names as meaningful file names, wget does create a directory with and inserts there index.html with correct relative directory path. (../). I want that this behavior will be kept for the top directory only and when wget recourse, it will create .html files out of it. (With proper links in the site of course). Is this feature exists? If not, how complicated it is to program it? Best regards, Michael
Re: [Bug-wget] [PATCH 04/25] Bugfix: Keep the download progress when alternating metalink:url
On Sun, 11 Sep 2016 22:04:41 +0200 Giuseppe Scrivano wrote: > Matthew White writes: > > > From f9fc03c0788675275041d0876d3e7fffd3f50eee Mon Sep 17 00:00:00 2001 > > From: Matthew White > > Date: Thu, 4 Aug 2016 11:35:42 +0200 > > Subject: [PATCH 04/25] Bugfix: Keep the download progress when alternating > > metalink:url > > > > * src/metalink.c (retrieve_from_metalink): On download error, resume > > output_stream with the next mres->url. Keep fully downloaded files > > started with --continue, otherwise rename/remove the file > > > > Before this patch, with --continue, existing and/or fully retrieved > > files which fail the sanity tests were renamed (--keep-badhash), or > > removed. > > > > This patch ensures that --continue doesn't rename/remove existing > > and/or fully retrieved files which fail the sanity tests. > > --- > > ACK Patch 12/25 is now squased in: http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00086.html The series of patches is now all amended/rebased. There are still changes to do. Posting after final decisions are taken about open topics in this series of patches. > > Giuseppe Regards, Matthew -- Matthew White pgp1UbwBOQybg.pgp Description: PGP signature
Re: [Bug-wget] [PATCH 12/25] New test: --continue shall keep fully retrieved Metalink files (HTTP 416)
On Sun, 11 Sep 2016 22:51:27 +0200 Giuseppe Scrivano wrote: > Hi Matthew, > > Matthew White writes: > > > [Coverity Scan is ok, make syntax-check is ok, make check-valgrind is ok, > > contrib/check-hard is ok] > > > > This introduces a new Metalink test to verify that when --continue is used, > > after a HTTP 416 answer a file that fails a sanity test isn't > > renamed/removed. > > > > The following description is verbatim from the patch: > > - > > Ensure that --continue doesn't rename/remove existing and/or fully > > retrieved files which fail the sanity tests. > > - > > can this test be included together with the patch that introduces this change? After amending, squashed with Patch 04/25, and inserted its update in the Patch 11/25. http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00020.html http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00027.html The series of patches is now all amended/rebased. There are still changes to do. > > > +MetaXml = re.sub (r'{{FILE1_HASH}}', File1_sha256, MetaXml) > > +MetaXml = re.sub (r'{{FILE2_HASH}}', File2_sha256, MetaXml) > > +MetaXml = re.sub (r'{{FILE3_HASH}}', File3_sha256, MetaXml) > > +MetaXml = re.sub (r'{{FILE4_HASH}}', File4_sha256, MetaXml) > > +MetaXml = re.sub (r'{{FILE5_HASH}}', File5_sha256, MetaXml) > > +MetaXml = re.sub (r'{{SRV_HOST}}', srv_host, MetaXml) > > +MetaXml = re.sub (r'{{SRV_PORT}}', str (srv_port), MetaXml) > > +MetaFile.content = MetaXml > > would just MetaXml.replace be enough here as well? Thanks, fixed here and in other patches. Posting after final decisions are taken about open topics in this series of patches. > > Thanks, > Giuseppe Regards, Matthew -- Matthew White pgpm9_4Wyy5kg.pgp Description: PGP signature
Re: [Bug-wget] [PATCH 01/25] Add two Metalink/XML tests
On Tue, 13 Sep 2016 10:10:49 +0200 Giuseppe Scrivano wrote: > Hi Matthew, > > Matthew White writes: > > > Next patches add more conditions to testenv/Test-metalink-xml-relpath.py. > > > > Also variants are introduced, like: > > * testenv/Test-metalink-xml-trust.py > > * testenv/Test-metalink-xml-homepath.py > > * testenv/Test-metalink-xml-homepath-trust.py > > * testenv/Test-metalink-xml-abspath-trust.py > > * testenv/Test-metalink-xml-relpath-trust.py > > > > And there will be other tests with other meanings too. > > > > So, I don't know if it's a good idea to mix specific tests together. > > > > This topic also applies to Patch 05/25, Patch 10/25, and Patch 17/25: > > http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00046.html > > http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00050.html > > http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00057.html > > > > WDYT? > > Could we have a base file, that is shared by these files? I agree it is > good to have a separate file for each different test, but if the > difference is only for one line, I think we can refactor them a bit. It > will make maintainance easier later. I'll work on it. > > Regards, > Giuseppe Regards, Matthew -- Matthew White pgpHumFkSxT3S.pgp Description: PGP signature
Re: [Bug-wget] [PATCH 01/25] Add two Metalink/XML tests
Hi Matthew, Matthew White writes: > Next patches add more conditions to testenv/Test-metalink-xml-relpath.py. > > Also variants are introduced, like: > * testenv/Test-metalink-xml-trust.py > * testenv/Test-metalink-xml-homepath.py > * testenv/Test-metalink-xml-homepath-trust.py > * testenv/Test-metalink-xml-abspath-trust.py > * testenv/Test-metalink-xml-relpath-trust.py > > And there will be other tests with other meanings too. > > So, I don't know if it's a good idea to mix specific tests together. > > This topic also applies to Patch 05/25, Patch 10/25, and Patch 17/25: > http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00046.html > http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00050.html > http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00057.html > > WDYT? Could we have a base file, that is shared by these files? I agree it is good to have a separate file for each different test, but if the difference is only for one line, I think we can refactor them a bit. It will make maintainance easier later. Regards, Giuseppe
Re: [Bug-wget] [PATCH 09/25] Enforce Metalink file name verification, strip directory if necessary
On Tuesday, September 13, 2016 5:13:10 AM CEST Matthew White wrote: > On Mon, 12 Sep 2016 21:20:54 +0200 > > Tim Rühsen wrote: > > On Montag, 12. September 2016 20:18:30 CEST Eli Zaretskii wrote: > > > > From: Tim Ruehsen > > > > Date: Mon, 12 Sep 2016 13:00:32 +0200 > > > > > > > > > + char *basename = name; > > > > > + > > > > > + while ((name = strstr (basename, "/"))) > > > > > +basename = name + 1; > > > > > > > > Could you use strrchr() ? something like > > > > > > > > char *basename = strrchr (name, '/'); > > > > > > > > if (basename) > > > > > > > > basename += 1; > > > > > > > > else > > > > > > > > basename = name; > > > > > > I think we want to use ISSEP, no? Otherwise Windows file names with > > > backslashes will misfire. > > > > Good point. What about device names ? > > > > So maybe base_name() from Gnulib module 'dirname' is the right choice !? > > See https://www.gnu.org/software/gnulib/manual/html_node/basename.html > > What if Gnulib's base_name() returns "./"? > > libmetalink's metalink_check_safe_path() rejects relative paths: > https://tools.ietf.org/html/rfc5854#section-4.1.2.1 > > Also, basename is used to point to an existing memory location, base_name() > instead allocates new space. This is not a biggy, but we should keep it in > mind to amend properly. > > lib/basename.c (base_name) > -- > /* On systems with drive letters, "a/b:c" must return "./b:c" rather > than "b:c" to avoid confusion with a drive letter. On systems > with pure POSIX semantics, this is not an issue. */ > -- > > Suggestions? ISSEP is "homebrewed" and incomplete but doesn't need a memory allocation. base_name() is "complete" (the macros check more than just WINDOWS) and we automatically get improvements from upstream - but it calls malloc(). There is also last_component() which returns a pointer to the basename within your filename. This is basically what you do. Anyways, this last component (basename) may still hold a device prefix - you have to check that with either HAS_DEVICE() (only defined in certain environments, needs to guarded by #ifdef) or by FILE_SYSTEM_PREFIX_LEN() which gives 2 if your basename has a leading device prefix. And you should do this check in a loop to catch names like 'C:D:xxx'. If you don't do, we likely get a CVE assigned ;-) Regards, Tim signature.asc Description: This is a digitally signed message part.
Re: [Bug-wget] [PATCH 05/25] Update Metalink/XML tests and add a new test for home paths
On Sun, 11 Sep 2016 22:06:50 +0200 Giuseppe Scrivano wrote: > Matthew White writes: > > > + > > same as before, can be just another test in the same file, not a > complete new test file. Metalink tests are specialized and may be revisioned later on adding more conditions. I like the idea of different sub-tests for different things. WDYT? Please, see the comments on Patch 01/25: http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00074.html > > Giuseppe Regards, Matthew -- Matthew White pgpMHkBiWUjIS.pgp Description: PGP signature
Re: [Bug-wget] [PATCH 10/25] New document: Metalink/XML and Metalink/HTTP standard reference
On Sun, 11 Sep 2016 22:45:09 +0200 Giuseppe Scrivano wrote: > Matthew White writes: > > > [Coverity Scan is ok, make syntax-check is ok, make check-valgrind is ok, > > contrib/check-hard is ok] > > > > This introduces the new document doc/metalink-standard.txt. > > > > The purpose of the document is to serve as a Metalink/XML and Metalink/HTTP > > standard reference of the Metalink functionalities currently implemented in > > Wget. > > > > Regards, > > Matthew > > > > -- > > Matthew White > > > > From 40442c885ab06dbef19caeef6bc4ba22a26dbb31 Mon Sep 17 00:00:00 2001 > > From: Matthew White > > Date: Fri, 19 Aug 2016 13:17:34 +0200 > > Subject: [PATCH 10/25] New document: Metalink/XML and Metalink/HTTP standard > > reference > > > > * doc/metalink-standard.txt: New doc. Implemented and recommended > > Metalink/XML and Metalink/HTTP standard features > > --- > > doc/metalink-standard.txt | 156 > > ++ > > 1 file changed, 156 insertions(+) > > create mode 100644 doc/metalink-standard.txt > > > > diff --git a/doc/metalink-standard.txt b/doc/metalink-standard.txt > > new file mode 100644 > > index 000..d00c384 > > --- /dev/null > > +++ b/doc/metalink-standard.txt > > @@ -0,0 +1,156 @@ > > +GNU Wget Metalink recommended behaviour > > + > > + Metalink/XML and Metalink/HTTP standard reference > > + > > + > > +1. Security features > > + > > + > > +Only metalink:file elements with safe "name" fields shall be accepted > > +[1 #section-4.1.2.1]. If unsafe metalink:file elements are saved, any > > +related test shall fail (see '2. Tests'). > > + > > +By design, libmetalink rejects unsafe metalink:file elements [3]: > > +* lib/metalink_helper.c (metalink_check_safe_path): Verify path > > + > > +1.1 Exceptions > > +== > > + > > +The option --directory-prefix could allow to use an absolute, relative > > +or home path. > > + > > +2. Tests > > + > > + > > +Saving a file to an unexpected path poses a security problem. We must > > +ensure that Wget's automated tests never modify the root and the home > > +paths or descend/escalate to a relative path unexpectedly. > > + > > +2.1 Metalink/XML implemented tests > > +== > > + > > +* testenv/Test-metalink-xml.py: Accept safe paths > > +* testenv/Test-metalink-xml-abspath.py: Reject absolute paths > > +* testenv/Test-metalink-xml-relpath.py: Reject relative paths > > +* testenv/Test-metalink-xml-homepath.py: Reject home paths > > ACK with these tests merged together. All the Metalink tests are specialized, hence each test has a different file name and content. I don't know if it's a good idea to merge the tests together. But this is just my opinion. WDYT? Please, see the comments on Patch 01/25: http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00074.html > > Regards, > Giuseppe Regards, Matthew -- Matthew White pgp58naq3p1aU.pgp Description: PGP signature
Re: [Bug-wget] [PATCH 16/25] Bugfix: Remove surrounding quotes from Metalink/HTTP key's value
On Sun, 11 Sep 2016 23:10:55 +0200 Giuseppe Scrivano wrote: > Matthew White writes: > > > +/* > > + Remove the quotation surrounding a string. > > + > > + The string is permanently modified. > > + */ > > +void > > +dequote_metalink_string (char **str) > > +{ > > + char *new, *beg, *end; > > + size_t str_len, new_len; > > + > > + if (!str || !*str) > > +return; > > + > > + str_len = strlen (*str); /* current string length */ > > + > > + if (str_len < 2) > > +return; > > + > > + new_len = str_len - 2; /* predict dequoted length */ > > + > > + beg = *str; /* begin of current string */ > > + end = *str + (str_len - 1); /* end of current string */ > > + > > + /* Verify if the current string is surrounded by quotes. */ > > + if (!(*beg == '\"' && *end == '\"') && !(*beg == '\'' && *end == '\'')) > > +return; > > could we just verify that (*str[0] == '\"' || *str[0] == '\'') and in > case return before computing the strlen? Yes we can. Fixed, see below. Posting after final decisions are taken about open topics in this series of patches. void dequote_metalink_string (char **str) { char *new; size_t str_len; if (!str || !*str || ((*str)[0] != '\"' && (*str)[0] != '\'')) return; str_len = strlen (*str); /* current string length */ /* Verify if the current string is surrounded by quotes. */ if (str_len < 2 || (*str)[0] != (*str)[str_len - 1]) return; /* Dequoted string. */ new = xmemdup0 (*str + 1, str_len - 2); xfree (*str); *str = new; } WDYT? > > Giuseppe Regards, Matthew -- Matthew White pgpsS9o_aiOV3.pgp Description: PGP signature