[Bug-wget] bug #48811: netrc password wins over interactive --ask-password

2016-09-16 Thread Wajda, Piotr
I've tried to tackle another bug. I've moved reading from .netrc file to 
the end, when no user and no passwd is defined (for both http and ftp).


Please review.

Thanks
Piotr
diff --git a/src/ftp.c b/src/ftp.c
index 39f20fa..359cbce 100644
--- a/src/ftp.c
+++ b/src/ftp.c
@@ -360,11 +360,10 @@ getftp (struct url *u, struct url *original_url,
   *qtyread = restval;

   user = u->user;
-  passwd = u->passwd;
-  search_netrc (u->host, (const char **), (const char **), 1);
   user = user ? user : (opt.ftp_user ? opt.ftp_user : opt.user);
+  passwd = opt.passwd ? opt.passwd : (u->passwd ? u->passwd : opt.ftp_passwd);
+  if(!user && !passwd) search_netrc (u->host, (const char **), (const 
char **), 1);
   if (!user) user = "anonymous";
-  passwd = passwd ? passwd : (opt.ftp_passwd ? opt.ftp_passwd : opt.passwd);
   if (!passwd) passwd = "-wget@";

   dtsock = -1;
diff --git a/src/http.c b/src/http.c
index 7e2c4ec..9de5cc0 100644
--- a/src/http.c
+++ b/src/http.c
@@ -1877,10 +1877,9 @@ initialize_request (const struct url *u, struct 
http_stat *hs, int *dt, struct u

   /* Find the username and password for authentication. */
   *user = u->user;
-  *passwd = u->passwd;
-  search_netrc (u->host, (const char **)user, (const char **)passwd, 0);
   *user = *user ? *user : (opt.http_user ? opt.http_user : opt.user);
-  *passwd = *passwd ? *passwd : (opt.http_passwd ? opt.http_passwd : 
opt.passwd);
+  *passwd = opt.passwd ? opt.passwd : (u->passwd ? u->passwd : 
opt.http_passwd);
+  if (!*user && !*passwd) search_netrc (u->host, (const char **)user, (const 
char **)passwd, 0);

   /* We only do "site-wide" authentication with "global" user/password
* values unless --auth-no-challange has been requested; URL user/password


[Bug-wget] bug #46584: wget --spider always returns zero exit status

2016-09-16 Thread Wajda, Piotr

Hi,
I'd like to start contributing to wget. I've chosen 
http://savannah.gnu.org/bugs/index.php?46584 for a good start.


Please let me know if attached patch is sane.

Thanks
Piotr
diff --git a/src/ftp.c b/src/ftp.c
index 39f20fa..e05d57b 100644
--- a/src/ftp.c
+++ b/src/ftp.c
@@ -1191,6 +1191,7 @@ Error in server response, closing control 
connection.\n"));
   if (opt.spider)
 {
   bool exists = false;
+  bool all_exist = true;
   struct fileinfo *f;
   uerr_t _res = ftp_get_listing (u, original_url, con, );
   /* Set the DO_RETR command flag again, because it gets unset when
@@ -1206,6 +1207,8 @@ Error in server response, closing control 
connection.\n"));
 {
   exists = true;
   break;
+} else {
+  all_exist = false;
 }
   f = f->next;
 }
@@ -1226,7 +1229,11 @@ Error in server response, closing control 
connection.\n"));
   con->csock = -1;
   fd_close (dtsock);
   fd_close (local_sock);
-  return RETRFINISHED;
+  if(all_exist) {
+  return RETRFINISHED;
+  } else {
+  return FTPNSFOD;
+  }
 }

   if (opt.verbose)


Re: [Bug-wget] [PATCH 20/25] New option --metalink-index to process Metalink application/metalink4+xml

2016-09-16 Thread Matthew White
On Fri, 16 Sep 2016 10:15:17 +0200
Tim Ruehsen  wrote:

> On Thursday, September 15, 2016 7:56:45 PM CEST Matthew White wrote:
> > On Thu, 15 Sep 2016 16:48:01 +0200
> > 
> > Tim Ruehsen  wrote:
> > > On Thursday, September 15, 2016 4:16:44 PM CEST Matthew White wrote:
> > > > On Thu, 15 Sep 2016 09:11:31 +0200
> > > > 
> > > > Giuseppe Scrivano  wrote:
> > > > > Hi Matthew,
> > > > > 
> > > > > Matthew White  writes:
> > > > > >> > Function amended to modify *name in place.
> > > > > >> > 
> > > > > >> > Followed Tim's suggestions for Patch 09/25 about different
> > > > > >> > environments compatibility, the function now uses
> > > > > >> > last_component()
> > > > > >> > to detect the basename:
> > > > > >> > http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00083.html
> > > > > >> > 
> > > > > >> > NOTES: if *name is NULL and ref is like 'dir/C:D:file', the
> > > > > >> > result
> > > > > >> > will be 'C:D:file'; is it advisable to remove the drive letters
> > > > > >> > 'C:D:' and return only 'file'?
> > > > > >> > 
> > > > > >> > /*
> > > > > >> > 
> > > > > >> >   Replace/remove the basename of a file name.
> > > > > >> >   
> > > > > >> >   The file name is permanently modified.
> > > > > >> >   
> > > > > >> >   Always set NAME to a string, even an empty one.
> > > > > >> >   
> > > > > >> >   Use REF's basename as replacement.  If REF is NULL or if it
> > > > > >> >   doesn't
> > > > > >> >   provide a valid basename candidate, then remove NAME's
> > > > > >> >   basename.
> > > > > >> > 
> > > > > >> > */
> > > > > >> > void
> > > > > >> > replace_metalink_basename (char **name, char *ref)
> > > > > >> > {
> > > > > >> 
> > > > > >> is it something we could do using only dirname and basename?  What
> > > > > >> you
> > > > > >> need here is "dirname(name) + basename(ref)"?
> > > > > > 
> > > > > > You asked to avoid superfluous memory allocations, right?
> > > > > 
> > > > > yes, and that is still my idea.  I was just wondering if the cost of
> > > > > these extra memory allocations was worth.  Is it enough to use
> > > > > last_component() to get it working on Windows?
> > > > 
> > > > Extra memory allocations shouldn't be a concern any longer, they are
> > > > removed in the amended function definition previously posted:
> > > > http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00094.html
> > > > 
> > > > About base_name() and last_component() multi-environment compatibility:
> > > > * lib/basename.c (base_name): Call last_component() to find the
> > > > basename,
> > > > allocate the basename (prefix with './' if required) *
> > > > lib/basename-lgpl.c
> > > > (last_component): Use the macros FILE_SYSTEM_PREFIX_LEN and ISSLASH to
> > > > isolate the basename * lib/dosname.h: Define the macros
> > > > FILE_SYSTEM_PREFIX_LEN and ISSLASH to work on different system
> > > > environments
> > > > 
> > > > Forgive this copy and paste, but my question about
> > > > replace_metalink_basename() is "if *name is NULL and ref is like
> > > > 'dir/C:D:file', the result will be 'C:D:file'; is it advisable to remove
> > > > the drive letters 'C:D:' and return only 'file'?".
> > > 
> > > Yes, if you use the result as file name to any file/system calls.
> > 
> > Function amended to remove prefix drive letters (aka FILE_SYSTEM_PREFIX_LEN)
> > from the resulting file name, when the file name is a bare basename.
> > 
> > Use the dirname from *name and the basename from ref to compute the
> > resulting file name.
> > 
> >   *name  +  ref   -> result
> >   -
> >   NULL   + "foo/C:D:file" -> "file" [bare basename]
> >   "foobar"   + "foo/C:D:file" -> "file" [bare basename]
> >   "dir/old"  + "foo/C:D:file" -> "dir/C:D:file"
> >   "C:D:file/old" + "foo/E:F:new"  -> "C:D:file/E:F:new" [is this ok?]
> 
> Just make sure that no file name beginning with letter+colon is used for 
> system 
> calls on Windows (e.g. open("C:D:file/E:F:new", ...) is not a good idea). 
> Either you strip the 'C:D:', or percent escape ':' on Windows. Wget has 
> functions to percent escape special characters in file names, depending on 
> the 
> OS it is built on.

Thanks. Function amended to always remove prefix drive letters if required 
(i.e. on w32 environments) for safety reasons.

About %-escaping, in src/url.c there is 'static void append_uri_pathel()'. I 
can't find references to opt.restrict_files_os in other places (except 
src/init.c and src/options.h).

Could be useful to write an extern function to do file name %-escaping?

The reason of this function is to replace an existing basename with another 
basename. Is it appropriate to do a file name verification here?

The results in the table below regards w32 environments:

  *name + ref-> result
  
  NULL  

Re: [Bug-wget] [PATCH 20/25] New option --metalink-index to process Metalink application/metalink4+xml

2016-09-16 Thread Eli Zaretskii
> From: Tim Ruehsen 
> Cc: mehw.is...@inventati.org, bug-wget@gnu.org
> Date: Fri, 16 Sep 2016 11:15:31 +0200
> 
> > So if wget needs to create or open such files, it needs to replace the
> > colon with some other character, like '!'.
> 
> That is what I meant with 'Wget has functions to percent escape special 
> characters...'. It is not only colons. And it depends on the OS (and/or file 
> system).

OK, so the problem should not exist, good.

> From https://en.wikipedia.org/wiki/Comparison_of_file_systems:
> "MS-DOS, Microsoft Windows, and OS/2 disallow the characters \ / : ? * " > < 
> | 
> and NUL in file and directory names across all filesystems. Unices and Linux 
> disallow the characters / and NUL in file and directory names across all 
> filesystems."

This is a better reference, JFYI:

  
https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx



Re: [Bug-wget] [PATCH 20/25] New option --metalink-index to process Metalink application/metalink4+xml

2016-09-16 Thread Tim Ruehsen
On Friday, September 16, 2016 11:39:15 AM CEST Eli Zaretskii wrote:
> > From: Tim Ruehsen 
> > Date: Fri, 16 Sep 2016 10:15:17 +0200
> > Cc: bug-wget@gnu.org
> > 
> > >   *name  +  ref   -> result
> > >   -
> > >   NULL   + "foo/C:D:file" -> "file" [bare basename]
> > >   "foobar"   + "foo/C:D:file" -> "file" [bare basename]
> > >   "dir/old"  + "foo/C:D:file" -> "dir/C:D:file"
> > >   "C:D:file/old" + "foo/E:F:new"  -> "C:D:file/E:F:new" [is this ok?]
> > 
> > Just make sure that no file name beginning with letter+colon is used for
> > system calls on Windows (e.g. open("C:D:file/E:F:new", ...) is not a good
> > idea). Either you strip the 'C:D:', or percent escape ':' on Windows.
> > Wget has functions to percent escape special characters in file names,
> > depending on the OS it is built on.
> 
> (I've lost track of this discussion, and don't understand the context
> well enough to get back on track, so please bear with me.)
> 
> Windows filesystems will not allow file names that have embedded colon
> characters, except if that colon is part of the drive specification at
> the beginning of a file name, as in "D:/dir/file".  File names like
> the 2 last results above are not allowed, and cannot be created or
> opened.
> 
> So if wget needs to create or open such files, it needs to replace the
> colon with some other character, like '!'.

That is what I meant with 'Wget has functions to percent escape special 
characters...'. It is not only colons. And it depends on the OS (and/or file 
system).

From https://en.wikipedia.org/wiki/Comparison_of_file_systems:
"MS-DOS, Microsoft Windows, and OS/2 disallow the characters \ / : ? * " > < | 
and NUL in file and directory names across all filesystems. Unices and Linux 
disallow the characters / and NUL in file and directory names across all 
filesystems."

Regards, Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] [PATCH 20/25] New option --metalink-index to process Metalink application/metalink4+xml

2016-09-16 Thread Eli Zaretskii
> From: Tim Ruehsen 
> Date: Fri, 16 Sep 2016 10:15:17 +0200
> Cc: bug-wget@gnu.org
> 
> >   *name  +  ref   -> result
> >   -
> >   NULL   + "foo/C:D:file" -> "file" [bare basename]
> >   "foobar"   + "foo/C:D:file" -> "file" [bare basename]
> >   "dir/old"  + "foo/C:D:file" -> "dir/C:D:file"
> >   "C:D:file/old" + "foo/E:F:new"  -> "C:D:file/E:F:new" [is this ok?]
> 
> Just make sure that no file name beginning with letter+colon is used for 
> system 
> calls on Windows (e.g. open("C:D:file/E:F:new", ...) is not a good idea). 
> Either you strip the 'C:D:', or percent escape ':' on Windows. Wget has 
> functions to percent escape special characters in file names, depending on 
> the 
> OS it is built on.

(I've lost track of this discussion, and don't understand the context
well enough to get back on track, so please bear with me.)

Windows filesystems will not allow file names that have embedded colon
characters, except if that colon is part of the drive specification at
the beginning of a file name, as in "D:/dir/file".  File names like
the 2 last results above are not allowed, and cannot be created or
opened.

So if wget needs to create or open such files, it needs to replace the
colon with some other character, like '!'.

Again, apologies if this comment makes no sense in the context of
whatever you've been discussing.



Re: [Bug-wget] [PATCH 20/25] New option --metalink-index to process Metalink application/metalink4+xml

2016-09-16 Thread Tim Ruehsen
On Thursday, September 15, 2016 7:56:45 PM CEST Matthew White wrote:
> On Thu, 15 Sep 2016 16:48:01 +0200
> 
> Tim Ruehsen  wrote:
> > On Thursday, September 15, 2016 4:16:44 PM CEST Matthew White wrote:
> > > On Thu, 15 Sep 2016 09:11:31 +0200
> > > 
> > > Giuseppe Scrivano  wrote:
> > > > Hi Matthew,
> > > > 
> > > > Matthew White  writes:
> > > > >> > Function amended to modify *name in place.
> > > > >> > 
> > > > >> > Followed Tim's suggestions for Patch 09/25 about different
> > > > >> > environments compatibility, the function now uses
> > > > >> > last_component()
> > > > >> > to detect the basename:
> > > > >> > http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00083.html
> > > > >> > 
> > > > >> > NOTES: if *name is NULL and ref is like 'dir/C:D:file', the
> > > > >> > result
> > > > >> > will be 'C:D:file'; is it advisable to remove the drive letters
> > > > >> > 'C:D:' and return only 'file'?
> > > > >> > 
> > > > >> > /*
> > > > >> > 
> > > > >> >   Replace/remove the basename of a file name.
> > > > >> >   
> > > > >> >   The file name is permanently modified.
> > > > >> >   
> > > > >> >   Always set NAME to a string, even an empty one.
> > > > >> >   
> > > > >> >   Use REF's basename as replacement.  If REF is NULL or if it
> > > > >> >   doesn't
> > > > >> >   provide a valid basename candidate, then remove NAME's
> > > > >> >   basename.
> > > > >> > 
> > > > >> > */
> > > > >> > void
> > > > >> > replace_metalink_basename (char **name, char *ref)
> > > > >> > {
> > > > >> 
> > > > >> is it something we could do using only dirname and basename?  What
> > > > >> you
> > > > >> need here is "dirname(name) + basename(ref)"?
> > > > > 
> > > > > You asked to avoid superfluous memory allocations, right?
> > > > 
> > > > yes, and that is still my idea.  I was just wondering if the cost of
> > > > these extra memory allocations was worth.  Is it enough to use
> > > > last_component() to get it working on Windows?
> > > 
> > > Extra memory allocations shouldn't be a concern any longer, they are
> > > removed in the amended function definition previously posted:
> > > http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00094.html
> > > 
> > > About base_name() and last_component() multi-environment compatibility:
> > > * lib/basename.c (base_name): Call last_component() to find the
> > > basename,
> > > allocate the basename (prefix with './' if required) *
> > > lib/basename-lgpl.c
> > > (last_component): Use the macros FILE_SYSTEM_PREFIX_LEN and ISSLASH to
> > > isolate the basename * lib/dosname.h: Define the macros
> > > FILE_SYSTEM_PREFIX_LEN and ISSLASH to work on different system
> > > environments
> > > 
> > > Forgive this copy and paste, but my question about
> > > replace_metalink_basename() is "if *name is NULL and ref is like
> > > 'dir/C:D:file', the result will be 'C:D:file'; is it advisable to remove
> > > the drive letters 'C:D:' and return only 'file'?".
> > 
> > Yes, if you use the result as file name to any file/system calls.
> 
> Function amended to remove prefix drive letters (aka FILE_SYSTEM_PREFIX_LEN)
> from the resulting file name, when the file name is a bare basename.
> 
> Use the dirname from *name and the basename from ref to compute the
> resulting file name.
> 
>   *name  +  ref   -> result
>   -
>   NULL   + "foo/C:D:file" -> "file" [bare basename]
>   "foobar"   + "foo/C:D:file" -> "file" [bare basename]
>   "dir/old"  + "foo/C:D:file" -> "dir/C:D:file"
>   "C:D:file/old" + "foo/E:F:new"  -> "C:D:file/E:F:new" [is this ok?]

Just make sure that no file name beginning with letter+colon is used for system 
calls on Windows (e.g. open("C:D:file/E:F:new", ...) is not a good idea). 
Either you strip the 'C:D:', or percent escape ':' on Windows. Wget has 
functions to percent escape special characters in file names, depending on the 
OS it is built on.

Regards, Tim


signature.asc
Description: This is a digitally signed message part.