Re: [Bug-wget] [PATCH 20/25] New option --metalink-index to process Metalink application/metalink4+xml

2016-09-16 Thread Matthew White
On Fri, 16 Sep 2016 10:15:17 +0200
Tim Ruehsen  wrote:

> On Thursday, September 15, 2016 7:56:45 PM CEST Matthew White wrote:
> > On Thu, 15 Sep 2016 16:48:01 +0200
> > 
> > Tim Ruehsen  wrote:
> > > On Thursday, September 15, 2016 4:16:44 PM CEST Matthew White wrote:
> > > > On Thu, 15 Sep 2016 09:11:31 +0200
> > > > 
> > > > Giuseppe Scrivano  wrote:
> > > > > Hi Matthew,
> > > > > 
> > > > > Matthew White  writes:
> > > > > >> > Function amended to modify *name in place.
> > > > > >> > 
> > > > > >> > Followed Tim's suggestions for Patch 09/25 about different
> > > > > >> > environments compatibility, the function now uses
> > > > > >> > last_component()
> > > > > >> > to detect the basename:
> > > > > >> > http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00083.html
> > > > > >> > 
> > > > > >> > NOTES: if *name is NULL and ref is like 'dir/C:D:file', the
> > > > > >> > result
> > > > > >> > will be 'C:D:file'; is it advisable to remove the drive letters
> > > > > >> > 'C:D:' and return only 'file'?
> > > > > >> > 
> > > > > >> > /*
> > > > > >> > 
> > > > > >> >   Replace/remove the basename of a file name.
> > > > > >> >   
> > > > > >> >   The file name is permanently modified.
> > > > > >> >   
> > > > > >> >   Always set NAME to a string, even an empty one.
> > > > > >> >   
> > > > > >> >   Use REF's basename as replacement.  If REF is NULL or if it
> > > > > >> >   doesn't
> > > > > >> >   provide a valid basename candidate, then remove NAME's
> > > > > >> >   basename.
> > > > > >> > 
> > > > > >> > */
> > > > > >> > void
> > > > > >> > replace_metalink_basename (char **name, char *ref)
> > > > > >> > {
> > > > > >> 
> > > > > >> is it something we could do using only dirname and basename?  What
> > > > > >> you
> > > > > >> need here is "dirname(name) + basename(ref)"?
> > > > > > 
> > > > > > You asked to avoid superfluous memory allocations, right?
> > > > > 
> > > > > yes, and that is still my idea.  I was just wondering if the cost of
> > > > > these extra memory allocations was worth.  Is it enough to use
> > > > > last_component() to get it working on Windows?
> > > > 
> > > > Extra memory allocations shouldn't be a concern any longer, they are
> > > > removed in the amended function definition previously posted:
> > > > http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00094.html
> > > > 
> > > > About base_name() and last_component() multi-environment compatibility:
> > > > * lib/basename.c (base_name): Call last_component() to find the
> > > > basename,
> > > > allocate the basename (prefix with './' if required) *
> > > > lib/basename-lgpl.c
> > > > (last_component): Use the macros FILE_SYSTEM_PREFIX_LEN and ISSLASH to
> > > > isolate the basename * lib/dosname.h: Define the macros
> > > > FILE_SYSTEM_PREFIX_LEN and ISSLASH to work on different system
> > > > environments
> > > > 
> > > > Forgive this copy and paste, but my question about
> > > > replace_metalink_basename() is "if *name is NULL and ref is like
> > > > 'dir/C:D:file', the result will be 'C:D:file'; is it advisable to remove
> > > > the drive letters 'C:D:' and return only 'file'?".
> > > 
> > > Yes, if you use the result as file name to any file/system calls.
> > 
> > Function amended to remove prefix drive letters (aka FILE_SYSTEM_PREFIX_LEN)
> > from the resulting file name, when the file name is a bare basename.
> > 
> > Use the dirname from *name and the basename from ref to compute the
> > resulting file name.
> > 
> >   *name  +  ref   -> result
> >   -
> >   NULL   + "foo/C:D:file" -> "file" [bare basename]
> >   "foobar"   + "foo/C:D:file" -> "file" [bare basename]
> >   "dir/old"  + "foo/C:D:file" -> "dir/C:D:file"
> >   "C:D:file/old" + "foo/E:F:new"  -> "C:D:file/E:F:new" [is this ok?]
> 
> Just make sure that no file name beginning with letter+colon is used for 
> system 
> calls on Windows (e.g. open("C:D:file/E:F:new", ...) is not a good idea). 
> Either you strip the 'C:D:', or percent escape ':' on Windows. Wget has 
> functions to percent escape special characters in file names, depending on 
> the 
> OS it is built on.

Thanks. Function amended to always remove prefix drive letters if required 
(i.e. on w32 environments) for safety reasons.

About %-escaping, in src/url.c there is 'static void append_uri_pathel()'. I 
can't find references to opt.restrict_files_os in other places (except 
src/init.c and src/options.h).

Could be useful to write an extern function to do file name %-escaping?

The reason of this function is to replace an existing basename with another 
basename. Is it appropriate to do a file name verification here?

The results in the table below regards w32 environments:

  *name + ref-> result
  
  NULL  

Re: [Bug-wget] [PATCH 20/25] New option --metalink-index to process Metalink application/metalink4+xml

2016-09-16 Thread Eli Zaretskii
> From: Tim Ruehsen 
> Cc: mehw.is...@inventati.org, bug-wget@gnu.org
> Date: Fri, 16 Sep 2016 11:15:31 +0200
> 
> > So if wget needs to create or open such files, it needs to replace the
> > colon with some other character, like '!'.
> 
> That is what I meant with 'Wget has functions to percent escape special 
> characters...'. It is not only colons. And it depends on the OS (and/or file 
> system).

OK, so the problem should not exist, good.

> From https://en.wikipedia.org/wiki/Comparison_of_file_systems:
> "MS-DOS, Microsoft Windows, and OS/2 disallow the characters \ / : ? * " > < 
> | 
> and NUL in file and directory names across all filesystems. Unices and Linux 
> disallow the characters / and NUL in file and directory names across all 
> filesystems."

This is a better reference, JFYI:

  
https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx



Re: [Bug-wget] [PATCH 20/25] New option --metalink-index to process Metalink application/metalink4+xml

2016-09-16 Thread Tim Ruehsen
On Friday, September 16, 2016 11:39:15 AM CEST Eli Zaretskii wrote:
> > From: Tim Ruehsen 
> > Date: Fri, 16 Sep 2016 10:15:17 +0200
> > Cc: bug-wget@gnu.org
> > 
> > >   *name  +  ref   -> result
> > >   -
> > >   NULL   + "foo/C:D:file" -> "file" [bare basename]
> > >   "foobar"   + "foo/C:D:file" -> "file" [bare basename]
> > >   "dir/old"  + "foo/C:D:file" -> "dir/C:D:file"
> > >   "C:D:file/old" + "foo/E:F:new"  -> "C:D:file/E:F:new" [is this ok?]
> > 
> > Just make sure that no file name beginning with letter+colon is used for
> > system calls on Windows (e.g. open("C:D:file/E:F:new", ...) is not a good
> > idea). Either you strip the 'C:D:', or percent escape ':' on Windows.
> > Wget has functions to percent escape special characters in file names,
> > depending on the OS it is built on.
> 
> (I've lost track of this discussion, and don't understand the context
> well enough to get back on track, so please bear with me.)
> 
> Windows filesystems will not allow file names that have embedded colon
> characters, except if that colon is part of the drive specification at
> the beginning of a file name, as in "D:/dir/file".  File names like
> the 2 last results above are not allowed, and cannot be created or
> opened.
> 
> So if wget needs to create or open such files, it needs to replace the
> colon with some other character, like '!'.

That is what I meant with 'Wget has functions to percent escape special 
characters...'. It is not only colons. And it depends on the OS (and/or file 
system).

From https://en.wikipedia.org/wiki/Comparison_of_file_systems:
"MS-DOS, Microsoft Windows, and OS/2 disallow the characters \ / : ? * " > < | 
and NUL in file and directory names across all filesystems. Unices and Linux 
disallow the characters / and NUL in file and directory names across all 
filesystems."

Regards, Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] [PATCH 20/25] New option --metalink-index to process Metalink application/metalink4+xml

2016-09-16 Thread Eli Zaretskii
> From: Tim Ruehsen 
> Date: Fri, 16 Sep 2016 10:15:17 +0200
> Cc: bug-wget@gnu.org
> 
> >   *name  +  ref   -> result
> >   -
> >   NULL   + "foo/C:D:file" -> "file" [bare basename]
> >   "foobar"   + "foo/C:D:file" -> "file" [bare basename]
> >   "dir/old"  + "foo/C:D:file" -> "dir/C:D:file"
> >   "C:D:file/old" + "foo/E:F:new"  -> "C:D:file/E:F:new" [is this ok?]
> 
> Just make sure that no file name beginning with letter+colon is used for 
> system 
> calls on Windows (e.g. open("C:D:file/E:F:new", ...) is not a good idea). 
> Either you strip the 'C:D:', or percent escape ':' on Windows. Wget has 
> functions to percent escape special characters in file names, depending on 
> the 
> OS it is built on.

(I've lost track of this discussion, and don't understand the context
well enough to get back on track, so please bear with me.)

Windows filesystems will not allow file names that have embedded colon
characters, except if that colon is part of the drive specification at
the beginning of a file name, as in "D:/dir/file".  File names like
the 2 last results above are not allowed, and cannot be created or
opened.

So if wget needs to create or open such files, it needs to replace the
colon with some other character, like '!'.

Again, apologies if this comment makes no sense in the context of
whatever you've been discussing.



Re: [Bug-wget] [PATCH 20/25] New option --metalink-index to process Metalink application/metalink4+xml

2016-09-16 Thread Tim Ruehsen
On Thursday, September 15, 2016 7:56:45 PM CEST Matthew White wrote:
> On Thu, 15 Sep 2016 16:48:01 +0200
> 
> Tim Ruehsen  wrote:
> > On Thursday, September 15, 2016 4:16:44 PM CEST Matthew White wrote:
> > > On Thu, 15 Sep 2016 09:11:31 +0200
> > > 
> > > Giuseppe Scrivano  wrote:
> > > > Hi Matthew,
> > > > 
> > > > Matthew White  writes:
> > > > >> > Function amended to modify *name in place.
> > > > >> > 
> > > > >> > Followed Tim's suggestions for Patch 09/25 about different
> > > > >> > environments compatibility, the function now uses
> > > > >> > last_component()
> > > > >> > to detect the basename:
> > > > >> > http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00083.html
> > > > >> > 
> > > > >> > NOTES: if *name is NULL and ref is like 'dir/C:D:file', the
> > > > >> > result
> > > > >> > will be 'C:D:file'; is it advisable to remove the drive letters
> > > > >> > 'C:D:' and return only 'file'?
> > > > >> > 
> > > > >> > /*
> > > > >> > 
> > > > >> >   Replace/remove the basename of a file name.
> > > > >> >   
> > > > >> >   The file name is permanently modified.
> > > > >> >   
> > > > >> >   Always set NAME to a string, even an empty one.
> > > > >> >   
> > > > >> >   Use REF's basename as replacement.  If REF is NULL or if it
> > > > >> >   doesn't
> > > > >> >   provide a valid basename candidate, then remove NAME's
> > > > >> >   basename.
> > > > >> > 
> > > > >> > */
> > > > >> > void
> > > > >> > replace_metalink_basename (char **name, char *ref)
> > > > >> > {
> > > > >> 
> > > > >> is it something we could do using only dirname and basename?  What
> > > > >> you
> > > > >> need here is "dirname(name) + basename(ref)"?
> > > > > 
> > > > > You asked to avoid superfluous memory allocations, right?
> > > > 
> > > > yes, and that is still my idea.  I was just wondering if the cost of
> > > > these extra memory allocations was worth.  Is it enough to use
> > > > last_component() to get it working on Windows?
> > > 
> > > Extra memory allocations shouldn't be a concern any longer, they are
> > > removed in the amended function definition previously posted:
> > > http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00094.html
> > > 
> > > About base_name() and last_component() multi-environment compatibility:
> > > * lib/basename.c (base_name): Call last_component() to find the
> > > basename,
> > > allocate the basename (prefix with './' if required) *
> > > lib/basename-lgpl.c
> > > (last_component): Use the macros FILE_SYSTEM_PREFIX_LEN and ISSLASH to
> > > isolate the basename * lib/dosname.h: Define the macros
> > > FILE_SYSTEM_PREFIX_LEN and ISSLASH to work on different system
> > > environments
> > > 
> > > Forgive this copy and paste, but my question about
> > > replace_metalink_basename() is "if *name is NULL and ref is like
> > > 'dir/C:D:file', the result will be 'C:D:file'; is it advisable to remove
> > > the drive letters 'C:D:' and return only 'file'?".
> > 
> > Yes, if you use the result as file name to any file/system calls.
> 
> Function amended to remove prefix drive letters (aka FILE_SYSTEM_PREFIX_LEN)
> from the resulting file name, when the file name is a bare basename.
> 
> Use the dirname from *name and the basename from ref to compute the
> resulting file name.
> 
>   *name  +  ref   -> result
>   -
>   NULL   + "foo/C:D:file" -> "file" [bare basename]
>   "foobar"   + "foo/C:D:file" -> "file" [bare basename]
>   "dir/old"  + "foo/C:D:file" -> "dir/C:D:file"
>   "C:D:file/old" + "foo/E:F:new"  -> "C:D:file/E:F:new" [is this ok?]

Just make sure that no file name beginning with letter+colon is used for system 
calls on Windows (e.g. open("C:D:file/E:F:new", ...) is not a good idea). 
Either you strip the 'C:D:', or percent escape ':' on Windows. Wget has 
functions to percent escape special characters in file names, depending on the 
OS it is built on.

Regards, Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] [PATCH 20/25] New option --metalink-index to process Metalink application/metalink4+xml

2016-09-15 Thread Matthew White
On Thu, 15 Sep 2016 16:48:01 +0200
Tim Ruehsen  wrote:

> On Thursday, September 15, 2016 4:16:44 PM CEST Matthew White wrote:
> > On Thu, 15 Sep 2016 09:11:31 +0200
> > 
> > Giuseppe Scrivano  wrote:
> > > Hi Matthew,
> > > 
> > > Matthew White  writes:
> > > >> > Function amended to modify *name in place.
> > > >> > 
> > > >> > Followed Tim's suggestions for Patch 09/25 about different
> > > >> > environments compatibility, the function now uses last_component()
> > > >> > to detect the basename:
> > > >> > http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00083.html
> > > >> > 
> > > >> > NOTES: if *name is NULL and ref is like 'dir/C:D:file', the result
> > > >> > will be 'C:D:file'; is it advisable to remove the drive letters
> > > >> > 'C:D:' and return only 'file'?
> > > >> > 
> > > >> > /*
> > > >> > 
> > > >> >   Replace/remove the basename of a file name.
> > > >> >   
> > > >> >   The file name is permanently modified.
> > > >> >   
> > > >> >   Always set NAME to a string, even an empty one.
> > > >> >   
> > > >> >   Use REF's basename as replacement.  If REF is NULL or if it doesn't
> > > >> >   provide a valid basename candidate, then remove NAME's basename.
> > > >> > 
> > > >> > */
> > > >> > void
> > > >> > replace_metalink_basename (char **name, char *ref)
> > > >> > {
> > > >> 
> > > >> is it something we could do using only dirname and basename?  What you
> > > >> need here is "dirname(name) + basename(ref)"?
> > > > 
> > > > You asked to avoid superfluous memory allocations, right?
> > > 
> > > yes, and that is still my idea.  I was just wondering if the cost of
> > > these extra memory allocations was worth.  Is it enough to use
> > > last_component() to get it working on Windows?
> > 
> > Extra memory allocations shouldn't be a concern any longer, they are removed
> > in the amended function definition previously posted:
> > http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00094.html
> > 
> > About base_name() and last_component() multi-environment compatibility:
> > * lib/basename.c (base_name): Call last_component() to find the basename,
> > allocate the basename (prefix with './' if required) * lib/basename-lgpl.c
> > (last_component): Use the macros FILE_SYSTEM_PREFIX_LEN and ISSLASH to
> > isolate the basename * lib/dosname.h: Define the macros
> > FILE_SYSTEM_PREFIX_LEN and ISSLASH to work on different system environments
> > 
> > Forgive this copy and paste, but my question about
> > replace_metalink_basename() is "if *name is NULL and ref is like
> > 'dir/C:D:file', the result will be 'C:D:file'; is it advisable to remove
> > the drive letters 'C:D:' and return only 'file'?".
> 
> Yes, if you use the result as file name to any file/system calls.

Function amended to remove prefix drive letters (aka FILE_SYSTEM_PREFIX_LEN) 
from the resulting file name, when the file name is a bare basename.

Use the dirname from *name and the basename from ref to compute the resulting 
file name.

  *name  +  ref   -> result
  -
  NULL   + "foo/C:D:file" -> "file" [bare basename]
  "foobar"   + "foo/C:D:file" -> "file" [bare basename]
  "dir/old"  + "foo/C:D:file" -> "dir/C:D:file"
  "C:D:file/old" + "foo/E:F:new"  -> "C:D:file/E:F:new" [is this ok?] 

void
replace_metalink_basename (char **name, char *ref)
{
  int n;
  char *new, *basename;

  if (!name)
return;

  /* Strip old basename.  */
  if (*name)
{
  basename = last_component (*name);

  if (basename == *name)
xfree (*name);
  else
*basename = '\0';
}

  /* New basename from file name reference.  */
  if (ref)
{
  ref = last_component (ref);

  if (!*name)
while ((n = FILE_SYSTEM_PREFIX_LEN (ref)))
  ref += n;
}

  /* Replace the old basename.  */
  new = aprintf ("%s%s", *name ? *name : "", ref ? ref : "");
  xfree (*name);
  *name = new;
}

> 
> > PS: I'm currently working to create a common python class for the Metalink
> > tests (testenv), see Patch 01/25
> 
> :thumbs up:

All right!

> 
> Regards, Tim

Regards,
Matthew

-- 
Matthew White 


pgpBRFXhcVYFy.pgp
Description: PGP signature


Re: [Bug-wget] [PATCH 20/25] New option --metalink-index to process Metalink application/metalink4+xml

2016-09-15 Thread Tim Ruehsen
On Thursday, September 15, 2016 4:16:44 PM CEST Matthew White wrote:
> On Thu, 15 Sep 2016 09:11:31 +0200
> 
> Giuseppe Scrivano  wrote:
> > Hi Matthew,
> > 
> > Matthew White  writes:
> > >> > Function amended to modify *name in place.
> > >> > 
> > >> > Followed Tim's suggestions for Patch 09/25 about different
> > >> > environments compatibility, the function now uses last_component()
> > >> > to detect the basename:
> > >> > http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00083.html
> > >> > 
> > >> > NOTES: if *name is NULL and ref is like 'dir/C:D:file', the result
> > >> > will be 'C:D:file'; is it advisable to remove the drive letters
> > >> > 'C:D:' and return only 'file'?
> > >> > 
> > >> > /*
> > >> > 
> > >> >   Replace/remove the basename of a file name.
> > >> >   
> > >> >   The file name is permanently modified.
> > >> >   
> > >> >   Always set NAME to a string, even an empty one.
> > >> >   
> > >> >   Use REF's basename as replacement.  If REF is NULL or if it doesn't
> > >> >   provide a valid basename candidate, then remove NAME's basename.
> > >> > 
> > >> > */
> > >> > void
> > >> > replace_metalink_basename (char **name, char *ref)
> > >> > {
> > >> 
> > >> is it something we could do using only dirname and basename?  What you
> > >> need here is "dirname(name) + basename(ref)"?
> > > 
> > > You asked to avoid superfluous memory allocations, right?
> > 
> > yes, and that is still my idea.  I was just wondering if the cost of
> > these extra memory allocations was worth.  Is it enough to use
> > last_component() to get it working on Windows?
> 
> Extra memory allocations shouldn't be a concern any longer, they are removed
> in the amended function definition previously posted:
> http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00094.html
> 
> About base_name() and last_component() multi-environment compatibility:
> * lib/basename.c (base_name): Call last_component() to find the basename,
> allocate the basename (prefix with './' if required) * lib/basename-lgpl.c
> (last_component): Use the macros FILE_SYSTEM_PREFIX_LEN and ISSLASH to
> isolate the basename * lib/dosname.h: Define the macros
> FILE_SYSTEM_PREFIX_LEN and ISSLASH to work on different system environments
> 
> Forgive this copy and paste, but my question about
> replace_metalink_basename() is "if *name is NULL and ref is like
> 'dir/C:D:file', the result will be 'C:D:file'; is it advisable to remove
> the drive letters 'C:D:' and return only 'file'?".

Yes, if you use the result as file name to any file/system calls.

> PS: I'm currently working to create a common python class for the Metalink
> tests (testenv), see Patch 01/25

:thumbs up:

Regards, Tim


signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] [PATCH 20/25] New option --metalink-index to process Metalink application/metalink4+xml

2016-09-15 Thread Matthew White
On Thu, 15 Sep 2016 09:11:31 +0200
Giuseppe Scrivano  wrote:

> Hi Matthew,
> 
> Matthew White  writes:
> 
> >> > Function amended to modify *name in place.
> >> >
> >> > Followed Tim's suggestions for Patch 09/25 about different environments 
> >> > compatibility, the function now uses last_component() to detect the 
> >> > basename:
> >> > http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00083.html
> >> >
> >> > NOTES: if *name is NULL and ref is like 'dir/C:D:file', the result will 
> >> > be 'C:D:file'; is it advisable to remove the drive letters 'C:D:' and 
> >> > return only 'file'?
> >> >
> >> > /*
> >> >   Replace/remove the basename of a file name.
> >> >
> >> >   The file name is permanently modified.
> >> >
> >> >   Always set NAME to a string, even an empty one.
> >> >
> >> >   Use REF's basename as replacement.  If REF is NULL or if it doesn't
> >> >   provide a valid basename candidate, then remove NAME's basename.
> >> > */
> >> > void
> >> > replace_metalink_basename (char **name, char *ref)
> >> > {
> >> 
> >> is it something we could do using only dirname and basename?  What you
> >> need here is "dirname(name) + basename(ref)"?
> >
> > You asked to avoid superfluous memory allocations, right?
> 
> yes, and that is still my idea.  I was just wondering if the cost of
> these extra memory allocations was worth.  Is it enough to use
> last_component() to get it working on Windows?

Extra memory allocations shouldn't be a concern any longer, they are removed in 
the amended function definition previously posted:
http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00094.html

About base_name() and last_component() multi-environment compatibility: 
* lib/basename.c (base_name): Call last_component() to find the basename, 
allocate the basename (prefix with './' if required)
* lib/basename-lgpl.c (last_component): Use the macros FILE_SYSTEM_PREFIX_LEN 
and ISSLASH to isolate the basename
* lib/dosname.h: Define the macros FILE_SYSTEM_PREFIX_LEN and ISSLASH to work 
on different system environments

Forgive this copy and paste, but my question about replace_metalink_basename() 
is "if *name is NULL and ref is like 'dir/C:D:file', the result will be 
'C:D:file'; is it advisable to remove the drive letters 'C:D:' and return only 
'file'?".

PS: I'm currently working to create a common python class for the Metalink 
tests (testenv), see Patch 01/25 
http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00085.html .

> 
> Thanks,
> Giuseppe

Regards,
Matthew

-- 
Matthew White 


pgp5rmu9O_nMW.pgp
Description: PGP signature


Re: [Bug-wget] [PATCH 20/25] New option --metalink-index to process Metalink application/metalink4+xml

2016-09-15 Thread Giuseppe Scrivano
Hi Matthew,

Matthew White  writes:

>> > Function amended to modify *name in place.
>> >
>> > Followed Tim's suggestions for Patch 09/25 about different environments 
>> > compatibility, the function now uses last_component() to detect the 
>> > basename:
>> > http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00083.html
>> >
>> > NOTES: if *name is NULL and ref is like 'dir/C:D:file', the result will be 
>> > 'C:D:file'; is it advisable to remove the drive letters 'C:D:' and return 
>> > only 'file'?
>> >
>> > /*
>> >   Replace/remove the basename of a file name.
>> >
>> >   The file name is permanently modified.
>> >
>> >   Always set NAME to a string, even an empty one.
>> >
>> >   Use REF's basename as replacement.  If REF is NULL or if it doesn't
>> >   provide a valid basename candidate, then remove NAME's basename.
>> > */
>> > void
>> > replace_metalink_basename (char **name, char *ref)
>> > {
>> 
>> is it something we could do using only dirname and basename?  What you
>> need here is "dirname(name) + basename(ref)"?
>
> You asked to avoid superfluous memory allocations, right?

yes, and that is still my idea.  I was just wondering if the cost of
these extra memory allocations was worth.  Is it enough to use
last_component() to get it working on Windows?

Thanks,
Giuseppe



Re: [Bug-wget] [PATCH 20/25] New option --metalink-index to process Metalink application/metalink4+xml

2016-09-14 Thread Matthew White
On Wed, 14 Sep 2016 13:03:58 +0200
Giuseppe Scrivano  wrote:

> Hi Matthew,
> 
> Matthew White  writes:
> 
> > On Sun, 11 Sep 2016 23:39:09 +0200
> > Giuseppe Scrivano  wrote:
> >
> >> Hi Matthew,
> >> 
> >> Matthew White  writes:
> >> 
> >> > +void
> >> > +replace_metalink_basename (char **name, char *ref)
> >> > +{
> >> > +  size_t dir_len = 0;
> >> > +  char *p, *dir, *file, *new;
> >> > +
> >> > +  if (!name)
> >> > +return;
> >> > +
> >> > +  /* New basename from file name reference.  */
> >> > +  file = ref;
> >> > +  if (file)
> >> > +{
> >> > +  p = strrchr (file, '/');
> >> > +  if (p)
> >> > +file = p + 1;
> >> > +}
> >> > +
> >> > +  /* Old directory.  */
> >> > +  dir = NULL;
> >> > +  if (*name)
> >> > +{
> >> > +  p = strrchr (*name, '/');
> >> > +  if (p)
> >> > +dir_len = (p - *name) + 1;
> >> > +}
> >> > +  dir = xstrndup (*name, dir_len);
> >> 
> >> I'll review this patch more in details, but one small thing here, could
> >> we modify name in place and avoid a memory allocation for dir?
> >> 
> >> name[dir_len] = '\0';
> >
> > Function amended to modify *name in place.
> >
> > Followed Tim's suggestions for Patch 09/25 about different environments 
> > compatibility, the function now uses last_component() to detect the 
> > basename:
> > http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00083.html
> >
> > NOTES: if *name is NULL and ref is like 'dir/C:D:file', the result will be 
> > 'C:D:file'; is it advisable to remove the drive letters 'C:D:' and return 
> > only 'file'?
> >
> > /*
> >   Replace/remove the basename of a file name.
> >
> >   The file name is permanently modified.
> >
> >   Always set NAME to a string, even an empty one.
> >
> >   Use REF's basename as replacement.  If REF is NULL or if it doesn't
> >   provide a valid basename candidate, then remove NAME's basename.
> > */
> > void
> > replace_metalink_basename (char **name, char *ref)
> > {
> 
> is it something we could do using only dirname and basename?  What you
> need here is "dirname(name) + basename(ref)"?

You asked to avoid superfluous memory allocations, right?

> >> I'll review this patch more in details, but one small thing here, could
> >> we modify name in place and avoid a memory allocation for dir?
> >> 
> >> name[dir_len] = '\0';

dir_name() allocates new memory and it may add a '.' (append_dot), there's a 
WARNING disclaimer in this patch:

+/* Insert the current "Directory Options".  */
+if (metalink->origin)
+  {
+/* WARNING: Do not use lib/dirname.c (dir_name) to
+   get the directory name, it may append a dot '.'
+   character to the directory name. */
+metadir = xstrdup (planname);
+replace_metalink_basename (, NULL);
+  }

base_name() allocates new memory, it was also discussed in Patch 09/25 
http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00070.html .

> 
> Regards,
> Giuseppe

Regards,
Matthew

-- 
Matthew White 


pgpS4DwtFlqeP.pgp
Description: PGP signature


Re: [Bug-wget] [PATCH 20/25] New option --metalink-index to process Metalink application/metalink4+xml

2016-09-14 Thread Giuseppe Scrivano
Hi Matthew,

Matthew White  writes:

> On Sun, 11 Sep 2016 23:39:09 +0200
> Giuseppe Scrivano  wrote:
>
>> Hi Matthew,
>> 
>> Matthew White  writes:
>> 
>> > +void
>> > +replace_metalink_basename (char **name, char *ref)
>> > +{
>> > +  size_t dir_len = 0;
>> > +  char *p, *dir, *file, *new;
>> > +
>> > +  if (!name)
>> > +return;
>> > +
>> > +  /* New basename from file name reference.  */
>> > +  file = ref;
>> > +  if (file)
>> > +{
>> > +  p = strrchr (file, '/');
>> > +  if (p)
>> > +file = p + 1;
>> > +}
>> > +
>> > +  /* Old directory.  */
>> > +  dir = NULL;
>> > +  if (*name)
>> > +{
>> > +  p = strrchr (*name, '/');
>> > +  if (p)
>> > +dir_len = (p - *name) + 1;
>> > +}
>> > +  dir = xstrndup (*name, dir_len);
>> 
>> I'll review this patch more in details, but one small thing here, could
>> we modify name in place and avoid a memory allocation for dir?
>> 
>> name[dir_len] = '\0';
>
> Function amended to modify *name in place.
>
> Followed Tim's suggestions for Patch 09/25 about different environments 
> compatibility, the function now uses last_component() to detect the basename:
> http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00083.html
>
> NOTES: if *name is NULL and ref is like 'dir/C:D:file', the result will be 
> 'C:D:file'; is it advisable to remove the drive letters 'C:D:' and return 
> only 'file'?
>
> /*
>   Replace/remove the basename of a file name.
>
>   The file name is permanently modified.
>
>   Always set NAME to a string, even an empty one.
>
>   Use REF's basename as replacement.  If REF is NULL or if it doesn't
>   provide a valid basename candidate, then remove NAME's basename.
> */
> void
> replace_metalink_basename (char **name, char *ref)
> {

is it something we could do using only dirname and basename?  What you
need here is "dirname(name) + basename(ref)"?

Regards,
Giuseppe



Re: [Bug-wget] [PATCH 20/25] New option --metalink-index to process Metalink application/metalink4+xml

2016-09-14 Thread Matthew White
On Sun, 11 Sep 2016 23:39:09 +0200
Giuseppe Scrivano  wrote:

> Hi Matthew,
> 
> Matthew White  writes:
> 
> > +void
> > +replace_metalink_basename (char **name, char *ref)
> > +{
> > +  size_t dir_len = 0;
> > +  char *p, *dir, *file, *new;
> > +
> > +  if (!name)
> > +return;
> > +
> > +  /* New basename from file name reference.  */
> > +  file = ref;
> > +  if (file)
> > +{
> > +  p = strrchr (file, '/');
> > +  if (p)
> > +file = p + 1;
> > +}
> > +
> > +  /* Old directory.  */
> > +  dir = NULL;
> > +  if (*name)
> > +{
> > +  p = strrchr (*name, '/');
> > +  if (p)
> > +dir_len = (p - *name) + 1;
> > +}
> > +  dir = xstrndup (*name, dir_len);
> 
> I'll review this patch more in details, but one small thing here, could
> we modify name in place and avoid a memory allocation for dir?
> 
> name[dir_len] = '\0';

Function amended to modify *name in place.

Followed Tim's suggestions for Patch 09/25 about different environments 
compatibility, the function now uses last_component() to detect the basename:
http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00083.html

NOTES: if *name is NULL and ref is like 'dir/C:D:file', the result will be 
'C:D:file'; is it advisable to remove the drive letters 'C:D:' and return only 
'file'?

/*
  Replace/remove the basename of a file name.

  The file name is permanently modified.

  Always set NAME to a string, even an empty one.

  Use REF's basename as replacement.  If REF is NULL or if it doesn't
  provide a valid basename candidate, then remove NAME's basename.
*/
void
replace_metalink_basename (char **name, char *ref)
{
  char *new, *basename;

  if (!name)
return;

  /* Strip old basename.  */
  if (*name)
{
  basename = last_component (*name);

  if (basename == *name)
xfree (*name);
  else
*basename = '\0';
}

  /* New basename from file name reference.  */
  if (ref)
ref = last_component (ref);

  /* Replace the old basename.  */
  new = aprintf ("%s%s", *name ? *name : "", ref ? ref : "");
  xfree (*name);
  *name = new;
}

> 
> Giuseppe

Regards,
Matthew

-- 
Matthew White 


pgprJOVF1ZTGo.pgp
Description: PGP signature