Re: [Bug-wget] [PATCH 20/25] New option --metalink-index to process Metalink application/metalink4+xml
On Fri, 16 Sep 2016 10:15:17 +0200 Tim Ruehsenwrote: > On Thursday, September 15, 2016 7:56:45 PM CEST Matthew White wrote: > > On Thu, 15 Sep 2016 16:48:01 +0200 > > > > Tim Ruehsen wrote: > > > On Thursday, September 15, 2016 4:16:44 PM CEST Matthew White wrote: > > > > On Thu, 15 Sep 2016 09:11:31 +0200 > > > > > > > > Giuseppe Scrivano wrote: > > > > > Hi Matthew, > > > > > > > > > > Matthew White writes: > > > > > >> > Function amended to modify *name in place. > > > > > >> > > > > > > >> > Followed Tim's suggestions for Patch 09/25 about different > > > > > >> > environments compatibility, the function now uses > > > > > >> > last_component() > > > > > >> > to detect the basename: > > > > > >> > http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00083.html > > > > > >> > > > > > > >> > NOTES: if *name is NULL and ref is like 'dir/C:D:file', the > > > > > >> > result > > > > > >> > will be 'C:D:file'; is it advisable to remove the drive letters > > > > > >> > 'C:D:' and return only 'file'? > > > > > >> > > > > > > >> > /* > > > > > >> > > > > > > >> > Replace/remove the basename of a file name. > > > > > >> > > > > > > >> > The file name is permanently modified. > > > > > >> > > > > > > >> > Always set NAME to a string, even an empty one. > > > > > >> > > > > > > >> > Use REF's basename as replacement. If REF is NULL or if it > > > > > >> > doesn't > > > > > >> > provide a valid basename candidate, then remove NAME's > > > > > >> > basename. > > > > > >> > > > > > > >> > */ > > > > > >> > void > > > > > >> > replace_metalink_basename (char **name, char *ref) > > > > > >> > { > > > > > >> > > > > > >> is it something we could do using only dirname and basename? What > > > > > >> you > > > > > >> need here is "dirname(name) + basename(ref)"? > > > > > > > > > > > > You asked to avoid superfluous memory allocations, right? > > > > > > > > > > yes, and that is still my idea. I was just wondering if the cost of > > > > > these extra memory allocations was worth. Is it enough to use > > > > > last_component() to get it working on Windows? > > > > > > > > Extra memory allocations shouldn't be a concern any longer, they are > > > > removed in the amended function definition previously posted: > > > > http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00094.html > > > > > > > > About base_name() and last_component() multi-environment compatibility: > > > > * lib/basename.c (base_name): Call last_component() to find the > > > > basename, > > > > allocate the basename (prefix with './' if required) * > > > > lib/basename-lgpl.c > > > > (last_component): Use the macros FILE_SYSTEM_PREFIX_LEN and ISSLASH to > > > > isolate the basename * lib/dosname.h: Define the macros > > > > FILE_SYSTEM_PREFIX_LEN and ISSLASH to work on different system > > > > environments > > > > > > > > Forgive this copy and paste, but my question about > > > > replace_metalink_basename() is "if *name is NULL and ref is like > > > > 'dir/C:D:file', the result will be 'C:D:file'; is it advisable to remove > > > > the drive letters 'C:D:' and return only 'file'?". > > > > > > Yes, if you use the result as file name to any file/system calls. > > > > Function amended to remove prefix drive letters (aka FILE_SYSTEM_PREFIX_LEN) > > from the resulting file name, when the file name is a bare basename. > > > > Use the dirname from *name and the basename from ref to compute the > > resulting file name. > > > > *name + ref -> result > > - > > NULL + "foo/C:D:file" -> "file" [bare basename] > > "foobar" + "foo/C:D:file" -> "file" [bare basename] > > "dir/old" + "foo/C:D:file" -> "dir/C:D:file" > > "C:D:file/old" + "foo/E:F:new" -> "C:D:file/E:F:new" [is this ok?] > > Just make sure that no file name beginning with letter+colon is used for > system > calls on Windows (e.g. open("C:D:file/E:F:new", ...) is not a good idea). > Either you strip the 'C:D:', or percent escape ':' on Windows. Wget has > functions to percent escape special characters in file names, depending on > the > OS it is built on. Thanks. Function amended to always remove prefix drive letters if required (i.e. on w32 environments) for safety reasons. About %-escaping, in src/url.c there is 'static void append_uri_pathel()'. I can't find references to opt.restrict_files_os in other places (except src/init.c and src/options.h). Could be useful to write an extern function to do file name %-escaping? The reason of this function is to replace an existing basename with another basename. Is it appropriate to do a file name verification here? The results in the table below regards w32 environments: *name + ref-> result NULL
Re: [Bug-wget] [PATCH 20/25] New option --metalink-index to process Metalink application/metalink4+xml
> From: Tim Ruehsen> Cc: mehw.is...@inventati.org, bug-wget@gnu.org > Date: Fri, 16 Sep 2016 11:15:31 +0200 > > > So if wget needs to create or open such files, it needs to replace the > > colon with some other character, like '!'. > > That is what I meant with 'Wget has functions to percent escape special > characters...'. It is not only colons. And it depends on the OS (and/or file > system). OK, so the problem should not exist, good. > From https://en.wikipedia.org/wiki/Comparison_of_file_systems: > "MS-DOS, Microsoft Windows, and OS/2 disallow the characters \ / : ? * " > < > | > and NUL in file and directory names across all filesystems. Unices and Linux > disallow the characters / and NUL in file and directory names across all > filesystems." This is a better reference, JFYI: https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx
Re: [Bug-wget] [PATCH 20/25] New option --metalink-index to process Metalink application/metalink4+xml
On Friday, September 16, 2016 11:39:15 AM CEST Eli Zaretskii wrote: > > From: Tim Ruehsen> > Date: Fri, 16 Sep 2016 10:15:17 +0200 > > Cc: bug-wget@gnu.org > > > > > *name + ref -> result > > > - > > > NULL + "foo/C:D:file" -> "file" [bare basename] > > > "foobar" + "foo/C:D:file" -> "file" [bare basename] > > > "dir/old" + "foo/C:D:file" -> "dir/C:D:file" > > > "C:D:file/old" + "foo/E:F:new" -> "C:D:file/E:F:new" [is this ok?] > > > > Just make sure that no file name beginning with letter+colon is used for > > system calls on Windows (e.g. open("C:D:file/E:F:new", ...) is not a good > > idea). Either you strip the 'C:D:', or percent escape ':' on Windows. > > Wget has functions to percent escape special characters in file names, > > depending on the OS it is built on. > > (I've lost track of this discussion, and don't understand the context > well enough to get back on track, so please bear with me.) > > Windows filesystems will not allow file names that have embedded colon > characters, except if that colon is part of the drive specification at > the beginning of a file name, as in "D:/dir/file". File names like > the 2 last results above are not allowed, and cannot be created or > opened. > > So if wget needs to create or open such files, it needs to replace the > colon with some other character, like '!'. That is what I meant with 'Wget has functions to percent escape special characters...'. It is not only colons. And it depends on the OS (and/or file system). From https://en.wikipedia.org/wiki/Comparison_of_file_systems: "MS-DOS, Microsoft Windows, and OS/2 disallow the characters \ / : ? * " > < | and NUL in file and directory names across all filesystems. Unices and Linux disallow the characters / and NUL in file and directory names across all filesystems." Regards, Tim signature.asc Description: This is a digitally signed message part.
Re: [Bug-wget] [PATCH 20/25] New option --metalink-index to process Metalink application/metalink4+xml
> From: Tim Ruehsen> Date: Fri, 16 Sep 2016 10:15:17 +0200 > Cc: bug-wget@gnu.org > > > *name + ref -> result > > - > > NULL + "foo/C:D:file" -> "file" [bare basename] > > "foobar" + "foo/C:D:file" -> "file" [bare basename] > > "dir/old" + "foo/C:D:file" -> "dir/C:D:file" > > "C:D:file/old" + "foo/E:F:new" -> "C:D:file/E:F:new" [is this ok?] > > Just make sure that no file name beginning with letter+colon is used for > system > calls on Windows (e.g. open("C:D:file/E:F:new", ...) is not a good idea). > Either you strip the 'C:D:', or percent escape ':' on Windows. Wget has > functions to percent escape special characters in file names, depending on > the > OS it is built on. (I've lost track of this discussion, and don't understand the context well enough to get back on track, so please bear with me.) Windows filesystems will not allow file names that have embedded colon characters, except if that colon is part of the drive specification at the beginning of a file name, as in "D:/dir/file". File names like the 2 last results above are not allowed, and cannot be created or opened. So if wget needs to create or open such files, it needs to replace the colon with some other character, like '!'. Again, apologies if this comment makes no sense in the context of whatever you've been discussing.
Re: [Bug-wget] [PATCH 20/25] New option --metalink-index to process Metalink application/metalink4+xml
On Thursday, September 15, 2016 7:56:45 PM CEST Matthew White wrote: > On Thu, 15 Sep 2016 16:48:01 +0200 > > Tim Ruehsenwrote: > > On Thursday, September 15, 2016 4:16:44 PM CEST Matthew White wrote: > > > On Thu, 15 Sep 2016 09:11:31 +0200 > > > > > > Giuseppe Scrivano wrote: > > > > Hi Matthew, > > > > > > > > Matthew White writes: > > > > >> > Function amended to modify *name in place. > > > > >> > > > > > >> > Followed Tim's suggestions for Patch 09/25 about different > > > > >> > environments compatibility, the function now uses > > > > >> > last_component() > > > > >> > to detect the basename: > > > > >> > http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00083.html > > > > >> > > > > > >> > NOTES: if *name is NULL and ref is like 'dir/C:D:file', the > > > > >> > result > > > > >> > will be 'C:D:file'; is it advisable to remove the drive letters > > > > >> > 'C:D:' and return only 'file'? > > > > >> > > > > > >> > /* > > > > >> > > > > > >> > Replace/remove the basename of a file name. > > > > >> > > > > > >> > The file name is permanently modified. > > > > >> > > > > > >> > Always set NAME to a string, even an empty one. > > > > >> > > > > > >> > Use REF's basename as replacement. If REF is NULL or if it > > > > >> > doesn't > > > > >> > provide a valid basename candidate, then remove NAME's > > > > >> > basename. > > > > >> > > > > > >> > */ > > > > >> > void > > > > >> > replace_metalink_basename (char **name, char *ref) > > > > >> > { > > > > >> > > > > >> is it something we could do using only dirname and basename? What > > > > >> you > > > > >> need here is "dirname(name) + basename(ref)"? > > > > > > > > > > You asked to avoid superfluous memory allocations, right? > > > > > > > > yes, and that is still my idea. I was just wondering if the cost of > > > > these extra memory allocations was worth. Is it enough to use > > > > last_component() to get it working on Windows? > > > > > > Extra memory allocations shouldn't be a concern any longer, they are > > > removed in the amended function definition previously posted: > > > http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00094.html > > > > > > About base_name() and last_component() multi-environment compatibility: > > > * lib/basename.c (base_name): Call last_component() to find the > > > basename, > > > allocate the basename (prefix with './' if required) * > > > lib/basename-lgpl.c > > > (last_component): Use the macros FILE_SYSTEM_PREFIX_LEN and ISSLASH to > > > isolate the basename * lib/dosname.h: Define the macros > > > FILE_SYSTEM_PREFIX_LEN and ISSLASH to work on different system > > > environments > > > > > > Forgive this copy and paste, but my question about > > > replace_metalink_basename() is "if *name is NULL and ref is like > > > 'dir/C:D:file', the result will be 'C:D:file'; is it advisable to remove > > > the drive letters 'C:D:' and return only 'file'?". > > > > Yes, if you use the result as file name to any file/system calls. > > Function amended to remove prefix drive letters (aka FILE_SYSTEM_PREFIX_LEN) > from the resulting file name, when the file name is a bare basename. > > Use the dirname from *name and the basename from ref to compute the > resulting file name. > > *name + ref -> result > - > NULL + "foo/C:D:file" -> "file" [bare basename] > "foobar" + "foo/C:D:file" -> "file" [bare basename] > "dir/old" + "foo/C:D:file" -> "dir/C:D:file" > "C:D:file/old" + "foo/E:F:new" -> "C:D:file/E:F:new" [is this ok?] Just make sure that no file name beginning with letter+colon is used for system calls on Windows (e.g. open("C:D:file/E:F:new", ...) is not a good idea). Either you strip the 'C:D:', or percent escape ':' on Windows. Wget has functions to percent escape special characters in file names, depending on the OS it is built on. Regards, Tim signature.asc Description: This is a digitally signed message part.
Re: [Bug-wget] [PATCH 20/25] New option --metalink-index to process Metalink application/metalink4+xml
On Thu, 15 Sep 2016 16:48:01 +0200 Tim Ruehsenwrote: > On Thursday, September 15, 2016 4:16:44 PM CEST Matthew White wrote: > > On Thu, 15 Sep 2016 09:11:31 +0200 > > > > Giuseppe Scrivano wrote: > > > Hi Matthew, > > > > > > Matthew White writes: > > > >> > Function amended to modify *name in place. > > > >> > > > > >> > Followed Tim's suggestions for Patch 09/25 about different > > > >> > environments compatibility, the function now uses last_component() > > > >> > to detect the basename: > > > >> > http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00083.html > > > >> > > > > >> > NOTES: if *name is NULL and ref is like 'dir/C:D:file', the result > > > >> > will be 'C:D:file'; is it advisable to remove the drive letters > > > >> > 'C:D:' and return only 'file'? > > > >> > > > > >> > /* > > > >> > > > > >> > Replace/remove the basename of a file name. > > > >> > > > > >> > The file name is permanently modified. > > > >> > > > > >> > Always set NAME to a string, even an empty one. > > > >> > > > > >> > Use REF's basename as replacement. If REF is NULL or if it doesn't > > > >> > provide a valid basename candidate, then remove NAME's basename. > > > >> > > > > >> > */ > > > >> > void > > > >> > replace_metalink_basename (char **name, char *ref) > > > >> > { > > > >> > > > >> is it something we could do using only dirname and basename? What you > > > >> need here is "dirname(name) + basename(ref)"? > > > > > > > > You asked to avoid superfluous memory allocations, right? > > > > > > yes, and that is still my idea. I was just wondering if the cost of > > > these extra memory allocations was worth. Is it enough to use > > > last_component() to get it working on Windows? > > > > Extra memory allocations shouldn't be a concern any longer, they are removed > > in the amended function definition previously posted: > > http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00094.html > > > > About base_name() and last_component() multi-environment compatibility: > > * lib/basename.c (base_name): Call last_component() to find the basename, > > allocate the basename (prefix with './' if required) * lib/basename-lgpl.c > > (last_component): Use the macros FILE_SYSTEM_PREFIX_LEN and ISSLASH to > > isolate the basename * lib/dosname.h: Define the macros > > FILE_SYSTEM_PREFIX_LEN and ISSLASH to work on different system environments > > > > Forgive this copy and paste, but my question about > > replace_metalink_basename() is "if *name is NULL and ref is like > > 'dir/C:D:file', the result will be 'C:D:file'; is it advisable to remove > > the drive letters 'C:D:' and return only 'file'?". > > Yes, if you use the result as file name to any file/system calls. Function amended to remove prefix drive letters (aka FILE_SYSTEM_PREFIX_LEN) from the resulting file name, when the file name is a bare basename. Use the dirname from *name and the basename from ref to compute the resulting file name. *name + ref -> result - NULL + "foo/C:D:file" -> "file" [bare basename] "foobar" + "foo/C:D:file" -> "file" [bare basename] "dir/old" + "foo/C:D:file" -> "dir/C:D:file" "C:D:file/old" + "foo/E:F:new" -> "C:D:file/E:F:new" [is this ok?] void replace_metalink_basename (char **name, char *ref) { int n; char *new, *basename; if (!name) return; /* Strip old basename. */ if (*name) { basename = last_component (*name); if (basename == *name) xfree (*name); else *basename = '\0'; } /* New basename from file name reference. */ if (ref) { ref = last_component (ref); if (!*name) while ((n = FILE_SYSTEM_PREFIX_LEN (ref))) ref += n; } /* Replace the old basename. */ new = aprintf ("%s%s", *name ? *name : "", ref ? ref : ""); xfree (*name); *name = new; } > > > PS: I'm currently working to create a common python class for the Metalink > > tests (testenv), see Patch 01/25 > > :thumbs up: All right! > > Regards, Tim Regards, Matthew -- Matthew White pgpBRFXhcVYFy.pgp Description: PGP signature
Re: [Bug-wget] [PATCH 20/25] New option --metalink-index to process Metalink application/metalink4+xml
On Thursday, September 15, 2016 4:16:44 PM CEST Matthew White wrote: > On Thu, 15 Sep 2016 09:11:31 +0200 > > Giuseppe Scrivanowrote: > > Hi Matthew, > > > > Matthew White writes: > > >> > Function amended to modify *name in place. > > >> > > > >> > Followed Tim's suggestions for Patch 09/25 about different > > >> > environments compatibility, the function now uses last_component() > > >> > to detect the basename: > > >> > http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00083.html > > >> > > > >> > NOTES: if *name is NULL and ref is like 'dir/C:D:file', the result > > >> > will be 'C:D:file'; is it advisable to remove the drive letters > > >> > 'C:D:' and return only 'file'? > > >> > > > >> > /* > > >> > > > >> > Replace/remove the basename of a file name. > > >> > > > >> > The file name is permanently modified. > > >> > > > >> > Always set NAME to a string, even an empty one. > > >> > > > >> > Use REF's basename as replacement. If REF is NULL or if it doesn't > > >> > provide a valid basename candidate, then remove NAME's basename. > > >> > > > >> > */ > > >> > void > > >> > replace_metalink_basename (char **name, char *ref) > > >> > { > > >> > > >> is it something we could do using only dirname and basename? What you > > >> need here is "dirname(name) + basename(ref)"? > > > > > > You asked to avoid superfluous memory allocations, right? > > > > yes, and that is still my idea. I was just wondering if the cost of > > these extra memory allocations was worth. Is it enough to use > > last_component() to get it working on Windows? > > Extra memory allocations shouldn't be a concern any longer, they are removed > in the amended function definition previously posted: > http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00094.html > > About base_name() and last_component() multi-environment compatibility: > * lib/basename.c (base_name): Call last_component() to find the basename, > allocate the basename (prefix with './' if required) * lib/basename-lgpl.c > (last_component): Use the macros FILE_SYSTEM_PREFIX_LEN and ISSLASH to > isolate the basename * lib/dosname.h: Define the macros > FILE_SYSTEM_PREFIX_LEN and ISSLASH to work on different system environments > > Forgive this copy and paste, but my question about > replace_metalink_basename() is "if *name is NULL and ref is like > 'dir/C:D:file', the result will be 'C:D:file'; is it advisable to remove > the drive letters 'C:D:' and return only 'file'?". Yes, if you use the result as file name to any file/system calls. > PS: I'm currently working to create a common python class for the Metalink > tests (testenv), see Patch 01/25 :thumbs up: Regards, Tim signature.asc Description: This is a digitally signed message part.
Re: [Bug-wget] [PATCH 20/25] New option --metalink-index to process Metalink application/metalink4+xml
On Thu, 15 Sep 2016 09:11:31 +0200 Giuseppe Scrivanowrote: > Hi Matthew, > > Matthew White writes: > > >> > Function amended to modify *name in place. > >> > > >> > Followed Tim's suggestions for Patch 09/25 about different environments > >> > compatibility, the function now uses last_component() to detect the > >> > basename: > >> > http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00083.html > >> > > >> > NOTES: if *name is NULL and ref is like 'dir/C:D:file', the result will > >> > be 'C:D:file'; is it advisable to remove the drive letters 'C:D:' and > >> > return only 'file'? > >> > > >> > /* > >> > Replace/remove the basename of a file name. > >> > > >> > The file name is permanently modified. > >> > > >> > Always set NAME to a string, even an empty one. > >> > > >> > Use REF's basename as replacement. If REF is NULL or if it doesn't > >> > provide a valid basename candidate, then remove NAME's basename. > >> > */ > >> > void > >> > replace_metalink_basename (char **name, char *ref) > >> > { > >> > >> is it something we could do using only dirname and basename? What you > >> need here is "dirname(name) + basename(ref)"? > > > > You asked to avoid superfluous memory allocations, right? > > yes, and that is still my idea. I was just wondering if the cost of > these extra memory allocations was worth. Is it enough to use > last_component() to get it working on Windows? Extra memory allocations shouldn't be a concern any longer, they are removed in the amended function definition previously posted: http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00094.html About base_name() and last_component() multi-environment compatibility: * lib/basename.c (base_name): Call last_component() to find the basename, allocate the basename (prefix with './' if required) * lib/basename-lgpl.c (last_component): Use the macros FILE_SYSTEM_PREFIX_LEN and ISSLASH to isolate the basename * lib/dosname.h: Define the macros FILE_SYSTEM_PREFIX_LEN and ISSLASH to work on different system environments Forgive this copy and paste, but my question about replace_metalink_basename() is "if *name is NULL and ref is like 'dir/C:D:file', the result will be 'C:D:file'; is it advisable to remove the drive letters 'C:D:' and return only 'file'?". PS: I'm currently working to create a common python class for the Metalink tests (testenv), see Patch 01/25 http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00085.html . > > Thanks, > Giuseppe Regards, Matthew -- Matthew White pgp5rmu9O_nMW.pgp Description: PGP signature
Re: [Bug-wget] [PATCH 20/25] New option --metalink-index to process Metalink application/metalink4+xml
Hi Matthew, Matthew Whitewrites: >> > Function amended to modify *name in place. >> > >> > Followed Tim's suggestions for Patch 09/25 about different environments >> > compatibility, the function now uses last_component() to detect the >> > basename: >> > http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00083.html >> > >> > NOTES: if *name is NULL and ref is like 'dir/C:D:file', the result will be >> > 'C:D:file'; is it advisable to remove the drive letters 'C:D:' and return >> > only 'file'? >> > >> > /* >> > Replace/remove the basename of a file name. >> > >> > The file name is permanently modified. >> > >> > Always set NAME to a string, even an empty one. >> > >> > Use REF's basename as replacement. If REF is NULL or if it doesn't >> > provide a valid basename candidate, then remove NAME's basename. >> > */ >> > void >> > replace_metalink_basename (char **name, char *ref) >> > { >> >> is it something we could do using only dirname and basename? What you >> need here is "dirname(name) + basename(ref)"? > > You asked to avoid superfluous memory allocations, right? yes, and that is still my idea. I was just wondering if the cost of these extra memory allocations was worth. Is it enough to use last_component() to get it working on Windows? Thanks, Giuseppe
Re: [Bug-wget] [PATCH 20/25] New option --metalink-index to process Metalink application/metalink4+xml
On Wed, 14 Sep 2016 13:03:58 +0200 Giuseppe Scrivanowrote: > Hi Matthew, > > Matthew White writes: > > > On Sun, 11 Sep 2016 23:39:09 +0200 > > Giuseppe Scrivano wrote: > > > >> Hi Matthew, > >> > >> Matthew White writes: > >> > >> > +void > >> > +replace_metalink_basename (char **name, char *ref) > >> > +{ > >> > + size_t dir_len = 0; > >> > + char *p, *dir, *file, *new; > >> > + > >> > + if (!name) > >> > +return; > >> > + > >> > + /* New basename from file name reference. */ > >> > + file = ref; > >> > + if (file) > >> > +{ > >> > + p = strrchr (file, '/'); > >> > + if (p) > >> > +file = p + 1; > >> > +} > >> > + > >> > + /* Old directory. */ > >> > + dir = NULL; > >> > + if (*name) > >> > +{ > >> > + p = strrchr (*name, '/'); > >> > + if (p) > >> > +dir_len = (p - *name) + 1; > >> > +} > >> > + dir = xstrndup (*name, dir_len); > >> > >> I'll review this patch more in details, but one small thing here, could > >> we modify name in place and avoid a memory allocation for dir? > >> > >> name[dir_len] = '\0'; > > > > Function amended to modify *name in place. > > > > Followed Tim's suggestions for Patch 09/25 about different environments > > compatibility, the function now uses last_component() to detect the > > basename: > > http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00083.html > > > > NOTES: if *name is NULL and ref is like 'dir/C:D:file', the result will be > > 'C:D:file'; is it advisable to remove the drive letters 'C:D:' and return > > only 'file'? > > > > /* > > Replace/remove the basename of a file name. > > > > The file name is permanently modified. > > > > Always set NAME to a string, even an empty one. > > > > Use REF's basename as replacement. If REF is NULL or if it doesn't > > provide a valid basename candidate, then remove NAME's basename. > > */ > > void > > replace_metalink_basename (char **name, char *ref) > > { > > is it something we could do using only dirname and basename? What you > need here is "dirname(name) + basename(ref)"? You asked to avoid superfluous memory allocations, right? > >> I'll review this patch more in details, but one small thing here, could > >> we modify name in place and avoid a memory allocation for dir? > >> > >> name[dir_len] = '\0'; dir_name() allocates new memory and it may add a '.' (append_dot), there's a WARNING disclaimer in this patch: +/* Insert the current "Directory Options". */ +if (metalink->origin) + { +/* WARNING: Do not use lib/dirname.c (dir_name) to + get the directory name, it may append a dot '.' + character to the directory name. */ +metadir = xstrdup (planname); +replace_metalink_basename (, NULL); + } base_name() allocates new memory, it was also discussed in Patch 09/25 http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00070.html . > > Regards, > Giuseppe Regards, Matthew -- Matthew White pgpS4DwtFlqeP.pgp Description: PGP signature
Re: [Bug-wget] [PATCH 20/25] New option --metalink-index to process Metalink application/metalink4+xml
Hi Matthew, Matthew Whitewrites: > On Sun, 11 Sep 2016 23:39:09 +0200 > Giuseppe Scrivano wrote: > >> Hi Matthew, >> >> Matthew White writes: >> >> > +void >> > +replace_metalink_basename (char **name, char *ref) >> > +{ >> > + size_t dir_len = 0; >> > + char *p, *dir, *file, *new; >> > + >> > + if (!name) >> > +return; >> > + >> > + /* New basename from file name reference. */ >> > + file = ref; >> > + if (file) >> > +{ >> > + p = strrchr (file, '/'); >> > + if (p) >> > +file = p + 1; >> > +} >> > + >> > + /* Old directory. */ >> > + dir = NULL; >> > + if (*name) >> > +{ >> > + p = strrchr (*name, '/'); >> > + if (p) >> > +dir_len = (p - *name) + 1; >> > +} >> > + dir = xstrndup (*name, dir_len); >> >> I'll review this patch more in details, but one small thing here, could >> we modify name in place and avoid a memory allocation for dir? >> >> name[dir_len] = '\0'; > > Function amended to modify *name in place. > > Followed Tim's suggestions for Patch 09/25 about different environments > compatibility, the function now uses last_component() to detect the basename: > http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00083.html > > NOTES: if *name is NULL and ref is like 'dir/C:D:file', the result will be > 'C:D:file'; is it advisable to remove the drive letters 'C:D:' and return > only 'file'? > > /* > Replace/remove the basename of a file name. > > The file name is permanently modified. > > Always set NAME to a string, even an empty one. > > Use REF's basename as replacement. If REF is NULL or if it doesn't > provide a valid basename candidate, then remove NAME's basename. > */ > void > replace_metalink_basename (char **name, char *ref) > { is it something we could do using only dirname and basename? What you need here is "dirname(name) + basename(ref)"? Regards, Giuseppe
Re: [Bug-wget] [PATCH 20/25] New option --metalink-index to process Metalink application/metalink4+xml
On Sun, 11 Sep 2016 23:39:09 +0200 Giuseppe Scrivanowrote: > Hi Matthew, > > Matthew White writes: > > > +void > > +replace_metalink_basename (char **name, char *ref) > > +{ > > + size_t dir_len = 0; > > + char *p, *dir, *file, *new; > > + > > + if (!name) > > +return; > > + > > + /* New basename from file name reference. */ > > + file = ref; > > + if (file) > > +{ > > + p = strrchr (file, '/'); > > + if (p) > > +file = p + 1; > > +} > > + > > + /* Old directory. */ > > + dir = NULL; > > + if (*name) > > +{ > > + p = strrchr (*name, '/'); > > + if (p) > > +dir_len = (p - *name) + 1; > > +} > > + dir = xstrndup (*name, dir_len); > > I'll review this patch more in details, but one small thing here, could > we modify name in place and avoid a memory allocation for dir? > > name[dir_len] = '\0'; Function amended to modify *name in place. Followed Tim's suggestions for Patch 09/25 about different environments compatibility, the function now uses last_component() to detect the basename: http://lists.gnu.org/archive/html/bug-wget/2016-09/msg00083.html NOTES: if *name is NULL and ref is like 'dir/C:D:file', the result will be 'C:D:file'; is it advisable to remove the drive letters 'C:D:' and return only 'file'? /* Replace/remove the basename of a file name. The file name is permanently modified. Always set NAME to a string, even an empty one. Use REF's basename as replacement. If REF is NULL or if it doesn't provide a valid basename candidate, then remove NAME's basename. */ void replace_metalink_basename (char **name, char *ref) { char *new, *basename; if (!name) return; /* Strip old basename. */ if (*name) { basename = last_component (*name); if (basename == *name) xfree (*name); else *basename = '\0'; } /* New basename from file name reference. */ if (ref) ref = last_component (ref); /* Replace the old basename. */ new = aprintf ("%s%s", *name ? *name : "", ref ? ref : ""); xfree (*name); *name = new; } > > Giuseppe Regards, Matthew -- Matthew White pgprJOVF1ZTGo.pgp Description: PGP signature