libmetalink: Metalink C library

2008-06-13 Thread Tatsuhiro Tsujikawa
Hi,

I read the planned Metalink support in wget, so I post this message.
I've started the project named libmetalink, which is a Metalink library
written in C language. It is intended to provide the programs written in
C to add Metalink functionality such as parsing Metalink XML files.

The project is hosted on google-code:
http://code.google.com/p/libmetalink/
and the license is MIT license which is compatible with GPLv3 wget
currently uses.

It has just started, so there still lacks documents and road maps etc.
I'll update them on the course of development.

If anyone is interested in this project, please inform me.

Thanks,

Tatsuhiro Tsujikawa



RE: Wget 1.11.3 - case sensetivity and URLs

2008-06-13 Thread Tony Lewis
Micah Cowan wrote:

 Unfortunately, nothing really comes to mind. If you'd like, you could
 file a feature request at
 https://savannah.gnu.org/bugs/?func=additemgroup=wget, for an option
 asking Wget to treat URLs case-insensitively.

To have the effect that Allan seeks, I think the option would have to convert 
all URIs to lower case at an appropriate point in the process. I think you 
probably want to send the original case to the server (just in case it really 
does matter to the server). If you're going to treat different case URIs as 
matching then the lower-case version will have to be stored in the hash. The 
most important part (from the perspective that Allan voices) is that the 
versions written to disk use lower case characters.

Tony



Re: Wget 1.11.3 - case sensetivity and URLs

2008-06-13 Thread mm w
standard: the URL are case-insensitive

you can adapt your software because some people don't respect standard,
we are not anymore in 90's, let people doing crapy things deal with
their crapy world

Cheers!

On Fri, Jun 13, 2008 at 2:08 PM, Tony Lewis [EMAIL PROTECTED] wrote:
 Micah Cowan wrote:

 Unfortunately, nothing really comes to mind. If you'd like, you could
 file a feature request at
 https://savannah.gnu.org/bugs/?func=additemgroup=wget, for an option
 asking Wget to treat URLs case-insensitively.

 To have the effect that Allan seeks, I think the option would have to convert 
 all URIs to lower case at an appropriate point in the process. I think you 
 probably want to send the original case to the server (just in case it really 
 does matter to the server). If you're going to treat different case URIs as 
 matching then the lower-case version will have to be stored in the hash. The 
 most important part (from the perspective that Allan voices) is that the 
 versions written to disk use lower case characters.

 Tony





-- 
-mmw


Re: Wget 1.11.3 - case sensetivity and URLs

2008-06-13 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Tony Lewis wrote:
 Micah Cowan wrote:
 
 Unfortunately, nothing really comes to mind. If you'd like, you
 could file a feature request at 
 https://savannah.gnu.org/bugs/?func=additemgroup=wget, for an
 option asking Wget to treat URLs case-insensitively.
 
 To have the effect that Allan seeks, I think the option would have to
 convert all URIs to lower case at an appropriate point in the
 process. I think you probably want to send the original case to the
 server (just in case it really does matter to the server). If you're
 going to treat different case URIs as matching then the lower-case
 version will have to be stored in the hash. The most important part
 (from the perspective that Allan voices) is that the versions written
 to disk use lower case characters.

Well, that really depends. If it's doing a straight recursive download,
without preexisting local files, then all that's really necessary is to
do lookups/stores in the blacklist in a case-normalized manner.

If preexisting files matter, then yes, your solution would fix it.
Another solution would be to scan directory contents for the first name
that matches case insensitively. That's obviously much less efficient,
but has the advantage that the file will match at least one of the
real cases from the server.

As Matthias points out, your lower-case normalization solution could be
achieved in a more general manner with a hook. Which is something I was
planning on introducing perhaps in 1.13 anyway (so you could, say, run
sed on the filenames before Wget uses them), so that's probably the
approach I'd take. But probably not before 1.13, even if someone
provides a patch for it in time for 1.12 (too many other things to focus
on, and I'd like to introduce the external command hooks as a suite,
if possible).

OTOH, case normalization in the blacklists would still be useful, in
addition to that mechanism. Could make another good addition for 1.13
(because it'll be more useful in combination with the rename hooks).

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer,
and GNU Wget Project Maintainer.
http://micah.cowan.name/
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFIUua+7M8hyUobTrERAr0tAJ98A/WCfPNhTOQ3Xcfx2eWP2stofgCcDUUQ
nVYivipui+0TRmmK04kD2JE=
=OMsD
-END PGP SIGNATURE-


Re: Wget 1.11.3 - case sensetivity and URLs

2008-06-13 Thread Steven M. Schweda
   In the VMS world, where file name case may matter, but usually
doesn't, the normal scheme is to preserve case when creating files, but
to do case-insensitive comparisons on file names.

From Tony Lewis:

 To have the effect that Allan seeks, I think the option would have to
 convert all URIs to lower case at an appropriate point in the process.

   I think that that's the wrong way to look at it.  Implementation
details like name hashing may also need to be adjusted, but this
shouldn't be too hard.



   Steven M. Schweda   [EMAIL PROTECTED]
   382 South Warwick Street(+1) 651-699-9818
   Saint Paul  MN  55105-2547


RE: Wget 1.11.3 - case sensetivity and URLs

2008-06-13 Thread Tony Lewis
mm w wrote:

 standard: the URL are case-insensitive

 you can adapt your software because some people don't respect standard,
 we are not anymore in 90's, let people doing crapy things deal with
 their crapy world

You obviously missed the point of the original posting: how can one 
conveniently mirror a site whose server uses case insensitive names onto a 
server that uses case sensitive names.

If the original site has the URI strings /dir/file, dir/File, Dir/file, 
and /Dir/File, the same local file will be returned. However, wget will treat 
those as unique directories and files and you wind up with four copies.

Allan asked if there is a way to have wget just create one copy and proposed 
one way that might accomplish that goal.

Tony



RE: Wget 1.11.3 - case sensetivity and URLs

2008-06-13 Thread Tony Lewis
Steven M. Schweda wrote:

 From Tony Lewis:
  To have the effect that Allan seeks, I think the option would have to
  convert all URIs to lower case at an appropriate point in the process.

   I think that that's the wrong way to look at it.  Implementation
 details like name hashing may also need to be adjusted, but this
 shouldn't be too hard.

OK. How would you normalize the names?

Tony



Re: Wget 1.11.3 - case sensetivity and URLs

2008-06-13 Thread mm w
Hi, after all, after all it's only my point of view :D
anyway,

/dir/file,
dir/File, non-standard
Dir/file, non-standard
and /Dir/File non-standard

that's it, if the server manages non-standard URL, it's not my
concern, for me it doesn't exist


On Fri, Jun 13, 2008 at 3:12 PM, Tony Lewis [EMAIL PROTECTED] wrote:
 mm w wrote:

 standard: the URL are case-insensitive

 you can adapt your software because some people don't respect standard,
 we are not anymore in 90's, let people doing crapy things deal with
 their crapy world

 You obviously missed the point of the original posting: how can one 
 conveniently mirror a site whose server uses case insensitive names onto a 
 server that uses case sensitive names.

 If the original site has the URI strings /dir/file, dir/File, Dir/file, 
 and /Dir/File, the same local file will be returned. However, wget will 
 treat those as unique directories and files and you wind up with four copies.

 Allan asked if there is a way to have wget just create one copy and proposed 
 one way that might accomplish that goal.

 Tony





-- 
-mmw