Re: wget url with hash # issue

2007-09-06 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Aram Wool wrote:
 Hi, I'm having trouble retrieving an mp3 file from a url of the form
 
 http://www.websitename.com/HTML/typo3conf/ext/naksci_synd/mod1/index.php?mode=LATESTpid=13recursive=255feeduid=1feed=Normaluser=8hash=d84a36bbaa1906cc07007557c6b60395
 
 entering this url in a browser opens the 'save as' dialogue box for the
 mp3, but the file isn't found if wget is used instead.

Well, since the above URL doesn't point to any real resource, we can't
really track down what problems you may be having.

Also, the URL doesn't seem to have anything to do with the subject of
your message, which mentions a hash # (unless you mean hash number,
the last parameter in the query string; that's ambiguous, because the
# itself is often called a hash mark).

Since you haven't given us enough information to help you, I can only
hazard a wide guess, and wonder if the site might be explicitly blocking
wget, in which case you can use the --user-agent option to trick it (try
a value like 'Mozilla', or emulate whatever your browser sends).

 Also, is it possible to add an asterik to a url so as to indicate that
 wget should ignore the characters before or after it?

I really don't understand what you're asking for here. If you want Wget
to ignore the characters you've specified, why specify them in the first
place?

If you mean that you want Wget to find any file that matches that
wildcard, well no: Wget can do that for FTP, which supports directory
listings; it can't do that for HTTP, which has no means for listing
files in a directory (unless it has been extended, for example with
WebDAV, to do so).

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG362l7M8hyUobTrERCJ+RAJ9BWXs6d8VAZyOf5ozaozokUEptRACeOR0J
ET5Ur9UdFWTKzQtYjPM6Pg4=
=Y4xe
-END PGP SIGNATURE-


RE: wget url with hash # issue

2007-09-06 Thread Tony Lewis
Micah Cowan wrote:

 If you mean that you want Wget to find any file that matches that
 wildcard, well no: Wget can do that for FTP, which supports directory
 listings; it can't do that for HTTP, which has no means for listing
 files in a directory (unless it has been extended, for example with
 WebDAV, to do so).

Seems to me that is a big unless because we've all seen lots of websites
that have http directory listings. Apache will do it out of the box (and by
default) if there is no index.htm[l] file in the directory.

Perhaps we could have a feature to grab all or some of the files in a HTTP
directory listing. Maybe something like this could be made to work:

wget http://www.exelana.com/images/mc*.gif

Perhaps we would need an option such as --http-directory (the first thing
that came to mind, but not necessarily the most intuitive name for the
option) to explicitly tell wget how it is expected to behave. Or perhaps it
can just try stripping the filename when doing an http request and wildcards
are specified.

At any rate (with or without the command line option), wget would retrieve
http://www.exelana.com/images/ and then retrieve any links where the target
matches mc*.gif.

If wget is going to explicitly support http directory listings, it probably
needs to be intelligent enough to ignore the sorting options. In the case of
Apache, that would be things like A HREF=?N=DName/A.

Anyone have any idea how many different http directory listing formats are
out there?

Tony