Re: I can download with a browser, but not with wget

2007-08-23 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Rafael Anschau wrote:
 bash-3.2$ wget http://www.intel.com/design/processor/manuals/253667.pdf
 --20:01:01--  http://www.intel.com/design/processor/manuals/253667.pdf
= `253667.pdf'
 Resolving www.intel.com... 81.52.134.16, 81.52.134.24
 Connecting to www.intel.com|81.52.134.16|:80... connected.
 HTTP request sent, awaiting response... 403 Forbidden
 20:01:02 ERROR 403: Forbidden.

--user-agent Mozilla does the trick. Apparently Intel's website does
not like wget. :)

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGzhjU7M8hyUobTrERCPc8AJ9y82f2OpixUZALb2TU4slrOphfLQCgi40s
rKU193KGDh778jM/kwekfME=
=JIfb
-END PGP SIGNATURE-


Re: wget 1.10.2 (warnings)

2007-08-24 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Hi Esin,

Unfortunately, the patch you've provided really doesn't seem
appropriate, as it would try to convert an enumeration value into a
pointer, which, while it might shut up a warning on your compiler, has a
much higher chance of breaking than converting a pointer-to-enum to a
pointer-to-int. If I read it correctly, Wget would almost certainly
crash the moment someone specifies --prefer-family (did you test wget
with --prefer-family before submitting the patch?).

However, you'll be pleased to know that this issue (assuming an enum
type is compatible with int, and likewise for their pointers) appears to
have been fixed in the current trunk.

- -Micah

Esin Andrey wrote:
 Hi!
 I have downloaded wget-1.10.2 sources and try to compile it.
 I have some warnings:
 
 /|init.c: In function ‘cmd_spec_prefer_family’
 init.c:1193: warning: доступ по указателю с приведением типа нарушает правила 
 перекрытия объектов в памяти
 
 |/I have wrote patch which correct this warnings (It is in attach)
 
 
 
 
 
 diff -urN wget-1.10.2.orig/src/init.c wget-1.10.2/src/init.c
 --- wget-1.10.2.orig/src/init.c   2005-08-09 02:54:16.0 +0400
 +++ wget-1.10.2/src/init.c2007-08-10 00:16:33.0 +0400
 @@ -1190,7 +1190,7 @@
  { none, prefer_none },
};
int ok = decode_string (val, choices, countof (choices),
 -   (int *) opt.prefer_family);
 +   (int *) opt.prefer_family);
if (!ok)
  fprintf (stderr, _(%s: %s: Invalid value `%s'.\n), exec_name, com, 
 val);
return ok;
 @@ -1455,7 +1455,7 @@
for (i = 0; i  itemcount; i++)
  if (0 == strcasecmp (val, items[i].name))
{
 - *place = items[i].code;
 + place = (int *)items[i].code;
   return 1;
}
return 0;


- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGz0Lf7M8hyUobTrERCImQAJ0WyfdxMXkR5QDCHN3gumzMd+GzpgCfU+ED
3FauLetjHa+eZxEeIiQzmkA=
=Kqjy
-END PGP SIGNATURE-


Re: Spider.C

2007-08-27 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Sorry, hadn't noticed this was also sent to the list. Sending a copy there.

Christopher G. Lewis wrote:
 Micah -

  Your latest checkin for Spider.C breaks the windows build:

snip

   logprintf (LOG_NOTQUIET, ngettext(Found %d broken link.\n\n,
 Found %d broken links.\n\n,
 num_elems),
  num_elems);

Hi Chris,

I didn't realize that lack of NLS gets worked-around by #defining _(s)
to return s... r2364 should hopefully fix your problem.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG0u+z7M8hyUobTrERCATkAKCLKiqYaHpyWvFl80jgBHwbU9mAGgCdER9w
heQi4nA1qjwJ1zDcB1+xpyU=
=qmV1
-END PGP SIGNATURE-


Re: MSDOS/djgpp version

2007-08-28 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Gisle Vanem wrote:
 I've been porting Wget to MSDOS using the djgpp environment.
 The task was rather simple; adding a new directory ./msdos with
 a config.h and Makefile.dj. I intend to support other DOS-compilers
 later. 
 Please take a look at the attached patch and ./msdos files. Would there
 be any interest in adding this?

These look decent to me, Gisle; I've added a bug report to Savannah to
mark the task of merging this in. I will need to wait on these changes,
though, until the FSF has your copyright assignment on file.

Huh, there were a couple constructs in there that I had been thinking
were C99; the user of a final comma after array element initializers,
and the preprocessing defined() operator. Both of those seem to be in
my draft copy of C89, though, so I guess I'm wrong.

I know final commas in /enum/ declarations are a new addition, maybe
that's what confused me...

The addition of yet more #ifdefs bug me a small amount; I'm hoping at
some point to convert the code to comply with GNU guidelines that such
things be avoided; but it's really not going to add more work at that
point if I let these in, and in the meantime it's consistent with the
rest of the code, so I won't ask you to change that.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG1HpA7M8hyUobTrERCMUoAJ9SlkyfC0nxxj/i1UwgKOSPbnYZWwCgkotz
EYLQwMxwEf1LAj8CzFMOk8A=
=w4vD
-END PGP SIGNATURE-


Re: ftp-ls.c - patch [wget 1.10.2]

2007-08-29 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Jason Mancini wrote:
 
 Since I don't know why that code exists, here is a patch that should
 fix the issue, and remain compatible with whatever the heck that
 decrementing code tries to achieve on some system I have probably
 never seen:

Thanks Jason.

Actually, the code you speak of has already been removed in the current
development version of Wget.

In general, it's best to submit patches against the trunk version of
Wget, as it may be significantly different from the latest official
release. That version can be had via subversion; see
http://www.gnu.org/software/wget/wgetdev.html#development .

Actually, I'm hoping to instate automated nightly source distros and/or
builds (probably just for GNU/Linux systems, at least at first), in the
very near future, at which point you can just grab one of those :)

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG1bO57M8hyUobTrERCCEGAJ43pB+B9g7QP+y7CjB/QWT1ipDFhwCeNShM
Ys4I0cEwmdFnd4EelihQMmM=
=e862
-END PGP SIGNATURE-


Wget 1.11 Release Postponed

2007-08-29 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Hi folks,

Development for Wget has been proceeding about as planned, which is
pretty amazing considering that the few core developers we have are all
busy with their own day jobs. We are very nearly ready to release Wget
 version 1.11, and are confident that we would have been able to do so
by the target date, 15 September 2007.

Unfortunately, however, it looks as if we are going to miss that date
despite being code-ready, as there are some licensing issues with the
new GPLv3, in relation to the licensing exception we have in place to
permit users to link with the OpenSSL library. Obviously, this sort of
thing is not exclusive to GNU Wget; other projects with the same or
similar exceptions are also waiting for this issue to be resolved.

Here's a snippit from an announcement by Brett Smith, FSF's licensing
guru:

 Unfortunately, updating the exceptions has proven to be more difficult than
 we first thought.  That's not because simply changing the words to line up
 with GPLv3's is hard; that part's still easy.  However, all of these
 exceptions were written at a time when GPLv2 was the only version of the
 license in serious use.  Now, for the exceptions that talk about
 relationships with software under other licenses, we have to figure out how
 to update the text so that it interacts with GPLv2 software properly.  And
 this part is not always so easy.
 
 Our lawyers at SFLC are working on all this, and making progress; it's just
 taking more time than we originally thought.

Hopefully, we will be able to release by some time in October, or
possibly late September. In the meantime, it gives us time to fix a
couple of extra bugs we weren't sure would be fixed in time for 1.11,
and to get some extra testing in.

Of course, we technically could revert the merges of GPLv3 from Wget,
and release 1.11 on-schedule under GPLv2 (or any later version). If
the licensing issues aren't still resolved by late October, I may
consider doing that; however, I would prefer (and I believe RMS would
prefer) that the next release of GNU Wget be under the GPLv3 (and up). I
know that RMS wants as many GNU projects as possible to be licensed
under the GPLv3 as of right now; and I'd rather avoid making a separate
release of Wget 1.11.1 (or whatever) just for the relicensing.

Thanks, everyone, for your patience and understanding. We look forward
to releasing GNU Wget 1.11 in October.

- --
Micah J. Cowan
GNU Wget Maintainer

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG1cAI7M8hyUobTrERCGzxAJ99lqFInx+smvgEt7Mcxx90FqkiYACfWFYg
6nsUOxsioPcUbpPAwKXgJ/g=
=YUWX
-END PGP SIGNATURE-


Re: Delayed Page D/L Issue

2007-08-29 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Jeff Holicky wrote:
 I am trying to download a page from a specific site.  
 The page URL is: http://stockcharts.com/charts/adjusthist.html
 
 First, when you open a browser, say FireFox and go to the site - the top
 part (above the dotted line) will appear.  Within a few seconds the
 bottom will appear which contains the data I need.

snip

 Somehow it would be nice to have wget call the page - pause - and then
 save it.  I am seeing retries and timeout and other such time related
 options but nothing that would just pause.  Another thought would be
 issuing a POST command of some sort, and then use a SLEEP option before
 I issue the next wget call.

That really isn't the problem. Wget always waits until the page is
completely loaded before it saves it; that is, it always waits until the
server closes the connection, or otherwise indicates that the web page
has been completely sent.

The problem is that this particular page, after it has been completely
downloaded, then uses JavaScript to go and fetch further data, and then
display it in the page. There is no way for Wget to get this data and
save it as if it were part of the original page. Wget doesn't understand
JavaScript, though there has been talk about putting (probably limited)
support for it in. Even if it did speak JavaScript, though, it likely
would not be able to understand how to modify the web page so as to save
the version that you see after the JavaScript has completed its tasks.

In short, it is unlikely that Wget will ever really be able to do the
job you want it to do.

Careful analysis of the JavaScript code that page uses might reveal away
to create your own script to generate a similar page; but this is likely
to be a moderate amount of work for a fairly technically savvy user.

Wish I could give you better news... :-/

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG1kLs7M8hyUobTrERCGWRAKCStyIIsSpH1ljTL6QEZqtvOK+hTACfTGcn
OgTWkX5wNQl+pz6JYZNe7Kk=
=k0Z+
-END PGP SIGNATURE-


Myriad merges

2007-08-30 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

I've just merged a bunch of things into the current trunk, including
Mauro's latest changes related to when HEAD is sent (concerning which he
recently sent an email). Please feel free to beat on it, and report any
bugs here!

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG1o777M8hyUobTrERCAJ8AJ9xjFQdasK/P3I23LWU0iGR8Ed6/ACfQ7G6
QOFxahMHz6MzoShw+UBYQqw=
=nBcl
-END PGP SIGNATURE-


Re: Bad html naming behaviour of wget ( windows version 1.10.2b + trunk:2369 )

2007-08-30 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Goul_duKat wrote:
 blanked some info with xxx
 
 file used
 http://files.filefront.com/Oulton_ParkBumpFixrar/;6524403;;/fileinfo.html
 
 is possible to fix this wrong behavior so instead to try to use the url
 to build the filename use the right header voice to make the filename ?
 
 just see now i got saved a file called X6 instead of the right one
 Oulton_ParkBumpFix.rar who is in :
  Content-Disposition: attachment; filename=Oulton_ParkBumpFix.rar
 
 possible to fix this so when the html header return the filename use it
 and use the url only when not found ? ( is possible to adds some command
 line switch too to change this behavior so ppl can decide what its fit
 for him )

The current trunk version of Wget will do this if you specify -e
contentdisposition=on.

Note that the default behavior shouldn't necessarily be considered
wrong, since Content-Disposition is not an HTTP header, but a common
extension taken from MIME.

The Content-Disposition code is somewhat experimental, and may not work
perfectly, and also results in sending a lot of extra requests, which is
why it is currently disabled by default. We hope to enable it by default
in Wget 1.12.

- --
HTH,
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG11XY7M8hyUobTrERCB2LAJ9H8LX2qUKOXJhc2K7h5c62WFglKwCfUVr9
lU9Lygt2RJg9a6wHuZg4xRw=
=Qoit
-END PGP SIGNATURE-


--quiet and --background

2007-09-01 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

A bug was submitted to Ubuntu's Malone bugtracker, complaining that
- --background shoud not create a wget-log file when --quiet has been
specified:

  https://bugs.launchpad.net/ubuntu/+source/wget/+bug/135063
  https://savannah.gnu.org/bugs/index.php?20917

This seems reasonable to me, despite the fact that there are a few logs
that can still be issued, even when --quiet is specified: the likelihood
of needing those is much lower than the likelihood that the user will be
irritated by an empty wget.log appearing whenever they use --background,
if --quiet was specified :)

Stefan Lendl [EMAIL PROTECTED] has been good enough to prepare a
patch to resolve this issue; however, we did run across a potential
issue. If --server-response is enabled, even if --quiet was specified,
FTP server responses will still be issued (however, HTTP server
responses will _not_ be).

First of all, this is obviously an inconsistency: we should pick one
behavior or the other for FTP and HTTP alike. Second, which is the
correct behavior? Is it reasonable to expect that someone issuing
- --quiet -S actually wants the server responses (and nothing else), or
should --quiet override -S (or, even conflict with it)?

It seems to me, that a person wanting not much more output than -S will
still be wanting -nv to say so, rather than -q. --quiet seems like it
should be, well, quiet. Unless the sky is falling.

If we take out the server responses from --quiet, then the only message
it would ever issue would be for xmalloc failure. I think we can live
with that not showing up in backgrounded wgets (after all, if they
wanted to be notified when something went wrong, well, they wouldn't
issue --quiet, now would they? :) )

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG2Tqu7M8hyUobTrERCK4CAJ9swVEkvutDflD5FOj9xddT2YOYXACfXZN2
wb9tXQHz4lGPjH0YYf/nSa8=
=DZPf
-END PGP SIGNATURE-


Re: Myriad merges

2007-09-02 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Jochen Roderburg wrote:
 I did only a few first tests now, because the basic test already had a 
 problem:
 With default options the local timestamps are not set at all.
 
 Options:  no spider, no -O, no content-disposition:
 
 no timestamping, no local file   no HEAD   but: local timestamp not set to
 remote
 no timestamping,local file   no HEAD   but: local timestamp not set to
 remote
timestamping, no local file   no HEAD   but: local timestamp not set to
 remote
timestamping,local file  HEAD   local timestamp set to remote
 
 In these cases the HEAD is now used again only for the case where it is
 necessary, but the timestamp ..
 One could think that it is now taken only from the HEAD and not from GET.

Hm, that should not be. It should definitely set the timestamp if it
gets downloaded... I'll investigate.

OOC, was there a specific resource you tested against (just in case I
have difficulty reproducing)?

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG2xIY7M8hyUobTrERCAQcAJ9ezRd9v+DwE1JuUCVl4bZMswr/NgCdEOTR
D2Q9WwL+8HOzWxIizClXTvk=
=zNXl
-END PGP SIGNATURE-


Re: wget ignores --user and --password if you have a .netrc

2007-09-04 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Andreas Kohlbach wrote:
 Hi,
 
 though the man page of wget mentions .netrc, I assume this is a bug.
 
 For my understanding if you provide a --user=user and --password=password
 at the command line this should overwrite any setting elsewhere, as in
 the .netrc. It doesn't. And it took me quite some time and bothering
 other guys to realise that it seems wget is ignoring --user and
 --password at the command line if a .netrc exists with the matching
 content.

Here's what I see:

search_netrc has the code:

  if (*acc  *passwd)
return;

This occurs before acc or passwd are set (but strangely, after the netrc
is parsed, which is silly), and means that if acc and passwd already
point at valid strings (even if empty), then we exit without changing them.

This is why I wasn't sure I'd be able to reproduce the problem (though I
hadn't gotten around to trying it yet). However, a closer inspection of
http.c shows I misread the call in gethttp:

  user = u-user;
  passwd = u-passwd;
  search_netrc (u-host, (const char **)user, (const char **)passwd, 0);
  user = user ? user : (opt.http_user ? opt.http_user : opt.user);
  passwd = passwd ? passwd : (opt.http_passwd ? opt.http_passwd :
opt.passwd);

search_netrc only gets the value of user and passwd, as obtained from
the URL. Therefore, http://foo:[EMAIL PROTECTED]/ will override netrc, but
opt.user and opt.http_user aren't checked until after netrc.

The problem could be solved by simply moving the search_netrc down below
  the other places where user/passwd get set.

As Josh points out, the question remains whether this should be our
behavior; I vote yes, as command-line arguments should always override
rc files, in general. Of course, these values could well have come from
.wgetrc and not the command-line; but even so,
more-specific-to-our-application should override general, so .wgetrc
still wins over .netrc.

While we're changing things, we should move the first snippit I
mentioned above, to before where we bother parsing the .netrc.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG3Y1d7M8hyUobTrERCAS8AJ4vWhnXBegLSUzbuTH2hjmNJ+lTCgCcCPJw
3XT7t6VXLgIJlGI7Q+PDm9c=
=Mbro
-END PGP SIGNATURE-


Re: wget (centos build) page-requisites leaves requisites in a bad location

2007-09-04 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Ed wrote:
 Seen this twice now but unable to track down how it happens.
 
 I am crawling a list of websites which are being kept in a cache area.

snip

 A small number of files end up in the wrong location, evidence from
 the logs indicates that these
 
 - are page requisite downloads, e.g. jpegs generally
 -  have a saved line of the form 'file.jpg' saved - i.e. no directory prefix
 - are a small part of the overall crawl activity, most things get put
 away properly
 
 the html for these pages shows references to the offending item as
 '../../../../../../ . file.jpg' (in the one case where I counted
 ../ is repeated 17 times)
 
 the wget logs shows that after a spell of correctly saving requisites
 for a site we get a run of these errors until the current download
 finishes processing, each file erroneously saved is associated with a
 log line like
 Server file no newer than local file `filename.jpg' -- not retrieving.
 However this is the *first occurence* of filename.jpg in the log
 
 I am using Centos and the Centos build of wget, I have looked through
 bug trackers in vain, is this a known problem with wget? with centos?
 Repeating this particular event did not produce the same problem and
 as my wget code has not changed I am assuming it is intermittent in
 some fashion.

It's hard to track down what's wrong unless you can give us a specific
invocation that we can use to test with. It'd also be helpful if you
could provide evidence that these paths are indeed wrong. Comparisons
between the actual URL (both of the requisite and the referring page),
and the location would be helpful, as would a snippit from the debug
log. But even just a page we could try it against, or if it only happens
for a whole site, the main URL you're doing this against, would help.

But, before that, we need you to try to reproduce it with a canonical
version of Wget, please, which you can obtain from
ftp://ftp.gnu.org/gnu/wget/wget-1.10.2.tar.gz. CentOS is a RedHat
derivative, and it is known that RedHat has made some heavy
modifications to Wget, so that their version is not our version.

You might also see how our current development version holds up. You can
get it via Subversion, see
http://www.gnu.org/software/wget/wgetdev.html#development; you'll need
Subversion, GNU Autoconf and GNU Gettext.

Thanks very much for your help!

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG3Y/17M8hyUobTrERCJENAJ45jNxYDxXFijr/4HOnXJXQnccivQCeN/+o
gv08oGm8kZuT+xh2LWdcHig=
=1sTU
-END PGP SIGNATURE-


Re: wget ignores --user and --password if you have a .netrc

2007-09-04 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Hrvoje Niksic wrote:
 Micah Cowan [EMAIL PROTECTED] writes:
 
 As Josh points out, the question remains whether this should be our
 behavior; I vote yes, as command-line arguments should always override
 rc files, in general. Of course, these values could well have come from
 .wgetrc and not the command-line; but even so,
 more-specific-to-our-application should override general, so .wgetrc
 still wins over .netrc.
 
 I think .netrc was considered more specific because it can provide
 passwords on a per-machine basis, which .wgetrc can't.

Okay, that seems pretty reasonable. Perhaps we should leave this as-is,
then, and revisit this when we've introduced host/path-specific
configuration.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG3dKo7M8hyUobTrERCEWMAJsGUzG+cG3eB127ZsZNNkExaWbNOACcD+w9
KH+T0ZdpQmHExS5gvqvEK2E=
=ae1Q
-END PGP SIGNATURE-


Re: wget ignores --user and --password if you have a .netrc

2007-09-04 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Andreas Kohlbach wrote:
 On Tue, Sep 04, 2007 at 02:48:25PM -0700, Micah Cowan wrote:
 Hrvoje Niksic wrote:
 Micah Cowan [EMAIL PROTECTED] writes:

 As Josh points out, the question remains whether this should be our
 behavior; I vote yes, as command-line arguments should always override
 rc files, in general. Of course, these values could well have come from
 .wgetrc and not the command-line; but even so,
 more-specific-to-our-application should override general, so .wgetrc
 still wins over .netrc.
 I think .netrc was considered more specific because it can provide
 passwords on a per-machine basis, which .wgetrc can't.
 Okay, that seems pretty reasonable. Perhaps we should leave this as-is,
 then, and revisit this when we've introduced host/path-specific
 configuration.
 
 I'd say command line option shall always win against anything else. And
 usually it does everywhere as far as I know.
 
 If the current behavor stays as-is, may be the methode
 
 | wget ftp://user_name:[EMAIL PROTECTED]
 
 should be mentioned as workaround in the wget man page?

(I'm assuming you meant to Cc this to the list? Apologies if this is not
the case)

Yes, but such usage won't apply recursively for such things as http (and
perhaps not even for ftp if reconnection occurs; I'm not sure).

You're right of course that command-line is more specific (and yet,
.netrc can still specify with more granularity), and more likely to
represent what the user wants for this invocation. However, since if we
fix this now (which would be slightly complicated), we'll just have to
re-fix it when host/path-specific config is introduced, I think I'd
probably prefer to punt it for now until a real fix may be applied.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG3d/j7M8hyUobTrERCJ3QAJsGAUtuLtxh0FtFh8jjDpmmqVn6VACfczQr
5anlPFOCgvJS7zJOBT5EVYE=
=lOVq
-END PGP SIGNATURE-


Re: wget url with hash # issue

2007-09-06 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Aram Wool wrote:
 Hi, I'm having trouble retrieving an mp3 file from a url of the form
 
 http://www.websitename.com/HTML/typo3conf/ext/naksci_synd/mod1/index.php?mode=LATESTpid=13recursive=255feeduid=1feed=Normaluser=8hash=d84a36bbaa1906cc07007557c6b60395
 
 entering this url in a browser opens the 'save as' dialogue box for the
 mp3, but the file isn't found if wget is used instead.

Well, since the above URL doesn't point to any real resource, we can't
really track down what problems you may be having.

Also, the URL doesn't seem to have anything to do with the subject of
your message, which mentions a hash # (unless you mean hash number,
the last parameter in the query string; that's ambiguous, because the
# itself is often called a hash mark).

Since you haven't given us enough information to help you, I can only
hazard a wide guess, and wonder if the site might be explicitly blocking
wget, in which case you can use the --user-agent option to trick it (try
a value like 'Mozilla', or emulate whatever your browser sends).

 Also, is it possible to add an asterik to a url so as to indicate that
 wget should ignore the characters before or after it?

I really don't understand what you're asking for here. If you want Wget
to ignore the characters you've specified, why specify them in the first
place?

If you mean that you want Wget to find any file that matches that
wildcard, well no: Wget can do that for FTP, which supports directory
listings; it can't do that for HTTP, which has no means for listing
files in a directory (unless it has been extended, for example with
WebDAV, to do so).

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG362l7M8hyUobTrERCJ+RAJ9BWXs6d8VAZyOf5ozaozokUEptRACeOR0J
ET5Ur9UdFWTKzQtYjPM6Pg4=
=Y4xe
-END PGP SIGNATURE-


Re: Myriad merges

2007-09-06 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Jochen Roderburg wrote:
 Zitat von Jochen Roderburg [EMAIL PROTECTED]:
 
 So it looks now to me, that the new error (local timestamp not set to remote)
 only occurs in the cases when no HEAD is used.
 
 This (new) piece of code in http.c (line 2666 ff.) looks very suspicious to 
 me,
 especially the time_came_from_head bit:
 
   /* Reparse time header, in case it's changed. */
   if (time_came_from_head
hstat.remote_time  hstat.remote_time[0])
 {
   newtmr = http_atotm (hstat.remote_time);
   if (newtmr != -1)
 tmr = newtmr;
 }

The intent behind this code is to ensure that we parse the Last-Modified
date again, even if we already parsed Last-Modified, if the last one we
parsed came from the HEAD. This whole block of code that you've pasted
is new, not just the surrounding if clause; if we never sent a HEAD but
only a GET, the Last-Modified _should_ have been parsed in code that
appears before here.

...but, obviously, things aren't working quite as they should, so I need
to look into it more closely.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG4DD77M8hyUobTrERCFf0AJ9MVT0+eTCidH63YTBuHKrXTmA+3QCeIzav
x1bSxRx1I3I1eXnvz8Pv384=
=EfI4
-END PGP SIGNATURE-


Re: wget syntax problem ?

2007-09-06 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Alan Thomas wrote:
 command.com
 
 By the way, Josh and your messages are being put out to the list in
 dupicates (at least, that`s what I`m seeing on my end).

Not really; we've been Cc'ing you. I don't think we knew whether you
were subscribed or not, and so Cc'd you in case you weren't. Also, many
of us just habitually hit Reply All to hit the message, so we don't
accidentally send it to the message's author only. :)

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG4Kys7M8hyUobTrERCCK4AJ9rOGMPa1Xcl/evqENs6pmN7AAncACfeWhd
nyC+OzJ3ME7vMqRsEoVNP68=
=n6JC
-END PGP SIGNATURE-


Re: wget syntax problem ?

2007-09-06 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Alan Thomas wrote:
 Please ignore.  It was needing the \\, like Josh said.

Out of curiosity, what command interpreter were you using? Was this
command.com, or something else like rxvt/Cygwin?

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD4DBQFG4Kqe7M8hyUobTrERCI3HAJjw+g0GsGE1b+6vhr+pu/QJAQIuAJ4o2UbP
e3qqbx+ywsdRpTuIbx6VPQ==
=792z
-END PGP SIGNATURE-


Re: wget syntax problem ?

2007-09-06 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Alan Thomas wrote:
I know this is probably something simple I screwed up, but the
 following commands in a Windows batch file return the error Bad command
 or file name for the wget command

It sounds to me like you don't have wget in your PATH. Make sure that
wget is located somewhere where command.com (or whatever) can find it.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG4Kki7M8hyUobTrERCCG9AJ90dQ95sGaqEwVyH7KOZQxwlL7xCQCfWeJz
v9aCRAPhJp3kqZtd6zS0KNs=
=IAsR
-END PGP SIGNATURE-


Re: Files returned by ASP

2007-09-06 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Alan Thomas wrote:
 Is there a way to use wget to get file from links that result
 from Active Server Pages (ASPs) on a web page?  For example, to get the
 files in the links on the page returned by the URL
 http://www.onr.navy.mil/about/conferences/rd_partner/2007/presentations_03.asp.
  
 
 Thanks, Alan

Sure, check out what the Wget manual has to say about recursive fetching:

http://www.gnu.org/software/wget/manual/html_node/Recursive-Download.html#Recursive-Download

- --
HTH,
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG4LAu7M8hyUobTrERCDtAAJ4ub4sh17gMv8kzK6F/p69C2HBrFQCgiLHc
zidjMSZuCQI/j0TkKxWd24M=
=kNgI
-END PGP SIGNATURE-


Announcing... The Wget Wgiki!

2007-09-06 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

The main informational site for GNU Wget is now at
http://wget.addictivecode.com/; the Wget Wgiki.

  --

The original motivation for starting a wiki for Wget was that I needed a
forum for collaboration on specifications and design for future features
in Wget, and particularly in what we've been calling Wget 2.0, the
next generation of Wget.

Features that have been (tentatively) suggested or planned for Wget
2.0 include:

  * Support for multiple connections simultaneously
  * Configuration options on a per-host and/or per-initial URI subpath
basis.
  * Accept/reject (and others) based on MIME type.
  * Support for the use of regular expressions.
  * A recursive-fetch metadatabase, to save download information such as
mappings between local filenames and originating URIs, MIME types,
HTTP entity identifiers, etc.
  * A plugin architecture.
  * Support for parsing of non-HTML files for links to follow.
  * Support for handing-off specific HTML elements to plugins for
special handling
  * Support for extending Wget with new protocols
  * Better encapsulation of the file-system, to hide local filename
restrictions and such from the download logic.
  * Support for Internationalized Resource Identifiers (IRIs).
  * Some level of JavaScript support **
  * Support for the Metalink format **

 ** For various reasons, JavaScript and Metalink support will probably
 not be part of canonical Wget, but would take advantage of the plugin
 architecture and be distributed separately from the core Wget source.
 Development for these features might be separate from core Wget
 development.

Some of these things necessitate a complete restructuring of Wget's
logic, very possibly a complete or near-rewrite. It is also possible
that the configuration and command-line interface syntaxes would need to
be reimagined, in which case a name change for the next generation
Wget might begin to show merit.

The feature specifications and design discussions for these elements
will live at http://wget.addictivecode.org/FeatureSpecifications. I have
started a few of them off, most still need to be started, and all need
help.

 --

An aside: I do not want to give the idea that Wget is going to go from a
Swiss Army Knife to a Combination
Hand-pistol/tank/aircraft-carrier/missile-launch-silo ;)
As I see it, Wget's major boons have been its relatively small
footprint, it's speed and efficiency, and it's ability to (usually) Do
what I want. I do not wish to abandon these things. This was a major
factor in the decision to isolate features like Metalink and JavaScript
into plugins: with a plugin architecture, if the users /want/ the
Combination Hand-pistol/..., they can just load up the tank and
missile-silo modules! ;)

 --

At any rate, I felt that having a wiki for discussion of these things
would prove invaluable, so I started work on this last week. But while I
was working on these things, it became more and more obvious how much of
a benefit it could be in serving as the main repository for even
general, non-developer-oriented information for Wget. This is a somewhat
abrupt turn from my desire to make the gnu.org site the main source of
information about Wget, but I believe it'll be much easier in the long
run.

Please do check the site out, and help to improve it! Most of content
from the old site should have moved to the wiki (the old site has
already been updated to direct readers there).

  - http://wget.addictivecode.org/FeatureSpecifications
  Home for various features that need sketching out (these are
  intended to be informal specifications, not particularly rigorous;
  just enough to know what we are doing).

  - http://wget.addictivecode.org/Faq
  The FAQ has been updated somewhat, probably worth looking over.

  - http://wget.addictivecode.org/TitleIndex
  We don't have that many pages yet; here's the full list. ;)


- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG4N+E7M8hyUobTrERCHQoAKCFRB0HPbWSIBvTrT42clFlYh2p/gCfTzYH
h9HCFzSxs4WSNgyFe4OX3A8=
=0OBC
-END PGP SIGNATURE-


Re: Announcing... The Wget Wgiki!

2007-09-07 Thread Micah Cowan
Josh Williams wrote:
 On 9/7/07, Micah Cowan [EMAIL PROTECTED] wrote:
 Doh! Of course, it's .org. Fortunately all the other links, including
 the ones from the site at gnu.org, seem to be correct.
 
 Unfortunately for you, your typo is now an official piece of free
 software history! :D
 
 Just poking. :-P

D:

-- 
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/



Re: Myriad merges

2007-09-07 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Jochen Roderburg wrote:
 Zitat von Micah Cowan [EMAIL PROTECTED]:
 
 Zitat von Jochen Roderburg [EMAIL PROTECTED]:

 So it looks now to me, that the new error (local timestamp not set to
 remote)
 only occurs in the cases when no HEAD is used.
 This (new) piece of code in http.c (line 2666 ff.) looks very suspicious to
 me,
 especially the time_came_from_head bit:

   /* Reparse time header, in case it's changed. */
   if (time_came_from_head
hstat.remote_time  hstat.remote_time[0])
 {
   newtmr = http_atotm (hstat.remote_time);
   if (newtmr != -1)
 tmr = newtmr;
 }
 The intent behind this code is to ensure that we parse the Last-Modified
 date again, even if we already parsed Last-Modified, if the last one we
 parsed came from the HEAD.
 
 Hmm, yes, but that is not what it does  ;-)
 
 I mean, it does not parse the date again even if it was already parsed, but
 only if it was already parsed. So especially it does *not* parse it if there
 had been no HEAD at all before.

That's actually what I said it does (somewhat clumsily: if the last one
we parsed came from the HEAD).

Yes, as I said, if there had been no HEAD before, it should already have
been parsed in earlier code, and no action should be necessary. That's
what time_came_from_head is for, to prevent us from parsing it twice
from GET.

 And the only other code I found which parses the remote date is in the part
 which handles the logic around the timestamping option. In older versions this
 was a conditional block starting with  if (!got_head) ...  , now it starts 
 with
  if (send_head_first  !got_head) ...   Could this mean that this code is now
 only executed when a HEAD response is examined ??

Hm... that change came from the Content-Disposition fixes. I'll investigate.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG4XQO7M8hyUobTrERCEe3AJ9ywjfcxZl0a9vAQSWaBspuPsAXmQCdEflk
VQvp1HYcvm2gCE0ogJiD04I=
=SDe0
-END PGP SIGNATURE-


Re: Announcing... The Wget Wgiki!

2007-09-07 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Senthil Kumaran S wrote:
 On 9/7/07, Micah Cowan [EMAIL PROTECTED] wrote:
 The main informational site for GNU Wget is now at
 http://wget.addictivecode.com/; the Wget Wgiki.
 
 Is it http://wget.addictivecode.com/ or http://wget.addictivecode.org/ ?
 
 I could not reach http://wget.addictivecode.com/

Doh! Of course, it's .org. Fortunately all the other links, including
the ones from the site at gnu.org, seem to be correct.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG4XL77M8hyUobTrERCOdNAJ9tUWv90A1Y2QEcoFuNEhcuGh4wSACcDNFa
PH7KuZUeTnAjG/C8fRaFYxQ=
=84pn
-END PGP SIGNATURE-


Re: Myriad merges

2007-09-13 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Jochen Roderburg wrote:
 Zitat von Micah Cowan [EMAIL PROTECTED]:
 
 Hm... that change came from the Content-Disposition fixes. I'll investigate.

 
 OK, but I hope I am still allowed to help a little with the investigation  ;-)

Oh, I'm always very, _very_ happy to get help. :D

 I made a few more tests and some debugging now and I am convinced now that 
 this
 if send_head_first is definitely the immediate cause for the new problem
 that the remote timestamp is not picked up on GET-only requests.

snip

 Btw, continued downloads (wget -c) are also
 broken now in this case (probably for the same reason).

Really? I've been using this Wget version for a bit, and haven't noticed
this problem. Could you give an invocation that produces this problem?

 I meanwhile also believe that the primary issue we are trying to repair (first
 found remote time-stamp is used for local and not last found) has always been
 there. Only a year ago when the contentdisposition stuff was included and more
 HEAD requests were made I really noticed it. I remember that it had always 
 been
 more difficult to get a newer file downloaded through the proxy-cache when a
 local file was present, but as these cases were rare, I had never tried to
 investigate this before  ;-)

I'm not surprised to hear this; it didn't look like it had ever been
working before... and it's not a common situation, so I'm not surprised
it wasn't caught earlier, either.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG49HJ7M8hyUobTrERCJvXAJ0QHN8/8e9EcWUFV10RIWOIisRrnwCggzqI
62SZmq7si3/p3be41IVIjj0=
=TBid
-END PGP SIGNATURE-


Services down last night

2007-09-13 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

I haven't discovered why yet, but all of addictivecode.org's internet
services went down last night around 7:30 pm PDT (02:30 UTC). The web
and ssh services were brought back up in response to an email query,
around 2:30 am PDT (09:30 UTC), but it wasn't until I checked again this
morning around 10:30 am, that I was able to log in and restore the
remaining services.

This means that the Wget Wgiki was down for about 7 hours, and the
Subversion repository and addictivecode.org-hosted mailing lists for
about 15 hours. Sorry for the interruption; I'll be working with the
provider to help ensure this doesn't happen again.

Addictivecode.org is hosted on a VPS; the VPS itself didn't go down, as
it was pingable, and the logs show cron firing normally. Somehow, all
internet-connected services (don't know whether any non-internet
services were affected) were apparently killed without producing logs... :/

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG5aJ87M8hyUobTrERCMdwAJ0UfxaD3bLupnBir5YHeyLodLL/lQCfXzac
48QwZcPP7lLj3e2DlmJadTA=
=SgA4
-END PGP SIGNATURE-


Re: Services down last night

2007-09-13 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Micah Cowan wrote:
 I haven't discovered why yet, but all of addictivecode.org's internet
 services went down last night around 7:30 pm PDT (02:30 UTC).

Note that the addictivecode.org failure was completely unrelated to the
main Wget mailing list going down for about five days; just coincidental
(addictivecode.org apparently ran out of memory and OOM-killed almost
everything). I haven't discovered what the cause of that was, yet.

I discovered yesterday that I had failed to bring up mailman on
addictivecode.org, so wget-notify (which receives SVN commits and
Savannah bug changes) was down until I realized and brought it back up.

For information on the dotsrc.org (sunsite.dk) issues, see
http://www.dotsrc.org/news/. They seem not to have announced the servers
coming back up; perhaps not all of them have. Dotsrc's issues appear to
be the result of an upgrade gone wrong last weekend. :/

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG6Xj27M8hyUobTrERCJbOAJ4or+ienl71aoEyiSfxuz3om+LSbQCff2Ix
MnEudLhX7VtfVv/fYQnVcZo=
=uvVr
-END PGP SIGNATURE-


Re: Abort trap

2007-09-13 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Hex Star wrote:
 Oh and the configuration on which wget was running is: PowerBook G4
 1.5ghz (PowerPC), 768mb ram, Mac OS X 10.4.10

One crucial bit of information you've left out, is which version of Wget
you're running. :)

Sorry if it took a while to respond to your message; the mailing list
went down about five days ago... :/

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG6Xo67M8hyUobTrERCF8EAJ488clSEHr6SonWics5IYQil1BzgwCdETKB
JuLemhwNonbhIPJ3mcL+wPU=
=xYpF
-END PGP SIGNATURE-


Re: Different exit status for 404 error?

2007-09-13 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Alex Owen wrote:
 Hello,
 
 If i run :
 wget http://server.domain/file
 
 How can I differentiate between a network problem that made wget fail
 of the server sending back a HTTP 404 error?
 
 ( I have a use case described in debian bug  http://bugs.debian.org/422088 )
 
 I think it would be nice if the exit code of wget could be inspected
 to determin if wget failed because of a 404 error or some other
 reason.

Hi Alex,

We do plan to look evaluate differentiation of exit statuses at some
point in the future; however, it's not one of our very highest
priorities for the moment, and it's currently targeted for Wget 1.13
(the bug report is at https://savannah.gnu.org/bugs/index.php?20333, but
there's really not much description there). We are about to release Wget
1.11, hopefully within a month.

It is possible that this item will be targeted for a sooner release, in
Wget 1.12; mostly it just needs a final agreement on how exit codes
should be divided, which means discussion. Actual implementation will be
trivial. But, in the meantime, I'm not sure I want to introduce
different exit codes on an individual basis.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG6aGF7M8hyUobTrERCO03AJ9FllUtvfZf3+aUX0a+PP1h5EBILgCdEwj8
yDjpiVmkBf/3OQ2IRSILCTs=
=BVBE
-END PGP SIGNATURE-


Re: Timeout workaround?

2007-09-13 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Todd Plessel wrote:
 Q1. Is there a way that I can run wget that somehow avoids this
 timeout. For example, by sending an out-of-band ack to stderr every
 30 seconds so httpd does not disconnect.
 By out-of-band, I mean it cannot be included in the result bytes
 streamed to stdout since these are specific binary data formats,
 images, etc.

I don't think that changing Wget for this would be appropriate. It's not
Wget's responsibility to ensure the server doesn't time out; it's the
server's. Of course, you're welcome to make such a change yourself
(that's what Free Software is all about!), but I can't tell you how it
might be done, and it may be system-dependent.

 Q2. If not, then could the PERL-CGI script be modified to spawn a
 thread that writes an ack to stderr to keep the httpd from timing-out?
 If so, can you point me to some sample code?

This would be the better solution; but I don't know how it's done. I
think some servers will automatically send an ack if you write
something, anything, to stderr, but I'm not sure. You'll have to check
in your server's documentation.

It seems to me, though, that the infrastructure should be rearchitected
a bit to avoid such extremely large waiting periods; it strikes me as
very inefficient.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG6aOQ7M8hyUobTrERCKDFAJ0ej/1f/6MOAV/ziEPD8rc15lejLACZAR5a
wUsiaWXaBxbp0M7ydD72jLI=
=LGVB
-END PGP SIGNATURE-


Re: Timeout workaround?

2007-09-13 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Micah Cowan wrote:
 Todd Plessel wrote:
 Q2. If not, then could the PERL-CGI script be modified to spawn a
 thread that writes an ack to stderr to keep the httpd from timing-out?
 If so, can you point me to some sample code?
 
 This would be the better solution; but I don't know how it's done. I
 think some servers will automatically send an ack if you write
 something, anything, to stderr, but I'm not sure. You'll have to check
 in your server's documentation.

You could possibly hack the server source (if you have access to it) to
set the SO_KEEPALIVE socket option.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG6aXW7M8hyUobTrERCKLmAJsHxEDjacLwkLOkZn9XBgGZypIWrgCfTg19
Z2irIp3baPgMI0iAD++Kh5M=
=+R6Z
-END PGP SIGNATURE-


Re: Wget automatic download from RSS feeds

2007-09-13 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Josh Williams wrote:
 On 9/12/07, Erik Bolstad [EMAIL PROTECTED] wrote:
 Hi!
 I'm doing a master thesis on online news at the University of Oslo,
 and need a software that can download html pages based on RSS feeds.

 I suspect that Wget could be modified to do this.

 - Do you know if there are any ways to get Wget to read RSS files and
 download new files every hour or so?
 - If not: Have you heard about software that can do this?

 I am very grateful for all help and tips.
 
 Wget does not do this. That would be a great feature, but I don't
 believe parsing the RSS feed is Wget's job. Wget just fetches the
 files.
 
 I recommend you look for a program that simply parses the RSS feed and
 dumps the URLs to a file for Wget to fetch. Piping.. that's what UNIX
 is all about ;-)

Might make a very interesting plugin, though, once we've added that
functionality in Wget 2.0.

That won't be for quite some time, though.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG6b3v7M8hyUobTrERCNqQAJ91+Wsetv7LMoCdGrAN9txlQxbikwCfZp7L
kRPNYcf7VPaW/QouNXVFEHc=
=y81P
-END PGP SIGNATURE-


Re: wget -c problem with current svn version

2007-09-15 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Jochen Roderburg wrote:
 I see also a conflict between older changes by Mauro
 and the latest changes by Micah in this area.

Actually, I never made any changes to this area that I recall; just
merged in changes others made. :)

I'm not really sure of how all that works, either. The code was already
complicated, and the code from the b20323 branch hasn't helped much in
that regard. got_name, AFAICT, is a misnomer anyway, because it tracks
more than whether we've simply gotten a name.

I'd care a little more about that if I wasn't already planning to
rewrite http_loop in the near future. At any rate, though, it looks like
the new changes merit a closer look.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG7EMD7M8hyUobTrERCJT/AJ9gmWyHUjclbQNotDmW41kbgebENwCcCJWW
Vz50KZDbMDgLDmdkASPFThg=
=vJBt
-END PGP SIGNATURE-


Re: [fwd] Wget Bug: recursive get from ftp with a port in the url fails

2007-09-17 Thread Micah Cowan
Hrvoje Niksic wrote:
 Subject:
 Re: Wget Bug: recursive get from ftp with a port in the url fails
 From:
 baalchina [EMAIL PROTECTED]
 Date:
 Mon, 17 Sep 2007 19:56:20 +0800
 To:
 [EMAIL PROTECTED]
 
 To:
 [EMAIL PROTECTED]
 
 Message-ID:
 [EMAIL PROTECTED]
 MIME-Version:
 1.0
 Content-Type:
 multipart/alternative; boundary===-=-=
 
 
 Hi,I am using wget 1.10.2 in Windows 2003.And the same problem like
 Cantara. The file system is NTFS.
 Well I find my problem is, I wrote the command in schedule tasks like this:
  
 wget  -N -i D:\virus.update\scripts\kavurl.txt -r -nH -P
 d:\virus.update\kaspersky
  
 well, after wget,and before -N, I typed TWO spaces.
  
 After delete one space, wget works well again.
  
 Hope this can help.
  
 :)

Hi baalchina,

Hrvoje forwarded your message to the Wget discussion mailing list, where
such questions are really more appropriate, especially since Hrvoje is
not maintaining Wget any longer, but has left that responsibility for
others.

What you're describing does not appear to be a bug in Wget; it's the
shell's (or task scheduler's, or whatever) responsibility to split
space-separated elements properly; the words are supposed to already be
split apart (properly) by the time Wget sees it.

Also, you didn't really describe what was going wrong with Wget, or what
message about it's failure you were seeing (perhaps you'd need to
specify a log file with -o log, or via redirection of the command
interpreter supports it). However, if the problem is that Wget was
somehow seeing the space, as a separate argument or as part of another
one, then the bug lies with your task scheduler (or whatever is
interpreting the command line).

-- 
HTH,
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/



signature.asc
Description: OpenPGP digital signature


Re: changing url

2007-09-18 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Aram Wool wrote:
 hi, I'm using wget with this url:
 
 http://www.twis.org/audio/podpress_trac/web/147/0/TWIS_2007_09_11.mp3
 
 the directory named 147 increases by 1 each week, corresponding to an mp3
 with a new date. I can use macros to automatically deal with a changing
 date, but haven't been able to find out how to make wget go to directory
 148 the following week, etc. without manually changing the url. I'd expect
 there's an easy solution to such a simple problem.

Well, Wget's not really meant to do this; this is really more of a
shell-scripting/batch-file problem. A relatively simple solution would
be to have a cronjob (Unix) or scheduled task (Windows?) update the
number in a file, and use that file to construct the URL and invoke Wget.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG8FqN7M8hyUobTrERCMPkAJkBqtRGzxNIOTFc1X69ummPq7g5fQCeMUuA
I7ibiNa+qxUDq8om8k2V1FI=
=kLg4
-END PGP SIGNATURE-


Re: Wget NTLM IIS Authentication Failure

2007-09-18 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

[EMAIL PROTECTED] wrote:
 Not sure if it's a bug, but NTLM authentication has just started to fail
 connecting to SharePoint running IIS, after it's been working for
 years.  The URL is accessible using IE or Mozilla using the same login
 credentials.  I've worked with our SharePoint team, and they are telling
 me that they have not changed any configurations on their side.  I'm
 guessing that maybe Microsoft has recently updated/changed their
 proprietary protocol.  Here's some debug info:

Well, without having debug information from when it was working, it's
hard to say what could be going on. A tcpdump of the successful
transaction with Mozilla or IE could be very informative.

It looks to me, though, that the NTLM authentication could actually be a
red herring: if Wget is sending a bad authentication, the server would
respond with 401 Unauthorized. The fact that it responds with 500 could
mean something else is wrong.
http://wget.addictivecode.org/FrequentlyAskedQuestions#tool-x-not-wget
gives a couple of common situations and workarounds.

Failing that, you might try out newer versions of Wget, to see if they
solve your problem. The latest release is 1.10.2, and there is also the
current development version. For info on how to get these, check
http://wget.addictivecode.org/FrequentlyAskedQuestions#download

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG8GCY7M8hyUobTrERCDU4AJ41ZDuhT6EI1U6xIt6Shwh3MnAKFwCdGL40
Ssq5S6ih5rCs71Ly1KFL8QQ=
=SQHp
-END PGP SIGNATURE-


Re: Wget NTLM IIS Authentication Failure

2007-09-18 Thread Micah Cowan
Well, I was actually directing you to a specific entry on the FAQ:
http://wget.addictivecode.org/FrequentlyAskedQuestions#tool-x-not-wget
It mentions a couple ways to get around sites that specifically block
programs like wget, or don't like jumping links.
-Micah

[EMAIL PROTECTED] wrote:
 Thanks for the quick response.  I looked on the FAQ and didn't see
 anything relevant.  I installed the latest code, but I still get the
 same error.  Any other suggestions??? 

 -Original Message-
 From: Micah Cowan [mailto:[EMAIL PROTECTED] 
 
 [EMAIL PROTECTED] wrote:
 Not sure if it's a bug, but NTLM authentication has just started to 
 fail connecting to SharePoint running IIS, after it's been working for
 
 years.  The URL is accessible using IE or Mozilla using the same login
 
 credentials.  I've worked with our SharePoint team, and they are 
 telling me that they have not changed any configurations on their 
 side.  I'm guessing that maybe Microsoft has recently updated/changed 
 their proprietary protocol.  Here's some debug info:
 
 Well, without having debug information from when it was working, it's
 hard to say what could be going on. A tcpdump of the successful
 transaction with Mozilla or IE could be very informative.
 
 It looks to me, though, that the NTLM authentication could actually be a
 red herring: if Wget is sending a bad authentication, the server would
 respond with 401 Unauthorized. The fact that it responds with 500 could
 mean something else is wrong.
 http://wget.addictivecode.org/FrequentlyAskedQuestions#tool-x-not-wget
 gives a couple of common situations and workarounds.
 
 Failing that, you might try out newer versions of Wget, to see if they
 solve your problem. The latest release is 1.10.2, and there is also the
 current development version. For info on how to get these, check
 http://wget.addictivecode.org/FrequentlyAskedQuestions#download
 

-- 
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/


Wget on Mercurial!

2007-09-19 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Since I do virtually all my work on a laptop, which is usually but not
always connected to the Wired, I have begun experimenting with
distributed SCMs. I have recently been using Mercurial for work on Wget,
and then synching the work with Subversion, and am very happy with it so
far. I would like to consider moving Wget development from Subversion to
Mercurial at some point in the future, if it continues to work well.

For the time being, and for at least a good while, both repositories are
now hosted, and will be kept synchronized. Only the _trunk_ has been
converted to a Mercurial repository, and this is likely to remain the
case for a while; in fact, it is likely that the rest of the Subversion
repository (past release branches, bug branches, and tags) will never be
converted to Mercurial, unless active development needs to happen on them.

Note that, while a common idiom for working with Subversion is to create
branches within the same repository for working on and later merging
back into the trunk, in a distributed SCM the usual MO is to clone the
repository, check changes into the cloned repository, and then later
merge those changes back into the public, shared repository. Thus, we'll
probably never have bug branches like we've been using (lately) in
Subversion; instead, developers working on bugs will just have various
repository copies holding the changes they're working on.

The advantage to all of this is that we can all be working on stuff,
wholly separately from the Subversion server at addictivecode.org, so
that we don't need an internet connection, or for addictivecode.org to
be functioning properly ( D-= ). People who aren't core developers and
are just preparing patches, can still have the full functionality of
being able to save their progress as they work, and submit their
changesets back in patch form. And, it's that much easier to fork Wget
in the event that I refuse to entertain a popular/necessary improvement
(I'm hoping the addition of a decent plugin architecture will help avoid
that sort of thing)! ;)

One disadvantage I can see is that ongoing development could run the
risk of becoming less transparent than I'd like. I really want to be
able to see the work people are doing, as they are doing it. Regular
commits to a central repository, combined with a commit notifications
mailing list ([EMAIL PROTECTED]), ensure that everyone could
see what everyone else was doing. Switching to a distributed SCM tool
means that there's the potential for people to be doing large work on an
ongoing basis, and no one gets to see it/give feedback on it until close
to the end. This also has the potential for wasted work, if design flaws
can be discovered only very late in the game.

There is nothing I can do to prevent that, but I would encourage anyone
who is doing significant work to contact me about hosting their
repositories publicly, or at least host them publicly themselves and
announce it to the list, so we can all keep abreast of current
development work.

- From this point on, development in Mercurial is preferred, but for the
time being it's fine to continue working in Subversion as well.

.

The current development trunk on Mercurial can be browsed and/or cloned
from http://hg.addictivecode.org/wget/trunk/. In the future, when I
expect to be tracking multiple repositories, you'll see the list of them
at http://hg.addictivecode.org/.

For information about Mercurial, see
http://www.selenic.com/mercurial/wiki/. Especially the download links
and QuickStart page.

Pushing (analogous to committing in a non-distributed system) is not
currently supported yet. People who currently have commit access to the
Subversion repository will receive information about committing to the
Mercurial system separately, when support has been enabled for that.

Developers who do not have commit access are encouraged to use Mercurial
to work on the Wget sources, and send patches (preferably, as generated
by hg export) to [EMAIL PROTECTED]

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG8as77M8hyUobTrERCLuOAJ9xux2VuKT35XcaiIWB9XbqI4auAgCgi4AI
9IJdOp4LCMCq/VPFu9iovBk=
=ynkZ
-END PGP SIGNATURE-


Re: Wget on Mercurial!

2007-09-19 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Matthew Woehlke wrote:
 Micah Cowan wrote:
 Since I do virtually all my work on a laptop, which is usually but not
 always connected to the Wired, I have begun experimenting with
 distributed SCMs. I have recently been using Mercurial for work on Wget,
 and then synching the work with Subversion, and am very happy with it so
 far. I would like to consider moving Wget development from Subversion to
 Mercurial at some point in the future, if it continues to work well.
 
 Have you already considered and rejected git? I'm not trying to start a
 VCS war, it's just that I've seen some other GNU projects (notably
 coreutils and gnulib) moving to git and want to know you didn't pick hg
 just because it's the first DVCS you liked.

I meant to say this: alternative suggestions for DVCS are still welcome.
And yes, hg is in fact the first DVCS I liked. :)

I considered git first, actually, but concluded that it's not
appropriate, because from what I've seen it just really isn't
multiplatform at all. From what I've seen, it looks absolutely
terrific... if you're a Unix user. Bollocks to you if you're not. :)

Git also has a reputation for being more difficult to use. Linus says
this is no longer so, but I'm not sure I'd trust his opinion on that...
:) It is also rather short on documentation.

Bazaar was also a candidate, especially since I am an Ubuntu user and
(small) contributor (which has not yet actually involved my using
Bazar), and while it appears to be extremely easy-to-use, it looks to
suffer severe performance drawbacks in comparison to git or Mercurial,
and since Mercurial seems to be quite efficient, while being (AFAICT)
just as easy-to-use as Bazaar, Mercurial seemed a better choice.

I did do some research, which did not actually involve trying any other
DVCSses, though I have used git to prepare a small kernel patch, but
mainly consisted checking out Rick Moen's summary[1] of a huge variety
of SCMs, and reading up on the decision-making progress that Mozilla[2]
and went through to choose which SCM to switch to (they chose Mercurial).

This is still an _evaluation_ period, and I don't want to rush too
quickly towards just jumping to it. But if there doesn't seem to be a
strong reason to use something else, and we use it for a while, and
there don't seem to be any very serious complaints, then we'd probably
move in that direction.

Mercurial is a relatively young (but quickly popularized) project. It
seems to do everything we would need, but is likely to have some
shortcomings. I note, for instance, that support for symbolic links is a
very recent, and somewhat immature, addition. I also understand that it
operates under the assumption that files in the repository can be held
entirely in memory. Also, it cannot track directories, only paths,
which makes me wonder about how tracking permissions of directories
would work (probably doesn't).

All of these things look to me like (fairly minor) shortcomings for
Mercurial as an SCM; but I don't think they're likely to affect our
particular project significantly.

1. http://linuxmafia.com/faq/Apps/scm.html
2.
http://weblogs.mozillazine.org/preed/2007/04/version_control_system_shootou_1.html

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG8bUb7M8hyUobTrERCCggAJ9M+V2OkvxzCoQWPpqtjf0O72g2AgCdEJob
OWKIRmJlcyonJzi51h9UmHo=
=3H4Z
-END PGP SIGNATURE-


Size comparisons between svn, hg

2007-09-19 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

The subversion repository (including trunk, tags, branches) takes up 27MB.

A working copy checked out from Subversion (just trunk) occupies about
12MB; about 5.1MB are the working files, about 6.2MB are .svn/
directories and their contents (that should add up to ~11MB by my count,
but anyway...).

A Mercurial repository clone (which includes all trunk history
information, plus the working copy), occupies 13MB. Again, 5.1MB of this
is the working copy's files; a clone of the repository without a working
copy ocupies 7.9MB.

I don't know how the Subversion repository should compare to the
Mercurial repository; the Mercurial repository is the trunk only,
whereas the Subversion repository also has history for four release
branches, 16 bug branches, and reference tags for all of these. However,
tags are very, very cheap, and the additional information represented in
the branches are probably not terribly large, so it's difficult to
compare the everything Subversion repository to the trunk-only
Mercurial repository: it could be very impressive that Subversion holds
all that stuff in only 19MB more than the Mercurial repo, or it could be
that even a trunk-only Subversion repository would not be a whole lot
smaller than that 27MB.

However, the fact that the entire Mercurial repository compares
approximately equal to a Subversion working copy's metadata cruft is
pretty impressive. Of course, a large part of this is probably due to
the fact that Subversion's cruft includes pristine copies of all files
from the last checkout, so that a simple svn diff doesn't have to
involve network traffic. Since a Mercurial repository includes the
entire histories, it doesn't need to do this. :)

Note that this doesn't say anything about how Mercurial or Subversion
compare to other DRCSses; and, of course, efficient storage is far from
one of the most significant considerations for choosing an SCM, in
comparison with other things. Still, if you're wondering if allocating
space for an entire repository is going to be problematic in comparison
to storing just a working copy: worry not! :)

...of course, if using Mercurial to work with Wget is the only reason
you have to have Python installed on your system, well, that's another
thing... :D

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG8b1l7M8hyUobTrERCCRYAJ4kYswo2lGAhpkkNgCIUOnNG5SIOgCeJcgj
4Hfk7DKe7R8EZOhG9imV+jo=
=LGHf
-END PGP SIGNATURE-


Re: Wget on Mercurial!

2007-09-20 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

(Whoops, of course I meant to send this to the list, rather than to Tony
alone. Resent. Sorry Tony!)

Tony Lewis wrote:
 Micah Cowan wrote:

 As I see it, the biggest concern over using git would be multiplatform
 support. AFAICT, git has a great developer community, but a rather
 Linux-focused one. And while, yes, there is Win32 support, the
 impression I have is that it significantly lags Unix/Linux support.
 Mozilla rejected it early on due to this conclusion

 The Mozilla community (with a large base of Win32 programmers) rejected an
 open-source package that met their needs better than other packages because
 it didn't have good enough Win32 support? Why didn't they just add in the
 Win32 support so that the rest of the world that cares about Win32 support
 could benefit from it?

How is good enough Win32 support not part of their needs? :)

Like everything, there is a balance--a trade-off. Whatever aspects that
git possessed that other systems didn't, apparently was not comparable
to the amount of work involved in adding proper Win32 support. They may
have just been minor niceties, who knows? They didn't say, or refer to
what specifically git did well that the other systems didn't. Asking any
development community to take time away from their core project to work
on something else is not a small thing, though it is of course how git
was born in the first place.

 git vs hg.

 It's a good thing I remember a little high-school chemistry or I'd have no
 idea what that meant. I'm assuming hg == Mercurial, but shouldn't it be
 hgial? :-)

Well, it would probably be more proper if the h were capitalized.

What you're seeing is an inconsistency in how we're refer to these SCMs:
by their proper names, or by their program names; hg is to Mercurial as
svn is to Subversion, and as bzr is to (modern) Bazaar. Which is as git
is to Git. :)

But yeah, the command-line name is because of the chemical designation.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG8vrh7M8hyUobTrERCA5dAJ429wHdD2G/6yaQcZyDpblFtOpxqwCfbrj1
Cbj5VihxnXk02xdRTp/YNSM=
=A7Bj
-END PGP SIGNATURE-


Differentiated exit statuses [wget + dowbloading AV signature files]

2007-09-22 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

(Not Cc'ing Gerard, as I'm not sure he wants to be included in this
tangent.)

Tony Lewis wrote:
 Unfortunately, at least as far as I can tell, wget does not issue an
 exit code if it has downloaded a newer file.
 
 Better exit codes is on the wish list.
 
 It would really be nice though if wget simply issued an exit code if
 an updated file were downloaded.
 
 Yes, it would.

I don't think this is what the exit codes differentiation will handle:
Wget really ought to exit with zero status for all success cases. The
Unix idiom is that 0 is success, anything else is a failure of some
sort; so while it would certainly be handy to differentiate between
various types of success, there isn't really a way to do that appropriately.

The exception is that Unix tools will often issue a non-zero exit status
for I didn't have to do anything; so a -N on a file that didn't have
to be downloaded might result in non-zero, giving that differentiation
we needed. However, an important question would be, what should Wget do
for multi-file downloads, where either multiple files were specified on
the command-line, or it's running in recursive-retrieval mode? Say you
specified three files with -N, one needed downloading, one didn't, and
the other was a 404?

I'm thinking, perhaps more important than different exit codes (which
will still be useful for some circumstances: single-file downloads,
spidering, serious failures, I/O and allocation problems),
strictly-defined program-parseable output formats may be more useful.
Something which, for each file, specifies the results plainly.

This could become even more important when we support multiple
simultaneous download streams: we'll want to output parseable updates on
each chunk of downloaded file we retrieve, so we can communicate to
(say) a GUI wrapper as the different streams are running.

You'd probably want to specifically ask for logging in this format, as
it's liable to be less user-friendly than the current output formats
(who wants to read, during a multi-stream fetch, an endless series of
file `index.html': got bytes 4096-5120 lines?).

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG9WTL7M8hyUobTrERCNP1AJ9igD3zejm34VBlEIyIdx83Q0V9pgCfd0tW
ax1u6l9uaapCZREZHQljep8=
=08dv
-END PGP SIGNATURE-


Re: wget behavior

2007-09-28 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Stephen Schachter wrote:

 With some innocence, I tried changing weblogic server parameters, increasing 
 the number of threads, and the percentage of threads used to process 
 requestst, but this dod no good.  With four concurrent requesters sending 
 about one request per second, one requester hung and could not be interrupted.
 
 Does anyone have an idea what might be going on?  Is this a problem at the 
 requesting workstation end, or is the server running out of resources?  Any 
 idea?  Thanks.

It's hard to say what could be happening, but it sounds more like a
server issue than a client one to me.

When you say it hung and couldn't be interrupted: did it eventually
(after 5 minutes or something) time out the connection?

Using a packet-dumper of some sort might be illuminating, to see what
the traffic looks like for the hanging-client; whether it sends packets
and never gets appropriate responses, or receives but doesn't process
the packets.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG/VeD7M8hyUobTrERCIU5AJ4wN5vA8Gt4jbG0PzYug9xh72rjOwCeIFcc
zufW+gVvKgiGXPm3NwpMExI=
=/ydY
-END PGP SIGNATURE-


Re: Option -nc causes crash

2007-09-28 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

www.mail wrote:
 wget -nc http://www.google.com/;
 
 the index.html file is downloaded as expected.  However, running the
 same command again causes wget to crash.  The wget output is:

Confirmed, on GNU/Linux. Thanks for the report.

There seems to be a pause between when it declares it's not retrieving
it, and when it crashes.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG/aVg7M8hyUobTrERCCV4AJ9c5RjGfdq9cakDf4zyLU6X+J5XIACfXTSt
ijGY8qljUIbszGGovDWFcQc=
=LZux
-END PGP SIGNATURE-


Re: Option -nc causes crash

2007-09-28 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Micah Cowan wrote:
 www.mail wrote:
 wget -nc http://www.google.com/;
 
 the index.html file is downloaded as expected.  However, running the
 same command again causes wget to crash.  The wget output is:
 
 Confirmed, on GNU/Linux. Thanks for the report.
 
 There seems to be a pause between when it declares it's not retrieving
 it, and when it crashes.

Fixed; it was a simple assertion-style abort. Guess the pause was just
the core file, which I'm surprised at, as since I had ulimit -c 0 set,
no core file is generated, so there shouldn't have been any effort
wasted to construct one! :p

gethttp was returning RETROK, which http_loop wasn't prepared to
receive. I altered gethttp to return RETRUNNEEDED in this case, instead.

Thanks again for the report!

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG/auG7M8hyUobTrERCMKgAJ9O57nHhYYh6fnoKyf/mQtTauMaMgCcDlFp
JPOF4fkS6EDf5fZv1YTSiQk=
=Hbn+
-END PGP SIGNATURE-


Re: Myriad merges

2007-09-30 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Jochen Roderburg wrote:
 And now, for a change, a case, that works now (better)  ;-)
 
 This is an example where a HEAD request gets a 500 Error response.
 
 Wget default options again, but contentdisposition=yes to force a HEAD.
 
 
 wget.111-svn-0709 --debug -e contentdisposition = yes
 http://www.eudora.com/cgi-bin/export.cgi?productid=EUDORA_win_7109
 
 Setting contentdisposition (contentdisposition) to yes
 DEBUG output created by Wget 1.10+devel on linux-gnu.
 
 --15:26:54--  
 http://www.eudora.com/cgi-bin/export.cgi?productid=EUDORA_win_7109
 Resolving www.eudora.com... 199.106.114.30
 Caching www.eudora.com = 199.106.114.30
 Connecting to www.eudora.com|199.106.114.30|:80... connected.
 Created socket 3.
 Releasing 0x080888d8 (new refcount 1).
 
 ---request begin---
 HEAD /cgi-bin/export.cgi?productid=EUDORA_win_7109 HTTP/1.0
 User-Agent: Wget/1.10+devel
 Accept: */*
 Host: www.eudora.com
 Connection: Keep-Alive
 
 ---request end---
 HTTP request sent, awaiting response...
 ---response begin---
 HTTP/1.1 500 Server Error
 Server: Netscape-Enterprise/6.0
 Date: Mon, 03 Sep 2007 13:26:54 GMT
 Content-length: 305
 Content-type: text/html
 Connection: keep-alive
 
 ---response end---
 500 Server Error
 Registered socket 3 for persistent reuse.
 --15:26:56--  (try: 2) 
 http://www.eudora.com/cgi-bin/export.cgi?productid=EUDORA_win_7109
 Disabling further reuse of socket 3.
 Closed fd 3
 Found www.eudora.com in host_name_addresses_map (0x80888d8)
 Connecting to www.eudora.com|199.106.114.30|:80... connected.
 Created socket 3.
 Releasing 0x080888d8 (new refcount 1).
 
 ---request begin---
 GET /cgi-bin/export.cgi?productid=EUDORA_win_7109 HTTP/1.0
 User-Agent: Wget/1.10+devel
 Accept: */*
 Host: www.eudora.com
 Connection: Keep-Alive
 
 ---request end---
 HTTP request sent, awaiting response...
 ---response begin---
 HTTP/1.1 302 Moved Temporarily
 Server: Netscape-Enterprise/6.0
 Date: Mon, 03 Sep 2007 13:26:55 GMT
 Location: http://www.eudora.com/download/eudora/windows/7.1/Eudora_7.1.0.9.exe
 Content-length: 0
 Connection: keep-alive
 
 ---response end---
 302 Moved Temporarily
 Registered socket 3 for persistent reuse.
 Location: http://www.eudora.com/download/eudora/windows/7.1/Eudora_7.1.0.9.exe
 [following]
 Skipping 0 bytes of body: [] done.
 --15:26:56-- 
 http://www.eudora.com/download/eudora/windows/7.1/Eudora_7.1.0.9.exe
 Reusing existing connection to www.eudora.com:80.
 Reusing fd 3.
 
 ---request begin---
 HEAD /download/eudora/windows/7.1/Eudora_7.1.0.9.exe HTTP/1.0
 User-Agent: Wget/1.10+devel
 Accept: */*
 Host: www.eudora.com
 Connection: Keep-Alive
 
 ---request end---
 HTTP request sent, awaiting response...
 ---response begin---
 HTTP/1.1 200 OK
 Server: Netscape-Enterprise/6.0
 Date: Mon, 03 Sep 2007 13:26:56 GMT
 Content-type: application/octet-stream
 Last-modified: Thu, 05 Oct 2006 18:45:18 GMT
 Content-length: 17416184
 Accept-ranges: bytes
 Connection: keep-alive
 
 ---response end---
 200 OK
 Length: 17416184 (17M) [application/octet-stream]
 --15:26:56-- 
 http://www.eudora.com/download/eudora/windows/7.1/Eudora_7.1.0.9.exe
 Reusing existing connection to www.eudora.com:80.
 Reusing fd 3.
 
 ---request begin---
 GET /download/eudora/windows/7.1/Eudora_7.1.0.9.exe HTTP/1.0
 User-Agent: Wget/1.10+devel
 Accept: */*
 Host: www.eudora.com
 Connection: Keep-Alive
 
 ---request end---
 HTTP request sent, awaiting response...
 ---response begin---
 HTTP/1.1 200 OK
 Server: Netscape-Enterprise/6.0
 Date: Mon, 03 Sep 2007 13:26:56 GMT
 Content-type: application/octet-stream
 Last-modified: Thu, 05 Oct 2006 18:45:18 GMT
 Content-length: 17416184
 Accept-ranges: bytes
 Connection: keep-alive
 
 ---response end---
 200 OK
 Length: 17416184 (17M) [application/octet-stream]
 Saving to: `Eudora_7.1.0.9.exe'
 
 100%[=] 17,416,184   397K/s  
 in 44s
 
 15:27:40 (386 KB/s) - `Eudora_7.1.0.9.exe' saved [17416184/17416184]
 
 
 ls -l Eudora_7.1.0.9.exe
 -rw-r- 1 a0045 RRZK 17416184 05.10.2006 20:45 Eudora_7.1.0.9.exe
 
 
 This seems also to use the only available source for the timestamp, the 
 response
 to the GET request.

Sorry to reproduce that in full, but I thought it might be helpful to
see the full transcript again, since you sent this a while ago.

I was going back through this thread to refresh my memory on some
things. I noticed, and wanted to point out, that actually, the GET
request was _not_ the only available source for the timestamp; HEAD was
answered with a 500, but only the first one. The HEAD issued after the
redirect gives a timestamp.

The problem you pointed out that causes the failure to properly
timestamp when HEADs aren't issued seems, to my reading, to be simply
regressable for the fix. Mauro's fixes don't look as if they depend upon
that line being there, but I'm waiting for him to have a chance to look
over it before I commit to that as the fix (both he and I have 

Re: wget -o question

2007-09-30 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Steven M. Schweda wrote:
 From: Micah Cowan
 
 -  tms = time_str (NULL);
 +  tms = datetime_str (NULL);
 
 Does anyone think there's any general usefulness for this sort of
 thing?
 
I don't care much, but it seems like a fairly harmless change with
 some benefit.  Of course, I use an OS where a directory listing which
 shows date and time does so using a consistent and constant format,
 independent of the age of a file, so I may be biased.

:)

Though honestly, what this change buys you above simply doing date;
wget, I don't know. I think maybe I won't bother, at least for now.

 Though if I were considering such a change, I'd probably just have wget
 mention the date at the start of its run, rather than repeat it for each
 transaction. Obviously wouldn't be a high-priority change... :)
 
That sounds reasonable, except for a job which begins shortly before
 midnight.

I considered this, along with the unlikely 24-hour wget run.

But, since any specific transaction is unlikely to take such a long
time, the spread of the run is easily deduced by the start and end
times, and, in the unlikely event of multiple days, counting time
regressions.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHAIP67M8hyUobTrERCFFIAJ9Pltuwqr0FeOtlwuFPotKxoBa6TgCeKb2l
dtRfakFDQ47qcUJJFKXPVwY=
=t50d
-END PGP SIGNATURE-


Re: wget -o question

2007-10-01 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Steven M. Schweda wrote:

 But, since any specific transaction is unlikely to take such a long
 time, the spread of the run is easily deduced by the start and end
 times, and, in the unlikely event of multiple days, counting time
 regressions.
 
And if the pages in books were all numbered 1, 2, 3, 4, 5, 6, 7, 8,
 9, 0, 1, 2, 3, ..., the reader could easily deduce the actual number for
 any page, but most folks find it more convenient when all the necessary
 data are right there in one place.

To my mind, books are much more likely to cross 10-page boundaries
several severals of times, than Wget is to cross more than just one
24-hour boundary. And, there's always date; wget; date...

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHAJch7M8hyUobTrERCKMDAKCFxnnZrB0vIrquoMi5x/F+32DlCwCcDWdP
3U+0+vCH1tXGCJ3pk9KR3xM=
=ZDLY
-END PGP SIGNATURE-


Wget 1.11 branched

2007-10-01 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

There is little left to finish up for Wget 1.11. I've decided to leave
certain issues related to Content-Disposition until 1.12
(Content-Disposition support in Wget 1.11 will be considered
experimental). The few other remaining issues are awaiting feedback I've
solicited from various persons.

In order for work to begin on for Wget 1.12, I have branched
svn://addictivecode.org/branches/1.11 from the trunk. Changes
appropriate for 1.11 will go both there and in the trunk. I'll also set
a tag at the release. A new 1.11 Mercurial repository has also been
cloned to http://hg.addictivecode.org/wget/1.11.

The Wget 1.11 release is still being held back at the moment by
discussions on the OpenSSL licensing exception (I pinged Brett Smith
about it again a few days ago).

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHAKDW7M8hyUobTrERCA18AJ93/+W6sk1lKvqcZC6G09DmwDxmKwCgjkeJ
d4K9Kg4rwaU0kl8CAZcNPdg=
=gRt4
-END PGP SIGNATURE-


Re: wget -o question

2007-10-01 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Jim Wright wrote:
 My usage is counter to your assumptions below.[...]
 A change as proposed here is very simple, but
 would be VERY useful.

Okay. Guess I'm sold, then. :D

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHAKcq7M8hyUobTrERCCxhAKCPbzNRHGkVbZTcaEBlI7xNqroJbACeKSYO
kdixUTJro4Pp3CszOYdjfHE=
=NaSh
-END PGP SIGNATURE-


Re: Myriad merges

2007-10-01 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Jochen Roderburg wrote:
 Zitat von Micah Cowan [EMAIL PROTECTED]:
 
 The problem you pointed out that causes the failure to properly
 timestamp when HEADs aren't issued seems, to my reading, to be simply
 regressable for the fix. Mauro's fixes don't look as if they depend upon
 that line being there, but I'm waiting for him to have a chance to look
 over it before I commit to that as the fix (both he and I have been busy
 lately).
 
 Yes, this one is still open, and the other one that wget -c always starts 
 at 0
 again.

Do you mean the (local 0) thing? That should have been fixed in
674cc935f7c8 [subversion r2382]. Can you re-check?

 On the other hand, with the combination of options that I usually use in my
 daily wget practice (timestampng and content-disposition on) everything works
 fine now  ;-)
 
 I've also got trying to deal with content-disposition issues for when
 HEAD fails, on my todo list.
 
 I have not done real-life tests with content-disposition cases, but I have 
 also
 some feeling that not all combination with other options (like timestamping 
 and
 continuation) work with these yet. These may be minor issues again, as usually
 content-disposition is used when the contents are generated somehow dynmically
 and there are no static timestamps and filelengths at all.

Yes. Currently Content-Disposition is not working when the HEAD fails or
doesn't include Content-Disposition, which is problematic since this is
a very frequent case. However, I think the necessary changes would be a
bit invasive, and I'm not prepared to make them in time for the 1.11
release; so in essence, Content-Disposition, for now, will sometimes
work and sometimes not.

It'll be nice to fix this in 1.12, along with implementing changes to
reduce the number of HEADs we issue (I'd prefer to skip HEAD completely
for just content-disposition, and assume we'll accept it, and terminate
the connection if we won't; at any rate, it will need some discussion,
most of which would probably be more appropriate at the Wgiki).

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHAT+27M8hyUobTrERCAvRAKCANM2nkxvZAN1CZYRmMKlo8FSDrQCeNwWj
aUA37hJ+EaZ/fI6pBNL7P68=
=u5FR
-END PGP SIGNATURE-


Re: Myriad merges

2007-10-01 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Jochen Roderburg wrote:
 Zitat von Micah Cowan [EMAIL PROTECTED]:
 
 Jochen Roderburg wrote:
 Yes, this one is still open, and the other one that wget -c always starts
 at 0
 again.
 Do you mean the (local 0) thing? That should have been fixed in
 674cc935f7c8 [subversion r2382]. Can you re-check?
 
 No, that is ok now.
 I saw my little patch for this included as of this weekend ;-)
 
 The one I mean is: wget -c continuation is not done in the HEADless cases.
 
 http://www.mail-archive.com/wget%40sunsite.dk/msg10265.html   ff.

Ah, thanks for the reminder. Apparently I'd forgotten to track that.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHAU/t7M8hyUobTrERCE3jAJ0TJrS+83Tv5qZK4TZqvyZBcEKwpACghJu8
gXkWE9BP42KMNXE55ce2v7o=
=k857
-END PGP SIGNATURE-


Re: bug in escaped filename calculation?

2007-10-04 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Josh Williams wrote:
 On 10/4/07, Brian Keck [EMAIL PROTECTED] wrote:
 I would have sent a fix too, but after finding my way through http.c 
 retr.c I got lost in url.c.
 
 You and me both. A lot of the code needs re-written.. there's a lot of
 spaghetti code in there. I hope Micah chooses to do a complete
 re-write for version 2 so I can get my hands dirty and understand the
 code better.

Currently, I'm planning on refactoring what exists, as needed, rather
than going for a complete rewrite. This will be driven by unit-tests, to
try to ensure that we do not lose functionality along the way. This
involves more work overall, but IMO has these key advantages:

 * as mentioned, it's easier to prevent functionality loss,
 * we will be able to use the work as its written, instead of waiting
many months for everything to be finished (especially with the current
number of developers), and
 * AIUI, the wording of employer copyright assignment releases may not
apply to new works that are not _preexisting_ as GPL works. This means
that, if a rewrite ended up using no code whatsoever from the original
work (not likely, but...), there could be legal issues.

After 1.11 is released (or possibly before), one of my top priorities is
to clean up the gethttp and http_loop functions to a degree where they
can be much more readily read and understood (and modified!). This is
important to me because so far (in my
probably-not-statistically-significant 3 months as maintainer) a
majority of the trickier fixes have been in those two functions. Some of
these fixes seem to frequently introduce bugs of their own, and I spend
more time than seems right in trying to understand the code there, which
is why these particular functions are prime targets for refactoring. :)

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHBR7E7M8hyUobTrERCCrbAJ9Jw7LB/YW4myDOyPiHvXLZ13rkNQCeOVbf
5INV0ApmUTuzxp8zO5haVCA=
=EeEd
-END PGP SIGNATURE-


Re: Software interface to wget

2007-10-04 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Alan Thomas wrote:
 Idea for future wget versions:  It would be nice if I could
 invoke wget programmatically and have options like returning data in
 buffers versus files (so data can be searched and/or manipulated in
 memory),

This can already be done by using wget's -O switch, which directs the
output to a specified file (including standard output). A wrapper
program could simply read wget's stdout directly into a buffer. However,
- -O is only really useful for single downloads, as there is no
delineation between separate files. And, I'll admit that I'm not clear
how easy this is to do with 100% Pure Java; it's quite straightforward
on Unix systems in most languages.

 or at least getting notification of what files have been
 downloaded (progress).

...progress reports are already issued to standard error; parsing this
wouldn't be too terribly difficult (though it's not currently guaranteed
to be stable across releases). Several programs are already doing this,
AIUI.

 Then it could be more easily and seamlessly
 integrated into other software that needs this capability.  I would
 especially like to be able to invoke wget from Java code. 

It sounds to me like you're asking for a library version of Wget. There
aren't specific plans to support this at the moment, and I'm not sure
how much it'd really buy you: high level programming languages such as
Java, Python, Perl, etc, tend to ship with good HTTP and HTML-parsing
libraries, in which case rigging your own code to do a good chunk of
what Wget does, is probably less work than trying to adapt Wget into
library form. I'm not saying I'm ruling it out, but I'd need to hear
some good cases for it, in contrast to using what's already available on
those platforms.

However, some changes are in the works (early early planning stages) for
Wget to sport a plugin architecture, and if a bit of glue to call out to
higher-level languages is added, plugins written in languages such as
Java wouldn't be a big sretch. It may well be that restructuring Wget as
a library instead of as a standalone app that runs plugins, may be a
better solution; it bears discussion.

Also planned is a more flexible output system, allowing for arbitrary
formatting of downloaded resources (such as .mht's, or tarballs, or
whatever), making delineation in a single output stream possible; also,
a metadata system for preserving information about what files have been
completely downloaded and which were interrupted, what their original
URLs were, etc.

All of this, however, is a long way from even really being started,
especially given our current developer resources.

- --
HTH¸
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHBRsU7M8hyUobTrERCM5rAJ9dgnkDPZbqQMTL2xfsv25fNiZ8QwCaAwbY
AXmKyAsiKIV54fVhzsUzVeU=
=oWY6
-END PGP SIGNATURE-


Re: bug in escaped filename calculation?

2007-10-04 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Brian Keck wrote:
 Hello,
 
 I'm wondering if I've found a bug in the excellent wget.
 I'm not asking for help, because it turned out not to be the reason
 one of my scripts was failing.
 
 The possible bug is in the derivation of the filename from a URL which
 contains UTF-8.
 
 The case is:
 
   wget http://en.wikipedia.org/wiki/%C3%87atalh%C3%B6y%C3%BCk
 
 Of course these are all ascii characters, but underlying it are
 3 nonascii characters, whose UTF-8 encoding is:
 
   hexoctal name
     ---  -
   C387  303 274  C-cedilla
   C3B6  303 266  o-umlaut
   C3BC  303 274  u-umlaut
 
 The file created has a name that's almost, but not quite, a valid UTF-8
 bytestring ... 
 
   ls *y*k | od -tc
   000 303   %   8   7   a   t   a   l   h 303 266   y 303 274   k  \n
 
 Ie the o-umlaut  u-umlaut UTF-8 encodings occur in the bytestring,
 but the UTF-8 encoding of C-cedilla has its 2nd byte replaced by the
 3-byte string %87.

Using --restrict=nocontrol will do what you want it to, in this instance.

 I'm guessing this is not intended.  

Actually, it is (more-or-less).

Realize that Wget really has no idea how to tell whether you're trying
to give it UTF-8, or one of the ISO latin charsets. It tends to assume
the latter. It also, by default, will not create filenames with control
characters in them. In ISO latin, characters in the range 0x80-0x9f are
control characters, which is why Wget left %87 escaped, which falls into
that range, but not the others, which don't.

It is actually illegal to specify byte values outside the range of ASCII
characters in a URL, but it has long been historical practice to do so
anyway. In most cases, the intended meaning was one of the latin
character sets (usually latin1), so Wget was right to do as it does, at
that time.

There is now a standard for representing Unicode values in URLs, whose
result is then called IRLs (Internationalized Resource Locators).
Conforming correctly to this standard would require that Wget be
sensitive to the context and encoding of documents in which it finds
URLs; in the case of filenames and command arguments, it would probably
also require sensitivity to the current locale as determined by
environment variables. Wget is simply not equipped to handle IRLs or
encoding issues at the moment, so until it is, a proper fix will not be
in place. Addressing these are considered a Wget 2.0 (next-generation
Wget functionality) priority, and probably won't be done for a year or
two, given that the number of developers involved with Wget, if you add
up all the part-time helpers (including me), is probably still less than
one full-time dev. :)

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHBSHX7M8hyUobTrERCKRLAJwKiDOo0uO7x/k/iAEB/W0pPQmUJQCfUHaP
c6k2490strgy1Efy1DmiOhA=
=7lvZ
-END PGP SIGNATURE-


Windows Wget hangs when lookups timeout?

2007-10-04 Thread Micah Cowan
Reynir (Cc'd) writes:

 Wget 1.10.2 on a Windows98 box will occasionally time out resolving a
 host (I've set all time-outs to 45s, being stuck with a modem) and 
 hang. The -d switch adds nothing useful.

and has also reproduced this on 1.11.

Is it possible there are issues with TerminateThread on threads running
gethostbyname? Can anyone running the Windows version of Wget give
insight into this?

-- 
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/



Re: Myriad merges

2007-10-05 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Jochen Roderburg wrote:
 Zitat von Micah Cowan [EMAIL PROTECTED]:
 
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA256

 Jochen Roderburg wrote:
 Yes, this one is still open, and the other one that wget -c always starts
 at 0
 again.
 Do you mean the (local 0) thing? That should have been fixed in
 674cc935f7c8 [subversion r2382]. Can you re-check?
 
 No, that is ok now.
 I saw my little patch for this included as of this weekend ;-)
 
 The one I mean is: wget -c continuation is not done in the HEADless cases.
 
 http://www.mail-archive.com/wget%40sunsite.dk/msg10265.html   ff.

This should be fixed now, along with the timestamping issues.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHBdtP7M8hyUobTrERCIg5AJ4oZ9Yy177t6XJ7P3XAugNVRZXjkwCcDoOu
HQ2j7vXqsh0HflkjhNkmASg=
=RKxE
-END PGP SIGNATURE-


Re: Myriad merges

2007-10-07 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Jochen Roderburg wrote:
 Unfortunately, however, a new regression crept in:
 In the case timestamping=on, content-disposition=off, no local file present it
 does now no HEAD (correctly), but two (!!) GETS and transfers the file two
 times.

Ha! Okay, gotta get that one fixed...

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHCQrf7M8hyUobTrERCGZYAJ4s/wKsoi7pnPjMYYuD5Xn1QZ1ttgCeIbV9
KbiJKfmK32Uil6/00SJaWcY=
=CViU
-END PGP SIGNATURE-


Re: automake-ify wget [0/N]

2007-10-08 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Ralf Wildenhues wrote:
 Hello Micah,
 
 * Micah Cowan wrote on Sat, Oct 06, 2007 at 12:04:53AM CEST:
 Ralf Wildenhues wrote:
 Remove all files that will be installed/governed by automake:
 (We keep deprecated mkinstalldirs for now, as po/Makefile.in.in still
 uses it.  We keep config.rpath until the switch to a newer gettext
 infrastructure.)

 svn remove config.guess config.sub install-sh doc/version.texi
 As to the first three of these; wouldn't it be better to update them
 _once_, and then not use --force as part of the autoreconf stuff? I
 thought --force was for the occasion when you want to refresh a few
 things, and not for habitual use?
 
 Sure.  But without --force, files installed by automake, such as
 install-sh, missing, config.guess etc., will not be updated.  The
 rationale for this behavior is to allow for local changes to these
 files to be preserved.  But yes, it is a free choice of yours to
 make, both approaches are viable.
 
 I'll probably remove autogen.sh altogether, and just replace its use in
 the instructions with autoreconf. And yes, configure.in will be renamed
 to configure.ac
 
 Fine with me.  Should I post updated patches for all of these
 modifications?

I've already applied most of your patches so far, some with my own
tweaks/modifications. I've opted not to apply the patch to
Makefile.in.in to make src writable, because I'd prefer to have a broken
make distcheck than a falsely working one. :)

I also removed the realclean target (may decide to add it back in at
some point), as Automake has, IMO, more than enough *clean targets, and
leaving it out keeps the Makefile.am's cleaner.

I moved sample.wgetrc.munged_for_texi_inclusion from EXTRA_DIST to
wget_TEXINFOS, as this allowed make doc to work (it accidentally
worked for you with make distcheck, because make distcheck built it
for the distribution; make wasn't working.

I also took some liberties with the ChangeLogs, reducing much of the
work done to, essentially, Makefile.am: converted from Makefile.in for
automake, or something similar.

You can see my current progress at
http://hg.addictivecode.org/wget/automakify (get a copy of Mercurial and
hg clone from there to get a repository). If you could produce further
patches from that foundation, I'd appreciate it.

If getting Mercurial is inconvenient, I've temporarily enabled
tarballing at that location, so you can just fetch the tree by clicking
a bz2 or gz link; however, Mercurial is recommended, as you'll be able
to inspect the history of changes, and, if you use Mercurial to track
your own modifications, it'll be that much easier for me to pull or
import them from you. :)

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHCoQr7M8hyUobTrERCGSTAJ9ODT5QbtBRFgE+jI+jr5JQzNQjggCfWTO9
HNwBEnWFtW2PGSjjzx851x4=
=2In3
-END PGP SIGNATURE-


Re: working on patch to limit to percent of bandwidth

2007-10-08 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

A. P. Godshall wrote:
 Hi
 
 New to the list.

Welcome!

 Wrote a patch that Works For Me to limit to a percent of measured
 bandwidth.  This is useful, like --limit-rate, in cases where an
 upstream switch is poorly made and interactive users get locked out
 when a single box does a wget, but limit-pct is more automatic in
 the sense that you don't have to know ahead of time how big your
 downstream pipe is.
 
 I.e., what I used to do is (1) wget, look at the bandwidth I was
 getting, and then (2) Ctrl-C,  (3) Ctrl-P and edit the line to add -c
 --limit-rate nnK (where nn is a bit less than I was getting).
 
 Now I can wget --limit-pct 50 and it will go full-speed for a bit and
 then back off till till the average speed is 50% of what the we saw
 during that time.
 
 The heuristic I'm using is to download full-speed for 15 seconds and
 then back off- that seems to work on my connection (too much less and
 measured rate is erratic, too much more and the defective upstream
 switch locks interactive folks out long enough that they notice and
 complain).  Does that seem reasonable to folks or should it be
 parameterized.  I'm not sure I can spend much time on complex
 parameter handling etc. right now.
 
 Anyhow, does this seem like something others of you could use?  Should
 I submit the patch to the submit list or should I post it here for
 people to hash out any parameterization niceties etc first?

The best place to submit patches is to [EMAIL PROTECTED]
Discussions can also happen there, too. Please submit patches against
relatively recent development code, as opposed to the latest release
(which happened ~two years ago), as otherwise the work involved in
bringing it up-to-date may be a disincentive to include it. :)
Information on obtaining the latest development version of Wget is at
http://wget.addictivecode.org/RepositoryAccess.

As to whether or not it will be included in mainline Wget, that depends
on the answer to your question, does this seem like something others of
you could use? I, personally, wouldn't find it very useful (I rarely
use even --limit-rate), so I'd be interested in knowing who would. I
suspect that it may well be the case that most folks who have need of
- --limit-rate will find your version handy, but I want to hear from
them. :)

Also, I'd like a little more understanding about what the use case is:
is it just to use N% less bandwidth than you seem to have available, so
that other connections you may open won't be taxed as much?

A couple notes off the bat: I'd prefer --limit-percent to --limit-pct,
as apparently more recent naming conventions seem to prefer
unabbreviated terms to abbreviated ones. And, I would desire a wgetrc
command to complement the long option version. Don't break your back
working these things into your patch, though: let's first see what
you've got and whether folks want it (lest you waste effort for nothing).

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHCqBJ7M8hyUobTrERCFXFAJ9/BQrl9B+SajnuXgxCtZWRyPITUQCeKSl/
oQzSzJuGNDZhSP6GNKq9wkI=
=AgXg
-END PGP SIGNATURE-


Re: working on patch to limit to percent of bandwidth

2007-10-08 Thread Micah Cowan
And here's another post that apparently got sent with an erroneous
signature. I think I may have figured out what was wrong; I specifically
remember that I was still holding shift down when I typed one of the
spaces in my passphrase... maybe that results in some screwiness...

-- 
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/



Re: Myriad merges

2007-10-09 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Micah Cowan wrote:
 Jochen Roderburg wrote:
 Unfortunately, however, a new regression crept in:
 In the case timestamping=on, content-disposition=off, no local file present 
 it
 does now no HEAD (correctly), but two (!!) GETS and transfers the file two
 times.
 
 Ha! Okay, gotta get that one fixed...

That should now be fixed.

It's hard to be confident I'm not introducing more issues, with the
state of http.c being what it is. So please beat on it! :)

One issue I'm still aware of is that, if -c and -e
contentdisposition=yes are specified for a file already fully
downloaded, HEAD will be sent for the contentdisposition, and yet a GET
will still be sent to fetch the remainder of the -c (resulting in a 416
Requested Range Not Satisfiable). Ideally, Wget should be smart enough
to see from the HEAD that the Content-Length already matches the file's
size, even though -c no longer requires a HEAD (again). We _got_ one, we
should put it to good use.

However, I'm not worried about addressing this before 1.11 releases;
it's a minor complaint, and with content-disposition's current
implementation, users are already going to be expecting an extra HEAD
round-trip in the general case; what's a few extra?

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHCy5M7M8hyUobTrERCBAnAJ4kvG/5zlr23dr2aAwEpyQr+U1VmACeIvjn
nUIFmAfUpV0WqpzAZMxgu00=
=/XdC
-END PGP SIGNATURE-


Closing Subversion trunk, Automakification in Hg mainline

2007-10-09 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Several items to announce.

One, the Mercurial trunk repository has been renamed to mainline,
which seems a better label, considering that we're not really talking
about a trunk and branches any more (in fact, the mainline repo
could conceivably include branches, so trunk is really not
appropriate.) This means that the former trunk repository is no longer
available at http://hg.addictivecode.org/wget/trunk, but rather at
http://hg.addictivecode.org/wget/mainline.

Two, the automakification process is basically done (modulo inevitable
bugs), and has been pushed to the mainline. These changes will not
appear in 1.11, and are for 1.12+. Folks running on Windows platforms or
who tend to use --disable-nls should check that everything works as
expected for them.

Three, I have CLOSED THE TRUNK in subversion (svn rm $WGETROOT/trunk).
Changes to wget-1.11 will continue to be merged to
$WGETROOT/branches/1.11 until the release, after which point Subversion
will no longer be used for active development (if all goes according to
expectations).

The motivation for this is that I want to move ahead with using
Mercurial. This is still not set in stone, in the sense that with
sufficiently good reasons I would be willing to switch to another
DVCS--and in fact, we can even move active development back to
Subversion if appropriate, reopening the trunk (I believe something like
svn cp $WGETROOT/[EMAIL PROTECTED] $WGETROOT/trunk might do the trick).
However, it is looking likelier day-by-day.

However, in order to find out whether Mercurial is truly going to fit
the bill for us, I want to be aware of potential problems early on: and
this means encouraging everyone to move to it. This way, if there are
serious issues in using it, I'll hopefully discover them sooner, so we
can make smart decisions based on that, and migrate to something better
if need be. Part of the reason Mercurial may be looking like it fits our
needs, is that currently, pretty much all active development on Wget is
being done by... me. This doesn't give me much opportunity to hear about
alternative experiences with our development toolset! :D ...so the
obvious solution that presented itself to me was to make sure everyone
else is using the same tools too, to see whether that will work.

To make this a little bit easier, I've enabled archive-generation
(tar.gz, tar.bz2, and .zip) from the repository pages, so that if you
just plain can't get Mercurial working, you can use these instead
(however, if you just plain can't get Mercurial working, I want to
know about it--that's kind of the point!).

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHC9lk7M8hyUobTrERCMTRAJ0YQYn/2m0ahdFYVOKnHmZEhDvHlQCfeRwo
I28XjpvnqNErvSrPAXGbj5k=
=ebr6
-END PGP SIGNATURE-


Re: Closing Subversion trunk, Automakification in Hg mainline

2007-10-09 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Micah Cowan wrote:

 Three, I have CLOSED THE TRUNK in subversion (svn rm $WGETROOT/trunk).
 Changes to wget-1.11 will continue to be merged to
 $WGETROOT/branches/1.11 until the release, after which point Subversion
 will no longer be used for active development (if all goes according to
 expectations).

This also means that pushes of code for Wget 1.12+ will no longer result
in notifications being sent to the wget-notify list. I'll probably set
up notifications from Mercurial fairly soon; in the meantime, you can
use the RSS feeds.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHC9t07M8hyUobTrERCCbfAJ9g+TNjyg5B4FEwr1hc2D6AhQvzIACdGgyh
LkP9Bd10MZRXZuRrWcVE2Hk=
=Acjz
-END PGP SIGNATURE-


Re: wget 1.10.2 doesn't compile on NetBSD/i386 3.1

2007-10-09 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Ray Phillips wrote:
 I thought I'd report my experiences trying to install wget 1.10.2 on
 NetBSD/i386 3.1.  I'll append the contents of config.log to the end of
 this email.

snip

 gcc -I. -I.   -DHAVE_CONFIG_H -DSYSTEM_WGETRC=\/usr/local/etc/wgetrc\
 -DLOCALEDIR=\/usr/local/share/locale\ -O2 -Wall -Wno-implicit -c
 http-ntlm.c
 http-ntlm.c:185: error: parse error before des_key_schedule
 http-ntlm.c: In function `setup_des_key':
 http-ntlm.c:187: error: `des_cblock' undeclared (first use in this
 function)

What version of OpenSSL do you have installed? DES_key_schedule and
DES_cblock are defined in openssl/des.h in recent versions; they are
uncapitalized in some old versions. It appears that your openssl/des.h
does not declare these.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHDDHf7M8hyUobTrERCKccAJ949G3y17x6RM95VusdkRYoa2IIogCeK/kM
6LsbRrdExihUVZM0tOKM968=
=nF9y
-END PGP SIGNATURE-


Version tracking in Wget binaries

2007-10-09 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

I've just pushed some changes to mainline that result in Wget including
its Mercurial revision when that information is available (a truncated
SHA-1 hash, plus a + sign if there are local modifications.)

Among other things, version.c is now generated rather than parsed. Every
time make all is run, which also means that make all will always
relink the wget binary, even if there haven't been any changes.

This also pretty much guarantees that the Windows and MS-DOS builds are
now broken in mainline, since they still depend on version.c, but won't
generate it. Patches accepted here. :)

A quick fix would be to simply generate one with the basic version
information and no repository info, but best would be if it could
include Mercurial revision information if it's available.

.

$ wget --version
GNU Wget 1.12-devel (bf48b2652707)

Copyright (C) 2007 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
http://www.gnu.org/licenses/gpl.html.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Originally written by Hrvoje Niksic [EMAIL PROTECTED].
Currently maintained by Micah Cowan [EMAIL PROTECTED].

$ wget --version   # from a tarball, no repo
GNU Wget 1.12-devel

Copyright (C) 2007 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
...

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHDFZX7M8hyUobTrERCFQOAJ9m3/i8v/iJbHFIGAS28X6MOAka5gCcCRBS
CN5l8ZPMWfTX8hdcJBOw780=
=XFw5
-END PGP SIGNATURE-


Re: working on patch to limit to percent of bandwidth

2007-10-10 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Jim Wright wrote:
 I think there is still a case for attempting percent limiting.  I agree
 with your point that we can not discover the full bandwidth of the
 link and adjust to that.  The approach discovers the current available
 bandwidth and adjusts to that.  The usefullness is in trying to be
 unobtrusive to other users.

Does it really fit that description, though? Given that it runs
full-bore for 15 seconds (not that that's very long)...

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHDTRD7M8hyUobTrERCM44AJ9rwWoavX1aqHIw8i3MR4nNabbNQgCeLdKW
gA2VuaUGbBRhlaexlhOn+TE=
=t4lv
-END PGP SIGNATURE-


Re: wget 1.10.2 doesn't compile on NetBSD/i386 3.1

2007-10-10 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

(We don't have reply-to's set; I've re-included the list on this.)

Ray Phillips wrote:
 Thanks for your reply Micah.
 
 
 Ray Phillips wrote:
  I thought I'd report my experiences trying to install wget 1.10.2 on
  NetBSD/i386 3.1.  I'll append the contents of config.log to the end of
  this email.

 snip

  gcc -I. -I.   -DHAVE_CONFIG_H -DSYSTEM_WGETRC=\/usr/local/etc/wgetrc\
  -DLOCALEDIR=\/usr/local/share/locale\ -O2 -Wall -Wno-implicit -c
  http-ntlm.c
  http-ntlm.c:185: error: parse error before des_key_schedule
  http-ntlm.c: In function `setup_des_key':
  http-ntlm.c:187: error: `des_cblock' undeclared (first use in this
  function)

 What version of OpenSSL do you have installed?
 
 % which openssl
 /usr/bin/openssl
 % openssl version
 OpenSSL 0.9.7d 17 Mar 2004
 %
 
 DES_key_schedule and DES_cblock are defined in openssl/des.h in
 recent versions; they are uncapitalized in some old versions. It
 appears that your openssl/des.h does not declare these.
 
 % sed -n '75,91p' /usr/include/openssl/des.h
 
 typedef unsigned char DES_cblock[8];
 typedef /* const */ unsigned char const_DES_cblock[8];
 /* With const, gcc 2.8.1 on Solaris thinks that DES_cblock *
  * and const_DES_cblock * are incompatible pointer types. */
 
 typedef struct DES_ks
 {
 union
 {
 DES_cblock cblock;
 /* make sure things are correct size on machines with
  * 8 byte longs */
 DES_LONG deslong[2];
 } ks[16];
 } DES_key_schedule;
 
 %
 
 Does that look OK to you?
 
 I've just installed openssl 0.9.8e to see if it helps:

snip

 so there's no difference in that part of the code.  However, when I tell
 wget to use version 0.9.8e it compiles without error, as shown at the
 end of this email.
 
 It would be nice if wget 1.10.2 would compile on NetBSD without having
 to install a second version of openssl.

Well, it's too late to change Wget 1.10.2; any new version would have a
new version number. Wget 1.11 will be releasing RSN, though, so
hopefully we'll be able to square this in time for that release (we're
mostly just waiting on resolution of some licensing issues, which should
be done shortly).

It appears from your description that Wget's check in http-ntlm.c:

  #if OPENSSL_VERSION_NUMBER  0x00907001L

is wrong. Your copy of openssl seems to be issuing a number lower than
that, and yet has the newer, capitalized names. However, when I download
a copy of openssl-0.9.7d, I get a copy of opensslv.h that gives:

  #define OPENSSL_VERSION_NUMBER   0x0090704fL

The likeliest explanation, to me, is that OPENSSL_VERSION_NUMBER isn't
being defined by this point, which will cause the C preprocessor to
replace it in #if conditionals with 0 (guaranteeing a false positive).

I actually can't figure out how we're getting a definition of
OPENSSL_VERSION_NUMBER at all: AFAICT we're not directly #including
openssl/opensslv.h... it looks like we're depending on one of
openssl/md4.h and openssl/des.h to #include it indirectly.

It works for me because des.h includes des_old.h (for support for the
uncapitalized symbols), which includes ui_compat.h, which includes ui.h,
which includes crypto.h... which includes opensslv.h. But des_old.h is
only included if OPENSSL_DISABLE_OLD_DES_SUPPORT is not set. Probably it
isn't, on most systems, but happens to be on yours (perhaps in
openssl/opensslconf.h.

This obviously seems pretty precarious, and http-ntlm.c should be
directly #including openssl/opensslv.h rather than trusting to fate
that it will get #included by-and-by.

Ray, if you add the line

 #include openssl/opensslv.h

along with the other openssl #includes, does it fix your problem wrt
openssl-9.7d?

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHDUch7M8hyUobTrERCOXLAJ0eblOOGQ5SMrFKSfT8Tc13dpuECACfcs+M
4m3LuGLNEoIzdxEJ61HDILQ=
=Hvld
-END PGP SIGNATURE-


Re: working on patch to limit to percent of bandwidth

2007-10-10 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Hrvoje Niksic wrote:
 Jim Wright [EMAIL PROTECTED] writes:
 
 I think there is still a case for attempting percent limiting.  I
 agree with your point that we can not discover the full bandwidth of
 the link and adjust to that.  The approach discovers the current
 available bandwidth and adjusts to that.  The usefullness is in
 trying to be unobtrusive to other users.
 
 The problem is that Wget simply doesn't have enough information to be
 unobtrusive.  Currently available bandwidth can and does change as new
 downloads are initiated and old ones are turned off.  Measuring
 initial bandwidth is simply insufficient to decide what bandwidth is
 really appropriate for Wget; only the user can know that, and that's
 what --limit-rate does.

So far, I'm inclined to agree.

For instance, if one just sticks limit_percent = 25 in their wgetrc,
then on some occasions, Wget will limit to far too _low_ a rate, when
most of the available bandwidth is already being consumed by other things.

Regardless of what we decide on this, though, I like Tony L's suggestion
of some summary data at completion. He had already suggested something
similar to this for a proposed interactive prompt at interrupt.

I'm thinking that there are a lot of other little nice-to-haves
related to such a feature, too: someone might want Wget to save previous
download rates to a file, and average them up across invocations, and
base a percentage on this when called upon to do so. Basically, it
smells like one of many possible hacks to perform on --limit-rate, which
makes it a good candidate for a plugin (once we have that infrastructure
in place), rather than Wget proper.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHDUlH7M8hyUobTrERCLbrAJ0eKAHly5C46dGpJVuGsJLseYSU9gCfW9H7
C2e8R7BF9rcQ7UXHKHLV6bk=
=I/eA
-END PGP SIGNATURE-


Re: wget 1.10.2 doesn't compile on NetBSD/i386 3.1

2007-10-10 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Daniel Stenberg wrote:
 On Wed, 10 Oct 2007, Micah Cowan wrote:
 
 It appears from your description that Wget's check in http-ntlm.c:

  #if OPENSSL_VERSION_NUMBER  0x00907001L

 is wrong. Your copy of openssl seems to be issuing a number lower than
 that, and yet has the newer, capitalized names.
 
 I don't think that check is wrong. We have the exact same check in
 libcurl (no suprise) and it has worked fine since I wrote it, and I've
 tested it myself on numerous different openssl versions.
 
 I would rather suspect that the problem is related to multiple openssl
 installations or similar.

If you read further, you'll see that I actually believe that we're not
properly #including the header that will _define_ OPENSSL_VERSION_NUMBER
(meaning it will be replaced with 0 in things like #if).

Sorry, the style for that mail was mainly logging what I'm
discovering; I probably should have deleted the initial suggestion that
the number could be wrong, as the not-including-opensslv.h seems far
likelier (plus, the mail ended up being a bit lengthier than it really
needed to be to make that final point).

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHDU6c7M8hyUobTrERCEubAJ45HgmxMUhbptQmHUPD/PIn1TGjYQCeIIv7
7chLx5ySHC/J/4GRWx16yPQ=
=VGI/
-END PGP SIGNATURE-


Re: working on patch to limit to percent of bandwidth

2007-10-10 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Tony Godshall wrote:
 The scenario I was picturing was where you'd want to make sure some
 bandwidth was left available so that unfair routers wouldn't screw
 your net-neighbors.  I really don't see this as an attempt to be
 unobtrusive at all.  This is not an attempt to hide one's traffic,
 it's an attempt to not overwhelm in the presence of unfair switching.
 If I say --limit-pct 75% and the network is congested, yes, what I
 want is to use no more than 75% of the available bandwidth, not the
 total bandwidth.  So, yes, if the network is more congensted just now,
 then let this download get a lower bitrate, that's fine.

I'm pretty sure that's what Jim meant by being unobtrusive; it surely
had nothing to do with traffic-hiding.

My current impression is that this is a useful addition for some limited
scenarios, but not particularly more useful than --limit-rate already
is. That's part of what makes it a good candidate as a plugin.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHDWs07M8hyUobTrERCEkbAJ9lbnva+Xtk8rv9S1AYOZ7yjZ2VuQCcDgOQ
hZKEjD4qZy/BwgDmchCDT1k=
=jN50
-END PGP SIGNATURE-


Re: wget 1.10.2 doesn't compile on NetBSD/i386 3.1

2007-10-10 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Ray Phillips wrote:
 Ray, if you add the line

  #include openssl/opensslv.h

 along with the other openssl #includes, does it fix your problem wrt
 openssl-9.7d?
 
 I made this change:
 
 % diff -u http-ntlm.c.orig http-ntlm.c
 --- http-ntlm.c.orig2007-10-11 10:07:18.0 +1000
 +++ http-ntlm.c 2007-10-11 09:58:26.0 +1000
 @@ -48,6 +48,7 @@
 
  #include openssl/des.h
  #include openssl/md4.h
 +#include openssl/opensslv.h
 
  #include wget.h
  #include utils.h
 %
 
 and wget 1.10.2 now compiles and installs without errors, as far as I
 can see.  Thanks.

Excellent! Looks like that was the issue, then. I'll apply that change
for 1.11, then.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHDW697M8hyUobTrERCPQNAJ4mtC/6yccgdeKDk9FsSn4oRSb3/wCfTnUE
Qn2Ww8R3Kvz2qdk0gMUi1X0=
=A4xC
-END PGP SIGNATURE-


Re: wget does not compile with SSL support

2007-10-11 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Thomas Wolff wrote:
 Hi,
 as requested, I am sending you the output of configure and config.log 
 for checking the problem that my compiled wget does not retrieve 
 over https (Unsupported scheme).

Thomas, I don't see that anything went wrong at all with the
configuration. This makes me think that you're not actually using the
wget that you built when you try to use https.

What does command -v wget give? How about whereis wget?

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHDlrL7M8hyUobTrERCKQUAJ9QLVEWJ8eRddIVPcAT4WokgT/fqgCcDHQ8
tuCTf3/flU3ydmyWhG62RZg=
=W3hT
-END PGP SIGNATURE-


Re: working on patch to limit to percent of bandwidth

2007-10-11 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Tony Godshall wrote:
 On 10/10/07, Micah Cowan [EMAIL PROTECTED] wrote:
 My current impression is that this is a useful addition for some limited
 scenarios, but not particularly more useful than --limit-rate already
 is. That's part of what makes it a good candidate as a plugin.
 
 I guess I don't see how picking a reasonable rate automatically is
 less useful then having to know what the maximum upstream bandwidth is
 ahead of time.

I never claimed it was less useful. In fact, I said it was more useful.
My doubt is as to whether it is _significantly_ more useful.

I'm still open, just need more convincing.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHDprc7M8hyUobTrERCHN6AJ4/+GhluN5a4ejkMZJN9dt+tHqM2ACfQ0I9
8LvCCfCaxdvH0PfHWHiN4MQ=
=m7+Y
-END PGP SIGNATURE-


Re: anyone look at the actual patch? anyone try it? [Re: working on patch to limit to percent of bandwidth]

2007-10-11 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Tony Godshall wrote:
 On 10/11/07, Micah Cowan [EMAIL PROTECTED] wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA256

 Tony Godshall wrote:
 On 10/10/07, Micah Cowan [EMAIL PROTECTED] wrote:
 My current impression is that this is a useful addition for some limited
 scenarios, but not particularly more useful than --limit-rate already
 is. That's part of what makes it a good candidate as a plugin.
 I guess I don't see how picking a reasonable rate automatically is
 less useful then having to know what the maximum upstream bandwidth is
 ahead of time.
 I never claimed it was less useful. In fact, I said it was more useful.
 My doubt is as to whether it is _significantly_ more useful.
 
 For me, yes.  For you, apparently not.  It's a small patch, really.
 Did you even look at it?

I have, yes. And yes, it's a very small patch. The issue isn't so much
about the extra code or code maintenance; it's more about extra
documentation, and avoiding too much clutter of documentation and lists
of options/rc-commands. I'm not very picky about adding little
improvements to Wget; I'm a little pickier about adding new options.

It's not really about this option, it's about a class of options. I'm in
the unenviable position of having to determine whether small patches
that add options are sufficiently useful to justify the addition of the
option. Adding one new option/rc command is not a problem. But when,
over time, fifty people suggest little patches that offer options with
small benefits, we've suddenly got fifty new options cluttering up the
documentation and --help output. If the benefits are such that only a
handful of people will ever use any of them, then they may not have been
worth the addition, and I'm probably not doing my job properly.

Particularly since a plugin architecture is planned, it seems ideal to
me to recommend that such things be implemented as plugins at that
point. In the meantime, people who find the feature sufficiently useful
can easily apply the patch to Wget themselves (that's part of what makes
Free Software great!), and even offer patched binaries up if there's
call for it.

If a number of people bother to download and install the patch, or fetch
 patched binaries in preference to the official binaries, that'd be a
good indicator that it's worth pulling in.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHDsBj7M8hyUobTrERCIMPAJ9z936EGkfx7b/1sKAt3zw6OcPMIgCaAi2Y
qtNxSlmy09JSvtaWgZ42M7o=
=iRGw
-END PGP SIGNATURE-


Re: anyone look at the actual patch? anyone try it? [Re: working on patch to limit to percent of bandwidth]

2007-10-11 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Tony Godshall wrote:
 ...
 I have, yes. And yes, it's a very small patch. The issue isn't so much
 about the extra code or code maintenance; it's more about extra
 documentation, and avoiding too much clutter of documentation and lists
 of options/rc-commands. I'm not very picky about adding little
 improvements to Wget; I'm a little pickier about adding new options.

 It's not really about this option, it's about a class of options. I'm in
 the unenviable position of having to determine whether small patches
 that add options are sufficiently useful to justify the addition of the
 option. Adding one new option/rc command is not a problem. But when,
 over time, fifty people suggest little patches that offer options with
 small benefits, we've suddenly got fifty new options cluttering up the
 documentation and --help output.
 
 Would it be better, then, if I made it --limit-rate nn% instead of
 limit-percent nn?

The thought occurred to me; but then it's no longer such a small patch,
and we've just introduced a new argument parser function, so we still
get to is it justified?.

 And made the descrip briefer?

It's not really the length of the description that I'm concerned about,
it's just the number of little options.

  If the benefits are such that only a
 handful of people will ever use any of them, then they may not have been
 worth the addition, and I'm probably not doing my job properly. ...
 
 I guess I'd like to see compile-time options so people could make a
 tiny version for their embedded system, with most options and all
 documentation stripped out, and a huge kitchen-sink all-the-bells
 version and complete documentation for the power user version.  I
 don't think you have to go to a totally new (plug in) architecture or
 make the hard choices.

Well, we need the plugin architecture anyway. There are some planned
features (JavaScript and MetaLink support being the main ones) that have
no business in Wget proper, as far as I'm concerned, but are inarguably
useful.

You have a good point regarding customized compilation, though I think
that most of the current features in Wget belong as core features. There
are some small exceptions (egd sockets).

 I know when I put an app into an embedded app, I'd rather not even
 have the overhead of the plug-in mechanism, I want it smaller than
 that.  And when I'm running the gnu version of something I expect it
 to have verbose man pages and lots of double-dash options, that's what
 tools like less and grep are for.

Well... many GNU tools actually lack verbose man pages, particularly
since info is the preferred documentation system for GNU software.

Despite the fact that many important GNU utilities are very
feature-packed, they also tend not to have options that are only useful
to a relatively small number of people--particularly when equivalent
effects are possible with preexisting options.

As to the overhead of the plugin mechanism, you're right, and I may well
decide to make that optionally compiled.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHDs2U7M8hyUobTrERCOYoAJ9bIfGbztes0MEfKxAPwpQY/bjJAQCeOAXn
8M6Kj1vLploBN+qENpF2gu8=
=K9Sb
-END PGP SIGNATURE-


Re: anyone look at the actual patch? anyone try it? [Re: working on patch to limit to percent of bandwidth]

2007-10-11 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Jim Wright wrote:
 On Thu, 11 Oct 2007, Micah Cowan wrote:
 
 It's not really about this option, it's about a class of options. I'm in
 the unenviable position of having to determine whether small patches
 that add options are sufficiently useful to justify the addition of the
 option. Adding one new option/rc command is not a problem. But when,
 over time, fifty people suggest little patches that offer options with
 small benefits, we've suddenly got fifty new options cluttering up the
 documentation and --help output.
 
 I would posit that the vast majority of wget options are used in some
 extremely small percentage of wget invocations.  Should they be removed?

Such as which ones?

I don't think we're talking about the same extremely small percentages.

Looking through the options listed with --help, I can find very few
options that I've never used or would not consider vital in some
situations I (or someone else) might encounter.

This doesn't look to me like a vital function, one that a large number
of users will find mildly useful, or one that a mild number of users
will find extremely useful. This looks like one that a mild number of
users will find mildly useful. Only slightly more useful, in fact, than
what is already done.

It's also one of those fuzzy features that addresses a scenario that
has no right solution (JavaScript support is in that domain). These
sorts of features tend to invite a gang of friends to help get a little
bit closer to the unreachable target. For instance, if we include this
option, then the same users will find another option to control the
period of time spent full-bore just as useful. A pulse feature might
be useful, but then you'll probably want an option to control the
spacing between those, too. And someone else may wish to introduce an
option that saves bandwidth information persistently, and uses this to
make a good estimate from the beginning.

And all of this would amount to a very mild improvement over what
already exists.

 In my view, wget is a useful and flexible tool largely because there
 are a lot of options.  The internet is a messy place, and wget can cope.

Sure. But what does --limit-percent allow wget to cope with that it
cannot currently?

 I have a handful of options I've added to wget which are mandatory for
 my use.  Mostly dealing with timeouts and retries.  Useful features which
 would not commonly be used.

But they mainly fall into the category of features that a large number
of users will use occasionally, and a small number of users will find
indispensable much of the time. Will you find this feature
indispensable, or can you pretty much use --limit-rate with a reasonable
value to do the same thing?

 Am I correct in reading in to this discussion
 that submitting new features is not encouraged?

To be honest, I'm shocked to get this sort of reaction over what is, as
far as I can tell, an extremely small improvement. If you really care
that much about it, I really don't mind putting it in. But if it's
really that useful to you, then I don't think your previous comments
really conveyed the degree to which that was the case. I repeatedly
asked for people to sell it for me, and got very little actual
case-making other than impressions that it was a very minor convenience
improvement. If it's more than a very minor improvement to you, then I
wish you'd have made that clearer from the start.

If, on the other hand, it is really, just a pretty minor improvement
that happens to be mildly useful to you, could we please drop using this
as a platform to predict what my future reactions to new features in
general are likely to be? :p

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHDwIc7M8hyUobTrERCDJRAJ91SkMNlTc0ssUpejnyEuGp7MqvIwCgir9U
9t7oOJ8y40VerzlnhysFSXw=
=u5oK
-END PGP SIGNATURE-


Re: wget does not compile with SSL support

2007-10-12 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Thomas Wolff wrote:
 So I think it's clear the version thus produced was invoked.

Yeah, guess it couldn't be that easy! :)

Hm... well, can you verify that src/config.h has been correctly
generated to #define HAVE_SSL? If you go to src/ and type rm url.o;
make url.o CFLAGS=-E | $PAGER, what does the definition of
supported_schemes look like?

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHD5yt7M8hyUobTrERCDmQAJ4+8Kf4Fb2bgRwQsdpViqnckEMKzgCghvzG
wM141l57ZZvpbgoYNgQQugk=
=1yT2
-END PGP SIGNATURE-


Re: Version tracking in Wget binaries

2007-10-12 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Hrvoje Niksic wrote:
 Micah Cowan [EMAIL PROTECTED] writes:
 
 Among other things, version.c is now generated rather than
 parsed. Every time make all is run, which also means that make
 all will always relink the wget binary, even if there haven't been
 any changes.
 
 I personally find that quite annoying.  :-(  I hope there's a very
 good reason for introducing that particular behavior.

Well, making version.c a generated file is necessary to get the
most-recent revision for the working directory. I'd like to avoid it,
obviously, but am not sure how without making version.c dependent on
every source file. But maybe that's the appropriate fix. It shouldn't be
too difficult to arrange; probably just
  version.c:  $(wget_SOURCES)
or similar.

It's not 100% effective; it relies on (1) this source directory being
managed as a repository, and (2) the user possessing a copy of Mercurial
(which seems likely if (1) is true). So, for instance, clicking the
bz2 link at http://hg.addictivecode.org/wget/mainline means you aren't
getting a repository, and won't get the revision stamp. :\

I'm currently looking into ways to deal with this. For instance, I can
add an extension to the repository on the server that ensures that
archives are modified to include their version information before
they're shipped out; that could help.

There is also the problem that, if it _is_ a repository clone, the local
user may have made local changes and committed them, in which case I'll
get a different revision id (which is a truncated SHA1 hash, and not a
linear number as with Subversion*), with no information about how it
relates to revision ids I know about from the official repos.

* Mercurial actually has linear numbers, but they're only meaningful to
that one specific repository instance, as the same hashes may have
different corresponding numbers on someone elses clone. They're
basically for making the local repo easier to work with, not for sharing
around.

I'm happy to field suggestions!

 BTW does that mean that, for example, running `make install', also
 attempts to relink Wget?

Yup.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHD5lY7M8hyUobTrERCDDsAJ9m4u7suve+sqID92ebcrq1VvrhawCgi+T4
nunTe5ve/E96fmi4EB7OYbI=
=ykJF
-END PGP SIGNATURE-


Re: anyone look at the actual patch? anyone try it? [Re: working on patch to limit to percent of bandwidth]

2007-10-12 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Tony Godshall wrote:
 [Jim]
 Well, we need the plugin architecture anyway. There are some planned
 features (JavaScript and MetaLink support being the main ones) that have
 no business in Wget proper, as far as I'm concerned, but are inarguably
 useful.
 
 I know when I put an app into an embedded app, I'd rather not even
 have the overhead of the plug-in mechanism, I want it smaller than
 that.
 
 You have a good point regarding customized compilation, though I think
 that most of the current features in Wget belong as core features. There
 are some small exceptions (egd sockets).
 
 Thanks.

(You've misattributed this: it's me talking here, not Jim.)

 OK, so so far there are three of us, I think, that find it potentially
 useful.

One of whom, you'll note, was happy to see it as a module, which is what
I had also been suggesting.

 This doesn't look to me like a vital function, one that a large number
 of users will find mildly useful, or one that a mild number of users
 will find extremely useful. This looks like one that a mild number of
 users will find mildly useful. Only slightly more useful, in fact, than
 what is already done.
 
 You keep saying that.  You seem to think unknown upstream bandwidth is
 a rare thing.  Or that wanting to be nice to other bandwidth users in
 such a circumstance is a rare thing.

I do not think it's a particularly rare thing. I think it's a fairly
easily-dealt-with thing.

 Like I said when I submitted the patch, this
 essentially automates what I do manually:
   wget somesite
   ctrl-c
   wget -c --limit-rate nnK somesite

What I've been trying to establish, is whether automating such a thing
(directly within Wget), is a useful-enough thing to justify the patch.

 But they mainly fall into the category of features that a large number
 of users will use occasionally, and a small number of users will find
 indispensable much of the time. Will you find this feature
 indispensable, or can you pretty much use --limit-rate with a reasonable
 value to do the same thing?
 
 Horse dead.  Parts rolling in the freeway.

Is it? I was talking to Jim, not you. He actually hadn't said very much
until this point.

 If, on the other hand, it is really, just a pretty minor improvement
 that happens to be mildly useful to you, could we please drop using this
 as a platform to predict what my future reactions to new features in
 general are likely to be? :p
 
 Well, when a guy first joins the list and submits his first patch and gets...

Gets what? One should not expect that all patches are automatically
accepted. Jim knows this, and has also seen other people come with
patches I've accepted, which is why it's just silly to accuse me of
something there's already ample proof I don't.

And what is it you got? Did I ever say, no, it's not going in? Did I
ever say I'm against it? What I repeatedly said was, I need convincing.

 Anyhow, perhaps I did the wrong thing in bringing it here- perhaps I
 should have provided it as a wishlist bug in debian and seen how many
 ordinary people find it useful before taking it to the source...
 perhaps I should have vetted it or whatever.

Sure, vetting it is entirely helpful. Getting feedback from a larger
community of users is very helpful. And, lamentably, the current
activity level of this list is not sufficient that I can gauge how
useful a feature is to the community as a whole from the five-or-so
people that participate on this list.

I cannot gauge how useful a feature is from how loudly the contributor
proclaims it's useful. I already _know_ you find it useful, as you cared
enough to bother writing a patch. What I was hoping to hear, but hadn't
heard much of until just now, was more support from the rest of the
community. Jim had spoken up, but not particularly strongly. Rather than
waiting for people to have the chance to speak up, though, you just got
louder.

What is most interesting to me, is your reaction to my statements, which
were never I'm not putting it in, but I think it should wait and live
as an accessory. And to this you get upset, and both defensive and
offensive. This does not make it likelier for me to include your changes.

In this specific case, there's probably a good chance it'll go in (not
for 1.11 though), as I'm clearer now on exactly how useful Jim finds it,
and we've also had another speak up. In the future, though, if you've
got something you'd like me to consider including, you might consider
just a bit more patience than you've exhibited this time around.

Hopefully this thread can go away now, unless someone has something
truly new to contribute.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHD/mZ7M8hyUobTrERCLCrAJ9wApBrS12uqTJ/5pNDLxNI9zbJkACgkdBr
gCwWMhPZ/kzzY1ynR8aof+g=
=t9s1

PATCHES file removed

2007-10-13 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

FYI, I've removed the PATCHES file. Not because I don't think it's
useful, but because the information needed updating (now that we're
using Mercurial rather than Subversion), I expect it to be updated again
from time to time, and the Wgiki seems to be the right place to keep
changing documentation (http://wget.addictivecode.org/PatchGuidelines).

It's still obviously useful to have patch-submission information
included as part of the Wget distribution itself; however, I don't
currently have a good way to generate the document from the Wgiki, and
until such a time that I do, I don't want to rewrite the information in
two places.

Speaking of which, I've replaced the MAILING-LISTS file, regenerating it
from the Mailing Lists section of the Texinfo manual. I suspect it had
previously been generated from source, but it's not clear to me from
what (perhaps the web page?), or what tool was used.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHEId67M8hyUobTrERCAzXAJ4gBFqlm5jUharDtYT7kexb/i1HcQCfW4RP
vVBWnFknqJZb4+Q4mkpmU6k=
=7pzB
-END PGP SIGNATURE-


Re: Version tracking in Wget binaries

2007-10-13 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Micah Cowan wrote:
 Hrvoje Niksic wrote:
 Micah Cowan [EMAIL PROTECTED] writes:
 
 Among other things, version.c is now generated rather than
 parsed. Every time make all is run, which also means that make
 all will always relink the wget binary, even if there haven't been
 any changes.
 I personally find that quite annoying.  :-(  I hope there's a very
 good reason for introducing that particular behavior.
 
 Well, making version.c a generated file is necessary to get the
 most-recent revision for the working directory. I'd like to avoid it,
 obviously, but am not sure how without making version.c dependent on
 every source file. But maybe that's the appropriate fix. It shouldn't be
 too difficult to arrange; probably just
   version.c:  $(wget_SOURCES)
 or similar.

version.c is no longer unconditionally generated. The secondary file,
hg-id, which is generated to contain the revision id (and is used to
avoid using GNU's $(shell ...) extension, which autoreconf complains
about), depends on $(wget_SOURCES), and $(LDADD) (so that it properly
includes conditionally-used sources such as http-ntlm.c or gen-md5.c
when applicable).

This has the advantage that every make does not result in regenerating
version.c, recompiling version.c and relinking wget. It has the
potential disadvantage that, since $(wget_SOURCES) includes version.c
itself, there is the circular dependency: version.c - hg-id -
version.c. GNU Make is smart enough to catch that and throw that
dependency out.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHEJb07M8hyUobTrERCE4rAJ9gKXonGN9bRydErVkxtZF8g723CACeLbhD
VYUyd0MnjBdjcRXMSTge0ZE=
=cC2V
-END PGP SIGNATURE-


Re: PATCHES file removed

2007-10-13 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Hrvoje Niksic wrote:
 Micah Cowan [EMAIL PROTECTED] writes:
 
 FYI, I've removed the PATCHES file. Not because I don't think it's
 useful, but because the information needed updating (now that we're
 using Mercurial rather than Subversion), I expect it to be updated
 again from time to time, and the Wgiki seems to be the right place
 to keep changing documentation
 (http://wget.addictivecode.org/PatchGuidelines).

 It's still obviously useful to have patch-submission information
 included as part of the Wget distribution itself;
 
 It would be nice for the distribution to contain that URL on a
 prominent place, such as in the README, or even a stub PATCHES file.

It's in NEWS, but putting it in README can't hurt.

 Speaking of which, I've replaced the MAILING-LISTS file,
 regenerating it from the Mailing Lists section of the Texinfo
 manual. I suspect it had previously been generated from source, but
 it's not clear to me from what (perhaps the web page?), or what tool
 was used.
 
 It was simply hand-written.  :-)

Oh, yeah, I don't want to do that in three places then (MAILING-LISTS,
Wgiki, and manual)!

It had a right-aligned -*- text -*- thing at the top, so I was
thinking that was an indication of having been generated.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHERDH7M8hyUobTrERCKqOAKCGuapIPLSYLpDktbteDDYyU2I2AgCfRWs9
iznnPJ4ejopsaSgeY/APk78=
=GHTD
-END PGP SIGNATURE-


Re: wget default behavior [was Re: working on patch to limit to percent of bandwidth]

2007-10-13 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256


 On 10/13/07, Tony Godshall [EMAIL PROTECTED] wrote:
 OK, so let's go back to basics for a moment.

 wget's default behavior is to use all available bandwidth.

 Is this the right thing to do?

 Or is it better to back off a little after a bit?

Heh. Well, some people are saying that Wget should support accelerated
downloads; several connections to download a single resource, which can
sometimes give a speed increase at the expense of nice-ness.

So you could say we're at a happy medium between those options! :)

Actually, Wget probably will get support for multiple simultaneous
connections; but number of connections to one host will be limited to a
max of two.

It's impossible for Wget to know how much is appropriate to back off,
and in most situations I can think of, backing off isn't appropriate.

In general, though, I agree that Wget's policy should be nice by default.

Josh Williams wrote:
 That's one of the reasons I believe this
 should be a module instead, because it's more or less a hack to patch
 what the environment should be doing for wget, not vice versa.

At this point, since it seems to have some demand, I'll probably put it
in for 1.12.x; but I may very well move it to a module when we have
support for that.

Of course, Tony G indicated that he would prefer it to be
conditionally-compiled, for concerns that the plugin architecture will
add overhead to the wget binary. Wget is such a lightweight app, though,
I'm not thinking that the plugin architecture is going to be very
significant. It would be interesting to see if we can add support for
some modules to be linked in directly, rather than dynamically; however,
it'd still probably have to use the same mechanisms as the normal
modules in order to work. Anyway, I'm sure we'll think about those
things more when the time comes.

Or you could be proactive and start work on
http://wget.addictivecode.org/FeatureSpecifications/Plugins
(non-existent, but already linked to from FeatureSpecifications). :)

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHESNi7M8hyUobTrERCChSAJ90KmWelT0bH9qQMlArapEdn1ocSACfRHcK
JJmV8QaqcnKTRYam/v0/lwg=
=TPsw
-END PGP SIGNATURE-


Re: WGET Negative Counter Glitch

2007-10-13 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Joshua Szanto wrote:
 http://www.hlrse.net/Qwerty/wget_glitch.gif
  
 I have no idea how that happened. My theory is this...
  
 I start downloading files.tar as normal, it starts at 0 and counts up to
 ~2.5GB (so far this is true). (Here's the theory) After that it turned
 negative somehow, and decided it should go to 0 (which is only logical).

Hi Joshua,

There is a very strong likelihood that this has been fixed in the
current development version of Wget. Could you try with that?

If you're a Windows user, you can get a binary from
http://www.christopherlewis.com/WGet/WGetFiles.htm; otherwise, you'd
need to compile from the repositories source:
http://wget.addictivecode.org/RepositoryAccess

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHEY/x7M8hyUobTrERCO1sAJ9sH9nYquVWh31rwrrpo/PKE+Q8ZACeLAVz
5UQ9+UD6tD8jSbvmPVVNIQw=
=9txB
-END PGP SIGNATURE-


Re: Myriad merges

2007-10-14 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Jochen Roderburg wrote:
 Zitat von Micah Cowan [EMAIL PROTECTED]:

 It's hard to be confident I'm not introducing more issues, with the
 state of http.c being what it is. So please beat on it! :)
 
 This time it survived the beating  ;-)

Yay!! :D

 One issue I'm still aware of is that, if -c and -e
 contentdisposition=yes are specified for a file already fully
 downloaded, HEAD will be sent for the contentdisposition, and yet a GET
 will still be sent to fetch the remainder of the -c (resulting in a 416
 Requested Range Not Satisfiable). Ideally, Wget should be smart enough
 to see from the HEAD that the Content-Length already matches the file's
 size, even though -c no longer requires a HEAD (again). We _got_ one, we
 should put it to good use.

 However, I'm not worried about addressing this before 1.11 releases;
 it's a minor complaint, and with content-disposition's current
 implementation, users are already going to be expecting an extra HEAD
 round-trip in the general case; what's a few extra?
 
 Agreed. I can confirm this behaviour, too. And I would also consider this a
 minor issue, at least the result is correct.
 
 I have also not made many tests where content-disposition is really used for 
 the
 filename. Those few real-live cases that I have at hand do not send any
 special headers like timestamnps and filelengths with it. At least the local
 filename is set correctly and is correctly renamed if it exists.

And I expect there are probably several bugs lurking here (which is why
I've designated it as experimental). After the 1.11 release I want to
revisit that section, and look more closely at what happens if we get a
Content-Disposition at the last minute, especially if it specifies a
local file name that we are rejecting. I'd prefer that it not use HEAD
at all for that, as I expect Content-Disposition is rare enough that it
doesn't justify issuing HEAD just to see if its present; and in any case
it probably frequently isn't sent with HEAD responses, but only for GET.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHElVE7M8hyUobTrERCOG5AJ9xsAPlFyhXXC28E5TeqnoKXWuLPACbBAFN
SfRAf4ZfMFwvYXDKlcDV3dA=
=ZHVD
-END PGP SIGNATURE-


Re: css @import parsing

2007-10-14 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Andreas Pettersson wrote:
 Andreas Pettersson wrote:
 Have there been any progress with this patch since this post?
 http://www.mail-archive.com/wget@sunsite.dk/msg09502.html
 *bump*
 
 Anyone knows the status of this?

Not yet installed... don't know what else to tell you, except that it's
slated to be included in Wget 1.12. Wget 1.11 is expected to be released
quite soon (just waiting for resolution of some licensing stuff), and
I'm afraid to say that CSS support won't be in in time for that.

However, I too am very interested to see CSS support included in Wget;
it'll be in when we have time to look at it more closely, and is one of
my higher priorities for Wget 1.12.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHEldV7M8hyUobTrERCFsPAJ449TvEoo6IZVs5PP+fivSo4Hh6twCdEyjc
B8GWbP8CyVgV7GaY1n6qEx8=
=vYhq
-END PGP SIGNATURE-


Re: Version tracking in Wget binaries

2007-10-14 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Christopher G. Lewis wrote:
 OK, so I'm trying to be open minded and deal with yet another version
 control system.
 
 I've cloned the repository and built my mainline.  I do not
 autogenerate a version.c file in windows.  Build fails missing
 version.obj.  

Right; I think I mentioned that would happen.

 Note that in the windows world, we use Nmake from the MSVC install - no
 GNU tools required.

Right; and I don't expect that you'll be able to do it exactly as I've
done. However, the contents of src/Makefile.am should give good hints
about how it could be done in Nmake. AFAIK, the only thing Unix-specific
about the rules as I've done them, in fact, is the use of the Unix cut
command. If absolutely necessary, that part could be removed, with the
Nmake rules similar to:

hg-id: $(OBJS)
-hg id  $@

It's just that only the first word is needed.

 An aside on Hg...
 
 Confirm for me that I basically need to do the following:
 
 Create a clone repository:
   hg clone http://hg.addictivecode.org/wget/mainline

Approximate equivalent to svn co.

 Get any changes from mainline into my clone 
   hg pull http://hg.addictivecode.org/wget/mainline

Equivalent to svn up.

 Make my src changes, create a changeset... And then I'm lost...

Alright, so you can make your changes, and issue an hg diff, and
you've basically got what you used to do with svn.

Or, if they're larger changes, you can run hg ci periodically as you
change, to save progress so to speak.

 And as a follow-up question - what does Hg get you above and beyond CVS
 or SVN?  I kind of get the non-centralized aspect of repositories and
 clones, but I don't understand how changesets and tips work.

Well, changesets are in all SCMs, as far as I know. A changeset is just
the set-of-changes that you check in when you do svn ci or hg ci.
Every revision id corresponds to and identifies a changeset.

tip is just the Mercurial equivalent of Subversion's HEAD. In
Mercurial, the tip is always the very last revision made, whereas
heads are the last revision made to each unclosed branch in a repository.

 My thoughts are that there is *one* source of the code (with histories)
 regardless of SVN, Hg or whatever.

One official one, sure.

For me, the major advantages are that I can be working on several
things, each with history, without touching the official repository. I
can work on large changes while I'm in my car while my wife drives the
family out-of-town, without having to worry about screwing something up
that I can't back up to a good point (other than back to the last
official point in the repo, or whatever I had the foresight to cp
- -r). And, I can check in changes where each time it takes a hair of a
second, and then push it all over the net when I'm ready for it to be
sent, instead of taking several seconds for each commit. Believe me, you
begin to appreciate that after a few times.

Admittedly, these advantages are mainly advantages to pretty active
developers, which, at the moment, is pretty much just me. :) I've
definitely found use of a DVCS to be absolutely awesome for my purposes.

   Hg's concept of multiple clones and repositories is quite interesting,
 but doesn't feel right for the remote, non-connected group of developers
 that wget gathers input from.  If we were all behind a firewall or could
 share out each user's repository, it might make more sense, but I (for
 one) wouldn't be able to share my repository (NAT'd, firewalled,
 corporate desktop), so I just don't get it.

Sharing is a potentially useful aspect of DVCSses, to be sure, but it's
not all it's got going for it, and in fact isn't really the reason I
made the move.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHEugL7M8hyUobTrERCCM/AJ47cwY0rm0FBsEKH6PhKLwFiyTrxgCfasIY
GJiUAR8s7rX09O2F9ZIt4uQ=
=COwb
-END PGP SIGNATURE-


Wget gnulib-ized

2007-10-14 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Mainline now has replaced a few of Wget's portability pieces with
corresponding gnulib modules. This has resulted in significant changes
to what needs to be built where, so non-Unix builds are probably further
broken (...sorry, Chris, Gisle... *'n'*). Various Unix builds may
possibly have been broken as well; hopefully it'll come out in testing.

The pieces replaced were, I think, old code culled from libiberty or
otherwise from the GNU collective pool: gnu-md5 (now md5), getopt,
safe-ctype (now c-ctype). stdint.h and stdbool.h detection/replacement
were pulled in automatically through importing those modules, but I
haven't altered the build setup to use those instead of our own builtin
stuff yet.

So, at the moment, I've just introduced tremendous instability to
mainline with the only benefit being mildly updated equivalents to about
three files from the GNU collective. ^_^

However, I expect the payoff in the long run to be worth it, as I can
now more easily take advantage of other modules gnulib offers. I expect
that the inline module could be handy for taking advantage of build
environments that offer inlined functions, and of course getpass will be
useful (though we may need to special-case our handling of that one);
the quote (for dealing with strange characters when quoting, say,
filenames) and regex (same thing Emacs uses, I believe--for the proposed
regex support in -A, -R and the like) modules are also possibilities.
And, especially, there are several ADTs that I expect that I will need
shortly, in applications where string-hashes may not fill the need.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHEvME7M8hyUobTrERCMj0AJ9aKGdqCrz9SCuK31kl3dupJAbY9QCcCsJC
FE9In1CKb6xs1xYD2qoRcAk=
=V1sa
-END PGP SIGNATURE-


Suffixes after unique-ing numbers [Re: Two wget patches: min-size/max-size and nc options]

2007-10-15 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Just getting a chance to look a bit more closely at this again.

Christian Roche wrote:
 Hi there,
 
 please find attached two small patches that could be
 considered for wget (against revision 2276).
 
 patch-utils changes the file renaming mechanism when
 the -nc option is in effect.  Instead of trying to
 rename  a file to file.1, file.2 etc, it tries
 prefix-1.suffix, prefix-2.suffix etc, thus preserving
 the filename extension if any.  This is necessary to
 avoid a bug otherwise when the -A option is used:
 renamed files are rejected because they don't match
 the required suffix, although they should really be
 kept.

Is this true? I'm having some trouble reproducing this, either now or in
older versions (1.10.2, 1.9.1), Could you supply a command-line that
produces this problem? ...Because, it seems that Wget will always accept
any file that was explicitly given on the command-line or via
- --input-file; and AFAICT .1, .2 suffixes are not generated in
- --recursive mode.

Your suffixes patch accidentally includes some of the documentation from
- --min-size/--max-size; I'd prefer this not to be part of this patch ;)

Also, the proposed code changes are fine, but IMO a little bit wasteful,
as it allocates strlen(s)-sized buffers for prefix and suffix. Actually,
it really shouldn't be necessary to allocate space for the prefix and
suffix at all, as you can just set markers for them, leaving just the
filename variable to be allocated.

If you can make those changes, I'll probably apply the patch right away;
otherwise, I'll probably wait until I have a chance to do that myself.
Even if the accept/reject thing doesn't apply, this patch has the
benefit of preserving interpretation of .html files, etc. But please let
me know how you were reproducing the accept/reject, as it would still
need a more robust fix.

- --
Thanks!
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHExDa7M8hyUobTrERCFcFAJwJRhLQTfqVeWCB/0ul+GMPW4PdDACdHyua
Sm3JnsEi6m6ZmRrCWzXcUbU=
=GNE5
-END PGP SIGNATURE-


--limit-percent N versus --limit-rate N% ?

2007-10-15 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Okay, now that it's decided this thing will go in...

I'm kinda leaning toward the idea that we change the parser for
- --limit-rate to something that takes a percentage, instead of adding a
new option. While it probably means a little extra coding, it handily
deals with broken cases like people specifying both --limit-rate and
- --limit-percent, and helps consolidate the documentation. Anyone have
opinions about this?

Also: does the current proposed patch deal properly with situations such
as where the first 15 seconds haven't been taken up by part of a single
download, but rather several very small ones? I'm not very familiar yet
with the rate-limiting stuff, so I really have no idea.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHExRX7M8hyUobTrERCKOwAJ9QFdy3u9j1t9t2jjTBcfQ3n+uSRACfRhFM
sVDzTLk/NrW5g13sz+aGgCc=
=6Sig
-END PGP SIGNATURE-


Re: Version tracking in Wget binaries

2007-10-15 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Hrvoje Niksic wrote:
 Micah Cowan [EMAIL PROTECTED] writes:
 
 Make my src changes, create a changeset... And then I'm lost...
 Alright, so you can make your changes, and issue an hg diff, and
 you've basically got what you used to do with svn.
 
 That is not quite true, because with svn you could also do svn
 commit to upload your changes on the global repository seen by
 everyone. It is my understanding that with the distributed VC's,
 the moral equivalent of svn commit is only to be done by the
 maintainer, by pulling (cherry-picking) the patches of various
 contributors.

Untrue. That is simply the model some (notably, Linus and friends)
choose to use.

As with the Subversion set up, active developers with permissions may
push to the central repositories; I actually mentioned this during the
initial announcement, but am using SSH key-based authentication to
accomplish this, and so require people's SSH keys in order to give them
push access. So far, the only one who has given their SSH key is
actually Ralf Wildenues, whose key I requested so he could easily push
any necessary changes related to the Automake stuff. So: send your key,
and you'll have push access!

Note that I still prefer to review non-trivial changes, just as I did in
Subversion; but just as in Subversion, I decided against ACLs and
allowed everyone write permission to trunk, though I preferred to merge
there myself, I want all core devs to have write permission to the
public repos, so they can push quick fixes.

 It is most likely the case that I simply didn't (yet) get the DVCS
 way of doing things.

It takes a little bit of getting used to, but you'll find that many
things directly translate from using a non-distributed SCM. The biggest
differences that arise (for me so far) tend to be related to the fact
that history is not a linear series of events, where each log entry is
the child of another; it is instead a directed acyclical graph, where
the history can branch off and re-merge later, leaving all history
intact. Subversion and friends, of course, support branching and
merging, but the history wouldn't merge along with the changes, and
you'd typically get the branch merge as one big slurp of the total
changes. Tools like svnmerge helped address this (by merging each change
individually, which could be problematic in some circumstances).

For my part, I found using Mercurial very easy to learn, and it is
purposely designed to use an interface very similar to Subversion. It
just takes a little getting used to, like any conversion to a new,
heavily-relied-upon development tool.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHE3177M8hyUobTrERCMpZAJ9MUGXaYa2r+SBmFEujdrnwvjNITQCfevJN
qh2Jicj1A9Iv8Po3E8EUGTA=
=gkeP
-END PGP SIGNATURE-


Re: --limit-percent N versus --limit-rate N% ?

2007-10-15 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Matthias Vill wrote:
 I would appreciate having a --limit-rate N% option.
 
 So now about those broken cases. You could do some least of both
 policy (which would of course still need the time to do measuring and
 can cut only afterwards).
 Or otherwise you could use a non-percent value as a minimum. This would
 be especially useful if you add it to your default options and stumble
 over some slow server only serving you 5KiB/s, where you most probably
 don't want to further lower the speed on your side.
 
 As third approach you would only use the last limiting option.
 
 Depending on how difficult the implementation is I would vote for the
 second behavior, although the first or third option might be more
 intuitive to some of the users not reading the docs.

Third option should be more intuitive to the implementer, too. I vote
for that, as I really want to avoid putting too much sophistication into
this.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHE5wl7M8hyUobTrERCEYUAJ9q4Bgi0LNtxuzWBOqmw8taL0K8wgCdGsxQ
EIizwF8wxo1ksJURUGVT9VA=
=mZ/c
-END PGP SIGNATURE-


Re: subscribing from this list

2007-10-15 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Josh Williams wrote:
 On 10/15/07, patrick robinson [EMAIL PROTECTED] wrote:
 Hello,

 I want to unsubscripe from this list but lost my registration e-mail.
 How is this performed?

I haven't seen this original message yet.

 You can find this (and other information) on the Wget wiki.
 http://wget.addictivecode.org/
 
 To unsubscribe from a list, send an email to
 [EMAIL PROTECTED] For more information on list
 commands, send an email to [EMAIL PROTECTED]

Note that this doesn't help him much if he's lost his registration e-mail.

Patrick, you'll probably have to go bug the staff at www.dotsrc.org, who
hosts this list; send an email to [EMAIL PROTECTED]

- --
HTH,
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHE6c77M8hyUobTrERCKYNAKCMBnFh/+ONearE23Z90HnCcqFOBQCfVNDk
1UaUI4iYc6adbkLIrcVz6Qg=
=9XcZ
-END PGP SIGNATURE-


Re: subscribing from this list

2007-10-15 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Josh Williams wrote:
 On 10/15/07, Micah Cowan [EMAIL PROTECTED] wrote:
 Note that this doesn't help him much if he's lost his registration e-mail.

 Patrick, you'll probably have to go bug the staff at www.dotsrc.org, who
 hosts this list; send an email to [EMAIL PROTECTED]
 
 E-mail *address* or just the e-mail? I don't see how having the e-mail
 is important.

Oh. ... Maybe I misread! :)

I meant/read it as address. As in _registered_ email address. I see
now that an actual email seems more likely.

BTW, we also keep the unsubscription information in the Wget manual (not
the manpage), and in the file MAILING-LIST that comes with Wget
(probably installed to /usr/share/doc/wget/MAILING-LIST, typically).

- --
HTH,
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHE6jY7M8hyUobTrERCPSTAJ9LCvZKCchiBhKQ1XitYS/WHRS2jACfSTG4
Y7qtlmEiml8YtxYxKkGH99o=
=TVTr
-END PGP SIGNATURE-


version.c take two

2007-10-15 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

I've improved the generation of version.c, removing the intermediate
generation of an hg-id file and using a more portable replacement for
hg id | cut -d ' ' -f 1 (can be used on Windows and MS-DOS).

The relevant lines in src/Makefile.am are now:

version.c:  $(wget_SOURCES) $(LDADD)
printf '%s' 'const char *version_string = @VERSION@'  $@
-hg log -r tip --template=' ({node|short})'  $@
printf '%s\n' ';'  $@

(The printf commands aren't portable to Windows AFAIK, but this should
be easier to adapt, at any rate. Note that echo -n is not a portable
method for suppressing newlines in echo's output.)

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHFAXf7M8hyUobTrERCFFmAJ9cR5Pg2wJb3SP/8c3lVXCuuLcyHACfZCmO
vG29YrdpWnm5csHE381L/Ug=
=HVHe
-END PGP SIGNATURE-


Re: version.c take two

2007-10-16 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Hrvoje Niksic wrote:
 Micah Cowan [EMAIL PROTECTED] writes:
 
 version.c:  $(wget_SOURCES) $(LDADD)
 printf '%s' 'const char *version_string = @VERSION@'  $@
 -hg log -r tip --template=' ({node|short})'  $@
 printf '%s\n' ';'  $@
 
 printf is not portable to older systems, but that may not be a
 problem anymore.  What are the current goals regarding portability?

GNU, modern Unixen, and Windows systems (in that order) take priority,
but portability to other systems is desirable if it's not out of the way.

I may take liberties with the Make environment, and assume the presence
of a GNU toolset, though I'll try to avoid that where it's possible.

In cases like this, printf is much more portable (in behavior) than
echo, but not as dependable (on fairly old systems) for presence;
however, it's not a difficult tool to obtain, and I wouldn't mind making
it a prerequisite for Wget (on Unix systems, at any rate). In a pinch,
one could write an included tool (such as an echo command that does
precisely what we expect) to help with building. But basically, if it's
been in POSIX a good while, I'll probably expect it to be available for
the Unix build.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHFHFV7M8hyUobTrERCDG3AJ0WGcRcE9423lSXcasZ5uTxS2HXMACfQqe8
vI1aTSxAHqPrxQPZuTIzpjM=
=gXYM
-END PGP SIGNATURE-


Re: version.c take two

2007-10-16 Thread Micah Cowan
Micah Cowan wrote:
 I've improved the generation of version.c, removing the intermediate
 generation of an hg-id file and using a more portable replacement for
 hg id | cut -d ' ' -f 1 (can be used on Windows and MS-DOS).
 
 The relevant lines in src/Makefile.am are now:
 
 version.c:  $(wget_SOURCES) $(LDADD)
 printf '%s' 'const char *version_string = @VERSION@'  $@
 -hg log -r tip --template=' ({node|short})'  $@
 printf '%s\n' ';'  $@
 
 (The printf commands aren't portable to Windows AFAIK, but this should
 be easier to adapt, at any rate. Note that echo -n is not a portable
 method for suppressing newlines in echo's output.)

Gisle and Chris, you should be able to write this rule in your
Makefiles. Something like:

version.c: $(SOURCES)
echo 'const char *version_string = @VERSION@'  $@
-hg log -r tip --template=' ({node|short})\n'  $@
echo ';'  $@

This particular usage should actually be portable across all varying
implementations of Unix echo, as well (so maybe I'll use it too--though
printf is still probably a reasonable expectation, and I may well
require it in the future). It takes advantage of C's string-literal
concatenation. The results will be an eyesore, but will work.

Note that if version.c is in SOURCES, there is still a recursive
dependency; if this is a problem for your build system, you may want to
remove version.obj from OBJS, and add it directly to the command to link
wget.exe.

-- 
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/



signature.asc
Description: OpenPGP digital signature


Re: version.c take two

2007-10-16 Thread Micah Cowan
Hrvoje Niksic wrote:
 Micah Cowan [EMAIL PROTECTED] writes:
 
 I may take liberties with the Make environment, and assume the
 presence of a GNU toolset, though I'll try to avoid that where it's
 possible.
 
 Requiring the GNU toolset puts a large burden on the users of non-GNU
 systems (both free and non-free ones).  Please remember that for many
 Unix users and sysadmins Wget is one of the core utilities, to be
 compiled very soon after a system is set up.  Each added build
 dependency makes Wget that much harder to compile on a barebones
 system.

Alright; I'll make an extra effort to avoid non-portable Make
assumptions then. It's just... portable Make _sucks_ (not that
non-portable Make doesn't).

 In cases like this, printf is much more portable (in behavior) than
 echo, but not as dependable (on fairly old systems) for presence;
 however, it's not a difficult tool to obtain, and I wouldn't mind
 making it a prerequisite for Wget (on Unix systems, at any rate). In
 a pinch, one could write an included tool (such as an echo command
 that does precisely what we expect) to help with building. But
 basically, if it's been in POSIX a good while, I'll probably expect
 it to be available for the Unix build.
 
 Such well-intended reasoning tends to result with a bunch reports
 about command/feature X not being present on the reporter's system, or
 about a bogus version that doesn't work being picked up, etc.  But
 maybe the times have changed -- we'll see.

Given that there is no portable way to avoid newlines with echo, or to
depend on the results of including a backslash in its argument, it may
be hard to avoid, depending on what we need it for (with my last
revision of the Make rule, however, I've avoided it for this specific
purpose). Any modern Unix had pretty dang well include it. Non-modern
Unixen won't generally be needing to bootstrap, I'd think (and probably
already include older versions of wget anyway).

Thanks for the input, Hrvoje.

-- 
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/



signature.asc
Description: OpenPGP digital signature


<    1   2   3   4   5   6   >