from:"Hrvoje Niksic"

Re: Weird 302 problem with wget 1.7

2002-01-14 Thread Hrvoje Niksic


John Levon [EMAIL PROTECTED] writes:

 Thanks very much (wouldn't it be good to refer to the clause in the
 RFC in the comments ?)

Uh, I suppose so.  But it doesn't matter that much -- someone looking
for it will find it anyway.  Besides, it's not clear which RFC Wget
conforms to.  Web standards are messy.

Re: Wget Patch for 1.8.1 witch IPv6

2002-01-14 Thread Hrvoje Niksic


Thomas Lussnig [EMAIL PROTECTED] writes:

 1. Now if IPv6 enabled it only fetch IPv6 IPv4 sites faile

This is a problem, and part of the reason why the patch is so simple
in its current form.  A correct patch must modify struct address_list
to hold a list of IP addresses, each of which can be either an IPv4
address or an IPv6 address.  It could be something like:

struct ip_address {
  enum { ADDR_IPV4, ADDR_IPV6 } type;
  union {
ipv4_address ipv4;
ipv6_address ipv6;
  } addr;
};

with the appropriate #ifdefs for when IPv6 is not available.
ipv6_address might also need to contain the scope information.  (I
don't know what that is, but I trust that you do.  I've been told that
IPv6 addresses were scoped.)

The address_list_* functions should be modified to either return such
a data structure, *or* (perhaps simpler) to provide the way for the
caller to query which kind of address it's dealing with.

Another possibility is to store struct sockaddr_in instead of the
address.  This was proposed by a Japanese developer, and I disliked
that idea because it seemed cleaner to store and pass only the
information we actually need.  But perhaps this would be easier all
around, I don't know.

Also, you should get rid of the global variable named as vaguely as
`family'.  Also, for FTP we need to support the extended IPv6
commands.

Your patch seems to introduce possibly non-portable functions such as
inet_pton and gethostbyname2 without checking whether they exist.

IPv6 support is not easy to add to an application heavily relying on
IPv4, such as Wget.  I wouldn't say that your patch is dirty or
anything like it, but the fact is that in its current form it cannot
resemble the changes needed to fully support IPv6.

Re: Wget Patch for 1.8.1 witch IPv6

2002-01-15 Thread Hrvoje Niksic


Markus Buchhorn [EMAIL PROTECTED] writes:

 Reading back, that was itojun's proposal, and I suspect probably a
 good choice, even if it seems less clean. Itojun is one of the leading
 lights in IPv6 development, along with the whole WIDE group in Japan,
 and heavily involved in the v6 stacks for the *BSD family (Kame) and
 now moving into Linux (Usagi?). They're busy converting everything
 useful to support v6

I don't doubt that itojun is serious, but he might have different
priorities.  For example, he could choose to implement the easiest
solution that would make it possible to patch a large number of
programs in a realistic time frame.  It doesn't mean that such a
solution is necessarily the best one for each individual program.

Re: Wget Patch for 1.8.1 witch IPv6

2002-01-15 Thread Hrvoje Niksic


Daniel Stenberg [EMAIL PROTECTED] writes:

 I'd suggest that you instead pass around a 'struct hostent *' on
 IPv4 only platforms

Why?  The rest of the code never needs anything from `struct hostent'
except the list of addresses, and this is what my code extracts.  By
extension, the idea was for the IPv6 code to extract the list of
addresses from the data returned by the IPv6 calls.

 with the appropriate #ifdefs for when IPv6 is not
 available. ipv6_address might also need to contain the scope
 information.  (I don't know what that is, but I trust that you do.
 I've been told that IPv6 addresses were scoped.)
 
 IPv6 addresses are scoped, but that is nothing you have to care
 about as mere application writer (unless you really want to of
 course). If you just keep the list of addresses in the addrinfo
 struct and you try all them when you connect, then it'll work
 transparantly.

`struct addrinfo' contains a `struct sockaddr', which carries the
necessary scoping information (I think).  The question at the time was
whether I could extract only the address(es) and ignore everything
else, as it was possible with IPv4.  Itojune implied that scoping of
addresses made this hard or impossible.

Re: Wget Patch for 1.8.1 witch IPv6

2002-01-15 Thread Hrvoje Niksic


Daniel Stenberg [EMAIL PROTECTED] writes:

 On Tue, 15 Jan 2002, Hrvoje Niksic wrote:
 
  I'd suggest that you instead pass around a 'struct hostent *' on
  IPv4 only platforms

 Why?  The rest of the code never needs anything from `struct hostent'
 except the list of addresses, and this is what my code extracts.
 
 Well, why extract the addresses when you can just leave them in the
 struct and pass a pointer to that?

Because I'm caching the result of the lookup, and making a deep copy
of `struct hostent' is not exactly easy.  (Yes, I know libcurl does
it, but the code is not exactly pretty, and I'd like to avoid doing
that.)

 I am only suggesting this as it makes things a lot easier.

No that's fine, but I just don't see why things are any easier that
way.  One way or the other, the caller will want to deal with the
address -- providing it through struct hostent or through an API call
to `struct address_list' should not make a difference.

 connect()ing on machines that support getaddrinfo() should be a matter of
 running through the addrinfo-list and perform something in this style:
 
   struct addrinfo *ai;
 
   sockfd = socket(ai-ai_family, ai-ai_socktype, ai-ai_protocol);
   rc = connect(sockfd, ai-ai_addr, ai-ai_addrlen);

Except the port number can be different for each connection.  And it
won't work in IPv4 where I don't have `struct addrinfo' handy.

Re: Wget Patch for 1.8.1 witch IPv6

2002-01-15 Thread Hrvoje Niksic


Daniel Stenberg [EMAIL PROTECTED] writes:

 On Tue, 15 Jan 2002, Hrvoje Niksic wrote:
 
  Well, why extract the addresses when you can just leave them in the
  struct and pass a pointer to that?

 Because I'm caching the result of the lookup, and making a deep
 copy of `struct hostent' is not exactly easy.  (Yes, I know libcurl
 does it, but the code is not exactly pretty, and I'd like to avoid
 doing that.)
 
 No, the code doing that copy is not pretty. Deep-copying a struct
 like the hostent one can hardly be made pretty.

Agreed.  That, and the fact that I don't *need* other data from
hostent, made me decide that I don't want to keep struct hostent
around.

 The easiness comes with the fact that you have one pointer to the
 complete host info.  Be it hostent for IPv4 or addrinfo for
 IPv6. Then the connect code can take that pointer and walk through
 the list of addresses and attempt to connect.

Yes, but that's exactly the abstraction I've built for 1.8.  That
pointer is called `struct address_list'.

struct addrinfo *ai;
 
sockfd = socket(ai-ai_family, ai-ai_socktype, ai-ai_protocol);
rc = connect(sockfd, ai-ai_addr, ai-ai_addrlen);

 Except the port number can be different for each connection.
 
 I think that's the intended beauty of this API. It sort of hides
 that fact.  We don't even have to bother about which protocol it
 uses. It works the same.

I don't think it's useful to hide the fact that one can connect to a
different port of the same port.  For example, `wget http://foo:80/'
and `wget http://foo:81/' must connect to different ports, and I'd
prefer to look up `foo' only once.

But maybe I just don't see the beauty.  :-)

 And it won't work in IPv4 where I don't have `struct addrinfo'
 handy.
 
 The getaddrinfo() could theoreticly work just as well on IPv4-only
 machines, as it is IP version unbound.

Sure, but older OS'es don't have it implemented -- so I need to
support the old API anyway.

 I still think you can do it like this:
 
 #ifdef HAVE_GETADDRINFO
 typedef struct addrinfo *hostinformation;
 #else
 /* current system */
 typedef struct whateveryouhavetoday *hostinformation;
 #endif

That could of course work, but it'd defeat the very idea behind the
struct whateverihavetoday (`struct address_list'), which is to allow
the callers to use a clean API to access the underlying host
information.

 But, I'm talking a lot more than what I have knowledge about with
 regard to how the wget code is designed. My discussion is generic
 and may not apply to wget internals.

If you have time, please take a glance at Wget 1.8.1's `host.c' and
`connect.c'.  I believe it will make my POV much clearer.

Re: Wget Patch for 1.8.1 witch IPv6

2002-01-15 Thread Hrvoje Niksic


Thomas Lussnig [EMAIL PROTECTED] writes:

 Ok first we don't need this difference. I think it's not so easy than
 it first seem's.
 Because IPv6 is an superset of IPv4 there is an representation fo IPv4
 Adresses.

But is it desirable to use it in preference to native IPv4 calls?

I apologize if I appear anal here -- it's just that if we do add IPv6
support to Wget, I'd like it to be done right, as far as that's
possible.

Your patch seems to introduce possibly non-portable functions such as
inet_pton and gethostbyname2 without checking whether they exist.

 Thats correct i only knew they are avaible on Linux and BSD, but this
 is the ground i make the ifdefs.
 And i think that this to call's can checked from Makefile. Or easely
 replaced with more compatible (if exists).

I would like to use only the tried and true calls when compiling for
IPv4.  The ones Wget 1.8.1 uses have been chosen for maximum
portability in preference over elegance.

Re: Content-dispotion: filename=foo HTTP header

2002-01-15 Thread Hrvoje Niksic


Rami Lehti [EMAIL PROTECTED] writes:

 Wget should try to honor 
 Content-disposition: filename=foobar
 HTTP-response header.
 
 It is really a pain to try to download a file that is created by a script.
 Usually the server gives the Content-disposition: header 
 You would have to save the server response somewhere and rename manually.
 Multiply this by a factor of 500 and you have a problem.

Good suggestion.  I'll put it on the TODO list and see how hard it's
to implement.

Re: wget 1.8.1

2002-01-15 Thread Hrvoje Niksic


Jonathan Davis [EMAIL PROTECTED] writes:

 I recently successfully compiled and installed wget 1.8.1 on my box.
 The new OS and architecture reads as follows: Mac OS X
 (powerpc-apple-darwin5.2)

Thanks for the report; I've now updated MACHINES.

Re: using wget on local lan failed for only one website...

2002-01-15 Thread Hrvoje Niksic


Boris [EMAIL PROTECTED] writes:

 As propose by Hrvoje, I have try with retry option, but no change, every
 time I've got 'read error'.
 
 I also test with the new release for windows (1.8.1), but same thing
 :(

I have no idea what could be going on.  Perhaps a Windows person might
help?  On Unix the error is usually accompanied with an error message
slightly more informative than unknown error.

Re: WGET - OSX

2002-01-15 Thread Hrvoje Niksic


Dan Lavie [EMAIL PROTECTED] writes:

 I have just downloaded and installed WGET on my OS-X.

You didn't say where you downloaded it from or how you installed it,
so I'll assume you're using the standard build process.

 1- I can¹t find any documentation.

The documentation is in Info format, installed in `/usr/local/info' by
default.

 2- How do I remove it from my OS-X system ?

You pretty much need to remove the installed files manually.  Wget
also has an `uninstall' target, but I don't think it has been tested
in a while.

Re: doubt

2002-01-15 Thread Hrvoje Niksic


praveen sirivolu [EMAIL PROTECTED] writes:

 I have a doubt.when we use wget to recursively retrieve pages from
 internet its not bringing files with shtml and jhtml
 extensions.is this feature not implemented or if it is there ,could
 somebody explain me how to get those HTML pages.

They should be downloaded.  Can you give an example of how you're
invoking Wget, preferrably accompanied by a debug log?

Re: Wget Patch for 1.8.1 witch IPv6

2002-01-16 Thread Hrvoje Niksic


Daniel Stenberg [EMAIL PROTECTED] writes:

 `struct addrinfo' contains a `struct sockaddr', which carries the
 necessary scoping information (I think).  The question at the time
 was whether I could extract only the address(es) and ignore
 everything else, as it was possible with IPv4.  Itojune implied
 that scoping of addresses made this hard or impossible.
 
 Right, you can't just extract a few things from that struct and go
 with them without very careful considerations.

Which is, if I understand correctly, exactly what Thomas does.
Thomas, can you followup on this?  I'm worried about the whole scoping
business.

Re: -H suggestion

2002-01-16 Thread Hrvoje Niksic


[EMAIL PROTECTED] writes:

 Funny you mention this.  When I first heard about -p (1.7?) I
 thought exactly that it would default to [spanning hosts to retrieve
 page requisites].  I think it would be really useful if the page
 requisites could be wherever they want. I mean, -p is already
 ignoring -np (since 1.8?), what I think is also very useful.

Since 1.8.1.  I considered it a bit more dangerous to allow
downloading from just any host if the user has not allowed it
explicitly.  For example, maybe the user doesn't want to load the
banner ads?  Or maybe he does?

In either way, I was presented with a user interface problem.  I
couldn't quite figure out how to arrange the options to allow for
three cases:

 * -p gets stuff from this host only, including requisites.
 * -p gets stuff from this host only, but requisites may span hosts.
 * everything may span hosts.

Fred's suggestion raises the bar, because to implement it we'd need a
set of options to juggle with the different download depths depending
on whether you're referring to the starting host or to the other
hosts.

  The -i switch provides for a file listing the URLs to be downloaded.
  Please provide for a list file for URLs to be avoided when -H is
  enabled.
 
 URLs to be avoided?  Given that a URL can be named in more than one
 way, this might be hard to do.
 
 Sorry, but does --reject-host (or similar, I don't have the docs here ATM)
 not exactly do this?

The existing rejection switches reject on the basis of host name, and
on the basis of file name.  There is no switch to disallow downloading
a specific URL.

Re: IPv6

2002-01-16 Thread Hrvoje Niksic


Thomas Lussnig [EMAIL PROTECTED] writes:

 how the socket part should work fine.
 inet_pton and gethostbyname2 only get used if IPV6 is defined

Please don't use gethostbyname2.  It's apparently a GNU extension, and
I don't think it will work anywhere except on Linux.

 Now it leaves Makefile,evtl

I don't know what this is.

Re: WGET+IPv6

2002-01-16 Thread Hrvoje Niksic


Thomas Lussnig [EMAIL PROTECTED] writes:

 1. without IPv6 there is no longer used new syscalls
 (gethostbyname2,inet_ntop,inet_pton)
 2. It can on runtime downgreade to IPv4
 3. In IPv6 mode it can handle IPv4 Adresses
 4. Checked with following input www.ix.de , 217.110.115.160 ,
 www.ipv6.euronet.be
+ Adress is shown as expected
+ Connection work clean :-)
+ checked also with IPv4 only where www.ipv6.euronet.be don't work (as
 expected because its ipv6 only)

Cool, good work.  There are still things to work on, though:

* Autoconf support.  Since I don't want to support broken IPv6
  implementations, we don't need to get fancy here.  Checking for
  several IPv6-specific calls and defining IPv6 only if all of them
  are there.  There should also be a flag to turn off IPv6 entirely at
  compile-time.

* FTP.  You said you'd look for help there, but I'd at least like to
  make sure that IPv4 sites work with FTP, even in IPv6 mode.  In
  fact, I dislike the idea of a mode, and I think it should be used
  only to downgrade to IPv4 and for debug purposes.

* You haven't answered my question about scopes.  Is it really safe
  to just store the IP address and then call it later, without also
  storing the address scope?  Please take a careful look at this.

* Style.  Please take a look at Wget's coding standards (described in
  PATCHES file), and the accompanying GNU coding standards.  Please
  don't use C++ comments.  Please use lower-case variable names
  (`hostent' or `hptr' instead of HOSTENT).

Re: A strange bit of HTML

2002-01-16 Thread Hrvoje Niksic


Ian Abbott [EMAIL PROTECTED] writes:

 I came across this extract from a table on a website:
 
 td ALIGN=CENTER VALIGN=CENTER WIDTH=120 HEIGHT=120a
 href=66B27885.htm msover1('Pic1','thumbnails/MO66B27885.jpg');
 onMouseOut=msout1('Pic1','thumbnails/66B27885.jpg');img
 SRC=thumbnails/66B27885.jpg NAME=Pic1 BORDER=0 /a/td
 
 Note the string beginning msover1(, which seems to be an
 attribute value without a name, so that makes it illegal HTML.

I think it's even worse than that.  My limited knowledge of SGML
taught me that foo bar is equivalent to foo bar=bar, which means
that given foo bar, bar is attribute *name*, not value.

If I understand SGML correctly, attribute names cannot be quoted.
This makes foo bar illegal even if foo bar=10 or foo bar are
perfectly valid.

 I haven't traced what Wget is actually doing when it encounters
 this, but it doesn't treat 66B27885.htm as a URL to be
 downloaded.

According to Wget's notion of HTML, the A tag in question is simply
not a well-formed tag.  This means that Wget's parser will back out
to the character a (the second char of a href=...) and continue
parsing from there.  Generally, when faced with a syntax error, it is
extremely hard to just ignore it and extract a useful result from
garbage.  In some cases it's possible; in most, it's just too much
worse.

Loosely, html-parse.c will recognize the following things as tags.  (S
stands for strict string, only letters, numbers, hyphen and
underscore allowed, L stands for loosely matched string,
i.e. everything except whitespace and separator, such as quote, ,
etc.)

 I can't call this a bug, but is Wget doing the right thing by
 ignoring the href altogether?

S S1=L1 S2=L2 ... -- normal tag with attributes
S S1=L1 S2=L2 ... -- like the above, but quotation allows more
   leeway on values.
S S1  -- the same as S S1=S1

Given the amount of broken HTML on the web, it's easy to imagine for
this parser to be confused about what's what.  That is why the
attribute names are matched strictly.

Now, it would be fairly easy to change the parser to match the
attribute names loosely like it does for values, but to parse the
above piece of broken HTML, it would have to be extended to handle:

S L1

(and, I assume)

S L1=L2

I wonder if that's worth it.  On the one hand, it might be helpful to
someone (e.g. you).  On the other hand, there will always be one more
piece of illegal HTML that Wget *could* handle if tweaked hard enough.

Re: A strange bit of HTML

2002-01-16 Thread Hrvoje Niksic


[EMAIL PROTECTED] writes:

 That sounds like they wanted onMouseOver=msover1(...)

Which Wget would, by the way, have handled perfectly.

Re: How does -P work?

2002-01-16 Thread Hrvoje Niksic


Ian Abbott [EMAIL PROTECTED] writes:

 Here is a patch to deal with the -P C:\temp (and similar) problems
 on Windows.

This looks good.  I'll apply it as soon as CVS becomes operational
again.

Re: WGET+IPv6

2002-01-16 Thread Hrvoje Niksic


Daniel Stenberg [EMAIL PROTECTED] writes:

 On Wed, 16 Jan 2002, Hrvoje Niksic wrote:
 
  The so called scope in IPv6 is emeddeded in the address, so you can't use
  IPv6 addresses without getting the scope too.

 Are you sure?  Here is what itojun said in
 [EMAIL PROTECTED]:

  due to the IPv6 address architecture (scoped), 16 bytes does not
  identify a node.  we need 4 byte more information to identify the
  outgoing scope zone.

 Am I misunderstanding him that Wget needs to keep 20 bytes of information
 to successfully connect?
 
 Well, if itojun said it then that's so. I have the greatest respect
 for his IPv6 abilities.

That may be so, but I'm not entirely convinced that it is really
important that Wget care about scopes in this context.  I'm still
discussing it with Thomas.

For example, the gethostbyname2 and getipnodebyname calls don't return
the scope at all.  Does it mean that all applications that use them
are broken?  Somehow I doubt it.  (And yes, I know that both seem to
be obsolete, but I still find it strange that such a feature of such
importance would be missing.)

If someone here understands IPv6 enough to explain this, I'd be
grateful to hear the clarification.

Re: A strange bit of HTML

2002-01-16 Thread Hrvoje Niksic


[EMAIL PROTECTED] writes:

 Until there's an ESP package that can guess what the author
 intended, I doubt wget has any choice but to ignore the defective
 tag.
 
 Seriously, I think you guys are too strict.
 Similar discussion have spawned numerous times.
 If the HTML code says 
 a href=URL yaddayada my-Mother=Shopping%5 goingsupermarket/a
 Why can't wget just ignore everything after ...URL?

Because, as he said, Wget can parse text, not read minds.  For
example, you must know where a tag ends to be able to look for the
next one, or to find comments.  It is not enough to look for '' to
determine the tag's ending -- something like img alt=my dog
src=foo is a perfectly legal tag.

In other words, you have to destructure the tag, not only to retrieve
the URLs, but to be able to continue parsing.  If the tag is not
syntactically valid, the parsing fails, on to other tags.  Wget has
never been able to pick apart every piece of broken HTML.

As for us being strict, I can only respond with a mini-rant...

Wget doesn't create web standards, but it tries to support them.
Spanning the chasm between the standards as written and the actual
crap generated by HTML generators feels a lot like shoveling shit.
Some amount of shoveling is necessary and is performed by all small
programs to protect their users, but there has to be a point where you
draw the line.  There is only so much shit Wget can shovel.

I'm not saying Ian's example is where the line has to be drawn.  (Your
example is equivalent to Ian's -- Wget would only choke on the last
going part).  But I'm sure that the line exists and that it is not
far from those two examples.

Re: wget does not parse .netrc properly

2002-01-16 Thread Hrvoje Niksic


Alexey Aphanasyev [EMAIL PROTECTED] writes:

 I'm using wget compiled from the latest CVS sources (GNU Wget
 1.8.1+cvs). I use it to mirror several ftp sites. I keep ftp
 accounts in .netrc file which looks like this:
[...]

Ah, I see.  The macro definition (`macdef init') would fail to be
terminated at empty lines.

Thanks for the report.  This patch should fix the problems; please let
me know if it works for you.


2002-01-17  Hrvoje Niksic  [EMAIL PROTECTED]

* netrc.c (parse_netrc): Skip leading whitespace before testing
whether the line is empty.  Empty lines still contain the line
terminator.

Index: src/netrc.c
===
RCS file: /pack/anoncvs/wget/src/netrc.c,v
retrieving revision 1.10
diff -u -r1.10 netrc.c
--- src/netrc.c 2001/11/30 09:33:22 1.10
+++ src/netrc.c 2002/01/17 00:56:12
@@ -280,6 +280,10 @@
   p = line;
   quote = 0;
 
+  /* Skip leading whitespace.  */
+  while (*p  ISSPACE (*p))
+   p ++;
+
   /* If the line is empty, then end any macro definition.  */
   if (last_token == tok_macdef  !*p)
/* End of macro if the line is empty.  */

Re: How does -P work?

2002-01-16 Thread Hrvoje Niksic


Hrvoje Niksic [EMAIL PROTECTED] writes:

 Ian Abbott [EMAIL PROTECTED] writes:
 
 Here is a patch to deal with the -P C:\temp (and similar) problems
 on Windows.
 
 This looks good.  I'll apply it as soon as CVS becomes operational
 again.

Applied now.

Re: wget-1.8

2002-01-16 Thread Hrvoje Niksic


Tay Ngak San [EMAIL PROTECTED] writes:

 I have downloaded your source code for wget and tried to make it but
 failed due to va_list parameter conflict in stdarg.h and stdio.h.
 Please advice.

What OS and compiler are you using to compile Wget?

Re: wget does not parse .netrc properly

2002-01-17 Thread Hrvoje Niksic


Alexey Aphanasyev [EMAIL PROTECTED] writes:

 It works for me. I wish the patch included in the next release.

Thanks for the confirmation.  The patch is already in CVS.

Re: Bug report: 1) Small error 2) Improvement to Manual

2002-01-17 Thread Hrvoje Niksic


Herold Heiko [EMAIL PROTECTED] writes:

 My personal idea is:
 As a matter of fact no *windows* text editor I know of, even the
 supplied windows ones (notepad, wordpad) AFAIK will add the ^Z at the
 end of file.txt. Wget is a *windows* program (although running in
 console mode), not a *Dos* program (except for the real dos port I know
 exists but never tried out).
 
 So personally I'd say it would not be really neccessary adding support
 for the ^Z, even in the win32 port;

That was my line of thinking too.

Re: Mapping URLs to filenames

2002-01-17 Thread Hrvoje Niksic


Ian Abbott [EMAIL PROTECTED] writes:

 Most (all?) of the escape sequences within URLs should be decoded
 before transforming to an external file-name.

All, I'd say.  Even now u-file and u-dir are not URL-encoded.  They
get reencoded later, by url_filename.

 The point between the two is my internal or ideal file-name, but
 the two steps can be combined.

I see what you mean.  I guess u-file and u-dir constitute internal
file names, and whatever url_filename() returns is external?

 I guess we'll have to check back in the mail archives to find out
 the original argument for the %-@ patch. The ChangeLog entry is
 dated 1997-01-18.

I'll check my personal archives.  At that time the patches list was
not established, and many people were sending patches to me directly.

Re: Passwords and cookies

2002-01-17 Thread Hrvoje Niksic


Ian Abbott [EMAIL PROTECTED] writes:

 -   asctime (localtime ((time_t *)cookie-expiry_time)),
 +   (cookie-expiry_time != ~0UL ?
 +asctime (localtime ((time_t *)cookie-expiry_time))
 +: UNKNOWN),
 cookie-attr, cookie-value));
  }
 
 Yes, except for any other values of cookie-expiry_time that would
 cause localtime() to return a NULL pointer

Good point.

 (in the case of Windows, anything before 1970). Perhaps the return
 value of localtime() should be checked before passing it to
 asctime() as in the modified version of your patch I have attached
 below.

Yes, that's the way to go.  Except I'll probably add a bit more
complexity (sigh) to print something other than UNKNOWN when the
expiry time is ~0UL.

 I'm also a little worried about the (time_t *)cookie-expiry_time
 cast, as cookie-expiry time is of type unsigned long. Is a time_t
 guaranteed to be the same size as an unsigned long?

It's not, but I have a hard time imagining an architecture where
time_t will be *larger* than unsigned long.

Re: [PATCH] Failed linking on SCO Openserver

2002-01-17 Thread Hrvoje Niksic


Jeff Bailey [EMAIL PROTECTED] writes:

 wget 1.8 fails to link on i686-pc-sco3.2v5.0.6

Does the compiler on that machine really not have alloca()?  I'm
usually wary of attempts to compile `alloca.c' because they usually
point out a mistake in the configuration process.

Re: WGet 1.8.1

2002-01-18 Thread Hrvoje Niksic


Lauri Mägi [EMAIL PROTECTED] writes:

 I'm using WGet 1.8.1 for downloading files over FTP protocol.
 when filename contain spaces url is like that ftp://server.name/file%20name
 and it saves files also with %20 in file names 
 
 Prior I was using WGet 1.7 and it saved spaces as the should be.
 
 My OS is RedHat 7.2
 
 I tried w32 version also, but it puts @20 into filenames.
 
 Is it a bug or just a feature ?

It's supposed to be a feature, but many users dislike that particular
feature.  Which means it is likely to go away in the next release.

(Some dangerous characters will still be encoded to %hh, but space is
likely not to be one of them.)

Re: WGet is a very useful program. Writing a program to make the documentation easy

2002-01-18 Thread Hrvoje Niksic


Michael Jennings [EMAIL PROTECTED] writes:

 The issue centers on the documentation. Philosophically, in my
 opinion, a program should be written so the documentation is easy to
 read. In this case a hidden stripping of useless characters means
 that there is one less thing to explain in the manual.

No, it's one *more* thing to explain in the manual.  The only
characters universally agreed to be useless in the context of
parsing are the whitespace characters.  *Everything* else is subject
to serious considerations.

For example, control characters for you might be UTF8-encoded
characters for someone else.  Not stripping them away without a very
good reason to do so is for me a simple matter of correctness.

The GNU coding standards seem to suggest the same.

(...) Or go for generality.  For example, Unix programs often have
static tables or fixed-size strings, which make for arbitrary
limits; use dynamic allocation instead.  Make sure your program
handles NULs and other funny characters in the input files.  Add a
programming language for extensibility and write part of the
program in that language.

and:

Utilities reading files should not drop NUL characters, or any
other nonprinting characters _including those with codes above
0177_.  The only sensible exceptions would be utilities
specifically intended for interface to certain types of terminals
or printers that can't handle those characters.  Whenever
possible, try to make programs work properly with sequences of
bytes that represent multibyte characters, using encodings such as
UTF-8 and others.

 There is precedent for this. Microsoft Windows is in some places
 written to get around shortcomings in the processors on which it
 runs. Such accommodation puts quirkiness in the code, but it gets
 the job done.

In many cases Wget tries to accommodate to its environment to ensure
smoother operation.  But with each such accomodation we are forced to
weigh the added quirkiness (entropy) of the code against the
benefit.

In this case, implementing correct support for ^Z is not exactly
trivial, and the benefit is minimal -- the ^Z characters don't even
appear in files normally created on platforms supported by Wget, which
are Unix and Windows.

You are trying to convince us otherwise by offering an easier
implementation of ^Z, thereby reducing the costs.  But unfortunately
this easier implementation reduces correctness of the code, and is
therefore not an option.  Sorry.

Re: HELP: getaddrinfo

2002-01-21 Thread Hrvoje Niksic


Thomas Lussnig [EMAIL PROTECTED] writes:

 i'm building an IPv6 patch for wget. And i'm worried about the point
 that i have to add 12 in the sockaddr.

Perhaps it would help if you created a minimal test case for the
problem you're witnessing.  For example:

#include stdio.h

#include sys/types.h
#include sys/socket.h
#include arpa/inet.h

#include netdb.h

int
main (int argc, char **argv)
{
  char *host = argv[1];
  struct addrinfo *ai_head, *ai;

  int err = getaddrinfo (host, NULL, NULL, ai_head);
  if (err != 0)
return err;

  for (ai = ai_head; ai; ai = ai-ai_next)
{
  char buf[128];
  struct sockaddr_in6 *sa6 = (struct sockaddr_in6 *)ai-ai_addr;

  if (!inet_ntop (AF_INET6, sa6-sin6_addr, buf, sizeof (buf)))
{
  perror (inet_ntop);
  continue;
}
  puts (buf);
}
  return 0;
}

Compile this.  For me it seems to work correctly:

$ ./a.out www.ipv6.euronet.be
3ffe:8100:200:2::2
3ffe:8100:200:2::2
3ffe:8100:200:2::2
::57.0.0.0
::57.0.0.0
::9.11.0.0

The first three IP addresses seem to be correct (I remember them from
your logs).  So adding 12 does not appear to be necessary on my
system.

Does the same test work for you?

Re: -A -R Problems With List

2002-01-23 Thread Hrvoje Niksic


Jan Hnila [EMAIL PROTECTED] writes:

 Hello,
 
 please try this(it should work):
 
 wget -r -l2 -A=htm,html,phtml http://www.tunedport.com
 
 (the change is the equals sign.The same for -R. If you take a look
 at the output of wget --help, you may notice the equality signs
 there(in the longer form: --accept=LIST ), but it really is easy toi
 find it out.)

This advice is simply wrong; have you actually tried it?  The option
syntax is one of:

wget -x value
wger -xvalue
wget --xxx value
wget --xxx=value

where x and xxx are short and long option names.

I haven't yet had time to investigate Samuel's problem, but it's
almost certainly not alleviated by prepending = to a one-letter
option.

Re: Possible bugs when making https requests

2002-01-23 Thread Hrvoje Niksic


Sacha Mallais [EMAIL PROTECTED] writes:

 Unable to establish SSL connection.
 --
 
 Also note the it does _not_ appear to be retrying the connection.  I
 have explicitly set --tries=5, and with a non-ssl connection, the
 above stuff appears 5 times when it cannot connect.  But, for SSL
 stuff, one failure kills the process.

I cannot say why the connection fails, but I can explain why it's not
retried.  It's because Wget (perhaps improperly) considers the failure
non-transient.  Such permanent failures cause Wget to give up on URL.
Other permanent failures include failure to perform a DNS lookup,
connection refused on all known interfaces, etc.

For example:

$ wget --tries=100 http://www.xemacs.org:1212
--00:11:08--  http://www.xemacs.org:1212/
   = `index.html'
Resolving www.xemacs.org... done.
Connecting to www.xemacs.org[207.96.122.9]:1212... failed: Connection refused.
$

Re: problems with char sequence %26

2002-01-23 Thread Hrvoje Niksic


wget Admin [EMAIL PROTECTED] writes:

 I am using wget version 1.5.3 under Solaris and 1.5.2 under IRIX.

Please upgrade.  This problem is fixed in Wget 1.8.1.

 Do you have any ideas to solve the problem?  (Possibly without
 having to recompile wget since I am not sysadmin.)

You do not have to be a sysadmin to recompile Wget.  Just install it
into your home directory.

Re: Wget 1.6

2002-01-24 Thread Hrvoje Niksic


Way, Trevor [EMAIL PROTECTED] writes:

 Using the -T, -t and -w parameters but cannot get it to timeout less than 3
 minutes.
 
 /usr/bin/wget --output-document=/tmp/performance.html -T5 --wait=2
 --waitretry=2 --tries=2
 
 Shuld this timeout after 5 secs, retry twice, waiting 2 secs between
 retries. BUT it always waits 3 minutes.

Note that -T only sets the read timeout, not the connect timeout.

Re: PROXY + wget ftp://my.com/pub/my*.tar

2002-01-24 Thread Hrvoje Niksic


Thanos Siaperas [EMAIL PROTECTED] writes:

 Shouldn't wget first get the .listing, find the files needed by the
 wildcard, and then request the files from the proxy? This looks like
 a bug.

No, when using a proxy, you get HTTP behavior.  So to do that, you
have to do it the HTTP way:

wget -rl1 ftp://my.com/pub/ -A my*.tar

Re: stdout

2002-01-25 Thread Hrvoje Niksic


Jens Röder [EMAIL PROTECTED] writes:

 for wget I would suggest a switch that allows to send the output
 directly to stdout. It would be easier to use it in pipes.

Are you talking about the log output or the text of the documents Wget
downloads?

* Log output goes to stderr by default, and can be redirected by one
  of:

  wget ... 21
  wget -o /dev/stdout ...  # for systems with /dev/stdout
  wget -o /dev/fd/1 ...# for systems with /dev/fd/1

* The contents of a downloaded document can be redirected to stdout
  using `-O -'.  Since the log output is printed to stderr, it won't
  be mixed up with the download output.  For instance, this works as
  you'd expect:

wget -O - http://hrvoje.willfork.com/tst/Recurdos_De_La_Alhambra.mp3 | mpg123 -

Re: Noise ratio getting a bit high?

2002-01-29 Thread Hrvoje Niksic


Andre Majorel [EMAIL PROTECTED] writes:

 I respectfully disagree. If we can spend the time to read and
 answer the poster's question, the poster can spend five minutes
 to subscribe/unsubscribe.
 
 For reference, see the netiquette item on posting to newsgroups
 and asking for replies by email.

I am aware of newsgroup etiquette, but I consider a newsgroup to be
different from a mailing list devoted to helping users.  Besides,
subscribing to and unsubscribing from an unknown mailing list are much
more annoying processes than they are for newsgroups.

I suppose we can only agree to disagree on this one.

 I am aware that in this matter, as well as in the infamous
 `Reply-To' debate, this list lies in the minority.  But that is not
 a sufficient reason to back down and let the spammers win.
 
 Right now, [EMAIL PROTECTED] is providing free relaying for spammers
 to all its subscribers.

So does any mailing list with open subscription.  I find your choice
of wording strange, sort of like saying that `sendmail' provides free
transmission of spam.  That may be so, but that was not its intention,
and the fact that it's misused is no reason to cripple its intended
use.

 If you have a spam-fighting suggestion that does *not* include
 disallowing non-subscriber postings, I am more than willing to listen.
 
 Mmm... What would you think of having the list software
 automatically add a special header (say X-Non-Subscriber) to every
 mail sent by a non-subscriber ?

I see where you're getting at, and I would have absolutely no
objections to that.

Re: windows binary

2002-01-29 Thread Hrvoje Niksic


Brent Morgan [EMAIL PROTECTED] writes:

 Whats CVS and what is the significance of this version?

CVS stands for Concurrent Versions System, and is the version
control system where the master sources for Wget are kept.  I would
not advise the download of the CVS version because it is likely to
be incomplete or unstable.

It would be nice if the 1.8.1+cvs binary could be moved to a less
visible location, or on a separate page dedicated for development.  Or
accompanied by an explanation, etc.

Re: bug when processing META tag.

2002-01-31 Thread Hrvoje Niksic


An, Young Hun [EMAIL PROTECTED] writes:

 if HTML document contains code like this
 
 meta http-equiv=Refresh
 
 wget may be crushed. It has 'refresh' but
 does not have 'content'. Of course this is
 incorrect HTML. But I found some pages at web :)
 
 simply add check routine at 'tag_handle_meta' function.

Thanks for the report; this patch should fix the bug:

2002-02-01  Hrvoje Niksic  [EMAIL PROTECTED]

* html-url.c (tag_handle_meta): Don't crash on meta
http-equiv=refresh where content is missing.

Index: src/html-url.c
===
RCS file: /pack/anoncvs/wget/src/html-url.c,v
retrieving revision 1.23
diff -u -r1.23 html-url.c
--- src/html-url.c  2001/12/19 01:15:34 1.23
+++ src/html-url.c  2002/02/01 03:32:55
@@ -521,10 +521,13 @@
 get to the URL.  */
 
   struct urlpos *entry;
-
   int attrind;
-  char *p, *refresh = find_attr (tag, content, attrind);
   int timeout = 0;
+  char *p;
+
+  char *refresh = find_attr (tag, content, attrind);
+  if (!refresh)
+   return;
 
   for (p = refresh; ISDIGIT (*p); p++)
timeout = 10 * timeout + *p - '0';

Re: wget: malloc: Not enough memory.

2002-02-07 Thread Hrvoje Niksic


Michael Dodwell [EMAIL PROTECTED] writes:

 Just noticed that wget 1.7 errors with the subject line if you pass
 it a protocol, port and username but not a password.

Please upgrade to Wget 1.8.1.  I believe this problem has gone away.

Re: KB or kB

2002-02-09 Thread Hrvoje Niksic


Ian Abbott [EMAIL PROTECTED] writes:

 I'd suggest either leaving them alone or adopting the IEC standards
 that Henrik referred to, i.e. KiB = kibibyte = 2^10 bytes

Ugh!  Never!

Let them keep their kibibytes to themselves.  :-)

Re: wildcard(?) use inf wget 1.8

2002-02-18 Thread Hrvoje Niksic


cagri coltekin [EMAIL PROTECTED] writes:

 Apologies if this is a known issue. However, it seems that as of
 wget 1.8, the `?' char is treated as a separator in URLs. But this
 feature brakes the ftp downloads using wild-card `?'. It would be
 nice to disable this in url_parse() if url is not an https* url.

Thanks for the report.  The problem you described should be fixed by
this patch:

2002-02-19  Hrvoje Niksic  [EMAIL PROTECTED]

* url.c (url_parse): Don't treat '?' as query string separator
when parsing FTP URLs.

Index: src/url.c
===
RCS file: /pack/anoncvs/wget/src/url.c,v
retrieving revision 1.71
diff -u -r1.71 url.c
--- src/url.c   2002/01/26 20:43:17 1.71
+++ src/url.c   2002/02/19 05:07:24
@@ -802,6 +802,15 @@
   query_b = p;
   p = strpbrk_or_eos (p, #);
   query_e = p;
+
+  /* Hack that allows users to use '?' (a wildcard character) in
+FTP URLs without it being interpreted as a query string
+delimiter.  */
+  if (scheme == SCHEME_FTP)
+   {
+ query_b = query_e = NULL;
+ path_e = p;
+   }
 }
   if (*p == '#')
 {

Re: bug

2002-02-18 Thread Hrvoje Niksic


Peteris Krumins [EMAIL PROTECTED] writes:

 GNU Wget 1.8

 get: progress.c:673: create_image: Assertion `p - bp-buffer = bp-width' failed.

This problem has been fixed in Wget 1.8.1.  Please upgrade.

Re: FTP passwords?

2002-02-18 Thread Hrvoje Niksic


John A Ogren [EMAIL PROTECTED] writes:

 I'd like to use 'wget' to mirror a remote ftp directory, but it
 requires a username and password to access the server.  I don't see
 any mention of command-line options for supplying this information
 for an FTP server, only for an HTTP server. Is this a bug, or a
 feature, or am I just missing something obvious?

There is a `.wgetrc' command for setting the password, which you can
use on the command line with `-e' (`--execute').  For example:

wget -e 'login=foo' -e 'passwd=bar' ftp://server/dir/file

The same can be achieved with:

wget ftp://foo:bar@server/dir/file

Or you can store the username/password in `.netrc'.

Re: -N option gives proxy error

2002-02-18 Thread Hrvoje Niksic


It's a known problem.  Timestamping doesn't work with FTP URLs over
proxy because the HEAD request is not honored by the proxy for FTP.

Note that your Wget is very old and you should upgrade -- but not
because of this, because this problem has remained.

Re: ftp download with -p

2002-02-18 Thread Hrvoje Niksic


Currently this is a known problem.  Wget doesn't span hosts or
schemes with -p, although it probably should.

Re: Using wildcards through a proxy server

2002-02-18 Thread Hrvoje Niksic


It's a known issue.  Wget's wildcard magic only works when using the
FTP protocol.  HTTP is used for communication with proxies, so
wildcarding doesn't work.  But you should be able to simulate it
using:

wget -nd -rl1 -A foo*bar ftp://server/dir/

It's not elegant, but it works for me.

Re: End of IPv6 Scope_Id discussion

2002-02-18 Thread Hrvoje Niksic


Again, thanks for taking the time to research this.  Next time
sometimes ask this question, we'll forward him this email.

Re: wget core dump with recursive file transfer

2002-02-18 Thread Hrvoje Niksic


Thanks for the report, Paul.  This patch, which I'm about to apply to
CVS, should fix it.

2002-02-19  Hrvoje Niksic  [EMAIL PROTECTED]

* recur.c (retrieve_tree): Handle the case when start_url doesn't
parse.

Index: src/recur.c
===
RCS file: /pack/anoncvs/wget/src/recur.c,v
retrieving revision 1.42
diff -u -r1.42 recur.c
--- src/recur.c 2002/02/19 05:23:35 1.42
+++ src/recur.c 2002/02/19 06:07:39
@@ -186,15 +186,24 @@
   uerr_t status = RETROK;
 
   /* The queue of URLs we need to load. */
-  struct url_queue *queue = url_queue_new ();
+  struct url_queue *queue;
 
   /* The URLs we do not wish to enqueue, because they are already in
  the queue, but haven't been downloaded yet.  */
-  struct hash_table *blacklist = make_string_hash_table (0);
+  struct hash_table *blacklist;
 
-  /* We'll need various components of this, so better get it over with
- now. */
-  struct url *start_url_parsed = url_parse (start_url, NULL);
+  int up_error_code;
+  struct url *start_url_parsed = url_parse (start_url, up_error_code);
+
+  if (!start_url_parsed)
+{
+  logprintf (LOG_NOTQUIET, %s: %s.\n, start_url,
+url_error (up_error_code));
+  return URLERROR;
+}
+
+  queue = url_queue_new ();
+  blacklist = make_string_hash_table (0);
 
   /* Enqueue the starting URL.  Use start_url_parsed-url rather than
  just URL so we enqueue the canonical form of the URL.  */

Re: wget core dump with recursive file transfer

2002-02-18 Thread Hrvoje Niksic


Thanks for looking into this.  I've written a slightly different fix
before I saw the one from you.

Your patch was *almost* correct -- one minor detail is that you don't
take care to free QUEUE and BLACKLIST before exiting, therefore
technically creating a (small) memory leak.

My patch avoids the leak simply by making sure the return is placed
before calls to url_queue_new() and make_string_hash_table().

Re: using -nd ?

2002-02-18 Thread Hrvoje Niksic


Samuel Hargis [EMAIL PROTECTED] writes:

 I've read through the documentation and it says that (if a name
 shows up more than once, the filenames will get extensions '.n')
 Would that be like index.html duplicate would be named
 index.n.html or index.html.n?

The latter.

 Also, how does it handle multiple duplicates, like say 5?

It will create index.html, then index.html.1, then index.html.2, etc.

 wget -P  ~MyUserDirectory/WincraftFolder  -nd -r -l2 -p -np -t3 -T 30 -nv 
-A.asp,.cfm,.phtml,.shtml,.htm,.html www.wincraftusa.com

 There are 6 html files in this domain that kill each other out.  I'm
 trying to get that data, all files with same names without them
 canceling each other out.  I don't care if the names are modified
 when downloaded as long as I get all 6 files.  Can someone please
 assist?

Ouch.  I think I understand what the problem is here.  Wget deletes
the index.html.n files because it thinks they're rogue HTML
downloads.  You can work around this bug by including '*.[0-9]' in
your accept list.  For instance:

wget -P  ~MyUserDirectory/WincraftFolder  -nd -r -l2 -p -np -t3 -T 30 -nv 
-A.asp,.cfm,.phtml,.shtml,.htm,.html,'*[0-9]' www.wincraftusa.com

Thanks for the report.

Re: wget timeout option useless

2002-02-21 Thread Hrvoje Niksic


Jamie Zawinski [EMAIL PROTECTED] writes:

 Please also set an exit alarm around your calls to connect() based
 on the -T option.

This is requested frequently.  I'll include it in the next release.

The reason why it's not already there is simply that I was lucky never
to be bitten by that problem.  For some reason, the systems I've
worked on have always either connected or timed out in reasonable
time.

Re: wget info page

2002-02-21 Thread Hrvoje Niksic


Noel Koethe [EMAIL PROTECTED] writes:

 wget 1.8.1 is shipped with the files in doc/
 wget.info
 wget.info-1
 wget.info-2
 wget.info-3
 wget.info-4

Yes.

As Ian said, this is so that people without `makeinfo' installed can
still read the documentation.  (In fact, Info pages can even be read
without an Info reader.)  I believe this is mandated by the GNU
standards.

`make distclean' doesn't remove those Info pages precisely so that
they can be shipped with the release.

Re: wget info page

2002-02-21 Thread Hrvoje Niksic


Noel Koethe [EMAIL PROTECTED] writes:

 OK. No problem for me. I just wrote this because the more
 interesting doc, the manpage, is not shipped with the source.

I don't know how the man page is more interesting since it's a mere
subset of the Info documentation.  All the GNU programs are shipped
with preformatted Info, and Wget is no exception there.

The current man page is a compromise to appease the people who insist
on having a Unix-style man page as well.

Re: RFC1806 Content-disposition patch (Take 2)

2002-03-21 Thread Hrvoje Niksic


[ Adding the development list to Cc, to facilitate discussion. ]

David F. Newman [EMAIL PROTECTED] writes:

 First of all, I think this new behaviour needs an option to enable
 it, rather than be on by default. The option could be called
 rfc1806, or rather, rfc2183 now, unless anyone can suggest a
 friendlier name such as --obey-content-disposition-filename!

Even better, I would entirely avoid the rfc numbers in naming either
command-line options *or* functions.  Mentioning rfc-s in comments or
in the manual is fine, of course.

 OK, I'll take these things into account.  My concern with the valid
 characters was that someone doesn't specify an absolute path and
 pass something like /etc/passwd.  So you think that stripping out
 the leading path is enough?  That shouldn't be to tough.

I'm not sure what you mean by stripping out the leading part, but
what I suggest is to leave only the trailing part.  So if someone
specifies /etc/passwd, it's exactly the same as if he specified just
passwd.

 And how about simply --honor-content-disposition

Ian, why do you think this should not be allowed by default?  A
command-line option is easy to miss, and honoring this looks like a
neat idea.  Am I missing something?

Re: Problem with the way that wget handles %26 == '' in URLs

2002-03-21 Thread Hrvoje Niksic


Robert Lupton the Good [EMAIL PROTECTED] writes:

 This appears to be an over-enthusistic interpretation of %26 == ''
 in wget.

 I submit a URL (which is in fact a SQL query) with some embedded s
 (logical ORs). These are encoded as %26, and the URL works just fine
 with netscape and lynx.  It fails with wget.

 Note that wget rewrites
   Where+((A.status+%26+0x4)+=+0format=csv
 as 
   Where ((A.status  0x4) = 0
 which is a problem.

 wi:wget-1.8.1src/wget --version
 GNU Wget 1.8.1

Odd.  Earlier versions of Wget did this, but 1.8.1 shouldn't.  For me
that doesn't seem to happen:

$ wget 'http://fly.srk.fer.hr/Where+((A.status+%26+0x4)+=+0format=csv' -d
DEBUG output created by Wget 1.8.1 on linux-gnu.

--04:10:12--  http://fly.srk.fer.hr/Where+((A.status+%26+0x4)+=+0format=csv
   = `Where+((A.status++0x4)+=+0format=csv'
Resolving fly.srk.fer.hr... done.
Caching fly.srk.fer.hr = 161.53.70.130
Connecting to fly.srk.fer.hr[161.53.70.130]:80... connected.
Created socket 3.
Releasing 0x807b9e8 (new refcount 1).
---request begin---
GET /Where+((A.status+%26+0x4)+=+0format=csv HTTP/1.0
...

Looks ok to me.

In your example, I also don't quite see the problem; the URL specified
on the command line is identical to the one rewritten by Wget:

 wi:wget-1.8.1src/wget -O - 
'http://skyserver.sdss.org/en/tools/search/x_sql.asp?cmd=+select+top+10+A.run,+A.camCol,+A.field,+str(A.rowc,7,2)+as+rowc,+str(A.colc,7,2)+as+colc,+str(dbo.fObjFromObjID(A.ObjId),+4)+as+id,+B.run,+B.camCol,+B.field,+str(B.rowc,7,2)+as+rowc,+str(B.colc,7,2)+as+colc,+str(dbo.fObjFromObjID(B.ObjId),+4)+as+id,+str(A.u,+5,3)+as+Au,+str(A.g,+5,3)+as+Ag,+str(A.r,+5,3)+as+Ar,+str(A.i,+5,3)+as+Ai,+str(A.u+-+B.u,+5,3)+as+du,+str(A.g+-+B.g,+5,3)+as+dg,+str(A.r+-+B.r,+5,3)+as+dr,+str(A.i+-+B.i,+5,3)+as+di+from+photoObj+as+A,+photoObj+as+B,+Neighbors+as+ObjN+Where+((A.status+%26+0x4)+=+0format=csv'
 --16:24:15--  
http://skyserver.sdss.org/en/tools/search/x_sql.asp?cmd=+select+top+10+A.run,+A.camCol,+A.field,+str(A.rowc,7,2)+as+rowc,+str(A.colc,7,2)+as+colc,+str(dbo.fObjFromObjID(A.ObjId),+4)+as+id,+B.run,+B.camCol,+B.field,+str(B.rowc,7,2)+as+rowc,+str(B.colc,7,2)+as+colc,+str(dbo.fObjFromObjID(B.ObjId),+4)+as+id,+str(A.u,+5,3)+as+Au,+str(A.g,+5,3)+as+Ag,+str(A.r,+5,3)+as+Ar,+str(A.i,+5,3)+as+Ai,+str(A.u+-+B.u,+5,3)+as+du,+str(A.g+-+B.g,+5,3)+as+dg,+str(A.r+-+B.r,+5,3)+as+dr,+str(A.i+-+B.i,+5,3)+as+di+from+photoObj+as+A,+photoObj+as+B,+Neighbors+as+ObjN+Where+((A.status+%26+0x4)+=+0format=csv

What am I missing?

Re: OK, time to moderate this list

2002-03-22 Thread Hrvoje Niksic


Doug Kearns [EMAIL PROTECTED] writes:

 On Fri, Mar 22, 2002 at 04:08:36AM +0100, Hrvoje Niksic wrote:

 snip
  
 I think I agree with this.  The amount of spam is staggering.  I have
 no explanation as to why this happens on this list, and not on other
 lists which are *also* open to non-subscribers.

 I guess you are not subscribed to [EMAIL PROTECTED]?

 It is not just this list :-(

Good to hear, for a certain deranged value of good.  :-(  However,
I'm also subscribed to [EMAIL PROTECTED] and to [EMAIL PROTECTED]
Both of them allow non-subscriber posts, and very few spam on either.

Go figure.

Re: Can wget handle this scenario?

2002-04-08 Thread Hrvoje Niksic


Tomislav Goles [EMAIL PROTECTED] writes:

 Now I need to add the twist where username account info
 resides on another machine (i.e. machine2 which by the way
 is on the same network as machine1) So I need to do something
 like the following:

 $ wget ftp://username:[EMAIL PROTECTED]@machine1.com/file.txt

 which is of course not the syntax wget understands.

Use:

$ wget ftp://username:[EMAIL PROTECTED]/file.txt

However, this will not work on Wget 1.8.1 due to a bug in handling
URL-encoded passwords.  You can remedy that with this patch:

--- wget-1.8.1.orig/src/url.c
+++ wget-1.8.1/src/url.c
@@ -528,6 +528,11 @@
   memcpy (*user, str, len);
   (*user)[len] = '\0';
 
+  if (*user)
+decode_string (*user);
+  if (*passwd)
+decode_string (*passwd);
+  
   return 1;
 }

I plan to implement this behavior automagically when you set the FTP
proxy to ftp://machine1.com/.

Re: wget reject lists

2002-04-08 Thread Hrvoje Niksic


David McCabe [EMAIL PROTECTED] writes:

 I am having a hell of a time to get the reg-ex stuff to work with the -A or -R
 options. If I supply this option to my wget command:

 -R 1*

 Everything works as expected. Same with this:

 -R 2*

 Now, if I do this:

 -R 1*,2*

 I get all the files beginning with 1. if I do this:

 -R 2*,1* 

 I get all the files beginning with 2.

I've now tried to repeat this, but I am unable to.

This will sound incredulous, but based on some reports I got, I
believe what you see is a result of a Gcc bug.  Specifically, gcc
2.95.something can miscompile sepstring() in utils.c.  Please try
recompiling Wget with `cc' or without optimization and see if it works
then.

Re: Debian bug 32353 - opens a new connection for each ftpdocument.

2002-04-08 Thread Hrvoje Niksic


Guillaume Morin [EMAIL PROTECTED] writes:

 if I use 'wget ftp://site.com/file1.txt ftp://site.com/file2.txt',
 wget will no reuse the ftp connection, but will open one for each
 document downloaded from the same site...

Yes, that's how Wget currently behaves.  But that's not a bug, or at
least not an obvious one -- the files do get downloaded.  Handling
this correctly is, I believe on the TODO list, and should be
classified as wish list.

Re: debian bug 32712 - wget -m sets atimet to remote mtime.

2002-04-08 Thread Hrvoje Niksic


Good point there.  I wonder... is there a legitimate reason to require
atime to be set to the mtime time?  If not, we could just make the
change without the new option.  In general I'm careful not to add new
options unless they're really necessary.

Re: Debian bug 55145 - wget gets confused by redirects

2002-04-08 Thread Hrvoje Niksic


Guillaume Morin [EMAIL PROTECTED] writes:

 If wget fetches a url which redirects to another host, wget
 retrieves the file, and there's nothing that can be done to turn
 that off.

 So, if you do wget -r on a machine that happens to have a redirect to
 www.yahoo.com you'll wind up trying to pull down a big chunk of
 yahoo.

Hmm.  Are you sure?  Wget 1.8.1 is trying hard to restrict following
redirections by applying the same rules normally used for following
links.  Downloading a half of Yahoo! because someone redirects to
www.yahoo.com is not intended to happen.

I tried to reproduce it by creating a page that redirects to
www.yahoo.com, but Wget behaved correctly:

$ wget -r -l0 http://muc.arsdigita.com:2005/test.tcl
--19:13:53--  http://muc.arsdigita.com:2005/test.tcl
   = `muc.arsdigita.com:2005/test.tcl'
Resolving muc.arsdigita.com... done.
Connecting to muc.arsdigita.com[212.84.246.68]:2005... connected.
HTTP request sent, awaiting response... 302 Found
Location: http://www.yahoo.com [following]
--19:13:53--  http://www.yahoo.com/
   = `www.yahoo.com/index.html'
Resolving www.yahoo.com... done.
Connecting to www.yahoo.com[64.58.76.223]:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]

[   =   ] 16,82922.39K/s 

19:13:55 (22.39 KB/s) - `www.yahoo.com/index.html' saved [16829]


FINISHED --19:13:55--
Downloaded: 16,829 bytes in 1 files

Guillaume, exactly how have you reproduced the problem?

Re: New suggestion.

2002-04-08 Thread Hrvoje Niksic


Ivan Buttinoni [EMAIL PROTECTED] writes:

 Again I send a suggestion, this time quite easy.  I hope it's not
 allready implemented, else I'm sorry in advance.  It will be nice if
 wget can use the regexp to evaluate what accept/refuse to download.
 The regexp have to work on whole URL and/or filename and/or hostname
 and/or CGI argument.  Sometime I found the apache directory sorting
 links that are unusefull, eg:
 .../?N=A 
 .../?M=D

 Here follows an hipotesis for the above example:
 wget -r -l0 --reg-exclude '[A-Z]=[AD]$' http://

The problem with regexps is that their use would make Wget dependent
on a regexp library.  To make matters worse, regexp libraries come in
all shapes and sizes, with incompatible APIs and implementing
incompatible dialects of regexps.

I'm staying away from regexps as long as I possibly can.

Re: Debian bug 65791 - when converting links no effort is made tohandle the '?' character

2002-04-08 Thread Hrvoje Niksic


Guillaume Morin [EMAIL PROTECTED] writes:

 For example if a link to the URL /foo?bar is seen then the correct
 file is downloaded and saved with the name foo?bar.  When viewing
 the pages with Netscape the '?' character is seen to separate the
 URL and the arguments.  This makes the link fail.

That's a known problem.  The easy fix of changing ? to %3f didn't
work because some browsers still fail to load the file.

This problem will be fixed in a future release by changing ? to
another, different character.

Re: CSS @import, NetBSD 1.5.2 ok

2002-04-08 Thread Hrvoje Niksic


Martin Tsachev [EMAIL PROTECTED] writes:

 it compiles on i386-unknown-netbsdelf1.5.2 without any modifications

 I think that wget isn't parsing the @import CSS declaration, it should
 save those files when run with -p and convert the links if set so

That is true.  Parsing @import would require an (easy) change to the
HTML parser and a CSS parser.  Noone has stepped up to implement those
yet.

Re: OK, time to moderate this list

2002-04-08 Thread Hrvoje Niksic


Maciej W. Rozycki [EMAIL PROTECTED] writes:

 On Mon, 8 Apr 2002, Hrvoje Niksic wrote:

 I was also thinking about checking for `Wget' in the body, and things
 like that.

  That might be annoying (although it is certainly an option to consider
 anyway) as someone sending a mail legitimately may assume the matter being
 obvious from the list's name and definition and not repeat the program's
 name anywhere in the headers or the body (only a version number for
 example, or current when referring to a CVS snapshot).  Just like I did
 here. ;-)

Don't get me wrong, a spam detection wouldn't get discarded; it
would simply need to be approved by an editor.  So even the false
positives would make to the list, only a tad later.

Re: -nv option; printing out infos via stderr[http://bugs.debian.org/141323]

2002-04-09 Thread Hrvoje Niksic


Ian Abbott [EMAIL PROTECTED] writes:

 On 5 Apr 2002 at 18:17, Noel Koethe wrote:

 Will this be changed so the user could use -nv with /dev/null
 and get only errors or warnings displayed?

 So what I think you want is for any log message tagged as
 LOG_VERBOSE (verbose information) or LOG_NONVERBOSE (basic
 information) in the source to go to stdout when no log file has been
 specified and the `-O -' option has not been used and for everything
 else to go to stderr?

That change sounds dangerous.  Current Wget output doesn't really have
a concept of errors that would be really separate from other output;
it only operates on the level of verbosity.  This was, of course, a
bad design decision, and I agree that steps need to be taken to change
it.  I'm just not sure that this is the right step.

For one, I don't know of any utility that splits its output this way.
It is true that many programs print their output on stdout and errors
to stderr, but Wget's log output is hardly the actual, programmatic,
output of the program.  That can only be the result of `-O -'.

Suddenly `wget -o X' is no longer equivalent to `wget 2x', which
violates the Principle of Least Surprise.

Re: Satellite [NetGain 2000] [Corruption]

2002-04-09 Thread Hrvoje Niksic


Justin Piszcz [EMAIL PROTECTED] writes:

 --12:12:21--  ftp://war:*password*@0.0.0.0:21//iso/file.iso
= `iso/file.iso'
 == CWD not required.
 == PASV ... done.== RETR file.iso ... done.
 Length: 737,402,880

 24% [] 180,231,952   37.40K/s  ETA
 4:02:27

 13:31:51 (37.40 KB/s) - Data connection: Connection timed out; Data transfer
 aborted.
 Retrying.

 This causes corruption in the file.
 I need to try a client which supports rollback I guess.

It seems so.  But I really find it strange that there would be an FTP
server that fails in this scenario.  I've only heard of such proxy
failures, and that's not applicable to your case.

Re: Malformed status line error

2002-04-09 Thread Hrvoje Niksic


Torsten Fellhauer -iXpoint- #429 [EMAIL PROTECTED] writes:

 when connecting to a FTP-Server using a TrendMicro Viruswall Proxy,
 we get the error Malformed status line,

Unfortunately, Wget is right; that status line is quite different from
what HTTP mandates.  The status line should be something like:

HTTP/1.0 200 Ok

Or, more generally:

HTTP/1.x status message

Instead, the TrendMicro Viruswall Proxy returns:

220 InterScan Version 3.6-Build_1166 $Date: 04/24/2001 22:13:0052$ (mucint01, 
dynamic, get: N, put: N): Ready

That is so far from HTTP that even if Wget's parser were lenient it
still wouldn't make sense out of it.  Is 220 an HTTP status code?
If so, which one?  What version of HTTP is the proxy speaking?

Someone should write to the makers of TrendMicro Viruswall Proxy and
ask them to fix this bug.

Re: GNU Wget 1.5.3

2002-04-09 Thread Hrvoje Niksic


Matthias Jim Knopf [EMAIL PROTECTED] writes:

 there is a bug (or a feature...) in the version 1.5.3

Note that the latest version of Wget is 1.8.1.  I suggest you to
upgrade because the new version handles URLs much better.

 I discovered that every doubled slash (//) is converted to a single
 slash (/) which might make sense for real file-paths, but which does
 not allow me to retrieve an url like

This bug still exists in the latest version, but I plan to fix it
before the next release.

 http://my.server/some/file?get_url=http://foo.bar/
 ^^ will be
 converted to http:/foo.bar which is not a valid url

 this also does not work if the value for 'get_url'  is  url-encoded 
 as it should be.

1.8.1 handles this correctly when quoted.  For example:

$ wget -d 'http://fly.srk.fer.hr/?foo=http%2f%2fbar/baz'
DEBUG output created by Wget 1.8.1 on linux-gnu.
[...]
---request begin---
GET /?foo=http%2f%2fbar/baz HTTP/1.0

Current download speed in progress bar

2002-04-09 Thread Hrvoje Niksic


Since I implemented the progress bar, I've progressively become more
and more annoyed by the fact that the download speed it reports is the
average download speed.  What I'm usually much more interested in is
the current download speed.

This patch implements this change; the current download speed is
calculated as the speed of the most recent 30 network reads.  I think
this makes sense -- for very downloads, you'll get the average
spanning several seconds; for the fast ones, you'll get the average in
this fraction of a second.  This is what I want -- I think.

The one remaining problem is the ETA.  Based on the current speed, it
changes value wildly.  Of course, over time it is generally
decreasing, but one can hardly follow it.  I removed the flushing by
making sure that it's not shown more than once per second, but this
didn't fix the problem of unreliable values.

Should we revert to the average speed for ETA, or is there a smarter
way to handle it?  What are other downloaders doing?


2002-04-09  Hrvoje Niksic  [EMAIL PROTECTED]

* progress.c (bar_update): Maintain an array of the time it took
to perform previous 30 network reads.
(create_image): Calculate the download speed and ETA based on the
last 30 reads, not the entire download.
(create_image): Make sure that the ETA is not changed more than
once per second.

Index: src/progress.c
===
RCS file: /pack/anoncvs/wget/src/progress.c,v
retrieving revision 1.23
diff -u -r1.23 progress.c
--- src/progress.c  2001/12/10 05:31:45 1.23
+++ src/progress.c  2002/04/09 18:49:45
@@ -401,6 +401,9 @@
create_image will overflow the buffer.  */
 #define MINIMUM_SCREEN_WIDTH 45
 
+/* Number of recent packets we keep the stats for. */
+#define RECENT_ARRAY_SIZE 30
+
 static int screen_width = DEFAULT_SCREEN_WIDTH;
 
 struct bar_progress {
@@ -410,7 +413,7 @@
   download finishes */
   long count;  /* bytes downloaded so far */
 
-  long last_update;/* time of the last screen update. */
+  long last_screen_update; /* time of the last screen update. */
 
   int width;   /* screen width we're using at the
   time the progress gauge was
@@ -420,7 +423,27 @@
   signal. */
   char *buffer;/* buffer where the bar image is
   stored. */
-  int tick;
+  int tick;/* counter used for drawing the
+  progress bar where the total size
+  is not known. */
+
+  /* The following variables (kept in a struct for namespace reasons)
+ keep track of how long it took to read recent packets.  See
+ bar_update() for explanation.  */
+  struct {
+long previous_time;
+long times[RECENT_ARRAY_SIZE];
+long bytes[RECENT_ARRAY_SIZE];
+int count;
+long summed_times;
+long summed_bytes;
+  } recent;
+
+  /* create_image() uses these to make sure that ETA information
+ doesn't flash. */
+  long last_eta_time;  /* time of the last update to download
+  speed and ETA. */
+  long last_eta_value;
 };
 
 static void create_image PARAMS ((struct bar_progress *, long));
@@ -453,7 +476,8 @@
 bar_update (void *progress, long howmuch, long dltime)
 {
   struct bar_progress *bp = progress;
-  int force_update = 0;
+  int force_screen_update = 0;
+  int rec_index;
 
   bp-count += howmuch;
   if (bp-total_length  0
@@ -465,21 +489,75 @@
equal to the expected size doesn't abort.  */
 bp-total_length = bp-count + bp-initial_length;
 
+  /* The progress bar is supposed to display the current download
+ speed.  The first version of the progress bar calculated it by
+ dividing the total amount of data with the total time needed to
+ download it.  The problem with this was that stalled or suspended
+ download could unduly influence the current time.  Taking just
+ the time needed to download the current packet would not work
+ either because packets arrive too fast and the varitions would be
+ too jerky.
+
+ It would be preferrable to show the speed that pertains to a
+ recent period, say over the past several seconds.  But to do this
+ accurately, we would have to record all the packets received
+ during the last five seconds.
+
+ What we do instead is maintain a history of a fixed number of
+ packets.  It actually makes sense if you think about it -- faster
+ downloads will have a faster response to speed changes.  */
+
+  rec_index = bp-recent.count % RECENT_ARRAY_SIZE;
+  ++bp-recent.count;
+
+  /* Instead of calculating the sum of times[] and bytes[], we
+ maintain the summed quantities.  To maintain each sum, we must
+ make sure that it gets increased

Re: Current download speed in progress bar

2002-04-09 Thread Hrvoje Niksic


Maurice Cinquini [EMAIL PROTECTED] writes:

 I don't think using only a fraction of a second is a reliable method
 for estimating current bandwidth.   Here are some factors that can
 make for a wildly varing ETAs when just looking at the last fraction
 of a second.
   - TCP slow start.
   - Kernel level buffering
   - Other network traffic

That's beside the point; this was never intended to be a scientific
method of determining bandwidth.  All I aimed for was something more
useful than dividing total bytes with total time.  And for bandwidth,
I'm confident that my current method is better than what was
previously in place.  I'm not so sure about ETA, though.

I don't like apt's method of calculating the CPS only every N seconds,
because -- if I'm reading it right -- it means that you see the same
value for 6 seconds, and then have to wait another 6 seconds for
refresh.  That sucks.  `links', for example, offers both average and
current speed, and the latter seems to be updated pretty swiftly.

Still, thanks for the suggestions.  Unless I find a really cool
different suggestion, I'll fall back to the previous method for ETA.

Re: Current download speed in progress bar

2002-04-09 Thread Hrvoje Niksic


Daniel Stenberg [EMAIL PROTECTED] writes:

 On Tue, 9 Apr 2002, Hrvoje Niksic wrote:

 Should we revert to the average speed for ETA, or is there a smarter way to
 handle it?  What are other downloaders doing?

 I'll grab the other part and explain what curl does. It shows a current
 speed based on the past five seconds,

Does it mean that the speed doesn't change for five seconds, or that
you always show the *current* speed, but relative to the last five
seconds?  I may be missing something, but I don't see how to efficiently
implement the latter.

Re: Current download speed in progress bar

2002-04-09 Thread Hrvoje Niksic


Tony Lewis [EMAIL PROTECTED] writes:

 I'm often annoyed by ETA estimates that make no sense. How about showing two
 values -- something like:

 ETA at average speed: 1:05:17
 ETA at current speed: 15:05

The problem is that Wget is limited by what fits in one line.  I'd
like to keep enough space for the progress bar, so I can add no
additional information.

Re: Current download speed in progress bar

2002-04-09 Thread Hrvoje Niksic


Tony Lewis [EMAIL PROTECTED] writes:

 Could you keep an array of speeds that is updated once a second such that
 the value from six seconds ago is discarded and when the value for the
 second that just ended is recorded?

Right now I'm doing that kind of trick, but for the last N reads from
the network.  This translates to a larger interval for slower
downloads, and the other way around, which is, I think, what one would
want.

Re: Current download speed in progress bar

2002-04-09 Thread Hrvoje Niksic


Andre Majorel [EMAIL PROTECTED] writes:

 If find it very annoying when a downloader plays yoyo with the
 remaining time. IMHO, remaining time is by nature a long term thing
 and short term jitter should not cause it to go up and down.

Agreed wholeheartedly, but how would you *implement* a non-jittering
ETA?  Do you think it makes sense the way 1.8.1 does it, i.e. to
calculate the ETA from the average speed?

Re: Current download speed in progress bar

2002-04-10 Thread Hrvoje Niksic


Daniel Stenberg [EMAIL PROTECTED] writes:

 The meter is updated maximum once per second, I don't think it makes
 sense to update the screen faster than that.

Maybe not, but I sort of like it.  Wget's progress bar refreshes the
screen (not more than) five times per second, and I like the idea of
refreshing the download speed along with the amount.  However, I've
added the code to limit the ETA change to once per second.

I've come up with a similar scheme you are describing, except I now
use smaller subintervals.  In other words, at compile-time you can
independently choose how much you're going in the past, and in how
many chunks that's divided.  I've defaulted it to 3 seconds and 30
intervals, respectively.

 This basicly explains what curl does, not saying it is any
 particularly scientific way or anything, I've just found this info
 interesting.

Thanks for the info; I appreciate it.

Re: Current download speed in progress bar

2002-04-10 Thread Hrvoje Niksic


Roger L. Beeman [EMAIL PROTECTED] writes:

 On Wed, 10 Apr 2002, Hrvoje Niksic wrote:

 Agreed wholeheartedly, but how would you *implement* a non-jittering
 ETA?  Do you think it makes sense the way 1.8.1 does it, i.e. to
 calculate the ETA from the average speed?

 One common programming technique is the exponential decay model.

Sounds cool.  Do you have a pseudocode or, failing that, a reference
easy enough that even a programmer of Unix command-line utilities can
follow it?  :-)

(I must admit that your email address adds a certain weight to
whatever you have to say about measuring bandwidth.)

 I believe that the method is chosen for its simplicity and that
 justifications of its validity are completely after the fact. The
 simplicity is that one keeps a previously calculated value and
 averages that value with the current measurement and saves the
 result for the next iteration, i.e. add and shift right.

I thought about calculating the average between the average and the
current speed, and use that for ETA, but it sounded too arbitrary
and I didn't have time to gather empirical evidence that it was any
better than just using average.  Again, I'd be grateful if you could
provide some code.

 You must chose how to normalize the measurement based on
 irregularity in the measurement interval, however.

I'm afraid I can't parse this without understanding the algorithm.

Re: Debian bug 88176 - timestamping is wrong with -O

2002-04-10 Thread Hrvoje Niksic


Unfortunately, this bug is not easy to fix.  The problem is that `-O'
was originally invented for streaming, i.e. for `-O -'.  As a result,
many places in Wget's code assume that they can freely operate on the
file names, and -O seems more like an afterthought.

On the other hand, many people (reasonably) expect `-O x' to simply
override the file name from whatever was specified in the URL to x.
But the code doesn't work that way.

I plan to change the handling of file names to make this work, but
that will take some time.  Unless someone takes time to fix this in
the existing code base, the bug will remain open until the said
reorganization.  Until then, the workaround is to avoid the `-O -N'
combination.

Re: Debian bug 106391 - documentation doesn't warn about passwordsin urls

2002-04-10 Thread Hrvoje Niksic


[ Cc'ing to [EMAIL PROTECTED], as requested by Guillaume. ]

Guillaume Morin [EMAIL PROTECTED] writes:

 this is from the advanced usage section of examples (info docs):

   * If you want to encode your own username and password to HTTP or
 FTP, use the appropriate URL syntax (*note URL Format::).

  wget ftp://hniksic:[EMAIL PROTECTED]/.emacs

 this would let other users on the system to see your password using
 ps. it should have a big disclaimer.

You're right.  I'll apply this patch, which I think should add enough
warnings to educate the unwary.


2002-04-10  Hrvoje Niksic  [EMAIL PROTECTED]

* wget.texi: Warn about the dangers of specifying passwords on the
command line and in unencrypted files.

Index: doc/wget.texi
===
RCS file: /pack/anoncvs/wget/doc/wget.texi,v
retrieving revision 1.62
diff -u -r1.62 wget.texi
--- doc/wget.texi   2001/12/16 18:05:34 1.62
+++ doc/wget.texi   2002/04/10 21:40:32
@@ -285,6 +285,13 @@
 @file{.netrc} file in your home directory, password will also be
 searched for there.}
 
+@strong{Important Note}: if you specify a password-containing @sc{url}
+on the command line, the username and password will be plainly visible
+to all users on the system, by way of @code{ps}.  On multi-user systems,
+this is a big security risk.  To work around it, use @code{wget -i -}
+and feed the @sc{url}s to Wget's standard input, each on a separate
+line, terminated by @kbd{C-d}.
+
 You can encode unsafe characters in a @sc{url} as @samp{%xy}, @code{xy}
 being the hexadecimal representation of the character's @sc{ascii}
 value.  Some common unsafe characters include @samp{%} (quoted as
@@ -849,8 +856,15 @@
 @code{digest} authentication scheme.
 
 Another way to specify username and password is in the @sc{url} itself
-(@pxref{URL Format}).  For more information about security issues with
-Wget, @xref{Security Considerations}.
+(@pxref{URL Format}).  Either method reveals your password to anyone who
+bothers to run @code{ps}.  To prevent the passwords from being seen,
+store them in @file{.wgetrc} or @file{.netrc}, and make sure to protect
+those files from other users with @code{chmod}.  If the passwords are
+really important, do not leave them lying in those files either---edit
+the files and delete them after Wget has started the download.
+
+For more information about security issues with Wget, @xref{Security
+Considerations}.
 
 @cindex proxy
 @cindex cache
@@ -975,6 +989,9 @@
 authentication on a proxy server.  Wget will encode them using the
 @code{basic} authentication scheme.
 
+Security considerations similar to those with @samp{--http-passwd}
+pertain here as well.
+
 @cindex http referer
 @cindex referer, http
 @item --referer=@var{url}
@@ -2409,6 +2426,10 @@
 wget ftp://hniksic:mypassword@@unix.server.com/.emacs
 @end example
 
+Note, however, that this usage is not advisable on multi-user systems
+because it reveals your password to anyone who looks at the output of
+@code{ps}.
+
 @cindex redirecting output
 @item
 You would like the output documents to go to standard output instead of
@@ -2773,10 +2794,12 @@
 main issues, and some solutions.
 
 @enumerate
-@item
-The passwords on the command line are visible using @code{ps}.  If this
-is a problem, avoid putting passwords from the command line---e.g. you
-can use @file{.netrc} for this.
+@item The passwords on the command line are visible using @code{ps}.
+The best way around it is to use @code{wget -i -} and feed the @sc{url}s
+to Wget's standard input, each on a separate line, terminated by
+@kbd{C-d}.  Another workaround is to use @file{.netrc} to store
+passwords; however, storing unencrypted passwords is also considered a
+security risk.
 
 @item
 Using the insecure @dfn{basic} authentication scheme, unencrypted

Re: Debian bug 131851 - cwd during ftp causes download to fail

2002-04-10 Thread Hrvoje Niksic


Guillaume Morin [EMAIL PROTECTED] writes:

 When getting a file in a non-root directory from FTP with wget, wget
 always tries CWD to that directory before getting the
 file. Unfortunately sometimes you're not allowed to CWD to a
 directory, but you're all allowed to list or download files from it
 (taken that you know the filename).

I believe this breaks rfc959.

I think this is quite rare, so I don't plan to add this to Wget in the
near future.  If someone implements it cleanly, the functionality can
go in.

Re: Debian wishlist bug 21148 - wget doesn't allow selectivitybased on mime type

2002-04-10 Thread Hrvoje Niksic


I believe this is already on the todo list.  However, this is made
harder by the fact that, to implement this kind of reject, you have to
start downloading the file.  This is very different from the
filename-based rejection, where the decision can be made at a very
early point in the download process.

Re: spaces and other special caracters in directories

2002-04-10 Thread Hrvoje Niksic


Loic Le Loarer [EMAIL PROTECTED] writes:

 When I fetch with wget a whole subtree and when directories contains
 space or some other special character, these character are
 urlencoded in the local version while it is not the case for files.

 For exemple if I mirror with wget -m the directory to to which
 contains the file to to, I get localy the directory to%20to and
 the file to to. Is there an option to have the directory to to ?

The inconsistency you're seeing is a bug, but the intended behavior
goes into rather the opposite direction.  The code was supposed to
url-encode *both* the file and the directory without the option to
suppress it.

I will try to fix this for the next release, preferrably by uncoupling
the url encoding from the protection of file names from invalid
characters.  Ideally, the latter would be configurable.

Re: feature wish: switch to disable robots.txt usage

2002-04-10 Thread Hrvoje Niksic


Noel Koethe [EMAIL PROTECTED] writes:

 Ok got it. But it is possible to get this option as a switch for
 using it on the command line?

Yes, like this:

wget -erobots=off ...

Re: ftp passwords

2002-04-11 Thread Hrvoje Niksic


Antonis Sidiropoulos [EMAIL PROTECTED] writes:

 But when the password contains characters such as '^' or space,
 these chars are converted in the form: %{hex code}
 e.g. a passwd like ^12 34 is translated to: %5E12%2034, so the
 login   fails.

 Is this a bug ??

Thanks for the report.  It is indeed a bug, and this patch shoud fix
it:

Index: src/url.c
===
RCS file: /pack/anoncvs/wget/src/url.c,v
retrieving revision 1.68
retrieving revision 1.69
diff -u -r1.68 -r1.69
--- src/url.c   2002/01/14 01:56:40 1.68
+++ src/url.c   2002/01/14 13:26:16 1.69
@@ -528,6 +528,11 @@
   memcpy (*user, str, len);
   (*user)[len] = '\0';
 
+  if (*user)
+decode_string (*user);
+  if (*passwd)
+decode_string (*passwd);
+
   return 1;
 }

Re: wget-1.8.1: build problems, and some patches

2002-04-11 Thread Hrvoje Niksic


Nelson H. F. Beebe [EMAIL PROTECTED] writes:

 The wget-1.8.1 release is evidently intended to be buildable with
 old-style KR compilers, since it automatically detects this, and
 filters the source code with ansi2knr.

 Unfortunately, there are some syntactical things in the wget source
 code that ansi2knr cannot recognize, and they prevent a successful
 build with such a compiler (in my case, cc on HP-UX 10.01, with no
 c89 or gcc available on that system).

Thanks a lot for looking into this.  I will apply your patch with one
modification: you don't need to conditionalize the use of parameters
in declarations -- just wrap them in the PARAMS macro.  This is
important because it decreases the number of ifdef's in the code.

It is good to have someone test the ansi2knr feature, since I no
longer have access to systems with pre-ANSI compilers.

 Here are patches that I applied to get a successful build [I dealt
 with the log.c problem by manually inserting

   #undef HAVE_STDARG_H

 in config.h, to force it to use the old-style varargs interface.]

I wonder why this was needed?  Does your system have stdarg.h, and yet
does not support the ANSI interface?

The patch I am about to apply looks like this:

Index: src/ChangeLog
===
RCS file: /pack/anoncvs/wget/src/ChangeLog,v
retrieving revision 1.373
diff -u -r1.373 ChangeLog
--- src/ChangeLog   2002/04/11 15:25:50 1.373
+++ src/ChangeLog   2002/04/11 17:06:02
@@ -1,5 +1,29 @@
 2002-04-11  Hrvoje Niksic  [EMAIL PROTECTED]
 
+   * progress.c (struct progress_implementation): Use PARAMS when
+   declaring the parameters of *create, *update, *finish, and
+   *set_params.
+
+   * netrc.c: Ditto.
+
+   * http.c: Reformat some function definitions so that ansi2knr can
+   read them.
+
+   * hash.c (struct hash_table): Use the PARAMS macro around
+   parameters in the declaration of hash_function and test_function.
+   (prime_size): Spell 2580823717UL and 3355070839UL as (unsigned
+   long)0x99d43ea5 and (unsigned long)0xc7fa5177 respectively, so
+   that pre-ANSI compilers can read them.
+   (find_mapping): Use PARAMS when declaring EQUALS.
+   (hash_table_put): Ditto.
+
+   * ftp.h: Wrap the parameters of ftp_index declaration in PARAMS.
+
+   * cookies.c (cookie_new): Use (unsigned long)0 instead of 0UL,
+   which was unsupported by pre-ANSI compilers.
+
+2002-04-11  Hrvoje Niksic  [EMAIL PROTECTED]
+
* url.c (url_filename): Use compose_file_name regardless of
whether opt.dirstruct is set.
(mkstruct): Don't handle the query and the reencoding of DIR; that
Index: src/cookies.c
===
RCS file: /pack/anoncvs/wget/src/cookies.c,v
retrieving revision 1.18
diff -u -r1.18 cookies.c
--- src/cookies.c   2001/12/10 02:29:11 1.18
+++ src/cookies.c   2002/04/11 17:06:02
@@ -84,7 +84,7 @@
 
   /* If we don't know better, assume cookie is non-permanent and valid
  for the entire session. */
-  cookie-expiry_time = ~0UL;
+  cookie-expiry_time = ~(unsigned long)0;
 
   /* Assume default port. */
   cookie-port = 80;
Index: src/ftp.h
===
RCS file: /pack/anoncvs/wget/src/ftp.h,v
retrieving revision 1.13
diff -u -r1.13 ftp.h
--- src/ftp.h   2002/01/25 03:34:23 1.13
+++ src/ftp.h   2002/04/11 17:06:02
@@ -107,7 +107,7 @@
 struct fileinfo *ftp_parse_ls PARAMS ((const char *, const enum stype));
 uerr_t ftp_loop PARAMS ((struct url *, int *));
 
-uerr_t ftp_index (const char *, struct url *, struct fileinfo *);
+uerr_t ftp_index PARAMS ((const char *, struct url *, struct fileinfo *));
 
 char ftp_process_type PARAMS ((const char *));
 
Index: src/hash.c
===
RCS file: /pack/anoncvs/wget/src/hash.c,v
retrieving revision 1.14
diff -u -r1.14 hash.c
--- src/hash.c  2001/11/17 18:03:57 1.14
+++ src/hash.c  2002/04/11 17:06:02
@@ -136,8 +136,8 @@
 };
 
 struct hash_table {
-  unsigned long (*hash_function) (const void *);
-  int (*test_function) (const void *, const void *);
+  unsigned long (*hash_function) PARAMS ((const void *));
+  int (*test_function) PARAMS ((const void *, const void *));
 
   int size;/* size of the array */
   int count;   /* number of non-empty, non-deleted
@@ -177,7 +177,8 @@
 10445899, 13579681, 17653589, 22949669, 29834603, 38784989,
 50420551, 65546729, 85210757, 110774011, 144006217, 187208107,
 243370577, 316381771, 411296309, 534685237, 695090819, 903618083,
-1174703521, 1527114613, 1985248999, 2580823717UL, 3355070839UL
+1174703521, 1527114613, 1985248999,
+(unsigned long)0x99d43ea5, (unsigned long)0xc7fa5177
   };
   int i;
   for (i = 0; i  ARRAY_SIZE (primes); i++)
@@ -236,7 +237,7 @@
   struct mapping *mappings = ht

Re: wget-1.8.1: build failure on SGI IRIX 6.5 with c89

2002-04-11 Thread Hrvoje Niksic


Nelson H. F. Beebe [EMAIL PROTECTED] writes:

 c89 -I. -I. -I/opt/include   -DHAVE_CONFIG_H 
-DSYSTEM_WGETRC=\/usr/local/etc/wgetrc\ -DLOCALEDIR=\/usr/local/share/locale\ -O 
-c connect.c
 cc-1164 c89: ERROR File = connect.c, Line = 94
   Argument of type int is incompatible with parameter of type const char *.

 logprintf (LOG_VERBOSE, _(Connecting to %s[%s]:%hu... ),
 ^

 cc-1164 c89: ERROR File = connect.c, Line = 97
   Argument of type int is incompatible with parameter of type const char *.

The argument of type int is probably an indication that the `_'
macro is either undefined or expands to an undeclared function.  The
compiler rightfully assumes the function to return int and complains
about the type mismatch.

If you check why the macro is misdeclared, you'll likely discover the
source of the problem.

 Inasmuch as this compiler has been excellent in diagnosing
 violations of the 1989 ISO C Standard, and catching many portability
 problems, I suspect the error lies in wget.

Agreed.  But in this case the error is one of configuration, not
programming.

Re: -k does not convert form actions

2002-04-11 Thread Hrvoje Niksic


[EMAIL PROTECTED] writes:

 From the specification the form action= field is a uri and it can
 be an absolute url. So it seems it should be fixed up with the -k
 option just like hrefs and img srcs are.

A good idea, thanks.  I've attached a patch, which will be part of the
next release, that implements this.

 Overall it would be very nice if -k where to grab an
 http://something it sees and convert it if it is on that server
 since you get urls in javascript code also that it would be nice to
 have convert also.

That's a problem because `-k' sees only the data in tags that are
defined to contain URLs.  When Wget is taught to rummage through
JavaScript looking for URLs, `-k' will become aware of them as well.

Here is the patch:

2002-04-11  Hrvoje Niksic  [EMAIL PROTECTED]

* html-url.c (tag_handle_form): New function.  Pick up form
actions and mark them for conversion only.

Index: src/html-url.c
===
RCS file: /pack/anoncvs/wget/src/html-url.c,v
retrieving revision 1.24
diff -u -r1.24 html-url.c
--- src/html-url.c  2002/02/01 03:34:31 1.24
+++ src/html-url.c  2002/04/11 17:46:52
@@ -48,6 +48,7 @@
 
 DECLARE_TAG_HANDLER (tag_find_urls);
 DECLARE_TAG_HANDLER (tag_handle_base);
+DECLARE_TAG_HANDLER (tag_handle_form);
 DECLARE_TAG_HANDLER (tag_handle_link);
 DECLARE_TAG_HANDLER (tag_handle_meta);
 
@@ -73,29 +74,31 @@
   { embed,   tag_find_urls },
 #define TAG_FIG7
   { fig, tag_find_urls },
-#define TAG_FRAME  8
+#define TAG_FORM   8
+  { form,tag_handle_form },
+#define TAG_FRAME  9
   { frame,   tag_find_urls },
-#define TAG_IFRAME 9
+#define TAG_IFRAME 10
   { iframe,  tag_find_urls },
-#define TAG_IMG10
+#define TAG_IMG11
   { img, tag_find_urls },
-#define TAG_INPUT  11
+#define TAG_INPUT  12
   { input,   tag_find_urls },
-#define TAG_LAYER  12
+#define TAG_LAYER  13
   { layer,   tag_find_urls },
-#define TAG_LINK   13
+#define TAG_LINK   14
   { link,tag_handle_link },
-#define TAG_META   14
+#define TAG_META   15
   { meta,tag_handle_meta },
-#define TAG_OVERLAY15
+#define TAG_OVERLAY16
   { overlay, tag_find_urls },
-#define TAG_SCRIPT 16
+#define TAG_SCRIPT 17
   { script,  tag_find_urls },
-#define TAG_TABLE  17
+#define TAG_TABLE  18
   { table,   tag_find_urls },
-#define TAG_TD 18
+#define TAG_TD 19
   { td,  tag_find_urls },
-#define TAG_TH 19
+#define TAG_TH 20
   { th,  tag_find_urls }
 };
 
@@ -141,10 +144,11 @@
from the information above.  However, some places in the code refer
to the attributes not mentioned here.  We add them manually.  */
 static const char *additional_attributes[] = {
-  rel,   /* for TAG_LINK */
-  http-equiv,/* for TAG_META */
-  name,  /* for TAG_META */
-  content/* for TAG_META */
+  rel,   /* used by tag_handle_link */
+  http-equiv,/* used by tag_handle_meta */
+  name,  /* used by tag_handle_meta */
+  content,   /* used by tag_handle_meta */
+  action /* used by tag_handle_form */
 };
 
 static const char **interesting_tags;
@@ -473,6 +477,22 @@
 ctx-base = uri_merge (ctx-parent_base, newbase);
   else
 ctx-base = xstrdup (newbase);
+}
+
+/* Mark the URL found in form action=... for conversion. */
+
+static void
+tag_handle_form (int tagid, struct taginfo *tag, struct map_context *ctx)
+{
+  int attrind;
+  char *action = find_attr (tag, action, attrind);
+  if (action)
+{
+  struct urlpos *action_urlpos = append_one_url (action, 0, tag,
+attrind, ctx);
+  if (action_urlpos)
+   action_urlpos-ignore_when_downloading = 1;
+}
 }
 
 /* Handle the LINK tag.  It requires special handling because how its

Re: Using wildcards through proxy server

2002-04-11 Thread Hrvoje Niksic


John Poltorak [EMAIL PROTECTED] writes:

 Can anyone confirm that WGET allows the use of wildcards trhough a proxy 
 server?

It doesn't.  Use a substitute:

wget -rl1 -A wildcard URL...

Re: wget timeout

2002-04-11 Thread Hrvoje Niksic


Warwick Poole [EMAIL PROTECTED] writes:

 I want to set a timeout of 5 seconds on a wget http fetch. I have
 tried -T --timeout etc in the command line and in a .wgetrc
 file. wget does not seem to obey these directives.

You have probably encountered the problem that Wget's timeout is not
honored for connect timeouts, only for reads.  We plan to fix that for
the next release.

Re: Problem with URL

2002-04-11 Thread Hrvoje Niksic


Marcus - Videomoviehouse.com [EMAIL PROTECTED] writes:

 I am trying to get wget to work with a URL with characters that it
 doesn't seem to like. I tried putting the URL in quotes and still
 gave me similar results. Works find if it is a simple URL like wget
 www.something.com/index.html. Any help appreciated. I am running
 Red Hat Linux.

I'm afraid you will need to specify exactly which URL you are having
problems with, and what happens, preferrably accompanied with a log
produced by running Wget with the `-d' flag.  Also, please let us know
which version of Wget you are using.  (wget --version)

Re: wget crash

2002-04-11 Thread Hrvoje Niksic


Hack Kampbjørn [EMAIL PROTECTED] writes:

 assertion percentage = 100 failed: file progress.c, line 552
 zsh: abort (core dumped)  wget -m -c --tries=0
 ftp://ftp.scene.org/pub/music/artists/nutcase/mp3/timeofourlives.mp3

 progress.c
   int percentage = (int)(100.0 * size / bp-total_length);

   assert (percentage = 100);
 Of course the assert will fail, size is bigger than total_length !
[...]
 To reproduce with wget-1.8.1
 $ wget ftp://sunsite.dk/disk1/gnu/wget/wget-1.8{,.1}.tar.gz
 $ cat wget-1.8.tar.gz  wget-1.8.1.tar.gz
 $ wget -d -c ftp://sunsite.dk/disk1/gnu/wget/wget-1.8.1.tar.gz

Thanks for looking into this.  There are two problems here, and most
likely two separate bugs.

First, I cannot repeat your test case.  Maybe sunsite.dk changed their
FTP server since Feb 15; anyway, what I get is:

-- REST 2185627

350 Restarting at 2185627
-- RETR wget-1.8.1.tar.gz

451-Restart offset 2185627 is too large for file size 1097780.
451 Restart offset reset to 0

Wget (bogusly) considers the 451 response to be error in server
response and retries.  That's bug number one, but it also means that
I cannot repeat your test case.


Bug number two is the one the reporter saw.  At first I didn't quite
understand how it can happen, since bar_update() explicitly guards
against such a condition:

  if (bp-total_length  0
   bp-count + bp-initial_length  bp-total_length)
/* We could be downloading more than total_length, e.g. when the
   server sends an incorrect Content-Length header.  In that case,
   adjust bp-total_length to the new reality, so that the code in
   create_image() that depends on total size being smaller or
   equal to the expected size doesn't abort.  */
bp-total_length = bp-count + bp-initial_length;

The problem is that the same guard is not implemented in bar_create()
and bar_finish(), which also call create_image().  In the FTP case,
the crash comes from bar_create.  This patch should fix it.

2002-04-11  Hrvoje Niksic  [EMAIL PROTECTED]

* progress.c (bar_create): If INITIAL is larger than TOTAL, fix
TOTAL.
(bar_finish): Likewise.

Index: src/progress.c
===
RCS file: /pack/anoncvs/wget/src/progress.c,v
retrieving revision 1.27
diff -u -r1.27 progress.c
--- src/progress.c  2002/04/11 17:49:32 1.27
+++ src/progress.c  2002/04/11 18:49:08
@@ -461,6 +461,11 @@
 
   memset (bp, 0, sizeof (*bp));
 
+  /* In theory, our callers should take care of this pathological
+ case, but it can sometimes happen. */
+  if (initial  total)
+total = initial;
+
   bp-initial_length = initial;
   bp-total_length   = total;
 
@@ -493,7 +498,7 @@
adjust bp-total_length to the new reality, so that the code in
create_image() that depends on total size being smaller or
equal to the expected size doesn't abort.  */
-bp-total_length = bp-count + bp-initial_length;
+bp-total_length = bp-initial_length + bp-count;
 
   /* This code attempts to determine the current download speed.  We
  measure the speed over the interval of approximately three
@@ -564,6 +569,11 @@
 bar_finish (void *progress, long dltime)
 {
   struct bar_progress *bp = progress;
+
+  if (bp-total_length  0
+   bp-count + bp-initial_length  bp-total_length)
+/* See bar_update() for explanation. */
+bp-total_length = bp-initial_length + bp-count;
 
   create_image (bp, dltime);
   display_image (bp-buffer);

Re: wget 1.8.1 crashes on Solaris (i386 and sparc) v7 and v8, butworks on WinNT

2002-04-11 Thread Hrvoje Niksic


Christopher Scott [EMAIL PROTECTED] writes:

 The attached file contains a link which causes wget 1.8.1 to crash
 on Solaris i386 and sparc, on both Solaris 7 and 8 on both
 platforms. However, I downloaded the latest version for Windows, and
 it ran correctly!?!

I'm afraid I cannot get Wget to dump core downloading this link,
either from the command line or from a `lnk.txt' file.

Try recompiling Wget with debugging information (make clean; make
CFLAGS=-g).  When it crashes, run `gdb wget core' and type `where'.
Mail the output here.

Thanks for the report.

Re: No clobber and .shtml files

2002-04-11 Thread Hrvoje Niksic


This change is fine with me.  I vaguely remember that this test is
performed in two places; you might want to create a function.

Re: ETA on wget timeout option

2002-04-11 Thread Hrvoje Niksic


Christopher H. Taylor [EMAIL PROTECTED] writes:

 Any ETA on when you're going to add a timeout alarm to the connect()
 function? I'm running 1.8.1 and still have the same problem. Many of
 my applications that utilize wget are time critical and I'm
 anxiously awaiting this fix. Thanks for your reply.

I've just implemented this, and it's been passing my initial tests.
I'll apply it to CVS shortly.

I can provide the patch, but it's against the latest CVS and is likely
not to apply to the 1.8.1 sources.  However, if it's critical for you,
you can grab the latest CVS sources and use that.  I believe the
current CVS is at least as stable as 1.8.1.

Re: No clobber and .shtml files

2002-04-12 Thread Hrvoje Niksic


Ian Abbott [EMAIL PROTECTED] writes:

 On 11 Apr 2002 at 21:00, Hrvoje Niksic wrote:

 This change is fine with me.  I vaguely remember that this test is
 performed in two places; you might want to create a function.

 Certainly. Where's the best place for it? utils.c?

As good a place as any.

Re: /usr/include/stdio.h:120: previous declaration of `va_list'

2002-04-12 Thread Hrvoje Niksic


Kevin Rodgers [EMAIL PROTECTED] writes:

 1. Don't #define _XOPEN_SOURCE 500 (by commenting it out).

 2. Do #define _VA_ALIST.

 I can confirm that (1) works.  I didn't try (2).

Could you please try (2) and see if it works out?

I'm reluctant to withdraw the _XOPEN_SOURCE definition because it's
supposed to create the kind of environment that we want --
standards-compliant with useful extensions.  Without it, some
functions we use just don't get declared.  (I think strptime is one of
them, but there are probably more.)  I'm keeping that option as a last
resort.

Thanks for the report and the analysis.

Re: Goodbye and good riddance

2002-04-12 Thread Hrvoje Niksic


James C. McMaster (Jim) [EMAIL PROTECTED] writes:

 This could be a great resource, but (I hate to say this) it has been
 rendered more trouble than it is worth by the stubbornness and
 stupidity of the owner.  He has turned a deaf ear to all pleas to do
 something, ANYTHING, to stop the flood of spam, viruses and
 annoyances posted to the list.

Actually, I was planning to work on the spam problem this weekend.
(Don't for a moment think I'm not annoyed by it.)  It *will* be
resolved, hopefully to everyone's satisfaction.  But if several spams
are enough to detract you from a useful resource and resort to
name-calling targeted at the very person who created it, I cannot
honestly feel dismayed by your choice.

 This is the one and only mailing list that still maintains this
 policy,

This is a factually incorrect statement.

 I will continue to use it without support, because getting support
 is more trouble than it is worth.

Don't forget that you can always post to the mailing list *without*
being subscribed.  :-)  Who knows, maybe one day you'll reap the
benefits of what you are badmouthing right now.


I respectfully ask the other participants to extend their patience for
some more days.  I apologize for not having provided a better solution
already.  Despite the insults, I do not deny my part of the blame --
it is just your method (of dealing with spam) I disagree with.

< 1 2 3 4 5 6 7 8 9 10 >

301 - 400 of 1457 matches

Mail list logo