On 4 Jul 2001, at 23:20, Jacob Burckhardt wrote:

> I run wget on this file:
> 
> <! ------------------------------------------------------ >
> <A HREF="a.html">a</a>
> <! ------------------------------------------------------ >
> <A HREF="b.html">b</a>
> 
> It downloads b.html, but it does not download a.html.

This is not HTML, nor valid SGML, so you shouldn't be too surprised 
at the behavior. What wget is doing is skipping over SGML 
declarations and the comments in those declarations. One of those 
comments is started by the last two hyphens on the line 1 and 
terminated by the first two hyphens on line 3, so the whole of line 2 
is commented out.

> However, if the following file is used, then it does download a.html:
> 
> <! ------------------------------------------------------ >
> <A HREF="a.html">a</a>

The last two hyphens on line 1 start a comment which is not 
terminated. Because it is not terminated, wget backs out and 
continues parsing the document anyway. Perhaps it shouldn't.

Reply via email to