Re: Comment handling

2003-06-05 Thread Tony Lewis
Aaron S. Hawley wrote: > i'm just saying what's going to happen when someone posts to this list: > "My Web Pages have [insert obscure comment format] for comments and Wget > is considering them to (not) be comments. Can you change the [insert > Wget comment mode] comment mode to (not) recognize m

Re: Comment handling

2003-06-05 Thread Aaron S. Hawley
On Wed, 4 Jun 2003, George Prekas wrote: > > > > i think the idea of quirky comments modes are cool, but is it the better > > solution? > > Do you think that the current algorithm shouldn't be improved? Even, a > little bit to handle the common mistakes? i think Wget's default behavior should be

Re: Comment handling

2003-06-05 Thread George Prekas
[...] > i suppose my proposal should have been called --disobey-comments (comments > are already "ignored" by default). I suppose that this is a good idea, since it won't be enabled by default and someone could enable it if the page he wants to download is very buggy concerning the comments. > >

Re: Comment handling

2003-06-05 Thread Aaron S. Hawley
i suppose my proposal should have been called --disobey-comments (comments are already "ignored" by default). i'm just saying what's going to happen when someone posts to this list: "My Web Pages have [insert obscure comment format] for comments and Wget is considering them to (not) be comments.

Re: Comment handling

2003-06-05 Thread George Prekas
> Tony Lewis writes: > > > > The issue we've been discussing is what to do about things that almost > > follow the rules for HTML comments, but don't quite get it right. By > > default, wget ignores legitimate HTML comments. > > I think the point of the suggested option was to not even try to > id

Re: Comment handling

2003-06-05 Thread Larry Jones
Tony Lewis writes: > > The issue we've been discussing is what to do about things that almost > follow the rules for HTML comments, but don't quite get it right. By > default, wget ignores legitimate HTML comments. I think the point of the suggested option was to not even try to identify HTML com

Re: Comment handling

2003-06-05 Thread Tony Lewis
Aaron S. Hawley wrote: > why not just have the default wget behavior follow comments explicitly > (i've lost track whether wget does that or needs to be ammended) /and/ > have an option that goes /beyond/ quirky comments and is just > --ignore-comments ? :) The issue we've been discussing is what

Re: Comment handling

2003-06-05 Thread Aaron S. Hawley
On Wed, 4 Jun 2003, Tony Lewis wrote: > Adding this function to wget seems reasonable to me, but I'd suggest that it > be off by default and enabled from the command line with something > like --quirky_comments. why not just have the default wget behavior follow comments explicitly (i've lost tra

Re: Comment handling

2003-06-05 Thread Tony Lewis
s" rules. The ones from Mozilla sound as good as any to me. > That's for now. Please give me some feedback with your thoughts and tell me > if you would like the comment handling mechanism of WGet to change. By the > way, who was written the current one? Maybe, he can help us

Re: Comment handling

2003-06-04 Thread George Prekas
some feedback with your thoughts and tell me if you would like the comment handling mechanism of WGet to change. By the way, who was written the current one? Maybe, he can help us with his experience. Regards, George Prekas.

Re: Comment handling

2003-06-03 Thread Georg Bauhaus
> Georg, I think we're talking about apples and oranges here. I'm talking > about what is legitimate in a comment in an SGML document. I think you're > talking about what is legitimate as a comment in an SGML declaration. Ah, yes, o.K., I was reacting to "valid SGML comments", where legitimate is

Re: Comment handling

2003-06-03 Thread Georg Bauhaus
> > This is what I have tried, leaving out EOF. Basically the algorithm is quite > tolerant and, after "' or for the next > "--[[:space]]*>". This will include some very invalid comments, but so what? > I > thought it might blend well with typical wget use. It doesn't handle . And, darn, I have

Re: Comment handling

2003-06-03 Thread Georg Bauhaus
> > So in the example there are 5 hyphens, the first two > > of which can be interpreted as a comment delimiter, as can > > the second two. But then there is something else following the > > second two, namely a '-'. So this piece of text is as invalid > > as (valid, 1) (It doesn't stop at the f

Re: Comment handling

2003-06-03 Thread Tony Lewis
Georg Bauhaus wrote: > I don't think so. Actually the rules for SGML "comments" are > somewhat different. Georg, I think we're talking about apples and oranges here. I'm talking about what is legitimate in a comment in an SGML document. I think you're talking about what is legitimate as a commen

Re: Comment handling

2003-06-02 Thread George Prekas
- Original Message - From: "Georg Bauhaus" <[EMAIL PROTECTED]> To: "Tony Lewis" <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]> Sent: Monday, June 02, 2003 11:32 AM Subject: Re: Comment handling [ ... ] > So in the example there are 5 hyphens, th

Re: Comment handling

2003-06-02 Thread Georg Bauhaus
> After reading http://www.w3c.org/MarkUp/SGML/sgml-lex/sgml-lex I am > convinced that is a valid SGML (and therefore HTML) comment. > Therefore, I believe it is a bug if wget does not recognize such a comment. I don't think so. Actually the rules for SGML "comments" are somewhat different. First

Re: Comment handling

2003-06-01 Thread Tony Lewis
George Prekas wrote: > You are probably right. I have pointed this because I have seen pages that > use as a separator with lots of dashes and althrough > Internet Explorer shows the page, wget can not download it correctly. What > do think about finishing the comment at the >? After reading htt

Re: Comment handling

2003-06-01 Thread George Prekas
- Original Message - From: "Tony Lewis" <[EMAIL PROTECTED]> To: "George Prekas" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> Sent: Saturday, May 31, 2003 8:47 AM Subject: Re: Comment handling > George Prekas wrote: > > > > I have found a

Re: Comment handling

2003-05-31 Thread Tony Lewis
George Prekas wrote: > I have found a bug in Wget version 1.8.2 concerning comment handling ( ). Take a look at the following illegal HTML code: > > > test1.html > > > > > Now, save the above snippet as test.html and try wget -Fi test.html. You > will notice

Re: Comment handling

2003-05-31 Thread Aaron S. Hawley
On Fri, 30 May 2003, George Prekas wrote: > I have found a bug in Wget version 1.8.2 concerning comment handling ( ). Take a look at the following illegal HTML code: > > > test1.html > > > > > Now, save the above snippet as test.html and try wget -Fi test.html

Comment handling

2003-05-31 Thread George Prekas
I have found a bug in Wget version 1.8.2 concerning comment handling ( ). Take a look at the following illegal HTML code: test1.html Now, save the above snippet as test.html and try wget -Fi test.html. You will notice that it doesn't recognise the second link. I have found a solution t

Comment handling

2003-05-30 Thread George Prekas
I have found a bug in Wget version 1.8.2 concerning comment handling ( ). Take a look at the following illegal HTML code: test1.html Now, save the above snippet as test.html and try wget -Fi test.html. You will notice that it doesn't recognise the second link. I have found a solution t

Comment handling

2003-05-30 Thread George Prekas
I have found a bug in Wget version 1.8.2 concerning comment handling ( ). Take a look at the following illegal HTML code: test1.html Now, save the above snippet as test.html and try wget -Fi test.html. You will notice that it doesn't recognise the second link. I have found a solution t