Re: xhtml? Re: Standalone html parser

2001-07-02 Thread Ian Abbott
On 29 Jun 2001, at 17:40, Anees Shaikh [EMAIL PROTECTED] wrote: Henrik says that xhtml probably doesn't require a space before / to close the tag. But just for anecdotal evidence, all of the sites I've had this problem with do in fact put a space before the / . I guess the requirement of

Re: Standalone html parser

2001-06-29 Thread Hrvoje Niksic
Anees Shaikh [EMAIL PROTECTED] writes: I'm trying to use the code in html-parse.c (v1.7) in standalone mode Excellent! For some reason, img src=... tags are recognized but then skipped almost every time they are encountered. When using the full program and recursive retrieve, the images

Re: Standalone html parser

2001-06-29 Thread Anees Shaikh
So I think the problem is with malformed img tags. The parser fails if the tag is of this form: img src=/library/homepage/images/curve.gif alt= border=0 / Note the end of the tag is closed with / instead of just as in the spec. When the parser finds the / it thinks it sets

Re: Standalone html parser

2001-06-29 Thread Hrvoje Niksic
Anees Shaikh [EMAIL PROTECTED] writes: So I think the problem is with malformed img tags. The parser fails if the tag is of this form: img src=/library/homepage/images/curve.gif alt= border=0 / [...] This problem with img tags seems to be quite common (redhat.com, ibm.com,

Re: xhtml? Re: Standalone html parser

2001-06-29 Thread Hrvoje Niksic
Anees Shaikh [EMAIL PROTECTED] writes: Hrvoje, you mentioned that you planned to modify the parser to handle these tags. Any ideas on timetable? How about now? :-) I have created a simple patch that deals with this. However, preliminary testing indicated a problem with the semantics.