Hrvoje Niksic <[EMAIL PROTECTED]> writes:

> If so, we can make it so that the following all differ:
> 
>     <img src=foo/>     # <img src="foo/">
> 
>     <img src=foo />    # <img src="foo"></img>
> 
>     <img src=foo/ />   # <img src="foo/"></img>
> 
> Any opinions?  Advice?  Standards-lawyer-speak?

In any case, here is the patch that implements this, so that Anees can
test it if he likes the idea.  My preliminary testing shows that the
patch covers all the reasonable cases.

Anees, are you aware that `html-parse.c' comes with a `main' that
allows you to test the parser by feeding it HTML on stdin?  Just
checking.

Index: src/html-parse.c
===================================================================
RCS file: /pack/anoncvs/wget/src/html-parse.c,v
retrieving revision 1.7
diff -u -r1.7 html-parse.c
--- src/html-parse.c    2001/05/27 19:35:01     1.7
+++ src/html-parse.c    2001/06/29 20:16:39
@@ -638,6 +638,19 @@
 
        SKIP_WS (p);
 
+       if (*p == '/')
+         {
+           /* A slash at this point means the tag is about to be
+              closed.  This is legal in XML and has been popularized
+              in HTML via XHTML.  */
+           /* <foo a=b c=d /> */
+           /*              ^  */
+           ADVANCE (p);
+           SKIP_WS (p);
+           if (*p != '>')
+             goto backout_tag;
+         }
+
        /* Check for end of tag definition. */
        if (*p == '>')
          break;
@@ -654,7 +667,7 @@
 
        /* Establish bounds of attribute value. */
        SKIP_WS (p);
-       if (NAME_CHAR_P (*p) || *p == '>')
+       if (NAME_CHAR_P (*p) || *p == '/' || *p == '>')
          {
            /* Minimized attribute syntax allows `=' to be omitted.
                For example, <UL COMPACT> is a valid shorthand for <UL
@@ -735,7 +748,7 @@
            /* We skipped the whitespace and found something that is
               neither `=' nor the beginning of the next attribute's
               name.  Back out.  */
-           goto backout_tag;   /* <foo bar /... */
+           goto backout_tag;   /* <foo bar [... */
                                /*          ^    */
          }
 

Reply via email to