"Stephen Pair" <[EMAIL PROTECTED]> wrote:
> > Ooo, ooo, I was thinking about doing that. Do it, do it!
> >
> > But what are you parsing? DTDs? DTDs, themselves, specify grammars. What
> > would be cool is to have a T-Gen that understood DTDs :)
> >
> > Of course, that's only if you want validation.
>
> Hey, Hey! That would be great! I'm translating the XML EBNF into the TGen
> format (a pretty easy mapping). From there, I'll specify additional TGen
> stuff to create a more workable AST. The XML spec doesn't describe XML
> using a DTD.
>
> It would be great for Scamper (and Squeak in general) to have it understand
> DTDs...I believe all of the HTML versions are described as DTDs. Do you
> know where the DTD spec is located?
>
Smallwalker is a Smalltalk web browser that uses DTD's. A web search
should turn it up. But I disagree that this is a good idea. Most web
pages don't follow the DTD's. They do things like:
- use non-standard tags and attributes
- close tags in the opposite order they were opened
- embed tags in the wrong places, eg <LI>'s outside of a list
- forget to close some tags at all
Skipping the DTD's was an early, but deliberate decision in Scamper's
development. The goal was to make a system that can understand any
crudy HTML that might get thrown at it.
Lex