Hi:
We are working on the html parser, and need to have working DTD. The current
implementation of DTD.read(), based on serialization, has some problems, and
I think we should have a well defined binary format. I suggest the following
ASN.1 format, and if there is consensus on it, we could contribute the code
to read and write it.
I would like to hear the opinion of Stepan and anyone who has worked with
ASN.1 before.

BDTD ::= SEQUENCE {
      Name UTF8String,
      Entity SET OF HTMLEntity,
      Element SET OF HTMLElement
}

HTMLEntity ::= SEQUENCE {
      Name UTF8String,
      Value INTEGER,
      General BOOLEAN DEFAULT FALSE,
      Parameter BOOLEAN DEFAULT FALSE,
      Data UTF8String
}

HTMLElement ::= SEQUENCE {
      Index INTEGER,
      Name UTF8String,
      Type INTEGER,
      OStart BOOLEAN,
      OEnd BOOLEAN,
      Exclusions SET OF INTEGER,
      Inclusions SET OF INTEGER,
      Attributes SET OF HTMLElementAttributes OPTIONAL,
      ContentModel HTMLContentModel,
}

HTMLContentModel ::= SEQUENCE OF SEQUENCE {
      Type INTEGER,
      Index INTEGER
}

HTMLElementAttributes ::= SEQUENCE {
      Name UTF8String,
      Type INTEGER,
      Modifier INTEGER,
      DefaultValue UTF8String OPTIONAL,
      PossibleValues SET OF UTF8String OPTIONAL
}
--
Miguel Montes

Reply via email to