[Changing the subject to something more meaningful.]

>> > Name classes should be patterns
>>
>> This is something I thought about when I was doing TREX, but I didn't
>> find a way to make it work that I was satisfied with.
>
> I think the important point is not so much that name classes be patterns,
> but that there be a way to assign (local) names to name classes so that
> common elements can be factored out.

Quite so.  I got hung up on finding an XML syntax for it. But let's
try and work this through.

Looking at the compact syntax first, our choices are quite constrained.   Since

  element foo { }

always means an element named foo, for any name foo whatsoever,
there's going to have to be some special character to distinguish a
reference to a name class.  For example, you might have

  element %foo  { }

to reference the name class defined as foo (the mnemonic is that % is
like a reference to a parameter entity in a DTD). Given this, the
natural syntax for a definition would be

 %foo = x | y | z

With this syntax, a definition of a pattern foo should not conflict
with a definition of a name class foo. In other words, there's a
separate symbol space for definitions of patterns and name classes.
Obviously, you can have

 %foo |= x

as with definition of patterns.

For regularity and equivalence with the XML syntax, we need to make
parent references work for name classes.  I guess the syntax for that
would have to be

  element parent %foo { }

which is bit ugly and tricky to parse, but I don't suppose it will be used much.

Now let's look at the XML syntax. Given that choices of name class use
<choice>, just like choices of patterns, I think references to name
classes need to use <ref>/<parentRef>.  And if references use <ref>
then I think for symmetry definitions need to use <define>.  On the
other hand, definitions of name classes need to be syntactically
distinguished from definitions of patterns, partly because they are in
different symbol spaces (because of compact syntax design
considerations) and partly because an implementation needs to know
what kind of thing is being defined and partly because a human
reader/author also needs to know.  In a case like

<define name="foo">
   <choice>
      <ref name="bar"/>
     <ref name="baz"/>
   </choice>
</define>

you could be defining either a name class or a pattern.  (This is
where I got stuck thinking about it before.)

I think the clean solution is to add an attribute to define.  For
example, you might have an "as" attribute, with possible values of
"pattern" and "nameClass", defaulting to "pattern", so you would say:

<define name="foo" as="nameClass">
   <choice>
     <ref name="bar"/>
     <ref name="baz"/>
   </choice>
</define>

to define a name class.  Note that

<define name="foo" as="nameClass">
   <name>x</name>
   <name>y</name>
</define>

would be equivalent to

<define name="foo" as="nameClass">
  <choice>
    <name>x</name>
    <name>y</name>
  </choice>
</define>

James

Reply via email to