ambiguity of grammar for font shorthand?

Jonathan Levinson Mon, 21 Sep 2009 15:30:53 -0700

Hi Vincent,


As I read the grammar for the font shorthand it is ambiguous, though not
fatally so as long as one excludes the value of "inherit" from
individual properties in the font short hand.

 

 For instance the first optional argument is font-style, font-weight,
and font-variant, each of which is optional and can occur in any order.
All can have the value normal.   So if the value for the font shorthand
is "normal 10pt Arial" we do not know which of these three is being set
to normal even though it is harmless and the omitted values will be set
to normal since that is their initial value.

 

If inherit is allowed to be a value then the grammar truly becomes
ambiguous since each of these can have the value inherit and we don't
know which ones are omitted and must take the value normal.

 

I think it is probably the case that in the context of the font short
hand - the font properties cannot take the value of inherit, since this
renders the grammar irreducibly ambiguous.  While such an exclusion is
not mentioned in the spec,  it makes sense that inherit must be excluded
for the reason I've just given.

 

Prima facie, the grammar (eliminating inherit) looks LL(1) since parsing
from left to right one can always tell what property one is parsing
except for the case when one of the first three is assigned normal and
there are no further values unique to the properties of the first three.
In this case, one has a special rule (outside the grammar) to
arbitrarily pick one of the optional properties in the first optional
argument as the bearer of normal, while the rest receive their initial
values of normal.

 

There is a special case where the value of font is inherit and that
works fine.  Since we are testing if the single token is inherit, we can
handle that special case in a recursive descent parser.   We create a
tokenizer which breaks on space and see if the one token returned is
inherit.

 

Also, in your message you said we could ignore a value for font of
caption, icon, etc., as the standard tells us to do, but the standard
discusses these values and their relation to system fonts.  Was this an
oversight on your part or am I mis-reading the spec? [1]

 

[1] http://www.w3.org/TR/2001/REC-xsl-20011015/slice7.html#font

 

I'm not sure we have to go to the complexity of parsing the font short
hand in a recursive descent manner.  I've updated the open issue (47709)
to give my reasons why and a solution to the problem of more than two
fonts separated by commas.  The overly complex code I analyzed looks to
me like a tokenizer not a parser, and while it could be better written
(and more understandable) it seems to be doing an adequate job of
tokenizing, unless I'm still missing something.

 

Best Regards,

Jonathan S. Levinson

Senior Software Developer

Object Group

InterSystems

617-621-0600

ambiguity of grammar for font shorthand?

Reply via email to