Re: ambiguity of grammar for font shorthand?

2009-09-23 Thread Vincent Hennebert
Hi Tony,

Tony Graham wrote:
 On Mon, Sep 21 2009 23:30:17 +0100, jonathan.levin...@intersystems.com wrote:
 ...
 If inherit is allowed to be a value then the grammar truly becomes ambiguous
 since each of these can have the value inherit and we don?t know which ones 
 are
 omitted and must take the value normal.
 
 'inherit' doesn't mix with other values [1].  AFAIK, this is true even
 for shorthands taken from CSS2.

Well the point you’re referring to says that ‘inherit’ can’t be mixed
with other operations in an expression. Technically speaking the
shorthand is not an expression. And, anyway, the point also says that
the ‘from-parent()’ function can be used instead, which leads to the
same issue.

That said, your point made me look at the introduction of section 7.31,
“Shorthand Properties”:
http://www.w3.org/TR/2006/REC-xsl11-20061205/#d0e33965
which says that “One cannot mix ‘inherit’ with other subproperty values
as it would not be possible to specify the subproperty to which
‘inherit’ applied”.

While this is not always true as we found out, that avoids the
problem...

... Except when the ‘normal’ keyword is used, which applies to all three
style/variant/weight properties, and may also lead to ambiguous values.


 If the value is 'inherit', the individual properties for which the
 shorthand is a shorthand individually inherit [2].
 
 Regards,
 
 
 Tony Graham tony.gra...@menteithconsulting.com
 Director  W3C XSL FO SG Invited Expert
 Menteith Consulting Ltd   XML Guild member
 XML, XSL and XSLT consulting, programming and training
 Registered Office: 13 Kelly's Bay Beach, Skerries, Co. Dublin, Ireland
 Registered in Ireland - No. 428599   http://www.menteithconsulting.com
   --  --  --  --  --  --  --  --  --  --  --  --  --  --  --  --  --
 xmlroff XSL Formatter   http://xmlroff.org
 xslide Emacs mode  http://www.menteith.com/wiki/xslide
 Unicode: A Primer   urn:isbn:0-7645-4625-2
 
 
 [1] http://www.w3.org/TR/xsl11/#d0e5479
 [2] http://www.w3.org/TR/xsl11/#shortexpan


Vincent


Re: ambiguity of grammar for font shorthand?

2009-09-23 Thread Peter B. West


On 23/09/2009, at 8:18 PM, Vincent Hennebert wrote:


Hi Tony,

Tony Graham wrote:
On Mon, Sep 21 2009 23:30:17 +0100, jonathan.levin...@intersystems.com 
 wrote:

...
If inherit is allowed to be a value then the grammar truly becomes  
ambiguous
since each of these can have the value inherit and we don?t know  
which ones are

omitted and must take the value normal.


'inherit' doesn't mix with other values [1].  AFAIK, this is true  
even

for shorthands taken from CSS2.


Well the point you’re referring to says that ‘inherit’ can’t be mixed
with other operations in an expression. Technically speaking the
shorthand is not an expression. And, anyway, the point also says that
the ‘from-parent()’ function can be used instead, which leads to the
same issue.

That said, your point made me look at the introduction of section  
7.31,

“Shorthand Properties”:
http://www.w3.org/TR/2006/REC-xsl11-20061205/#d0e33965
which says that “One cannot mix ‘inherit’ with other subproperty  
values

as it would not be possible to specify the subproperty to which
‘inherit’ applied”.

While this is not always true as we found out, that avoids the
problem...


When is it not true?



... Except when the ‘normal’ keyword is used, which applies to all  
three
style/variant/weight properties, and may also lead to ambiguous  
values.




Font shorthand implicitly sets _all_ of these values to normal,  
doesn't it?





If the value is 'inherit', the individual properties for which the
shorthand is a shorthand individually inherit [2].

Regards,


Tony Graham  
tony.gra...@menteithconsulting.com
Director  W3C XSL FO SG Invited  
Expert
Menteith Consulting Ltd   XML Guild  
member

XML, XSL and XSLT consulting, programming and training
Registered Office: 13 Kelly's Bay Beach, Skerries, Co. Dublin,  
Ireland
Registered in Ireland - No. 428599   http:// 
www.menteithconsulting.com

 --  --  --  --  --  --  --  --  --  --  --  --  --  --  --  --  --
xmlroff XSL Formatter   http:// 
xmlroff.org
xslide Emacs mode  http://www.menteith.com/wiki/ 
xslide
Unicode: A Primer   urn:isbn: 
0-7645-4625-2



[1] http://www.w3.org/TR/xsl11/#d0e5479
[2] http://www.w3.org/TR/xsl11/#shortexpan



Vincent




Re: ambiguity of grammar for font shorthand?

2009-09-22 Thread Alexander Kiel
Hi,

 Also, in your message you said we could ignore a value for font of
 caption, icon, etc., as the standard tells us to do, but the standard
 discusses these values and their relation to system fonts.  Was this
 an oversight on your part or am I mis-reading the spec? [1]

 [1] http://www.w3.org/TR/2001/REC-xsl-20011015/slice7.html#font

The spec says:

XSL modifications to the CSS definition:

In XSL the font property is a pure shorthand property. System font
characteristics, such as font-family, and font-size, may be obtained by
the use of the system-font function in the expression language.

If I read this correctly the system font shorthands namely: caption,
icon, menu, message-box, small-caption, status-bar are not allowed in
XSL.


Best Regards
Alex



signature.asc
Description: This is a digitally signed message part


Re: ambiguity of grammar for font shorthand?

2009-09-22 Thread Alexander Kiel
Hi,

 I think it is probably the case that in the context of the font short
 hand – the font properties cannot take the value of inherit, since
 this renders the grammar irreducibly ambiguous.  While such an
 exclusion is not mentioned in the spec,  it makes sense that inherit
 must be excluded for the reason I’ve just given.

Once, I've written a CSS Minifier (shrinks CSS files). There I also
didn't allow inherit for individual properties inside the font
shorthand. So I would give the user a good error message here. I don't
think that there are many documents out there, which actually use
inherit for individual properties inside the font shorthand.


Best Regards
Alex



signature.asc
Description: This is a digitally signed message part


Re: ambiguity of grammar for font shorthand?

2009-09-22 Thread Vincent Hennebert
Hi Jonathan,

Jonathan Levinson wrote:
 Hi Vincent,
 
  
 
 As I read the grammar for the font shorthand it is ambiguous, though not
 fatally so as long as one excludes the value of inherit from
 individual properties in the font short hand.
 
  
 
  For instance the first optional argument is font-style, font-weight,
 and font-variant, each of which is optional and can occur in any order.
 All can have the value normal.   So if the value for the font shorthand
 is normal 10pt Arial we do not know which of these three is being set
 to normal even though it is harmless and the omitted values will be set
 to normal since that is their initial value.

Actually not: the default value is inherited. If somewhere up in the
hierarchy the font-weight was set to bold, then we don’t know if that
‘normal’ in the font property means that font-weight must be reset to
normal or if it applies to another property. This example you’re
mentioning is truly ambiguous.


 If inherit is allowed to be a value then the grammar truly becomes
 ambiguous since each of these can have the value inherit and we don't
 know which ones are omitted and must take the value normal.
 
  
 
 I think it is probably the case that in the context of the font short
 hand - the font properties cannot take the value of inherit, since this
 renders the grammar irreducibly ambiguous.  While such an exclusion is
 not mentioned in the spec,  it makes sense that inherit must be excluded
 for the reason I've just given.

Excluding inherit for good is a bit too restrictive IMO. I think we
should try to resolve all non-ambiguous cases, like:
normal normal bold
inherit bold italic
inherit inherit inherit
inherit
etc.

Some truly ambiguous values:
normal normal (which one is inherited?)
normal bold inherit (which one is normal, which one inherited?)
normal (which one is normal, which one inherited?)
etc.

A good “exercise” would be to identify all cases that are ambiguous. In
which case an error would be thrown with a “the value is ambiguous”-like
message.


 Prima facie, the grammar (eliminating inherit) looks LL(1) since 
 parsing
 from left to right one can always tell what property one is parsing
 except for the case when one of the first three is assigned normal and
 there are no further values unique to the properties of the first three.
 In this case, one has a special rule (outside the grammar) to
 arbitrarily pick one of the optional properties in the first optional
 argument as the bearer of normal, while the rest receive their initial
 values of normal.

Actually, a “simple” regular expression might be enough. The
java.util.regex package can do wonder. See attached Java file: there
will always be 6 matching groups, some of them possibly being null. The
first three are for style/variant/weight, then font-size, then
line-height, then font families. Some magic would have to be implemented
to identify the first 3 groups. Also, the regex for the individual
properties would have to be refined: “\\w+” is actually wrong for
font-weight. One could imagine to re-use a regex defined for each
sub-property.

However, an LL parser would probably be superior in error handling. The
regular expression would just fail to match, and there’s not much that
can be said about why it fails. An LL parser would probably be able to
tell, say, that the error lies in the declaration of the font-size
property.

I think a good error handling is important, especially to beginners.
I’ve found myself ranting against such meaningless error messages that
don’t tell you at all what your error could be.


 There is a special case where the value of font is inherit and that
 works fine.  Since we are testing if the single token is inherit, we can
 handle that special case in a recursive descent parser.   We create a
 tokenizer which breaks on space and see if the one token returned is
 inherit.
 
  
 
 Also, in your message you said we could ignore a value for font of
 caption, icon, etc., as the standard tells us to do, but the standard
 discusses these values and their relation to system fonts.  Was this an
 oversight on your part or am I mis-reading the spec? [1]

See Alexander’s answer about that.


 I'm not sure we have to go to the complexity of parsing the font short
 hand in a recursive descent manner.  I've updated the open issue (47709)
 to give my reasons why and a solution to the problem of more than two
 fonts separated by commas.  The overly complex code I analyzed looks to
 me like a tokenizer not a parser, and while it could be better written
 (and more understandable) it seems to be doing an adequate job of
 tokenizing, unless I'm still missing something.

There are still cases that it doesn’t handle: spaces around the slash,
different order in style/variant/weight.


Vincent
/*
 * Licensed to the Apache Software Foundation (ASF) under one or more
 * contributor license agreements.  See the NOTICE file distributed with
 * this work for additional information regarding 

RE: ambiguity of grammar for font shorthand?

2009-09-22 Thread Jonathan Levinson
Hi Vincent,

You make excellent points, however for font-style, font-variant and font-weight 
the initial value (the default value) is normal, not inherit.

http://www.w3.org/TR/2001/REC-xsl-20011015/slice7.html#font-style

http://www.w3.org/TR/2001/REC-xsl-20011015/slice7.html#font-variant

http://www.w3.org/TR/2001/REC-xsl-20011015/slice7.html#font-weight

This is a minor detail, but important if our discussion is used as the basis 
for building a recursive descent parser.

Best Regards,
Jonathan S. Levinson
Senior Software Developer
Object Group
InterSystems
617-621-0600


-Original Message-
From: Vincent Hennebert [mailto:vhenneb...@gmail.com] 
Sent: Tuesday, September 22, 2009 7:20 AM
To: fop-dev@xmlgraphics.apache.org
Subject: Re: ambiguity of grammar for font shorthand?

Hi Jonathan,

Jonathan Levinson wrote:
 Hi Vincent,
 
  
 
 As I read the grammar for the font shorthand it is ambiguous, though 
 not fatally so as long as one excludes the value of inherit from 
 individual properties in the font short hand.
 
  
 
  For instance the first optional argument is font-style, font-weight, 
 and font-variant, each of which is optional and can occur in any order.
 All can have the value normal.   So if the value for the font shorthand
 is normal 10pt Arial we do not know which of these three is being 
 set to normal even though it is harmless and the omitted values will 
 be set to normal since that is their initial value.

Actually not: the default value is inherited. If somewhere up in the hierarchy 
the font-weight was set to bold, then we don’t know if that ‘normal’ in the 
font property means that font-weight must be reset to normal or if it applies 
to another property. This example you’re mentioning is truly ambiguous.


 If inherit is allowed to be a value then the grammar truly becomes 
 ambiguous since each of these can have the value inherit and we don't 
 know which ones are omitted and must take the value normal.
 
  
 
 I think it is probably the case that in the context of the font short 
 hand - the font properties cannot take the value of inherit, since 
 this renders the grammar irreducibly ambiguous.  While such an 
 exclusion is not mentioned in the spec,  it makes sense that inherit 
 must be excluded for the reason I've just given.

Excluding inherit for good is a bit too restrictive IMO. I think we should try 
to resolve all non-ambiguous cases, like:
normal normal bold
inherit bold italic
inherit inherit inherit
inherit
etc.

Some truly ambiguous values:
normal normal (which one is inherited?)
normal bold inherit (which one is normal, which one inherited?) normal (which 
one is normal, which one inherited?) etc.

A good “exercise” would be to identify all cases that are ambiguous. In which 
case an error would be thrown with a “the value is ambiguous”-like message.


 Prima facie, the grammar (eliminating inherit) looks LL(1) since 
 parsing from left to right one can always tell what property one is 
 parsing except for the case when one of the first three is assigned 
 normal and there are no further values unique to the properties of the 
 first three.
 In this case, one has a special rule (outside the grammar) to 
 arbitrarily pick one of the optional properties in the first optional 
 argument as the bearer of normal, while the rest receive their initial 
 values of normal.

Actually, a “simple” regular expression might be enough. The java.util.regex 
package can do wonder. See attached Java file: there will always be 6 matching 
groups, some of them possibly being null. The first three are for 
style/variant/weight, then font-size, then line-height, then font families. 
Some magic would have to be implemented to identify the first 3 groups. Also, 
the regex for the individual properties would have to be refined: “\\w+” is 
actually wrong for font-weight. One could imagine to re-use a regex defined for 
each sub-property.

However, an LL parser would probably be superior in error handling. The regular 
expression would just fail to match, and there’s not much that can be said 
about why it fails. An LL parser would probably be able to tell, say, that the 
error lies in the declaration of the font-size property.

I think a good error handling is important, especially to beginners.
I’ve found myself ranting against such meaningless error messages that don’t 
tell you at all what your error could be.


 There is a special case where the value of font is inherit and that 
 works fine.  Since we are testing if the single token is inherit, we can
 handle that special case in a recursive descent parser.   We create a
 tokenizer which breaks on space and see if the one token returned is 
 inherit.
 
  
 
 Also, in your message you said we could ignore a value for font of 
 caption, icon, etc., as the standard tells us to do, but the standard 
 discusses these values and their relation to system fonts.  Was this 
 an oversight on your part or am I mis-reading the spec? [1

Re: ambiguity of grammar for font shorthand?

2009-09-22 Thread Peter B. West


On 23/09/2009, at 12:13 AM, Jonathan Levinson wrote:


Hi Vincent,

You make excellent points, however for font-style, font-variant and  
font-weight the initial value (the default value) is normal, not  
inherit.


http://www.w3.org/TR/2001/REC-xsl-20011015/slice7.html#font-style

http://www.w3.org/TR/2001/REC-xsl-20011015/slice7.html#font-variant

http://www.w3.org/TR/2001/REC-xsl-20011015/slice7.html#font-weight

This is a minor detail, but important if our discussion is used as  
the basis for building a recursive descent parser.




An important detail. When the font shorthand is encountered, all font  
properties (including those that cannot be defined in the shorthand)  
are set to their initial values.




Re: ambiguity of grammar for font shorthand?

2009-09-22 Thread Vincent Hennebert
Hi Jonathan,

Jonathan Levinson wrote:
 Hi Vincent,
 
 You make excellent points, however for font-style, font-variant and 
 font-weight the initial value (the default value) is normal, not inherit.
 
 http://www.w3.org/TR/2001/REC-xsl-20011015/slice7.html#font-style
 
 http://www.w3.org/TR/2001/REC-xsl-20011015/slice7.html#font-variant
 
 http://www.w3.org/TR/2001/REC-xsl-20011015/slice7.html#font-weight

Sure but the properties are inheritable. The initial value is used for
the root element and propagated down the FO tree. See
http://www.w3.org/TR/2006/REC-xsl11-20061205/#speccomact


 This is a minor detail, but important if our discussion is used as the basis 
 for building a recursive descent parser.

Moreover we can easily get things wrong with the XSL-FO Recommendation,
so double-checking is beneficial and welcome.

snip/

Thanks,
Vincent


Re: ambiguity of grammar for font shorthand?

2009-09-22 Thread Tony Graham
On Mon, Sep 21 2009 23:30:17 +0100, jonathan.levin...@intersystems.com wrote:
...
 If inherit is allowed to be a value then the grammar truly becomes ambiguous
 since each of these can have the value inherit and we don?t know which ones 
 are
 omitted and must take the value normal.

'inherit' doesn't mix with other values [1].  AFAIK, this is true even
for shorthands taken from CSS2.

If the value is 'inherit', the individual properties for which the
shorthand is a shorthand individually inherit [2].

Regards,


Tony Graham tony.gra...@menteithconsulting.com
Director  W3C XSL FO SG Invited Expert
Menteith Consulting Ltd   XML Guild member
XML, XSL and XSLT consulting, programming and training
Registered Office: 13 Kelly's Bay Beach, Skerries, Co. Dublin, Ireland
Registered in Ireland - No. 428599   http://www.menteithconsulting.com
  --  --  --  --  --  --  --  --  --  --  --  --  --  --  --  --  --
xmlroff XSL Formatter   http://xmlroff.org
xslide Emacs mode  http://www.menteith.com/wiki/xslide
Unicode: A Primer   urn:isbn:0-7645-4625-2


[1] http://www.w3.org/TR/xsl11/#d0e5479
[2] http://www.w3.org/TR/xsl11/#shortexpan


ambiguity of grammar for font shorthand?

2009-09-21 Thread Jonathan Levinson
Hi Vincent,

 

As I read the grammar for the font shorthand it is ambiguous, though not
fatally so as long as one excludes the value of inherit from
individual properties in the font short hand.

 

 For instance the first optional argument is font-style, font-weight,
and font-variant, each of which is optional and can occur in any order.
All can have the value normal.   So if the value for the font shorthand
is normal 10pt Arial we do not know which of these three is being set
to normal even though it is harmless and the omitted values will be set
to normal since that is their initial value.

 

If inherit is allowed to be a value then the grammar truly becomes
ambiguous since each of these can have the value inherit and we don't
know which ones are omitted and must take the value normal.

 

I think it is probably the case that in the context of the font short
hand - the font properties cannot take the value of inherit, since this
renders the grammar irreducibly ambiguous.  While such an exclusion is
not mentioned in the spec,  it makes sense that inherit must be excluded
for the reason I've just given.

 

Prima facie, the grammar (eliminating inherit) looks LL(1) since parsing
from left to right one can always tell what property one is parsing
except for the case when one of the first three is assigned normal and
there are no further values unique to the properties of the first three.
In this case, one has a special rule (outside the grammar) to
arbitrarily pick one of the optional properties in the first optional
argument as the bearer of normal, while the rest receive their initial
values of normal.

 

There is a special case where the value of font is inherit and that
works fine.  Since we are testing if the single token is inherit, we can
handle that special case in a recursive descent parser.   We create a
tokenizer which breaks on space and see if the one token returned is
inherit.

 

Also, in your message you said we could ignore a value for font of
caption, icon, etc., as the standard tells us to do, but the standard
discusses these values and their relation to system fonts.  Was this an
oversight on your part or am I mis-reading the spec? [1]

 

[1] http://www.w3.org/TR/2001/REC-xsl-20011015/slice7.html#font

 

I'm not sure we have to go to the complexity of parsing the font short
hand in a recursive descent manner.  I've updated the open issue (47709)
to give my reasons why and a solution to the problem of more than two
fonts separated by commas.  The overly complex code I analyzed looks to
me like a tokenizer not a parser, and while it could be better written
(and more understandable) it seems to be doing an adequate job of
tokenizing, unless I'm still missing something.

 

Best Regards,

Jonathan S. Levinson

Senior Software Developer

Object Group

InterSystems

617-621-0600