date:20090929

Re: Checkstyle RedundantThrowsCheck

2009-09-29 Thread Vincent Hennebert

Hi Max,

Max Berger wrote:
> Vincent,
> 
> 
> 2009/9/29 Vincent Hennebert:
>> I started to write my own checkstyle configuration from scratch some
>> time ago, enabling everything that looked important to me. But I’d like
>> to test it a bit more before submitting it.
> 
> Same here. See the checkstyle file for JEuclid as an example.
> 
> http://jeuclid.hg.sourceforge.net/hgweb/jeuclid/jeuclid/file/tip/support/build-tools/src/main/resources/jeuclid/checkstyle.xml
> 
>> Speaking of that, there’s a rule that I would suggest to disable: the
>> HiddenFieldCheck. I don’t really see its benefit. It forces to find
>> somewhat artificial names for variables, where the field name is exactly
>> what I want. Sometimes a method doesn’t have a name following the
>> setField pattern, yet still acts as a setter for Field. This rule would
>> make sense if we were using a Hungarian-like notation for variables
>> (mMember, pParam, etc.), but that’s not the case in FOP.
>> WDYT?
> 
> I like the rule, BUT I am ok with an exception for setters and
> constructors (this is IMO a new option in checkstyle 5):
> http://checkstyle.sourceforge.net/config_coding.html#HiddenField

(Actually this option is available in checkstyle 4.)

But what is the benefit of that rule? I find it annoying, so unless I am
convinced of its usefulness I’d rather disable it.


Vincent

[PDF] Entries in number tree not specified as indirect references

2009-09-29 Thread Vincent Hennebert

Hi,

The StructTreeRoot dictionary must have a ParentTree entry whose type is
a number tree. As explained in Section 3.8.5, “Number Trees” of the PDF
Reference, Third Edition, the Nums entry of a number tree node must be
an array of key-value pairs where value is an indirect reference to the
object associated with the key.

This is not what is done in the current implementation of Logical
Structure in FOP (Temp_Accessibility branch). The value (an array) is
directly stored in the array of key-value pairs instead of being
referenced. So technically the PDF produced is invalid. Acrobat doesn’t
seem to complain, though.

Did I miss anything?
Thanks,
Vincent

RE: Questionable whether font-shorthand grammar LL(1)

2009-09-29 Thread Jonathan Levinson

Hi Vincent,

Excellent ideas!  

The diagram you drew is extremely useful!

If the font shorthand sub-language has a grammar that is regular then it also 
has a grammar that is LL(1).  So recursive descent parsing will work, if there 
is a regular grammar.

I think the best way of getting font shorthand to work would proceed in stages:

1) First get the current code to properly parse and accept valid font shorthand 
expressions.  This should be very easy.  The one remaining problem (AFAIK) is 
the parsing of font-size/line-height where /line-height is optional.   
Currently spaces are not allowed around the slash "/" and they should be.  I'm 
going to try to get to this problem as soon as I have time, probably in a day 
or so.
2) Evaluate which parser or automaton approach is the simplest and produces 
better error states than the current approach.  
3) Implement the approach one has chosen in (2).

Best Regards,
Jonathan S. Levinson
Senior Software Developer
Object Group
InterSystems
617-621-0600

-Original Message-
From: Vincent Hennebert [mailto:vhenneb...@gmail.com] 
Sent: Monday, September 28, 2009 8:13 AM
To: fop-dev@xmlgraphics.apache.org
Subject: Re: Questionable whether font-shorthand grammar LL(1)

Hi Jonathan,

Interesting stuff!

Jonathan Levinson wrote:
> Hi Vincent,
> 

> 
> Because font-variant font-style and font-weight can occur in any 
> order, I could not (currently) come up with a grammar in which the 
> directing sets were disjoint for each non-terminal.  So I was unable 
> to come up with an LL(1) grammar.
> 
> For instance, here are two productions of my attempt at a grammar: 
> 
>  -> 
> 
>  -> 
> 
> In each case, the first set of  shares a common 
> element in two different productions, the literal values for variant.
> One needs to look ahead one more token to see if one has a 
>  or a .

(I’ll call “modifier” any of the three style, variant, weight
properties.)
Taking the ‘normal’ case apart, and since ‘inherit’ is not allowed in the 
shorthand, I think the values for all modifiers are distinct:
‘italic’, ‘oblique’, ‘backslant’ for font-style, ‘small-caps’ for font-variant, 
and the various weight values for font-weight.

Since all modifiers are set to their initial values prior to the shorthand 
parsing, which is ‘normal’ for all three of them, I think we can simply ignore 
any ‘normal’ value found in the string. That is, accept it as a legal terminal 
but not do anything.

So I don’t think there is any ambiguity any more. What remains to be done is to 
check that the same modifier is not specified more than once (that includes 
checking that ‘normal’ is not specified more than
3 times). And it’s probably easier to check that at the semantic level instead 
of crafting special grammar rules.

> The books and web articles I read only discussed using recursive 
> descent when the grammar is LL(1).  I have the feeling that despite 
> the ambiguities in the grammar it is almost LL(k) because font-variant and
> font-style and font-weight almost have disjoint values.   It is at least
> LL(3) and I suspect it is LL(6).

The font-size property has the good idea of not allowing ‘normal’ as a value. 
The ‘normal’ case for modifiers can be ignored as explained above. So I think 
the grammar still is LL(1)

> I'm not as convinced as you are that recursive descent parsing or a 
> formal bottom-up-parser will make the code simpler rather than more
> complex because of the complexities of a formal grammar.   Of course,
> however complex the grammar, a table-generating tool - like ANTLR - 
> will generate code, however complex, which will faithfully reflect the 
> inputted grammar.  However, none of the other properties in FOP use a 
> table-generating tool like ANTLR - and I'm not sure what the 
> consequences would be to FOP of introducing such a tool.  Given the 
> complexities of the grammar, I'm sure that a recursive descent parser 
> will be quite complex, and if we are going to use a grammar driven 
> approach we would be better off with a tool that generates parsers 
> from grammars rather than the recursive descent approach.  Also an 
> advantage of parser generators is that one doesn't have to rewrite so 
> much code to correct a mistake in one's grammar, if one makes a 
> mistake, or if the grammar changes.  Recursive descent parsing can 
> pose its own maintenance nightmares.

Using a grammar tool like ANTLR is probably overkill to parse just a shorthand 
property. Moreover the grammar is not likely to change, so that reduces its 
usefulness even more. That said, most properties can accept expressions, where 
such a tool might actually be interesting.
I don’t know how far FOP goes to supporting expressions in other properties.

> The current approach in FOP for font-shorthand is obscurely written 
> but strikes me as basically sound.
> 
> 1)  One parses from right-to-left using the fact that spaces divide
> tokens

The problem is that font families can be specified

RE: Questionable whether font-shorthand grammar LL(1)

2009-09-29 Thread Laurent Caillette

Hi all,

I've never used SableCC or JavaCC so I cannot compare, but I'm using ANTLR a 
lot. ANTLR is highly customizable and has a very strong community. It's 
integrated development environment offers a debugger and visualization of 
grammar ambiguities. It's not only simple to setup and use, it also offers all 
the comfort you can reasonably dream of when developing grammars.

Maybe that a tool like JarJar could reduce the pain of depending on one more 
library (with all possible conflicts that could happen to FOP users).

Because code generation has some drawbacks (at least in terms of build 
complexity) you may be interested by JParsec, which creates parsers dynamically 
from pure Java code. Disclaimer: never used it.
http://jparsec.codehaus.org

Hope this will help you to do a reasonable choice.

c.


-Message d'origine-
De : berger@gmail.com [mailto:berger@gmail.com] De la part de Max Berger
Envoyé : mardi 29 septembre 2009 13:00
À : fop-dev@xmlgraphics.apache.org
Objet : Re: Questionable whether font-shorthand grammar LL(1)

Hi Vincent,


2009/9/29 Vincent Hennebert :
>> How about specifing the grammer and using a tool such as JavaCC to
>> generate the actual parser? This way you could focus more complete
>> grammer and have to spend less time writing the parser.
> That would be the same as using ANTLR. I feel that this is a bit
> overkill for just parsing the font shorthand property, although that may
> prove to be useful for other properties that can accept complex
> expressions.
> That said, JavaCC is an interesting suggestion, I didn’t think of it. If
> a choice had to be made between ANTLR and JavaCC, which one would win?

ANTLR:
- easy to use
- requires runtime linking of jar [1] (a *huge* disadvantage imo)

JavaCC:
- very sparse documentation
- generates standalone java classes

SableCC:
- better documentation
- LGPL (And therefore maybe not feasible, although it would only be
used at compile time and not runtime)

[1] http://beust.com/weblog/archives/000145.html


Max


__ Information provenant d'ESET NOD32 Antivirus, version de la base des 
signatures de virus 4466 (20090929) __

Le message a été vérifié par ESET NOD32 Antivirus.

http://www.eset.com

Re: Checkstyle RedundantThrowsCheck

2009-09-29 Thread Max Berger

Vincent,


2009/9/29 Vincent Hennebert :
> I started to write my own checkstyle configuration from scratch some
> time ago, enabling everything that looked important to me. But I’d like
> to test it a bit more before submitting it.

Same here. See the checkstyle file for JEuclid as an example.

http://jeuclid.hg.sourceforge.net/hgweb/jeuclid/jeuclid/file/tip/support/build-tools/src/main/resources/jeuclid/checkstyle.xml

> Speaking of that, there’s a rule that I would suggest to disable: the
> HiddenFieldCheck. I don’t really see its benefit. It forces to find
> somewhat artificial names for variables, where the field name is exactly
> what I want. Sometimes a method doesn’t have a name following the
> setField pattern, yet still acts as a setter for Field. This rule would
> make sense if we were using a Hungarian-like notation for variables
> (mMember, pParam, etc.), but that’s not the case in FOP.
> WDYT?

I like the rule, BUT I am ok with an exception for setters and
constructors (this is IMO a new option in checkstyle 5):
http://checkstyle.sourceforge.net/config_coding.html#HiddenField

Max

Re: Checkstyle RedundantThrowsCheck

2009-09-29 Thread Vincent Hennebert

Hi Max,

Max Berger wrote:
> Alex,
> 
> The checkstyle checks are historically grown, and are therefore
> incomplete. I personally would turn on much more checks for certain
> style issues I like. IMO every option set helps deciding a certain
> factor. So more the more checks the better :)

If you think that the current checkstyle could be improved, then by all
means, do suggest changes.

I started to write my own checkstyle configuration from scratch some
time ago, enabling everything that looked important to me. But I’d like
to test it a bit more before submitting it.

Speaking of that, there’s a rule that I would suggest to disable: the
HiddenFieldCheck. I don’t really see its benefit. It forces to find
somewhat artificial names for variables, where the field name is exactly
what I want. Sometimes a method doesn’t have a name following the
setField pattern, yet still acts as a setter for Field. This rule would
make sense if we were using a Hungarian-like notation for variables
(mMember, pParam, etc.), but that’s not the case in FOP.

WDYT?


> (in short: +1 to your changes).
> 
> Right now we have 3 checkstyle files: 3.5, 4.0, and 5.0, which also
> means the checks would need to be added in all of them (if possible).
> Can we remove any of them? I'd volunteer to modify the ant buildfile
> to support 5.0.
> 
> I'd also vote for dropping 3.5 support, and potentially dropping checkstyle 4.

+1. Let’s avoid redundancy. Checkstyle 5.0 still looks a bit on the
bleeding edge to me, but I’m happy to update my checkstyle plug-in
accordingly.


Vincent


> Max
> 
> 
> 
> 2009/9/26 Alexander Kiel :
>> Hi,
>>
>> why didn't our code style allow unchecked exceptions or subclasses of
>> thrown exceptions in Javadoc?
>>
>> From checkstyle-5.0.xml:
>>
>> 
>>
>>
>>
>> 
>>
>> From "J. Bloch: Effective Java, Second Edition" [1] page 252:
>>
>>> Use the Javadoc @thows tag to document each unchecked exception
>>> that a method can throw, but do not use the throws keyword to
>>> include unchecked exceptions in the method declaration.
>> Every good code I know, documents unchecked exceptions. Take the Java
>> Collections API. Every possible ClassCastException or
>> NullPointerException is documented.
>>
>> Another quote from J. Bloch:
>>
>>> A well-documented list of unchecked exceptions that a method
>>> can throw effectively describes the preconditions for its
>>> successful execution. It is essential that each method's
>>> documentation describe its preconditions [...]
>> I think that everyone can agree with the statements J. Bloch made. So I
>> would strongly vote to allow documenting unchecked exceptions.
>>
>>
>> The second point is not allowing subclasses of exceptions in Javadoc. I
>> don't use this very often, but I have just one example in my mind where
>> this makes sense. If you have a look into
>> java.io.DataInputStream#readByte(), there are both IOException and
>> EOFException documented. EOFException is a subclass of IOException. As
>> you know a normal InputStream.read() returns -1 at EOF but readByte()
>> doesn't. So it's worth documenting that readByte() is throwing a
>> EOFException instead.
>>
>> So I would also vote allowing subclasses.
>>
>>
>> Best Regards
>> Alex
>>
>> [1]: 
>>
>> --
>> e-mail: alexanderk...@gmx.net
>> web:www.alexanderkiel.net

Re: Questionable whether font-shorthand grammar LL(1)

2009-09-29 Thread Max Berger

Hi Vincent,


2009/9/29 Vincent Hennebert :
>> How about specifing the grammer and using a tool such as JavaCC to
>> generate the actual parser? This way you could focus more complete
>> grammer and have to spend less time writing the parser.
> That would be the same as using ANTLR. I feel that this is a bit
> overkill for just parsing the font shorthand property, although that may
> prove to be useful for other properties that can accept complex
> expressions.
> That said, JavaCC is an interesting suggestion, I didn’t think of it. If
> a choice had to be made between ANTLR and JavaCC, which one would win?

ANTLR:
- easy to use
- requires runtime linking of jar [1] (a *huge* disadvantage imo)

JavaCC:
- very sparse documentation
- generates standalone java classes

SableCC:
- better documentation
- LGPL (And therefore maybe not feasible, although it would only be
used at compile time and not runtime)

[1] http://beust.com/weblog/archives/000145.html



Max

Re: Best Interface for reading OpenType Files

2009-09-29 Thread Vincent Hennebert

Hi Alexander,

Alexander Kiel wrote:
> Hi Vincent,
> 
 Here are my two cents: if you make use of classes in javax.imagio at
 only one place in your font library, then there’s no need to worry about
 creating a more neutral layer. If OTOH you need to use those classes
 everywhere, then it makes sense to use a simplified abstraction layer.
 That abstraction layer could be shipped as a separate module and evolve
 separately. An implementation could be based on imageIO, Apache Commons
 IO (?), your own implementation based on byte arrays for testing
 purpose, etc.
>>> Thanks for that. I think, I will write a OpenTypeDataInputStream which
>>> is not a FilterInputStream, but takes a ImageInputStream as constructor
>>> argument like a FilterInputStream would take a InputStream. This
>>> OpenTypeDataInputStream will be the API for all the Streams on top of
>>> it. So I would have only one point which depends on ImageInputStream.
>> You may want to use a factory a la SAXParserFactory. Although that may
>> go a bit far.
> 
> Hmmm. I don't see the benefit of such a factory here. The
> OpenTypeDataInputStream would look like this:
> 
> public class OpenTypeDataInputStream {
snip/>
> }
> 
> This is the common FilterInputStream pattern. OpenTypeDataInputStream
> only depends on ImageInputStream which is an interface.
> OpenTypeDataInputStream is really simple and straitforward, so that I
> can't imagine different implementations. Except implementations on top
> of other things as ImageInputStream. But than we are at the question, if
> we want ImageInputStream the common interface for different
> implementations (on top of files, streams, byte arrays) or if we want
> OpenTypeDataInputStream to do that. I think that ImageInputStream is the
> right place, because it abstracts from getting bytes and be able to
> seek. OpenTypeDataInputStream on the other hand implements the semantics
> of the common OpenType data types, which are well defined in the
> specification.

I see. I had in mind to use OpenTypeDataInputStream as the common
interface. It actually makes sense to use ImageInputStream instead.
Simpler and just as flexible. That will add a direct dependency on
a class in the javax.imageio package, but this is not a problem as it is
part of the standard library. That ImageInputStream interface is
unfortunately named really.


>> There’s no such thing as IoC container in FOP. I’m not sure how easy it
>> would be to introduce one. Although that would probably be A Good Thing.
>> So do design your font library with IoC in mind.
> 
> Yes, I will. We can use IoC even without a container. And if we want to
> choose one, I have plenty experience with spring.

Good!


> So if I should vote, it would properly vote for spring.

Well I’m not sure I like the abundance of XML in spring actually. POJOs
powaaa! Also, spring may be overkill to just deploy FOP. Anyway, this is
probably a bit early to discuss that. (What do you think of the
following though: http://code.google.com/p/google-guice/ ?)


 - does the use of serializable objects make sense? What would be more
   efficient: re-parsing font data all the time or re-loading
   serializable object representation of them?
>>> You mean the font metrics XML files? I've alwas asking me for what
>>> propose they are there. No, I don't think, we need this. I really don't
>>> want to serialize the Advanced OpenType Features! It took me already a
>>> good amount of code to parse just a bit of it.
>> What I meant was to use the java.io.Serializable interface. I don’t
>> indeed think XML representations are any useful, apart maybe for
>> debugging purpose or to have a more human-readable version of the font
>> file.
>> IIC there would be next to nothing to do to cache Serializable objects
>> on the hard drive and retrieve them?
> 
> Hmmm. Ok. But if we want to use Serializable for that, your classes have
> to be very stable. Versioning the Serializable stuff is a real burden in
> my opinion. So we will need a cache which detects version changes and
> invalidate the objects if so. Do you know such a lib?

I was thinking that just catching the InvalidClassException when reading
the object would be enough to conclude that the cache is no longer valid
and must be re-created. Maybe I’m wrong? I must confess that I have no
experience with serialization.


HTH,
Vincent

Re: Questionable whether font-shorthand grammar LL(1)

2009-09-29 Thread Vincent Hennebert

Hi Max,

Max Berger wrote:
> Hi *,
> 
> I just want to throw in a different idea (you may ignore it if you like):
> 
> How about specifing the grammer and using a tool such as JavaCC to
> generate the actual parser? This way you could focus more complete
> grammer and have to spend less time writing the parser.

That would be the same as using ANTLR. I feel that this is a bit
overkill for just parsing the font shorthand property, although that may
prove to be useful for other properties that can accept complex
expressions.
That said, JavaCC is an interesting suggestion, I didn’t think of it. If
a choice had to be made between ANTLR and JavaCC, which one would win?


> JavaCC is BSD license, so we could easily integrate it in the fop build.
> 
> Max

Thanks,
Vincent



> 2009/9/28 Vincent Hennebert:
>> Hi Jonathan,
>>
>> Interesting stuff!
>>
>> Jonathan Levinson wrote:
>>> Hi Vincent,
>>>
>> 
>>> Because font-variant font-style and font-weight can occur in any order,
>>> I could not (currently) come up with a grammar in which the directing
>>> sets were disjoint for each non-terminal.  So I was unable to come up
>>> with an LL(1) grammar.
>>>
>>> For instance, here are two productions of my attempt at a grammar:
>>>
>>>  -> 
>>>
>>>  -> 
>>>
>>> In each case, the first set of  shares a common
>>> element in two different productions, the literal values for variant.
>>> One needs to look ahead one more token to see if one has a
>>>  or a .
>> (I’ll call “modifier” any of the three style, variant, weight
>> properties.)
>> Taking the ‘normal’ case apart, and since ‘inherit’ is not allowed in
>> the shorthand, I think the values for all modifiers are distinct:
>> ‘italic’, ‘oblique’, ‘backslant’ for font-style, ‘small-caps’ for
>> font-variant, and the various weight values for font-weight.
>>
>> Since all modifiers are set to their initial values prior to the
>> shorthand parsing, which is ‘normal’ for all three of them, I think we
>> can simply ignore any ‘normal’ value found in the string. That is,
>> accept it as a legal terminal but not do anything.
>>
>> So I don’t think there is any ambiguity any more. What remains to be
>> done is to check that the same modifier is not specified more than once
>> (that includes checking that ‘normal’ is not specified more than
>> 3 times). And it’s probably easier to check that at the semantic level
>> instead of crafting special grammar rules.
>>
>>
>> 
>>> The books and web articles I read only discussed using recursive descent
>>> when the grammar is LL(1).  I have the feeling that despite the
>>> ambiguities in the grammar it is almost LL(k) because font-variant and
>>> font-style and font-weight almost have disjoint values.   It is at least
>>> LL(3) and I suspect it is LL(6).
>> The font-size property has the good idea of not allowing ‘normal’ as
>> a value. The ‘normal’ case for modifiers can be ignored as explained
>> above. So I think the grammar still is LL(1)
>>
>>
>> 
>>> I'm not as convinced as you are that recursive descent parsing or a
>>> formal bottom-up-parser will make the code simpler rather than more
>>> complex because of the complexities of a formal grammar.   Of course,
>>> however complex the grammar, a table-generating tool - like ANTLR - will
>>> generate code, however complex, which will faithfully reflect the
>>> inputted grammar.  However, none of the other properties in FOP use a
>>> table-generating tool like ANTLR - and I'm not sure what the
>>> consequences would be to FOP of introducing such a tool.  Given the
>>> complexities of the grammar, I'm sure that a recursive descent parser
>>> will be quite complex, and if we are going to use a grammar driven
>>> approach we would be better off with a tool that generates parsers from
>>> grammars rather than the recursive descent approach.  Also an advantage
>>> of parser generators is that one doesn't have to rewrite so much code to
>>> correct a mistake in one's grammar, if one makes a mistake, or if the
>>> grammar changes.  Recursive descent parsing can pose its own maintenance
>>> nightmares.
>> Using a grammar tool like ANTLR is probably overkill to parse just
>> a shorthand property. Moreover the grammar is not likely to change, so
>> that reduces its usefulness even more. That said, most properties can
>> accept expressions, where such a tool might actually be interesting.
>> I don’t know how far FOP goes to supporting expressions in other
>> properties.
>>
>>
>>> The current approach in FOP for font-shorthand is obscurely written but
>>> strikes me as basically sound.
>>>
>>> 1)  One parses from right-to-left using the fact that spaces divide
>>> tokens
>> The problem is that font families can be specified with strings
>> containing whitespace, that must be handled in a specific manner and not
>> as a terminal delimitation. Otherwise parsing from right to left would
>> indeed probably be relatively easy.
>>
>>
>>> 2)  One lets property makers determine whether they apply to

Re: Questionable whether font-shorthand grammar LL(1)

2009-09-29 Thread Max Berger

Hi *,

I just want to throw in a different idea (you may ignore it if you like):

How about specifing the grammer and using a tool such as JavaCC to
generate the actual parser? This way you could focus more complete
grammer and have to spend less time writing the parser.

JavaCC is BSD license, so we could easily integrate it in the fop build.

Max

2009/9/28 Vincent Hennebert :
> Hi Jonathan,
>
> Interesting stuff!
>
> Jonathan Levinson wrote:
>> Hi Vincent,
>>
> 
>>
>> Because font-variant font-style and font-weight can occur in any order,
>> I could not (currently) come up with a grammar in which the directing
>> sets were disjoint for each non-terminal.  So I was unable to come up
>> with an LL(1) grammar.
>>
>> For instance, here are two productions of my attempt at a grammar:
>>
>>  -> 
>>
>>  -> 
>>
>> In each case, the first set of  shares a common
>> element in two different productions, the literal values for variant.
>> One needs to look ahead one more token to see if one has a
>>  or a .
>
> (I’ll call “modifier” any of the three style, variant, weight
> properties.)
> Taking the ‘normal’ case apart, and since ‘inherit’ is not allowed in
> the shorthand, I think the values for all modifiers are distinct:
> ‘italic’, ‘oblique’, ‘backslant’ for font-style, ‘small-caps’ for
> font-variant, and the various weight values for font-weight.
>
> Since all modifiers are set to their initial values prior to the
> shorthand parsing, which is ‘normal’ for all three of them, I think we
> can simply ignore any ‘normal’ value found in the string. That is,
> accept it as a legal terminal but not do anything.
>
> So I don’t think there is any ambiguity any more. What remains to be
> done is to check that the same modifier is not specified more than once
> (that includes checking that ‘normal’ is not specified more than
> 3 times). And it’s probably easier to check that at the semantic level
> instead of crafting special grammar rules.
>
>
> 
>> The books and web articles I read only discussed using recursive descent
>> when the grammar is LL(1).  I have the feeling that despite the
>> ambiguities in the grammar it is almost LL(k) because font-variant and
>> font-style and font-weight almost have disjoint values.   It is at least
>> LL(3) and I suspect it is LL(6).
>
> The font-size property has the good idea of not allowing ‘normal’ as
> a value. The ‘normal’ case for modifiers can be ignored as explained
> above. So I think the grammar still is LL(1)
>
>
> 
>> I'm not as convinced as you are that recursive descent parsing or a
>> formal bottom-up-parser will make the code simpler rather than more
>> complex because of the complexities of a formal grammar.   Of course,
>> however complex the grammar, a table-generating tool - like ANTLR - will
>> generate code, however complex, which will faithfully reflect the
>> inputted grammar.  However, none of the other properties in FOP use a
>> table-generating tool like ANTLR - and I'm not sure what the
>> consequences would be to FOP of introducing such a tool.  Given the
>> complexities of the grammar, I'm sure that a recursive descent parser
>> will be quite complex, and if we are going to use a grammar driven
>> approach we would be better off with a tool that generates parsers from
>> grammars rather than the recursive descent approach.  Also an advantage
>> of parser generators is that one doesn't have to rewrite so much code to
>> correct a mistake in one's grammar, if one makes a mistake, or if the
>> grammar changes.  Recursive descent parsing can pose its own maintenance
>> nightmares.
>
> Using a grammar tool like ANTLR is probably overkill to parse just
> a shorthand property. Moreover the grammar is not likely to change, so
> that reduces its usefulness even more. That said, most properties can
> accept expressions, where such a tool might actually be interesting.
> I don’t know how far FOP goes to supporting expressions in other
> properties.
>
>
>> The current approach in FOP for font-shorthand is obscurely written but
>> strikes me as basically sound.
>>
>> 1)      One parses from right-to-left using the fact that spaces divide
>> tokens
>
> The problem is that font families can be specified with strings
> containing whitespace, that must be handled in a specific manner and not
> as a terminal delimitation. Otherwise parsing from right to left would
> indeed probably be relatively easy.
>
>
>> 2)      One lets property makers determine whether they apply to a
>> token.  Each property maker is a little parser of the token one feeds
>> it.  Because the property makers determine whether they apply to a
>> token, one can handle the fact that variant, weight and style can occur
>> in any order by feeding the current token to each of the property makers
>> for font-variant, font-weight, and font-style in turn.  Whatever they
>> accept is ipso-facto a font-variant or a font-weight or font-style.
>>
>> Just want to let you know I take the problem seriously, and

Re: Checkstyle RedundantThrowsCheck

[PDF] Entries in number tree not specified as indirect references

RE: Questionable whether font-shorthand grammar LL(1)

RE: Questionable whether font-shorthand grammar LL(1)

Re: Checkstyle RedundantThrowsCheck

Re: Checkstyle RedundantThrowsCheck

Re: Questionable whether font-shorthand grammar LL(1)

Re: Best Interface for reading OpenType Files

Re: Questionable whether font-shorthand grammar LL(1)

Re: Questionable whether font-shorthand grammar LL(1)

10 matches

Site Navigation

Mail list logo

Footer information