Re: Questionable whether font-shorthand grammar LL(1)

2009-09-30 Thread Vincent Hennebert
Thanks everyone for your parser suggestions. I believe we should be able
to do without one for the font shorthand, but this is definitely
something to keep in mind if we want to improve the parsing of other
properties.

I’m starting to realise that the most difficult part is probably not so
much the grammar parsing as the lexical analysis. To be continued,
I guess...

Vincent


Laurent Caillette wrote:
 Hi all,
 
 I've never used SableCC or JavaCC so I cannot compare, but I'm using ANTLR a 
 lot. ANTLR is highly customizable and has a very strong community. It's 
 integrated development environment offers a debugger and visualization of 
 grammar ambiguities. It's not only simple to setup and use, it also offers 
 all the comfort you can reasonably dream of when developing grammars.
 
 Maybe that a tool like JarJar could reduce the pain of depending on one more 
 library (with all possible conflicts that could happen to FOP users).
 
 Because code generation has some drawbacks (at least in terms of build 
 complexity) you may be interested by JParsec, which creates parsers 
 dynamically from pure Java code. Disclaimer: never used it.
 http://jparsec.codehaus.org
 
 Hope this will help you to do a reasonable choice.
 
 c.
 
 
 -Message d'origine-
 De : berger@gmail.com [mailto:berger@gmail.com] De la part de Max 
 Berger
 Envoyé : mardi 29 septembre 2009 13:00
 À : fop-dev@xmlgraphics.apache.org
 Objet : Re: Questionable whether font-shorthand grammar LL(1)
 
 Hi Vincent,
 
 
 2009/9/29 Vincent Hennebert vhenneb...@gmail.com:
 How about specifing the grammer and using a tool such as JavaCC to
 generate the actual parser? This way you could focus more complete
 grammer and have to spend less time writing the parser.
 That would be the same as using ANTLR. I feel that this is a bit
 overkill for just parsing the font shorthand property, although that may
 prove to be useful for other properties that can accept complex
 expressions.
 That said, JavaCC is an interesting suggestion, I didn’t think of it. If
 a choice had to be made between ANTLR and JavaCC, which one would win?
 
 ANTLR:
 - easy to use
 - requires runtime linking of jar [1] (a *huge* disadvantage imo)
 
 JavaCC:
 - very sparse documentation
 - generates standalone java classes
 
 SableCC:
 - better documentation
 - LGPL (And therefore maybe not feasible, although it would only be
 used at compile time and not runtime)
 
 [1] http://beust.com/weblog/archives/000145.html
 
 
 Max


Re: Questionable whether font-shorthand grammar LL(1)

2009-09-30 Thread Vincent Hennebert
Hi Jonathan,

Jonathan Levinson wrote:
 Hi Vincent,
 
 Excellent ideas!  
 
 The diagram you drew is extremely useful!
 
 If the font shorthand sub-language has a grammar that is regular then it also 
 has a grammar that is LL(1).  So recursive descent parsing will work, if 
 there is a regular grammar.
 
 I think the best way of getting font shorthand to work would proceed in 
 stages:
 
 1) First get the current code to properly parse and accept valid font 
 shorthand expressions.  This should be very easy.  The one remaining problem 
 (AFAIK) is the parsing of font-size/line-height where /line-height is 
 optional.   Currently spaces are not allowed around the slash / and they 
 should be.  I'm going to try to get to this problem as soon as I have time, 
 probably in a day or so.

The current code predates the switch to Java 1.4 as a minimum
requirement, so couldn’t use the java.util.regex package. Feel free to
make use of regular expressions if you think that will make the job
easier.


 2) Evaluate which parser or automaton approach is the simplest and produces 
 better error states than the current approach.  
 3) Implement the approach one has chosen in (2).

Good luck!

snip/

Vincent


Re: Best Interface for reading OpenType Files

2009-09-30 Thread Alexander Kiel
Hi Vincent,

 I see. I had in mind to use OpenTypeDataInputStream as the common
 interface. It actually makes sense to use ImageInputStream instead.
 Simpler and just as flexible. That will add a direct dependency on
 a class in the javax.imageio package, but this is not a problem as it is
 part of the standard library. That ImageInputStream interface is
 unfortunately named really.

What did you mean with your last sentence? That ImageInputStream isn't
named good?

  So if I should vote, it would properly vote for spring.
 
 Well I’m not sure I like the abundance of XML in spring actually. POJOs
 powaaa! Also, spring may be overkill to just deploy FOP. Anyway, this is
 probably a bit early to discuss that. (What do you think of the
 following though: http://code.google.com/p/google-guice/ ?)

I heard of it before, but didn't inform myself about it. So I took your
pointer as motivation to have a look at it. I watched the Google I/O -
Big Modular Java with Guice [1] talk on youtube. It looks very
promising. I'm not agains this XML config stuff, but if I can get the
same with annotations and standard Java code - why not. Of course I like
this whole type safety stuff, but with Intellij I get this in Spring XML
too. 

[1]: http://www.youtube.com/watch?v=hBVJbzAagfs

  - does the use of serializable objects make sense? What would be more
efficient: re-parsing font data all the time or re-loading
serializable object representation of them?
  You mean the font metrics XML files? I've alwas asking me for what
  propose they are there. No, I don't think, we need this. I really don't
  want to serialize the Advanced OpenType Features! It took me already a
  good amount of code to parse just a bit of it.
  What I meant was to use the java.io.Serializable interface. I don’t
  indeed think XML representations are any useful, apart maybe for
  debugging purpose or to have a more human-readable version of the font
  file.
  IIC there would be next to nothing to do to cache Serializable objects
  on the hard drive and retrieve them?
  
  Hmmm. Ok. But if we want to use Serializable for that, your classes have
  to be very stable. Versioning the Serializable stuff is a real burden in
  my opinion. So we will need a cache which detects version changes and
  invalidate the objects if so. Do you know such a lib?
 
 I was thinking that just catching the InvalidClassException when reading
 the object would be enough to conclude that the cache is no longer valid
 and must be re-created. Maybe I’m wrong? I must confess that I have no
 experience with serialization.

Yes this could work. But I find it always difficult and time consuming
to design classes for serialization. And reading the serialized version
is most likely not much faster than reading the actual OpenType file. So
I would really want to wait until we have a real performance problem.

Best Regards
Alex



signature.asc
Description: This is a digitally signed message part


Re: Confused about checkstyle use

2009-09-30 Thread Alexander Kiel
Hi Max,

First, I will respect every code style of FOP. Its just a matter of
discussion.

  Really? That means commenting every public method even simple Getters
  and Setters?
 
 Yes. Simple Getter and Setters are the only place where you can
 publicly document private variables. (in most cases, comment in the
 getter and link from the setter)

Yes thats right. But is this Javadoc better than no Javadoc?

public class Person {

/**
 * Returns the first name of this person.
 *
 * @returns the first name of this person.
 */
public String getFirstName() {
return firstName;
}
}

  Commenting equals(), hashCode() and toString()? I think,
  this would be only clutter.
 
 /** {...@inheritdoc} */

In my eyes this is enough clutter. I saw classes in FOP with maybe 10
methods using this /** {...@inheritdoc} */. It just distracts the eye from
ready the actual method name. And it adds absolutely no information for
the source code reader.

 would do the trick on those,  UNLESS they implement something which is
 unexpected (such as the equals methods I recently renamed which did
 not implement equals) or special (a toString which creates a
 guaranteed parsable result for example)

Hmmm. A equals method shouldn't do anything unexpected. But your
toString() example is a good one. If such standard methods do something
more as the comment in Object says, that a comment is useful. 

I think it's the same as on simple public methods like the getter from
above. If your comment doesn't say anything more than the method name
says already, I don't want to read it.

Best Regards
Alex



signature.asc
Description: This is a digitally signed message part


RE: Support for Arabic in FOP

2009-09-30 Thread Prakash sen

Hi,

   I am not sure on the licensing part as sebastian did some changes in FOP
code and he provided me the jars. And as per what i had checked those jar
print arabic correctly. 
Possibly he will only be able to answer and I am nots ure whether the change
was made keeping FOP standards. He was planning to do bidi algorithm, no
idea whether he worked on it later and whether he contribuited the below
change to FOP. 

Below were his comments - 
If I set the writing-mode to rl-tb my text is flipped vertical. This happens 
because the CTM class rotates the transformation matrix for rendering 
according to the writing mode. If I want to write right-to-left this has 
nothing to do with mirroring of cause and I disabled it, because I want to 
print arabic text. So what is the purpose of mirroring in rl-tb 
writing-mode? What errors will appear if I disable the CTM.getWMctm() 
function that does the mirroring according to the writing-mode?
I achived printing (pdf) arabic text after some weeks of work ignoring any 
xsl:fo recommendations. The most things I did in the TextLayoutManager. Now 
I'm thinking about implementing it according to the recommendations and the 
BIDI algorithm.

Hi Prakash,
you can download the version of FOP that I use to print Arabic script from
www.anneundsebp.de/fop/fop.html

I hope it works for you. Unfortunately I don't understand Arabic but I know
that there are still some problems with the type setting. Maybe you can
inform me about bugs you'll find.
I'll add some explanations and the source code in a few days.

Regards
Sebastian











-- 
View this message in context: 
http://www.nabble.com/Volunteering-to-work-on-FOP-development-tp25442059p25680065.html
Sent from the FOP - Dev mailing list archive at Nabble.com.



Re: Checkstyle RedundantThrowsCheck

2009-09-30 Thread Alexander Kiel
Hi Vincent,

 Speaking of that, there’s a rule that I would suggest to disable: the
 HiddenFieldCheck. I don’t really see its benefit. It forces to find
 somewhat artificial names for variables, where the field name is exactly
 what I want. Sometimes a method doesn’t have a name following the
 setField pattern, yet still acts as a setter for Field. This rule would
 make sense if we were using a Hungarian-like notation for variables
 (mMember, pParam, etc.), but that’s not the case in FOP.
 
 WDYT?

Yes I would vote for it. In modern IDE's one sees clearly the difference
between an instance field and a local variable. This is also the reason
why this Hungarian-like scope notation is largely gone in Java.


Best Regards
Alex


signature.asc
Description: This is a digitally signed message part


Re: Checkstyle RedundantThrowsCheck

2009-09-30 Thread Alexander Kiel
Hi Max,

  Speaking of that, there’s a rule that I would suggest to disable: the
  HiddenFieldCheck. I don’t really see its benefit. It forces to find
  somewhat artificial names for variables, where the field name is exactly
  what I want. Sometimes a method doesn’t have a name following the
  setField pattern, yet still acts as a setter for Field. This rule would
  make sense if we were using a Hungarian-like notation for variables
  (mMember, pParam, etc.), but that’s not the case in FOP.
  WDYT?
 
 I like the rule, BUT I am ok with an exception for setters and
 constructors (this is IMO a new option in checkstyle 5):
 http://checkstyle.sourceforge.net/config_coding.html#HiddenField

The exclusion of constructors an setters is important. Otherwise we
would be forced to use some Hungarian-like scope notation.

But why do you think, that this rule is useful at all?

Best Regards
Alex


signature.asc
Description: This is a digitally signed message part


Re: Best Interface for reading OpenType Files

2009-09-30 Thread Vincent Hennebert
Hi Alexander,

Alexander Kiel wrote:
 Hi Vincent,
 
 I see. I had in mind to use OpenTypeDataInputStream as the common
 interface. It actually makes sense to use ImageInputStream instead.
 Simpler and just as flexible. That will add a direct dependency on
 a class in the javax.imageio package, but this is not a problem as it is
 part of the standard library. That ImageInputStream interface is
 unfortunately named really.
 
 What did you mean with your last sentence? That ImageInputStream isn't
 named good?

Yes. AFAICT its methods have nothing to do with images. This interface
should probably have been given a more neutral name.

snip/
 - does the use of serializable objects make sense? What would be more
   efficient: re-parsing font data all the time or re-loading
   serializable object representation of them?
 You mean the font metrics XML files? I've alwas asking me for what
 propose they are there. No, I don't think, we need this. I really don't
 want to serialize the Advanced OpenType Features! It took me already a
 good amount of code to parse just a bit of it.
 What I meant was to use the java.io.Serializable interface. I don’t
 indeed think XML representations are any useful, apart maybe for
 debugging purpose or to have a more human-readable version of the font
 file.
 IIC there would be next to nothing to do to cache Serializable objects
 on the hard drive and retrieve them?
 Hmmm. Ok. But if we want to use Serializable for that, your classes have
 to be very stable. Versioning the Serializable stuff is a real burden in
 my opinion. So we will need a cache which detects version changes and
 invalidate the objects if so. Do you know such a lib?
 I was thinking that just catching the InvalidClassException when reading
 the object would be enough to conclude that the cache is no longer valid
 and must be re-created. Maybe I’m wrong? I must confess that I have no
 experience with serialization.
 
 Yes this could work. But I find it always difficult and time consuming
 to design classes for serialization. And reading the serialized version
 is most likely not much faster than reading the actual OpenType file. So
 I would really want to wait until we have a real performance problem.

Sure. Nothing wrong with that.


Thanks,
Vincent


Re: Confused about checkstyle use

2009-09-30 Thread Vincent Hennebert
Hi Alexander,

Alexander Kiel wrote:
 Hi Max,
 
 First, I will respect every code style of FOP. Its just a matter of
 discussion.
 
 Really? That means commenting every public method even simple Getters
 and Setters?
 Yes. Simple Getter and Setters are the only place where you can
 publicly document private variables. (in most cases, comment in the
 getter and link from the setter)
 
 Yes thats right. But is this Javadoc better than no Javadoc?
 
 public class Person {
 
 /**
  * Returns the first name of this person.
  *
  * @returns the first name of this person.
  */
 public String getFirstName() {
 return firstName;
 }
 }

Except in the simplest cases like that one, there is always a bit of
additional information that can be added about the variable or its
usage.


 Commenting equals(), hashCode() and toString()? I think,
 this would be only clutter.
 /** {...@inheritdoc} */
 
 In my eyes this is enough clutter. I saw classes in FOP with maybe 10
 methods using this /** {...@inheritdoc} */. It just distracts the eye from
 ready the actual method name. And it adds absolutely no information for
 the source code reader.

That one is indeed there only to make Checkstyle happy. The Javadoc tool
is able to retrieve by itself the javadoc from the redefined method
(Eclipse as well). I wish Checkstyle could do that too. We will be able
to partially solve that when switching to Java 1.5, by using the
@Override annotation.

Should the rule be disabled because of that? Having proper javadoc on at
least public methods is very important. OTOH, this is actually not
something Checkstyle can verify. How many methods in the code base have
totally useless comments that are there just to avoid a Checkstyle
warning...

I think I’d prefer to keep the rule, but wouldn’t veto its removal.


 would do the trick on those,  UNLESS they implement something which is
 unexpected (such as the equals methods I recently renamed which did
 not implement equals) or special (a toString which creates a
 guaranteed parsable result for example)
 
 Hmmm. A equals method shouldn't do anything unexpected. But your
 toString() example is a good one. If such standard methods do something
 more as the comment in Object says, that a comment is useful. 
 
 I think it's the same as on simple public methods like the getter from
 above. If your comment doesn't say anything more than the method name
 says already, I don't want to read it.
 
 Best Regards
 Alex

Vincent


Re: Confused about checkstyle use

2009-09-30 Thread Alexander Kiel
Hi Vincent,

 Should the rule be disabled because of that? Having proper javadoc on at
 least public methods is very important. OTOH, this is actually not
 something Checkstyle can verify. How many methods in the code base have
 totally useless comments that are there just to avoid a Checkstyle
 warning...
 
 I think I’d prefer to keep the rule, but wouldn’t veto its removal.

I don't vote for removal too, I only vote for the right to violate it in
cases one can't add any useful information in the comment.


Best Regards
Alex


signature.asc
Description: This is a digitally signed message part


RE: Questionable whether font-shorthand grammar LL(1)

2009-09-30 Thread Jonathan Levinson
I agree - in this case - tokenizing - lexical analysis - is more difficult than 
parsing.

Best Regards,
Jonathan 

-Original Message-
From: Vincent Hennebert [mailto:vhenneb...@gmail.com] 
Sent: Wednesday, September 30, 2009 6:25 AM
To: fop-dev@xmlgraphics.apache.org
Subject: Re: Questionable whether font-shorthand grammar LL(1)

Thanks everyone for your parser suggestions. I believe we should be able
to do without one for the font shorthand, but this is definitely
something to keep in mind if we want to improve the parsing of other
properties.

I’m starting to realise that the most difficult part is probably not so
much the grammar parsing as the lexical analysis. To be continued,
I guess...

Vincent


Laurent Caillette wrote:
 Hi all,
 
 I've never used SableCC or JavaCC so I cannot compare, but I'm using ANTLR a 
 lot. ANTLR is highly customizable and has a very strong community. It's 
 integrated development environment offers a debugger and visualization of 
 grammar ambiguities. It's not only simple to setup and use, it also offers 
 all the comfort you can reasonably dream of when developing grammars.
 
 Maybe that a tool like JarJar could reduce the pain of depending on one more 
 library (with all possible conflicts that could happen to FOP users).
 
 Because code generation has some drawbacks (at least in terms of build 
 complexity) you may be interested by JParsec, which creates parsers 
 dynamically from pure Java code. Disclaimer: never used it.
 http://jparsec.codehaus.org
 
 Hope this will help you to do a reasonable choice.
 
 c.
 
 
 -Message d'origine-
 De : berger@gmail.com [mailto:berger@gmail.com] De la part de Max 
 Berger
 Envoyé : mardi 29 septembre 2009 13:00
 À : fop-dev@xmlgraphics.apache.org
 Objet : Re: Questionable whether font-shorthand grammar LL(1)
 
 Hi Vincent,
 
 
 2009/9/29 Vincent Hennebert vhenneb...@gmail.com:
 How about specifing the grammer and using a tool such as JavaCC to
 generate the actual parser? This way you could focus more complete
 grammer and have to spend less time writing the parser.
 That would be the same as using ANTLR. I feel that this is a bit
 overkill for just parsing the font shorthand property, although that may
 prove to be useful for other properties that can accept complex
 expressions.
 That said, JavaCC is an interesting suggestion, I didn’t think of it. If
 a choice had to be made between ANTLR and JavaCC, which one would win?
 
 ANTLR:
 - easy to use
 - requires runtime linking of jar [1] (a *huge* disadvantage imo)
 
 JavaCC:
 - very sparse documentation
 - generates standalone java classes
 
 SableCC:
 - better documentation
 - LGPL (And therefore maybe not feasible, although it would only be
 used at compile time and not runtime)
 
 [1] http://beust.com/weblog/archives/000145.html
 
 
 Max


Re: LZW embedding experiment

2009-09-30 Thread MatthiasR

Hello,


Jeremias Maerki-2 wrote:
 
 I've written some code that can embedd a single-stripe CMYK TIFF in PDF
 as a proof of concept. I've done it for PDF because that was the easiest
 to implement. I don't want to commit that right now since it
 would need a lot of testing first. So in case I don't pursue this (due
 to other priorities) and someone else wants that code, it's available.
 

I'd be interested in testing this for PDF output. Could you please send me
the patch?

Regards,
Matthias Reischenbacher
-- 
View this message in context: 
http://www.nabble.com/LZW-embedding-experiment-tp25635491p25685400.html
Sent from the FOP - Dev mailing list archive at Nabble.com.



RE: Support for Arabic in FOP

2009-09-30 Thread spepping

Quoting Prakash sen prakash@gmail.com:



Hi,

   I am not sure on the licensing part as sebastian did some changes in FOP
code and he provided me the jars. And as per what i had checked those jar
print arabic correctly.
Possibly he will only be able to answer and I am nots ure whether the change
was made keeping FOP standards. He was planning to do bidi algorithm, no
idea whether he worked on it later and whether he contribuited the below
change to FOP.


He did not commit any change to FOP.

Simon



This message was sent using IMP, the Internet Messaging Program.