Re: svn commit: r1034094 - in /xmlgraphics/fop/branches/Temp_TrueTypeInPostScript/src/java/org/apache/fop: fonts/truetype/TTFSubSetFile.java render/ps/PSFontUtils.java

2010-11-12 Thread Jeremias Maerki
Hi Mehdi

On 12.11.2010 16:32:45 mehdi houshmand wrote:
> Hi Jeremias,
> 
> This code fails the build, you need to add a ";" (a semi-colon) to the
> last parameter in the enumerated type in
> o.a.f.fonts.truetype.TTFSubSetFile.

I don't see that. Eclipse/ECJ is happy with it and the Sun JDK 1.5.0_22
also doesn't have a problem when running the Ant build. Checking the JLS
3.0, the semicolon is optional if there's no body content after the
entries. An example from the JLS:

public class Example1 {

  public enum Season { WINTER, SPRING, SUMMER, FALL }

  public static void main(String[] args) {
for (Season s : Season.values()) System.out.println(s);
  }
}

What environment are you working with?

> I was also curious why you made
> TTFSubSetFile.GlyphHandler? Why do you make it an interface, and why
> do you use an anonymous class in PSFontUtils, only to pass it back to
> the same class? If there's only one implementation and if it only
> contains a single method, I wouldn't have thought an interface was
> necessary.

It's a normal callback interface from PSFontUtils back into
TTFSubSetFile, called for each glyph when building the subset.

> TTFSubSetFile already contains various methods that perform
> similar functions (i.e. take an input, convert it to the necessary
> format and write to file), why couldn't this be implemented in the
> handleGlyphSubset(...) method?

My main problem with the way TTFSubSetFile is currently written is that
writing the records is mixed with building the table index. If that were
not so, it would have been easier to go with an approach that you would
have expected. But my approach actually has the advantage that there's
less memory build-up, since not the whole subset including glyphs has to
be buffered in memory. After all, TTF loading is known to take a LOT of
memory.

> Is there another implementation you're
> making this flexible for?

No. The context: my client (your employer) asked for urgent help to
resolve the problem with my first attempt at TTF subsets when printed on
HP printers. I needed a quick resolution after I found out what could be
wrong. I didn't know if I would turn out to be right until after I
committed the changes and Chris/Vincent could run tests. So I didn't
care about too much code beauty. There's actually quite a bit of
copy/paste/change in TTFSubSetFile as a result which I'm not
particularly proud of. I'm still waiting for feedback if my change
really fixed the problem although preliminary results show that the
problem is now solved. I expect that some refactoring would do
TTFSubSetFile some good.

> Also, from a design point, why have you made each glyph a single
> string?

That was no design decision. It's a requirement found in the PS langref
third edition, page 356, describing the contents of /GlyphDirectory.
Each glyph is looked up by its index when an array is used.

> Surely if the string must finish at a glyph boundary, then we
> could pack in several glyphs into the string and make it intelligent
> enough not to write partial glyphs?

That would be useful if we were to keep putting the glyphs in the /sfnts
entry, but not with /GlyphDirectory.

> Will this method have any performance benefits/disadvantages?

The GlyphDirectory allows to keep memory consumption down in the JavaVM.
Otherwise, I see no implications.

> The spec says 65535 is the array limit, will this be hit?

I think that's unlikely. We will hardly have any font with more than
65535 glyphs and no single glyph is likely to be larger than 64KB to
describe its outline. We might still run into problems with the /sfnts
entry, though. If we can improve TTFSubSetFile it should be much easier
to stop strings at record boundaries.



Jeremias Maerki



Re: svn commit: r1034094 - in /xmlgraphics/fop/branches/Temp_TrueTypeInPostScript/src/java/org/apache/fop: fonts/truetype/TTFSubSetFile.java render/ps/PSFontUtils.java

2010-11-12 Thread mehdi houshmand
Hi Jeremias,

This code fails the build, you need to add a ";" (a semi-colon) to the
last parameter in the enumerated type in
o.a.f.fonts.truetype.TTFSubSetFile. I was also curious why you made
TTFSubSetFile.GlyphHandler? Why do you make it an interface, and why
do you use an anonymous class in PSFontUtils, only to pass it back to
the same class? If there's only one implementation and if it only
contains a single method, I wouldn't have thought an interface was
necessary. TTFSubSetFile already contains various methods that perform
similar functions (i.e. take an input, convert it to the necessary
format and write to file), why couldn't this be implemented in the
handleGlyphSubset(...) method? Is there another implementation you're
making this flexible for?

Also, from a design point, why have you made each glyph a single
string? Surely if the string must finish at a glyph boundary, then we
could pack in several glyphs into the string and make it intelligent
enough not to write partial glyphs? Will this method have any
performance benefits/disadvantages? The spec says 65535 is the array
limit, will this be hit?

Thanks

Mehdi

On 11 November 2010 20:03,   wrote:
> Author: jeremias
> Date: Thu Nov 11 20:03:43 2010
> New Revision: 1034094
>
> URL: http://svn.apache.org/viewvc?rev=1034094&view=rev
> Log:
> PostScript Output: Bugfix for the occasional badly rendered glyph on HP 
> Laserjets.
> Reason: the /sfnts entry should split strings at glyph boundaries which 
> currently doesn't happen.
> Solution: Switch to the /GlyphDirectory approach described in the section 
> "Incremental Definition of Type 42 Fonts" in the PS language reference. This 
> way all glyphs are separated into single strings which seems to solve the 
> problem. It is also much closer to the approach taken by the various 
> PostScript printer drivers on Windows.
>
> Modified:
>    
> xmlgraphics/fop/branches/Temp_TrueTypeInPostScript/src/java/org/apache/fop/fonts/truetype/TTFSubSetFile.java
>    
> xmlgraphics/fop/branches/Temp_TrueTypeInPostScript/src/java/org/apache/fop/render/ps/PSFontUtils.java
>
> Modified: 
> xmlgraphics/fop/branches/Temp_TrueTypeInPostScript/src/java/org/apache/fop/fonts/truetype/TTFSubSetFile.java
> URL: 
> http://svn.apache.org/viewvc/xmlgraphics/fop/branches/Temp_TrueTypeInPostScript/src/java/org/apache/fop/fonts/truetype/TTFSubSetFile.java?rev=1034094&r1=1034093&r2=1034094&view=diff
> ==
> --- 
> xmlgraphics/fop/branches/Temp_TrueTypeInPostScript/src/java/org/apache/fop/fonts/truetype/TTFSubSetFile.java
>  (original)
> +++ 
> xmlgraphics/fop/branches/Temp_TrueTypeInPostScript/src/java/org/apache/fop/fonts/truetype/TTFSubSetFile.java
>  Thu Nov 11 20:03:43 2010
> @@ -20,7 +20,6 @@
>  package org.apache.fop.fonts.truetype;
>
>  import java.io.IOException;
> -import java.util.Iterator;
>  import java.util.List;
>  import java.util.Map;
>
> @@ -35,6 +34,10 @@ import java.util.Map;
>  */
>  public class TTFSubSetFile extends TTFFile {
>
> +    private static enum OperatingMode {
> +        PDF, POSTSCRIPT_GLYPH_DIRECTORY
> +    }
> +
>     private byte[] output = null;
>     private int realSize = 0;
>     private int currentPos = 0;
> @@ -43,37 +46,27 @@ public class TTFSubSetFile extends TTFFi
>      * Offsets in name table to be filled out by table.
>      * The offsets are to the checkSum field
>      */
> -    private int cvtDirOffset = 0;
> -    private int fpgmDirOffset = 0;
> +    private Map offsets = new java.util.HashMap Integer>();
>     private int glyfDirOffset = 0;
>     private int headDirOffset = 0;
> -    private int hheaDirOffset = 0;
>     private int hmtxDirOffset = 0;
>     private int locaDirOffset = 0;
>     private int maxpDirOffset = 0;
> -    private int prepDirOffset = 0;
>
>     private int checkSumAdjustmentOffset = 0;
>     private int locaOffset = 0;
>
> -    /**
> -     * Initalize the output array
> -     */
> -    private void init(int size) {
> -        output = new byte[size];
> -        realSize = 0;
> -        currentPos = 0;
> -
> -        // createDirectory()
> -    }
> -
> -    private int determineTableCount() {
> +    private int determineTableCount(OperatingMode operatingMode) {
>         int numTables = 4; //4 req'd tables: head,hhea,hmtx,maxp
>         if (isCFF()) {
>             throw new UnsupportedOperationException(
>                     "OpenType fonts with CFF glyphs are not supported");
>         } else {
> -            numTables += 2; //1 req'd table: glyf,loca
> +            if (operatingMode == OperatingMode.POSTSCRIPT_GLYPH_DIRECTORY) {
> +                numTables++; //1 table: gdir
> +            } else {
> +                numTables += 2; //2 req'd tables: glyf,loca
> +            }
>             if (hasCvt()) {
>                 numTables++;
>             }
> @@ -90,8 +83,8 @@ public class TTFSubSetFile extends TTFFi
>     /**
>      * Create the directory table
>      */
> -    pr

Printing FOP generated PDF using PCL6 drivers

2010-11-12 Thread Peter Hancock
Dear FOP devs,

I am working on rounded corner support in fop (see branch
Temp_RoundedCorners for work in progress) and I have hit upon a
problem
whilst trying to print PDF to a printer using a PCL6 driver.

Borders in PDF are created using a graphical streams of primitive
drawing commands and the rounded variant makes use of cubic bezier
curves.
I am inconsistently not able to print rounded borders and I am hoping
a snippet of the graphical stream of two border sections may provide a
fop developer with enough
info to debug the problem.

The first snippet is part of PDF that is successfully transformed to
printable PLC
q
1 0 0 1 -10 0 cm
4.393 4.393 m
7.205 1.581 11.023 0 14.999 0 c
383.720001 0 l
387.696014 0 391.514008 1.581 394.325989 4.393 c
387.255005 11.464 l
386.317993 10.527 385.045013 10 383.720001 10 c
15 10 l
13.674 10 12.401 10.527 11.464 11.464 c
h
W
n
0 G
[] 0 d 15 w
0 7.5 m 398.720001 7.5 l S
Q

The next snippet does not work

q
1 0 0 1 51.022999 785.195007 cm
-0 -1 1 -0 0 0 cm
8.302 8.302 m
13.616 2.988 20.830999 0 28.344999 0 c
700.156982 0 l
707.671021 0 714.885986 2.988 720.200012 8.302 c
716.192017 12.31 l
711.940002 8.059 706.169006 5.668 700.156982 5.669 c
28.346001 5.669 l
22.333 5.669 16.562 8.059 12.31 12.31 c
h
W
n
0.85098 0.14902 0.254902 RG
[] 0 d 28.346001 w
0 14.173 m 728.502991 14.173 l S
Q

I am aware that the problem may be in the print driver (outside the
scope of this list), or due to a wider context in the PDF,
but I am consistently able to print embedded SVGs that FOP maps to
equivalent graphical streams, and this leads me to conclude there may
be a problem with the border generation code.

Whilst debugging this issues I did notice that the coordinates are
formatted to 6 decimal places in the border painting yet to 8 dps in
 org.apache.fop.svg.PDFGraphics2D ( the SVG to PDF bridge).
Changing PDFBorderPainter to use 8 dps did not solve my problem,
however I am wondering why the discrepancy exists.

Please prompt me for more details if you are able to offer any help

Thanks in advance,

Pete


DO NOT REPLY [Bug 50245] [PATCH] Upgrade to Java 1.5 - Added type-safe parameters to collections in Fonts

2010-11-12 Thread bugzilla
https://issues.apache.org/bugzilla/show_bug.cgi?id=50245

Mehdi Houshmand  changed:

   What|Removed |Added

  Attachment #26280|0   |1
is obsolete||

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


DO NOT REPLY [Bug 50245] [PATCH] Upgrade to Java 1.5 - Added type-safe parameters to collections in Fonts

2010-11-12 Thread bugzilla
https://issues.apache.org/bugzilla/show_bug.cgi?id=50245

--- Comment #5 from Mehdi Houshmand  2010-11-12 05:43:06 EST 
---
Created an attachment (id=26285)
 --> (https://issues.apache.org/bugzilla/attachment.cgi?id=26285)
Upgraded collections in o.a.f.fonts with amendments

Ok, I've made the requested changes and a few more, it now passes all the JUnit
tests. 

I should mention that this URL/URI/File issue, is still present. I have ensured
it will pass tests by making it similar to how it was before however if we're
local/network files we should be using URI or File objects respectively. I'd
suggest using URI since it allows greater flexibility. Though I haven't dealt
with it. Dealing with that would require knowledge of the commons-io source and
also the batik source. This is something I'll look into if I get the time but
the basic upgrade is there.

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


DO NOT REPLY [Bug 49827] integration of barcode in fop 1.0

2010-11-12 Thread bugzilla
https://issues.apache.org/bugzilla/show_bug.cgi?id=49827

--- Comment #6 from Jeremias Maerki  2010-11-12 03:43:36 
EST ---
I get side-tracked constantly lately. I've made some progress preparing
Barcode4J for release but there are still a few things I need to finish first.
Paying clients first, open source second. Sorry.

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


Re: TrueType Font Embedding

2010-11-12 Thread Jeremias Maerki
On 12.11.2010 09:17:44 Chris Bowditch wrote:

> Thanks for the detailed explanation. I think I follow what you mean. 
> IIUC what you say above then when we fully embedded the CID TTF it would 
> not have been extractable? In the same way a subsetted font is 
> meaningless when extracted.

I think so, but I'm not 100% sure. Theoretically, if the Unicode cmap
tables are preserved (or even generated for the subset fonts), that
information is retained. But with a separate CMap resource that is
detached from the actual cidfont resource, it's difficult. Of course,
there's also the font resource that combines the CMap with the cidfont
that combines the two again. So to make this work all three resources
have to be kept together somehow. Example:

%%BeginResource: font EAAACC+HYb1gj
/EAAACC+HYb1gj /Identity-H [/EAAACC+HYb1gj] composefont pop
%%EndResource

> If this is true then clearly there is little 
> value in making this configurable without also adding the extra tables 
> you mention above, which I am guessing is a lot of work and probably not 
> worth it.
> 
> What about Type1 fonts? Do we always embed the font fully and can they 
> be extracted for re-use?

The good thing about Type1 fonts is that they are PostScript programs
which can be embedded with almost not changes. And you've also always
got each glyph referenced by its Adobe glyph name. But then we're also
not talking about CID Type1 fonts where the same problem probably
applies.

> 
> 
> Thanks,
> 
> Chris




Jeremias Maerki



DO NOT REPLY [Bug 49827] integration of barcode in fop 1.0

2010-11-12 Thread bugzilla
https://issues.apache.org/bugzilla/show_bug.cgi?id=49827

--- Comment #5 from Karsten  2010-11-12 03:38:14 EST 
---
Hi folks, 

I'm just having the very same issues. Any news about a build that works well
with fop 1.0?

Cheers,
Karsten

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


Re: TrueType Font Embedding

2010-11-12 Thread Jeremias Maerki
On 11.11.2010 22:10:57 Eric Douglas wrote:
> If using installed fonts is an option to save space in the file / data
> stream, using embedded fonts still needs to be an option.

Eric, we're not talking about removing anything. We're talking about
adding TrueType support to PostScript output and handling referenced
TrueType fonts with possibly full Unicode support.

> I am assigning specific fonts from specific files to get consistent
> output so everything must be embedded.  I don't want to have to care
> what is installed where.  I am glad to fix this headache I've had with
> Windows 98 trying to use Courier New fonts and different PCs with the
> same OS had a different font file, and trying to render on the server
> versus the client having one not installed or different fonts installed
> with the same name.
> 
> The problem I'm currently having with output is rendering special
> unicode glyphs.  I sent one unicode as a 25AB with the font file
> LTYPE.TTF which came installed with Windows XP.  In FOP 0.95 it produced
> a square which is what I want.  That character is supposed to be a
> square.  If I'm wrong and that character is not in the font then the
> square was the default print for character not found.  I'd like to be
> able to run a routine through FOP to get out a list of all unicodes and
> what characters they go with for a particular font.  When I tried FOP
> 1.0, that same code produced a pound #.

Hmm, sounds like a regression. I guess we'll have to look into that then.
And such a glyph dump utility is definitely something FOP could profit
from. Has anybody already written something like that? We could
integrate it into org.apache.fop.tools.fontlist maybe.

> The biggest problem I'm having running FOP 0.95 is the threading.  I've
> tried calling it from a Java SwingWorker and it's not resolving the
> issue.  I'm running a javax.swing.JProgressBar as indeterminate and it
> freezes while I'm transforming FOP output, so the users think the
> program is just stuck and I have to explain to them it's supposed to do
> that the first time.  If they run it twice in a row the second one is
> much smoother.

I've never used FOP in a way that it interacts with a Swing GUI. Maybe
there's some interaction with AWT/Java2D since FOP uses Java2D
extensively depending on the output format. But it makes absolutely
sense to run FOP in a different thread than AWT's event loop.

> Getting smaller results is nice but not necessarily a priority.
> Reducing a 2 MB file to 35 K is high priority.  Reducing a 46 K file to
> 35 K is not a big deal.  Getting consistent output is top priority.
> 
> 
> -Original Message-
> From: Jeremias Maerki [mailto:d...@jeremias-maerki.ch] 
> Sent: Thursday, November 11, 2010 3:35 PM
> To: fop-dev@xmlgraphics.apache.org
> Subject: Re: TrueType Font Embedding
> 
> Hi Chris
> 
> I fully understand the desire to install the font on a PostScript
> printer to keep the PS files smaller. To answer your question: I did not
> ask for the business use case. The problem I'm struggling with in this
> context is how to know about the CID meaning of the font, i.e. the
> multi-byte encoding of the font.
> 
> When we do subsets in FOP, we re-index the glyphs starting with index 1
> (or 3) by occurrence in the document. Only FOP knows which Unicode
> character is represented by which CID. That's why we need the ToUnicode
> CMap in PDF. Otherwise, text extraction would not be so easy.
> 
> In single-byte mode, the whole font is embedded (right now probably with
> the same problems I've just fixed with rev1034094 for the TTF subset).
> In this mode the Adobe character names map into the font, so 8-bit
> encodings can be built to properly address the right characters even if
> the font is not embedded. That's also how we currently do referenced TTF
> fonts for PDF output.
> 
> If we fully embed the font as a CID font, we currently lose the
> knowledge about which index represents which Unicode character.
> Combining the font with a suitable CMap resolves the problem but at the
> moment we only use Identity-H which is a 1:1 mapping. One solution would
> be to turn the Unicode "cmap" table in the TrueType font into a custom
> PS CMap and then use 16-bit Unicode characters directly. FOP currently
> doesn't support that.
> 
> Also, if some PS platform allows to upload naked TrueType fonts, how
> will they be represented in the PS VM? Are they CID fonts then or
> single-byte fonts? If they are CID fonts, which CID system are they
> following? I have no idea. The only way to be sure about this is by
> installing a CID font plus CMap that is generated by FOP (which can be
> done by extracting these resources from one of the PS streams. After
> that, the font can be referenced, but it may not be portable to other
> PS-generating applications.
> 
> And then, as Glen mentioned we have to have a strategy to deal with
> glyphs with no representation in Unicode. I think I get where he goes
> with that and it se

Re: TrueType Font Embedding

2010-11-12 Thread Chris Bowditch

On 11/11/2010 20:35, Jeremias Maerki wrote:

Hi Chris


Hi Jeremias,

I fully understand the desire to install the font on a PostScript
printer to keep the PS files smaller. To answer your question: I did not
ask for the business use case. The problem I'm struggling with in this
context is how to know about the CID meaning of the font, i.e. the
multi-byte encoding of the font.

When we do subsets in FOP, we re-index the glyphs starting with index 1
(or 3) by occurrence in the document. Only FOP knows which Unicode
character is represented by which CID. That's why we need the ToUnicode
CMap in PDF. Otherwise, text extraction would not be so easy.

In single-byte mode, the whole font is embedded (right now probably with
the same problems I've just fixed with rev1034094 for the TTF subset).
In this mode the Adobe character names map into the font, so 8-bit
encodings can be built to properly address the right characters even if
the font is not embedded. That's also how we currently do referenced TTF
fonts for PDF output.

If we fully embed the font as a CID font, we currently lose the
knowledge about which index represents which Unicode character.
Combining the font with a suitable CMap resolves the problem but at the
moment we only use Identity-H which is a 1:1 mapping. One solution would
be to turn the Unicode "cmap" table in the TrueType font into a custom PS
CMap and then use 16-bit Unicode characters directly. FOP currently
doesn't support that.
Thanks for the detailed explanation. I think I follow what you mean. 
IIUC what you say above then when we fully embedded the CID TTF it would 
not have been extractable? In the same way a subsetted font is 
meaningless when extracted. If this is true then clearly there is little 
value in making this configurable without also adding the extra tables 
you mention above, which I am guessing is a lot of work and probably not 
worth it.


What about Type1 fonts? Do we always embed the font fully and can they 
be extracted for re-use?




Thanks,

Chris