Re: svn commit: r1034094 - in /xmlgraphics/fop/branches/Temp_TrueTypeInPostScript/src/java/org/apache/fop: fonts/truetype/TTFSubSetFile.java render/ps/PSFontUtils.java
Hi Mehdi On 12.11.2010 16:32:45 mehdi houshmand wrote: > Hi Jeremias, > > This code fails the build, you need to add a ";" (a semi-colon) to the > last parameter in the enumerated type in > o.a.f.fonts.truetype.TTFSubSetFile. I don't see that. Eclipse/ECJ is happy with it and the Sun JDK 1.5.0_22 also doesn't have a problem when running the Ant build. Checking the JLS 3.0, the semicolon is optional if there's no body content after the entries. An example from the JLS: public class Example1 { public enum Season { WINTER, SPRING, SUMMER, FALL } public static void main(String[] args) { for (Season s : Season.values()) System.out.println(s); } } What environment are you working with? > I was also curious why you made > TTFSubSetFile.GlyphHandler? Why do you make it an interface, and why > do you use an anonymous class in PSFontUtils, only to pass it back to > the same class? If there's only one implementation and if it only > contains a single method, I wouldn't have thought an interface was > necessary. It's a normal callback interface from PSFontUtils back into TTFSubSetFile, called for each glyph when building the subset. > TTFSubSetFile already contains various methods that perform > similar functions (i.e. take an input, convert it to the necessary > format and write to file), why couldn't this be implemented in the > handleGlyphSubset(...) method? My main problem with the way TTFSubSetFile is currently written is that writing the records is mixed with building the table index. If that were not so, it would have been easier to go with an approach that you would have expected. But my approach actually has the advantage that there's less memory build-up, since not the whole subset including glyphs has to be buffered in memory. After all, TTF loading is known to take a LOT of memory. > Is there another implementation you're > making this flexible for? No. The context: my client (your employer) asked for urgent help to resolve the problem with my first attempt at TTF subsets when printed on HP printers. I needed a quick resolution after I found out what could be wrong. I didn't know if I would turn out to be right until after I committed the changes and Chris/Vincent could run tests. So I didn't care about too much code beauty. There's actually quite a bit of copy/paste/change in TTFSubSetFile as a result which I'm not particularly proud of. I'm still waiting for feedback if my change really fixed the problem although preliminary results show that the problem is now solved. I expect that some refactoring would do TTFSubSetFile some good. > Also, from a design point, why have you made each glyph a single > string? That was no design decision. It's a requirement found in the PS langref third edition, page 356, describing the contents of /GlyphDirectory. Each glyph is looked up by its index when an array is used. > Surely if the string must finish at a glyph boundary, then we > could pack in several glyphs into the string and make it intelligent > enough not to write partial glyphs? That would be useful if we were to keep putting the glyphs in the /sfnts entry, but not with /GlyphDirectory. > Will this method have any performance benefits/disadvantages? The GlyphDirectory allows to keep memory consumption down in the JavaVM. Otherwise, I see no implications. > The spec says 65535 is the array limit, will this be hit? I think that's unlikely. We will hardly have any font with more than 65535 glyphs and no single glyph is likely to be larger than 64KB to describe its outline. We might still run into problems with the /sfnts entry, though. If we can improve TTFSubSetFile it should be much easier to stop strings at record boundaries. Jeremias Maerki
Re: svn commit: r1034094 - in /xmlgraphics/fop/branches/Temp_TrueTypeInPostScript/src/java/org/apache/fop: fonts/truetype/TTFSubSetFile.java render/ps/PSFontUtils.java
Hi Jeremias, This code fails the build, you need to add a ";" (a semi-colon) to the last parameter in the enumerated type in o.a.f.fonts.truetype.TTFSubSetFile. I was also curious why you made TTFSubSetFile.GlyphHandler? Why do you make it an interface, and why do you use an anonymous class in PSFontUtils, only to pass it back to the same class? If there's only one implementation and if it only contains a single method, I wouldn't have thought an interface was necessary. TTFSubSetFile already contains various methods that perform similar functions (i.e. take an input, convert it to the necessary format and write to file), why couldn't this be implemented in the handleGlyphSubset(...) method? Is there another implementation you're making this flexible for? Also, from a design point, why have you made each glyph a single string? Surely if the string must finish at a glyph boundary, then we could pack in several glyphs into the string and make it intelligent enough not to write partial glyphs? Will this method have any performance benefits/disadvantages? The spec says 65535 is the array limit, will this be hit? Thanks Mehdi On 11 November 2010 20:03, wrote: > Author: jeremias > Date: Thu Nov 11 20:03:43 2010 > New Revision: 1034094 > > URL: http://svn.apache.org/viewvc?rev=1034094&view=rev > Log: > PostScript Output: Bugfix for the occasional badly rendered glyph on HP > Laserjets. > Reason: the /sfnts entry should split strings at glyph boundaries which > currently doesn't happen. > Solution: Switch to the /GlyphDirectory approach described in the section > "Incremental Definition of Type 42 Fonts" in the PS language reference. This > way all glyphs are separated into single strings which seems to solve the > problem. It is also much closer to the approach taken by the various > PostScript printer drivers on Windows. > > Modified: > > xmlgraphics/fop/branches/Temp_TrueTypeInPostScript/src/java/org/apache/fop/fonts/truetype/TTFSubSetFile.java > > xmlgraphics/fop/branches/Temp_TrueTypeInPostScript/src/java/org/apache/fop/render/ps/PSFontUtils.java > > Modified: > xmlgraphics/fop/branches/Temp_TrueTypeInPostScript/src/java/org/apache/fop/fonts/truetype/TTFSubSetFile.java > URL: > http://svn.apache.org/viewvc/xmlgraphics/fop/branches/Temp_TrueTypeInPostScript/src/java/org/apache/fop/fonts/truetype/TTFSubSetFile.java?rev=1034094&r1=1034093&r2=1034094&view=diff > == > --- > xmlgraphics/fop/branches/Temp_TrueTypeInPostScript/src/java/org/apache/fop/fonts/truetype/TTFSubSetFile.java > (original) > +++ > xmlgraphics/fop/branches/Temp_TrueTypeInPostScript/src/java/org/apache/fop/fonts/truetype/TTFSubSetFile.java > Thu Nov 11 20:03:43 2010 > @@ -20,7 +20,6 @@ > package org.apache.fop.fonts.truetype; > > import java.io.IOException; > -import java.util.Iterator; > import java.util.List; > import java.util.Map; > > @@ -35,6 +34,10 @@ import java.util.Map; > */ > public class TTFSubSetFile extends TTFFile { > > + private static enum OperatingMode { > + PDF, POSTSCRIPT_GLYPH_DIRECTORY > + } > + > private byte[] output = null; > private int realSize = 0; > private int currentPos = 0; > @@ -43,37 +46,27 @@ public class TTFSubSetFile extends TTFFi > * Offsets in name table to be filled out by table. > * The offsets are to the checkSum field > */ > - private int cvtDirOffset = 0; > - private int fpgmDirOffset = 0; > + private Map offsets = new java.util.HashMap Integer>(); > private int glyfDirOffset = 0; > private int headDirOffset = 0; > - private int hheaDirOffset = 0; > private int hmtxDirOffset = 0; > private int locaDirOffset = 0; > private int maxpDirOffset = 0; > - private int prepDirOffset = 0; > > private int checkSumAdjustmentOffset = 0; > private int locaOffset = 0; > > - /** > - * Initalize the output array > - */ > - private void init(int size) { > - output = new byte[size]; > - realSize = 0; > - currentPos = 0; > - > - // createDirectory() > - } > - > - private int determineTableCount() { > + private int determineTableCount(OperatingMode operatingMode) { > int numTables = 4; //4 req'd tables: head,hhea,hmtx,maxp > if (isCFF()) { > throw new UnsupportedOperationException( > "OpenType fonts with CFF glyphs are not supported"); > } else { > - numTables += 2; //1 req'd table: glyf,loca > + if (operatingMode == OperatingMode.POSTSCRIPT_GLYPH_DIRECTORY) { > + numTables++; //1 table: gdir > + } else { > + numTables += 2; //2 req'd tables: glyf,loca > + } > if (hasCvt()) { > numTables++; > } > @@ -90,8 +83,8 @@ public class TTFSubSetFile extends TTFFi > /** > * Create the directory table > */ > - pr
Printing FOP generated PDF using PCL6 drivers
Dear FOP devs, I am working on rounded corner support in fop (see branch Temp_RoundedCorners for work in progress) and I have hit upon a problem whilst trying to print PDF to a printer using a PCL6 driver. Borders in PDF are created using a graphical streams of primitive drawing commands and the rounded variant makes use of cubic bezier curves. I am inconsistently not able to print rounded borders and I am hoping a snippet of the graphical stream of two border sections may provide a fop developer with enough info to debug the problem. The first snippet is part of PDF that is successfully transformed to printable PLC q 1 0 0 1 -10 0 cm 4.393 4.393 m 7.205 1.581 11.023 0 14.999 0 c 383.720001 0 l 387.696014 0 391.514008 1.581 394.325989 4.393 c 387.255005 11.464 l 386.317993 10.527 385.045013 10 383.720001 10 c 15 10 l 13.674 10 12.401 10.527 11.464 11.464 c h W n 0 G [] 0 d 15 w 0 7.5 m 398.720001 7.5 l S Q The next snippet does not work q 1 0 0 1 51.022999 785.195007 cm -0 -1 1 -0 0 0 cm 8.302 8.302 m 13.616 2.988 20.830999 0 28.344999 0 c 700.156982 0 l 707.671021 0 714.885986 2.988 720.200012 8.302 c 716.192017 12.31 l 711.940002 8.059 706.169006 5.668 700.156982 5.669 c 28.346001 5.669 l 22.333 5.669 16.562 8.059 12.31 12.31 c h W n 0.85098 0.14902 0.254902 RG [] 0 d 28.346001 w 0 14.173 m 728.502991 14.173 l S Q I am aware that the problem may be in the print driver (outside the scope of this list), or due to a wider context in the PDF, but I am consistently able to print embedded SVGs that FOP maps to equivalent graphical streams, and this leads me to conclude there may be a problem with the border generation code. Whilst debugging this issues I did notice that the coordinates are formatted to 6 decimal places in the border painting yet to 8 dps in org.apache.fop.svg.PDFGraphics2D ( the SVG to PDF bridge). Changing PDFBorderPainter to use 8 dps did not solve my problem, however I am wondering why the discrepancy exists. Please prompt me for more details if you are able to offer any help Thanks in advance, Pete
DO NOT REPLY [Bug 50245] [PATCH] Upgrade to Java 1.5 - Added type-safe parameters to collections in Fonts
https://issues.apache.org/bugzilla/show_bug.cgi?id=50245 Mehdi Houshmand changed: What|Removed |Added Attachment #26280|0 |1 is obsolete|| -- Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug.
DO NOT REPLY [Bug 50245] [PATCH] Upgrade to Java 1.5 - Added type-safe parameters to collections in Fonts
https://issues.apache.org/bugzilla/show_bug.cgi?id=50245 --- Comment #5 from Mehdi Houshmand 2010-11-12 05:43:06 EST --- Created an attachment (id=26285) --> (https://issues.apache.org/bugzilla/attachment.cgi?id=26285) Upgraded collections in o.a.f.fonts with amendments Ok, I've made the requested changes and a few more, it now passes all the JUnit tests. I should mention that this URL/URI/File issue, is still present. I have ensured it will pass tests by making it similar to how it was before however if we're local/network files we should be using URI or File objects respectively. I'd suggest using URI since it allows greater flexibility. Though I haven't dealt with it. Dealing with that would require knowledge of the commons-io source and also the batik source. This is something I'll look into if I get the time but the basic upgrade is there. -- Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug.
DO NOT REPLY [Bug 49827] integration of barcode in fop 1.0
https://issues.apache.org/bugzilla/show_bug.cgi?id=49827 --- Comment #6 from Jeremias Maerki 2010-11-12 03:43:36 EST --- I get side-tracked constantly lately. I've made some progress preparing Barcode4J for release but there are still a few things I need to finish first. Paying clients first, open source second. Sorry. -- Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug.
Re: TrueType Font Embedding
On 12.11.2010 09:17:44 Chris Bowditch wrote: > Thanks for the detailed explanation. I think I follow what you mean. > IIUC what you say above then when we fully embedded the CID TTF it would > not have been extractable? In the same way a subsetted font is > meaningless when extracted. I think so, but I'm not 100% sure. Theoretically, if the Unicode cmap tables are preserved (or even generated for the subset fonts), that information is retained. But with a separate CMap resource that is detached from the actual cidfont resource, it's difficult. Of course, there's also the font resource that combines the CMap with the cidfont that combines the two again. So to make this work all three resources have to be kept together somehow. Example: %%BeginResource: font EAAACC+HYb1gj /EAAACC+HYb1gj /Identity-H [/EAAACC+HYb1gj] composefont pop %%EndResource > If this is true then clearly there is little > value in making this configurable without also adding the extra tables > you mention above, which I am guessing is a lot of work and probably not > worth it. > > What about Type1 fonts? Do we always embed the font fully and can they > be extracted for re-use? The good thing about Type1 fonts is that they are PostScript programs which can be embedded with almost not changes. And you've also always got each glyph referenced by its Adobe glyph name. But then we're also not talking about CID Type1 fonts where the same problem probably applies. > > > Thanks, > > Chris Jeremias Maerki
DO NOT REPLY [Bug 49827] integration of barcode in fop 1.0
https://issues.apache.org/bugzilla/show_bug.cgi?id=49827 --- Comment #5 from Karsten 2010-11-12 03:38:14 EST --- Hi folks, I'm just having the very same issues. Any news about a build that works well with fop 1.0? Cheers, Karsten -- Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug.
Re: TrueType Font Embedding
On 11.11.2010 22:10:57 Eric Douglas wrote: > If using installed fonts is an option to save space in the file / data > stream, using embedded fonts still needs to be an option. Eric, we're not talking about removing anything. We're talking about adding TrueType support to PostScript output and handling referenced TrueType fonts with possibly full Unicode support. > I am assigning specific fonts from specific files to get consistent > output so everything must be embedded. I don't want to have to care > what is installed where. I am glad to fix this headache I've had with > Windows 98 trying to use Courier New fonts and different PCs with the > same OS had a different font file, and trying to render on the server > versus the client having one not installed or different fonts installed > with the same name. > > The problem I'm currently having with output is rendering special > unicode glyphs. I sent one unicode as a 25AB with the font file > LTYPE.TTF which came installed with Windows XP. In FOP 0.95 it produced > a square which is what I want. That character is supposed to be a > square. If I'm wrong and that character is not in the font then the > square was the default print for character not found. I'd like to be > able to run a routine through FOP to get out a list of all unicodes and > what characters they go with for a particular font. When I tried FOP > 1.0, that same code produced a pound #. Hmm, sounds like a regression. I guess we'll have to look into that then. And such a glyph dump utility is definitely something FOP could profit from. Has anybody already written something like that? We could integrate it into org.apache.fop.tools.fontlist maybe. > The biggest problem I'm having running FOP 0.95 is the threading. I've > tried calling it from a Java SwingWorker and it's not resolving the > issue. I'm running a javax.swing.JProgressBar as indeterminate and it > freezes while I'm transforming FOP output, so the users think the > program is just stuck and I have to explain to them it's supposed to do > that the first time. If they run it twice in a row the second one is > much smoother. I've never used FOP in a way that it interacts with a Swing GUI. Maybe there's some interaction with AWT/Java2D since FOP uses Java2D extensively depending on the output format. But it makes absolutely sense to run FOP in a different thread than AWT's event loop. > Getting smaller results is nice but not necessarily a priority. > Reducing a 2 MB file to 35 K is high priority. Reducing a 46 K file to > 35 K is not a big deal. Getting consistent output is top priority. > > > -Original Message- > From: Jeremias Maerki [mailto:d...@jeremias-maerki.ch] > Sent: Thursday, November 11, 2010 3:35 PM > To: fop-dev@xmlgraphics.apache.org > Subject: Re: TrueType Font Embedding > > Hi Chris > > I fully understand the desire to install the font on a PostScript > printer to keep the PS files smaller. To answer your question: I did not > ask for the business use case. The problem I'm struggling with in this > context is how to know about the CID meaning of the font, i.e. the > multi-byte encoding of the font. > > When we do subsets in FOP, we re-index the glyphs starting with index 1 > (or 3) by occurrence in the document. Only FOP knows which Unicode > character is represented by which CID. That's why we need the ToUnicode > CMap in PDF. Otherwise, text extraction would not be so easy. > > In single-byte mode, the whole font is embedded (right now probably with > the same problems I've just fixed with rev1034094 for the TTF subset). > In this mode the Adobe character names map into the font, so 8-bit > encodings can be built to properly address the right characters even if > the font is not embedded. That's also how we currently do referenced TTF > fonts for PDF output. > > If we fully embed the font as a CID font, we currently lose the > knowledge about which index represents which Unicode character. > Combining the font with a suitable CMap resolves the problem but at the > moment we only use Identity-H which is a 1:1 mapping. One solution would > be to turn the Unicode "cmap" table in the TrueType font into a custom > PS CMap and then use 16-bit Unicode characters directly. FOP currently > doesn't support that. > > Also, if some PS platform allows to upload naked TrueType fonts, how > will they be represented in the PS VM? Are they CID fonts then or > single-byte fonts? If they are CID fonts, which CID system are they > following? I have no idea. The only way to be sure about this is by > installing a CID font plus CMap that is generated by FOP (which can be > done by extracting these resources from one of the PS streams. After > that, the font can be referenced, but it may not be portable to other > PS-generating applications. > > And then, as Glen mentioned we have to have a strategy to deal with > glyphs with no representation in Unicode. I think I get where he goes > with that and it se
Re: TrueType Font Embedding
On 11/11/2010 20:35, Jeremias Maerki wrote: Hi Chris Hi Jeremias, I fully understand the desire to install the font on a PostScript printer to keep the PS files smaller. To answer your question: I did not ask for the business use case. The problem I'm struggling with in this context is how to know about the CID meaning of the font, i.e. the multi-byte encoding of the font. When we do subsets in FOP, we re-index the glyphs starting with index 1 (or 3) by occurrence in the document. Only FOP knows which Unicode character is represented by which CID. That's why we need the ToUnicode CMap in PDF. Otherwise, text extraction would not be so easy. In single-byte mode, the whole font is embedded (right now probably with the same problems I've just fixed with rev1034094 for the TTF subset). In this mode the Adobe character names map into the font, so 8-bit encodings can be built to properly address the right characters even if the font is not embedded. That's also how we currently do referenced TTF fonts for PDF output. If we fully embed the font as a CID font, we currently lose the knowledge about which index represents which Unicode character. Combining the font with a suitable CMap resolves the problem but at the moment we only use Identity-H which is a 1:1 mapping. One solution would be to turn the Unicode "cmap" table in the TrueType font into a custom PS CMap and then use 16-bit Unicode characters directly. FOP currently doesn't support that. Thanks for the detailed explanation. I think I follow what you mean. IIUC what you say above then when we fully embedded the CID TTF it would not have been extractable? In the same way a subsetted font is meaningless when extracted. If this is true then clearly there is little value in making this configurable without also adding the extra tables you mention above, which I am guessing is a lot of work and probably not worth it. What about Type1 fonts? Do we always embed the font fully and can they be extracted for re-use? Thanks, Chris