the character is encoded as <c3a4> (which is correct ihmo)
but then mapped to ISOLatin1Encoding.

\u00e4 (Umlaut) encoded as 8859 should just be "e4".
What you have above is UTF-8, whereas the PS printing path is
definitely expecting 8859-1. I looked and found that when I reviewed this change
I commented it probably should be 8859-1 but didn't make a sufficient point of 
it :-(
I thought that since we returned latin1 for the charset name we'd get the right 
encoding
but apparently not, and I imagine what testing was done either didn't cover 
this range
or the bug was overlooked.

The following is the quick fix I think we need since I think printing and ONLY 
printing
ever uses this code when we are using fontconfig :-

diff --git a/src/java.desktop/unix/classes/sun/font/FcFontConfiguration.java 
b/src/java.desktop/unix/classes/sun/font/FcFontConfiguration.java
--- a/src/java.desktop/unix/classes/sun/font/FcFontConfiguration.java
+++ b/src/java.desktop/unix/classes/sun/font/FcFontConfiguration.java
@@ -180,7 +180,7 @@
         String[] componentFaceNames = cfi[idx].getComponentFaceNames();
         FontDescriptor[] ret = new FontDescriptor[componentFaceNames.length];
         for (int i = 0; i < componentFaceNames.length; i++) {
-            ret[i] = new FontDescriptor(componentFaceNames[i], 
StandardCharsets.UTF_8.newEncoder(), new int[0]);
+            ret[i] = new FontDescriptor(componentFaceNames[i], 
StandardCharsets.ISO_8859_1.newEncoder(), new int[0]);
         }
return ret;

-phil.



On 11/07/2014 08:36 AM, Mario Torre wrote:
Hi all,

I've been working on a strange issue recently, this seems to affect all
recent version of OpenJDK as well as Oracle JDK.

The issue appears to be related to this change:

http://hg.openjdk.java.net/jdk7u/jdk7u/jdk/rev/fbe9320339ea

The issue as I could find by debugging OpenJDK is a mix of a couple of
things.

This change was addressing postscript size explosion, where missing font
descriptor in version prior to this fix were causing characters to be
rendered as paths.

The new code creates an actual descriptor array, so fonts can be
rendered directly by postscript. However, it seems that the postscript
code assumes ISO_8859_1 encoding, so if I pass some characters with,
say, umlaut, like 'ä', instead of creating a patch the character is
encoded as <c3a4> (which is correct ihmo) but then mapped to
ISOLatin1Encoding.

This is a snippet of the generated postscript file, the file is
generated using a modified verion of the PrintSE.java test in OpenJDK:

http://cr.openjdk.java.net/~neugens/psDieresisBug/PrintSEUmlauts.java

/ISOF {
      dup findfont dup length 1 add dict begin {
              1 index /FID eq {pop pop} {D} ifelse
      } forall /Encoding ISOLatin1Encoding D
      currentdict end definefont
} BD
/NZ {dup 1 lt {pop 1} if} BD
/S {
      moveto 1 index stringwidth pop NZ sub
      1 index length 1 sub NZ div 0
      3 2 roll ashow newpath} BD
12.0 12 F
<c3a4> 7.44 100.0 100.0 S
pgSave restore

I'm not really confident with Postscript at this level, so I would like
some hints of where to look for an actual fix.

I have a workaround that seems to work, something like:

GlyphVector gv = font.createGlyphVector(frc, "ä");
g2d.drawGlyphVector(gv, 250, 220);

which basically forces the glyph path again. And of course I could
revert the original change, but in either case it doesn't seem correct.

My guess is that we should either somehow force ISO_8859_1 when calling
CharsetString[] makeMultiCharsetString from PSPrinterJob, or have a
proper fix for the Postscript file.

Any idea of hint is very much appreciated.

Cheers,
Mario



Reply via email to