PDTrueTypeFont limits number of glyph widths to 256. This can easily be removed.
--------------------------------------------------------------------------------

                 Key: PDFBOX-696
                 URL: https://issues.apache.org/jira/browse/PDFBOX-696
             Project: PDFBox
          Issue Type: Improvement
          Components: Parsing
    Affects Versions: 1.1.0
         Environment: Ubuntu Karmic
            Reporter: Michael Berg


Currently the support for fonts with exotic glyphs are limited at best. Making 
PDFBox render chinese characters has proved to be a bit of a pain ... :-)

One blocker we ran into was the limitation of glyph widths to 256 individual 
widths. In PDTrueTypeFont.java, we find this in loadDescriptorDictionary():

            int firstChar = 0;
            int maxWidths=256;
            HorizontalMetricsTable hMet = ttf.getHorizontalMetrics();
            int[] widthValues = hMet.getAdvanceWidth();
            List widths = new ArrayList(maxWidths);

The "int maxWidths=256" affects the remaining code so glyph widths for 
codepoints larger than 256 are ignored. We found that there is no need to 
impose such a limitation, and that having it makes it impossible to generate a 
proper /W dictionary when generating a cidfonttype2. Simply replacing the hard 
coded value 256 with the following seems to be a perfectly usable solution:

            int firstChar = 0;
            //int maxWidths=256;  <---- No hard coded value
            int maxWidths = glyphToCCode.length;            // <---- rather use 
the counted number of codepoints
            HorizontalMetricsTable hMet = ttf.getHorizontalMetrics();
            int[] widthValues = hMet.getAdvanceWidth();
            List widths = new ArrayList(maxWidths);
            Integer zero = new Integer( 250 );

Is it possible to have this change added to 1.2.0?

Also we would be more than happy to contribute some code that shows how you can 
use PDFBox to produce PDF's containing special characters (asian, chinese etc) 
by using codepoint-to-glyph mapping and copy-paste working (/tounicode). The 
code allows API users to simply use UTF-8 strings and not worry about any of 
the tricky font handling details.



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to