Ben Short wrote: > subType is /Type3 > > Does this help identify the problem?
Yes, but it doesn't bring us closer to a solution. Type 3 fonts are "user defined fonts". See for instance: http://itextpdf.com/examples/index.php?page=example&id=200 In that example, a 'delta' and 'sigma' shaped glyph was defined, corresponding with the characters 'D' and 'S'. However, the example would also have worked if we'd used any other character. Another example: we could define a glyph that looks like the symbol for 'The Artist Formerly Known As Prince' to correspond with the character 'P'. That's what Type 3 fonts are about: they can be used when a user needs a glyph that isn't provided in any other font. Therefore it's very hard to extract that content: how are you going to know that the glyph corresponding with 'P' needs to be 'translated' to 'The Artist Formerly Known As Prince'? I don't think there's a UNICODE code point for that glyph. I think you've hit a limitation regarding text extraction in general. -- This answer is provided by 1T3XT BVBA http://www.1t3xt.com/ - http://www.1t3xt.info ------------------------------------------------------------------------------ ThinkGeek and WIRED's GeekDad team up for the Ultimate GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the lucky parental unit. See the prize list and enter to win: http://p.sf.net/sfu/thinkgeek-promo _______________________________________________ iText-questions mailing list iText-questions@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/itext-questions Buy the iText book: http://www.itextpdf.com/book/ Check the site with examples before you ask questions: http://www.1t3xt.info/examples/ You can also search the keywords list: http://1t3xt.info/tutorials/keywords/