[ https://issues.apache.org/jira/browse/PDFBOX-4667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16948889#comment-16948889 ]
Struve Pierre commented on PDFBOX-4667: --------------------------------------- Thanks for the reply, I do not know either if it is a bug or not :) I wanted to know if this behavior had already been reported. I had found no related issue. But I'm a poor searcher. So I'll put some code from PDFBox. {code:java} // extract from org.apache.pdfbox.pdmodel.font.FontMapperImpl#isCharSetMatch long JIS_JAPAN = 1 << 17; long CHINESE_SIMPLIFIED = 1 << 18; long KOREAN_WANSUNG = 1 << 19; long CHINESE_TRADITIONAL = 1 << 20; long KOREAN_JOHAB = 1 << 21; if (cidSystemInfo.getOrdering().equals("GB1") && (codePageRange & CHINESE_SIMPLIFIED) == CHINESE_SIMPLIFIED) { return true; } ... {code} The issue is that as "-1" = 0xFFFFFFFFFFFFFFFF the provided font will match all . I will try to provide some code, but I'm not sure it would be visible on a PC. Issue might not appear because of preinstalled fonts. In order to reproduce locally I had to build my own implementations of FontMapper and FontProvider in order to load fonts from a folder instead of loading fonts from the system. But I might have missed something and it is already possible from PDFBox. So I'll try to provide code and pdf file tomorrow or at the beginning of next week. > Issue in FontMapperImpl#isCharSetMatch when font codePageRange is -1 > -------------------------------------------------------------------- > > Key: PDFBOX-4667 > URL: https://issues.apache.org/jira/browse/PDFBOX-4667 > Project: PDFBox > Issue Type: Bug > Components: PDModel > Affects Versions: 2.0.16 > Reporter: Struve Pierre > Priority: Trivial > Attachments: OcrB Regular.ttf, screenshot-1.png > > > Hi I met an issue with a font. > It seems to me that code page range has not been set and then in > org.apache.pdfbox.pdmodel.font.FontMapperImpl#isCharSetMatch -1 is used. > It seems to me that -1 means "open bar". > I was trying to find a font that matches CHINESE_SIMPLIFIED > (cidSystemInfo.getOrdering{color:#9876aa}(){color}.equals{color:#9876aa}({color}{color:#6a8759}"GB1"{color}{color:#9876aa}){color}), > and due to the -1 this font was matched and unluckily it was the one picked. > Do you think we can make a special case for -1 (return false)? > Is there any font currently that matches all code page ranges? > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org