Okay, that makes sense. Back to my root problem. I have a string which I want to write to a PDF. This string contains both latin and Arabic characters. I can't use Arial Unicode MS since we are running on a linux system and I have been told we have some licensing concerns as well since PDFBox embeds the font within the PDF file. I have found that Noto Naskh Arabic does support Arabic characters but it doesn't support latin characters, so my string that happens to contain both will throw a no glyph exception when trying to print.
One idea I have seen is to attempt to use a true type collection but loading the TTC you seem to solely reference a single font within the collection which won't allow characters from both latin and Arabic to be printed from a single string. Yes, in theory I could split this string but I can't be guaranteed where the latin characters exist. Maybe I'm not understanding something correctly with how I assume TTC works. Thanks, Dan -----Original Message----- From: Andreas Lehmkuehler [mailto:[email protected]] Sent: Tuesday, October 18, 2016 2:21 PM To: [email protected] Subject: Re: Supporting multiple languages, including CJK Am 18.10.2016 um 15:32 schrieb Daniel King: > I'm curious why you shouldn't load fonts that are scanned in by PDFBox using > org.apache.fontbox.util.autodetect.FontDirFinder and instead reference a hard > coded system directory? As you don't know what you get when asking the FontMapper for "Arial" especially if you run your code on different environments or OS. You may get a simple Arial font with a limited charset, or you may get "Arial Unicode MS" which has a wide support for non latin charsets or you may get any arial alike font. IMHO there are to many "may" especially if you are looking for a CJK capable font. As John already said, it's the best idea to choose the font on your own to be sure you get what you are looking for. BR Andreas > > -----Original Message----- > From: John Hewson [mailto:[email protected]] > Sent: Tuesday, October 18, 2016 3:09 AM > To: [email protected] > Subject: Re: Supporting multiple languages, including CJK > > >> On 12 Oct 2016, at 05:24, Daniel King <[email protected]> wrote: >> >> Hi, >> >> I'm attempting to write text to a PDF in situations where I need to >> support multiple languages on a single PDF. This may include regular >> latin characters as well as CJK characters. I've tried many attempts >> to do this and have it load the character sets from the OS without >> much success. The farthest I have gotten is support latin characters, >> some russian and I believe Vietnamese characters founds on the >> embedded fonts example here >> https://svn.apache.org/viewvc/pdfbox/trunk/examples/src/main/java/org >> / apache/pdfbox/examples/pdmodel/EmbeddedFonts.java?view=markup >> >> I'm doing a similar approach from the example but I believe I'm using >> the FileSystemFontProvider provided by the FontMappers class by doing >> something such as >> >> TrueTypeFont ttf = FontMappers.instance().getTrueTypeFont("Arial", >> null).getFont(); PDFont font = PDType0Font.load(signatureDocument, >> ttf.getOriginalData()); > > Don’t load fonts like this. Follow the approach from the EmbeddedFonts > example and load them from the filesystem. > >> As I mentioned I seem to be able to support the text in the EmbeddedFonts >> example but can't seem to determine how I can also support CJK. I’m >> currently using 2.0.2 of PDFBox but could potentially upgrade to 2.0.3 if >> that would help at all. > > If you have a font which supports CJK then PDFBox should be able to use it. I > recommend “Arial Unicode MS” as a good starting point, as it provides many > more Unicode characters than plain “Arial”. Google’s Noto fonts also provide > a great selection of characters. > > — John > >> Thanks for the help, >> Dan > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]

