Okay, that makes sense. Back to my root problem. I have a string which I want 
to write to a PDF. This string contains both latin and Arabic characters. I 
can't use Arial Unicode MS since we are running on a linux system and I have 
been told we have some licensing concerns as well since PDFBox embeds the font 
within the PDF file. I have found that Noto Naskh Arabic does support Arabic 
characters but it doesn't support latin characters, so my string that happens 
to contain both will throw a no glyph exception when trying to print.

One idea I have seen is to attempt to use a true type collection but loading 
the TTC you seem to solely reference a single font within the collection which 
won't allow characters from both latin and Arabic to be printed from a single 
string. Yes, in theory I could split this string but I can't be guaranteed 
where the latin characters exist. Maybe I'm not understanding something 
correctly with how I assume TTC works.

Thanks,
Dan

-----Original Message-----
From: Andreas Lehmkuehler [mailto:[email protected]] 
Sent: Tuesday, October 18, 2016 2:21 PM
To: [email protected]
Subject: Re: Supporting multiple languages, including CJK

Am 18.10.2016 um 15:32 schrieb Daniel King:
> I'm curious why you shouldn't load fonts that are scanned in by PDFBox using 
> org.apache.fontbox.util.autodetect.FontDirFinder and instead reference a hard 
> coded system directory?
As you don't know what you get when asking the FontMapper for "Arial" 
especially if you run your code on different environments or OS.

You may get a simple Arial font with a limited charset, or you may get "Arial 
Unicode MS" which has a wide support for non latin charsets or you may get any 
arial alike font.

IMHO there are to many "may" especially if you are looking for a CJK capable 
font.

As John already said, it's the best idea to choose the font on your own to be 
sure you get what you are looking for.

BR
Andreas

>
> -----Original Message-----
> From: John Hewson [mailto:[email protected]]
> Sent: Tuesday, October 18, 2016 3:09 AM
> To: [email protected]
> Subject: Re: Supporting multiple languages, including CJK
>
>
>> On 12 Oct 2016, at 05:24, Daniel King <[email protected]> wrote:
>>
>> Hi,
>>
>> I'm attempting to write text to a PDF in situations where I need to 
>> support multiple languages on a single PDF. This may include regular 
>> latin characters as well as CJK characters. I've tried many attempts 
>> to do this and have it load the character sets from the OS without 
>> much success. The farthest I have gotten is support latin characters, 
>> some russian and I believe Vietnamese characters founds on the 
>> embedded fonts example here 
>> https://svn.apache.org/viewvc/pdfbox/trunk/examples/src/main/java/org
>> / apache/pdfbox/examples/pdmodel/EmbeddedFonts.java?view=markup
>>
>> I'm doing a similar approach from the example but I believe I'm using 
>> the FileSystemFontProvider provided by the FontMappers class by doing 
>> something such as
>>
>> TrueTypeFont ttf = FontMappers.instance().getTrueTypeFont("Arial",
>> null).getFont(); PDFont font = PDType0Font.load(signatureDocument,
>> ttf.getOriginalData());
>
> Don’t load fonts like this. Follow the approach from the EmbeddedFonts 
> example and load them from the filesystem.
>
>> As I mentioned I seem to be able to support the text in the EmbeddedFonts 
>> example but can't seem to determine how I can also support CJK. I’m 
>> currently using 2.0.2 of PDFBox but could potentially upgrade to 2.0.3 if 
>> that would help at all.
>
> If you have a font which supports CJK then PDFBox should be able to use it. I 
> recommend “Arial Unicode MS” as a good starting point, as it provides many 
> more Unicode characters than plain “Arial”. Google’s Noto fonts also provide 
> a great selection of characters.
>
> — John
>
>> Thanks for the help,
>> Dan
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to