Hi Volker, As you suggested, my problem wasn't encoding - it was because a missing font file. Not only "AR PL UMing HK", you can use any font of your choice by modifying fontconfig.py (as long as you won't violate copyright ; )
I have a minor problem still here - Japanese punctuation (comma <U +3002> and full stop <U+3001>) are still rendered as white boxes. At this moment I cannot tell whether mwlib covers this issue. BTW, I will try your linebreaking patch later. Thanks for your help. BR, Yoichi On 7月22日, 午後7:51, Volker Haas <[email protected]> wrote: > Hi, > > I believe the real problem might not be a false encoding. > > Which pdfviewer did you use to view the PDF? If you are not using the > Adobe Reader japanese characters will probably not display because Adobe > fonts are used to render japanese. > > Since you're using a linux machine you probably don't want to use the > Adobe Reader. In that case you need to install several fonts as > described in the README.txt in mwlib.rl . > You can start by only installing "AR PL UMing HK" since this is the only > font you need for japanese. > > Please let me know if that solves your problem. > > Furthermore any help/hints etc. regarding rendering PDFs in japanese is > highly appreciated. If you'd like to contribute (not necessarily code) > this would be great ;) > > Regards, > Volker > > > > Yoichi KATO wrote: > > Hello - > > > I've tested mw-render as mentioned in mw-render examples page on my > > CentOS box and found that encoding setting for mw-render isn't > > correctly handled for Japanese environment. > > > Here's what I tried: > > > $ mw-render -c :en -w rl -o nasa.pdf NASA > > > to get English Wikipedia, which works properly, but for Japanese WP: > > > $ mw-render --config=http://ja.wikipedia.org/w/-L ja -w rl -o > > nasa_ja.pdf NASA > > > gives me an pdf file with white boxes for Japanese specific > > characters. When I cut/paste garbled characters from pdf to text > > editor, everything looks fine. > > > PDF file property window says it has UniGB-UCS2-H encoding. It is for > > Chinise language environment, and shoud be with UniJIS-UCS2-H encoding > > (as described in mwlib/reportlab/pdfbase/_cidfontdata.py). > > > Can somebody tell how to generate pdf file with a correct Japanese > > encoding? > > > Cheers, > > -- > volker haas brainbot technologies ag > fon +49 6131 2116394 boppstraße 64 > fax +49 6131 2116392 55118 mainz > [email protected] http://www.brainbot.com/ --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "mwlib" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/mwlib?hl=en -~----------~----~----~----~------~----~------~--~---
