[ 
https://issues.apache.org/jira/browse/PDFBOX-4189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17407489#comment-17407489
 ] 

Tilman Hausherr commented on PDFBOX-4189:
-----------------------------------------

This would have to be implemented in the source code. See Language.java in 
fontbox, it describes what needs to be done to implement a new language 
(implement a new GsubWorker). Currently there's only a 
GsubWorkerForBengali.java . You would need to understand what Palash Ray has 
done and why. I assume you'd need to know about Bengali and Malayalam glyphs, 
i.e. how the substitutions are done. Maybe it's a similar principle, maybe it 
isn't. Nobody in the team does AFAIK. And you need to be able to build from 
source. The current implementation is incomplete, the visual is fine but the 
text extraction is wrong. You're welcome if you want to try

> Enable PDF creation with Indian languages, by reading and utilizing the GSUB 
> table
> ----------------------------------------------------------------------------------
>
>                 Key: PDFBOX-4189
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4189
>             Project: PDFBox
>          Issue Type: New Feature
>          Components: FontBox, PDModel
>            Reporter: Palash Ray
>            Priority: Major
>         Attachments: Bengali-text-after.pdf, Bengali-text-before.pdf, 
> BengaliPdfGenerationHelloWorld.java, bengali-example.pdf, 
> bengali-example2.pdf, bengali-example3.pdf, bengali-word-lohit-bad.pdf, 
> bengali-word-lohit-good.pdf, committed.patch, pdf-output.png, screenshot.png
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Implemented proper rendering of Indian languages, which need extensive Glyph 
> substitution. The GSUB table has been read and used effectively to replace 
> some compound words with their respective Glyphs. All tests are passing. I 
> have tested this for the Bengali font. Please review these changes and let me 
> know if it makes sense to incorporate these.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to