[
https://issues.apache.org/jira/browse/PDFBOX-4213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17508390#comment-17508390
]
Jeyan edited comment on PDFBOX-4213 at 3/17/22, 8:04 PM:
---------------------------------------------------------
[~tilman] thanks for your time.
Ok, Working on adding a GsubWorker implementation for tamil. Taking 3.0.0-RC1
as [~paawak] code is not available in 2.0.25.
In addition, just gave an attempt with HarfBuzz as suggested by a SO user [SO
Link |https://stackoverflow.com/a/71325241/1341062] . It looks working, if I
provide a font and unicode as an input to hb-shape it gives back glyph ids in
the expected order with substitutions and reordering. But it does only the
glyphe id shaping part as expected. Just looking for the possibility to use it
with a PDF box for complex glyphe id substitutions and ordering.
If below is the sequence,
# Parse the TTF/OTF font file - PDFBox
# Receive the input text. showtext - PDFBox
# Convert the input text into uni code points - PDFBox
# Get GID/CID (cmap lookup, CID cmap, GSUB, GPOS) from ‘Font file/Parser obj’
for the code points with required ordering - PDFBox
# Get glyph from Font File using the GID/CID - PDFBox
# Embed glyph Subset - PDFBox
# Generate byte stream for writer to create PDF files - PDFBox
Can I assume the fourth item in the above list can be replaced with a HarfBuzz
shaper Engine? The HarfBuzz shaper Engine will take care of the correct
arrangements of the glyphs provided we could send the unicode codepoints of the
input text and the font file. PDFbox receives the ordered glyphs and continues
with getting glyphs from the font file, subsetting and byte stream to writer.
What would be the challenges on this, I could think of the below,
# Kerning, Positioning?
# Subsetting
# Integration with C++ codebase
Can you please comment on this when you get time?
was (Author: JIRAUSER284958):
[~tilman] thanks for your time.
Ok, I will have to add a GsubWorker implementation for Tamil. Have to start
with 3.0.0-RC1 as [~paawak]'s code is not available in 2.0.25.
In addition, just gave an attempt with HarfBuzz as suggested by a SO user [SO
Link |https://stackoverflow.com/a/71325241/1341062] . It looks working, if I
provide a font and unicode as an input to hb-shape it gives back glyph ids in
the expected order with substitutions and reordering. But it does only the
glyphe id shaping part as expected. Just looking for the possibility to use it
with a PDF box for complex glyphe id substitutions and ordering.
If below is the sequence,
# Parse the TTF/OTF font file - PDFBox
# Receive the input text. showtext - PDFBox
# Convert the input text into uni code points - PDFBox
# Get GID/CID (cmap lookup, CID cmap, GSUB, GPOS) from ‘Font file/Parser obj’
for the code points with required ordering - PDFBox
# Get glyph from Font File using the GID/CID - PDFBox
# Embed glyph Subset - PDFBox
# Generate byte stream for writer to create PDF files - PDFBox
Can I assume the fourth item in the above list can be replaced with a HarfBuzz
shaper Engine? The HarfBuzz shaper Engine will take care of the correct
arrangements of the glyphs provided we could send the unicode codepoints of the
input text and the font file. PDFbox receives the ordered glyphs and continues
with getting glyphs from the font file, subsetting and byte stream to writer.
What would be the challenges on this, I could think of the below,
# Kerning, Positioning?
# Subsetting
# Integration with C++ codebase
Can you please comment on this when you get time?
> UNICODE fonts UTF8
> -------------------
>
> Key: PDFBOX-4213
> URL: https://issues.apache.org/jira/browse/PDFBOX-4213
> Project: PDFBox
> Issue Type: Bug
> Components: FontBox, PDModel
> Affects Versions: 2.0.7
> Reporter: tritmain
> Priority: Major
> Attachments: pdf_utf_iss.png
>
>
> When we use the font with UTF8 code support fonts in the PDFbox with Tamil
> fonts
> String testSting=" பேஸ்புக் " in the jaav applicationa I got output in PDF
> with attached image pdf_utf_iss.png format.
> Which is wrong
> some other fonts works perfect "ஆஈஊஐஏளறனடணச"
> Please help us to resolve the issue
>
>
> ----------
> File tamilFontFilePattinatharGist = new
> File(this.getServletContext().getRealPath("/fonts/GIST-TAM-OTPattinathar_N_Ship.ttf"));
> PDType0Font fontPattinatharGist = PDType0Font.load(document,
> tamilFontFilePattinatharGist);//Not ok with பேஸ்புக்
> contentStream.setFont( fontPattinatharGist, 15 );
> String testSting="ஆஈ பேஸ்புக்
> ஆஈஊஐஏளறனடணசஞஇஉஎகபமதநயழரலஙவொஓஔ\\r\\nஆஈஊஐஏளறனடணசஞஇஉஎகபமதநயழரலஙவொஓஔ";
> contentStream.showText(testSting);
> System.out.println(testSting);
>
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]