[
https://issues.apache.org/jira/browse/PDFBOX-3986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Navnath Kumbhar updated PDFBOX-3986:
------------------------------------
Description:
Hello Support Team,
I am working on a task where I have to extract formulas from PDF document and
convert them into images.
But when I extract them using PDFBox, some of the symbols like *Summation*,
*Integral*, or *Big Parenthesis* .etc are mixing up with its previous line.
I checked the output of DrawPrintTextLocations example with that particular PDF
document and result does not look normal.
Red boxes are not aligned properly in the output as you will see in the
attachment files.
I am, herewith, attaching the output of two pages and PDF document itself.
*Please refer page no. 34 or 37 for this issue.*
Thank you in advance!
was:
Hello Support Team,
I am working on a task where I have to extract formulas from PDF document and
convert them into images.
But when I extract them using PDFBox, some of the symbols like *Summation*,
*Integral*, or *Big Parenthesis* .etc are mixing up with its previous line.
I checked the output of DrawPrintTextLocations example with that particular PDF
document and result does not look normal.
Red boxes are not aligned properly in the output as you can see.
I am, herewith, attaching the output of two pages and PDF document itself.
*Please refer page no. 34 or 37 for this issue.*
Thank you in advance!
> Bounding box of mathematical symbols are not proper
> ---------------------------------------------------
>
> Key: PDFBOX-3986
> URL: https://issues.apache.org/jira/browse/PDFBOX-3986
> Project: PDFBox
> Issue Type: Bug
> Components: Text extraction
> Environment: Windows 7 (64 bit)
> Reporter: Navnath Kumbhar
> Attachments: formula-marked-34.png, formula-marked-37.png, formula.pdf
>
>
> Hello Support Team,
> I am working on a task where I have to extract formulas from PDF document and
> convert them into images.
> But when I extract them using PDFBox, some of the symbols like *Summation*,
> *Integral*, or *Big Parenthesis* .etc are mixing up with its previous line.
> I checked the output of DrawPrintTextLocations example with that particular
> PDF document and result does not look normal.
> Red boxes are not aligned properly in the output as you will see in the
> attachment files.
> I am, herewith, attaching the output of two pages and PDF document itself.
> *Please refer page no. 34 or 37 for this issue.*
> Thank you in advance!
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]