Am 25.06.2019 um 06:53 schrieb [email protected]:
Hi,
Currently, ExtractText does not separate the superscript from the source pdf
document. The helpful hint I found is here
https://stackoverflow.com/questions/27700500/superscript-and-subscript-differentiation-using-pdf-box/32671815
which seems to be outdated.
Is there a way to separate each superscript within an arbitrary tag such as
<sup>exponent-1</sup> ........<sup>exponent-n</sup>?
I can't offer further help; PDFTextStripper hasn't been changed much in
many years, i.e. the core algorithm is still the same because none of us
risks to touch it. So you should try that.
Tilman
PS: in the future (next topic), please use the users mailing list, not
the dev mailing list.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]