Hello,
There is several similar issues which concerns formatting after PDF to
HTML conversion:
https://issues.apache.org/jira/browse/PDFBOX-6
https://issues.apache.org/jira/browse/PDFBOX-271
I would like to work on them (I see that some work has been done by
rrufai already, but PDFBOX code have changed since then, so I may need
to do some additional changes), but I see that severity of all these
issues is minor and there are not any comments on them for a long time.
Thats why I am not sure if it make sense to work on them or not? If not
then may be they can be closed?
Thank you,
LX