Here is another note from René:

On Mon, Sep 13, 2010 at 11:53, Rene Rebe <r...@exactcode.de> wrote:

Note that I wrote the initial hOCR annotation in cuneiform, ... :-)

If they desperately want to keep this new format, one could add 2
different hOCR formats, like hocr and hocr-detailed or so to cuneiform.

However, one could also use my initial style bbox per character
annocation and their new whole span bbox. However, I see no reason for
thiis imprecise and hard to parse span after span x_bboxes, per glyph
bbox's would retain compatibility with hocr2pdf and also get higher
precision as the x_bboxes do not contain the individual heights, ...

       René

-- 
Font size not correct in merged sandvich PDF
https://bugs.launchpad.net/bugs/623438
You received this bug notification because you are a member of Cuneiform
Linux, which is the registrant for Cuneiform for Linux.

Status in Linux port of Cuneiform: Confirmed
Status in “exactimage” package in Ubuntu: New

Bug description:
After processing with Cuneiform for Linux 1.0.0 and hOCR to PDF converter, 
version 0.7.4 (should be the most current version) I get a sandvich pdf that 
looks nice until I select text.

See the sample 5AADFEE1-0000.* files in the attachment and the result.pdf.
The effect is shown in screen087.png

For another file (Test10pages.pdf) the effect is either worse - basically I 
cannot really select any more text to copy because I only can guess where to 
move with the mouse.

It looks like that the font size in the HTML is somehow not correct - I am not 
an expert, but this link might help you:
http://www.emdpi.com/fontsize.html



_______________________________________________
Mailing list: https://launchpad.net/~cuneiform
Post to     : cuneiform@lists.launchpad.net
Unsubscribe : https://launchpad.net/~cuneiform
More help   : https://help.launchpad.net/ListHelp

Reply via email to