Cuneiform outputs both boundingboxes: <p><span class='ocr_line' id='line_1' title="bbox 36 93 580 123">This is a lot of 12 point text to test the <span class='ocr_cinfo' title="x_bboxes 36 93 55 117 57 93 71 117 ... [and so on and so on]
The line is in the first span and the letters in the second. -- Font size not correct in merged sandvich PDF https://bugs.launchpad.net/bugs/623438 You received this bug notification because you are a member of Cuneiform Linux, which is the registrant for Cuneiform for Linux. Status in Linux port of Cuneiform: Invalid Bug description: After processing with Cuneiform for Linux 1.0.0 and hOCR to PDF converter, version 0.7.4 (should be the most current version) I get a sandvich pdf that looks nice until I select text. See the sample 5AADFEE1-0000.* files in the attachment and the result.pdf. The effect is shown in screen087.png For another file (Test10pages.pdf) the effect is either worse - basically I cannot really select any more text to copy because I only can guess where to move with the mouse. It looks like that the font size in the HTML is somehow not correct - I am not an expert, but this link might help you: http://www.emdpi.com/fontsize.html _______________________________________________ Mailing list: https://launchpad.net/~cuneiform Post to : cuneiform@lists.launchpad.net Unsubscribe : https://launchpad.net/~cuneiform More help : https://help.launchpad.net/ListHelp