>I find the specification somewhat difficult to interpret at times but 
>it is my understanding that character bbox info goes within the 
>ocr_line tag element. whether it goes before or after the textual 
>elements is irrelevant. E.g.
>       <span class='ocr_line' id='line_18' title="bbox 363 1253 581 1289">
>               <b>BYGGNADER </b>
>               <span class='ocr_cinfo' title="x_bboxes 363 1253 382 1279 383 
> 1254 407 1281 409 1255 431 1283 434 1256 458 1284 460 1258 485 1285 486 1260 
> 511 1286 514 1261 538 1287 541 1260 560 1289 561 1261 581 1289 -1 -1 -1 -1 ">
>       </span>

Apart from not being valid HTML, this doesn't make sense. And this was 
already pointed out a year ago(!):
https://lists.launchpad.net/cuneiform/msg00450.html

>and
>       <span class='ocr_line' id='line_18' title="bbox 363 1253 581 1289">
>        <span class='ocr_cinfo' title="x_bboxes 363 1253 382 1279 383 1254 407 
> 1281 409 1255 431 1283 434 1256 458 1284 460 1258 485 1285 486 1260 511 1286 
> 514 1261 538 1287 541 1260 560 1289 561 1261 581 1289 -1 -1 -1 -1 ">
>               <b>BYGGNADER </b>
>       </span>
>are equally correct, it is the association to the correct line which matters.

If you don't close <span> it's not even a valid HTML...

-- 
Jakub Wilk

-- 
Font size not correct in merged sandvich PDF
https://bugs.launchpad.net/bugs/623438
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to