>I find the specification somewhat difficult to interpret at times but >it is my understanding that character bbox info goes within the >ocr_line tag element. whether it goes before or after the textual >elements is irrelevant. E.g. > <span class='ocr_line' id='line_18' title="bbox 363 1253 581 1289"> > <b>BYGGNADER </b> > <span class='ocr_cinfo' title="x_bboxes 363 1253 382 1279 383 > 1254 407 1281 409 1255 431 1283 434 1256 458 1284 460 1258 485 1285 486 1260 > 511 1286 514 1261 538 1287 541 1260 560 1289 561 1261 581 1289 -1 -1 -1 -1 "> > </span>
Apart from not being valid HTML, this doesn't make sense. And this was already pointed out a year ago(!): https://lists.launchpad.net/cuneiform/msg00450.html >and > <span class='ocr_line' id='line_18' title="bbox 363 1253 581 1289"> > <span class='ocr_cinfo' title="x_bboxes 363 1253 382 1279 383 1254 407 > 1281 409 1255 431 1283 434 1256 458 1284 460 1258 485 1285 486 1260 511 1286 > 514 1261 538 1287 541 1260 560 1289 561 1261 581 1289 -1 -1 -1 -1 "> > <b>BYGGNADER </b> > </span> >are equally correct, it is the association to the correct line which matters. If you don't close <span> it's not even a valid HTML... -- Jakub Wilk -- Font size not correct in merged sandvich PDF https://bugs.launchpad.net/bugs/623438 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list [email protected] https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
