Martin, I'm not using this functionality myself, so you most likely know best, but OCRAD is producing ORF output with "-x" command-line option. According to the README ORF file will contain bounding boxes for OCRed characters and lines.
Igor On Fri, 2010-09-10 at 17:52 +0000, Martin Wildam wrote: > @Igor: I searched quite a while - don't remember ocrad explicitely now > but I am quite sure I came across it. I also found at other places (blog > posts) that cuneiform seems to be the only one producing hocr output. > > I would be glad if there would be more choices. I have written a common > file converter with currently plugin using ABBYY to produce ocred pdf > and also writing a plugin for cuneiform. I would be glad if there would > be other options - I would immediately start another plugin for that > one. > > @Don: Thanks, I know VLinux - I have a visually impaired friend and VLinux > was also mentioned on the goinglinux podcast. > Back to topic: Regarding the sandvich PDF: ASFAIK sandvich PDF means to have > the text below the image so that the text is linked to the position on the > page where it belongs. This is more than just having the text as just a long > string (as usually delivered if you get the OCR result as text from a TIFF > without producing a PDF). In theory you could then group text columns for > being read by a screenreader as required for the impaired (I know of these > issues you are talking about). But as far as I know cuneiform cannot build > such groups. The hocr output is positioning each single character or a whole > line. I think ABBYY Finereader is currently the best out there producing > really good results (but it costs money). > > @Yury: What he is asking basically is: Using cuneiform + hocr2pdf - > would he have a chance to get a PDF output that using a screenreader > (for visually impaired people) would read everything in the correct > order (e.g. if you have a page with left and right column of text it > should result in reading first the left column and then the right column > and not first line of left column then first line of right column, > second line of left column and so on...) > -- Font size not correct in merged sandvich PDF https://bugs.launchpad.net/bugs/623438 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs