See https://github.com/tesseract-ocr/tesseract/commit/06b7a7b188b2ed21a101cd179b4dd3cfc13aaf30
On Fri, May 31, 2019 at 9:00 PM Shree Devi Kumar <[email protected]> wrote: > I think the hocr output has an option to output bounding info per > character also. > > On Fri, 31 May 2019, 19:07 G. S., <[email protected]> wrote: > >> Dear all, >> >> i have a pdf image file, (in Greek language) >> >> i would appreciate if you could help me on how i could >> >> a) have an output similar to what pdf alto does, >> >> but more important, have the position width and height info in a per >> character base. >> >> Up to now, pdfalto considers each word to be a token, so the output is on >> a per word base. >> >> https://github.com/kermitt2/pdfalto/issues/34 >> >> >> Please tell me how would you approach this with >> >> https://github.com/tesseract-ocr >> >> which command and which parameters you would use? >> >> thank you very much in advance >> >> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To post to this group, send email to [email protected]. >> Visit this group at https://groups.google.com/group/tesseract-ocr. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/32091990-88b9-426d-94f0-2c5278a9b9da%40googlegroups.com >> <https://groups.google.com/d/msgid/tesseract-ocr/32091990-88b9-426d-94f0-2c5278a9b9da%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > -- ____________________________________________________________ भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWuHAwOj4p-st0yEqUM5jcU20itH6tJAjNogb9Lr__LLA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.

