Thx for your answer Quan Nguyen, and sorry for my unclear question! I can get hocr output... but it does not contain any "<em>" tags when ocr'ing italic texts. Is this working for anybody?
On Apr 29, 5:46 am, Quan Nguyen <[email protected]> wrote: > http://groups.google.com/group/tesseract-ocr/browse_thread/thread/2f4...http://code.google.com/p/tesseract-ocr/issues/detail?id=377#c5 > > On Apr 28, 7:54 am, Nikse <[email protected]> wrote: > > > > > > > > > I can see that in baseapi.cpp in method "GetHOCRText" there seems to > > be support for italic in line 936/937: > > if (word->italic > 0) > > hocr_str += "<em>"; > > > Does anybody know if that's supposed to work? > > > TIA > > Nikolaj -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

