Well you never know until you try ;-) I did not intend to provide you final solution - I just wanted to point out that there is GetComponent function that could at least help you to provide bounding box for text areas.
There are also other tools for detecting text (e.g. opencv [1]) - you can do layout detection ourside tesseract and than to use tesseract just for OCR... [1] http://stackoverflow.com/questions/23506105/extracting-text-opencv Zdenko On Wed, Aug 12, 2015 at 6:10 PM, Anshul Maheshwari <[email protected]> wrote: > But I am not sure that there will always be one line in bitmap subtitle, > there could be 4 lines, It looks like correcting this case will break > other case of subtitles. > > I am not sure but will your solution also works on multi-line input. > > -Anshul > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at http://groups.google.com/group/tesseract-ocr. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/41558a13-2a81-40ea-a724-801db94785f3%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/41558a13-2a81-40ea-a724-801db94785f3%40googlegroups.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8yQq6HLEnEEi2GcOweMTcXESfY9HVAfVDkYu-rnovvJ4A%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.

