Hi, I'm interested in using OCRopus for a big OCR project for some older (18th century) texts. It looks like OCRopus would work well for this, especially with the learning capabilities.
As a test, I randomly picked a page image to work with, unfortunately, when I did page2lines on it, one line was skipped, and there was another line that I was surprised to see wasn't split. In the first case, I do notice that when I look at the 0001.pseg.png image, the missing line is shown in yellow - does this mean anything? For the two lines that weren't split, the second line is an attribution for a quote, and in the horizontal direction, only overlaps a small section of the preceding line. Any suggestions on how to recover my "missing" line and get my other line to split the way I expected? Thanks! Dave --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "ocropus" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/ocropus?hl=en -~----------~----~----~----~------~----~------~--~---
