[ocropus] Handling highly overlapping lines?

Derek Dohler Tue, 12 Jul 2011 08:31:37 -0700

Hi all,

I was wondering how well OCRopus handles (or could be made to handle) text 
lines with very little vertical separation. Specifically, I have a set of 
documents that contain blocks of text where the vertical spacing between lines 
is negative, resulting in frequent overlap between characters on different 
lines. It seems to me that OCRopus's strategy of over segmentation followed by 
alignment with a language model could work well at recognizing these 
characters, especially if segments can be assigned to multiple lines. However, 
that requires getting past the line segmentation stage; these lines are so 
closely spaced that they often appear to be a single block and are mistaken for 
oversized non-text by OCRopus' default page pre-processing.


Thanks!

Derek

-- 
You received this message because you are subscribed to the Google Groups 
"ocropus" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/ocropus?hl=en.

[ocropus] Handling highly overlapping lines?

Reply via email to