Hello All - I thought I'd share my project interest, and pose a question. I'm hoping to use ocropus to import time-tables for the data corpus for my master's thesis (a Geography thesis in comparing the Public Transit systems in Berlin, Germany from 1989 to 2009).
I understand that ocropus is geared towards book importing and indexing, and one of the first hurdles is gearing ocropus to segmenting lines of numbers. As you can see in the example pseg image below, there is column segmentation going on, where I'd like it to understand that space as just blank space: http://afterthewall.info/images/0004.pseg.png (Original: http://afterthewall.info/images/0004.png) Is there a way to customize column space? (I hit a "MAX COLUMNS" error during the translate as well). Also, has anybody had any luck in restricting ocropus to just interpreting numbers? (i.e. alot of zeros come out at O's) . Thanks much, I look forward to digging into ocropus ! Chipp Jansen -- You received this message because you are subscribed to the Google Groups "ocropus" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/ocropus?hl=en.
