I use Ocropus to detect lots of early modern text in Fraktur typeface. I'm 
very impressed by its high accuracy  While it recognises the single 
characters quite well, it doesn't seem to separate words correctly in my 
use-cases.

Example (pre-processed):

<https://lh3.googleusercontent.com/-yEJqvSDKd9E/UqUP-VVcHYI/AAAAAAAAABk/AqoB5fQO-T4/s1600/Bildschirmfoto+2013-12-09+um+01.33.09.png>
Output: MittlcindischenMeer gehabt-welcherso bald die Sonn untergangen kein 
siickgesehenxund aber durchdas essenrauer Leberen von Hùˆnerenist zu 
rechtgebrachtworden ) von diserLeberdes Fisches
What i'd expect: Mittlcindischen Meer gehabt-welcher so bald die Sonn 
untergangen kein siick gesehen und aber durch das essen rauer Leberen von 
Hùˆnerenist zu recht gebracht worden ) von diser Leber des Fisches

The spaces seem quite clear to me, compared to the character size.

*Is there anything I can do to improve ocropus' behaviour concerning the 
spaces?*

Thanks already.

-- 
You received this message because you are subscribed to the Google Groups 
"ocropus" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/ocropus/abd2cd39-46f7-4465-8710-fb071f427230%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to