Hi,

I'm using the Mercurial code and getting good results!  One big
problem is that words seem to run together.  (I'm putting the image
and text together with hocrtopdf)

"are in an informal setting in a conference room, we must"

is recognized as:

"areinaninformalset6nginaconferenceroom,wemust g"

See http://hero.com/ken/trainme.pdf

1) Can I fix this with training?

2) How do I generate the different file types in

http://ocropus.googlegroups.com/web/lines.tgz to start training?

3)  Is my problem related to the fact ground truth files, like

lines/0001/0080.gt.txt

don't contain spaces either?  E.g.,

"SNL,andF.HarveyDove,PNL,fortheirsuggestions"

is image text that reads:

"SNL, and F. Harvey Dove, PNL, for their suggestions"

Thanks,
Ken
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"ocropus" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/ocropus?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to