Thanks for reporting this; this is a genuine bug. We're replacing the string classes in preparation for better unicode support and something has gone wrong. I've also added this to the nightly build/test.
Tom On Jul 14, 11:18 pm, james <[email protected]> wrote: > Using the 'lines' files to train ocropus (Ubuntu 9.04) fails for every > file, telling me either that the cseg file contains one extra > character than the transcript: > > 0000 0001: transcript doesn't agree with cseg (transcript 11, cseg 12) > FIXME > > or the cseg doesn't exist > > [info] 0000 0005: no such cseg > > I've had a look at the files and the character count is correct for > the cseg, not the transcript but the transcript does in fact have all > the required characters in the file. In the case of missing cseg > files, they are indeed not there. I used the file > athttp://ocropus.googlegroups.com/web/lines.tgzas per the instructions. > Any help would be much appreciated, but go easy - I'm such a newbie > I'm not even sure if newbie is the right term... --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "ocropus" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/ocropus?hl=en -~----------~----~----~----~------~----~------~--~---
