Hi, your problem is that you use tesseractTrainer.py that was done in 2007 and not pyTesseractTrainer.py (2010) that corrected this issue. I would suggest to use http://code.google.com/p/pytesseracttrainer/downloads/detail?name=pyTesseractTrainer-1.01.pyor (if you are brave enough devel version: http://pytesseracttrainer.googlecode.com/svn/trunk/pyTesseractTrainer.py). In these case you do not need to solve problems that was solved already.
Anyway issues regarding tesseractTrainer.py/pyTesseractTrainer.py please post to http://code.google.com/p/pytesseracttrainer/issues/list or [email protected] BR, Zd. On Sat, Aug 21, 2010 at 10:39 AM, tt <[email protected]> wrote: > This Trainer variant won't open v3 box file: > Traceback (most recent call last): > File "/home/ty/files/tesseractTrainer.py", line 546, in doFileOpen > self.loadImageAndBoxes(fileName, chooser) > File "/home/ty/files/tesseractTrainer.py", line 471, in > loadImageAndBoxes > self.boxes = loadBoxData(boxName, height) > File "/home/ty/files/tesseractTrainer.py", line 129, in loadBoxData > (text, left, bottom, right, top) = line.split() > ValueError: too many values to unpack > > It needs something like this diff to proceed (I made this recently for > own use, and I didn't care about 6th field semantics, yet): > > --- tesseractTrainer.py.prev___^2009-04-07 12:18:08.000000000 +0300 > +++ tesseractTrainer.py^2010-08-17 12:05:31.000000000 +0300 > @@ -60,6 +60,7 @@ > right = 0 > top = 0 > bottom = 0 > + something = 0 > bold = False > italic = False > underline = False > @@ -126,7 +127,8 @@ > prevRight = -1 > . > for line in f: > - (text, left, bottom, right, top) = line.split() > + #print "%s\n" % (line) > + (text, left, bottom, right, top, something) = line.split() > s = Symbol() > . > if (text.startswith('@')): > @@ -589,9 +596,9 @@ > if s.bold: > text = '@' + text > #endif > - f.write('%s %d %d %d %d\n' % > + f.write('%s %d %d %d %d %d\n' % > (text, s.left, height - s.bottom, s.right, > - height - s.top)) > + height - s.top, s.something)) > #endfor > #endfor > f.close() > > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]<tesseract-ocr%[email protected]> > . > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en. > > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en.

