Re: Announcement: new version of pyTesseractTrainer available

zdenko podobny Sat, 21 Aug 2010 01:51:51 -0700

Hi,

your problem is that you use tesseractTrainer.py that was done in 2007 and
not pyTesseractTrainer.py (2010) that corrected this issue. I would suggest
to use
http://code.google.com/p/pytesseracttrainer/downloads/detail?name=pyTesseractTrainer-1.01.pyor
(if you are brave enough devel version:
http://pytesseracttrainer.googlecode.com/svn/trunk/pyTesseractTrainer.py).
In these case you do not need to solve problems that was solved already.


Anyway issues regarding tesseractTrainer.py/pyTesseractTrainer.py please
post to http://code.google.com/p/pytesseracttrainer/issues/list or
[email protected]

BR,

Zd.

On Sat, Aug 21, 2010 at 10:39 AM, tt <[email protected]> wrote:

> This Trainer variant won't open v3 box file:
> Traceback (most recent call last):
>   File "/home/ty/files/tesseractTrainer.py", line 546, in doFileOpen
>    self.loadImageAndBoxes(fileName, chooser)
>  File "/home/ty/files/tesseractTrainer.py", line 471, in
> loadImageAndBoxes
>    self.boxes = loadBoxData(boxName, height)
>  File "/home/ty/files/tesseractTrainer.py", line 129, in loadBoxData
>    (text, left, bottom, right, top) = line.split()
> ValueError: too many values to unpack
>
> It needs something like this diff to proceed (I made this recently for
> own use, and I didn't care about 6th field semantics, yet):
>
> --- tesseractTrainer.py.prev___^2009-04-07 12:18:08.000000000 +0300
> +++ tesseractTrainer.py^2010-08-17 12:05:31.000000000 +0300
> @@ -60,6 +60,7 @@
>     right = 0
>     top = 0
>     bottom = 0
> +    something = 0
>     bold = False
>     italic = False
>     underline = False
> @@ -126,7 +127,8 @@
>     prevRight = -1
> .
>     for line in f:
> -        (text, left, bottom, right, top) = line.split()
> +        #print "%s\n" % (line)
> +        (text, left, bottom, right, top, something) = line.split()
>         s = Symbol()
> .
>         if (text.startswith('@')):
> @@ -589,9 +596,9 @@
>                 if s.bold:
>                     text = '@' + text
>                 #endif
> -                f.write('%s %d %d %d %d\n' %
> +                f.write('%s %d %d %d %d %d\n' %
>                         (text, s.left, height - s.bottom, s.right,
> -                         height - s.top))
> +                         height - s.top, s.something))
>             #endfor
>         #endfor
>         f.close()
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected]<tesseract-ocr%[email protected]>
> .
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.

Re: Announcement: new version of pyTesseractTrainer available

Reply via email to