This Trainer variant won't open v3 box file:
Traceback (most recent call last):
File "/home/ty/files/tesseractTrainer.py", line 546, in doFileOpen
self.loadImageAndBoxes(fileName, chooser)
File "/home/ty/files/tesseractTrainer.py", line 471, in
loadImageAndBoxes
self.boxes = loadBoxData(boxName, height)
File "/home/ty/files/tesseractTrainer.py", line 129, in loadBoxData
(text, left, bottom, right, top) = line.split()
ValueError: too many values to unpack
It needs something like this diff to proceed (I made this recently for
own use, and I didn't care about 6th field semantics, yet):
--- tesseractTrainer.py.prev___^2009-04-07 12:18:08.000000000 +0300
+++ tesseractTrainer.py^2010-08-17 12:05:31.000000000 +0300
@@ -60,6 +60,7 @@
right = 0
top = 0
bottom = 0
+ something = 0
bold = False
italic = False
underline = False
@@ -126,7 +127,8 @@
prevRight = -1
.
for line in f:
- (text, left, bottom, right, top) = line.split()
+ #print "%s\n" % (line)
+ (text, left, bottom, right, top, something) = line.split()
s = Symbol()
.
if (text.startswith('@')):
@@ -589,9 +596,9 @@
if s.bold:
text = '@' + text
#endif
- f.write('%s %d %d %d %d\n' %
+ f.write('%s %d %d %d %d %d\n' %
(text, s.left, height - s.bottom, s.right,
- height - s.top))
+ height - s.top, s.something))
#endfor
#endfor
f.close()
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en.