Okay, I didn't notice the link in the announcement leads to the
original Trainer. The diff, however, is valid (Trainer patched with
this works with v3 boxes on my system).

Also, regarding the Trainer (not the authors' 1.01 but the original
with v3 boxes read/write added):

The incredibly slow opening of the image file might come with some
changes in distros, tentatively identified as made between April-July
2009 and now. E.g., I've never had the problem then (Slackware with
python 2.5). Was it 2.6 version of Python, possibly?

The dependency of Trainer on Numeric and numpy libs isn't clear --
does it need both? or could it be made to depend only on one? Pygtk
expects numpy, not Numeric, at least in 2.16; and Trainer wants
Numeric for some of its functionality. Trainer here (patched for the
v3 boxes) produces the following when trying to split the box (Ctrl
+2):

./tesseractTrainer.py:649: DeprecationWarning:
PyArray_FromDimsAndDataAndDescr: use PyArray_NewFromDescr.
  pixels = subpixbuf.get_pixels_array()
Traceback (most recent call last):
  File "./tesseractTrainer.py", line 677, in doCommandsSplit
    this.right = self.findSplitPoint(this)
  File "./tesseractTrainer.py", line 658, in findSplitPoint
    numPixels = countBlackPixels(pixels, x)
  File "./tesseractTrainer.py", line 196, in countBlackPixels
    if isBlack(row[x]):
  File "./tesseractTrainer.py", line 204, in isBlack
    return pixel[0][0] + pixel[1][0] + pixel[2][0] < 128 * 3
IndexError: invalid index to scalar variable.

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.

Reply via email to