Pierre,

Please confirm whether you  have succeeded in training by using your
commandline like
"tesseract OCRB.tif ./cst.OCRB.page001 nobatch box.train.logfile"
[please note Logfile is used for Windows platform like winXP]
Kindly upload OCRB.tif for hands on experience by me. I wanted to use your
commandline
 for Indic lang like Kannada.
Thanks for your research, Pierre.
-With regards,
-sriranga(77yrsold)


On Mon, Apr 12, 2010 at 9:02 PM, MARTIN Pierre <[email protected]> wrote:

> Replying to myself so you can understand why it fails. Solution follows.
>
>
> i'm getting:
> *Tesseract Open Source OCR Engine with Leptonica*
> *APPLY_BOXES:*
> *Boxes read from boxfile:     290*
> *Initially labelled blobs:    290 in 8 rows*
> *Box failures detected:            0*
> *Duped blobs for rebalance:     0*
> *"<" has fewest samples:     1*
> *Total unlabelled words:        0*
> *Final labelled words:        290*
> *Generating training data*
> And then it just crashes without an error message. i'm unable to debug the
> application (For some reason, the visual studio project shipped with the svn
> version can't read the debugging information, i've tryed to dynamically read
> the debugging symbols with no luck).
>
>
> This is triggered in blobclass.cpp in function LearBlob, when trying to get
> the "firstdot" variable from a "filename" variable.
> After debugging this, i figured that the "filename" variable was set to
> "junk", because i just followed the wiki training doc.
> In fact, there seem to be a new filename format, as stated with the comment
> in this C++ file:
> // filename is expected to be of the form [lang].[fontname].exp[num]
> // The [lang], [fontname] and [num] fields should not have '.' characters.
> So instead of calling:
> tesseract OCRB.tif junk nobatch box.train.stderr
> You have to call:
> tesseract OCRB.tif ./cst.OCRB.page001 nobatch box.train.stderr
>
> Thanks me,
> Me.
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected]<tesseract-ocr%[email protected]>
> .
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.

Reply via email to