Hi. Hope you help me. Can you explain "fixed", "serif" and "fraktur" meaning? (by clearly way :)) ) If you know, please answer, really need it. Thanks so much!!
On Sunday, July 10, 2011 10:30:43 PM UTC+7, [email protected] wrote: > > Hi > just put property file name as "font_properties" and let it stay with > tesseract.exe at the same directory. > file contains as follow: > magang.18Nodasharial 0 0 1 1 0 > magang.18NoDashariali 1 1 0 1 0 > magang.18Nodasharialb 0 1 1 1 0 > > > Each line of the font_properties file is formatted as follows: > > <fontname> <italic> <bold> <fixed> <serif> <fraktur> > > > > > > B.R. IQ.YU > > At 2011-07-10 21:05:20,"MARTIN Pierre" <[email protected] <javascript:>> > wrote: > > No didnt work... i created a file named OCRB_font_properties and then ran > at the clustering step: > *mftraining -F ../Fonts/OCRB_font_properties -U unicharset -O > cst.unicharset cst.OCRB.Full1.tr > <http://www.google.com/url?q=http%3A%2F%2Fcst.ocrb.full1.tr%2F&sa=D&sntz=1&usg=AFQjCNHoFhM7jyrP6QrmzG7pXbUMtb-LJg>* > Output was: > *Reading cst.OCRB.Full1.tr > <http://www.google.com/url?q=http%3A%2F%2Fcst.ocrb.full1.tr%2F&sa=D&sntz=1&usg=AFQjCNHoFhM7jyrP6QrmzG7pXbUMtb-LJg> > > ...* > *cst.OCRB.Full1 has no defined properties.* > *!"Missing font_properties entry is a fatal error!":Error:Assert failed:in > file ..\training\mftraining.cpp, line 287* > > What can it be? > > On 10 juil. 2011, at 14:58, MARTIN Pierre wrote: > > Replying myself.... i just discovered the new font_properties file > requirement... Maybe the error comes from that which is missing on my > scripts. i'll add the required and tell you if it worked. > > Thanks :) > Pierre. > > > Hello, > > i'm using a set of batch files i created using a previous version of > Tesseract (I'm now using the svn HEAD one). > > Can anyone explain me what changed in the new tesseract training > process? What is that error "*Missing font_properties entry is a fatal > error!":Error:Assert failed:in file ..\training\mftraining.cpp, line 287* > "? > > Below is a snipet of the command line im using and it's output. > > Thanks a lot, > Pierre. > > > First, im extracting boxes and training boxes. The command line is: > *tesseract ./Boxing/OCRBFull1.tif ./Generated/cst.OCRB.Full1 nobatch > box.train.stderr* > The output is: > *Tesseract Open Source OCR Engine v3.01 with Leptonica* > *TIFFReadDirectory: Warning, TIFFstream: wrong data type 7 for > "RichTIFFIPTC"; ta* > *g ignored.* > *APPLY_BOXES:* > * Boxes read from boxfile: 1260* > * Boxes failed resegmentation: 0* > * Found 1260 good blobs and 0 unlabelled blobs in 0 words.* > * 0 remaining unlabelled words deleted.* > *TRAINING ... Font name = OCRB* > *Generated training data for 153 words* > > Then, i'm extracting Unicharsets. The command line is: > *unicharset_extractor ../Boxing/OCRBFull1.box* > The output is: > *Extracting unicharset from ../Boxing/OCRBFull1.box* > *Wrote unicharset file ./unicharset.* > > After that, the cntraining. Command line is: > *cntraining cst.OCRB.Full1.tr > <http://www.google.com/url?q=http%3A%2F%2Fcst.ocrb.full1.tr%2F&sa=D&sntz=1&usg=AFQjCNHoFhM7jyrP6QrmzG7pXbUMtb-LJg>* > Output is: > *Reading cst.OCRB.Full1.tr > <http://www.google.com/url?q=http%3A%2F%2Fcst.ocrb.full1.tr%2F&sa=D&sntz=1&usg=AFQjCNHoFhM7jyrP6QrmzG7pXbUMtb-LJg> > > ...* > *Clustering ...* > *Writing normproto ...* > > Now the MFTraining (Which fails, and i have no idea why). Connand line is: > mftraining > cst.OCRB.Full1.tr<http://www.google.com/url?q=http%3A%2F%2Fcst.ocrb.full1.tr%2F&sa=D&sntz=1&usg=AFQjCNHoFhM7jyrP6QrmzG7pXbUMtb-LJg> > Output is: > *Reading cst.OCRB.Full1.tr > <http://www.google.com/url?q=http%3A%2F%2Fcst.ocrb.full1.tr%2F&sa=D&sntz=1&usg=AFQjCNHoFhM7jyrP6QrmzG7pXbUMtb-LJg> > > ...* > *cst.OCRB.Full1 has no defined properties.* > *!"Missing font_properties entry is a fatal error!":Error:Assert failed:in > file ..\training\mftraining.cpp, line 287* > > > > -- > You received this message because you are subscribed to the Google > Groups "tesseract-ocr" group. > To post to this group, send email to [email protected]<javascript:> > To unsubscribe from this group, send email to > [email protected] <javascript:> > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en > > > > -- -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en --- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/groups/opt_out.

