On Sun, Sep 16, 2012 at 6:23 AM, rohit mittal <[email protected]>wrote:
> hello, > I am working on tesseract for punjabi languages and i don't have data > files for punjabi language i.e. .bigrams.params,.lm , .nn , .word-freqas > are present with english and hindi..... > > Can i get these files for punjabi or i can make it from some software . > Please suggest me the direction to go further. > > > > Thanx and regards > > Rohit Mittal > > There are two OCR engine modes[1]: 1. tesseract 2. cube There is desctiption for tesseract training[2]. But there are no information for cube training. The files you mentioned belong to cube part. [1] http://code.google.com/p/tesseract-ocr/source/browse/trunk/ccstruct/publictypes.h#234 [2] http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3 -- Zdenko -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

