El dilluns 20 de maig de 2013 13:29:38 UTC, jimregan va escriure: > > On Saturday, 18 May 2013 12:51:54 UTC+1, [email protected] wrote: >> >> Hi, >> >> > Hi Fran. > > >> If I wanted to integrate a "spellchecker" (or wordlist) other than the >> DAWG one that is bundled with Tesseract, how might I go about it ? >> >> > There was a version of Tesseract that did this, using OpenFST, in one of > the Android trees (I think the original AOSP tree), but you'd have to dig > through old revisions to find it. >
That sounds like exactly what I'm looking for! > > >> In dict/dawg.cpp there is >> >> /// Returns true if the given word is in the Dawg. >> bool word_in_dawg(const WERD_CHOICE &word) const; >> >> But then I don't see any reference to it in the code outside of dict/ and >> it just seems to be used for constructing the Trie. >> >> There is also: >> >> cube/word_list_lang_model.h and cube/lang_model.h >> >> > Cube is basically a second OCR engine, but (last time I checked, at least) > there aren't tools or documentation for preparing data for it, so it's not > something too many people on the list can comment on. > > Ok, so I can forget looking there. So the only relevant "language model" (spellchecky) code is in dict/ ? Fran -- -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en --- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/groups/opt_out.

