Re: integrating other spellchecking tools

francis . tyers Mon, 20 May 2013 07:06:59 -0700


El dilluns 20 de maig de 2013 13:29:38 UTC, jimregan va escriure:
>
> On Saturday, 18 May 2013 12:51:54 UTC+1, [email protected] wrote:
>>
>> Hi,
>>
>>
> Hi Fran.
>  
>
>> If I wanted to integrate a "spellchecker" (or wordlist) other than the 
>> DAWG one that is bundled with Tesseract, how might I go about it ? 
>>
>>
> There was a version of Tesseract that did this, using OpenFST, in one of 
> the Android trees (I think the original AOSP tree), but you'd have to dig 
> through old revisions to find it.
>


That sounds like exactly what I'm looking for! 
 

>  
>
>> In dict/dawg.cpp there is 
>>
>>   /// Returns true if the given word is in the Dawg.
>>   bool word_in_dawg(const WERD_CHOICE &word) const;
>>
>> But then I don't see any reference to it in the code outside of dict/ and 
>> it just seems to be used for constructing the Trie.
>>
>> There is also:
>>
>> cube/word_list_lang_model.h and cube/lang_model.h
>>
>>
> Cube is basically a second OCR engine, but (last time I checked, at least) 
> there aren't tools or documentation for preparing data for it, so it's not 
> something too many people on the list can comment on.
>  
>

Ok, so I can forget looking there. So the only relevant "language model" 
(spellchecky) code is in dict/  ? 

Fran

-- 
-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

Re: integrating other spellchecking tools

Reply via email to