bazaar is nothing but a config file which sets values for a set of config
variables, please see

https://code.google.com/p/tesseract-ocr/source/browse/tessdata/configs/bazaar

So, if patterns are helpful, you can that as a config.

ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

On Wed, Nov 12, 2014 at 9:09 PM, Steven Norris <[email protected]> wrote:

> In a way. I can set values for keys that would appear in a config file.
> Like the below:
>
> [tesseract setVariableValue:@"0123456789" forKey:@"tessedit_char_whitelist"];
>
>
> On Wed, Nov 12, 2014 at 12:30 AM, ShreeDevi Kumar <[email protected]>
> wrote:
>
>> Are you able to pass a configuration variable with iOS CocoaPod ?
>>
>> *-c configvar=value*
>>
>> Set value for control parameter. Multiple -c arguments are allowed.
>>
>>
>> *configfile*
>>
>> The name of a config to use. A config is a plaintext file which contains
>> a list of variables and their values, one per line, with a space separating
>> variable from value.
>>
>> ShreeDevi
>> ____________________________________________________________
>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>>
>> On Wed, Nov 12, 2014 at 10:33 AM, Steven Norris <[email protected]>
>> wrote:
>>
>>> I did see that. Unfortunately I cannot use bazaar, as the final version
>>> of what I'm using will be using an iOS CocoaPod that does not support the
>>> bazaar functionality of Tesseract.
>>>
>>> On Tue, Nov 11, 2014 at 8:51 PM, ShreeDevi Kumar <[email protected]>
>>> wrote:
>>>
>>>> On Wed, Nov 12, 2014 at 2:13 AM, <[email protected]> wrote:
>>>>
>>>>>
>>>>>
>>>>> The user-patterns looks helpful, but I can't find any documentation on
>>>>> formatting or how it works. Is there documentation on this somewhere?
>>>>>
>>>>
>>>>
>>>> ​Did you see the man page? I had also sent link to a related discussion
>>>> in the past. Search the archives for other tips.
>>>>
>>>> https://tesseract-ocr.googlecode.com/svn/trunk/doc/tesseract.1.html
>>>> says
>>>> "if you pass the word *bazaar* as a trailing command line parameter to
>>>> Tesseract, Tesseract will not bother loading the system dictionary nor the
>>>> dictionary of frequent words and will load and use the eng.user-words and
>>>> eng.user-patterns files you provided. The former is a simple word list, one
>>>> per line. The format of the latter is documented in dict/trie.h on
>>>> read_pattern_list()."
>>>>
>>>> https://code.google.com/p/tesseract-ocr/source/browse/dict/trie.h
>>>> ​see
>>>> lines 199-232​
>>>>
>>>>
>>>>
>>>> ​
>>>>
>>>>
>>>>>
>>>>>
>>>>> On Tuesday, November 11, 2014 10:50:57 AM UTC-6, [email protected]
>>>>> wrote:
>>>>>>
>>>>>> I am working on getting Tesseract to recognize VINs for an
>>>>>> application I am developing. I have a clean VIN image (work around to be
>>>>>> black text on white background). Have traineddata using fonts Courier,
>>>>>> HelveticaNeue, LatoBold, LatoLight, OpenSans, and RobotoSlab as a first
>>>>>> attempt. I've also limited the unicharset to A-Z except I and O and 0-9.
>>>>>>
>>>>>> The result is not very good. It returns a great deal of characters
>>>>>> that surpass the number of characters present (17). Is there a way to 
>>>>>> limit
>>>>>> tesseract to only detecting a 17 character word in one line? I'd also 
>>>>>> like
>>>>>> to have tesseract prefer, but not require, the last 5 characters to be
>>>>>> digits. There are a few other preferences that may help too, but I want 
>>>>>> to
>>>>>> start with these. I'm not sure how to go about setting up those 
>>>>>> preferences.
>>>>>>
>>>>>> Also, any suggestions past these on being able to clean up the OCR to
>>>>>> read more correctly would be helpful. I can't post full data and image 
>>>>>> here
>>>>>> (they're VINs. I'd need permission to do so), but I can say that a in one
>>>>>> instance WM is coming back as 6W6M and that the digits 67258 are coming
>>>>>> back as 572S5 in another.
>>>>>>
>>>>>> Any guidance would be appreciated. I'll provide whatever information
>>>>>> I can.
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>  --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "tesseract-ocr" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to [email protected].
>>>>> To post to this group, send email to [email protected].
>>>>> Visit this group at http://groups.google.com/group/tesseract-ocr.
>>>>> To view this discussion on the web visit
>>>>> https://groups.google.com/d/msgid/tesseract-ocr/065a4b64-bcba-4d02-bc81-461d9ae11655%40googlegroups.com
>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/065a4b64-bcba-4d02-bc81-461d9ae11655%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>>
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>
>>>>  --
>>>> You received this message because you are subscribed to a topic in the
>>>> Google Groups "tesseract-ocr" group.
>>>> To unsubscribe from this topic, visit
>>>> https://groups.google.com/d/topic/tesseract-ocr/AyCNiju1x1Y/unsubscribe
>>>> .
>>>> To unsubscribe from this group and all its topics, send an email to
>>>> [email protected].
>>>> To post to this group, send email to [email protected].
>>>> Visit this group at http://groups.google.com/group/tesseract-ocr.
>>>> To view this discussion on the web visit
>>>> https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWoMKQg7enZUxOBfe35fCthkMOLvA6MmnwtqnuiFjacEw%40mail.gmail.com
>>>> <https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWoMKQg7enZUxOBfe35fCthkMOLvA6MmnwtqnuiFjacEw%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>> .
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>>
>>>
>>> --
>>> *Steven T. Norris*
>>> *Software Engineer - Forty AU*
>>>
>>> *p: (615)997-0836 <%28615%29997-0836>*
>>> *e: s <[email protected]>[email protected] <[email protected]>*
>>> *w: http://www.linkedin.com/in/steventnorris
>>> <http://www.linkedin.com/in/steventnorris>*
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To post to this group, send email to [email protected].
>>> Visit this group at http://groups.google.com/group/tesseract-ocr.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/tesseract-ocr/CAG5%2BCTEGQcag4QsX9Gy5Ei7dXrHzB5N4icc3tEUj0vt3dO6Fbg%40mail.gmail.com
>>> <https://groups.google.com/d/msgid/tesseract-ocr/CAG5%2BCTEGQcag4QsX9Gy5Ei7dXrHzB5N4icc3tEUj0vt3dO6Fbg%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  --
>> You received this message because you are subscribed to a topic in the
>> Google Groups "tesseract-ocr" group.
>> To unsubscribe from this topic, visit
>> https://groups.google.com/d/topic/tesseract-ocr/AyCNiju1x1Y/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to
>> [email protected].
>> To post to this group, send email to [email protected].
>> Visit this group at http://groups.google.com/group/tesseract-ocr.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduVgjzY8GDv9wea4emyEju%2B3gXZdHZL0krUjzWOD3jHF%2BA%40mail.gmail.com
>> <https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduVgjzY8GDv9wea4emyEju%2B3gXZdHZL0krUjzWOD3jHF%2BA%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> --
> *Steven T. Norris*
> *Software Engineer - Forty AU*
>
> *p: (615)997-0836 <%28615%29997-0836>*
> *e: s <[email protected]>[email protected] <[email protected]>*
> *w: http://www.linkedin.com/in/steventnorris
> <http://www.linkedin.com/in/steventnorris>*
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/CAG5%2BCTF%3DEXLTscCHxg%2B585E2Q7zKOH4Kn%2B3dPhmMDVDpV-P2hg%40mail.gmail.com
> <https://groups.google.com/d/msgid/tesseract-ocr/CAG5%2BCTF%3DEXLTscCHxg%2B585E2Q7zKOH4Kn%2B3dPhmMDVDpV-P2hg%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduUJHWJbm1ku0dV8K-Wd_6O2i2%2B8%3DkgzK%2B7F2kmTmjMYeQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to