bazaar is nothing but a config file which sets values for a set of config variables, please see
https://code.google.com/p/tesseract-ocr/source/browse/tessdata/configs/bazaar So, if patterns are helpful, you can that as a config. ShreeDevi ____________________________________________________________ भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com On Wed, Nov 12, 2014 at 9:09 PM, Steven Norris <[email protected]> wrote: > In a way. I can set values for keys that would appear in a config file. > Like the below: > > [tesseract setVariableValue:@"0123456789" forKey:@"tessedit_char_whitelist"]; > > > On Wed, Nov 12, 2014 at 12:30 AM, ShreeDevi Kumar <[email protected]> > wrote: > >> Are you able to pass a configuration variable with iOS CocoaPod ? >> >> *-c configvar=value* >> >> Set value for control parameter. Multiple -c arguments are allowed. >> >> >> *configfile* >> >> The name of a config to use. A config is a plaintext file which contains >> a list of variables and their values, one per line, with a space separating >> variable from value. >> >> ShreeDevi >> ____________________________________________________________ >> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com >> >> On Wed, Nov 12, 2014 at 10:33 AM, Steven Norris <[email protected]> >> wrote: >> >>> I did see that. Unfortunately I cannot use bazaar, as the final version >>> of what I'm using will be using an iOS CocoaPod that does not support the >>> bazaar functionality of Tesseract. >>> >>> On Tue, Nov 11, 2014 at 8:51 PM, ShreeDevi Kumar <[email protected]> >>> wrote: >>> >>>> On Wed, Nov 12, 2014 at 2:13 AM, <[email protected]> wrote: >>>> >>>>> >>>>> >>>>> The user-patterns looks helpful, but I can't find any documentation on >>>>> formatting or how it works. Is there documentation on this somewhere? >>>>> >>>> >>>> >>>> Did you see the man page? I had also sent link to a related discussion >>>> in the past. Search the archives for other tips. >>>> >>>> https://tesseract-ocr.googlecode.com/svn/trunk/doc/tesseract.1.html >>>> says >>>> "if you pass the word *bazaar* as a trailing command line parameter to >>>> Tesseract, Tesseract will not bother loading the system dictionary nor the >>>> dictionary of frequent words and will load and use the eng.user-words and >>>> eng.user-patterns files you provided. The former is a simple word list, one >>>> per line. The format of the latter is documented in dict/trie.h on >>>> read_pattern_list()." >>>> >>>> https://code.google.com/p/tesseract-ocr/source/browse/dict/trie.h >>>> see >>>> lines 199-232 >>>> >>>> >>>> >>>> >>>> >>>> >>>>> >>>>> >>>>> On Tuesday, November 11, 2014 10:50:57 AM UTC-6, [email protected] >>>>> wrote: >>>>>> >>>>>> I am working on getting Tesseract to recognize VINs for an >>>>>> application I am developing. I have a clean VIN image (work around to be >>>>>> black text on white background). Have traineddata using fonts Courier, >>>>>> HelveticaNeue, LatoBold, LatoLight, OpenSans, and RobotoSlab as a first >>>>>> attempt. I've also limited the unicharset to A-Z except I and O and 0-9. >>>>>> >>>>>> The result is not very good. It returns a great deal of characters >>>>>> that surpass the number of characters present (17). Is there a way to >>>>>> limit >>>>>> tesseract to only detecting a 17 character word in one line? I'd also >>>>>> like >>>>>> to have tesseract prefer, but not require, the last 5 characters to be >>>>>> digits. There are a few other preferences that may help too, but I want >>>>>> to >>>>>> start with these. I'm not sure how to go about setting up those >>>>>> preferences. >>>>>> >>>>>> Also, any suggestions past these on being able to clean up the OCR to >>>>>> read more correctly would be helpful. I can't post full data and image >>>>>> here >>>>>> (they're VINs. I'd need permission to do so), but I can say that a in one >>>>>> instance WM is coming back as 6W6M and that the digits 67258 are coming >>>>>> back as 572S5 in another. >>>>>> >>>>>> Any guidance would be appreciated. I'll provide whatever information >>>>>> I can. >>>>>> >>>>>> Thanks! >>>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "tesseract-ocr" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> To post to this group, send email to [email protected]. >>>>> Visit this group at http://groups.google.com/group/tesseract-ocr. >>>>> To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/tesseract-ocr/065a4b64-bcba-4d02-bc81-461d9ae11655%40googlegroups.com >>>>> <https://groups.google.com/d/msgid/tesseract-ocr/065a4b64-bcba-4d02-bc81-461d9ae11655%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>> >>>> -- >>>> You received this message because you are subscribed to a topic in the >>>> Google Groups "tesseract-ocr" group. >>>> To unsubscribe from this topic, visit >>>> https://groups.google.com/d/topic/tesseract-ocr/AyCNiju1x1Y/unsubscribe >>>> . >>>> To unsubscribe from this group and all its topics, send an email to >>>> [email protected]. >>>> To post to this group, send email to [email protected]. >>>> Visit this group at http://groups.google.com/group/tesseract-ocr. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWoMKQg7enZUxOBfe35fCthkMOLvA6MmnwtqnuiFjacEw%40mail.gmail.com >>>> <https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWoMKQg7enZUxOBfe35fCthkMOLvA6MmnwtqnuiFjacEw%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>> . >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> >>> >>> >>> -- >>> *Steven T. Norris* >>> *Software Engineer - Forty AU* >>> >>> *p: (615)997-0836 <%28615%29997-0836>* >>> *e: s <[email protected]>[email protected] <[email protected]>* >>> *w: http://www.linkedin.com/in/steventnorris >>> <http://www.linkedin.com/in/steventnorris>* >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "tesseract-ocr" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To post to this group, send email to [email protected]. >>> Visit this group at http://groups.google.com/group/tesseract-ocr. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/tesseract-ocr/CAG5%2BCTEGQcag4QsX9Gy5Ei7dXrHzB5N4icc3tEUj0vt3dO6Fbg%40mail.gmail.com >>> <https://groups.google.com/d/msgid/tesseract-ocr/CAG5%2BCTEGQcag4QsX9Gy5Ei7dXrHzB5N4icc3tEUj0vt3dO6Fbg%40mail.gmail.com?utm_medium=email&utm_source=footer> >>> . >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- >> You received this message because you are subscribed to a topic in the >> Google Groups "tesseract-ocr" group. >> To unsubscribe from this topic, visit >> https://groups.google.com/d/topic/tesseract-ocr/AyCNiju1x1Y/unsubscribe. >> To unsubscribe from this group and all its topics, send an email to >> [email protected]. >> To post to this group, send email to [email protected]. >> Visit this group at http://groups.google.com/group/tesseract-ocr. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduVgjzY8GDv9wea4emyEju%2B3gXZdHZL0krUjzWOD3jHF%2BA%40mail.gmail.com >> <https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduVgjzY8GDv9wea4emyEju%2B3gXZdHZL0krUjzWOD3jHF%2BA%40mail.gmail.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > > > > -- > *Steven T. Norris* > *Software Engineer - Forty AU* > > *p: (615)997-0836 <%28615%29997-0836>* > *e: s <[email protected]>[email protected] <[email protected]>* > *w: http://www.linkedin.com/in/steventnorris > <http://www.linkedin.com/in/steventnorris>* > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at http://groups.google.com/group/tesseract-ocr. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/CAG5%2BCTF%3DEXLTscCHxg%2B585E2Q7zKOH4Kn%2B3dPhmMDVDpV-P2hg%40mail.gmail.com > <https://groups.google.com/d/msgid/tesseract-ocr/CAG5%2BCTF%3DEXLTscCHxg%2B585E2Q7zKOH4Kn%2B3dPhmMDVDpV-P2hg%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduUJHWJbm1ku0dV8K-Wd_6O2i2%2B8%3DkgzK%2B7F2kmTmjMYeQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.

