I forgot to inform that I have already generated lib_debug version"dawg2wordlistd.exe" - when run in CMD same message displayed for lib_release version exe file.
On Thu, Mar 8, 2012 at 12:50 PM, Sriranga(78yrsold) <[email protected] > wrote: > Reg:kan.traineddata: revised screenshot attached for perusal. > Successfully generated output for the kan.punc-dawg and kan.numeric-dawg. > I could not understand why should failed for kan.word-dawg and > kan.freq.dawg only?. > > Reg tel.traineddata: - tested. Same problem similar/identical to kan > faced by me. > > Since I am not programmer nor developer- difficult to understand and > follow. > > Re: tessdata: - no problem with tessdata folder - since it is located > above all exe files. > able to generate traineddata files. > With regards, > -sriranga(79yrs) > > > > > > On Thu, Mar 8, 2012 at 11:35 AM, TP <[email protected]> wrote: > >> On Wed, Mar 7, 2012 at 7:55 PM, Sriranga(78yrs) >> <[email protected]> wrote: >> > David, >> > Thank you for the valuable guidance. I followed your steps still >> problem of >> > window's exe encounter - vide screenshot is attached. WinXP(sp3) >> tesseract >> > -r-700 >> > With warmest regards, >> > -sriranga(79yrs) >> > >> > >> > On Thu, Mar 8, 2012 at 12:42 AM, David Eger <[email protected]> >> wrote: >> >> >> >> $ combine_tessdata -u ./third_party/tesseract/tessdata/ >> >> kan.traineddata ./kan. >> >> Extracting tessdata components from ./third_party/tesseract/tessdata/ >> >> kan.traineddata >> >> Wrote ./kan.unicharset >> >> Wrote ./kan.inttemp >> >> Wrote ./kan.pffmtable >> >> Wrote ./kan.normproto >> >> Wrote ./kan.punc-dawg >> >> Wrote ./kan.word-dawg >> >> Wrote ./kan.number-dawg >> >> Wrote ./kan.freq-dawg >> >> >> >> $ ls kan.* >> >> kan.freq-dawg kan.inttemp kan.normproto kan.number-dawg >> >> kan.pffmtable kan.punc-dawg kan.unicharset kan.word-dawg >> >> >> >> $ dawg2wordlist kan.unicharset kan.word-dawg word.wordlist >> >> Loading word list from kan.word-dawg >> >> Reading squished dawg >> >> Word list loaded. >> >> >> >> $ wc -l word.wordlist >> >> 18720 word.wordlist >> >> >> >> Looks like there are 18,720 words in the Kannada word dawg, safely >> >> uncompressed... >> >> >> >> >> >> >> >> On Mar 7, 8:43 am, "Sriranga(78yrs)" <withblessing.sriranga. >> >> [email protected]> wrote: >> >> > David, >> >> > just now I checked with kan.punc-dawg(1KB) and kan.number-dawg(1KB) >> >> > also. >> >> > it works fine In both cases the output were not empty. Only >> >> > word-dawg(181KB) and freq-dawg(2KB) does not work but with M$ >> windows's >> >> > exe >> >> > encounter message were displayed. >> >> > this is brought to your kind notice. Even attached files of >> >> > kan.word-dawg >> >> > and kan.freq.dawg - for your investigation and valuable guidance. >> >> > With warmest regards, >> >> > -sriranga(79yrs) >> >> > >> >> > On Wed, Mar 7, 2012 at 9:44 AM, Sriranga(78yrs) < >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > [email protected]> wrote: >> >> > > David, >> >> > > Thanks for the valuable guidance. >> >> > > Copied dawg2wordlist.exe pasted in the folder n:\Newfolder\ wherein >> >> > > extracted files Kan.unicharset, kan.word-dawg, kan.freq-dawg are >> >> > > located. >> >> > >> >> > > extract of cmd is reproduced below - with encounter.exe windows >> >> > > messages >> >> > > displayed for word-dawg and freq-dawg. >> >> > > M:\New Folder>dawg2wordlist.exe -h >> >> > > Print all the words in a given dawg. >> >> > > Usage: dawg2wordlist.exe <unicharset> <dawgfile> <wordlistfile> >> >> > >> >> > > M:\New Folder>dawg2wordlist.exe kan.unicharset kan.word-dawg >> >> > > testwordlist >> >> > > Loading word list from kan.word-dawg >> >> > > Reading squished dawg >> >> > >> >> > > M:\New Folder>dawg2wordlist.exe kan.unicharset kan.freq-dawg >> >> > > testwordlist >> >> > > Loading word list from kan.freq-dawg >> >> > > Reading squished dawg >> >> > > Word list loaded. >> >> > > M:\New Folder> >> >> > >> >> > > [Note: testwordlist contains 0(zero)kb for kan.freq-dawg which >> >> > > contains >> >> > > 2KB - >> >> > > whereas testwordlist did not generate for kan.word-dawg which >> >> > > contains 181KB] >> >> > > Awaiting further valuable guidance. >> >> > > With regards, >> >> > > -sriranga(79yrs) >> >> > >> >> > > Still i could not understand where I made mistake? >> >> > > With regards, >> >> > > -sriranga(79yrs) >> >> > >> >> > > On Wed, Mar 7, 2012 at 2:41 AM, David Eger <[email protected]> >> >> > > wrote: >> >> > >> >> > >> Where you put wordlist2dawg.exe, try putting the name of the >> output >> >> > >> list >> >> > >> instead. >> >> > >> >> > >> On Friday, March 2, 2012 2:39:33 AM UTC-8, sriranga(79yrsold) >> wrote: >> >> > >> >> > >>> I had extracted kan.word-dawg from the Kan.traineddata. I am >> trying >> >> > >>> to >> >> > >>> convert dawg to wordlist using commandline in cmd as follows: >> >> > >> >> > >>> ***M:\r684\BuildFolder\tesseract-ocr>dawg2wordlist "m:\New >> >> > >>> Folder\kan.unicharset" " >> >> > >>> m:\New Folder\kan.word-dawg" wordlist2dawg.exe >> >> > >>> Loading word list from m:\New Folder\kan.word-dawg >> >> > >>> Reading squished dawg >> >> > >> >> > >>> M:\r684\BuildFolder\tesseract-ocr> >> >> > >>> * >> >> > >>> Unfortunately windows encounter exe displayed. Where I made a >> >> > >>> mistake? >> >> > >>> Awaiting solution? >> >> > >> >> > >> >> > kan.word-dawg >> >> > 243KViewDownload >> >> > >> >> > kan.freq-dawg >> >> > 2KViewDownload >> >> > >> >> > kan.punc-dawg >> >> > < 1KViewDownload >> >> > >> >> > kan.number-dawg >> >> > < 1KViewDownload >> >> Just looking at that screenshot you supplied, it starts with a ERROR >> message about TESSDATA_PREFIX not correctly pointing to the parent >> folder of TESSDATA folder? >> >> Have you fixed this by setting TESSDATA_PREFIX? This is prominently >> mentioned in the README [1] It should now probably point at your SVN >> working directory (and make sure it ends with a / character). >> >> And sorry to say, if you keep running into problems like this, you >> might want to think about learning to use the Visual Studio 2008 >> Debugger :) It's pretty easy, and very handy for figuring out exactly >> where a program crashes. >> >> 1) You already know how to build tesseract with VS, so just set your >> build configuration to LIB_Debug (when debugging the training apps). >> >> 2) Make the training app project (in this case dawg2wordlist) you are >> trying to debug, the Default Startup project (by right clicking it >> and choosing Set as Startup Project). >> >> 3) Open up the training app project's properties (by again >> right-clicking it and choosing Properties). >> >> 4) Make sure at the top Configuration: is LIB_Debug. >> >> 5) In the Configuration Properties | Debugging Category, set the >> following fields: >> >> Command Arguments: (whatever you specified on the command line) so set >> it to: >> >> kan.unicharset kan.word-dawg word.wordlist >> >> Working Directory should be your working directory so: >> >> M:\New Folder\New Folder >> >> (a terrible name for folders BTW :P ) >> >> 6) Now for the exciting part, right-click the dawg2wordlist project and >> choose Debug -> Start new instance from the popup menu. >> >> A new command window will show up (possible hidden by Visual Studio), >> displaying all of dawg2wordlist's output. >> >> When the program crashes, you should see a window in the debugger that >> shows exactly where the program was when it crashed and what the error >> reason is. From that either you (hopefully) or we can better figure >> out what is going wrong. >> >> [1] http://code.google.com/p/tesseract-ocr/wiki/ReadMe >> >> -- >> You received this message because you are subscribed to the Google >> Groups "tesseract-ocr" group. >> To post to this group, send email to [email protected] >> To unsubscribe from this group, send email to >> [email protected] >> For more options, visit this group at >> http://groups.google.com/group/tesseract-ocr?hl=en >> > > -- > You received this message because you are subscribed to the Google > Groups "tesseract-ocr" group. > To post to this group, send email to [email protected] > To unsubscribe from this group, send email to > [email protected] > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

