you have to be clear on what files you are combining. the command you have given is overwriting japanese traineddata - is that what you want to do?
> *training/combine_tessdata -o tessdata/jpn.traineddata* *Look at help for all options of combine_tessdata* *Figure out which files (lstm, dawg etc) you want to combine* *Give appropriate command options and files to create new traineddata* ShreeDevi ____________________________________________________________ भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com On Tue, Jun 13, 2017 at 5:25 PM, Ibr <[email protected]> wrote: > seems so, to add or merge the new LSTM files in the traineddata this > command to user correct: *training/combine_tessdata -o > tessdata/jpn.traineddata ~/tesstutorial/eng_from_chi/.lstm* > but that gave me the following: > TessdataManager can't determine which tessdata component is represented by > lstmf > TessdataManager combined tesseract data files. > Offset for type 0 (.traineddataconfig ) is 172 > Offset for type 1 (.traineddataunicharset ) is 2745 > Offset for type 2 (.traineddataunicharambigs ) is 283372 > Offset for type 3 (.traineddatainttemp ) is 288048 > Offset for type 4 (.traineddatapffmtable ) is 30906394 > Offset for type 5 (.traineddatanormproto ) is 30942955 > Offset for type 6 (.traineddatapunc-dawg ) is 31395690 > Offset for type 7 (.traineddataword-dawg ) is 31398292 > Offset for type 8 (.traineddatanumber-dawg ) is 32406214 > Offset for type 9 (.traineddatafreq-dawg ) is 32406256 > Offset for type 10 (.traineddatafixed-length-dawgs ) is -1 > Offset for type 11 (.traineddatacube-unicharset ) is -1 > Offset for type 12 (.traineddatacube-word-dawg ) is -1 > Offset for type 13 (.traineddatashapetable ) is 32407402 > Offset for type 14 (.traineddatabigram-dawg ) is -1 > Offset for type 15 (.traineddataunambig-dawg ) is -1 > Offset for type 16 (.traineddataparams-model ) is 33071948 > Offset for type 17 (.traineddatalstm ) is 33072647 > Offset for type 18 (.traineddatalstm-punc-dawg ) is 43371656 > Offset for type 19 (.traineddatalstm-word-dawg ) is 43374258 > Offset for type 20 (.traineddatalstm-number-dawg ) is 44380188 > > any idea? > thanks > > > On Tuesday, June 13, 2017 at 2:36:54 PM UTC+3, shree wrote: > >> *tesseract image results -l ara --tessdata-dir ./tessdata --oem 1* >> >> *uses the LSTM files that are there in ara.traineddata in your tessdata >> directory.* >> >> *Just placing lstm files in tesseract folder is not going to change >> anything.* >> >> *You need to create a new traineddata with the new lstm files and then >> test with it.* >> >> ShreeDevi >> ____________________________________________________________ >> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com >> >> On Tue, Jun 13, 2017 at 3:17 PM, Ibr <[email protected]> wrote: >> >>> Hi, >>> >>> when make detection using the tesseract 4.00.00alpha and use the >>> command: *tesseract image results -l ara --tessdata-dir ./tessdata >>> --oem 1 *the oem here means "Neural nets LSTM only", so there is no >>> argument in tesseract to specify where to find the LSTM files, how the >>> tesseract find them? I used to place the LSTM files inside the tesseract >>> folder, but I tried to detect after I deleted the LSTM files, with the >>> argument --oem 1 which meanst LSTM only yet the detection happened, so does >>> the tesseract search in other folders for LSTM files? as I had LSTM files >>> in different folders >>> >>> Thanks. >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "tesseract-ocr" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To post to this group, send email to [email protected]. >>> Visit this group at https://groups.google.com/group/tesseract-ocr. >>> To view this discussion on the web visit https://groups.google.com/d/ms >>> gid/tesseract-ocr/eefc8290-c407-4075-b845-4b226094e752%40goo >>> glegroups.com >>> <https://groups.google.com/d/msgid/tesseract-ocr/eefc8290-c407-4075-b845-4b226094e752%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at https://groups.google.com/group/tesseract-ocr. > To view this discussion on the web visit https://groups.google.com/d/ > msgid/tesseract-ocr/16ce1839-6af2-4c5a-850a-62843b185b4b% > 40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/16ce1839-6af2-4c5a-850a-62843b185b4b%40googlegroups.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWPixnX-ffKa2jG3xsxMajKLsuxOSUpmK7SzK%2BKVz0x5Q%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.

