>Also one more doubt is when I use lstm.train command a text file also gets generated with lstmf file You can ignore that txt file. Only lstmf is used for further processing.
On Wed, Jun 19, 2019 at 2:44 PM hrishikesh kaulwar <hpkaul...@gmail.com> wrote: > Hello shree, > I tried again with .tif and lstm.train command generated .txt file > again along with lstmf file. I don't think that's the error. Thanks for > helping. > > On Wednesday, June 19, 2019 at 2:02:54 PM UTC+5:30, shree wrote: >> >> > eng.Arial_Regular.exp0.png >> >> The script expects tif files not png. >> >> On Wed, Jun 19, 2019 at 1:42 PM hrishikesh kaulwar <hpka...@gmail.com> >> wrote: >> >>> Thank you for your help. I have checked it many times. Could you tell me >>> where I am doing wrong? It takes my 3 tiff box pairs for example and copies >>> it into train directoey. Then it overwrites exp0.tif file with randomly >>> generated text and text2image tool. Although 3 tiff box pairs are accepted >>> it only creates lstmf of 1st file generated by text2image and ignores rest. >>> I have attached generate_training_data.sh script. also the screeshot of the >>> folder where lstmf files are generated. >>> >>> Also one more doubt is when I use lstm.train command a text file also >>> gets generated with lstmf file. >>> I have named image files as per convention >>> tesseract eng.Arial_Regular.exp0.png eng.Arial_Regular.exp0 lstm.train >>> Image is attached above. and two files generated are also attached. >>> On Tuesday, June 18, 2019 at 3:08:19 PM UTC+5:30, shree wrote: >>>> >>>> It should work if your files follow similar naming convention. >>>> >>>> lang.xxxnnn.exp0.tif >>>> lang.xxxnnn.exp0.box >>>> >>>> Where lang is your language code eg. eng >>>> >>>> xxxnnn is any unique random string (fontname in files generated by >>>> text2image) >>>> >>>> >>>> >>>> On Tue, Jun 18, 2019 at 2:54 PM hrishikesh kaulwar <hpka...@gmail.com> >>>> wrote: >>>> >>>>> Greetings, >>>>> I just got to know that tesstrain.sh is modified to support user >>>>> provided box/tiff pairs by adding a tiff/box directory flag. I used that >>>>> version of tesseract source to use my own tiff/box pairs. But when I ran >>>>> tesstrain.sh I got to know that it just copies tiff/box pairs provided by >>>>> me to training directory but .lstmf file is generated from >>>>> eng.training_text file. My tiff/box pairs are not getting used in creating >>>>> training data. Can someone point out what mistake I am making? or some way >>>>> to only use user provided tiff/box pairs to create training data? >>>>> Thanks in advance. >>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "tesseract-ocr" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to tesser...@googlegroups.com. >>>>> To post to this group, send email to tesser...@googlegroups.com. >>>>> Visit this group at https://groups.google.com/group/tesseract-ocr. >>>>> To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/tesseract-ocr/f49566cf-0b6c-4b84-8c47-014ee31d3f60%40googlegroups.com >>>>> <https://groups.google.com/d/msgid/tesseract-ocr/f49566cf-0b6c-4b84-8c47-014ee31d3f60%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>> >>>> >>>> -- >>>> >>>> ____________________________________________________________ >>>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com >>>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "tesseract-ocr" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to tesser...@googlegroups.com. >>> To post to this group, send email to tesser...@googlegroups.com. >>> Visit this group at https://groups.google.com/group/tesseract-ocr. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/tesseract-ocr/84b0f7d4-b98a-487c-a300-e32a7b5fdc59%40googlegroups.com >>> <https://groups.google.com/d/msgid/tesseract-ocr/84b0f7d4-b98a-487c-a300-e32a7b5fdc59%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> >> -- >> >> ____________________________________________________________ >> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com >> > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to tesseract-ocr+unsubscr...@googlegroups.com. > To post to this group, send email to tesseract-ocr@googlegroups.com. > Visit this group at https://groups.google.com/group/tesseract-ocr. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/bb5e02ca-55ac-4839-934e-102489576e71%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/bb5e02ca-55ac-4839-934e-102489576e71%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- ____________________________________________________________ भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduUSuSm-O1a2pg8buuBKka8ck%2B26YDJEiv3svLkcUohAow%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.