Re: [tesseract-ocr] Re: LSTM files

2018-08-14 Thread Khosrobeigy.zohreh
ok, but I have some tif and box files for each font for example:
fas.B_Mitra.exp0.box
fas.B_Mitra.exp0.tif
fas.B_Mitra.exp1.box
fas.B_Mitra.exp1.tif
fas.B_Mitra.exp2.box
fas.B_Mitra.exp2.tif
.
.
.
How can I make lstm for each of them?



On Tue, Aug 14, 2018 at 4:56 PM,  wrote:

> I mean put all the file path in this file, then running the lstmtraining
> # cat eng.training_files.txt
> /home/tess-ocr/model_output/test//eng.Arial.exp0.lstmf
> /home/tess-ocr/model_output/test//eng.Microsoft_YaHei.exp0.lstmf
> /home/tess-ocr/model_output/test//eng.Times_New_Roman.exp0.lstmf
>
>
> 在 2018年8月14日星期二 UTC+8下午6:04:48,Zohreh Khosrobeygi写道:
>>
>> Sorry, I couldn't understand.
>> Could you please explain more this "and then put all the lstm files
>> together in training_files.txt"
>>
>> On Tue, Aug 14, 2018 at 1:19 PM,  wrote:
>>
>>> you should use tessearct command for each of your box/tif pair
>>> tesseract ${dir}/lang.font.exp0.tif ${dir}/lang.font.exp0 lstm.train
>>> and then put all the lstm files together in training_files.txt
>>>
>>> 在 2018年8月13日星期一 UTC+8下午6:16:09,Zohreh Khosrobeygi写道:

 Hi,
 I have been training persian language. My text is too large so I had to
 generated 18 boxfiles and 18 tifs for one text. Then I make on unicharset
 for all 18 files. Now when I want to make lstm file, it just create one
 lstm,"fas.B_Mitra.exp0.lstmf" and can't create for 1 ot 18.
 I test somthing else. I created "fas.B_Nazanin.exp0.lstmf" too and
 use lstmtraining but it just used fas.B_Mitra.exp0.lstmf and did not use
 another.
 How can I make a lstm for all my boxes?
 Thx.

>>> --
>>> You received this message because you are subscribed to a topic in the
>>> Google Groups "tesseract-ocr" group.
>>> To unsubscribe from this topic, visit https://groups.google.com/d/to
>>> pic/tesseract-ocr/928-Wfn5rGs/unsubscribe.
>>> To unsubscribe from this group and all its topics, send an email to
>>> tesseract-oc...@googlegroups.com.
>>> To post to this group, send email to tesser...@googlegroups.com.
>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>> To view this discussion on the web visit https://groups.google.com/d/ms
>>> gid/tesseract-ocr/97a24af8-fb96-402a-a15b-1e6a7df405ca%40goo
>>> glegroups.com
>>> 
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>>
>> --
>> Zohreh Khosrobeygi
>> University of Tehran, 2016
>> Tel: +989196042887
>> khosrobe...@ut.ac.ir
>>
>> --
> You received this message because you are subscribed to a topic in the
> Google Groups "tesseract-ocr" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/
> topic/tesseract-ocr/928-Wfn5rGs/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> tesseract-ocr+unsubscr...@googlegroups.com.
> To post to this group, send email to tesseract-ocr@googlegroups.com.
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/tesseract-ocr/67f4fb37-b3d2-4d11-83ff-d83607c48966%
> 40googlegroups.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Zohreh Khosrobeygi
University of Tehran, 2016
Tel: +989196042887
khosrobeygi.zo...@ut.ac.ir 

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAE1QSgz5Y6JxrWSBi5ODSbK0cmphFAwro7qW02b0-n_AujKdQQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: [tesseract-ocr] Re: LSTM files

2018-08-14 Thread zwwtsinghua
I mean put all the file path in this file, then running the lstmtraining
# cat eng.training_files.txt
/home/tess-ocr/model_output/test//eng.Arial.exp0.lstmf
/home/tess-ocr/model_output/test//eng.Microsoft_YaHei.exp0.lstmf
/home/tess-ocr/model_output/test//eng.Times_New_Roman.exp0.lstmf


在 2018年8月14日星期二 UTC+8下午6:04:48,Zohreh Khosrobeygi写道:
>
> Sorry, I couldn't understand. 
> Could you please explain more this "and then put all the lstm files 
> together in training_files.txt"
>
> On Tue, Aug 14, 2018 at 1:19 PM, > wrote:
>
>> you should use tessearct command for each of your box/tif pair 
>> tesseract ${dir}/lang.font.exp0.tif ${dir}/lang.font.exp0 lstm.train
>> and then put all the lstm files together in training_files.txt
>>
>> 在 2018年8月13日星期一 UTC+8下午6:16:09,Zohreh Khosrobeygi写道:
>>>
>>> Hi, 
>>> I have been training persian language. My text is too large so I had to 
>>> generated 18 boxfiles and 18 tifs for one text. Then I make on unicharset 
>>> for all 18 files. Now when I want to make lstm file, it just create one 
>>> lstm,"fas.B_Mitra.exp0.lstmf" and can't create for 1 ot 18. 
>>> I test somthing else. I created "fas.B_Nazanin.exp0.lstmf" too and 
>>> use lstmtraining but it just used fas.B_Mitra.exp0.lstmf and did not use 
>>> another.
>>> How can I make a lstm for all my boxes?
>>> Thx.
>>>
>> -- 
>> You received this message because you are subscribed to a topic in the 
>> Google Groups "tesseract-ocr" group.
>> To unsubscribe from this topic, visit 
>> https://groups.google.com/d/topic/tesseract-ocr/928-Wfn5rGs/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to 
>> tesseract-oc...@googlegroups.com .
>> To post to this group, send email to tesser...@googlegroups.com 
>> .
>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/tesseract-ocr/97a24af8-fb96-402a-a15b-1e6a7df405ca%40googlegroups.com
>>  
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> -- 
> Zohreh Khosrobeygi
> University of Tehran, 2016
> Tel: +989196042887
> khosrobe...@ut.ac.ir 
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/67f4fb37-b3d2-4d11-83ff-d83607c48966%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [tesseract-ocr] Re: LSTM files

2018-08-14 Thread Khosrobeigy.zohreh
Sorry, I couldn't understand.
Could you please explain more this "and then put all the lstm files
together in training_files.txt"

On Tue, Aug 14, 2018 at 1:19 PM,  wrote:

> you should use tessearct command for each of your box/tif pair
> tesseract ${dir}/lang.font.exp0.tif ${dir}/lang.font.exp0 lstm.train
> and then put all the lstm files together in training_files.txt
>
> 在 2018年8月13日星期一 UTC+8下午6:16:09,Zohreh Khosrobeygi写道:
>>
>> Hi,
>> I have been training persian language. My text is too large so I had to
>> generated 18 boxfiles and 18 tifs for one text. Then I make on unicharset
>> for all 18 files. Now when I want to make lstm file, it just create one
>> lstm,"fas.B_Mitra.exp0.lstmf" and can't create for 1 ot 18.
>> I test somthing else. I created "fas.B_Nazanin.exp0.lstmf" too and
>> use lstmtraining but it just used fas.B_Mitra.exp0.lstmf and did not use
>> another.
>> How can I make a lstm for all my boxes?
>> Thx.
>>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "tesseract-ocr" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/
> topic/tesseract-ocr/928-Wfn5rGs/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> tesseract-ocr+unsubscr...@googlegroups.com.
> To post to this group, send email to tesseract-ocr@googlegroups.com.
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/tesseract-ocr/97a24af8-fb96-402a-a15b-1e6a7df405ca%
> 40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Zohreh Khosrobeygi
University of Tehran, 2016
Tel: +989196042887
khosrobeygi.zo...@ut.ac.ir 

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAE1QSgz6kOnb7LO5J9ZbZ9zdbH40a%2BQnVm-_T37nTLr-b_OBtA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


[tesseract-ocr] Re: LSTM files

2018-08-14 Thread zwwtsinghua
you should use tessearct command for each of your box/tif pair 
tesseract ${dir}/lang.font.exp0.tif ${dir}/lang.font.exp0 lstm.train
and then put all the lstm files together in training_files.txt

在 2018年8月13日星期一 UTC+8下午6:16:09,Zohreh Khosrobeygi写道:
>
> Hi, 
> I have been training persian language. My text is too large so I had to 
> generated 18 boxfiles and 18 tifs for one text. Then I make on unicharset 
> for all 18 files. Now when I want to make lstm file, it just create one 
> lstm,"fas.B_Mitra.exp0.lstmf" and can't create for 1 ot 18. 
> I test somthing else. I created "fas.B_Nazanin.exp0.lstmf" too and 
> use lstmtraining but it just used fas.B_Mitra.exp0.lstmf and did not use 
> another.
> How can I make a lstm for all my boxes?
> Thx.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/97a24af8-fb96-402a-a15b-1e6a7df405ca%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.