[tesseract-ocr] Doubt with handwritten texts.

2019-06-20 Thread Vagner Belfort
Hello, I'm new to data science and I have a project to read the bank check, 
I can do any extraction of the typed characters, but I can not extract the 
handwritten texts.
My question, and is there any way to do this with tesserato?
I researched and found nothing about it. Thank you.
Attached has an example of what I'm trying to do.

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/bb853a71-5e33-4db4-bf13-dd876d7f81fd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[tesseract-ocr] Re: table ocr with tesseract(tess4j)

2019-06-20 Thread Quan Nguyen
Included with tess4j are some utility methods such as Remove Lines. You can 
see demontration of the functions  with VietOCR, which uses the library.

https://sourceforge.net/projects/vietocr 


On Wednesday, June 19, 2019 at 7:40:36 AM UTC-5, Momene Vigal wrote:
>
> Hello, please im a beginner with tesseract actually using it with java
> please can anyone help me with how to do the ocr of  a table with 
> tesseract 
> in python or java
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/4fce55a7-dcce-4f41-b42c-ef17baeca52d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[tesseract-ocr] tessdata files for tesseract 3.3 nuget package

2019-06-20 Thread kamal
Which version of trained data files can be used with the nuget package 
tesseract 3.3 - https://www.nuget.org/packages/Tesseract 

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/4c08d6cc-a271-46fe-b54d-f8cc71493ad0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [tesseract-ocr] Re: FontAwesome and Tesseract

2019-06-20 Thread Shree Devi Kumar
See https://github.com/Shreeshrii/tessdata_emoji

Font Awesome uses PUA Unicode range for the icons. So it did not work with
text2image. I used other emoji fonts.

The script and training data used are also in the repo.

On Tue, Jun 18, 2019 at 12:04 AM Jason  wrote:

> Can I "bump" this?
>
> Even if I only get a high-level description of the process?
> - How to make a box file (for v4) of unicode chars
> - How to make the training size invariant?
> Etc.
>
> Many thanks!
>
>
>
> On Tuesday, May 21, 2019 at 10:09:57 AM UTC-4, Jason wrote:
>>
>> I would like to be able to detect shapes like those contained in
>> FontAwesome. Take for example a gear: (
>> https://fontawesome.com/icons?d=gallery=gear) This is unicode
>> character \uf013
>> I think this would be as simple as training a font, via
>> http://trainyourtesseract.com/, but this did not work. I am not sure why
>> it failed, but any insight on how to do this would be appreciated. I am
>> thinking the unicode range is the issue?
>> Also, I would be fundamentally training characters, not words.
>>
>> Thank you.
>>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> To post to this group, send email to tesseract-ocr@googlegroups.com.
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/09a628f2-01a4-49fe-a8a5-55c17d44a4ce%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>


-- 


भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduX7Tz1E4E%3D%2BkAQdxC4FYHWAysjgMRNgzojXH3SoD0EW7w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: [tesseract-ocr] Increasing RAM while using tesseract

2019-06-20 Thread _ Flaviu
Moreover, if I read a lot of images at once, the RAM has increased in such 
a way that my app stopped by an assert:

[image: Untitled.png]


On Thursday, June 20, 2019 at 5:04:04 PM UTC+3, _ Flaviu wrote:

> I have uncommented lines:
>
> pTess->Clear();
>
> pTess->End();
>
>
> and I noticed a little improvement, but the RAM eaten by app is still high 
> … when you wrote "END()" you meant pTess->End() ?
>
>
>
>
>
> On Thursday, June 20, 2019 at 2:42:49 PM UTC+3, zdenop wrote:
>
>> IMO you have to call END() to correctly close tesseract instance.
>>
>> Zdenko
>>
>>
>> št 20. 6. 2019 o 13:02 _ Flaviu  napísal(a):
>>
>>> I am using tesseract 4 on a VC++ (MFC) app, to read text from images (A4 
>>> sizes). I noticed that while I using this app on several PCs (Win10 64 bit, 
>>> *GB RAM), the RAM occupied by my app (that use tesseract) is increasing on 
>>> and on. If I read 14 images, my app eat 470 MB, and if I didn't close the 
>>> app and read again all these 14 images, the RAM eaten by my app is 
>>> increasing until ~800MB, and if continue to read the same images, the RAM 
>>> is increasing on an on. Why is this happen ?
>>>
>>> Here is the code that I am using for tesseract:
>>>
>>> BOOL CMyClass::GetTextFromImage()
>>> {
>>> PIX* pix = NULL;
>>> tesseract::TessBaseAPI* pTess = new tesseract::TessBaseAPI;
>>>
>>> do
>>> {
>>> if (pTess->Init(...))
>>> {
>>> m_sError.Format(_T("OCRTesseract: Could not initialize tesseract."));
>>> break;
>>> }
>>> // setup
>>> // read image
>>> PIX* pix = pixRead(m_sFileName);
>>> if (! pix)
>>> {
>>> break;
>>> }
>>> // recognize
>>> pTess->SetImage(pix);
>>> }
>>> while (FALSE);
>>>
>>> // cleanup
>>> // pTess->Clear();
>>> // pTess->End();
>>> delete pTess;
>>> pTess = NULL;
>>> pixDestroy();
>>>
>>> return TRUE;
>>> }
>>>
>>>
>>> Is there anything wrong here ? Why is increasing RAM while I am using 
>>> tesseract ?
>>>
>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to tesser...@googlegroups.com.
>>> To post to this group, send email to tesser...@googlegroups.com.
>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/tesseract-ocr/8be010ab-c948-47d3-b51e-ac141ce12bbd%40googlegroups.com
>>>  
>>> 
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/7490f4bc-449f-413d-af8a-9ec99a948042%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [tesseract-ocr] Increasing RAM while using tesseract

2019-06-20 Thread _ Flaviu
I have uncommented lines:

pTess->Clear();

pTess->End();


and I noticed a little improvement, but the RAM eaten by app is still high 
… when you wrote "END()" you meant pTess->End() ?





On Thursday, June 20, 2019 at 2:42:49 PM UTC+3, zdenop wrote:

> IMO you have to call END() to correctly close tesseract instance.
>
> Zdenko
>
>
> št 20. 6. 2019 o 13:02 _ Flaviu > 
> napísal(a):
>
>> I am using tesseract 4 on a VC++ (MFC) app, to read text from images (A4 
>> sizes). I noticed that while I using this app on several PCs (Win10 64 bit, 
>> *GB RAM), the RAM occupied by my app (that use tesseract) is increasing on 
>> and on. If I read 14 images, my app eat 470 MB, and if I didn't close the 
>> app and read again all these 14 images, the RAM eaten by my app is 
>> increasing until ~800MB, and if continue to read the same images, the RAM 
>> is increasing on an on. Why is this happen ?
>>
>> Here is the code that I am using for tesseract:
>>
>> BOOL CMyClass::GetTextFromImage()
>> {
>> PIX* pix = NULL;
>> tesseract::TessBaseAPI* pTess = new tesseract::TessBaseAPI;
>>
>> do
>> {
>> if (pTess->Init(...))
>> {
>> m_sError.Format(_T("OCRTesseract: Could not initialize tesseract."));
>> break;
>> }
>> // setup
>> // read image
>> PIX* pix = pixRead(m_sFileName);
>> if (! pix)
>> {
>> break;
>> }
>> // recognize
>> pTess->SetImage(pix);
>> }
>> while (FALSE);
>>
>> // cleanup
>> // pTess->Clear();
>> // pTess->End();
>> delete pTess;
>> pTess = NULL;
>> pixDestroy();
>>
>> return TRUE;
>> }
>>
>>
>> Is there anything wrong here ? Why is increasing RAM while I am using 
>> tesseract ?
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to tesser...@googlegroups.com .
>> To post to this group, send email to tesser...@googlegroups.com 
>> .
>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/tesseract-ocr/8be010ab-c948-47d3-b51e-ac141ce12bbd%40googlegroups.com
>>  
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/96d9ff52-e7d0-45e9-87f8-05d51ec61142%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[tesseract-ocr] Re: Suggest a method to improve tesseract results

2019-06-20 Thread hrishikesh kaulwar
Any other suggestions or ideas will be more than helpful.
Thanks in advance for your input.

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/f71fb79d-4fb0-485e-b976-89ccdcb86d7a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [tesseract-ocr] Re: Custom Tiff/Box pairs support in tesstrain.sh

2019-06-20 Thread hrishikesh kaulwar
That was very crystal clear explanation. Thank you for explaining shree. I 
got it now.
On Thursday, June 20, 2019 at 1:55:20 PM UTC+5:30, shree wrote:
>
> if [[ ${MY_BOXTIFF_DIR} != "" ]]; then
> tlog "\n=== Copy existing box/tiff pairs from '${MY_BOXTIFF_DIR}'"
> cp  ${MY_BOXTIFF_DIR}/*.box ${TRAINING_DIR} | true
> cp  ${MY_BOXTIFF_DIR}/*.tif ${TRAINING_DIR} | true
> ls -l  ${TRAINING_DIR}
> fi
>
> copies the files to training directory
>
> phase_I_generate_image 8
>
> generates box/tiff pairs from the training text and fonts specified. 
> Please note that if you had same name files copied from my_boxtiff_dir, 
> they will get overwritten,
>
> phase_UP_generate_unicharset
>
> generates unicharset from all box files in training directory (meeting the 
> file naming convention lang.xxx.exp0.box)
>
> phase_E_extract_features " --psm 6 lstm.train " 8 "lstmf"
>
> this created lstmf files from all the box/tiff pairs
>
> make__lstmdata
>
> creates the list of lstmf files
> moves all required files from tmp directory to output directory
>
>
> On Thu, Jun 20, 2019 at 10:55 AM hrishikesh kaulwar  > wrote:
>
>>
>> Hey shree could you tell me what line in tesstrain.sh takes care of user 
>> provided tiff box pairs. Like what is the line which creates lstmf files 
>> from those pairs and then puts the name of lstmf files in training_list. 
>> Thanks in advance.
>> On Tuesday, June 18, 2019 at 2:54:09 PM UTC+5:30, hrishikesh kaulwar 
>> wrote:
>>>
>>> Greetings,
>>> I just got to know that tesstrain.sh is modified to support user 
>>> provided box/tiff pairs by adding a tiff/box directory flag. I used that 
>>> version of tesseract source to use my own tiff/box pairs. But when I ran 
>>> tesstrain.sh I got to know that it just copies tiff/box pairs provided by 
>>> me to training directory but .lstmf file is generated from 
>>> eng.training_text file. My tiff/box pairs are not getting used in creating 
>>> training data. Can someone point out what mistake I am making? or some way 
>>> to only use user provided tiff/box pairs to create training data?
>>>  Thanks in advance.
>>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to tesser...@googlegroups.com .
>> To post to this group, send email to tesser...@googlegroups.com 
>> .
>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/tesseract-ocr/166bfb15-52d9-4cc1-8f28-bb20e7ff3797%40googlegroups.com
>>  
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
> -- 
>
> 
> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/002a186a-cefc-41fc-97dc-6d7c24882abf%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[tesseract-ocr] Divergence in Trained data

2019-06-20 Thread Pooja Kamra
Hi,
For training i have provided target_error_rate 4.
But during training it error rate reduced upto 4.5 and then started 
increasing and never goes down.
What could be done in this scenario?

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/8521c9fe-75f3-45c9-99f5-e9e9162f6b7c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [tesseract-ocr] Increasing RAM while using tesseract

2019-06-20 Thread Zdenko Podobny
IMO you have to call END() to correctly close tesseract instance.

Zdenko


št 20. 6. 2019 o 13:02 _ Flaviu  napísal(a):

> I am using tesseract 4 on a VC++ (MFC) app, to read text from images (A4
> sizes). I noticed that while I using this app on several PCs (Win10 64 bit,
> *GB RAM), the RAM occupied by my app (that use tesseract) is increasing on
> and on. If I read 14 images, my app eat 470 MB, and if I didn't close the
> app and read again all these 14 images, the RAM eaten by my app is
> increasing until ~800MB, and if continue to read the same images, the RAM
> is increasing on an on. Why is this happen ?
>
> Here is the code that I am using for tesseract:
>
> BOOL CMyClass::GetTextFromImage()
> {
> PIX* pix = NULL;
> tesseract::TessBaseAPI* pTess = new tesseract::TessBaseAPI;
>
> do
> {
> if (pTess->Init(...))
> {
> m_sError.Format(_T("OCRTesseract: Could not initialize tesseract."));
> break;
> }
> // setup
> // read image
> PIX* pix = pixRead(m_sFileName);
> if (! pix)
> {
> break;
> }
> // recognize
> pTess->SetImage(pix);
> }
> while (FALSE);
>
> // cleanup
> // pTess->Clear();
> // pTess->End();
> delete pTess;
> pTess = NULL;
> pixDestroy();
>
> return TRUE;
> }
>
>
> Is there anything wrong here ? Why is increasing RAM while I am using
> tesseract ?
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> To post to this group, send email to tesseract-ocr@googlegroups.com.
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/8be010ab-c948-47d3-b51e-ac141ce12bbd%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8zXBoW7Bu6PV3sqaWKgwGu_RSv7B%3DczURrhQ%3DdP4wjbNg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


[tesseract-ocr] Increasing RAM while using tesseract

2019-06-20 Thread _ Flaviu
I am using tesseract 4 on a VC++ (MFC) app, to read text from images (A4 
sizes). I noticed that while I using this app on several PCs (Win10 64 bit, 
*GB RAM), the RAM occupied by my app (that use tesseract) is increasing on 
and on. If I read 14 images, my app eat 470 MB, and if I didn't close the 
app and read again all these 14 images, the RAM eaten by my app is 
increasing until ~800MB, and if continue to read the same images, the RAM 
is increasing on an on. Why is this happen ?

Here is the code that I am using for tesseract:

BOOL CMyClass::GetTextFromImage()
{
PIX* pix = NULL;
tesseract::TessBaseAPI* pTess = new tesseract::TessBaseAPI;

do
{
if (pTess->Init(...))
{
m_sError.Format(_T("OCRTesseract: Could not initialize tesseract."));
break;
}
// setup
// read image
PIX* pix = pixRead(m_sFileName);
if (! pix)
{
break;
}
// recognize
pTess->SetImage(pix);
}
while (FALSE);

// cleanup
// pTess->Clear();
// pTess->End();
delete pTess;
pTess = NULL;
pixDestroy();

return TRUE;
}


Is there anything wrong here ? Why is increasing RAM while I am using 
tesseract ?

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/8be010ab-c948-47d3-b51e-ac141ce12bbd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [tesseract-ocr] Re: Custom Tiff/Box pairs support in tesstrain.sh

2019-06-20 Thread Shree Devi Kumar
if [[ ${MY_BOXTIFF_DIR} != "" ]]; then
tlog "\n=== Copy existing box/tiff pairs from '${MY_BOXTIFF_DIR}'"
cp  ${MY_BOXTIFF_DIR}/*.box ${TRAINING_DIR} | true
cp  ${MY_BOXTIFF_DIR}/*.tif ${TRAINING_DIR} | true
ls -l  ${TRAINING_DIR}
fi

copies the files to training directory

phase_I_generate_image 8

generates box/tiff pairs from the training text and fonts specified. Please
note that if you had same name files copied from my_boxtiff_dir, they will
get overwritten,

phase_UP_generate_unicharset

generates unicharset from all box files in training directory (meeting the
file naming convention lang.xxx.exp0.box)

phase_E_extract_features " --psm 6 lstm.train " 8 "lstmf"

this created lstmf files from all the box/tiff pairs

make__lstmdata

creates the list of lstmf files
moves all required files from tmp directory to output directory


On Thu, Jun 20, 2019 at 10:55 AM hrishikesh kaulwar 
wrote:

>
> Hey shree could you tell me what line in tesstrain.sh takes care of user
> provided tiff box pairs. Like what is the line which creates lstmf files
> from those pairs and then puts the name of lstmf files in training_list.
> Thanks in advance.
> On Tuesday, June 18, 2019 at 2:54:09 PM UTC+5:30, hrishikesh kaulwar wrote:
>>
>> Greetings,
>> I just got to know that tesstrain.sh is modified to support user
>> provided box/tiff pairs by adding a tiff/box directory flag. I used that
>> version of tesseract source to use my own tiff/box pairs. But when I ran
>> tesstrain.sh I got to know that it just copies tiff/box pairs provided by
>> me to training directory but .lstmf file is generated from
>> eng.training_text file. My tiff/box pairs are not getting used in creating
>> training data. Can someone point out what mistake I am making? or some way
>> to only use user provided tiff/box pairs to create training data?
>>  Thanks in advance.
>>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> To post to this group, send email to tesseract-ocr@googlegroups.com.
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/166bfb15-52d9-4cc1-8f28-bb20e7ff3797%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>


-- 


भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduUAH6bfanyyn_9%2B6E5vuD4%2B6zX%3Dszucgu%3DDMSmNNQA73g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: [tesseract-ocr] Re: Custom Tiff/Box pairs support in tesstrain.sh

2019-06-20 Thread Shree Devi Kumar
See tesstrain_utils.sh

On Thu, 20 Jun 2019, 10:55 hrishikesh kaulwar,  wrote:

>
> Hey shree could you tell me what line in tesstrain.sh takes care of user
> provided tiff box pairs. Like what is the line which creates lstmf files
> from those pairs and then puts the name of lstmf files in training_list.
> Thanks in advance.
> On Tuesday, June 18, 2019 at 2:54:09 PM UTC+5:30, hrishikesh kaulwar wrote:
>>
>> Greetings,
>> I just got to know that tesstrain.sh is modified to support user
>> provided box/tiff pairs by adding a tiff/box directory flag. I used that
>> version of tesseract source to use my own tiff/box pairs. But when I ran
>> tesstrain.sh I got to know that it just copies tiff/box pairs provided by
>> me to training directory but .lstmf file is generated from
>> eng.training_text file. My tiff/box pairs are not getting used in creating
>> training data. Can someone point out what mistake I am making? or some way
>> to only use user provided tiff/box pairs to create training data?
>>  Thanks in advance.
>>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> To post to this group, send email to tesseract-ocr@googlegroups.com.
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/166bfb15-52d9-4cc1-8f28-bb20e7ff3797%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduVJ6J%3DRs-j-pz%3DX84NwyoLmcpFxHMXnmJ0pxqhQicoOsQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.