[tesseract-ocr] on the lstmtraiing, Have you ever seen error like "Logisitc outputs not implemented yet!"

2018-03-27 Thread notoriousterran
Hi All
on the lstmtraiing, Have you ever seen error like "Logisitc outputs not 
implemented yet!"





// Performs forward-backward on the given trainingdata. 
https://github.com/tesseract-ocr/tesseract/blob/master/lstm/lstmtrainer.cpp 
// Returns a Trainability enum to indicate the suitability of the sample.
Trainability LSTMTrainer::TrainOnLine(const ImageData* trainingdata,
bool batch) {
NetworkIO fwd_outputs, targets;
Trainability trainable =
PrepareForBackward(trainingdata, _outputs, );
++sample_iteration_;
if (trainable == UNENCODABLE || trainable == NOT_BOXED) {
return trainable; // Sample was unusable.
}
bool debug = debug_interval_ > 0 &&
training_iteration() % debug_interval_ == 0;
// Run backprop on the output.
NetworkIO bp_deltas;
if (network_->IsTraining() &&
(trainable != PERFECT ||
training_iteration() >
last_perfect_training_iteration_ + perfect_delay_)) {
network_->Backward(debug, targets, _space_, _deltas);
network_->Update(learning_rate_, batch ? -1.0f : momentum_, adam_beta_,
training_iteration_ + 1);
}
#ifndef GRAPHICS_DISABLED
if (debug_interval_ == 1 && debug_win_ != nullptr) {
delete debug_win_->AwaitEvent(SVET_CLICK);
}
#endif // GRAPHICS_DISABLED
// Roll the memory of past means.
RollErrorBuffers();
return trainable;
}
// Prepares the ground truth, runs forward, and prepares the targets.
// Returns a Trainability enum to indicate the suitability of the sample.
Trainability LSTMTrainer::PrepareForBackward(const ImageData* trainingdata,
NetworkIO* fwd_outputs,
NetworkIO* targets) {
if (trainingdata == nullptr) {
tprintf("Null trainingdata.\n");
return UNENCODABLE;
}
// Ensure repeatability of random elements even across checkpoints.
bool debug = debug_interval_ > 0 &&
training_iteration() % debug_interval_ == 0;
GenericVector truth_labels;
if (!EncodeString(trainingdata->transcription(), _labels)) {
tprintf("Can't encode transcription: '%s' in language '%s'\n",
trainingdata->transcription().string(),
trainingdata->language().string());
return UNENCODABLE;
}
bool upside_down = false;
if (randomly_rotate_) {
// This ensures consistent training results.
SetRandomSeed();
upside_down = randomizer_.SignedRand(1.0) > 0.0;
if (upside_down) {
// Modify the truth labels to match the rotation:
// Apart from space and null, increment the label. This is changes the
// script-id to the same script-id but upside-down.
// The labels need to be reversed in order, as the first is now the last.
for (int c = 0; c < truth_labels.size(); ++c) {
if (truth_labels[c] != UNICHAR_SPACE && truth_labels[c] != null_char_)
++truth_labels[c];
}
truth_labels.reverse();
}
}
int w = 0;
while (w < truth_labels.size() &&
(truth_labels[w] == UNICHAR_SPACE || truth_labels[w] == null_char_))
++w;
if (w == truth_labels.size()) {
tprintf("Blank transcription: %s\n",
trainingdata->transcription().string());
return UNENCODABLE;
}
float image_scale;
NetworkIO inputs;
bool invert = trainingdata->boxes().empty();
if (!RecognizeLine(*trainingdata, invert, debug, invert, upside_down,
_scale, , fwd_outputs)) {
tprintf("Image not trainable\n");
return UNENCODABLE;
}
targets->Resize(*fwd_outputs, network_->NumOutputs());
LossType loss_type = OutputLossType();
if (loss_type == LT_SOFTMAX) {
if (!ComputeTextTargets(*fwd_outputs, truth_labels, targets)) {
tprintf("Compute simple targets failed!\n");
return UNENCODABLE;
}
} else if (loss_type == LT_CTC) {
if (!ComputeCTCTargets(truth_labels, fwd_outputs, targets)) {
tprintf("Compute CTC targets failed!\n");
return UNENCODABLE;
}
} else {
tprintf("Logistic outputs not implemented yet!\n");
return UNENCODABLE;

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/b729b585-6720-472f-8acf-fe97c55f5971%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [tesseract-ocr] Any suggestions for more accurate Text conversion?

2018-03-27 Thread Bhargav Kanakiya
I tried using version 4.0 by building it from source.

However, I get following messages, and without much surprise, the output is 
totally bizarre.

Failed to load any lstm-specific dictionaries for lang eng-numCAPS!!
Tesseract Open Source OCR Engine v4.00.00alpha with Leptonica
Warning. Invalid resolution 0 dpi. Using 70 instead.
Estimating resolution as 233

Output: *4JTX9T*

I understand that the DPI message is there since older versions and I had 
it in 3.05 as well, the 'lstm-specific' message is probably from the 
training data file? Only other option is train/finetune on my own set?

On Tuesday, March 27, 2018 at 1:37:36 PM UTC-7, shree wrote:
>
> Version mismatch. That traineddata is for 4.0.
>
> Wiki has pages for training. Look for one appropriate for your version of 
> tesseract.
>
> On Wed 28 Mar, 2018, 1:23 AM ,  wrote:
>
>> Hi Shree,
>>
>> I just tried using the training data file you provided but it seems that 
>> there is some problem with Tesseract recognizing this file. I should have 
>> mentioned before that I am using version '3.05.01'.
>>
>> Below is the sequence of commands I ran:
>>
>> Bhargavs-MacBook-Pro-2:LPR bhargav$ tesseract topcrop1.jpg out -l 
>> end-numCAPS
>>
>> Error opening data file 
>> /usr/local/Cellar/tesseract/3.05.01/share/tessdata/end-numCAPS.traineddata
>>
>> Please make sure the TESSDATA_PREFIX environment variable is set to the 
>> parent directory of your "tessdata" directory.
>>
>> Failed loading language 'end-numCAPS'
>>
>> Tesseract couldn't load any languages!
>>
>> Could not initialize tesseract.
>>
>> Bhargavs-MacBook-Pro-2:LPR bhargav$ ls 
>> /usr/local/Cellar/tesseract/3.05.01/share/tessdata/
>>
>> configs eng.traineddata pdf.ttf
>>
>> eng-numCAPS.traineddata osd.traineddata tessconfigs
>>
>> Bhargavs-MacBook-Pro-2:LPR bhargav$ echo $TESSDATA_PREFIX
>>
>> /usr/local/share/tessdata
>>
>> Please let me know if I have done something wrong or the train data file 
>> has version mismatch or corrupted.
>>
>> Thanks,
>> Bhargav
>>
>> On Tuesday, March 27, 2018 at 11:24:36 AM UTC-7, bha...@automot.us wrote:
>>>
>>> Thank you Shree. I will give it a shot with the attached train data!
>>>
>>> About fine-tuning, are there any example tutorials on the Tesseract 
>>> wiki? I am not sure. I will try to find, but I you know and post the link, 
>>> I would really appreciate that!
>>>
>>> Thanks. 
>>>
>>> On Tuesday, March 27, 2018 at 3:00:06 AM UTC-7, shree wrote:

 You can try finetune training.

 Test with attached traineddata file.

>>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to tesseract-oc...@googlegroups.com .
>> To post to this group, send email to tesser...@googlegroups.com 
>> .
>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/tesseract-ocr/c346ec8b-32ef-4b29-b9e6-e5d9225a31df%40googlegroups.com
>>  
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/b8487fb6-cfd7-49d9-a422-312beeec4616%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [tesseract-ocr] Any suggestions for more accurate Text conversion?

2018-03-27 Thread ShreeDevi Kumar
Version mismatch. That traineddata is for 4.0.

Wiki has pages for training. Look for one appropriate for your version of
tesseract.

On Wed 28 Mar, 2018, 1:23 AM ,  wrote:

> Hi Shree,
>
> I just tried using the training data file you provided but it seems that
> there is some problem with Tesseract recognizing this file. I should have
> mentioned before that I am using version '3.05.01'.
>
> Below is the sequence of commands I ran:
>
> Bhargavs-MacBook-Pro-2:LPR bhargav$ tesseract topcrop1.jpg out -l
> end-numCAPS
>
> Error opening data file
> /usr/local/Cellar/tesseract/3.05.01/share/tessdata/end-numCAPS.traineddata
>
> Please make sure the TESSDATA_PREFIX environment variable is set to the
> parent directory of your "tessdata" directory.
>
> Failed loading language 'end-numCAPS'
>
> Tesseract couldn't load any languages!
>
> Could not initialize tesseract.
>
> Bhargavs-MacBook-Pro-2:LPR bhargav$ ls
> /usr/local/Cellar/tesseract/3.05.01/share/tessdata/
>
> configs eng.traineddata pdf.ttf
>
> eng-numCAPS.traineddata osd.traineddata tessconfigs
>
> Bhargavs-MacBook-Pro-2:LPR bhargav$ echo $TESSDATA_PREFIX
>
> /usr/local/share/tessdata
>
> Please let me know if I have done something wrong or the train data file
> has version mismatch or corrupted.
>
> Thanks,
> Bhargav
>
> On Tuesday, March 27, 2018 at 11:24:36 AM UTC-7, bha...@automot.us wrote:
>>
>> Thank you Shree. I will give it a shot with the attached train data!
>>
>> About fine-tuning, are there any example tutorials on the Tesseract wiki?
>> I am not sure. I will try to find, but I you know and post the link, I
>> would really appreciate that!
>>
>> Thanks.
>>
>> On Tuesday, March 27, 2018 at 3:00:06 AM UTC-7, shree wrote:
>>>
>>> You can try finetune training.
>>>
>>> Test with attached traineddata file.
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> To post to this group, send email to tesseract-ocr@googlegroups.com.
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/c346ec8b-32ef-4b29-b9e6-e5d9225a31df%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduXLT%2B3tepMtZ_fjufe%2Bt1WYMR4ChLdGaMuvAdj3M1t_tw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: [tesseract-ocr] Any suggestions for more accurate Text conversion?

2018-03-27 Thread bhargav
Hi Shree,

I just tried using the training data file you provided but it seems that 
there is some problem with Tesseract recognizing this file. I should have 
mentioned before that I am using version '3.05.01'.

Below is the sequence of commands I ran:

Bhargavs-MacBook-Pro-2:LPR bhargav$ tesseract topcrop1.jpg out -l 
end-numCAPS

Error opening data file 
/usr/local/Cellar/tesseract/3.05.01/share/tessdata/end-numCAPS.traineddata

Please make sure the TESSDATA_PREFIX environment variable is set to the 
parent directory of your "tessdata" directory.

Failed loading language 'end-numCAPS'

Tesseract couldn't load any languages!

Could not initialize tesseract.

Bhargavs-MacBook-Pro-2:LPR bhargav$ ls 
/usr/local/Cellar/tesseract/3.05.01/share/tessdata/

configs eng.traineddata pdf.ttf

eng-numCAPS.traineddata osd.traineddata tessconfigs

Bhargavs-MacBook-Pro-2:LPR bhargav$ echo $TESSDATA_PREFIX

/usr/local/share/tessdata

Please let me know if I have done something wrong or the train data file 
has version mismatch or corrupted.

Thanks,
Bhargav

On Tuesday, March 27, 2018 at 11:24:36 AM UTC-7, bha...@automot.us wrote:
>
> Thank you Shree. I will give it a shot with the attached train data!
>
> About fine-tuning, are there any example tutorials on the Tesseract wiki? 
> I am not sure. I will try to find, but I you know and post the link, I 
> would really appreciate that!
>
> Thanks. 
>
> On Tuesday, March 27, 2018 at 3:00:06 AM UTC-7, shree wrote:
>>
>> You can try finetune training.
>>
>> Test with attached traineddata file.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/c346ec8b-32ef-4b29-b9e6-e5d9225a31df%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [tesseract-ocr] Any suggestions for more accurate Text conversion?

2018-03-27 Thread bhargav
Thank you Shree. I will give it a shot with the attached train data!

About fine-tuning, are there any example tutorials on the Tesseract wiki? I 
am not sure. I will try to find, but I you know and post the link, I would 
really appreciate that!

Thanks. 

On Tuesday, March 27, 2018 at 3:00:06 AM UTC-7, shree wrote:
>
> You can try finetune training.
>
> Test with attached traineddata file.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/7cc1f04f-457d-4c36-8bd7-d5b7bb96536c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [tesseract-ocr] Unable to use tesseract api installed with a nuget pkg

2018-03-27 Thread ShreeDevi Kumar
I don't use visual studio. However I know that we support vs installation
via cppan cmake. Please follow those directions.

On Tue 27 Mar, 2018, 9:24 PM sonu sainju,  wrote:

> Hey Shree, Thanks for replying. No I didn't build using cppan and cmake. I
> used vcpkg install command. Isn't vcpkg supposed to acquire and install
> everything?
>
> On Sunday, March 25, 2018 at 5:46:06 PM UTC-7, shree wrote:
>>
>> Did you build using cppan and cmake?
>>
>> On Mon 26 Mar, 2018, 1:50 AM sonu sainju,  wrote:
>>
>>> Hi,
>>>
>>> I followed instruction in
>>>  https://github.com/tesseract-ocr/tesseract/wiki/Compiling#windows
>>>  to
>>> build tesseract and use it in vs2015 project. After installing tesseract
>>> via vcpkg, I exported it as a nuget pkg and added it to my project like any
>>> other nuget pkg but I am getting link errors like:
>>> Severity Code Description Project File Line Suppression State
>>> Error LNK2001 unresolved external symbol closesocket Project1 
>>> c:\Users\sonu's\documents\visual
>>> studio 2015\Projects\Project1\Project1\tesseract305.lib(svutil.cpp.obj)
>>> 1
>>> Error LNK2001 unresolved external symbol connect Project1 
>>> c:\Users\sonu's\documents\visual
>>> studio 2015\Projects\Project1\Project1\tesseract305.lib(svutil.cpp.obj)
>>> 1
>>> Error LNK2001 unresolved external symbol htons Project1 
>>> c:\Users\sonu's\documents\visual
>>> studio 2015\Projects\Project1\Project1\tesseract305.lib(svutil.cpp.obj)
>>> 1
>>> ...
>>>
>>> Is there something I have missed? Has anybody tried using tesseract api
>>> this way in vs 2015?
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to tesseract-oc...@googlegroups.com.
>>> To post to this group, send email to tesser...@googlegroups.com.
>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/tesseract-ocr/d662f3a8-31d9-4100-bd44-7943444e01db%40googlegroups.com
>>> 
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> To post to this group, send email to tesseract-ocr@googlegroups.com.
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/e3c24332-bca3-4969-b290-80f3e3054b7a%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWU3xhTZG5hi5D4OX7QCpP%3DweRTU20ckDvb72guNmWcAg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: [tesseract-ocr] Unable to use tesseract api installed with a nuget pkg

2018-03-27 Thread sonu sainju
Hey Shree, Thanks for replying. No I didn't build using cppan and cmake. I 
used vcpkg install command. Isn't vcpkg supposed to acquire and install 
everything?

On Sunday, March 25, 2018 at 5:46:06 PM UTC-7, shree wrote:
>
> Did you build using cppan and cmake?
>
> On Mon 26 Mar, 2018, 1:50 AM sonu sainju,  > wrote:
>
>> Hi,
>>
>> I followed instruction in
>>  https://github.com/tesseract-ocr/tesseract/wiki/Compiling#windows 
>>  to 
>> build tesseract and use it in vs2015 project. After installing tesseract 
>> via vcpkg, I exported it as a nuget pkg and added it to my project like any 
>> other nuget pkg but I am getting link errors like:
>> Severity Code Description Project File Line Suppression State
>> Error LNK2001 unresolved external symbol closesocket Project1 
>> c:\Users\sonu's\documents\visual 
>> studio 2015\Projects\Project1\Project1\tesseract305.lib(svutil.cpp.obj) 1 
>> Error LNK2001 unresolved external symbol connect Project1 
>> c:\Users\sonu's\documents\visual 
>> studio 2015\Projects\Project1\Project1\tesseract305.lib(svutil.cpp.obj) 1 
>> Error LNK2001 unresolved external symbol htons Project1 
>> c:\Users\sonu's\documents\visual 
>> studio 2015\Projects\Project1\Project1\tesseract305.lib(svutil.cpp.obj) 1 
>> ...
>>
>> Is there something I have missed? Has anybody tried using tesseract api 
>> this way in vs 2015?
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to tesseract-oc...@googlegroups.com .
>> To post to this group, send email to tesser...@googlegroups.com 
>> .
>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/tesseract-ocr/d662f3a8-31d9-4100-bd44-7943444e01db%40googlegroups.com
>>  
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/e3c24332-bca3-4969-b290-80f3e3054b7a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[tesseract-ocr] I want to traineddata from scratch. I have to install ScrollView.jar

2018-03-27 Thread notoriousterran



Hi. I have an question ..

I want to traineddata from scratch. I have to install ScrollView.jar



Thank you... 


I encounter this error in lstmtraining

ㅜㅜㅜ

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/e62e7e85-257d-4982-96a3-d2512dae2fc3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[tesseract-ocr] Hi Could you tell me way to install ScrollView.jar ?

2018-03-27 Thread notoriousterran
Hi Could you tell me way to install ScrollView.jar ?

I saw the tesseract github page

But, I could n't make sense


I installed tesseract 4.00 .beta.1 // and I clone git (1) langdata/ 2) 
tesseract) ubuntu 16.04.03. LTS

Thank you 

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/d0b677f9-b9ac-4c16-bebe-4c68e832a3d0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[tesseract-ocr] Any suggestions for more accurate Text conversion?

2018-03-27 Thread bhargav
Hello,

I am working on a project where I extract and the license plates and try to 
get the plate number automatically.

After applying some computer vision and image processing, I have come up 
with the following result.



As it can be very obvious, the OCR generated with tesseract is: 6JZX97L

Where as, actually, it is 6JZX974.

I a, very new to the tesseract and it seemed like a very easy to use 
library for my task, however, I do not have any idea on how to tackle a 
scenario like this. If there is anyone who has worked on solving such a 
problem, please share thoughts.

Some other error prone numbers/letters: 0-O, 1-I, 2-Z, 5-S, 8-B...

Thanks!

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/e39ffba4-e833-43bc-8a63-9bf443a28147%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.