Correction: fast version is *ocrb_int (not ocrb-int).*
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to
see https://github.com/Shreeshrii/tessdata_ocrb
Retrained to add missing X using 3 fonts at 3 exposures and a larger
training text compared to previous version.
Both float/best and integer/fast versions are provided.
- Download best version
If you can provide another 40-50 lines of training data (text file) I will
rerun the training
On Mon, 8 Apr 2019, 22:11 Jankees Korstanje, wrote:
> Hi Shree,
>
> We have tried your traineddata file for MRZ and noticed that it does not
> detect the character X.
>
> Looking at
>
Hi Shree,
We have tried your traineddata file for MRZ and noticed that it does not
detect the character X.
Looking at
https://github.com/Shreeshrii/tessdata_ocrb/blob/master/eng.MRZ.training_text
We see that there are no X in there.
In addition it might be good to add a couple of lines that
Thanks everyone.
With suggestions and following this link "
https://www.youtube.com/watch?v=WZLJucXZy-g;, I was able to run a demo
training for a font.
I used Shreeshrii' github repo "https://github.com/Shreeshrii/tessdata_ocrb
".
Need some help on below points: If there any documentation
You should uninstall (purge) v3 first. Then build the v4 from scratch.
On Tue, Oct 16, 2018 at 12:23 PM Vinod Gattani
wrote:
> Robert/ Zdenko
>
> Yes, in the log I see version "3.4v".
>
> To install v4, I used the link "https://github.com/tesseract-ocr/tesseract;.
> I thought it has tesseract
You forget to uninstall tesseract 3.04 obviously.
You can not have 2 installation of tesseract or you should know your system
and have knowledge how to handle this kind of situation.
What ever you do, you should understand what are you doing.
Zdenko
ut 16. 10. 2018 o 8:53 Vinod Gattani
Robert/ Zdenko
Yes, in the log I see version "3.4v".
To install v4, I used the link "https://github.com/tesseract-ocr/tesseract;.
I thought it has tesseract v4, as the Readme file say "Source code for the
new LSTM based 4.0 version is available from the master branch on GitHub."
So, I did a git
Robert is pointing you to right direction. Did you read the log you post
here?
" Tesseract Open Source OCR Engine v3.04.01 with Leptonica"
You are mixing tesseract versions so no surprise of problems.
Zdenko
ut 16. 10. 2018 o 8:26 Vinod Gattani
napísal(a):
> Hi,
> Typo: " Why the version is
Hi,
Typo: " Why the version is not 4.0.?
I installed using "git pull https://github.com/tesseract-ocr/tesseract;.
And then followed the instructions on training page.
Regards
On Tue, Oct 16, 2018 at 11:53 AM Robert Kamiński <
kaminski.robert...@gmail.com> wrote:
> Hi,
> " Why the version is
Hi,
" Why the version is 4.0." What do you mean by that? In logs it states that
it's 3.04v. "Tesseract Open Source OCR Engine v3.04.01 with Leptonica".
The problem might be the fact that 4th version is using lstm files whereas
you have version 3.04 using box files instead. Try to check the version
Hi All,
I have started a project to do OCR on Identity Cards. I am learning to
train tesseract models with custom fonts.
Please help me on this.
Steps till now:
1. git pull https://github.com/tesseract-ocr/tesseract
2. Then I followed instructions on training package till command "sudo make
Thank you Shreeshrii for reply!
Manual customization of theese files might be kinda annoying. If i will
need to experiment with the dawg files and I'll achieve something I'll
surely let you know if there is any difference. Again thank you for your
help and time :)
>
--
You received this
> When it's combining language model I've spotted that it's making some
dawg files.
Yes, it takes the files from langdata repo specified in the training
command.
You could change langdata/pol/pol.wordlist to have only the LAST NAMES and
GIVEN NAMES, pol.punc to have only < and change number
Thank you for your reply Shreeshrii!
Indeed finetune method is much much better solution for my problem. Thanks
to your logs and data provided in repo I realized that I don't need to
generate every single MRZ code separately (I'm sure it was mentioned
somewhere ). In fact the process of making
See https://github.com/Shreeshrii/tessdata_ocrb
for the files and traineddata.
On Wed, Sep 5, 2018 at 8:51 PM, Shree Devi Kumar
wrote:
> I think finetune will be a better option than training from scratch.
>
> Using a small training/test text - 40 lines, I get
>
>
I think finetune will be a better option than training from scratch.
Using a small training/test text - 40 lines, I get
-
+ lstmeval --verbosity 0 --model /home/ubuntu/
*tessdata_best/script/Latin.traineddata* --eval_listfile
Hi,
(I might butcher English grammar- you have been warned!)
For some time I'm trying to teach tesseract to read MRZ
codes.Unfortunately it's not going very well. I'm using the latest version
of tesseract (4.0) soI'mm trying to train it by lstm method. I've managed
to pull it off and got
18 matches
Mail list logo