Re: [tesseract-ocr] Re: train tesseract OCR 4.0

2017-04-04 Thread ShreeDevi Kumar
Read https://github.com/tesseract-ocr/tesseract/wiki/4.0-with-LSTM https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00 https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00---Finetune

[tesseract-ocr] Re: train tesseract OCR 4.0

2017-04-04 Thread srnsp92
Can you please post some experiences in this post, as there are no posts to train tesseract 4. 1)And also, is there any way to add the new trained data file to old trained data file, without replacing the old file. 2)If we dont know what font we may get in our images, then how should we

Re: [tesseract-ocr] train tesseract OCR 4.0

2017-04-04 Thread Saurabh Srivastav
thank you shree , you always help me. but i still have one problem that i wrote a bash script which trace the all images with .jpg extension and make their output files as the name of image. but i want that when i run script it trace more images with some different extensions like .jpg , .jpeg

[tesseract-ocr] Re: train tesseract OCR 4.0

2017-04-04 Thread Saurabh Srivastav
Yes, i trained my tesseract for eng font and make them read the characters from image. > thanks, >> Saurabh Srivastav >> > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send

Re: [tesseract-ocr] train tesseract OCR 4.0

2017-04-04 Thread ShreeDevi Kumar
Tesstrain.sh generates a file called eng.training_files.txt You are using command without .text extension Check the name of generated file and use that. I have found that editing that file also gives errors. - excuse the brevity, sent from mobile On 04-Apr-2017 7:01 PM,

Re: [tesseract-ocr] train tesseract OCR 4.0

2017-04-04 Thread srnsp92
I am trying to tesseract 4,, and i am getting folowing error,, command used: mkdir -p /home/p/Documents/T/engoutput /home/p/Documents/T/tesseract-master/training/lstmtraining -U /home/p/Documents/T/img_frm_3/unicharset \ --script_dir /home/p/Documents/T/TESS_4_ALPHA/langdata-master

Re: [tesseract-ocr] train tesseract OCR 4.0

2017-04-04 Thread ShreeDevi Kumar
See https://github.com/tesseract-ocr/tesseract/blob/master/training/tesstrain.sh https://github.com/tesseract-ocr/tesseract/blob/master/training/tesstrain_utils.sh https://github.com/tesseract-ocr/tesseract/blob/master/training/language-specific.sh -- You received this message because you are

Re: [tesseract-ocr] train tesseract OCR 4.0

2017-04-04 Thread srnsp92
Hello ShreeDevi, https://medium.com/apegroup-texts/training-tesseract-for-labels-receipts-and-such-690f452e8f79 In the link, we can see a full fledged tutorial of tesseract 3.0 version, of using it and training it. Can you please clarify the below points...?

[tesseract-ocr] Defaul tesseract OCR 3.05 on some "fuzzy" text

2017-04-04 Thread Javier
Hi, I have been trying to read the text in the attached images but I haven't been successful yet. I am using Tesseract 3.05 and I have tested several parameters but I haven't been able to read it well. Could you please share with me, *if my images are too bad for Tesseract? If not, what

[tesseract-ocr] Whitelisting apostrophes problem

2017-04-04 Thread Chris H
I am having trouble whitelisting and OCRing apostrophes (English single right quotes). Given something like the attached image, without specifying a whitelist, apostrophes are output: $ tesseract --user-words ./.user.words /tmp/test-ocr.png stdout Doctor‘s Mask But due to noise (not