[tesseract-ocr] Error: traineddata file must contain at least (a unicharset fileand inttemp) OR an lstm file.

2023-10-22 Thread Hammurabi
I have both of these files. I don't understand. They are both prefixed with .eng in my tessdata directory. I am so close... -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from

[tesseract-ocr] Error using tesstrain with START_MODEL - failed to continue

2023-10-19 Thread Keith Smith
Hi, Could someone help me understand why I am getting the following error when using tesstrain with the START_MODEL option? Failed to continue from: data/micr_ref/micr.lstm >From my local tesstrain repo (cloned from https://github.com/tesseract-ocr/tesstrain), I have the following in

[tesseract-ocr] Error while training tesseract

2023-06-27 Thread Daniel Azubuine
I get this error when training tesseract unicharset_extractor --output_unicharset "data/ibo/unicharset" --norm_mode 2 "data/ibo/all-gt" Bad box coordinates in boxfile string! A na-atụ ilu sị na "Nkụ dị na mba na-eghere mba nri." Extracting unicharset from plain text file data/ibo/all-gt Other case

[tesseract-ocr] Error in training

2022-12-15 Thread Soumen Halder
*Please help me* ESSDATA_PREFIX=../tesseract/tessdata make training MODEL_NAME=foo START_MODEL=eng TESSDATA=../tesseract/tessdata MAX_ITERATIONS=10 combine_lang_model \ --input_unicharset data/foo/unicharset \ --script_dir data/langdata \ --numbers data/foo/foo.numbers \ --puncs

Re: [tesseract-ocr] Error training on new font

2022-06-12 Thread Zdenko Podobny
Please report your problems to the author of the tutorial you follow. Official training procedure is at https://github.com/tesseract-ocr/tesstrain Official documentation could be found https://tesseract-ocr.github.io/tessdoc/ | https://github.com/tesseract-ocr/tessdoc Zdenko po 13. 6. 2022 o

[tesseract-ocr] Error training on new font

2022-06-12 Thread Allan Gongora
I am following this tutorial but when running this command (step 5): ``` mftraining -F font_properties -U unicharset -O eng.unicharset eng.dharma-gothic-e.exp0.tr ``` I get this error: ``` Warning: No shape table file present: shapetable Reading

Re: [tesseract-ocr] "Error in selectDefaultPdfEncoding: type selection failure" on Tesseract 5.1.0 in Ubuntu

2022-06-07 Thread Lucas L.
Also, I feel compelled to mention that I think I have seen this on some of my unupdated VMs running 4.1.1, also built from source, on the same document. Sorry for the spam, I wish I could edit. I think it may be tied to leptonica specifically or something else in the environment? The same

Re: [tesseract-ocr] "Error in selectDefaultPdfEncoding: type selection failure" on Tesseract 5.1.0 in Ubuntu

2022-06-07 Thread Lucas L.
Sure, I will write that up. Thanks for helping, zdenop. Would you happen to know which is the most recent version that does not exhibit this issue so I can switch to that? On Tuesday, June 7, 2022 at 12:27:08 AM UTC-5 zdenop wrote: > Can you please create an issue at >

Re: [tesseract-ocr] "Error in selectDefaultPdfEncoding: type selection failure" on Tesseract 5.1.0 in Ubuntu

2022-06-06 Thread Zdenko Podobny
Can you please create an issue at https://github.com/tesseract-ocr/tesseract/issues? I confirm a problem with recent tesseract and leptonica, so it should be fixed for the next release... Zdenko po 6. 6. 2022 o 22:47 Lucas L. napísal(a): > OK, I have a sample document to share now. I've

Re: [tesseract-ocr] "Error in selectDefaultPdfEncoding: type selection failure" on Tesseract 5.1.0 in Ubuntu

2022-06-06 Thread Lucas L.
No luck sadly, when I edited the image in Irfanview to block out the sensitive parts and tried to OCR it again, the error didn't occur. I'm not sure what changed in the .tiff image file. Any ideas on what kind of image metadata can possibly cause this "selectDefaultPdfEncoding" error? Only

Re: [tesseract-ocr] "Error in selectDefaultPdfEncoding: type selection failure" on Tesseract 5.1.0 in Ubuntu

2022-06-06 Thread Lucas L.
Oh yeah, here's the output of tessdata -v: tesseract 5.1.0 leptonica-1.79.0 libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 2.0.3) : libpng 1.6.37 : libtiff 4.1.0 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.1 Found AVX2 Found AVX Found FMA Found SSE4.1 Found OpenMP 201511 Found libarchive

Re: [tesseract-ocr] "Error in selectDefaultPdfEncoding: type selection failure" on Tesseract 5.1.0 in Ubuntu

2022-06-06 Thread Lucas L.
It seems to be specific to the document in question. However I'm afraid I can't post the document because it has sensitive information on it. I guess I can try to scrub the info using an image editing tool and see if the error still occurs. On Monday, June 6, 2022 at 11:21:25 AM UTC-5 zdenop

Re: [tesseract-ocr] "Error in selectDefaultPdfEncoding: type selection failure" on Tesseract 5.1.0 in Ubuntu

2022-06-06 Thread Zdenko Podobny
Can you please share ocrIn_1.tif + info which tessdata version you use? + output of 'tesseract -v' Zdenko po 6. 6. 2022 o 17:53 Lucas L. napísal(a): > Hi, I'm trying to upgrade Tesseract in our Ubuntu 20.04 VMs used to OCR > documents to Tesseract 5.1 from 4.1.1, both versions were built

[tesseract-ocr] "Error in selectDefaultPdfEncoding: type selection failure" on Tesseract 5.1.0 in Ubuntu

2022-06-06 Thread Lucas L.
Hi, I'm trying to upgrade Tesseract in our Ubuntu 20.04 VMs used to OCR documents to Tesseract 5.1 from 4.1.1, both versions were built from source on that VM. 4.1.1 worked, but 5.1 throws an error that I can't seem to find anywhere else online: sudo -u userx tesseract --loglevel ALL --oem 1

Re: [tesseract-ocr] Error during training

2022-03-26 Thread Alberto Ramirez
Also all the softwares i was able to find to make training faster, don't work with ver 5 -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to

Re: [tesseract-ocr] Error during training

2022-03-26 Thread Alberto Ramirez
If somebody ever has same issues, just download tesseract 4.0, finnally there are some good tutorials and guides which do work, for example: https://www.youtube.com/watch?v=1v8BPw0Dn0I, there's no point to waste the time on the 5.0 "documentation", it's too twisted for begginers, and if

Re: [tesseract-ocr] Error during training

2022-03-26 Thread Alberto Ramirez
tesseract custom_font/train.my.exp0.tif custom_font/train.my.exp0 batch.nochop makebox tesseract custom_font/train.my.exp0.tif custom_font/train.my.exp0 --psm 6 lstm.train echo "temp 0 0 1 0 0" > custom_font/font_properties unicharset_extractor custom_font/train.my.exp0.box That's all i was

Re: [tesseract-ocr] Error during training

2022-03-26 Thread Alberto Ramirez
This documentation sucks, the steps are not even complete, i am trying to understand how to make that unicharset, and ther's not even one example https://tesseract-ocr.github.io/tessdoc/tess5/TrainingTesseract-5.html -- You received this message because you are subscribed to the Google Groups

[tesseract-ocr] Error on Build Version on Oracle Cloud

2022-03-19 Thread Moritz Weibold
Hey there, I am using Tesseract in my Quarkus Java HTTP Server with the following code. The weird thing is, that it works perfectly fine on my Windows PC in the DEV Version, but as soon as I build the app and run it on my Ubuntu Virtual Machine it suddenly stops at String result =

Re: [tesseract-ocr] Error during training

2022-03-19 Thread Zdenko Podobny
Follow the latest official training instructions. Otherwise, nobody will help you. What you show seems like faking (e.g. not really following and understanding) the training process for tesseract 3.x version. Zdenko so 19. 3. 2022 o 15:17 Alberto Ramirez napísal(a): > This doesn't explain why

Re: [tesseract-ocr] Error during training

2022-03-19 Thread Alberto Ramirez
This doesn't explain why that thing dosn't work, the first 4 steps work fine. I already tryied to use the first link, but the guide is too chaotic.. There are some steps missing, the errors messages which i got are missing as well. The only thing i understood is that the method i am trying to

Re: [tesseract-ocr] Error during training

2022-03-19 Thread Zdenko Podobny
https://tesseract-ocr.github.io/tessdoc/#training-for-tesseract-5 https://github.com/tesseract-ocr/tesstrain Zdenko so 19. 3. 2022 o 7:59 Alberto Ramirez napísal(a): > this is the commands i use > tesseract eng.font1.exp0.tif train.my.exp0 batch.nochop makebox > tesseract eng.font1.exp0.tif

[tesseract-ocr] Error during training

2022-03-19 Thread Alberto Ramirez
this is the commands i use tesseract eng.font1.exp0.tif train.my.exp0 batch.nochop makebox tesseract eng.font1.exp0.tif train.my.exp0 box.train unicharset_extractor train.my.exp0.box echo "temp 0 0 1 0 0" > font_properties mftraining -F font_properties -U unicharset -O eng.unicharset

[tesseract-ocr] Error: Transaction test error: install conflicts package

2022-03-06 Thread Megidd Git
I was referred to this forum: https://github.com/tesseract-ocr/tesseract/issues/3764#issuecomment-1058948289 Can anybody help with that problem: https://github.com/tesseract-ocr/tesseract/issues/3764 Thanks :) -- You received this message because you are subscribed to the Google Groups

[tesseract-ocr] Error : Can't Run the fine-Tune Model after fine-tuning multi- fonts of khmer language

2022-02-06 Thread Manuth Vann
Hi ! Everyone . I have just tested my fine-tuning model with multi-fonts of khmer languages and I got this error message Error opening data file C:\Program Files\Tesseract-OCR\/tessdata/10fonts.traineddata Please make sure the TESSDATA_PREFIX environment variable is set to your

[tesseract-ocr] Error while training.

2021-12-18 Thread PRASHANTH V
Hi myself prashanth , I started new font traing of ocr but initially only I'm facing issue. I Failed to resolve this issue. please teach me whats the problem. thank you. -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from

[tesseract-ocr] Error while running unicharset_extractor.exe

2021-09-28 Thread Samruddhi Dhake
Hi, I have installed Tesseract v4.1.0 on my Windows machine. I have dowloaded the installer tesseract-ocr-w64-setup-v4.1.0.20190314.exe from website https://digi.bib.uni-mannheim.de/tesseract/ Running

[tesseract-ocr] Error while running lstmtraining.exe --stop_training

2021-09-27 Thread megha patil
Hi, I want to create my own traineddata and I am using tesseract 4.1 on windows. After running below command, I got checkpoints as output. lstmtraining --model_output="D:\Test\" --continue_from="D:\Test\Dim_test.lstmf" --train_listfile="D:\Test\eng.training_files.txt"

[tesseract-ocr] Error while running ocrmypdf in CentOS 8

2021-09-10 Thread Pankaj S Y
Hi, Below errors are coming when I run ocrmypdf in CentOS 8 -> 1 [tesseract] read_params_file: Can't open pdf 1 [tesseract] read_params_file: Can't open txt Command is -> *ocrmypdf image.pdf text.pdf* What to do ? -- You received this message because you are subscribed to the

TRe: [tesseract-ocr] Error in creating LSTM training data using tesstrain.sh

2021-04-24 Thread kamra....@gmail.com
The issue got resolved. libtiff was missing in the system so not working with tif files On Friday, April 23, 2021 at 12:18:43 AM UTC+5:30 kamra@gmail.com wrote: > I am facing the same issue. I have used following command: > /tesstrain.sh --fonts_dir /usr/share/fonts/ --lang eng

Re: [tesseract-ocr] Error in creating LSTM training data using tesstrain.sh

2021-04-22 Thread kamra....@gmail.com
I am facing the same issue. I have used following command: /tesstrain.sh --fonts_dir /usr/share/fonts/ --lang eng --linedata_only --noextract_font_properties --exposures "0" --langdata_dir /home/administrator/Downloads/tesseract-4.0.0/langdata --tessdata_dir

Re: [tesseract-ocr] Error in creating LSTM training data using tesstrain.sh

2021-04-22 Thread kamra....@gmail.com
Facing the same issue. === Starting training for language 'eng' [Fri Apr 23 00:13:06 IST 2021] /usr/bin/text2image --fonts_dir=/usr/share/fonts/ --font=FreeMono --outputbase=/tmp/font_tmp.7XXGMDw4DE/sample_text.txt --text=/tmp/font_tmp.7XXGMDw4DE/sample_text.txt

[tesseract-ocr] Error: Assert failed: in file tessdatamanager.cpp

2020-11-11 Thread Teem
when executing the command "combine_tessdata -e tesseract / tessdata / eng.traineddata eng.lstm" I get the error "tesseract :: TessdataManager :: TessdataTypeFromFileName (filename, & type): Error: Assert failed: in file tessdatamanager.cpp, line 297 Illegal instruction (core dumped) " What

[tesseract-ocr] Error while retrieving text from image

2020-11-10 Thread prasanth kotagiri
Hi, I am getting the error as like below for the code snippet. cannot convert from 'System.Drawing.Bitmap' to 'Tesseract.Pix' Code: using System.Drawing; using Tesseract; Bitmap img = (Bitmap)Bitmap.FromFile("/Users/prkotagi/Desktop/Test.bmp"); TesseractEngine engine = new

[tesseract-ocr] Error while building using cmake

2020-10-26 Thread Kirankumar Chincholi
I getting an error while building using cmake using msys2 on windows 10. ‘isascii’ was not declared in this scope this is the error, I also attached screenshot please find attachment. Please help me if anyone know the solution. Thanks & Regards Kirankumar Chincholi -- You received this

Re: [tesseract-ocr] error: shared library version mismatch

2020-10-15 Thread Eric Ihli
I know this message is old but I wanted to chime in with an with a bit more of a specific answer so that if anyone else came across this thread they wouldn't be stuck. TLDR: Training requires tesseract and unicharset_extractor to be the same version. If you update one without updating the

[tesseract-ocr] Error during install Tesseract training tools in MacOS Cataline

2020-08-04 Thread minh...@gmail.com
Dear friend, I need help to install tesseract and training tools in Mac OS 10.15.6 Catalina follows the guild in: https://github.com/tesseract-ocr/tesseract/issues/1453 The tessseract version: tesseract 5.0.0-alpha-773-gd33ed leptonica-1.80.0 libgif 5.2.1 : libjpeg 9d : libpng

[tesseract-ocr] Error jpn_vert is not a valid language code

2020-05-21 Thread Dave
Trying to run *tesstrain.sh* for jpn_vert and I'm getting ERROR: Error: jpn_vert is not a valid language code (when I pass it in --lang) Is jpn_vert supposed to be trained as jpn or am I missing something else? -- You received this message because you are subscribed to the Google Groups

[tesseract-ocr] Error Please Help am New to this Please Help

2020-05-06 Thread Ysakh Yp
Uncaught thiagoalessio\TesseractOCR\UnsuccessfulCommandException: Error! The command did not produce any output. Generated command: "tesseract" "sampleorg.pdf.03.jpg" "C:\Users\Ysh\AppData\Local\Temp\ocr6A8D.tmp" Returned message: Tesseract Open Source OCR Engine v3.02 with Leptonica Empty

[tesseract-ocr] error

2020-02-12 Thread bosh sherikar
can you please help I'm beginner i don't know this issue why its happening -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to

Re: [tesseract-ocr] ERROR: Program text2image failed. Abort.

2019-09-06 Thread Zdenko Podobny
Do you understand error message ("/usr/local/bin/text2image: error while loading shared libraries: libtesseract.so.5: cannot open shared object file: No such file or directory")? IMO it is clear. Zdenko pi 6. 9. 2019 o 17:07 Jundong Qiao napísal(a): > Hi all, > > I am generating training

[tesseract-ocr] ERROR: Program text2image failed. Abort.

2019-09-06 Thread Jundong Qiao
Hi all, I am generating training files for tesseract, after installing all necessary packages. My code: tesstrain.sh --fonts_dir fonts \ --fontlist "OCR-A Medium"\ --lang eng \ --linedata_only \ --langdata_dir langdata_lstm \ --tessdata_dir tesseract/tessdata

[tesseract-ocr] Error: failed to load user-words

2019-07-21 Thread Abdou
please help me this error when i use user-word with tesseract 5.0 is it possible or not to use user-word? -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to

[tesseract-ocr] error when training

2019-06-10 Thread Jingjing Lin
I was going through the training turotial below https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00 In part training from scratch, I copied the command in the link above and ran: mkdir -p ~/tesstutorial/engoutput lstmtraining --debug_interval 100 \ --traineddata

Re: [tesseract-ocr] error when make training

2019-06-05 Thread Shree Devi Kumar
You are probably missing the last step sudo make training-install Usual Build and Install instructions git clone https://github.com/tesseract-ocr/tesseract/ cd tesseract ./autogen.sh ./configure make sudo make install sudo ldconfig make training sudo make training-install On Wed, Jun 5,

Re: [tesseract-ocr] error when make training

2019-06-05 Thread Shree Devi Kumar
If training tools are made correctly, you should have all those programs. AT least that's how it is on Linux and Windows. On Wed, Jun 5, 2019 at 6:40 PM Jingjing Lin wrote: > Actually I found all the following are not there. Am I missing something? > text2image > unicharset_extractor >

Re: [tesseract-ocr] error when make training

2019-06-05 Thread Jingjing Lin
In my tesseract/src/training folder, all these text2image unicharset_extractor set_unicharset_properties combine_lang_model lstmtraining lstmeval are there. icluding .cpp, .o and without filename extension, three types of them. 在 2019年6月5日星期三 UTC-4上午9:10:43,Jingjing Lin写道: > > Actually I

Re: [tesseract-ocr] error when make training

2019-06-05 Thread Jingjing Lin
Actually I found all the following are not there. Am I missing something? text2image unicharset_extractor set_unicharset_properties combine_lang_model lstmtraining lstmeval 在 2019年6月5日星期三 UTC-4上午7:37:57,Jingjing Lin写道: > > I think my make training went successfully by manually linking several

Re: [tesseract-ocr] error when make training

2019-06-05 Thread Jingjing Lin
I think my make training went successfully by manually linking several libraries via "LDFLAGS". The compiling gives no error anymore. But it seems it is still not working because 'text2image' command is not recognized. Is there anything else I need to do after 'make training'? Thanks. Another

Re: [tesseract-ocr] Error when I try to run the command ./configure

2019-06-03 Thread Ricardo Junior
Thank you for your reply. I'll try Em segunda-feira, 3 de junho de 2019 14:53:46 UTC-3, zdenop escreveu: > > you need to start here: > > https://github.com/tesseract-ocr/tesseract/wiki/Compiling-%E2%80%93-GitInstallation > > > Zdenko > > > po 3. 6. 2019 o 19:42 Ricardo Junior > > napísal(a):

Re: [tesseract-ocr] Error when I try to run the command ./configure

2019-06-03 Thread Zdenko Podobny
you need to start here: https://github.com/tesseract-ocr/tesseract/wiki/Compiling-%E2%80%93-GitInstallation Zdenko po 3. 6. 2019 o 19:42 Ricardo Junior napísal(a): > Hello everybody, > I am trying to train the tesseract following this tutorial: >

[tesseract-ocr] Error when I try to run the command ./configure

2019-06-03 Thread Ricardo Junior
Hello everybody, I am trying to train the tesseract following this tutorial: https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00 I downloaded the git repository and run the cmake and make. When I try to run the ./configure command, I get the follow error:

Re: [tesseract-ocr] error when make training

2019-06-03 Thread Jingjing Lin
The above error was fixed with brew reinstall icu4c. now ./configure CC=gcc-8 CXX=g++-8 CPPFLAGS=-I/usr/local/opt/icu4c/include LDFLAGS=-L/usr/local/opt/icu4c/lib gives: checking pkg-config is at least version 0.9.0... yes checking for lept >= 1.74... yes checking for libarchive... yes

Re: [tesseract-ocr] error when make training

2019-06-03 Thread Jingjing Lin
Thanks for your reply. Are these two related to icu4c? If yes 'brew info icu4c' gives me: icu4c: stable 64.2 (bottled) [keg-only] C/C++ and Java libraries for Unicode and globalization https://ssl.icu-project.org/ /usr/local/Cellar/icu4c/64.2 (257 files, 69.2MB) Poured from bottle on

Re: [tesseract-ocr] error when make training

2019-06-02 Thread Zdenko Podobny
po 3. 6. 2019 o 4:10 Jingjing Lin napísal(a): > > checking for icu-uc >= 52.1... no > > checking for icu-i18n >= 52.1... no > This is problem - you do now have icu / right version... -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To

[tesseract-ocr] error when make training

2019-06-02 Thread Jingjing Lin
Hi I installed tesseract about one month ago using brew install and I'm now trying to set up training tool in MacOS Mojave following instructions here: https://github.com/tesseract-ocr/tesseract/wiki/Compiling#macos using homebrew and was having problems one after another. Currently the bug

[tesseract-ocr] Error codes

2019-05-09 Thread Massimo Redaelli
I'm running tesseract via the python interface pytesseract, and sometimes I get something like this: pytesseract.pytesseract.TesseractError: (-9, 'Tesseract Open Source OCR Engine v4.0.0-beta.1 with Leptonica Warning. Invalid resolution 0 dpi. Using 70 instead.') The error code returned by

Re: [tesseract-ocr] error: shared library version mismatch

2019-05-09 Thread Zdenko Podobny
If you installed with "make install" you should uninstall with " make uninstall " ;-) If you use apt you should learn how to use it also for uninstall. You should be familiar with tools you use. Zdenko št 9. 5. 2019 o 10:42 anne napísal(a): > Umm, can you please elaborate on what you mean

Re: [tesseract-ocr] error: shared library version mismatch

2019-05-09 Thread anne
Umm, can you please elaborate on what you mean by "similar way as you installed it"? I actually installed tesseract twice, first by following the instructions here: https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00 and then when I encountered this error, I thought, hey

Re: [tesseract-ocr] error: shared library version mismatch

2019-05-09 Thread Zdenko Podobny
The similar way as you installed it. BTW: If you are not familiar with your system, do not install sw from source - you can cause a lot of problem. Zdenko št 9. 5. 2019 o 10:30 anne napísal(a): > How exactly do I uninstall tesseract? I've found some "solutions" online > but they didn't really

Re: [tesseract-ocr] error: shared library version mismatch

2019-05-09 Thread anne
How exactly do I uninstall tesseract? I've found some "solutions" online but they didn't really work. When I checked for tesseract version, it shows me that it is still installed. -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To

Re: [tesseract-ocr] error: shared library version mismatch

2019-05-09 Thread Zdenko Podobny
that means: you mixed different tesseract version (you forget to uninstall previous tesseract installation). Zdenko št 9. 5. 2019 o 10:14 anne napísal(a): > Hi, I was running the command unicharset_extractor but I got this error: > > *ERROR: shared library version mismatch (was

[tesseract-ocr] error: shared library version mismatch

2019-05-09 Thread anne
Hi, I was running the command unicharset_extractor but I got this error: *ERROR: shared library version mismatch (was 4.1.0-rc1-170-gb6bf, expected 4.1.0-rc2-34-gb2fc3* So I tried the other commands by checking their versions but all of them were giving me this same error more or less. Thank

[tesseract-ocr] error running -- psm 1 Automatic page segmentation with OSD : (1, "Failed loading language 'heb' Tesseract couldn't load any languages! Could not initialize tesseract.")

2019-04-11 Thread Ido
I have been trying to extract text from an image using Tesseract default PSM mode and succeed then I tried to switch to mode 1 and received the following error : (1, "Failed loading language 'heb' Tesseract couldn't load any languages! Could not initialize tesseract.") I assume in order to

Re: [tesseract-ocr] Error by using own model

2019-03-21 Thread Shree Devi Kumar
A checkpoint is NOT a traineddata file. Use -stop-training to build the traineddata. eg. echo " stop training " ~/tesseract/bin/src/training/lstmtraining \ --stop_training \ --continue_from ./devaplus_z1/plus_checkpoint \ --traineddata

[tesseract-ocr] Error by using own model

2019-03-21 Thread Jens Humrich
Hey everyone, after training my model and gaining a sufficient accuracy, I copied the checkpoint into the TESSDATA_PREFIX folder and renamed it to jen.traineddata: cp jens_checkpoint /usr/share/tesseract/4/tessdata/jen.traineddata When trying to run tesseract with the new language I get the

[tesseract-ocr] ERROR: shared library version mismatch (was 4.0.0-279-gec8f, expected 4.0.0-255-gfc55

2019-02-07 Thread 한정협
I was try to use /src/training/tesstrain.sh with my own .tif/box files my tesseract version is below tesseract 4.0.0-279-gec8f leptonica-1.74.4 libjpeg 8d (libjpeg-turbo 1.4.2) : libpng 1.2.54 : libtiff 4.0.6 : zlib 1.2.8 === Starting training for language 'eng' ERROR: shared library

Re: [tesseract-ocr] Error running the BEST Model of "eng.traineddata"

2018-12-19 Thread Shree Devi Kumar
Tesseract Open Source OCR Engine v3.04.01 with Lept The tessdata_best models are for use with tesseract 4 On Wed, 19 Dec 2018, 06:32 I am trying to improve my accuracy to OCR tool which I built using > pytesseract. > As I was not getting good results using default eng.traineddata, I saw a >

[tesseract-ocr] Error running the BEST Model of "eng.traineddata"

2018-12-19 Thread gkranthi . kiran . 99
I am trying to improve my accuracy to OCR tool which I built using pytesseract. As I was not getting good results using default eng.traineddata, I saw a repo of Best LSTM Models of tesseract and downloaded "eng.traineddata" from there. Then I copied this to /usr/share/tesseract-ocr/tessdata/

Re: [tesseract-ocr] Error Segmentation fault (core dumped)

2018-10-03 Thread Zdenko Podobny
Maybe you forget to read FAQ or google ;-) https://github.com/tesseract-ocr/tesseract/wiki/FAQ-Old#actual_tessdata_num_entries_-tessdata_num_entrieserrorassert-failedin-file-ccutiltessdatamanagercpp-line-55_ Zdenko st 3. 10. 2018 o 15:50 AjeetM napísal(a): > Hi, > I am running the command: >

[tesseract-ocr] Error Segmentation fault (core dumped)

2018-10-03 Thread AjeetM
Hi, I am running the command: tesseract ocr_abc.jpg output hocr The output is: Tesseract Open Source OCR Engine v3.03 with Leptonica actual_tessdata_num_entries_ <= TESSDATA_NUM_ENTRIES:Error:Assert failed:in file tessdatamanager.cpp, line 53 Segmentation fault (core dumped) Basis what I found

[tesseract-ocr] error: Must provide a --traineddata see training wiki

2018-09-30 Thread Zohreh Khosrobeygi
I am trying to finetun train for tesseract I've created a new fas.traindata and extract best traind data for persian and then run below command: combine_tessdata -e tessdata/best/fas.traineddata \ /home/zohreh/Desktop/tesseract-master/tessdata/ext-best/fas.lstm training/lstmtraining

Re: [tesseract-ocr] Error when trying to run lstmtraining: Can't encode transcription

2018-09-08 Thread Shree Devi Kumar
> Warning: given outputs 111 not equal to unicharset of 90. your starter traineddata has a unicharset of 90. In your --net_spec you have specified number of unichars as 111. > Encoding of string failed! It means that some of the chracters in the displayed string are NOT in the unicharset of

[tesseract-ocr] Error when trying to run lstmtraining: Can't encode transcription

2018-09-08 Thread Shandigutt
Hi, *I was trying to run lstmtraining script using below command,* ./build/src/training/lstmtraining --debug_interval 100 \ --traineddata ../training/sintrain/sin/sin.traineddata \ --net_spec '[1,36,0,1 Ct3,3,16 Mp3,3 Lfys48 Lfx96 Lrx96 Lfx256 O1c111]' \ --model_output

Re: [tesseract-ocr] Error in creating LSTM training data using tesstrain.sh

2018-09-02 Thread Shandigutt
Thank you Shree. Now it works fine On Sunday, September 2, 2018 at 6:41:28 AM UTC+3, shree wrote: > > > read_params_file: Can't open lstm.train > > lstm.train is a config file which is not found. > > It is there in tesseract/tessdata/configs > > Make sure it is there in your tessdata directory

Re: [tesseract-ocr] Error in creating LSTM training data using tesstrain.sh

2018-09-01 Thread Shree Devi Kumar
> read_params_file: Can't open lstm.train lstm.train is a config file which is not found. It is there in tesseract/tessdata/configs Make sure it is there in your tessdata directory or your path and can be found. On Sun, Sep 2, 2018 at 3:40 AM, Shandigutt wrote: > Hi, > > I was trying to

[tesseract-ocr] Error in creating LSTM training data using tesstrain.sh

2018-09-01 Thread Shandigutt
Hi, I was trying to create LSTM training data using tesstrain.sh. I got the below error. Can somebody explain me what has gone wrong, *Command I used:* ./src/training/tesstrain.sh --fonts_dir ../Support/font --lang sin --linedata_only \ --noextract_font_properties --langdata_dir ../langdata

[tesseract-ocr] Error when build C++ Visual Studio 2017 project after include tesseract 4.0 library

2018-08-20 Thread Tiến Nguyễn
I am using windows 10 and manually build my own tesseract 4.0 and leptonica-1.76.0 library. When I include file .h and library in configuring of the project. It shows error like below [image: error.png] 1>-- Rebuild All started: Project: Test_Tesseractv4, Configuration: Debug x64 --

[tesseract-ocr] error while converting pdf file to tiff using command

2018-08-06 Thread thiyamjennil
hello everyone, for testing tesseract i convert the pdf file to tiff file and after 10 files(each contains 7000-8000 characters), there is this error that says convert-im6.q16: DistributedPixelCache '127.0.0.1' @ error/distribute-cache.c/ConnectPixelCacheServer/244. convert-im6.q16: cache

Re: [tesseract-ocr] Error on combine_lang_model script; Null char=2 Invalid format in radical table at line 4: 3400 1.4 Creation of encoded unicharset failed!! Error writing recoder!!

2018-08-05 Thread Shree Devi Kumar
You are using an old version of tesseract. Please use the latest version from github. Make sure you remove/uninstall old version. You error is related to radical stroke file in langdata. Make sure you use latest version of langdata repo. >Invalid format in radical table at line 4: 34001.4

[tesseract-ocr] Error on combine_lang_model script; Null char=2 Invalid format in radical table at line 4: 3400 1.4 Creation of encoded unicharset failed!! Error writing recoder!!

2018-08-05 Thread Shandigutt
Hi, I am trying to train Tesseract for Sinhala language. I was following training guidelines mentioned in Github wiki. I get an error with reference to the 4th step which is "Creating

Re: [tesseract-ocr] error while loading shared libraries: libtesseract.so.4: cannot open shared object file: No such file or directory

2018-06-27 Thread 振宇韩
use installed user eg:root 在 2017年8月27日星期日 UTC+8下午11:55:47,Dan9er写道: > > It worked!! Thank you! > > On Sunday, August 27, 2017 at 11:45:25 AM UTC-4, shree wrote: >> >> Did you do >> >> sudo ldconfig >> >> And try to run tesseract after that. >> >> On 27-Aug-2017 7:53 PM, "Dan9er" wrote: >> >>>

[tesseract-ocr] error running configure; how do I start over?

2018-06-09 Thread Shinehah-Gnolaum
I made an error running configure. The first time I ran the line given in the instructions at https://github.com/tesseract-ocr/tesseract/wiki/Compiling#macos It didn't work because the c++ compiler couldn't make executables, it said. The second time I didn't set the environment variables. How

Re: [tesseract-ocr] error

2018-06-09 Thread ShreeDevi Kumar
You are probably using a wrong traineddata file i.e. 3.0x version file with latest 4.0x code from master branch. ShreeDevi भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com On Sat, Jun 9, 2018 at 3:33 PM Vishal Jha wrote: > 1,

[tesseract-ocr] error

2018-06-09 Thread Vishal Jha
1, 'read_params_file: parameter not found: enable_new_segsearch') -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com.

Re: [tesseract-ocr] error in lstm training

2018-06-03 Thread nick
hi shree thanks for your reply. i will check it as soon as possible. On Saturday, June 2, 2018 at 3:56:39 PM UTC+4:30, shree wrote: > > > !int_mode_:Error:Assert failed:in file weightmatrix.cpp, line 244 > > You can only continue_from models in tessdata_best repo which are float > models. The

Re: [tesseract-ocr] error in lstm training

2018-06-02 Thread ShreeDevi Kumar
> !int_mode_:Error:Assert failed:in file weightmatrix.cpp, line 244 You can only continue_from models in tessdata_best repo which are float models. The integer models in tessdata and tessdata_fast can not be used for that purpose. ShreeDevi

[tesseract-ocr] error in lstm training

2018-06-02 Thread nick
hi i tried to finetune eng.traineddata. in lstm training raised this error : lstmtraining --continue_from ./tesseract-4.0.0-beta.1.20180414/tessdata/eng.lstm --traineddata ./tesseract-4.0.0-beta.1.20180414/tessdata/eng.traineddata --max_iterations 400 --debug_interval 0

[tesseract-ocr] error in running tesseract with API example PYTHON

2018-05-21 Thread nick
hi i want to run tesseract API python with bellow codes, but raised error: import osimport ctypes lang = "eng" filename = "/usr/src/tesseract-ocr/phototest.tif" libname = "/usr/local/lib64/libtesseract.so.3"TESSDATA_PREFIX = os.environ.get('TESSDATA_PREFIX')if not TESSDATA_PREFIX:

[tesseract-ocr] error in running tesseract with API example

2018-05-21 Thread nick
hi, i want to run tesseract with this code : #include #include int main() { char *outText; tesseract::TessBaseAPI *api = new tesseract::TessBaseAPI(); // Initialize tesseract-ocr with English, without specifying tessdata path if (api->Init(NULL, "eng")) {

[tesseract-ocr] Error in executing new .traineddata file

2018-05-15 Thread Eman Sawalha
Hello Recently, I worked on training Tesseract to detect Old South Arabian Script, and I produced the .traineddata file. So to test .traineddata file I copied the file into the tessdata file inside the Tesseract. My problem that whenever I tried to execute it on cmd.exe it gives me this

Re: [tesseract-ocr] error: required directory

2018-04-25 Thread Marius Amado-Alves
Zdenko, your latest fix of the makefile has solved this problem:-) Thanks a lot. -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to

Re: [tesseract-ocr] error: required directory

2018-04-25 Thread Zdenko Podobny
We are making reorganization of tesseract. Using the latest code is not recommended at all especially if you do not follow developers communications. Zdenko 2018-04-25 19:59 GMT+02:00 Marius Amado-Alves : > Trying to install on a Mac, cannot pass the autogen.sh step.

[tesseract-ocr] error: required directory

2018-04-25 Thread Marius Amado-Alves
Trying to install on a Mac, cannot pass the autogen.sh step. Any tips highly appreciated. Current directory is /tesseract bash-3.2# ./autogen.sh Running aclocal Running /opt/local/bin/glibtoolize glibtoolize: putting auxiliary files in AC_CONFIG_AUX_DIR, 'config'. glibtoolize: copying file

Re: [tesseract-ocr] Error opening traineddata files on Mac High Sierra

2018-04-10 Thread Zdenko Podobny
If you followed someone tutorial you should complain to its author ;-). I am not familiar with Mac, but on linux you can do it (in command) this way: export TESSDATA_PREFIX=/usr/loca/share/ Maybe it is similar on Mac. Try to google how to set environment variable on Mac. Zdenko 2018-04-10

Re: [tesseract-ocr] Error opening traineddata files on Mac High Sierra

2018-04-10 Thread Firlefanz
Thank you for your reply. I used the command following this guide https://www.youtube.com/watch?v=QhJiOCwz-_I -- if it's wrong, then I will not follow this guide anymore. Yes, I have Fraktur.traineddata in usr/loca/share/tessdata I do not know how to change "the TESSDATA_PREFIX environment

Re: [tesseract-ocr] Error opening traineddata files on Mac High Sierra

2018-04-10 Thread Zdenko Podobny
First of all: your command if wrong. It should be constructed this way: tesseract image output [options] See tesseract --help for more details. Next: error message is clear: Error opening data file ./tessdata/Fraktur.traineddata You (or your installation) instructed to look for trainneddata

[tesseract-ocr] Error opening traineddata files on Mac High Sierra

2018-04-10 Thread Firlefanz
I downloaded deu_frak.traineddata Fraktur.traineddata and frk.traineddata to usr/loca/share/tessdata. But when using $ tesseract file.tiff -l Fraktur Fraktur I get the error message Error opening data file ./tessdata/Fraktur.traineddata Please make sure the TESSDATA_PREFIX environment

Re: [tesseract-ocr] ERROR: exp0.box does not exist or is not readable

2018-04-07 Thread Fanatico
Thanks for the reply, but I just fixed this bug, the problem is that the var PANGOCAIRO_BACKEND was empty on MAC OSX so I needed to set it before executing the code. Something like this: PANGOCAIRO_BACKEND=fc \ ../../tesseract/training/tesstrain.sh \ --fonts_dir /Library/Fonts \ --lang eng

Re: [tesseract-ocr] ERROR: exp0.box does not exist or is not readable

2018-04-07 Thread ShreeDevi Kumar
Look in your tmp directory in the sub folders referred in the console output Check the log file and other files there On Sat 7 Apr, 2018, 11:00 AM Fanatico, wrote: > Yes the location is correct, I tried to put the full path to the folder > and go the same error. > >

  1   2   >