[tesseract-ocr] Re: Commercial use of trained data

2023-11-25 Thread Tom Morris
On Monday, November 20, 2023 at 11:41:23 PM UTC-5 Leon wrote: Is there any problem in embedding the trained data published below into commercial products? https://github.com/Shreeshrii/tessdata_ocrb It is assumed that the OSS license terms are followed. Since that repository has no stated

Re: [tesseract-ocr] Failed loading language 'eng'

2023-11-25 Thread Zdenko Podobny
tesseract 3.x is unsupported. I am not Java developer, but according https://github.com/nguyenq/tess4j/releases tess4j-5.8.0 should support Tesseract 5.3.2, so I would start from that. If there is still a problem have a look at their wiki ( https://github.com/nguyenq/tess4j/wiki) and issue

Re: [tesseract-ocr] Failed loading language 'eng'

2023-11-25 Thread 'sanogo sy' via tesseract-ocr
Too stupid, my bad! Could someone give me some advice to install required version. I use tess4j 5.4.0.jar in my application. In local on windows OS, I tried another version of tess4j but it didn't work, so I kept tess4j 5.4.0. Now I had to make it run in linux Centos 7. I tried many

Re: [tesseract-ocr] Failed loading language 'eng'

2023-11-25 Thread Zdenko Podobny
you used an old unsupported version of your tools (not sure if the problem is in the used/installed wrapper or Tesseract library...) - the cube engine was removed from Tesseract several years ago... Zdenko so 25. 11. 2023 o 15:31 'sanogo sy' via tesseract-ocr < tesseract-ocr@googlegroups.com>

[tesseract-ocr] Newspaper segmentation techniques

2023-11-25 Thread shacky
Hello everyone, I’m using tesseract l to ocrize some newspapers and it works very well. I am making some researches about how I could have some kind of automatic segmentation of singles articles into a newspaper page post processing generated HOCR files and I found some academics papers which

Re: [tesseract-ocr] Failed loading language 'eng'

2023-11-25 Thread 'sanogo sy' via tesseract-ocr
But in my app that running in server wildfly 24, I got error say: Failed loading language 'eng'. In my log file I got: Failed loading language 'eng' Cube ERROR (CubeRecoContext::Load): unable to read cube language model params from /tmp/tess4j/tessdata/fra.cube.lm Cube ERROR

Re: [tesseract-ocr] Failed loading language 'eng'

2023-11-25 Thread 'sanogo sy' via tesseract-ocr
If I well understood, you mean by tesseract (executable) to run tesseract command on purpose to check how it works. I just run command: tesseract path_of_my_image.jpg output.txt My output file is empty. It seems that it doesn't work because I got in my command line message : Estimating

Re: [tesseract-ocr] Failed loading language 'eng'

2023-11-25 Thread Zdenko Podobny
And the result is? Zdenko so 25. 11. 2023 o 13:07 'sanogo sy' via tesseract-ocr < tesseract-ocr@googlegroups.com> napísal(a): > I forgot to mentione that I use Centos 7. > I tried that command : tesseract img.jpg out > > As result I got a message like: > > Estimating resolution as 181 > Error

Re: [tesseract-ocr] Failed loading language 'eng'

2023-11-25 Thread 'sanogo sy' via tesseract-ocr
I forgot to mentione that I use Centos 7. I tried that command : tesseract img.jpg out As result I got a message like: Estimating resolution as 181 Error in boxClipToRectangle: box outside rectangle Error in pixScanForForeground: invalide box On Saturday, November 25, 2023 at 10:31:49 AM UTC

Re: [tesseract-ocr] Re: Training from Scratch

2023-11-25 Thread Simon
Yes in general I want to recognice this part "< 0,05 A" except that the < ist actually ∠ the character for angularity. The segmentation process of tesseract can't be edited right? So you mean I would need to make an Tesseract independent program that localizes the boxes crops them out and

Re: [tesseract-ocr] Failed loading language 'eng'

2023-11-25 Thread Zdenko Podobny
Does tesseract (executable) has the same problem? If yes, that check the content of /usr/share/tesseract-ocr/4/tessdata/ If not follow code of tesseract executable. Zdenko so 25. 11. 2023 o 11:07 'sanogo sy' via tesseract-ocr < tesseract-ocr@googlegroups.com> napísal(a): > Hi every one. I got

[tesseract-ocr] Failed loading language 'eng'

2023-11-25 Thread 'sanogo sy' via tesseract-ocr
Hi every one. I got an error with tesseract. When I try to use it in my app, I got an error like "Failed loading language eng". I installed tesseract 5 with leptonica 1.79 To solve the problem I tried that command : export TESSDATA_PREFIX=/usr/share/tesseract-ocr/4/tessdata/ I cloned from git