The files from tessdata_best only support the lstm mode ie --oem 1. Please check what mode your web service is using.
On Fri, Mar 6, 2020, 19:27 Adam Funk <[email protected]> wrote: > Hi again, > > I've updated the web service to use a newer version: > > compile group: 'org.bytedeco', name: 'tesseract-platform', version: > '4.1.0-1.5.2' > > It still segfaults when I try to use swe.traineddata but at least the > service recovers instead of dying in place. > > Adam > > > > On 06/03/2020 12:45, Adam Funk wrote: > > Hi, > > > > I've downloaded some of the *.traineddata files from > > <https://github.com/tesseract-ocr/tessdata_best> --- as far as I can > > tell, all the ones I have tested work on the command line, e.g., > > > > $ tesseract --tessdata-dir /opt/data/tessdata-new/ --list-langs > > ... > > swe > > ... > > > > $ tesseract --tessdata-dir /opt/data/tessdata-new/ -l swe test.png stdout > > [produces output with no errors] > > > > $ tesseract --version > > tesseract 4.1.0 > > leptonica-1.78.0 > > libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 2.0.1) : libpng 1.6.37 : > > libtiff 4.0.10 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.0 > > Found AVX2 > > Found AVX > > Found SSE > > > > > > but when I try to use the same swe.traineddata file in a web service > > built with grails and running on Tomcat, something causes a segfault and > > such a massive problem that the whole Tomcat server has to be killed and > > restarted. The grails service has the following dependency: > > > > compile group: 'org.bytedeco', name: 'tesseract-platform', version: > > '4.0.0-1.5' > > > > which is a slightly lower version, but the data files are supposed to > > work with Tesseract 4. > > > > Any ideas why? > > > > Thanks, > > Adam > > > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/b13d76da-24ec-633c-b281-29bbcf9cb0e0%40sheffield.ac.uk > . > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduXsKysaRfCK3DtN0DqeFhiKOU4ZFPNHjoVJmbzGJ0PbpA%40mail.gmail.com.

