output when file is run with python2
<__main__.LP_TessPageIterator object at 0x102369950>
output when file is run with python3
Failed loading language 'osd'
Tesseract couldn't load any languages!
Warning: Auto orientation and script detection requested, but osd language
failed to load
<__main__.LP_TessPageIterator object at 0x10194b950>
Also, I know it is finding the tessdata folder because when I change the
prefix value, it gives the following error in both python2 and 3:
Error opening data file ./tessdata/eng.traineddata
Please make sure the TESSDATA_PREFIX environment variable is set to the
parent directory of your "tessdata" directory.
Failed loading language 'eng'
Tesseract couldn't load any languages!
Segmentation fault: 11
On Wednesday, May 25, 2016 at 4:35:15 PM UTC+3, Reuben Cummings wrote:
>
> Using the capi and python3, the 'osd' language file fails to load. It
> works fine with python2. I'm using mac osx 10.9.5.
>
> code:
>
> #!/usr/bin/python
> # -*- coding: utf-8 -*-
>
> from __future__ import print_function, division, absolute_import
>
> from os import environ
> from ctypes import CDLL, POINTER, Structure, c_char_p, c_bool
> from ctypes.util import find_library
>
> LIBTESS = find_library('libtesseract')
> LIBLEPT = find_library('liblept')
> TESSDATA_PREFIX = environ.get('TESSDATA_PREFIX', '/opt/local/share')
>
>
> class TessBaseAPI(Structure):
> pass
>
>
> class Pix(Structure):
> pass
>
>
> class TessPageIterator(Structure):
> pass
>
>
> def create_tess_api(prefix=TESSDATA_PREFIX):
> tesseract = CDLL(LIBTESS)
> leptonica = CDLL(LIBLEPT)
> base_api = POINTER(TessBaseAPI)
> p_iter = POINTER(TessPageIterator)
> argtypes = [base_api, c_char_p, c_char_p]
>
> tesseract.TessBaseAPICreate.restype = base_api
> tesseract.TessBaseAPIInit3.argtypes = argtypes
> tesseract.TessBaseAPIInit3.restype = c_bool
> tesseract.TessBaseAPISetImage2.restype = None
> tesseract.TessBaseAPISetImage2.argtypes = [base_api, POINTER(Pix)]
> tesseract.TessBaseAPIAnalyseLayout.argtypes = [base_api]
> tesseract.TessBaseAPIAnalyseLayout.restype = p_iter
>
> api = tesseract.TessBaseAPICreate()
> tesseract.TessBaseAPIInit3(api, prefix.encode('utf-8'), b'eng')
>
> leptonica.pixRead.argtypes = [c_char_p]
> leptonica.pixRead.restype = POINTER(Pix)
> return tesseract, leptonica, api
>
> tesseract, leptonica, api = create_tess_api()
> path = b'eurotext.tif'
> tesseract.TessBaseAPISetPageSegMode(api, 1)
> pix = leptonica.pixRead(path)
> tesseract.TessBaseAPISetImage2(api, pix)
> print(tesseract.TessBaseAPIAnalyseLayout(api))
>
> tesseract 3.04.00
> leptonica-1.71
> libgif 4.2.3 : libjpeg 9a : libpng 1.6.21 : libtiff 4.0.6 : zlib 1.2.8 :
> libwebp 0.5.0 : libopenjp2 2.1.0
>
>
>
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit
https://groups.google.com/d/msgid/tesseract-ocr/dd4e2229-5a7f-4ee0-8beb-bf40a924b160%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.