1. Make sure you have the latest version of tesseract.
Then try this script and provide exact/full error message:

import tempfile

import cv2
import pytesseract
from PIL import Image
from pytesseract import Output

pytesseract.pytesseract.tesseract_cmd = 'C:\\Program
Files\\Tesseract-OCR\\tesseract.exe'

*i*mg = cv2.imread('images/invoice-sample.jpg')

# check temp file
temp_file = tempfile.NamedTemporaryFile(prefix='tess_')
print(temp_file.name)
image = Image.fromarray(img)
image.save(temp_file.name + '.png', format='png', **image.info)
temp_file.close()

if img.any():
print("Image shape:", img.shape)
data_dict = pytesseract.image_to_data(img, output_type=Output.DICT)
n_boxes = len(data_dict['level'])
for i in range(n_boxes):
(x, y, w, h) = (data_dict['left'][i], data_dict['top'][i], data_dict['width'
][i], data_dict['height'][i])
cv2.rectangle(img, (x, y), (x + w, y + h), (255, 125, 125), 2)
cv2.imshow('img', img)
cv2.waitKey(0)
else:
print("Can not open input file")




Zdenko


so 29. 2. 2020 o 19:04 Supharerk Thawillarp <raynus.blue...@gmail.com>
napísal(a):

> Sure
>
> >>> pytesseract.pytesseract.tesseract_cmd = 'C:\\Program
> Files\\Tesseract-OCR\\tesseract.exe'
> >>> pytesseract.get_tesseract_version()
> LooseVersion ('5.0.0-alpha.20200223')
>
>
> เมื่อ วันอาทิตย์ที่ 1 มีนาคม ค.ศ. 2020 0 นาฬิกา 21 นาที 26 วินาที UTC+7,
> zdenop เขียนว่า:
>>
>> This means there is problem with pytesseract/python permissions.
>>
>> Can you get output for pytesseract.get_tesseract_version()?
>>
>> Zdenko
>>
>>
>> so 29. 2. 2020 o 12:10 Supharerk Thawillarp <raynus...@gmail.com>
>> napísal(a):
>>
>>> No, the tesserect successfully run with output generated in textfile.
>>>
>>> (base) PS C:\Users\Supharerk\ocr_server> & 'C:\Program
>>> Files\Tesseract-OCR\tesseract.exe' .\images\invoice-sample.jpg invoice-
>>> sample
>>> Tesseract Open Source OCR Engine v5.0.0-alpha.20200223 with Leptonica
>>>
>>>
>>>
>>> However, the WinError 5 arise again when running from python (with
>>> pipenv)
>>> (base) PS C:\Users\Supharerk\ocr_server> pipenv run python .\app2.py
>>> Traceback (most recent call last):
>>>   File ".\app2.py", line 10, in <module>
>>>     d=pytesseract.image_to_data(img,output_type=Output.DICT)
>>>   File
>>> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py"
>>> , line 426, in image_to_data
>>>     }[output_type]()
>>>   File
>>> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py"
>>> , line 424, in <lambda>
>>>     Output.DICT: lambda: file_to_dict(run_and_get_output(*args), '\t', -
>>> 1),
>>>   File
>>> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py"
>>> , line 264, in run_and_get_output
>>>     return output_file.read().decode('utf-8').strip()
>>>   File
>>> "c:\users\supharerk\appdata\local\continuum\anaconda3\lib\contextlib.py"
>>> , line 119, in __exit__
>>>     next(self.gen)
>>>   File
>>> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py"
>>> , line 176, in save
>>>     cleanup(f.name)
>>>   File
>>> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py"
>>> , line 136, in cleanup
>>>     raise e
>>>   File
>>> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py"
>>> , line 133, in cleanup
>>>     remove(filename)
>>> PermissionError: [WinError 5] Access is denied:
>>> 'C:\\Users\\SUPHAR~1\\AppData\\Local\\Temp\\tess_y3d570lt'
>>>
>>>
>>>
>>>
>>>
>>>
>>> เมื่อ วันเสาร์ที่ 29 กุมภาพันธ์ ค.ศ. 2020 16 นาฬิกา 19 นาที 41 วินาที
>>> UTC+7, zdenop เขียนว่า:
>>>>
>>>> Can you replicate problem with command line /"pure" tesseract? e,g,
>>>> 'C:\\Program Files\\Tesseract-OCR\\tesseract.exe'   
>>>> images/invoice-sample.jpg
>>>> invoice-sample
>>>>
>>>> Zdenko
>>>>
>>>>
>>>> pi 28. 2. 2020 o 20:31 Supharerk Thawillarp <raynus...@gmail.com>
>>>> napísal(a):
>>>>
>>>>>
>>>>> I'm new to tesseract and trying to follow tutorial on Windows 10 using
>>>>> the code below
>>>>>
>>>>> import cv2
>>>>> import pytesseract
>>>>> from pytesseract import Output
>>>>> pytesseract.pytesseract.tesseract_cmd = 'C:\\Program
>>>>> Files\\Tesseract-OCR\\tesseract.exe'
>>>>>
>>>>>
>>>>> img=cv2.imread('images/invoice-sample.jpg')
>>>>>
>>>>>
>>>>> d=pytesseract.image_to_data(img,output_type=Output.DICT)
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> print(d.keys)
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> The problem is, I keep getting error PermissionError: [WinError 5]
>>>>> Access is denied: 'from implementing image_to_data and image_to_string in
>>>>> Windows 10.
>>>>>
>>>>> Only resource I found in stackoverflow is to set tesseract_cmd, PATH
>>>>> and TESSDATA_PREFIX which did not work for me. Not even using the
>>>>> administrative cmd works.
>>>>>
>>>>> After spending a couple hours I found setting permission for
>>>>> tesseract.exe (right click, select property and go to security tab) by
>>>>> checking Full control and Modify below to make it works.
>>>>>
>>>>> Hope this will help some people strugglingthe same problem.
>>>>>
>>>>>
>>>>> [image: 1582917756731.jpg][image: 1582917788913.jpg]
>>>>>
>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "tesseract-ocr" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to tesser...@googlegroups.com.
>>>>> To view this discussion on the web visit
>>>>> https://groups.google.com/d/msgid/tesseract-ocr/2d9f9f66-40a5-4ce9-9f14-cca48307e9f5%40googlegroups.com
>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/2d9f9f66-40a5-4ce9-9f14-cca48307e9f5%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>>
>>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to tesser...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/tesseract-ocr/06df8a53-6027-4dc5-af29-b7e29d446b29%40googlegroups.com
>>> <https://groups.google.com/d/msgid/tesseract-ocr/06df8a53-6027-4dc5-af29-b7e29d446b29%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/71abd149-93fe-478c-a637-6a9faf117c32%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/71abd149-93fe-478c-a637-6a9faf117c32%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8zgeP5pqMvSR4%3DwUxwfrYbGiFcqZ-HLXRJLRtJ2tiA7MA%40mail.gmail.com.

Reply via email to