After diving in pytesseract.py I found one possible related issue in 
the NamedTemporaryFile.

According to the post in stackoverflow (
https://stackoverflow.com/questions/55081022/python-tempfile-with-a-context-manager-on-windows-10-leads-to-permissionerror),
 
I added the delete=False argument in the NamedTemporaryFile function in 
pytesseract.py.


@contextmanager
#https://stackoverflow.com/questions/55081022/python-tempfile-with-a-context-manager-on-windows-10-leads-to-permissionerror
def save(image):
    try:
        with NamedTemporaryFile(prefix='tess_',delete=False) as f:
            if isinstance(image, str):
                yield f.name, realpath(normpath(normcase(image)))
                return



It's working since then.

I will forward this thread and issue to pytesseract. 

Thanks for you help.







เมื่อ วันอาทิตย์ที่ 1 มีนาคม ค.ศ. 2020 23 นาฬิกา 40 นาที 26 วินาที UTC+7, 
zdenop เขียนว่า:
>
> Hello,
>
> I am not able to reproduce error, errors come from here [1] where 
> pytesseract tries to cleanup temporary files.
> You should report it to pytesseract project as there is no option to skip 
> this code.
> Maybe you can try to modify this part of pytesseact code[2]:
>
> finally:
>     cleanup(f.name)
>
> to 
>
> finally:
>     f.close()
>     cleanup(f.name)
>
>
> [1] 
> https://github.com/madmaze/pytesseract/blob/master/src/pytesseract.py#L131
>   
> [2]  
> https://github.com/madmaze/pytesseract/blob/7fef19ff176bd9f837753dc4c0ebc76b16267775/src/pytesseract.py#L176
>  
> Zdenko
>
>
> ne 1. 3. 2020 o 14:11 Supharerk Thawillarp <[email protected] 
> <javascript:>> napísal(a):
>
>> ok, it gave me WinErr5 again.
>>
>>
>> PS C:\Users\Supharerk\ocr_server> pipenv run python .\test_tess.py
>> C:\Users\SUPHAR~1\AppData\Local\Temp\tess_g9e7avw0
>> Image shape: (1150, 835, 3)
>> Traceback (most recent call last):
>>   File ".\test_tess.py", line 19, in <module>
>>     data_dict = pytesseract.image_to_data(img, output_type=Output.DICT)
>>   File 
>> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py"
>> , line 426, in image_to_data
>>     }[output_type]()
>>   File 
>> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py"
>> , line 424, in <lambda>
>>     Output.DICT: lambda: file_to_dict(run_and_get_output(*args), '\t', -1
>> ),
>>   File 
>> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py"
>> , line 264, in run_and_get_output
>>     return output_file.read().decode('utf-8').strip()
>>   File 
>> "c:\users\supharerk\appdata\local\continuum\anaconda3\lib\contextlib.py", 
>> line 119, in __exit__
>>     next(self.gen)
>>   File 
>> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py"
>> , line 176, in save
>>     cleanup(f.name)
>>   File 
>> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py"
>> , line 136, in cleanup
>>     raise e
>>   File 
>> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py"
>> , line 133, in cleanup
>>     remove(filename)
>> PermissionError: [WinError 5] Access is denied: 
>> 'C:\\Users\\SUPHAR~1\\AppData\\Local\\Temp\\tess_69cggzq3'
>>
>>
>>
>>
>>
>> เมื่อ วันอาทิตย์ที่ 1 มีนาคม ค.ศ. 2020 3 นาฬิกา 05 นาที 39 วินาที UTC+7, 
>> zdenop เขียนว่า:
>>>
>>> 1. Make sure you have the latest version of tesseract.
>>> Then try this script and provide exact/full error message:
>>>
>>> import tempfile
>>>
>>> import cv2
>>> import pytesseract
>>> from PIL import Image
>>> from pytesseract import Output
>>>
>>> pytesseract.pytesseract.tesseract_cmd = 'C:\\Program 
>>> Files\\Tesseract-OCR\\tesseract.exe'
>>>
>>> *i*mg = cv2.imread('images/invoice-sample.jpg')
>>>
>>> # check temp file
>>> temp_file = tempfile.NamedTemporaryFile(prefix='tess_')
>>> print(temp_file.name)
>>> image = Image.fromarray(img)
>>> image.save(temp_file.name + '.png', format='png', **image.info)
>>> temp_file.close()
>>>
>>> if img.any():
>>> print("Image shape:", img.shape)
>>> data_dict = pytesseract.image_to_data(img, output_type=Output.DICT)
>>> n_boxes = len(data_dict['level'])
>>> for i in range(n_boxes):
>>> (x, y, w, h) = (data_dict['left'][i], data_dict['top'][i], data_dict[
>>> 'width'][i], data_dict['height'][i])
>>> cv2.rectangle(img, (x, y), (x + w, y + h), (255, 125, 125), 2)
>>> cv2.imshow('img', img)
>>> cv2.waitKey(0)
>>> else:
>>> print("Can not open input file")
>>>
>>>
>>>
>>>
>>> Zdenko
>>>
>>>
>>> so 29. 2. 2020 o 19:04 Supharerk Thawillarp <[email protected]> 
>>> napísal(a):
>>>
>>>> Sure
>>>>
>>>> >>> pytesseract.pytesseract.tesseract_cmd = 'C:\\Program 
>>>> Files\\Tesseract-OCR\\tesseract.exe'
>>>> >>> pytesseract.get_tesseract_version()
>>>> LooseVersion ('5.0.0-alpha.20200223')
>>>>
>>>>
>>>> เมื่อ วันอาทิตย์ที่ 1 มีนาคม ค.ศ. 2020 0 นาฬิกา 21 นาที 26 วินาที 
>>>> UTC+7, zdenop เขียนว่า:
>>>>>
>>>>> This means there is problem with pytesseract/python permissions.
>>>>>
>>>>> Can you get output for pytesseract.get_tesseract_version()?
>>>>>
>>>>> Zdenko
>>>>>
>>>>>
>>>>> so 29. 2. 2020 o 12:10 Supharerk Thawillarp <[email protected]> 
>>>>> napísal(a):
>>>>>
>>>>>> No, the tesserect successfully run with output generated in textfile.
>>>>>>
>>>>>> (base) PS C:\Users\Supharerk\ocr_server> & 'C:\Program 
>>>>>> Files\Tesseract-OCR\tesseract.exe' .\images\invoice-sample.jpg 
>>>>>> invoice-sample
>>>>>> Tesseract Open Source OCR Engine v5.0.0-alpha.20200223 with Leptonica
>>>>>>
>>>>>>
>>>>>>
>>>>>> However, the WinError 5 arise again when running from python (with 
>>>>>> pipenv)
>>>>>> (base) PS C:\Users\Supharerk\ocr_server> pipenv run python .\app2.py
>>>>>> Traceback (most recent call last):
>>>>>>   File ".\app2.py", line 10, in <module>
>>>>>>     d=pytesseract.image_to_data(img,output_type=Output.DICT)
>>>>>>   File 
>>>>>> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py"
>>>>>> , line 426, in image_to_data
>>>>>>     }[output_type]()
>>>>>>   File 
>>>>>> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py"
>>>>>> , line 424, in <lambda>
>>>>>>     Output.DICT: lambda: file_to_dict(run_and_get_output(*args), '\t'
>>>>>> , -1),
>>>>>>   File 
>>>>>> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py"
>>>>>> , line 264, in run_and_get_output
>>>>>>     return output_file.read().decode('utf-8').strip()
>>>>>>   File 
>>>>>> "c:\users\supharerk\appdata\local\continuum\anaconda3\lib\contextlib.py"
>>>>>> , line 119, in __exit__
>>>>>>     next(self.gen)
>>>>>>   File 
>>>>>> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py"
>>>>>> , line 176, in save
>>>>>>     cleanup(f.name)
>>>>>>   File 
>>>>>> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py"
>>>>>> , line 136, in cleanup
>>>>>>     raise e
>>>>>>   File 
>>>>>> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py"
>>>>>> , line 133, in cleanup
>>>>>>     remove(filename)
>>>>>> PermissionError: [WinError 5] Access is denied: 
>>>>>> 'C:\\Users\\SUPHAR~1\\AppData\\Local\\Temp\\tess_y3d570lt'
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> เมื่อ วันเสาร์ที่ 29 กุมภาพันธ์ ค.ศ. 2020 16 นาฬิกา 19 นาที 41 วินาที 
>>>>>> UTC+7, zdenop เขียนว่า:
>>>>>>>
>>>>>>> Can you replicate problem with command line /"pure" tesseract? e,g,
>>>>>>> 'C:\\Program Files\\Tesseract-OCR\\tesseract.exe'   
>>>>>>> images/invoice-sample.jpg 
>>>>>>> invoice-sample
>>>>>>>
>>>>>>> Zdenko
>>>>>>>
>>>>>>>
>>>>>>> pi 28. 2. 2020 o 20:31 Supharerk Thawillarp <[email protected]> 
>>>>>>> napísal(a):
>>>>>>>
>>>>>>>>
>>>>>>>> I'm new to tesseract and trying to follow tutorial on Windows 10 
>>>>>>>> using the code below
>>>>>>>>
>>>>>>>> import cv2
>>>>>>>> import pytesseract
>>>>>>>> from pytesseract import Output
>>>>>>>> pytesseract.pytesseract.tesseract_cmd = 'C:\\Program 
>>>>>>>> Files\\Tesseract-OCR\\tesseract.exe'
>>>>>>>>
>>>>>>>>
>>>>>>>> img=cv2.imread('images/invoice-sample.jpg')
>>>>>>>>
>>>>>>>>
>>>>>>>> d=pytesseract.image_to_data(img,output_type=Output.DICT)
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> print(d.keys)
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> The problem is, I keep getting error PermissionError: [WinError 5] 
>>>>>>>> Access is denied: 'from implementing image_to_data and image_to_string 
>>>>>>>> in 
>>>>>>>> Windows 10.
>>>>>>>>
>>>>>>>> Only resource I found in stackoverflow is to set tesseract_cmd, 
>>>>>>>> PATH and TESSDATA_PREFIX which did not work for me. Not even using the 
>>>>>>>> administrative cmd works. 
>>>>>>>>
>>>>>>>> After spending a couple hours I found setting permission for 
>>>>>>>> tesseract.exe (right click, select property and go to security tab) by 
>>>>>>>> checking Full control and Modify below to make it works.
>>>>>>>>
>>>>>>>> Hope this will help some people strugglingthe same problem.
>>>>>>>>
>>>>>>>>
>>>>>>>> [image: 1582917756731.jpg][image: 1582917788913.jpg]
>>>>>>>>
>>>>>>>> -- 
>>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>>> Groups "tesseract-ocr" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>>> send an email to [email protected].
>>>>>>>> To view this discussion on the web visit 
>>>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/2d9f9f66-40a5-4ce9-9f14-cca48307e9f5%40googlegroups.com
>>>>>>>>  
>>>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/2d9f9f66-40a5-4ce9-9f14-cca48307e9f5%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>> .
>>>>>>>>
>>>>>>> -- 
>>>>>> You received this message because you are subscribed to the Google 
>>>>>> Groups "tesseract-ocr" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>> send an email to [email protected].
>>>>>> To view this discussion on the web visit 
>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/06df8a53-6027-4dc5-af29-b7e29d446b29%40googlegroups.com
>>>>>>  
>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/06df8a53-6027-4dc5-af29-b7e29d446b29%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>> .
>>>>>>
>>>>> -- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "tesseract-ocr" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to [email protected].
>>>> To view this discussion on the web visit 
>>>> https://groups.google.com/d/msgid/tesseract-ocr/71abd149-93fe-478c-a637-6a9faf117c32%40googlegroups.com
>>>>  
>>>> <https://groups.google.com/d/msgid/tesseract-ocr/71abd149-93fe-478c-a637-6a9faf117c32%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/tesseract-ocr/ccc5d777-e0af-4683-9881-0efc8798ecb8%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/tesseract-ocr/ccc5d777-e0af-4683-9881-0efc8798ecb8%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/e4e2b3f1-5201-4ddb-adbb-b810570ce7d1%40googlegroups.com.

Reply via email to