anyway report it to pytesseract project, so it can be fixed - otherwise
next update will bring it once again.

Zdenko


ne 1. 3. 2020 o 18:17 Supharerk Thawillarp <[email protected]>
napísal(a):

> After diving in pytesseract.py I found one possible related issue in
> the NamedTemporaryFile.
>
> According to the post in stackoverflow (
> https://stackoverflow.com/questions/55081022/python-tempfile-with-a-context-manager-on-windows-10-leads-to-permissionerror),
> I added the delete=False argument in the NamedTemporaryFile function in
> pytesseract.py.
>
>
> @contextmanager
> #
> https://stackoverflow.com/questions/55081022/python-tempfile-with-a-context-manager-on-windows-10-leads-to-permissionerror
> def save(image):
>     try:
>         with NamedTemporaryFile(prefix='tess_',delete=False) as f:
>             if isinstance(image, str):
>                 yield f.name, realpath(normpath(normcase(image)))
>                 return
>
>
>
> It's working since then.
>
> I will forward this thread and issue to pytesseract.
>
> Thanks for you help.
>
>
>
>
>
>
>
> เมื่อ วันอาทิตย์ที่ 1 มีนาคม ค.ศ. 2020 23 นาฬิกา 40 นาที 26 วินาที UTC+7,
> zdenop เขียนว่า:
>>
>> Hello,
>>
>> I am not able to reproduce error, errors come from here [1] where
>> pytesseract tries to cleanup temporary files.
>> You should report it to pytesseract project as there is no option to skip
>> this code.
>> Maybe you can try to modify this part of pytesseact code[2]:
>>
>> finally:
>>     cleanup(f.name)
>>
>> to
>>
>> finally:
>>     f.close()
>>     cleanup(f.name)
>>
>>
>> [1]
>> https://github.com/madmaze/pytesseract/blob/master/src/pytesseract.py#L131
>>
>> [2]
>> https://github.com/madmaze/pytesseract/blob/7fef19ff176bd9f837753dc4c0ebc76b16267775/src/pytesseract.py#L176
>>
>> Zdenko
>>
>>
>> ne 1. 3. 2020 o 14:11 Supharerk Thawillarp <[email protected]>
>> napísal(a):
>>
>>> ok, it gave me WinErr5 again.
>>>
>>>
>>> PS C:\Users\Supharerk\ocr_server> pipenv run python .\test_tess.py
>>> C:\Users\SUPHAR~1\AppData\Local\Temp\tess_g9e7avw0
>>> Image shape: (1150, 835, 3)
>>> Traceback (most recent call last):
>>>   File ".\test_tess.py", line 19, in <module>
>>>     data_dict = pytesseract.image_to_data(img, output_type=Output.DICT)
>>>   File
>>> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py"
>>> , line 426, in image_to_data
>>>     }[output_type]()
>>>   File
>>> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py"
>>> , line 424, in <lambda>
>>>     Output.DICT: lambda: file_to_dict(run_and_get_output(*args), '\t', -
>>> 1),
>>>   File
>>> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py"
>>> , line 264, in run_and_get_output
>>>     return output_file.read().decode('utf-8').strip()
>>>   File
>>> "c:\users\supharerk\appdata\local\continuum\anaconda3\lib\contextlib.py"
>>> , line 119, in __exit__
>>>     next(self.gen)
>>>   File
>>> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py"
>>> , line 176, in save
>>>     cleanup(f.name)
>>>   File
>>> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py"
>>> , line 136, in cleanup
>>>     raise e
>>>   File
>>> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py"
>>> , line 133, in cleanup
>>>     remove(filename)
>>> PermissionError: [WinError 5] Access is denied:
>>> 'C:\\Users\\SUPHAR~1\\AppData\\Local\\Temp\\tess_69cggzq3'
>>>
>>>
>>>
>>>
>>>
>>> เมื่อ วันอาทิตย์ที่ 1 มีนาคม ค.ศ. 2020 3 นาฬิกา 05 นาที 39 วินาที UTC+7,
>>> zdenop เขียนว่า:
>>>>
>>>> 1. Make sure you have the latest version of tesseract.
>>>> Then try this script and provide exact/full error message:
>>>>
>>>> import tempfile
>>>>
>>>> import cv2
>>>> import pytesseract
>>>> from PIL import Image
>>>> from pytesseract import Output
>>>>
>>>> pytesseract.pytesseract.tesseract_cmd = 'C:\\Program 
>>>> Files\\Tesseract-OCR\\tesseract.exe'
>>>>
>>>> *i*mg = cv2.imread('images/invoice-sample.jpg')
>>>>
>>>> # check temp file
>>>> temp_file = tempfile.NamedTemporaryFile(prefix='tess_')
>>>> print(temp_file.name)
>>>> image = Image.fromarray(img)
>>>> image.save(temp_file.name + '.png', format='png', **image.info)
>>>> temp_file.close()
>>>>
>>>> if img.any():
>>>> print("Image shape:", img.shape)
>>>> data_dict = pytesseract.image_to_data(img, output_type=Output.DICT)
>>>> n_boxes = len(data_dict['level'])
>>>> for i in range(n_boxes):
>>>> (x, y, w, h) = (data_dict['left'][i], data_dict['top'][i], data_dict[
>>>> 'width'][i], data_dict['height'][i])
>>>> cv2.rectangle(img, (x, y), (x + w, y + h), (255, 125, 125), 2)
>>>> cv2.imshow('img', img)
>>>> cv2.waitKey(0)
>>>> else:
>>>> print("Can not open input file")
>>>>
>>>>
>>>>
>>>>
>>>> Zdenko
>>>>
>>>>
>>>> so 29. 2. 2020 o 19:04 Supharerk Thawillarp <[email protected]>
>>>> napísal(a):
>>>>
>>>>> Sure
>>>>>
>>>>> >>> pytesseract.pytesseract.tesseract_cmd = 'C:\\Program
>>>>> Files\\Tesseract-OCR\\tesseract.exe'
>>>>> >>> pytesseract.get_tesseract_version()
>>>>> LooseVersion ('5.0.0-alpha.20200223')
>>>>>
>>>>>
>>>>> เมื่อ วันอาทิตย์ที่ 1 มีนาคม ค.ศ. 2020 0 นาฬิกา 21 นาที 26 วินาที
>>>>> UTC+7, zdenop เขียนว่า:
>>>>>>
>>>>>> This means there is problem with pytesseract/python permissions.
>>>>>>
>>>>>> Can you get output for pytesseract.get_tesseract_version()?
>>>>>>
>>>>>> Zdenko
>>>>>>
>>>>>>
>>>>>> so 29. 2. 2020 o 12:10 Supharerk Thawillarp <[email protected]>
>>>>>> napísal(a):
>>>>>>
>>>>>>> No, the tesserect successfully run with output generated in textfile.
>>>>>>>
>>>>>>> (base) PS C:\Users\Supharerk\ocr_server> & 'C:\Program
>>>>>>> Files\Tesseract-OCR\tesseract.exe' .\images\invoice-sample.jpg
>>>>>>> invoice-sample
>>>>>>> Tesseract Open Source OCR Engine v5.0.0-alpha.20200223 with
>>>>>>> Leptonica
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> However, the WinError 5 arise again when running from python (with
>>>>>>> pipenv)
>>>>>>> (base) PS C:\Users\Supharerk\ocr_server> pipenv run python .\app2.py
>>>>>>> Traceback (most recent call last):
>>>>>>>   File ".\app2.py", line 10, in <module>
>>>>>>>     d=pytesseract.image_to_data(img,output_type=Output.DICT)
>>>>>>>   File
>>>>>>> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py"
>>>>>>> , line 426, in image_to_data
>>>>>>>     }[output_type]()
>>>>>>>   File
>>>>>>> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py"
>>>>>>> , line 424, in <lambda>
>>>>>>>     Output.DICT: lambda: file_to_dict(run_and_get_output(*args),
>>>>>>> '\t', -1),
>>>>>>>   File
>>>>>>> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py"
>>>>>>> , line 264, in run_and_get_output
>>>>>>>     return output_file.read().decode('utf-8').strip()
>>>>>>>   File
>>>>>>> "c:\users\supharerk\appdata\local\continuum\anaconda3\lib\contextlib.py"
>>>>>>> , line 119, in __exit__
>>>>>>>     next(self.gen)
>>>>>>>   File
>>>>>>> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py"
>>>>>>> , line 176, in save
>>>>>>>     cleanup(f.name)
>>>>>>>   File
>>>>>>> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py"
>>>>>>> , line 136, in cleanup
>>>>>>>     raise e
>>>>>>>   File
>>>>>>> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py"
>>>>>>> , line 133, in cleanup
>>>>>>>     remove(filename)
>>>>>>> PermissionError: [WinError 5] Access is denied:
>>>>>>> 'C:\\Users\\SUPHAR~1\\AppData\\Local\\Temp\\tess_y3d570lt'
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> เมื่อ วันเสาร์ที่ 29 กุมภาพันธ์ ค.ศ. 2020 16 นาฬิกา 19 นาที 41
>>>>>>> วินาที UTC+7, zdenop เขียนว่า:
>>>>>>>>
>>>>>>>> Can you replicate problem with command line /"pure" tesseract? e,g,
>>>>>>>> 'C:\\Program Files\\Tesseract-OCR\\tesseract.exe'   
>>>>>>>> images/invoice-sample.jpg
>>>>>>>> invoice-sample
>>>>>>>>
>>>>>>>> Zdenko
>>>>>>>>
>>>>>>>>
>>>>>>>> pi 28. 2. 2020 o 20:31 Supharerk Thawillarp <[email protected]>
>>>>>>>> napísal(a):
>>>>>>>>
>>>>>>>>>
>>>>>>>>> I'm new to tesseract and trying to follow tutorial on Windows 10
>>>>>>>>> using the code below
>>>>>>>>>
>>>>>>>>> import cv2
>>>>>>>>> import pytesseract
>>>>>>>>> from pytesseract import Output
>>>>>>>>> pytesseract.pytesseract.tesseract_cmd = 'C:\\Program
>>>>>>>>> Files\\Tesseract-OCR\\tesseract.exe'
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> img=cv2.imread('images/invoice-sample.jpg')
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> d=pytesseract.image_to_data(img,output_type=Output.DICT)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> print(d.keys)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> The problem is, I keep getting error PermissionError: [WinError 5]
>>>>>>>>> Access is denied: 'from implementing image_to_data and 
>>>>>>>>> image_to_string in
>>>>>>>>> Windows 10.
>>>>>>>>>
>>>>>>>>> Only resource I found in stackoverflow is to set tesseract_cmd,
>>>>>>>>> PATH and TESSDATA_PREFIX which did not work for me. Not even using the
>>>>>>>>> administrative cmd works.
>>>>>>>>>
>>>>>>>>> After spending a couple hours I found setting permission for
>>>>>>>>> tesseract.exe (right click, select property and go to security tab) by
>>>>>>>>> checking Full control and Modify below to make it works.
>>>>>>>>>
>>>>>>>>> Hope this will help some people strugglingthe same problem.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> [image: 1582917756731.jpg][image: 1582917788913.jpg]
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> You received this message because you are subscribed to the Google
>>>>>>>>> Groups "tesseract-ocr" group.
>>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>>> send an email to [email protected].
>>>>>>>>> To view this discussion on the web visit
>>>>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/2d9f9f66-40a5-4ce9-9f14-cca48307e9f5%40googlegroups.com
>>>>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/2d9f9f66-40a5-4ce9-9f14-cca48307e9f5%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>>> .
>>>>>>>>>
>>>>>>>> --
>>>>>>> You received this message because you are subscribed to the Google
>>>>>>> Groups "tesseract-ocr" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>> send an email to [email protected].
>>>>>>> To view this discussion on the web visit
>>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/06df8a53-6027-4dc5-af29-b7e29d446b29%40googlegroups.com
>>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/06df8a53-6027-4dc5-af29-b7e29d446b29%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>> .
>>>>>>>
>>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "tesseract-ocr" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to [email protected].
>>>>> To view this discussion on the web visit
>>>>> https://groups.google.com/d/msgid/tesseract-ocr/71abd149-93fe-478c-a637-6a9faf117c32%40googlegroups.com
>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/71abd149-93fe-478c-a637-6a9faf117c32%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>>
>>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/tesseract-ocr/ccc5d777-e0af-4683-9881-0efc8798ecb8%40googlegroups.com
>>> <https://groups.google.com/d/msgid/tesseract-ocr/ccc5d777-e0af-4683-9881-0efc8798ecb8%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/e4e2b3f1-5201-4ddb-adbb-b810570ce7d1%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/e4e2b3f1-5201-4ddb-adbb-b810570ce7d1%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8x4ssEJed%2BZ0k7uuQC28qYg3AkK3jhB1ge-gLoDXEP1uA%40mail.gmail.com.

Reply via email to