After diving in pytesseract.py I found one possible related issue in the NamedTemporaryFile.
According to the post in stackoverflow ( https://stackoverflow.com/questions/55081022/python-tempfile-with-a-context-manager-on-windows-10-leads-to-permissionerror), I added the delete=False argument in the NamedTemporaryFile function in pytesseract.py. @contextmanager #https://stackoverflow.com/questions/55081022/python-tempfile-with-a-context-manager-on-windows-10-leads-to-permissionerror def save(image): try: with NamedTemporaryFile(prefix='tess_',delete=False) as f: if isinstance(image, str): yield f.name, realpath(normpath(normcase(image))) return It's working since then. I will forward this thread and issue to pytesseract. Thanks for you help. เมื่อ วันอาทิตย์ที่ 1 มีนาคม ค.ศ. 2020 23 นาฬิกา 40 นาที 26 วินาที UTC+7, zdenop เขียนว่า: > > Hello, > > I am not able to reproduce error, errors come from here [1] where > pytesseract tries to cleanup temporary files. > You should report it to pytesseract project as there is no option to skip > this code. > Maybe you can try to modify this part of pytesseact code[2]: > > finally: > cleanup(f.name) > > to > > finally: > f.close() > cleanup(f.name) > > > [1] > https://github.com/madmaze/pytesseract/blob/master/src/pytesseract.py#L131 > > [2] > https://github.com/madmaze/pytesseract/blob/7fef19ff176bd9f837753dc4c0ebc76b16267775/src/pytesseract.py#L176 > > Zdenko > > > ne 1. 3. 2020 o 14:11 Supharerk Thawillarp <[email protected] > <javascript:>> napísal(a): > >> ok, it gave me WinErr5 again. >> >> >> PS C:\Users\Supharerk\ocr_server> pipenv run python .\test_tess.py >> C:\Users\SUPHAR~1\AppData\Local\Temp\tess_g9e7avw0 >> Image shape: (1150, 835, 3) >> Traceback (most recent call last): >> File ".\test_tess.py", line 19, in <module> >> data_dict = pytesseract.image_to_data(img, output_type=Output.DICT) >> File >> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py" >> , line 426, in image_to_data >> }[output_type]() >> File >> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py" >> , line 424, in <lambda> >> Output.DICT: lambda: file_to_dict(run_and_get_output(*args), '\t', -1 >> ), >> File >> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py" >> , line 264, in run_and_get_output >> return output_file.read().decode('utf-8').strip() >> File >> "c:\users\supharerk\appdata\local\continuum\anaconda3\lib\contextlib.py", >> line 119, in __exit__ >> next(self.gen) >> File >> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py" >> , line 176, in save >> cleanup(f.name) >> File >> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py" >> , line 136, in cleanup >> raise e >> File >> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py" >> , line 133, in cleanup >> remove(filename) >> PermissionError: [WinError 5] Access is denied: >> 'C:\\Users\\SUPHAR~1\\AppData\\Local\\Temp\\tess_69cggzq3' >> >> >> >> >> >> เมื่อ วันอาทิตย์ที่ 1 มีนาคม ค.ศ. 2020 3 นาฬิกา 05 นาที 39 วินาที UTC+7, >> zdenop เขียนว่า: >>> >>> 1. Make sure you have the latest version of tesseract. >>> Then try this script and provide exact/full error message: >>> >>> import tempfile >>> >>> import cv2 >>> import pytesseract >>> from PIL import Image >>> from pytesseract import Output >>> >>> pytesseract.pytesseract.tesseract_cmd = 'C:\\Program >>> Files\\Tesseract-OCR\\tesseract.exe' >>> >>> *i*mg = cv2.imread('images/invoice-sample.jpg') >>> >>> # check temp file >>> temp_file = tempfile.NamedTemporaryFile(prefix='tess_') >>> print(temp_file.name) >>> image = Image.fromarray(img) >>> image.save(temp_file.name + '.png', format='png', **image.info) >>> temp_file.close() >>> >>> if img.any(): >>> print("Image shape:", img.shape) >>> data_dict = pytesseract.image_to_data(img, output_type=Output.DICT) >>> n_boxes = len(data_dict['level']) >>> for i in range(n_boxes): >>> (x, y, w, h) = (data_dict['left'][i], data_dict['top'][i], data_dict[ >>> 'width'][i], data_dict['height'][i]) >>> cv2.rectangle(img, (x, y), (x + w, y + h), (255, 125, 125), 2) >>> cv2.imshow('img', img) >>> cv2.waitKey(0) >>> else: >>> print("Can not open input file") >>> >>> >>> >>> >>> Zdenko >>> >>> >>> so 29. 2. 2020 o 19:04 Supharerk Thawillarp <[email protected]> >>> napísal(a): >>> >>>> Sure >>>> >>>> >>> pytesseract.pytesseract.tesseract_cmd = 'C:\\Program >>>> Files\\Tesseract-OCR\\tesseract.exe' >>>> >>> pytesseract.get_tesseract_version() >>>> LooseVersion ('5.0.0-alpha.20200223') >>>> >>>> >>>> เมื่อ วันอาทิตย์ที่ 1 มีนาคม ค.ศ. 2020 0 นาฬิกา 21 นาที 26 วินาที >>>> UTC+7, zdenop เขียนว่า: >>>>> >>>>> This means there is problem with pytesseract/python permissions. >>>>> >>>>> Can you get output for pytesseract.get_tesseract_version()? >>>>> >>>>> Zdenko >>>>> >>>>> >>>>> so 29. 2. 2020 o 12:10 Supharerk Thawillarp <[email protected]> >>>>> napísal(a): >>>>> >>>>>> No, the tesserect successfully run with output generated in textfile. >>>>>> >>>>>> (base) PS C:\Users\Supharerk\ocr_server> & 'C:\Program >>>>>> Files\Tesseract-OCR\tesseract.exe' .\images\invoice-sample.jpg >>>>>> invoice-sample >>>>>> Tesseract Open Source OCR Engine v5.0.0-alpha.20200223 with Leptonica >>>>>> >>>>>> >>>>>> >>>>>> However, the WinError 5 arise again when running from python (with >>>>>> pipenv) >>>>>> (base) PS C:\Users\Supharerk\ocr_server> pipenv run python .\app2.py >>>>>> Traceback (most recent call last): >>>>>> File ".\app2.py", line 10, in <module> >>>>>> d=pytesseract.image_to_data(img,output_type=Output.DICT) >>>>>> File >>>>>> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py" >>>>>> , line 426, in image_to_data >>>>>> }[output_type]() >>>>>> File >>>>>> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py" >>>>>> , line 424, in <lambda> >>>>>> Output.DICT: lambda: file_to_dict(run_and_get_output(*args), '\t' >>>>>> , -1), >>>>>> File >>>>>> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py" >>>>>> , line 264, in run_and_get_output >>>>>> return output_file.read().decode('utf-8').strip() >>>>>> File >>>>>> "c:\users\supharerk\appdata\local\continuum\anaconda3\lib\contextlib.py" >>>>>> , line 119, in __exit__ >>>>>> next(self.gen) >>>>>> File >>>>>> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py" >>>>>> , line 176, in save >>>>>> cleanup(f.name) >>>>>> File >>>>>> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py" >>>>>> , line 136, in cleanup >>>>>> raise e >>>>>> File >>>>>> "C:\Users\Supharerk\.virtualenvs\ocr_server-jUkFWk3u\lib\site-packages\pytesseract\pytesseract.py" >>>>>> , line 133, in cleanup >>>>>> remove(filename) >>>>>> PermissionError: [WinError 5] Access is denied: >>>>>> 'C:\\Users\\SUPHAR~1\\AppData\\Local\\Temp\\tess_y3d570lt' >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> เมื่อ วันเสาร์ที่ 29 กุมภาพันธ์ ค.ศ. 2020 16 นาฬิกา 19 นาที 41 วินาที >>>>>> UTC+7, zdenop เขียนว่า: >>>>>>> >>>>>>> Can you replicate problem with command line /"pure" tesseract? e,g, >>>>>>> 'C:\\Program Files\\Tesseract-OCR\\tesseract.exe' >>>>>>> images/invoice-sample.jpg >>>>>>> invoice-sample >>>>>>> >>>>>>> Zdenko >>>>>>> >>>>>>> >>>>>>> pi 28. 2. 2020 o 20:31 Supharerk Thawillarp <[email protected]> >>>>>>> napísal(a): >>>>>>> >>>>>>>> >>>>>>>> I'm new to tesseract and trying to follow tutorial on Windows 10 >>>>>>>> using the code below >>>>>>>> >>>>>>>> import cv2 >>>>>>>> import pytesseract >>>>>>>> from pytesseract import Output >>>>>>>> pytesseract.pytesseract.tesseract_cmd = 'C:\\Program >>>>>>>> Files\\Tesseract-OCR\\tesseract.exe' >>>>>>>> >>>>>>>> >>>>>>>> img=cv2.imread('images/invoice-sample.jpg') >>>>>>>> >>>>>>>> >>>>>>>> d=pytesseract.image_to_data(img,output_type=Output.DICT) >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> print(d.keys) >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> The problem is, I keep getting error PermissionError: [WinError 5] >>>>>>>> Access is denied: 'from implementing image_to_data and image_to_string >>>>>>>> in >>>>>>>> Windows 10. >>>>>>>> >>>>>>>> Only resource I found in stackoverflow is to set tesseract_cmd, >>>>>>>> PATH and TESSDATA_PREFIX which did not work for me. Not even using the >>>>>>>> administrative cmd works. >>>>>>>> >>>>>>>> After spending a couple hours I found setting permission for >>>>>>>> tesseract.exe (right click, select property and go to security tab) by >>>>>>>> checking Full control and Modify below to make it works. >>>>>>>> >>>>>>>> Hope this will help some people strugglingthe same problem. >>>>>>>> >>>>>>>> >>>>>>>> [image: 1582917756731.jpg][image: 1582917788913.jpg] >>>>>>>> >>>>>>>> -- >>>>>>>> You received this message because you are subscribed to the Google >>>>>>>> Groups "tesseract-ocr" group. >>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>> send an email to [email protected]. >>>>>>>> To view this discussion on the web visit >>>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/2d9f9f66-40a5-4ce9-9f14-cca48307e9f5%40googlegroups.com >>>>>>>> >>>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/2d9f9f66-40a5-4ce9-9f14-cca48307e9f5%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>> . >>>>>>>> >>>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "tesseract-ocr" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>> send an email to [email protected]. >>>>>> To view this discussion on the web visit >>>>>> https://groups.google.com/d/msgid/tesseract-ocr/06df8a53-6027-4dc5-af29-b7e29d446b29%40googlegroups.com >>>>>> >>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/06df8a53-6027-4dc5-af29-b7e29d446b29%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>> . >>>>>> >>>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "tesseract-ocr" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/tesseract-ocr/71abd149-93fe-478c-a637-6a9faf117c32%40googlegroups.com >>>> >>>> <https://groups.google.com/d/msgid/tesseract-ocr/71abd149-93fe-478c-a637-6a9faf117c32%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/ccc5d777-e0af-4683-9881-0efc8798ecb8%40googlegroups.com >> >> <https://groups.google.com/d/msgid/tesseract-ocr/ccc5d777-e0af-4683-9881-0efc8798ecb8%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/e4e2b3f1-5201-4ddb-adbb-b810570ce7d1%40googlegroups.com.

