Re: [tesseract-ocr] pytesseract having high accuracy but performing very very slow

2021-03-25 Thread Zdenko Podobny
1 000 000 pages in one pdf? Seriously? + Post your code. pytesseract is not effective tool in case of multiple images (disk IO for each run/page) Zdenko št 25. 3. 2021 o 8:49 Vidya Chitragar < vidya.chitra...@lucidatechnologies.com> napísal(a): > Hi Every one. > I am using pytesseract with

Re: [tesseract-ocr] pytesseract having high accuracy but performing very very slow

2021-03-25 Thread Shree Devi Kumar
Try with newer version of tesseract. On Thu, Mar 25, 2021, 13:19 Vidya Chitragar < vidya.chitra...@lucidatechnologies.com> wrote: > Hi Every one. > I am using pytesseract with tesseract-ocr version 3.05.02 for conversion > of scanned pdf document of 1000k pages to searchable pdf document but my

[tesseract-ocr] pytesseract having high accuracy but performing very very slow

2021-03-25 Thread Vidya Chitragar
Hi Every one. I am using pytesseract with tesseract-ocr version 3.05.02 for conversion of scanned pdf document of 1000k pages to searchable pdf document but my code is taking more than 5 to 6 hrs to give searcable pdf document , Any suggestions are very helpful to me Thanks, Vidya -- You