Please also see http://doc-creator.labri.fr/
which makes it easy to create synthetic data similar to manuscript pages.
On Tue, Jun 12, 2018 at 9:03 PM ShreeDevi Kumar
wrote:
> Please see the project https://github.com/OCR-D/ocrd-train
>
> It has support for training tesseract if you provide
Please see the project https://github.com/OCR-D/ocrd-train
It has support for training tesseract if you provide line images and
matching ground truth text.
On Tue, Jun 12, 2018 at 8:19 PM wrote:
> Same question here. I see that the documentation on training Tesseract 4
> makes some reference
Same question here. I see that the documentation on training Tesseract 4
makes some reference to manuscripts:
As with base Tesseract, there is a choice between rendering synthetic
training data from fonts, or labeling some pre-existing images (like
ancient manuscripts for example).
So, if
Hi,
I tried running tesseract OCR on the same image using below 2 approach:
1. Command line (tesseract version 3.05.01)
tesseract image.jpg out.txt
2. using pytesseract in python (pytesseract version 0.2.2)
import PIL
from PIL import Image
import pytesseract
text =
Actual DPI is unknown as it depends on various factors (inter alia physical
dimensions of taken object and distance you took the picture from). The
easiest way to establish real DPI is to take photo of a ruler and count
number of pixels on 1 inch distance. As an example, there is approximately
Thank you for the info.
The following link also has helpful info.
https://www.ibm.com/support/knowledgecenter/SSGH2K_13.1.2/com.ibm.xlc131.aix.doc/compiler_ref/omp_thread_limit.html
ShreeDevi
भजन - कीर्तन - आरती @
Has anybody developed a solution on the same?
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this
7 matches
Mail list logo