please see https://github.com/OCR-D/ocrd-train

you can use it with image files and matching ground truth text - in utf-8.

ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

On Tue, May 29, 2018 at 9:57 PM, <ramast....@gmail.com> wrote:

> Hi,
> I belong to a group who study an old Egyptian writing system called
> "Coptic".
> It's based mostly on Greek (with some variation).
>
> Big majority of books written in Coptic where during the last century and
> were mostly the same [typewriter] font.
> Here is a sample picture:
> https://imgur.com/a/ILRw6vm
> And sample book:
> https://archive.org/download/pistissophiaopu00petegoog
>
> We need to add Coptic to languages supported by Tesseract but not sure how.
> I tried following this document https://github.com/tesseract-
> ocr/tesseract/wiki/TrainingTesseract-4.00 but it's very difficult to
> understand.
>
> We need someone help us with the initial setup so that we can dedicate our
> man power to training the system.
> We are none profit group so we are hoping for free help but we would also
> consider paid help since the alternative is hundreds of hours of man labor
> to digitalize just few books.
>
> Thanks everyone for contributing to this awesome project
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> To post to this group, send email to tesseract-ocr@googlegroups.com.
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/tesseract-ocr/08869d08-8b3a-4390-be79-fa811c78c0ca%
> 40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/08869d08-8b3a-4390-be79-fa811c78c0ca%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduW4x49te-Sgnkn7UhBO139p-5%3D3Mgh_tgQS_nE4NZcScQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to