For GUI
you can try VietOCR - http://sourceforge.net/projects/vietocr/files/vietocr/

For Language data for sanskrit transliteration
Try
http://sourceforge.net/projects/tesseracthindi/files/Tesseract-3-02-SanskritTransliteration/




Shree Devi Kumar
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com


On Tue, Nov 26, 2013 at 12:40 PM, Srivas <[email protected]> wrote:

> Hi!
> I have a bunch of PDF files journals and I need to get the text out of it.
> They contain a lot of romanized sanskrit diacritical marks and that creates
> a difficulty. I tried Finereader and OmniPage but they cannot be trained to
> recognize those symbols. I just need an ORC program I can train to show any
> symbol required and the above programs cannot do that.
>
> Where should I start from? I feel like this program can do the job but can
> you help me to get started? I downloaded tesseract and installed it
> (windows). There are different GUIs available and I think it will make it
> easier to work. Can you suggest a good one? I tried gimagereader but it's
> too primitive and leaves a lot of work to be done afterwards with the
> overall text.
>
> I don't think this kind of language pack is available and how to create
> it?
>
> I will add one pdf and fonts that were used to create it. Maybe someone
> would like to try and let me know how to do it?
>
> Thank you for any help!
>
> Regards,
> Srivas
>
> --
> --
> You received this message because you are subscribed to the Google
> Groups "tesseract-ocr" group.
> To post to this group, send email to [email protected]
> To unsubscribe from this group, send email to
> [email protected]
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en
>
> ---
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to