Re: Need help in recognizing english texts with sanskrit roman diacritical marks.

Quan Nguyen Wed, 27 Nov 2013 10:26:06 -0800

Download and install 
http://sourceforge.net/projects/ghostscript/files/GPL%20Ghostscript/9.10/gs910w32.exe.
 
Then follow the steps for setting Path environment variable as described in 
http://vietocr.sourceforge.net/usage.html.


On Tuesday, November 26, 2013 9:50:57 PM UTC-6, Srivas wrote:
>
> Thanks, I almost got my problem solved but I also want to try this out. 
> I'm quite sure I will need it also since I have some scanned vedic texts 
> and I would like to get them recognized also.
>
> I'm encountering the following problem: After installing the VietORC and 
> trying to open a pdf file, the following error comes up: The gsdll32.dll 
> wasn't found in default DLL search path. Please install GPL Ghostscript 
> and/or set the appropriate environment variable.
>
> I did download and install Ghostscript but the error remains. What to do 
> next?
>
> On Tuesday, November 26, 2013 6:53:03 PM UTC+7, shree wrote:
>>
>> For GUI
>> you can try VietOCR - 
>> http://sourceforge.net/projects/vietocr/files/vietocr/
>>
>> For Language data for sanskrit transliteration
>> Try 
>> http://sourceforge.net/projects/tesseracthindi/files/Tesseract-3-02-SanskritTransliteration/
>>
>>
>>
>>
>> Shree Devi Kumar
>> ____________________________________________________________
>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>>  
>>
>> On Tue, Nov 26, 2013 at 12:40 PM, Srivas <[email protected]> wrote:
>>
>>> Hi!
>>> I have a bunch of PDF files journals and I need to get the text out of 
>>> it. They contain a lot of romanized sanskrit diacritical marks and that 
>>> creates a difficulty. I tried Finereader and OmniPage but they cannot be 
>>> trained to recognize those symbols. I just need an ORC program I can train 
>>> to show any symbol required and the above programs cannot do that. 
>>>
>>> Where should I start from? I feel like this program can do the job but 
>>> can you help me to get started? I downloaded tesseract and installed it 
>>> (windows). There are different GUIs available and I think it will make it 
>>> easier to work. Can you suggest a good one? I tried gimagereader but it's 
>>> too primitive and leaves a lot of work to be done afterwards with the 
>>> overall text.
>>>
>>> I don't think this kind of language pack is available and how to create 
>>> it? 
>>>
>>> I will add one pdf and fonts that were used to create it. Maybe someone 
>>> would like to try and let me know how to do it?
>>>
>>> Thank you for any help!
>>>
>>> Regards,
>>> Srivas
>>>
>>> -- 
>>> -- 
>>> You received this message because you are subscribed to the Google
>>> Groups "tesseract-ocr" group.
>>> To post to this group, send email to [email protected]
>>> To unsubscribe from this group, send email to
>>> [email protected]
>>> For more options, visit this group at
>>> http://groups.google.com/group/tesseract-ocr?hl=en
>>>  
>>> --- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to [email protected].
>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>
>>
>>

-- 
-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

Re: Need help in recognizing english texts with sanskrit roman diacritical marks.

Reply via email to