Re: [tesseract-ocr] Need to understand Tesseract code

2016-06-15 Thread ravi katiyar
Hi Really appreciate your prompt response , thank you for showing me some direction. I understand that modifying tesseract will be an uphill task , and now specially given that the source code is been completely developed in c and C++ it seems even more tougher. I did mention my use case is

Re: [tesseract-ocr] Need to understand Tesseract code

2016-06-15 Thread Allistair
Hi, Your question is a little difficult to understand - it sounds like you are saying on the one hand you have no OCR or image processing background, know Java, and want to modify Tesseract toward some aim that you do not specify? Tesseract as far as I understand is developed using C/C++ and not

[tesseract-ocr] Re: Possible to prioritise some characters over others during OCR?

2016-06-15 Thread Diederik Hattingh
Hi Stef, Thanks for the reply (here and on SO). The fix mostly works, but unfortunately I am still seeing that tesseract sometimes ignores the unicharambigs file I set for it. For example I have the following two images:

Re: [tesseract-ocr] Re: Do we have Sanskrit training images and box files online?

2016-06-15 Thread ShreeDevi Kumar
You can check out the older version of sanskritocr from http://learnsanskrit.org/tools/ocr The new version is commercial software, available as a demo for free, but requires payment for use. - sent from my phone. excuse the brevity. On 14-Jun-2016 3:44 am, "rohit saluja"

[tesseract-ocr] Need to understand Tesseract code

2016-06-15 Thread ravi katiyar
Hello All, I am new to the world of OCR and image processing as well. I am come from a java background. can someone tell what are the pre-requisite to understand the tesseract code ? Like java.awt.image package , Digital image processing concepts ? what would I need to be thorough with so that