Today I downloaded vietocr NET 1.7 32 zip from
http://sourceforge.net/projects/vietocr/. Apart from viet it is also
supported other Indic lang including Kannada and also other lang of the
world. The said frontendGUI has built-in  post processor program for
DangAmbig. which is UTF-8. I think problem of simple program or script as
suggested by Martin Pierre is  now solved. Only test has to be performed.
With Regards,
-sriranga(77yrsold)


On Thu, Apr 15, 2010 at 12:40 PM, 74yrs old <[email protected]> wrote:

> Pierre,
> Thanks for the clarification. I am explaining  how combination of(C)
> generated  by merging (A)consonant + (B)dependent vowel  as  noted  below:
>
>  (C)       (A) <- (B)       (C)    <-   (A)     (B)          (C)
> (A)       (B)
> *ದೇ = ದ + ೇ    ಗೋ  =  ಗ + ೋ*  * ಸೌಂ =  ಸೌ + ಂ*.
> Try           <-
> <-                                <-
> *ದೇ = ದ  ೇ    ಗೋ  =  ಗ  ೋ*  * ಸೌಂ =  ಸೌ  ಂ*  Here you can Try/test how(B)
> dependent vowel merged with  (A)consonant  if pressed  backspace key the B
> above(say*ೇ*  towards A(say *ದ* )  You will notice that (B)will merge with
> (A) smoothly and become (C)
>
> In such cases whether  (A) and (B) have to be trained as separated symbols
> and simple program is required to merge (B) with (A) to become (C). I am
> trying to get simple program from this forum for the past 2-3 years.
> Unfortunately no one is is able to write simple program or script for
> tesseract for Indic + other wold Lang which have consonant plus dependent
> vowels
> I seek your valuable guidance.
> With Regards,
> -sriranga(77yrsold)
>
>
>
> On Wed, Apr 14, 2010 at 7:22 PM, MARTIN Pierre <[email protected]>wrote:
>
>> As a *last* Point "3) In tif file, I observed six or seven times are
>> repeated same para. i am interested to know your logic for repeating paras.
>> It is presumed that *one line* sentence is sufficient for training
>> purpose OR *more than one line* of same sentence should be repeated in
>> tif  for training purpose."
>>
>> Answered already, in the same mail. Again: Each paragraph is written using
>> the same font, but with different anti-aliasing. You won't notice it at
>> naked eye, i'm not even sure it's on the TIF file i've provided.
>>
>> Request for your valuable guidance for the above point for my knowledge. I
>> don't trouble you anymore.
>> Awaiting your valuable guidance and *screenshot you have marked in RED*.
>>
>> I've sent it, please check previous mails. Attached is again.
>>
>>  --
>> You received this message because you are subscribed to the Google Groups
>> "tesseract-ocr" group.
>> To post to this group, send email to [email protected].
>> To unsubscribe from this group, send email to
>> [email protected]<tesseract-ocr%[email protected]>
>> .
>> For more options, visit this group at
>> http://groups.google.com/group/tesseract-ocr?hl=en.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.

Reply via email to