Hi,

@sriranga, thanks for your encouraging reply,

i train tesseract with my hand writing and get very beautiful results.

Now i wonder is there a way to recognize adjacent letters in an handwriting
like in attached example.

This search is more than an ocr software's capability, okay, but maybe you
advise some other thing.

Or do you have an idea for segmenting adjacent letters?

Thanks in advance.


2011/9/12 Sriranga(78yrsold) <[email protected]>

> Extract of cmd is reproduced below:
> M:\>tesseract example.png example batch.nochop makebox
> Tesseract Open Source OCR Engine with Leptonica
>
> M:\>tesseract example.tif example  nobatch box.train logfile
> Number of found pages: 1.
>
> M:\>unicharset_extractor.exe example.box
> Extracting unicharset from example.box
> Wrote unicharset file ./unicharset.
>
> M:\>mftraining.exe example.tr
> Reading example.tr ...
> example has no defined properties.
>
> Writing Merged Microfeat ...Done!
>
> M:\>cntraining.exe example.tr
> Reading example.tr ...
> Clustering ...
>
> Writing normproto ...
>
> M:\>tesseract example.tif example  nobatch box.train logfile
> Number of found pages: 1.
>
> M:\>unicharset_extractor.exe example.box
> Extracting unicharset from example.box
> Wrote unicharset file ./unicharset.
>
> M:\>unicharset_extractor.exe example.box
> Extracting unicharset from example.box
> Wrote unicharset file ./unicharset.
>
> M:\>mftraining.exe example.tr
> Reading example.tr ...
> example has no defined properties.
>
> Writing Merged Microfeat ...Done!
>
> M:\>cntraining.exe example.tr
> Reading example.tr ...
> Clustering ...
>
> Writing normproto ...
>
> M:\>combine_tessdata.exe ./han.
> Combining tessdata files
> TessdataManager combined tesseract data files.
> Offset for type 0 is -1
> Offset for type 1 is 108
> Offset for type 2 is -1
> Offset for type 3 is 368
> Offset for type 4 is 127673
> Offset for type 5 is 127715
> Offset for type 6 is -1
> Offset for type 7 is -1
> Offset for type 8 is -1
> Offset for type 9 is -1
> Offset for type 10 is -1
> Offset for type 11 is -1
> Offset for type 12 is -1
>
> M:\>tesseract example.png testexample -l han
> Tesseract Open Source OCR Engine with Leptonica
>
> For further details please read *
> http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3 * which are
> self explanatory.
>
> On Mon, Sep 12, 2011 at 7:33 PM, merve t <[email protected]> wrote:
>
>> Hello, thanks for your hopeful answer.
>> I want to ask, what is han.traineddata?
>> Is it the training data to use to train the system to recognize hand
>> writing, thus it is too small, isnt it?
>> Also can you share the method to transform png to tif?
>> Thanks.
>>
>> 2011/9/12 Sriranga(78yrsold) <[email protected]>
>>
>>>  merve,
>>> It is possible in tesseract -vide attached files which is self
>>> explanatory.
>>> Cheers,
>>> -sriranga(78yrs)
>>>
>>> On Mon, Sep 12, 2011 at 6:04 PM, merve t <[email protected]> wrote:
>>>
>>>> Hello,
>>>> There is a file attached,
>>>> I must confess, i wrote it with mouse, but the data that is needed to be
>>>> solved is like this.
>>>> Because we are developing a white board application.
>>>> I tried to solve it with ocropus but it could not.
>>>> I can not install tesseract alone, if you say it can solve this pic, i
>>>> will try again.
>>>> Thanks for your time.
>>>>
>>>> 2011/8/24 Dmitri Silaev <[email protected]>
>>>>
>>>>> Simple answer: in general, no.
>>>>> However, in particular, it might.
>>>>> Send sample images to get a more certain answer
>>>>>
>>>>> Warm regards,
>>>>> Dmitri Silaev
>>>>> www.CustomOCR.com
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Aug 24, 2011 at 4:40 PM, merve <[email protected]> wrote:
>>>>> > Simple question, but i must be sure.
>>>>> > Thanks in advance
>>>>> >
>>>>> > --
>>>>> > You received this message because you are subscribed to the Google
>>>>> > Groups "tesseract-ocr" group.
>>>>> > To post to this group, send email to [email protected]
>>>>> > To unsubscribe from this group, send email to
>>>>> > [email protected]
>>>>> > For more options, visit this group at
>>>>> > http://groups.google.com/group/tesseract-ocr?hl=en
>>>>> >
>>>>>
>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "tesseract-ocr" group.
>>>>> To post to this group, send email to [email protected]
>>>>> To unsubscribe from this group, send email to
>>>>> [email protected]
>>>>> For more options, visit this group at
>>>>> http://groups.google.com/group/tesseract-ocr?hl=en
>>>>>
>>>>
>>>>  --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "tesseract-ocr" group.
>>>> To post to this group, send email to [email protected]
>>>> To unsubscribe from this group, send email to
>>>> [email protected]
>>>> For more options, visit this group at
>>>> http://groups.google.com/group/tesseract-ocr?hl=en
>>>>
>>>
>>>  --
>>> You received this message because you are subscribed to the Google
>>> Groups "tesseract-ocr" group.
>>> To post to this group, send email to [email protected]
>>> To unsubscribe from this group, send email to
>>> [email protected]
>>> For more options, visit this group at
>>> http://groups.google.com/group/tesseract-ocr?hl=en
>>>
>>
>>  --
>> You received this message because you are subscribed to the Google
>> Groups "tesseract-ocr" group.
>> To post to this group, send email to [email protected]
>> To unsubscribe from this group, send email to
>> [email protected]
>> For more options, visit this group at
>> http://groups.google.com/group/tesseract-ocr?hl=en
>>
>
>  --
> You received this message because you are subscribed to the Google
> Groups "tesseract-ocr" group.
> To post to this group, send email to [email protected]
> To unsubscribe from this group, send email to
> [email protected]
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en
>

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

<<attachment: example.png>>

Reply via email to