Thanks! So you'll need a threshold to clean up the background noise and get
black and white images. You can do some things with minimum dot sizes, but
some basic ImageMagick transformations should be sufficient. Can someone
else comment with specifics?
--Sven


On Wed, Nov 9, 2011 at 4:49 AM, Esteban Bordón <[email protected]> wrote:

> Hi,
>
> I send 2 examples of expedients.
>
> Thanks,
> Esteban.
>
>
> 2011/11/8 Sven Pedersen <[email protected]>
>
>> Hi Esteban,
>> Please show us a sample, even if it is a partial image of a word or two.
>> Thanks,
>> Sven
>>
>> On Mon, Nov 7, 2011 at 6:46 PM, Esteban Bordón <[email protected]> wrote:
>>
>>> Hi all,
>>>
>>> I am currently working on digitalization of old typewritten expedients
>>> that are deteriorated.
>>> Tesseract does not work well with these images and I would like to know
>>> which tools are better to enhance the images. I have a lot of images and I
>>> need some semi-automated tool to make binarisation and segmentation.
>>>
>>> Thanks a lot for any help.
>>>
>>> Esteban.
>>>
>>>  --
>>> You received this message because you are subscribed to the Google
>>> Groups "tesseract-ocr" group.
>>> To post to this group, send email to [email protected]
>>> To unsubscribe from this group, send email to
>>> [email protected]
>>> For more options, visit this group at
>>> http://groups.google.com/group/tesseract-ocr?hl=en
>>>
>>
>>
>>
>> --
>> ``All that is gold does not glitter,
>>   not all those who wander are lost;
>> the old that is strong does not wither,
>>   deep roots are not reached by the frost.
>> From the ashes a fire shall be woken,
>>   a light from the shadows shall spring;
>> renewed shall be blade that was broken,
>>   the crownless again shall be king.”
>>
>> --
>> You received this message because you are subscribed to the Google
>> Groups "tesseract-ocr" group.
>> To post to this group, send email to [email protected]
>> To unsubscribe from this group, send email to
>> [email protected]
>> For more options, visit this group at
>> http://groups.google.com/group/tesseract-ocr?hl=en
>>
>
>  --
> You received this message because you are subscribed to the Google
> Groups "tesseract-ocr" group.
> To post to this group, send email to [email protected]
> To unsubscribe from this group, send email to
> [email protected]
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en
>



-- 
``All that is gold does not glitter,
  not all those who wander are lost;
the old that is strong does not wither,
  deep roots are not reached by the frost.
>From the ashes a fire shall be woken,
  a light from the shadows shall spring;
renewed shall be blade that was broken,
  the crownless again shall be king.”

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to