I applied some of the image processing that I commonly use to the image you 
sent.
Before image processing, Tesseract outputs:
The Evolving Student
@

After processing, it outputs:
The Evolving Student
0 Children and Email
Classroom Requirem nts
Online Coursework Dependency
v Learning a Vital Social Skill

(The missing e is due to the pre-processing, not tesseract). 
The main thing I notice about the image that you sent is that most of the 
letters have very low contrast with their surroundings. If you add some 
pre-processing to intelligently convert the image to black and white, I expect 
that your results will improve significantly.

Derek
On Feb 19, 2012, at 4:58 , Jason Funk wrote:

> My specific examples are screen captures of powerpoint slides. For
> example, what would need to be done to this image?
> 
> http://jasonfunk.net/example2.jpeg
> 
> On Feb 18, 6:03 pm, Sven Pedersen <sven.peder...@gmail.com> wrote:
>> Image processing, not age. :-)
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> On Saturday, February 18, 2012, Sven Pedersen wrote:
>>> Commercial options have lots of built-in age processing. You can do that
>>> with free software but it does not just do it automatically. Post examples
>>> and you'll get feedback about how to do it with tesseract.
>>> --Sven
>> 
>>> On Saturday, February 18, 2012, Jason Funk wrote:
>> 
>>>> But what if I am simply trying to do OCR on images that use standard
>>>> normal english fonts? Why isn't it working as well as the commercial
>>>> options which do beautifully? Does the default english language data
>>>> file not contain a lot of the of typical fonts?
>> 
>>>> On Feb 18, 3:53 pm, "La Monte H. P. Yarroll" <piggy.yarr...@gmail.com>
>>>> wrote:
>>>>> A good example is fraktur (old German black-letter fonts). The only
>>>>> commercial option is over $10,000 for a single copy. There are some
>>>>> languages for which tesseract is the only option.
>> 
>>>>> On Sat, Feb 18, 2012 at 4:07 PM, Sven Pedersen <sven.peder...@gmail.com
>>>>> wrote:
>> 
>>>>>> Tesseract is especially good for custom training for a particular
>>>> type of
>>>>>> text. Accuracy can increase to over 98% for a given font. Also, it
>>>> can be
>>>>>> trained for foreign languages.
>>>>>> --Sven
>> 
>>>>>> On Sat, Feb 18, 2012 at 1:43 PM, Jason Funk <jasonlf...@gmail.com>
>>>> wrote:
>> 
>>>>>>> I am testing tesseract against some other commercial products and the
>>>>>>> commercials products seems to blow tesseract out of the water in
>>>> terms
>>>>>>> of quality and accuracy. Is this because tesseract is just not as
>>>> good
>>>>>>> as the other products? Or perhaps tesseract is designed for a
>>>> specific
>>>>>>> purpose other than what I am testing it for?
>> 
>>>>>>> Maybe a different question would be, for what applications are people
>>>>>>> using tesseract successfully?
>> 
>>>>>>> --
>>>>>>> You received this message because you are subscribed to the Google
>>>>>>> Groups "tesseract-ocr" group.
>>>>>>> To post to this group, send email to tesseract-ocr@googlegroups.com
>>>>>>> To unsubscribe from this group, send email to
>>>>>>> tesseract-ocr+unsubscr...@googlegroups.com
>>>>>>> For more options, visit this group at
>>>>>>> http://groups.google.com/group/tesseract-ocr?hl=en
>> 
>>>>>> --
>>>>>> ``All that is gold does not glitter,
>>>>>>   not all those who wander are lost;
>>>>>> the old that is strong does not wither,
>>>>>>   deep roots are not reached by the frost.
>>>>>> From the ashes a fire shall be woken,
>>>>>>   a light from the shadows shall spring;
>>>>>> renewed shall be blade that was broken,
>>>>>>   the crownless again shall be king.”
>> 
>>>>>> --
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "tesseract-ocr" group.
>>>>>> To post to this group, send email to tesseract-ocr@googlegroups.com
>>>>>> To unsubscribe from this group, send email to
>>>>>> tesseract-ocr+unsubscr...@googlegroups.com
>>>>>> For more options, visit this group at
>>>>>> http://groups.google.com/group/tesseract-ocr?hl=en
>> 
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "tesseract-ocr" group.
>>>> To post to this group, send email to tesseract-ocr@googlegroups.com
>>>> To unsubscribe from this group, send email to
>>>> tesseract-ocr+unsubscr...@googlegroups.com
>>>> For more options, visit this group at
>>>> http://groups.google.com/group/tesseract-ocr?hl=en
>> 
>>> --
>>> ``All that is gold does not glitter,
>>>   not all those who wander are lost;
>>> the old that is strong does not wither,
>>>   deep roots are not reached by the frost.
>>> From the ashes a fire shall be woken,
>>>   a light from the shadows shall spring;
>>> renewed shall be blade that was broken,
>>>   the crownless again shall be king.”
>> 
>> --
>> ``All that is gold does not glitter,
>>   not all those who wander are lost;
>> the old that is strong does not wither,
>>   deep roots are not reached by the frost.
>> From the ashes a fire shall be woken,
>>   a light from the shadows shall spring;
>> renewed shall be blade that was broken,
>>   the crownless again shall be king.”
> 
> -- 
> You received this message because you are subscribed to the Google
> Groups "tesseract-ocr" group.
> To post to this group, send email to tesseract-ocr@googlegroups.com
> To unsubscribe from this group, send email to
> tesseract-ocr+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to tesseract-ocr@googlegroups.com
To unsubscribe from this group, send email to
tesseract-ocr+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to