Re: Tesseract- Training dataset

Sven Pedersen Mon, 26 Sep 2011 10:05:06 -0700

Hi Mayce,
Abbyy does not disclose how they train their system, and Google
releases some information, but certainly not all. Are you concerned
about a particular type of document? You can train it yourself to
focus on a given domain, or inquire here (check archives first) about
it.
--Sven



On Mon, Sep 26, 2011 at 9:08 AM, Mayce Al <[email protected]> wrote:
> Hi All,
> I was looking for more information about which datasets have been use to
> train Tesseract and Abbyy to recognize English documents. I could not find
> further information, except that Tesseract is tested on UNLV-datasets.
> Does anyone have any idea about this?
> Best Regards
> Mayce
>
> --
> You received this message because you are subscribed to the Google
> Groups "tesseract-ocr" group.
> To post to this group, send email to [email protected]
> To unsubscribe from this group, send email to
> [email protected]
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en
>



-- 
``All that is gold does not glitter,
  not all those who wander are lost;
the old that is strong does not wither,
  deep roots are not reached by the frost.
>From the ashes a fire shall be woken,
  a light from the shadows shall spring;
renewed shall be blade that was broken,
  the crownless again shall be king.”

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Re: Tesseract- Training dataset

Reply via email to