Re: math formulas

Albert Zeyer Fri, 27 Aug 2010 14:59:34 -0700

 Am 27.08.10 11:53, schrieb Jimmy O'Regan:

On 26 August 2010 16:27, albert<[email protected]>  wrote:

Hi,


I need an open OCR library which is able to scan complex printed math
formulas (for example some formulas which were generated via LaTeX). I
want to get some LaTeX-like output (or just some AST-like data).

Can Tesseract do this? Is there something like this already? Or are
current OCR technics just able to parse line-oriented text?

Tesseract does not do that. There's an open enhancement request that
might have more information:
http://code.google.com/p/tesseract-ocr/issues/detail?id=270

Ah, but I am asking for more than just be able to scan math symbols. Iwant to have support to scan full formulas which can be quite complex. Acombination of \frac, \int, \sum, etc. It must not only detect thesymbols, it must also see how they belong together (for example thenumerator and the denominator in a fraction).

Is it possible to extend Tesseract to be able to do this or is someheavy redesign of the whole engine needed (and some fundamental othertechnics) to do this?


//

--
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.

Re: math formulas

Reply via email to