Indentation is currently not represented explicitly anywhere in
OCRopus.  However, you can do simple indentation recognition on the
hOCR output, quite independently of what OCR system produced it.

Tom

On Nov 18, 12:48 pm, alexanderff <[email protected]> wrote:
> Hi.
>
> Is that possible to any OCR recognize the indentation of the first
> line of the paragraphs?
>
> Any mark, spaces, strategy, or anything that I can use to parse the
> result and build a better HTML (marking the first line of each
> paragraph to reproduce the original indentation)?
>
> I've read that Ocropus do layout analysis... I don't know if it's what
> I'm looking for...
>
> Anyway, it's painfull to install Ocropus (at least in Gentoo), so I
> could finaly run Ocropus, but don't know about the right options or
> even if it's a good tool to portuguese language. I have tesseract
> working 100%.
>
> []s
> Alexander
> Brazil - Rio de Janeiro

-- 
You received this message because you are subscribed to the Google Groups 
"ocropus" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/ocropus?hl=en.

Reply via email to