Hi, > Anyway, it's painfull to install Ocropus (at least in Gentoo), There are Archlinux pkgbuilds in the AUR for both 0.4.4 and the mercurial tip: iulib, ocropus, ocroswig, ocropy, and same with -hg. Maybe you could convert them to ebuilds or otherwise use them to ease the pain.
Cheers, Ilya On Nov 18, 12:48 pm, alexanderff <[email protected]> wrote: > Hi. > > Is that possible to any OCR recognize the indentation of the first > line of the paragraphs? > > Any mark, spaces, strategy, or anything that I can use to parse the > result and build a better HTML (marking the first line of each > paragraph to reproduce the original indentation)? > > I've read that Ocropus do layout analysis... I don't know if it's what > I'm looking for... > > Anyway, it's painfull to install Ocropus (at least in Gentoo), so I > could finaly run Ocropus, but don't know about the right options or > even if it's a good tool to portuguese language. I have tesseract > working 100%. > > []s > Alexander > Brazil - Rio de Janeiro -- You received this message because you are subscribed to the Google Groups "ocropus" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/ocropus?hl=en.
