I would also like more information on how to make a UZN file appropriate to my image. thanks!
On Sunday, October 14, 2007 10:38:29 AM UTC-7, Christoph Reimmann wrote: > > Hi Ray, > > thx for your answer. > > I've tried out ocr with in.uzn and .... it worked very well. Thanks. > > But when is a zone file correctly formatted ? I can't find a > documentation. Do you know whether there is one ? > > Thx again in advance, Chris > > On 12 Okt., 18:47, "Ray Smith" <[email protected]> wrote: > > If you have made a correctly formatted UNLV zone file, then you should > name > > it in.uzn and use this command line: > > tesseract in.tif out.txt -l deu > > The in.uzn file will be found based on the name of the input tif file. > > Ray. > > > > On 10/12/07, [email protected] <[email protected]> wrote: > > > > > > > > > > > > > Tess does not at this point support multiple columns. You can write a > > > zoning software yourself and then use the dll interface to recognize > > > those parts of it. > > > > > On Oct 12, 3:35 am, Reimmann <[email protected]> wrote: > > > > Hi, > > > > > > I'm trying out Tesseract 2.01. I have a document that two columns of > > > > text, the quality of Tesseract's recognition is very good, but the > > > > columns are mixed, because tesseract recognizes the characters line > by > > > > line. So, I like to have two different zones, that are recognized one > > > > after the other. I have tried out a tiff-image and a "zone-file" that > > > > I found on the UNLV site, but this does not work. My command-line > > > > looks like that: > > > > > > tesseract in.tif out.txt -l deu in.zone > > > > > > in.tif is not compressed. > > > > > > When I debug this, the program exits at line 234 in variables.cpp > when > > > > trying to read_variables. > > > > > > Can anyone help ? > > > > > > Has anyone a useful pair of tiff-file and configuration-file for > > > > recognizing parts of a document ? > > > > > > thx in advance, > > > > > > Chris from Aachen, Germany- Zitierten Text ausblenden - > > > > - Zitierten Text anzeigen - > > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/8ec643c4-2e0b-4f62-8d52-183da1789cda%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

