Would I call this script using ocroscript? I'm a little confused when it comes to the different packages. I installed Ocropus just using apt- get on ubuntu. I believe the only command I really have access to at this point (without building the script) is ocroscript. I'm assuming this is just another script that ocroscript can be called with like rec-tess and rec-ltess.
Is there any documentation for the script you created in your reply? Currently, I'm using a C++ program w/ ImageMagick to cut up my tiff images into smaller png images and then calling "ocroscript rec-ltess subimage-n.png" for each region. It's working ok except when I have an image with very few characters (1 to 3). For those regions I'm pasting another image onto the end and for some reason that causes the OCR engine to recognize the characters I need. Overall I'm seeing excellent results. I'm very impressed, but I would still like to know how I can increase my accuracy rate. Any tips you have would be appreciated. On Mar 20, 12:09 am, Tom <[email protected]> wrote: > Currently, this is what you need to do for recognizing individual > lines: > > import ocropy > > # allocate objects > image = ocropy.bytearray() > fst = ocropy.make_OcroFST() > s = ocropy.ustrg() > > # load the line recognizer > linerec = ocropy.load_linerec("some.model") > # load the image to be recognized > ocropy.read_image_gray(image,"some.jpg") > # perform the actual recognition and compute recognition alternatives > linerec.recognizeLine(fst,image) > # find the best solution > fst.bestpath(s) > # print the result > print ocropy.ustrg_as_string(s) > > If you want to recognize blocks, you need to use one of the functions > that segments blocks into lines, then recognize the lines. That looks > something like this: > > # allocate data structures > pseg = ocropy.intarray() > segmenter = ocropy.make_ISegmentPage("SegmentPageByRAST1") > regions = ocropy.RegionExtractor() > > # pseg is just a color image; regions is a little utility class that > iterates > # over differently colored regions in that color image > segmenter.segment(pseg,image) > regions.setPageLines(pseg) > for i in range(1,regions.length()): > regions.extract(line,image,i,1) > ... continue with line recognition ... > > You can see all of this working in ocropy/ocropus-pages in 0.4.4 > > We'll try to add some convenience functions for all of this to the > library and document these use cases better. > > Tom > > On Mar 16, 12:59 pm, dataintelligence <[email protected]> wrote: > > > Is it possible for me to hand ocropus a list of bounding boxes and get > > back the OCR results from those regions within an image? -- You received this message because you are subscribed to the Google Groups "ocropus" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/ocropus?hl=en.
