Re: Specifying regions within an image

dataintelligence Sun, 21 Mar 2010 19:24:09 -0700

Would I call this script using ocroscript? I'm a little confused when
it comes to the different packages. I installed Ocropus just using apt-
get on ubuntu. I believe the only command I really have access to at
this point (without building the script) is ocroscript. I'm assuming
this is just another script that ocroscript can be called with like
rec-tess and rec-ltess.


Is there any documentation for the script you created in your reply?

Currently, I'm using a C++ program w/ ImageMagick to cut up my tiff
images into smaller png images and then calling "ocroscript rec-ltess
subimage-n.png" for each region. It's working ok except when I have an
image with very few characters (1 to 3). For those regions I'm pasting
another image onto the end and for some reason that causes the OCR
engine to recognize the characters I need.

Overall I'm seeing excellent results. I'm very impressed, but I would
still like to know how I can increase my accuracy rate. Any tips you
have would be appreciated.

On Mar 20, 12:09 am, Tom <[email protected]> wrote:
> Currently, this is what you need to do for recognizing individual
> lines:
>
> import ocropy
>
> # allocate objects
> image = ocropy.bytearray()
> fst = ocropy.make_OcroFST()
> s = ocropy.ustrg()
>
> # load the line recognizer
> linerec = ocropy.load_linerec("some.model")
> # load the image to be recognized
> ocropy.read_image_gray(image,"some.jpg")
> # perform the actual recognition and compute recognition alternatives
> linerec.recognizeLine(fst,image)
> # find the best solution
> fst.bestpath(s)
> # print the result
> print ocropy.ustrg_as_string(s)
>
> If you want to recognize blocks, you need to use one of the functions
> that segments blocks into lines, then recognize the lines.  That looks
> something like this:
>
> # allocate data structures
> pseg = ocropy.intarray()
> segmenter = ocropy.make_ISegmentPage("SegmentPageByRAST1")
> regions = ocropy.RegionExtractor()
>
> # pseg is just a color image; regions is a little utility class that
> iterates
> # over differently colored regions in that color image
> segmenter.segment(pseg,image)
> regions.setPageLines(pseg)
> for i in range(1,regions.length()):
>     regions.extract(line,image,i,1)
>     ... continue with line recognition ...
>
> You can see all of this working in ocropy/ocropus-pages in 0.4.4
>
> We'll try to add some convenience functions for all of this to the
> library and document these use cases better.
>
> Tom
>
> On Mar 16, 12:59 pm, dataintelligence <[email protected]> wrote:
>
> > Is it possible for me to hand ocropus a list of bounding boxes and get
> > back the OCR results from those regions within an image?

-- 
You received this message because you are subscribed to the Google Groups 
"ocropus" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/ocropus?hl=en.

Re: Specifying regions within an image

Reply via email to