I also ran into unnecessary "background seems to be white" and
"background seems to be black" exceptions. The fix that worked for me
was to comment out all the places they are thrown and recompile

On Jun 19, 2:08 am, Ishwor <[email protected]> wrote:
> Hi Tom,
>
> 2009/6/19 Thomas Breuel <[email protected]>:
>
>
>
> >> So, my question is why does it matter if the bg is white and why the
> >> check in place?
>
> > It's in place because the recognizer can only recognize black-on-white
> > characters.  It's a common programming mistake, and an occasional
> > problem in the input, that characters are input as white-on-black.
>
> Right. The characters are black in colour and the bg is white as it so
> happens in every single grayscale since there are only 2 possible
> colours.. Yep makes sense indeed.
>
> >> I _could_ possibly set bgcheck = false, recompile and get away with
> >> it.
>
> > I don't think you need to recompile; you should be able to set that
> > via an environment variable.
>
> This is good. I read about this in the docs (i.e., setting env
> variable prior to executing `ocroscript') but didn't actually think
> about it until you mentioned it. Schweet :)
>
> > So, you might ask: why is OCRopus not smart enough to do the right
> > thing for both kinds of characters?
>
> > It's actually not hard to do, but it would essentially mean trying
> > each character both ways, which would mean that OCRopus takes twice as
> > long to complete (there are some other ways one can do it).
>
> Are these other ways documented so I can read on them Tom? Thanks.
>
>
>
> >> (2) A binarize'd image was provided but then
> >> "check_page_segmentation(seg)" crushed it right away:
> >> ocroscript: segmenter.lua:39: CHECK
> >> ./ocr-utils/ocr-segmentations.cc:275 (column > 0 && column < 32) ||
> >> column == 254 || column == 255
> >> stack traceback:
> >>        [C]: in function 'check_page_segmentation'
> >>        segmenter.lua:39: in main chunk
> >>        [C]: ?
>
> >> The binarize'd image was provided using:
> >>  28 input = bytearray();
> >>  29 iulib.read_image_gray(input,arg[1]);
> >>  30 binarizer:binarize(image,input);
>
> >> Why?
>
> > OCRopus used to be more lenient in what kinds of segmentation images
> > it accepted.  Now it checks more carefully. However, the segmenters
> > other than RAST haven't been updated yet to give correct segmentation
> > output (since we use them rarely).
>
> Hrmm..
>
> > Please file a bug report about this so that we remember to fix it.
>
> Sure will do when time permits/ASAP.
>
> Thanks Tom.
>
> --
> Regards,
> Ishwor Gurung
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"ocropus" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/ocropus?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to