Hi Tom,

2009/6/19 Thomas Breuel <[email protected]>:
>
>> So, my question is why does it matter if the bg is white and why the
>> check in place?
>
> It's in place because the recognizer can only recognize black-on-white
> characters.  It's a common programming mistake, and an occasional
> problem in the input, that characters are input as white-on-black.

Right. The characters are black in colour and the bg is white as it so
happens in every single grayscale since there are only 2 possible
colours.. Yep makes sense indeed.

>> I _could_ possibly set bgcheck = false, recompile and get away with
>> it.
>
> I don't think you need to recompile; you should be able to set that
> via an environment variable.

This is good. I read about this in the docs (i.e., setting env
variable prior to executing `ocroscript') but didn't actually think
about it until you mentioned it. Schweet :)

> So, you might ask: why is OCRopus not smart enough to do the right
> thing for both kinds of characters?
>
> It's actually not hard to do, but it would essentially mean trying
> each character both ways, which would mean that OCRopus takes twice as
> long to complete (there are some other ways one can do it).

Are these other ways documented so I can read on them Tom? Thanks.

>> (2) A binarize'd image was provided but then
>> "check_page_segmentation(seg)" crushed it right away:
>> ocroscript: segmenter.lua:39: CHECK
>> ./ocr-utils/ocr-segmentations.cc:275 (column > 0 && column < 32) ||
>> column == 254 || column == 255
>> stack traceback:
>>        [C]: in function 'check_page_segmentation'
>>        segmenter.lua:39: in main chunk
>>        [C]: ?
>>
>> The binarize'd image was provided using:
>>  28 input = bytearray();
>>  29 iulib.read_image_gray(input,arg[1]);
>>  30 binarizer:binarize(image,input);
>>
>> Why?
>
> OCRopus used to be more lenient in what kinds of segmentation images
> it accepted.  Now it checks more carefully. However, the segmenters
> other than RAST haven't been updated yet to give correct segmentation
> output (since we use them rarely).

Hrmm..

> Please file a bug report about this so that we remember to fix it.

Sure will do when time permits/ASAP.

Thanks Tom.

-- 
Regards,
Ishwor Gurung

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"ocropus" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/ocropus?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to