Hi Tom,
2009/6/19 Thomas Breuel <[email protected]>: > >> So, my question is why does it matter if the bg is white and why the >> check in place? > > It's in place because the recognizer can only recognize black-on-white > characters. It's a common programming mistake, and an occasional > problem in the input, that characters are input as white-on-black. Right. The characters are black in colour and the bg is white as it so happens in every single grayscale since there are only 2 possible colours.. Yep makes sense indeed. >> I _could_ possibly set bgcheck = false, recompile and get away with >> it. > > I don't think you need to recompile; you should be able to set that > via an environment variable. This is good. I read about this in the docs (i.e., setting env variable prior to executing `ocroscript') but didn't actually think about it until you mentioned it. Schweet :) > So, you might ask: why is OCRopus not smart enough to do the right > thing for both kinds of characters? > > It's actually not hard to do, but it would essentially mean trying > each character both ways, which would mean that OCRopus takes twice as > long to complete (there are some other ways one can do it). Are these other ways documented so I can read on them Tom? Thanks. >> (2) A binarize'd image was provided but then >> "check_page_segmentation(seg)" crushed it right away: >> ocroscript: segmenter.lua:39: CHECK >> ./ocr-utils/ocr-segmentations.cc:275 (column > 0 && column < 32) || >> column == 254 || column == 255 >> stack traceback: >> [C]: in function 'check_page_segmentation' >> segmenter.lua:39: in main chunk >> [C]: ? >> >> The binarize'd image was provided using: >> 28 input = bytearray(); >> 29 iulib.read_image_gray(input,arg[1]); >> 30 binarizer:binarize(image,input); >> >> Why? > > OCRopus used to be more lenient in what kinds of segmentation images > it accepted. Now it checks more carefully. However, the segmenters > other than RAST haven't been updated yet to give correct segmentation > output (since we use them rarely). Hrmm.. > Please file a bug report about this so that we remember to fix it. Sure will do when time permits/ASAP. Thanks Tom. -- Regards, Ishwor Gurung --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "ocropus" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/ocropus?hl=en -~----------~----~----~----~------~----~------~--~---
