I just found this:
https://www.quora.com/How-do-I-fill-holes-in-image-using-image-processing/answer/V-Sri-Chakra-Kumar


Il giorno mer 8 mag 2019 alle ore 09:57 Lorenzo Bolzani <[email protected]>
ha scritto:

> Hi,
> you can try a few things, but you need to write a small script (python,
> etc.) or use imagemagick. I suggest to first try with gimp, find what works
> best, and then write the code. You want dark text on clear background.
>
> For white text on red:
>
> 1. Invert the image. Desaturate. Increase contrast.
>
> 2. split the image in RGB channels and use the one that looks better (red
> probably). Also try to decompose in HSV and see if S or V looks good. From
> gimp do: Colors -> components -> decompose.
>
> 3. invert the image and try thresholding (OTSU, etc.)
>
> With a little programming you can identify and isolate black regions from
> white ones, but I do not know if this is something you want to do.
>
>
> Post the image if this does not help.
>
>
> Lorenzo
>
> Il giorno mer 8 mag 2019 alle ore 03:07 Jason <[email protected]> ha
> scritto:
>
>> I have a problem with the current tesseract. I have documents that have
>> sections of varying background and text colors. Ive read that tesseract v3
>> was white/black invariant and it didn't matter if I had white text on red
>> background. But now it matters. The problem is, other parts in the same
>> image are black text on white background. Tesseract 4 fails to identify the
>> white text on red background at all.
>>
>> I have tried inverting the image colors so red (0xFF0000) becomes cyan
>> (0x00FFFF) and the white text (0xFFFFFF) becomes black (0x000000). I then
>> take the highest confidence text for the region. This improves some
>> scenarios, but for the red/white scenario, does not work.
>>
>> Questions:
>> 1. How can I extract the text to be black and the background to be white,
>> before using tesseract?
>> 2. Is there a way to configure tesseract to "just work"?
>>
>> I've been trying to figure out how to do this for some time, and I
>> haven't made any progress.
>>
>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To post to this group, send email to [email protected].
>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/tesseract-ocr/0c9cb359-bde4-4c2e-9643-1a9c56b639dc%40googlegroups.com
>> <https://groups.google.com/d/msgid/tesseract-ocr/0c9cb359-bde4-4c2e-9643-1a9c56b639dc%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAMgOLLzFxgUkCEG4AnNAsktVwYZn3ROzoyMqmdZbdesZqusoBg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to