Hi,

Yes OpenCV would be the method - finding contours and filtering by
colour/shape etc.

I am not sure how you would separate rooms sharing the same colour but you
have the benefit of thick black borders, so your method will need to
somehow use those borders.

There is no real benefit to grays-caling the images beforehand, Tesseract
already does this internally. You may however like to remove the coloured
background first - since you know the colour for a rectangle for each
output you can remove it quite easily, replacing it either with white (for
black text) or black (for white text).

Be very interested to hear how you get on ..

Cheers

On 21 August 2015 at 12:38, Rutger Rozendal <[email protected]> wrote:

> Ok, thanks
>
> We will try this method
> So first getting the rectangles out as cropped pictures and then do the
> character recognition on this separate picture.
>
> For rectangle-by-color extraction we think to use OpenCV
> <http://opencv.org> as it seems that Tesseract is not really into that,
> isn't it?
>
> We used OpenCV before to find the rooms (rectangular boxes) but that was
> based on the walls, their radius and this analysis was done on an inverted
> black-end-white picture of the floorplan.
> We will try now with the color as an input source but - as some rooms have
> the same color and they are beside each other we are wondering what will
> happen to those.
>
> And then for the last step, the Tesseract recognition on the cropped
> picture of one room, is it advisable to use there a grayscale image?
> And can we feed Tesseract with a kind of target list? For us it is
> important to find the location on the room (x and y on the picture), that
> is the overall goal of the assignment.
>
> Thanks again for any tips in this challange.
>
> Rutger
>
>
>
>
>
>
> On Fri, Aug 21, 2015 at 11:25 AM, Allistair C <[email protected]> wrote:
>
>> The way I would do this is use a rectangle-by-color extraction phase that
>> produces all the cropped out colour rectangles with numbers and then
>> perform ocr on each one which should be good success for the quality of
>> text
>>
>> Sent from my iPhone
>>
>> On 21 Aug 2015, at 08:45, Rutger Rozendal <[email protected]> wrote:
>>
>> Dear People,
>>
>> We are using Tesseract to recognise room numbers on a floorpan of a ship
>> deck.
>> Attached to this email two examples.
>>
>> We are trying different methods and have a mixture of results, let's say
>> recognising between 20% till 70% of the room numbers.
>>
>> Because the image come with color we are now wondering is results are
>> better when we take out colours upfront?
>> So making them black and white or grayscale.
>> Can we use Tesseract to do this color conversion with a certain profiling
>> or do we need to use an external program for that?
>>
>> Also we could work maybe with a search for specific patters, as these
>> rooms most of the time consist out of 4 digits.
>>
>> Any tips on direction for a best configuration is helpfull.
>>
>> Thanks in advance,
>>
>> Rutger
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To post to this group, send email to [email protected].
>> Visit this group at http://groups.google.com/group/tesseract-ocr.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/tesseract-ocr/bf5d8e06-7e42-464d-ab50-61951a89447e%40googlegroups.com
>> <https://groups.google.com/d/msgid/tesseract-ocr/bf5d8e06-7e42-464d-ab50-61951a89447e%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>> <rci_hm_DECK09.jpg>
>>
>> <cel_rf_DECK08.jpg>
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To post to this group, send email to [email protected].
>> Visit this group at http://groups.google.com/group/tesseract-ocr.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/tesseract-ocr/51085070-72F3-4ED5-81C1-4BE34146C5BE%40gmail.com
>> <https://groups.google.com/d/msgid/tesseract-ocr/51085070-72F3-4ED5-81C1-4BE34146C5BE%40gmail.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> --
> --
> Drs. Ing. R.D. Rozendal
>
> Noterik B.V.
>
> Tel. +31-(0)20-5929966
> Fax. +31-(0)20-5929969
>
> Check out the demo's of our tools:
> http://www.noterik.nl/video
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/CAJpvoacgqmtcsLrBz4uQHHYhwGAmmO1mu-y%3DKhPBr4fchiofWA%40mail.gmail.com
> <https://groups.google.com/d/msgid/tesseract-ocr/CAJpvoacgqmtcsLrBz4uQHHYhwGAmmO1mu-y%3DKhPBr4fchiofWA%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAORW5visbvhk%3DBMFG6t5Ni9j%2BrmrYQBncGgFy-eWZKpafnRrog%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to