Ok, thanks We will try this method So first getting the rectangles out as cropped pictures and then do the character recognition on this separate picture.
For rectangle-by-color extraction we think to use OpenCV <http://opencv.org> as it seems that Tesseract is not really into that, isn't it? We used OpenCV before to find the rooms (rectangular boxes) but that was based on the walls, their radius and this analysis was done on an inverted black-end-white picture of the floorplan. We will try now with the color as an input source but - as some rooms have the same color and they are beside each other we are wondering what will happen to those. And then for the last step, the Tesseract recognition on the cropped picture of one room, is it advisable to use there a grayscale image? And can we feed Tesseract with a kind of target list? For us it is important to find the location on the room (x and y on the picture), that is the overall goal of the assignment. Thanks again for any tips in this challange. Rutger On Fri, Aug 21, 2015 at 11:25 AM, Allistair C <[email protected]> wrote: > The way I would do this is use a rectangle-by-color extraction phase that > produces all the cropped out colour rectangles with numbers and then > perform ocr on each one which should be good success for the quality of > text > > Sent from my iPhone > > On 21 Aug 2015, at 08:45, Rutger Rozendal <[email protected]> wrote: > > Dear People, > > We are using Tesseract to recognise room numbers on a floorpan of a ship > deck. > Attached to this email two examples. > > We are trying different methods and have a mixture of results, let's say > recognising between 20% till 70% of the room numbers. > > Because the image come with color we are now wondering is results are > better when we take out colours upfront? > So making them black and white or grayscale. > Can we use Tesseract to do this color conversion with a certain profiling > or do we need to use an external program for that? > > Also we could work maybe with a search for specific patters, as these > rooms most of the time consist out of 4 digits. > > Any tips on direction for a best configuration is helpfull. > > Thanks in advance, > > Rutger > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at http://groups.google.com/group/tesseract-ocr. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/bf5d8e06-7e42-464d-ab50-61951a89447e%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/bf5d8e06-7e42-464d-ab50-61951a89447e%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > > <rci_hm_DECK09.jpg> > > <cel_rf_DECK08.jpg> > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at http://groups.google.com/group/tesseract-ocr. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/51085070-72F3-4ED5-81C1-4BE34146C5BE%40gmail.com > <https://groups.google.com/d/msgid/tesseract-ocr/51085070-72F3-4ED5-81C1-4BE34146C5BE%40gmail.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. > -- -- Drs. Ing. R.D. Rozendal Noterik B.V. Tel. +31-(0)20-5929966 Fax. +31-(0)20-5929969 Check out the demo's of our tools: http://www.noterik.nl/video -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAJpvoacgqmtcsLrBz4uQHHYhwGAmmO1mu-y%3DKhPBr4fchiofWA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.

