Hi Shaun, Why not just use an arbitrary font name? Like mylabelfont33, or whatever? Tesseract doesn't do anything interesting with the font name, it's just a label.
Or am I missing your question? Nick On Mon, Mar 24, 2014 at 06:14:25AM -0700, Shaun Farrell wrote: > I'm working on a prototype to be able to OCR Beverage labels and pull the > description off them. The problem that I have is that the fonts can be all > different and I may or may not know the font. I want to be able to script > this > as much as possible. Is there a way to train Tesseract in a way that you > don't > need to know the name of the font? Can I supply an image to train it myself > without the font name? I have attached a couple of examples. One Idea that > have > it to automatically crop out the description text so that the OCR doesn't have > to figure out where the text is. > > > [Revolver][CigarCity] > > > The first image (Revolver Brewing) does a pretty good job when I crop out the > right had side description: > > A full-flavored bock finished with > Northern Brewer and Saphir hops. > Brewed with an abundance of > Munich and caramel malts for a > hearty biscuit and toffee choracter. > > The second image (Cigar City) not so much. I cropped out the middle > description and this is what I got: > > WMNF 88.5Fm IS 3 > I1s'rener-supporreo > communrru l'aDi0 s1'a11on > TH3'l' cetesrares Cl.IlT|.Il'al > DiVel’SiT9 am: is commmeb > T0 GQUHIH9. Peace ano > GCOn0miC JUSTICE. WMNF in > Tampa Has Been Sel'VinG > THE communrru since 1979, > ano is Cel9Bl‘aTil1G THE > 33]‘ D H|1|1|'Vel'Sal‘9 OF THe > WMNF Tl‘0PiCal Hearwave. > > T0 Learn more asour WMNF, > GO TO lUl‘I1I1F.0l' G. > > T|"0PiCal Heatwave WH9aT > ate IS an American WHGHT > Ale. Generousw HOPPGD > UJi'I' H Kouaru HOPS Fl'0I'n > New zealano. THE KOHHTU > HOPS Pl‘0ViDe 3 very > Tl‘0PiCal FLaV0f mar F1’ > perrecns WIT H THi$ > summer ate. > > I know this is because its not sure of the font. > > Most common fonts work pretty well... But does anyone have any suggestions on > how one might go about this? > > Cheers! > > -- > -- > You received this message because you are subscribed to the Google > Groups "tesseract-ocr" group. > To post to this group, send email to [email protected] > To unsubscribe from this group, send email to > [email protected] > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en > > --- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email > to [email protected]. > For more options, visit https://groups.google.com/d/optout. -- -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en --- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.

