Tom Has the bounding box detector been added to OCROpus? I downloaded the latest version and couldn't get the word coordinates. Hence I request your help
Thanks in advance Natraj On Wednesday, November 5, 2008 2:18:24 AM UTC+5:30, Tom wrote: > > You can compute the bounding box as the bounding box of the text lines. > That will probably give you a fairly reasonable page bounding box for most > pages. > > We have separately developed another page bounding box detector that we > will be incorporating into OCRopus over the next few months; that detector > detects the page boundary directly. > > Tom > > On Fri, Oct 31, 2008 at 02:54, jimfunderburk > <[email protected]<javascript:> > > wrote: > >> >> I am a potential ocropus user. Based on a lecture by Breuel at a >> Sanskrit symposium in May 2008, and from what I've seen in ocropus >> wiki, I suspect that ocropus can solve the problem described below. >> But for me it is a non-trivial task to get a ubuntu computer, install >> ocropus, etc. etc., so I am hoping that the experts of this group will >> be able to say "Sure, ocropus can do that!", before I proceed further. >> >> The project is to look up a word in scans of the pages of the Wilson >> Sanskrit dictionary, and highlight on the scanned image of the >> relevant page the part pertaining to the word. >> >> You can see the current state of this for the Wilson dictionary at >> http://www.sanskrit-lexicon.uni-koeln.de/scans/WILScan/web/index.php >> If you enter 'azva', the page for this word is retrieved, and the >> part of the page containing the word is emphasized. >> For this word, 'azva' the process is quite satisfactory. >> However, if you try the word 'rAma' or 'sItA', for instance, you see >> that the region highlighted is not quite right. >> The main problem is that the position of the page within the whole >> scanned image varies, due in part to vagaries of >> the scanning process. >> >> Here is where I thought OCROPUS might come in usefully: to >> determine the pixel coordinates of the'bounding rectangle' of >> the text. A table of such information for each page could be fed >> into some other program, possibly such as imageMagick, >> to automate the 'normalization' of the image within the page. >> >> Thanks for any suggestions. >> >> >> > -- You received this message because you are subscribed to the Google Groups "ocropus" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/ocropus/3462f85f-850f-4797-925f-b0cc568011ee%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
