Did you think about using some recent version of tesseract ;-) ? Versions 3.0.2 and 3.0.3 seem too old for me (even I am not aware that they we released). Current stable version is 3.02.02 and next release will be 3.03.
Zdenko On Tue, Mar 11, 2014 at 7:41 AM, <[email protected]> wrote: > Just made a fresh install of version 3.0.3 and decided to test it out on > one the image of this test page > http://iupr1.cs.uni-kl.de/~tmb/ocropus-results/g1000p22/ > > Basically, I got better accuracy with 3.0.2 than with 3.0.3 using the > default english model. > > > For the test, I'm using this image (single page) > http://iupr1.cs.uni-kl.de/~tmb/ocropus-results/g1000p22/0002.nrm.png > > As you can see on the image below, there really isn't any skew or border. > Also changed the dpi from 72 to 300 > > > <http://papyrus.jolome.com/300.png> > > > When I run tesseract 3.0.3 on this image using various psm, the most > accurate result is : > > ========= > > > POPULAR TALES > > 0' > > THE WEST HIGHLANDS. > > __._ > > XVIII. > > THE CHEST. > Mlanhanhy. > > BEFOREthhthmwnflnglndhewinhed > > bloehil son with nwifo bofomhonhould depart > Hinnnnidhehadbeï¬ orgofornwife;mdhognve > himhnlfuhnndmdponndltogothcr. Homtfor- > wudtholengthofuday,lndwhanthonightumoho > mtinblhahlrytomyinit Homtdmtoa > chmbarwithngoodï¬ ninfrontofhim;mdwhenhe > hadgoflenmuhthomofthohouumtdownto > hikme Hobldthomofflnhonnmojour- > noyonwhichhom Thammofthehountold > himhneednotgofuflha; “chasm-little > homoppodbtohhdupingchmhugthtdnmm > ofthohouuhndthmï¬ nodmghmnndifhowonld > undinthewindowofhilchmbuinthomoming, > Mbowouldmonom-noflmooming > hot-alt Thattheymulllihuchothmnnd > hoouldnotdisï¬ ngniahomï¬ omhoï¬ ot,but > > voun. n \- > > Eii‘ > > (r.- if- > > > > > ======== > > In tesseract 3.0.2, I get a more accurate running the same command: > > POPULAR > THE > TALES > WEST HIGHLANDS. > > THE CHEST. > From In MecGeechy, Ieley. > B‘i§°.:‘.‘%..:“;:.*::a: .r::..'..‘::"..%..:“.‘.‘..,.':r.18 > llie eon said he had better go for 5 wife ; and he gave > him lmlf a hundred pounds to get her. He went for- > wen! the length of is day, and when the night came he > wentintoehoetelrytoeteyinit. Hewentdowntoe > chamber with e good fire in front of him; and when he > bed gotten meet, the men of the house went down to > telk to him. He told the men of the house the jour- > no] on which he was The men of the house told > himheneednot go further; that therewee e little > house opposite to his sleeping chsmber; that the men > of the house had three fine daughters; end if he would > stand in the window of his chamber in the morning, > that he would see one after enother coming ‘to dreee > herself. That they were ell like eech other, and that > In could not one from the other, but that > > > -- > -- > You received this message because you are subscribed to the Google > Groups "tesseract-ocr" group. > To post to this group, send email to [email protected] > To unsubscribe from this group, send email to > [email protected] > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en > > --- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en --- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.

