I just tested again. Reinstalled 3.01, tested on a couple of images, Reinstalled 3.00, tested on a couple of images.
Tesseract 3.00 wins again. I'm certain that I am using the correct training data, and I have the cube data as well for 3.01. I cleared out all the files between installations, so 3.01 only has access to 3.01 data and 3.00 only has access to 3.00 data. 3.01: LTARMER MEANWELL was at one time a very rich man. He owned large ï¬elds, and had ï¬ne ï¬ocks of sheep, and plenty of money. 3.00: LTARMER MEANWELL was at one time a very rich man. He owned large fields, and had fine flocks of sheep, and plenty of money. Note also that I'm on CentOS 6. Leptonica version 1.68. I also set the environment variable: export TESSDATA_PREFIX=/usr/local/share/tessdata It's the same on all images I have tested. Any ideas? On Dec 14, 11:03 am, patrickq <[email protected]> wrote: > I have had the opposite experience: Tess 3.01 beats 3.00 often - the > reverse does happen but rarely. > > Note that Tess 3.01 will do WORSE if using Tess 3.00 trained data - is > it possible you are not using the Tess 3.01 trained data? > > On Dec 13, 9:22 am, Alasdair <[email protected]> wrote: > > > > > > > > > For some reason Tesseract 3.01 is giving much poorer accuracy than > > 3.00 on exactly the same images. This is the same whether using the > > 3.01 English trained data or the 3.00 English trained data. (I would > > expect it to at least be the same using the 3.00 trained data.) > > > Am I the only person experiencing this? > > > If so, what have I done wrong? > > > I'm executing it like this: > > tesseract "test.tif" "testout" -l eng -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

