Hi,
I am doing some experiments with Tesseract-OCR (3.02) to extract OCR (without training) from the pool of sequence images (sample is given below). The issue which I am currently facing is, I am getting almost correct results through GUI (http://sourceforge.net/projects/tesseract-gui/ <http://sourceforge.net/projects/tesseract-gui/%20>on Ubuntu and http://vietocr.sourceforge.net/ on windows) but with 50% accuracy when I use tess4J to get the OCR programmatically. Does anyone know the reason behind this? I have to get better results through the program. <https://lh6.googleusercontent.com/-cNHxknh9iZc/U0LHgWnCz0I/AAAAAAAAARc/ujBaUVAHqPg/s1600/1.jpg> <https://lh6.googleusercontent.com/-cNHxknh9iZc/U0LHgWnCz0I/AAAAAAAAARc/ujBaUVAHqPg/s1600/1.jpg><https://lh6.googleusercontent.com/-cNHxknh9iZc/U0LHgWnCz0I/AAAAAAAAARc/ujBaUVAHqPg/s1600/1.jpg> *OCR results using GUI* CLUSTAL 2.0.2 multiple sequence alxgnment 907307 wvmqsscwrsascmmwnznmwcqLmm:u.wmswr::Qn'1vQz-rrm>rm'wn:L1'ns 60 PC7306 ———ELERSCYW'FSRSG!iNfl\DADNYCRLEDAELWVTSWEEQK!‘VQ1-D-IIGPVNTWMGLHDQ 216 PC7307 DESWIOJVDGTDYRHNYICNWAVTQPDVMHGHELGGSECVEVQPDGRWIDDFCLQVYEWVC 120 P07306 uspwxwvuarm51‘crmwwzqmnwrcacLsssmczuatrnnsnwlnnvcgmavnwvc 276 PC7307 ex 122 PC7306 :— 277 * OCR result using Tess4J API (Programmtic access)* CLUSTAL 2.0.2 multxple sequence alignment 207307 .4 nnQGSCYWFSESGR7lWI\EAEKYC WINSVIEEQKFIVQHTMPFNTWIGLTD5 so E07306 ———n.EnscYw1~'sI\ss1vm\D1\Dmc wAm,wvTsIvE=Q!<rvQx-n-IIL:1>vuTm4GLI-11:0 216 P0 7 3 0 7 .. AnNYIGWAVTQPDNWHGHELGGSIIDCVEVQPDGNIHIDDFC LQVY nwvc 12 0 P07306 ALJILJ » . nwxfiw 1. 276 P0730‘! ax 122 P137306 :— 277I *Second Question:* Do I need any training to improve OCR result? The images which i have all using courier font (display in attached imageabove). Morever, i just need to extract the alphabets, no digits and special characters. Another important thing, alphabets string always comes without space. I tried to disable dictionary because i donot require but it did not help to imrpove my results. Any tip, technique will highly be appreciated which can help me to improve my results programmatically. Thanks -- -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en --- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.

