Re: [tesseract-ocr] Does any parameter to control ocr region?

2017-06-06 Thread Duck
Because of my company's project, I can't change version by myself. But I changed pagesegmode to singleblock and it was fixed Why? I thought singlechar is better setting? -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from

Re: [tesseract-ocr] Re: How can I convert font data from ver 3.02 to 3.05

2017-06-06 Thread ShreeDevi Kumar
As far as I know, the traineddata files for 3.04 (also usable for 3.05) are github versions of the files posted on code.google.com for 3.02. So, I would think 3.02 traineddata files will work with 3.05 but newer files will not work with 3.02. Best is to give it a try and report your results.

[tesseract-ocr] Re: How can I convert font data from ver 3.02 to 3.05

2017-06-06 Thread RND Android
Sorry I meant 3.02 to 3.05. Addition: Is there anyway that I can use 3.02 font data for tesseract 3.05? On Wednesday, June 7, 2017 at 10:58:03 AM UTC+7, RND Android wrote: > > Hi, I have some trained data file for several fonts which successfully > used for tesseract ver 5.02, now my company

[tesseract-ocr] Re: use jTesseractEdit training but box edit is empty

2017-06-06 Thread Shaw Ryan
在 2017年6月6日星期二 UTC+8下午10:49:47,Quan Nguyen写道: > > You may want to attach your TIFF/Box pair here so people can look and help. > > On Monday, June 5, 2017 at 8:58:19 PM UTC-5, Shaw Ryan wrote: >> >> I have created a box file >> >> 在 2017年6月5日星期一 UTC+8下午11:24:36,Quan Nguyen写道: >>> >>> You'd need

[tesseract-ocr] Re: use jTesseractEdit training but box edit is empty

2017-06-06 Thread Shaw Ryan
Thank you I have uploaded box and tiff Please help 在 2017年6月5日星期一 UTC+8下午6:27:14,Shaw Ryan写道: > > > > How can I edit the data? > -- You received this message

Re: [tesseract-ocr] Re: Store rotated pages

2017-06-06 Thread Thomas Klettke
Thanks - I've figured it out, and have a solution that works now: - Input file is a multi-page TIFF. - Some pages need to be rotated (90, 180, or 270 degrees - although other angles should work as well). - Output is a multi-page searchable PDF - Tools used: - Linux (Fedora

[tesseract-ocr] Image preprocessing of images clicked from camera

2017-06-06 Thread s4resolve
I have to recognize text from images. The problem I am facing are: 1. sometimes the text background and the text itself vary slight in terms of pixel values when grayscale is applied. Like if my text is light blue and the background is dark blue. 2. How can I binarize such that I only have

Re: [tesseract-ocr] Does any parameter to control ocr region?

2017-06-06 Thread ShreeDevi Kumar
try latest code from http://www.emgu.com/wiki/index.php/Version_History#Emgu.CV-3.2.0 I converted the bmp to png and tried with command line tesseract 4 and get correct result. $ tesseract I.png stdout --oem 1 --psm 6 D $ tesseract I.png stdout --oem 0 --psm 6 D original .bmp also works. $

[tesseract-ocr] Re: Image precprocessing before providing it to Tesseract

2017-06-06 Thread Ciaran McCormack
You'll probably have to use something like OpenCV. Generate a histogram of the image to find the most common color. Find contours and limit it to contours with 4 sides (i.e. rectangles) Color in the contour with the calculated color. However an easier approach, instead of coloring in the rect

[tesseract-ocr] Does any parameter to control ocr region?

2017-06-06 Thread Duck
I need some help. The following pic is my problem, it was always recognized as "I". I trace for a while, find out that OCR engine segement again, it takes out the mid area of the "D". but I tried a lot of parameter, can't disable the segement process. Does anyone have any idea? Or the only

[tesseract-ocr] Re: use jTesseractEdit training but box edit is empty

2017-06-06 Thread Quan Nguyen
You may want to attach your TIFF/Box pair here so people can look and help. On Monday, June 5, 2017 at 8:58:19 PM UTC-5, Shaw Ryan wrote: > > I have created a box file > > 在 2017年6月5日星期一 UTC+8下午11:24:36,Quan Nguyen写道: >> >> You'd need to provide the box file also. If you do not have one, you can

[tesseract-ocr] How to covert font training data file from ver 5.02 to 5.03?

2017-06-06 Thread RND Android
Hi, I have some trained data file for several fonts which successfully used for tesseract ver 5.02, now my company upgrade the tesseract ver to 5.03, so how can I convert those trained data fonts from ver 5.02 to be used on ver 5.03? Please help -- You received this message because you are

[tesseract-ocr] Re: Some Clue on Generating Probablity scores for each character/word

2017-06-06 Thread Karuna Goyal
can anyone help in getting the probability score for all the similar characters of a word .I tried and only getting the probability score for only the highest probability character On Friday, September 28, 2007 at 3:21:34 PM UTC+5:30, Basu wrote: > > Hi, > > I am trying hard on generating some