confidence level of tessract ocr output

2012-11-12 Thread Li Rong
Hi, everyone,Dose tessrect- ocr can output confidence level of the result? I need not only a character but also a confidence level value. Dose anybody knows? thank you -- You received this message because you are subscribed to the Google Groups tesseract-ocr group. To post to this group, send

empty Page

2012-11-12 Thread Tseason
hello~guys...I need your help. I am trying to train a new langage data, but when I was making box for a bmpFile,Empty Page!! come. I don't know why this happened...It puzzle me for a long time. I found if 1 appears,the problem comes, and if not, everyting is fine.by the way , 1 is a number. --

Re: confidence level of tessract ocr output

2012-11-12 Thread Quan Nguyen
If you build using the latest source (r806http://code.google.com/p/tesseract-ocr/source/detail?r=806), you'll get the word confidence in the hOCR output. On Monday, November 12, 2012 1:54:34 AM UTC-6, lirong wrote: Hi, everyone,Dose tessrect- ocr can output confidence level of the result? I

Re: mftraining produces Missing font_properties

2012-11-12 Thread Quan Nguyen
The Powershell script train.ps1 on AddOns page can help automate the training process. http://code.google.com/p/tesseract-ocr/wiki/AddOns On Tuesday, May 17, 2011 2:08:53 AM UTC-5, Eyal wrote: Hi, I tried to train some letters when I ran the *mftraining *with the parameters*:*

Re: tesseract-ocr does not very well on chinese

2012-11-12 Thread Daniel Bonniot de Ruisselet
Hi, On Friday, November 2, 2012 1:25:22 PM UTC+1, Rong Xiao wrote: hi,I have tried tesseract-ocr on chinese,but I found that it can do well on only few fonts. I want to know what kind of fonts are included in chi_sim.traineddata? I'm also interested in this information (which fonts has it

Re: Newbie: Training tesseract

2012-11-12 Thread Mi Tran
Oh, sorry. I use tesseract 3.0.2, win7 32bit. Processing that I did: 1. Generate Training Images: eng.timesitalic.exp0.tif 2.Make Box Files: tesseract eng.timesitalic.exp0.tif eng.timesitalic.exp0 batch.nochop makebox 3.Bootstrapping a new character set: tesseract

Re: empty Page

2012-11-12 Thread Mi Tran
What kind of your bmpFile? bmpFile must is *.tif -- You received this message because you are subscribed to the Google Groups tesseract-ocr group. To post to this group, send email to tesseract-ocr@googlegroups.com To unsubscribe from this group, send email to

Re: empty Page

2012-11-12 Thread zdenko podobny
On Mon, Nov 12, 2012 at 3:23 PM, Mi Tran nuon...@gmail.com wrote: What kind of your bmpFile? bmpFile must is *.tif This is not true - it can be any image type supported by leptonica. -- Zdenko -- You received this message because you are subscribed to the Google Groups tesseract-ocr

Re: Newbie: Training tesseract

2012-11-12 Thread zdenko podobny
check also error messages - if you did not run shapeclustering then mftraining should not produce any output (in 3.02 version) ;-) Also it looks like you forget to rename output files from training tools! You need to follow training wiki[1]! [1]

Counting pixels and dpi

2012-11-12 Thread chikev
I'd be grateful if someone could help me here. Here is my request to Zdenko and the reply. Could you perhaps help me understand, and then change the page, the meaning of: A quick check is to count the pixels of the x-height of your characters. (X-height is the height of the lower case x.)

Re: Counting pixels and dpi

2012-11-12 Thread Sven Pedersen
Measure the height of a lower case 'x' in your image using an image program, such as Gimp or the standard image viewer on your platform (such as Windows Paint or Mac Preview). If the height of a lower-case 'x' in your text is less than 20 pixels, you need to resize it or rescan your documents.

Re: empty Page

2012-11-12 Thread MiT
leptonica is tool support training? Vào 22:38:58 UTC+7 Thứ hai, ngày 12 tháng mười một năm 2012, zdenop đã viết: On Mon, Nov 12, 2012 at 3:23 PM, Mi Tran nuo...@gmail.com javascript:wrote: What kind of your bmpFile? bmpFile must is *.tif This is not true - it can be any image type

FreeOCR Problems (Does anyone here use this program?)

2012-11-12 Thread Random Terrain
Does anyone here use FreeOCR? I downloaded FreeOCR V3 a while ago and was happy with it. Then I saw that there was version 4.2, so I thought that would be even better, but Tesseract V3 that comes with it doesn't work as well as the version included with FreeOCR V3 (at least with the current

My Post Didn't Show Up

2012-11-12 Thread Random Terrain
Does it take a while to show up or is it lost in the 'pipes' of the Internet? -- You received this message because you are subscribed to the Google Groups tesseract-ocr group. To post to this group, send email to tesseract-ocr@googlegroups.com To unsubscribe from this group, send email to

Re: Counting pixels and dpi

2012-11-12 Thread Kevin McCready
cheers that was easy!! many thanks I wonder if Z will now change the FAQ to tell ppl to use an image program to do the measuring? Cheers kevin1mccre...@gmail.com 32 Hawera Rd Kohimarama 1071 Auckland, New Zealand +64 (0)9 528 1174 home +64 (0)226 710 335 cell http://kmccready.wordpress.com

Re: tesseract-ocr does not very well on chinese

2012-11-12 Thread Zach Rothweiler
Your image appears to be only 96 dpi, try using an image at a higher dpi On Sunday, November 4, 2012 9:00:51 PM UTC-5, Rong Xiao wrote: https://lh3.googleusercontent.com/-gwRhWSanaHo/UJcdfs8hiSI/ABQ/8jlKa2ZypFs/s1600/chi_test4.jpg such as this image.it 's not very complex. On

Re: empty Page

2012-11-12 Thread Tseason
在 2012年11月12日星期一UTC+8下午10时23分26秒,MiT写道: What kind of your bmpFile? bmpFile must is *.tif I have tried a *.tif file, but it didn't work.Any suggestion? -- You received this message because you are subscribed to the Google Groups tesseract-ocr group. To post to this group, send email to

Re: My Post Didn't Show Up

2012-11-12 Thread zdenko podobny
As far as I know this list is moderated e.g. your first post should be approved by moderator... But I am not familiar with details (I am not moderator ;-) ) -- Zdenko On Mon, Nov 12, 2012 at 8:27 PM, Random Terrain replayabil...@randomterrain.com wrote: Does it take a while to show up or is

Re: Newbie: Training tesseract

2012-11-12 Thread zdenko podobny
If you are serious about your training project, please invest your time to read wiki (once again if necessary). It is there. -- Zdenko On Tue, Nov 13, 2012 at 1:20 AM, Mi Tran nuon...@gmail.com wrote: Thanks zdenop , I have ran shapeclustering and read training wiki. But it still has error.

Re: empty Page

2012-11-12 Thread zdenko podobny
Leptonica is library that handle images for tesseract. -- Zdenko On Tue, Nov 13, 2012 at 1:36 AM, MiT nuon...@gmail.com wrote: leptonica is tool support training? Vào 22:38:58 UTC+7 Thứ hai, ngày 12 tháng mười một năm 2012, zdenop đã viết: On Mon, Nov 12, 2012 at 3:23 PM, Mi Tran