[tesseract-ocr] Tesseract confidence level sample code for java

2014-11-19 Thread bala
I would like to get the code for tesseract confidence level. Any sample code available ? -- You received this message because you are subscribed to the Google Groups tesseract-ocr group. To unsubscribe from this group and stop receiving emails from it, send an email to

[tesseract-ocr] Re: Tesseract confidence level sample code for java

2014-11-19 Thread Allistair C
baseApi.init(filesDir.getPath() + /tesseract/, LANG); baseApi.setPageSegMode(TessBaseAPI.PageSegMode.PSM_SINGLE_BLOCK); baseApi.setImage(bmp); OCRResult result = new OCRResult(baseApi.getUTF8Text(), baseApi.meanConfidence()); baseApi.end(); Note OCRResult is my own object for holding values.

[tesseract-ocr] What preprocessing operations are performed by Tesseract OCR?

2014-11-19 Thread Sirius Lee
I couldn't find a detailed documentation and I don't feel browsing the source code. I want not to redo canny edge detection for example if it is already done by Tesseract engine. I need to know its steps before doing mine -- You received this message because you are subscribed to the Google

Re: {Spam?} Re: [tesseract-ocr] text2image infinite ScrollView: Waiting for server

2014-11-19 Thread Ryan Baumann
Hi all, Apologies for resurrecting an old thread, but I ran into the same issue and have documented a workaround in this blog post: http://ryanfb.github.io/etc/2014/11/19/installing_tesseract_training_tools_on_mac_os_x.html And the associated modified Homebrew formula:

Re: [tesseract-ocr] Need Help with extracting info from Invoice

2014-11-19 Thread Vinay Matam
Hi Art Rhyno, Thanks for your response. I will check it.. On Wednesday, November 19, 2014 5:42:56 AM UTC+5:30, Art Rhyno wrote: Shall I first crop the specific portion of the image to different rectangles and then OCR them individually..? Hi Vinay, If you are getting better OCR results

[tesseract-ocr] Re: Need Help with extracting info from Invoice

2014-11-19 Thread Vinay Matam
Thanks Allistair for replying. I have a wide variety of invoice types which are of no particular type. But all the invoice types will have the necessary fields that I have mentioned earlier in my post but they may exist at different locations in the image. Our solution should be able to

Re: [tesseract-ocr] Covering ASCII Extended range.

2014-11-19 Thread Ryan Dev
I'm dealing with font subsets, and I generate an image per font, so there is no reading order. Though I've seen latin and cjk in the same font subset. If OSD just gives, reading, orientation, and text order, it is not going to give me anything useful. Plus I have the font, so I could get some