Re: Get italic info from Tesseract 3 command line?

2011-04-28 Thread Quan Nguyen
http://groups.google.com/group/tesseract-ocr/browse_thread/thread/2f408e3f9b054edb http://code.google.com/p/tesseract-ocr/issues/detail?id=377#c5 On Apr 28, 7:54 am, Nikse nikse...@gmail.com wrote: I can see that in baseapi.cpp in method GetHOCRText there seems to be support for italic in line

Re: jTessBoxEditor

2011-04-14 Thread Quan Nguyen
Version 0.2 Release: - Add a provision to set font for the Box Coordinates table - Incorporate a pangram into the Font dialog http://sourceforge.net/projects/vietocr/files/ On Apr 10, 8:01 am, Quan Nguyen nguyen...@gmail.com wrote: jTessBoxEditor is a box editor for Tesseract OCR data

Re: Font_properties.

2011-04-12 Thread Quan Nguyen
I think you can name your image as eng.arialblack.exp0.tif and has arialblack as the fontname in font_properties. On Apr 8, 6:00 am, 78yrsold withblessi...@gmail.com wrote: According to wiki instruction = where fontname is a string naming the font (no spaces allowed!), and italic, bold, fixed,

jTessBoxEditor

2011-04-10 Thread Quan Nguyen
jTessBoxEditor is a box editor for Tesseract OCR data, providing editing of box data of both Tesseract 2.0x and 3.0x formats. It can read images of common image formats, including multi-page TIFF. The program requires Java Runtime Environment 6.0 or later. -- You received this message because

jTessBoxEditor

2011-04-10 Thread Quan Nguyen
jTessBoxEditor is a box editor for Tesseract OCR data, providing editing of box data of both Tesseract 2.0x and 3.0x formats. It can read images of common image formats, including multi-page TIFF. The program requires Java Runtime Environment 6.0 or later.

Re: how to dither this file properly for tesseract

2011-03-29 Thread Quan Nguyen
After simply rescaling the image to 300 DPI, I got nearly perfect result. It is interesting to note that with English data, find was misclassified as find -- the dictionary could not get it right. The Windows Search Engine The search engine in Windows XP will automatically OCR a tiff image

Re: Automate Tesseract 3.01 language data generation process

2011-03-28 Thread Quan Nguyen
For WinXP, you'd want to click on Download the Windows Management Framework Core for Windows XP and Windows Embedded package now. link. On Mar 27, 10:37 pm, Sriranga(78yrsold) withblessi...@gmail.com wrote: Sorry, I tried to download from Download Windows ph2.0 but instead of download it will

Re: Automate Tesseract 3.01 language data generation process

2011-03-28 Thread Quan Nguyen
].words_list.txt, [lang].frequent_words_list.txt, etc. # Automate Tesseract 3.01 language data pack generation process. @author: Quan Nguyen @date: 28 Mar 2011 The script file should be placed in the same directory as Tesseract's binary executables. Run PowerShell as Administrator and allow

Re: how to pass image directly to tesseract

2011-03-28 Thread Quan Nguyen
You certainly can use tessdll.dll of Tess 2.04. A 3.0x DLL is not yet available. On Mar 28, 11:53 am, zl2k kdsfin...@gmail.com wrote: hi, all My application will generate bunch of separate binarized characters and I need to feed the ocr engine for each of them. It will be very costly if save

Re: tesseract.exe has stopped working on win2008 r2

2011-03-26 Thread Quan Nguyen
The image appears to have been heavily compressed. OCR the whole image did not yield anything. Doing it blockwise, I got some results but not very accurate: Ch Juhe 24, 2@@9 the ACHP vctect ct: revisect teccmmehdettcns tcr mee_s1es-muhqes-t'ube[[e (NR/H~ ‘evictetnce ct tmmuhity’ requtrementstcr

Re: Use tesseract-ocr in android

2011-03-15 Thread Quan Nguyen
There's already an Android port, I believe. http://www.itwizard.ro/interfacing-cc-libraries-via-jni-example-tesseract-163.html On Mar 15, 6:08 am, jeni jeenaraju...@gmail.com wrote: How can i integrate tesseract-ocr http://code.google.com/p/tesseract-ocr/  in my andorid.. -- You received

Re: Tesseract 3.00

2011-03-13 Thread Quan Nguyen
There are some GUI frontends that you can use, such as VietOCR, which are available as Java and .NET apps. http://vietocr.sf.net On Mar 13, 6:13 pm, Onion onionzwie...@gmail.com wrote: Ok, thanks. That will be too complicated for me to use. Will have to uninstall it. -- You received this

Re: VietOCR v2.0/3.1 VietOCR.NET v2.0 Releases

2011-03-06 Thread Quan Nguyen
VietOCR v2.0.1/v3.1.1 VietOCR.NET v2.0.1 Release * Fix a bug which hangs the program if x.DangAmbigs.txt contains entries starting with an equal symbol * Improve postprocessing performance by caching the word list used; reload only if changes * Fix a bug that crashes the program when

Re: Tesseract and Windows 7 64 Bit

2011-02-27 Thread Quan Nguyen
I have no problem running Tess 3.01 on my Win7 64-bit. I'm not sure if it needs Microsoft Visual C++ 2008 SP1, but here's the link: http://www.microsoft.com/downloads/details.aspx?FamilyID=ba9257ca-337f-4b40-8c14-157cfdffee4edisplaylang=en On Feb 26, 4:24 pm, andy_syme andy_s...@hotmail.com

Re: what am i missing? tesseract runs but no output

2011-02-18 Thread Quan Nguyen
I ran test_page.png through VietOCR 3.1 with Screenshot Mode enabled and got acceptable results back. Since it's a Java program, it certainly can run on OS X, provided that you build the Tess engine. And if Ghostscript is installed, VietOCR can read PDF too. On Feb 18, 10:54 am, Bob Kuo

VietOCR v2.0/3.1 VietOCR.NET v2.0 Releases

2011-02-07 Thread Quan Nguyen
A Java/.NET GUI frontend for Tesseract OCR engine. The releases include the following fixes and improvements: * Add support for spellcheck suggestion in context menu * Improve program accessibility and usability * Add support for downloading and installing language data packs and appropriate

Re: VietOCR v2.0/3.1 VietOCR.NET v2.0 Releases

2011-02-07 Thread Quan Nguyen
to let tesseract OCR work with pdf documents. Thank you very much in advance for you kind response. With Best Regards, Sochenda On Tue, Feb 8, 2011 at 7:56 AM, Quan Nguyen nguyen...@gmail.com wrote: A Java/.NET GUI frontend for Tesseract OCR engine. The releases include the following

Re: VietOCR v2.0/3.1 VietOCR.NET v2.0 Releases

2011-02-07 Thread Quan Nguyen
of tesseract into code using any language. I don't have long experience in coding, but I learn it fast. Thank you in advance for letting me know this. Best Regards, Sochenda On Tue, Feb 8, 2011 at 10:36 AM, Quan Nguyen nguyen...@gmail.com wrote: Sochenda, It uses Ghostscript

VietOCR v3.0 Release

2010-10-03 Thread Quan Nguyen
A Java GUI frontend for Tesseract OCR engine. The release includes the following updates: * Upgrade Tesseract OCR engine to 3.0 * Replace old format (2.0x) language data with new format (3.0) language data http://vietocr.sf.net -- You received this message because you are subscribed to the

VietOCR v1.9 Release

2010-10-02 Thread Quan Nguyen
A Java GUI frontend for Tesseract OCR engine. The release includes the following fixes and improvements: * Integrate a Java binding for Hunspell library to provide spellchecking and spellcheck-as-you-type functionality. Include English and Vietnamese dictionaries * Add support for a custom

Re: VietOCR

2010-09-29 Thread Quan Nguyen
Go in Settings/Options, set path to either /usr/bin, /usr/local/bin, or where your Tesseract binary executable resides. The path will be preset in next release. On Sep 29, 12:24 am, Vino vinothini@gmail.com wrote: please tell me the way to run the VietOCR i am using ubutnu-8.04 how to set

Re: jtOCR

2010-09-27 Thread Quan Nguyen
That program is obsolete and no longer supported. Try VietOCR. And be sure to install tesseract-ocr first. http://vietocr.sf.net On Sep 27, 3:44 am, Vino vinothini@gmail.com wrote: When i execute the command java -jar jtOCR.jar in ubuntu-8.04 i got the following exception

Re: OCR of Screenshots

2010-09-09 Thread Quan Nguyen
by a factor of 2 without interpolation and then by a variable factor with interpolation to the needed size is a simple way to get some sharpening and some separation between characters. On Aug 31, 7:26 pm, Quan Nguyen nguyen...@gmail.com wrote: Hi Ian, I'm implementing a feature

Re: feature request: ability to scan teeny images

2010-09-06 Thread Quan Nguyen
I tried the image on VietOCR. Both Java and .NET versions, with the Screenshot Mode selected, are able to recognize it. On Sep 3, 12:15 pm, rogerdpack rogerpack2...@gmail.com wrote: Hi all. Noticed that tesseract seems unable to scan teeny images for letters, ex:

VietOCR v1.8 VietOCR.NET v1.8 Releases

2010-09-06 Thread Quan Nguyen
A Java/.NET GUI frontend for Tesseract OCR engine. The releases include the following fixes and improvements: * Display image information * Add Screenshot Mode, which rescales low-resolution images to 300 DPI to be more suitable for OCR operations * Read output and error streams to

Re: OCR of Screenshots

2010-09-01 Thread Quan Nguyen
could visually confirm if this looks to be the case. Maybe you could upload a sample screengrab and explain what it gets right and which errors it gets (maybe by drawing on the image)? i. On 1 September 2010 03:26, Quan Nguyen nguyen...@gmail.com wrote: Hi Ian, I'm implementing a feature

Re: OCR of Screenshots

2010-08-31 Thread Quan Nguyen
it worked fine:http://ianozsvald.com/2010/05/17/extracting-keyword-text-from-screenc... What problems are you seeing when you try tesseract? Ian. On 30 August 2010 23:46, Quan Nguyen nguyen...@gmail.com wrote: I understand the resolutions of screenshots are typically inadequate for OCR

Re: Which revision of tesseract 3.0 for win7 64bit

2010-08-23 Thread Quan Nguyen
I am able to confirm Tesseract r454 with new Leptonica-1.66 binary ran w/o the problem that was reported in Issue 304. Well, with one little other problem, though: Could not open file, ./tessdata/eng.user-words I had to create an empty file with the name to get it to run. When I tried with -l

Re: Tess4J - a Java wrapper for Tesseract OCR DLL

2010-08-23 Thread Quan Nguyen
Oh, thanks, James. I'd be happy to use the code to build the .so and avoid duplicate effort. I'll get in touch with you if more info is needed. Thank you, again. Quan On Aug 23, 2:44 am, James Le Cuirot ch...@aura-online.co.uk wrote: On Sun, 22 Aug 2010 19:35:26 -0700 (PDT) Quan Nguyen nguyen

Tess4J - a Java wrapper for Tesseract OCR DLL

2010-08-22 Thread Quan Nguyen
A JNA-based wrapper for Tesseract OCR DLL, the library provides optical character recognition (OCR) support for: * TIFF, JPEG, GIF, PNG, and BMP image formats * Multi-page TIFF images * PDF document format http://tess4j.sf.net -- You received this message because you are subscribed

<    1   2   3   4   5