Sorry, I still have difficulties trying to understand the issue reported by
you. Your TIFF image has 30 pages and 24 million colors, and the file is
400 MBytes in size? And what you do mean when saying All of those are not
passing through from the page to the another page?
Thank you.
Quan
On
tarihinde Quan Nguyen yazdı:
jTessBoxEditor is a Java box editor for Tesseract OCR data. It can read
images of common image formats, including multi-page TIFF. The
program requires JRE 6.0 or later.
Version 1.0 Beta integrates support for full automation of Tesseract
training. Please post your
jTessBoxEditor is a Java box editor for Tesseract OCR data. It can read
images of common image formats, including multi-page TIFF. The
program requires JRE 6.0 or later.
Version 1.0 Beta integrates support for full automation of Tesseract
training. Please post your comments/feedback here. Thank
Training only involves getting the data it requires into a few appropriate
files and executing a few appropriate commands, no programming required.
http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3
Take a look at the source training data for Vietnamese, which has many
diacritical
Training only involves getting the data it requires into a few appropriate
files and executing a few appropriate commands.
http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3
Take a look at the source training data for Vietnamese, which has many
diacritical marks similar to your
Try bazaar pattern matching and see if you will have better results.
http://tesseract-ocr.googlecode.com/svn/trunk/doc/tesseract.1.html
On Thursday, August 29, 2013 3:33:28 AM UTC-5, sam vara wrote:
this is my first OCR project . I am trying to feed an image that is
x...@gmail.com
Any example image?
On Wednesday, August 21, 2013 11:44:45 AM UTC-5, Morlock wrote:
Hello,
I'm using Tesseract v3.02.02.
I'm unable to get it to consistantly recognizing a number that contains a
decimal point. Tesseract is recognizing the digits. Tesseract is
recognizing the
For 10, it is PSM_SINGLE_CHAR.
http://code.google.com/p/tesseract-ocr/source/browse/trunk/ccstruct/publictypes.h
On Wednesday, July 17, 2013 5:28:33 PM UTC-5, Gabriel Paschoal Vicente
wrote:
Hi Guys,
I am integrating tesseract on my c++ application.
When i run the command manually I got
I don't think there exists a way to merge the data files; however, in 3.02,
you can rename your trained data file and specify it with the standard one
to the -l option, such as: tesseract image output -l eng+eng1
On Wednesday, July 31, 2013 8:44:44 AM UTC-5, honey kansal wrote:
Hi,
I
Did you try to merge them into one box, as the training Wiki suggests?
http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3
On Sunday, August 4, 2013 1:59:10 AM UTC-5, mama wrote:
Sir,
I have found the box files for bangla language for tesseract version 2
from the site
*
You could have better results with 300-DPI, binary or grayscale images.
On Sunday, August 4, 2013 7:04:20 PM UTC-5, Zeulopes wrote:
Hello Guys!
I'm using tesseract API (version 3.02) to single character recognition in
English (eng.traineddata), with the following parameters:
The AddOns page does not list any native Windows box editor. Most Windows
systems nowadays come with .NET Framework installed; the same can be said
about Java. Do you have a problem using a .NET- or Java-based box editor?
On Saturday, August 3, 2013 3:58:40 AM UTC-5, n1101...@gmail.com wrote:
The AddOns page does not list any native Windows box editor. Most Windows
systems nowadays come with .NET Framework installed; the same can be said
about Java. Do you have a problem using a .NET- or Java-based box editor?
On Saturday, August 3, 2013 3:58:40 AM UTC-5, n1101...@gmail.com wrote:
There's a .NET wrapper for Tesseract 3.02 at
https://github.com/charlesw/tesseract.
On Sunday, July 7, 2013 9:00:58 AM UTC-5, waleed Elerksosy wrote:
Hello,
In first i would thanks all about the effort to support us :)
How to add *tesseract3 *to my VB.NET project in previous with
file but for few other it can't be generate the box co-ordinate.Please sir
I have attached the file.
On Sat, May 4, 2013 at 7:38 PM, Quan Nguyen nguy...@gmail.comjavascript:
wrote:
What Ubuntu and Java versions are installed on your machine? You probably
has a headless Java -- i.e., one
Put them in a file placed under tessdata\configs folder and specify it as a
command-line option when you execute tesseract command.
On Saturday, May 4, 2013 2:58:31 AM UTC-5, Sathish Kumar wrote:
On Sunday, 30 December 2012 03:06:24 UTC+5:30, 服部慎 wrote:
Hi . I am Japanese tesseract users.
Yes, it runs on Ubuntu. Just unzip and execute run script. Be sure to have
Java installed first.
On Tuesday, April 23, 2013 12:17:21 AM UTC-5, mama wrote:
Sir
Is it work in UBUNTU
I did't get jTessBoxEditor for UBUNTU
Thank
mama
On Monday, October 3, 2011 9:20:00 AM UTC+5:30, Quan Nguyen
Version 0.9 Release:
- Enhance Generate TIFF/Box functionality to allow for combining prepending
symbols in addition to appending
- Fix a bug that failed to persist changes to table in edit mode
- Find function now supports partial matches
- Fix a problem with table not scrolling along when row
Version 0.9 Release:
- Enhance Generate TIFF/Box functionality to allow for combining prepending
symbols in addition to appending
- Fix a bug that failed to persist changes to table in edit mode
- Find function now supports partial matches
- Fix a problem with table not scrolling along when row
Are you're using v1.x version, which uses .traineddata format? What's
datapath (TESSDATA_PREFIX) value? Would it work with eng?
On Wednesday, April 24, 2013 2:10:15 PM UTC-5, Fabio Ebner wrote:
Can someone help me??
i download tess4J, and download de portugues language, put the
.tr are binary files; as such, you should use:
copy /b san.sanskrit2003.exp0*.tr san.sanskrit2003.exp2000.tr
--
--
You received this message because you are subscribed to the Google
Groups tesseract-ocr group.
To post to this group, send email to tesseract-ocr@googlegroups.com
To unsubscribe
Version 0.8 has been released with the following enhancements:
- Add row number header
- Char cell now editable
- Convert Unicode escape sequences where possible
- Find box now displays Unicode characters and allows search using Unicode
escape sequences
- Improve Generate TIFF/Box functionality:
I'd move the SetVariable statement after the Init.
On Friday, April 5, 2013 12:47:45 AM UTC-5, priya wrote:
hi
the code which i hav used to find word level confidence is given below,
but i need character level confidence. please let me know if u hav any
clues or pointers regarding
The confidence values embedded in the hOCR output are at the word, not
character, level.
On Friday, April 5, 2013 1:52:19 AM UTC-5, satuon wrote:
I just found out that Tesseract also supports the hOCR format. But I'm not
sure if character-wise confidence levels are available even there. How
The first parameter to Init is the path to tessdata folder; the second
indicates the language.
On Thursday, April 4, 2013 9:48:31 AM UTC-5, Renato Forti wrote:
Hi all,
My language file is in:
/tesseract/ocr_default_engine/tessdata
for sample:
tesseract/ocr_default_engine/tessdata$ ls
Cross post.
http://stackoverflow.com/questions/15758031/fatal-error-failed-to-write-core-dump
On Tuesday, April 2, 2013 3:35:08 AM UTC-5, koushik kumar wrote:
HELLO!
I'm running the unit test for tessiterator from the tess4j distribution.
But while running the ,code on eclipse, i got a
Show us your code.
On Thursday, April 4, 2013 4:00:11 AM UTC-5, priya wrote:
hi,
Does any one know how to find character level confidence, i tried
save_blob_choices code but it gives only word level confidence. Please
let me know if you hav any pointers.
--
--
You received this
The English language data for 2.0x is at the bottom of this page:
http://code.google.com/p/tesseract-ocr/downloads/list?num=100start=100
On Wednesday, April 3, 2013 9:18:35 PM UTC-5, Damiano Rodriguez wrote:
Hi all,
I have a very strange problem:
First of all: with visual studio 2010 and C#
No, you cannot run a batch of files like that with Tesseract; it has to be
a Tesseract invocation for each file.
Or you can use VietOCR http://vietocr.sf.net, a GUI frontend for
Tesseract that supports batch or bulk OCR.
On Saturday, March 30, 2013 6:30:11 AM UTC-5, rollas...@gmail.com wrote:
tessnet2 is Tesseract 2.04-based .NET wrapper while you're using Tesseract
3.0x language data. They are not compatible.
On Tuesday, March 19, 2013 10:45:12 AM UTC-5, Micael Leal wrote:
Hello,
After installing tesseract
tessnet2 and *.traineddata files are not compatible.
On Tuesday, March 19, 2013 9:07:02 AM UTC-5, Micael Leal wrote:
Hello,
I try to implement tessnet2 but get some issues while compiling.
Bitmap image = new Bitmap(@C:\Users\admin\AppData\Local\Temp\image.bmp);
tessnet2.Tesseract ocr = new
Use Tesseract 2.0x-version language data.
On Tuesday, March 19, 2013 7:49:40 AM UTC-5, Micael Leal wrote:
Hello,
I try to implement tesseract-ocr with my powerpoint program in order to
recognize pictures.
I can extract a picture in powerpoint, but I want to extract its content.
Inside
The distributed Tess4J-1.1-src.zip includes all the files you need.
Assuming you've already had Ant and JDK 6 or 7 32-bit installed, open a
command prompt, navigate cd to Tess4J directory, and execute the unit tests
by the following command:
ant test
For your Java program to work, the JAR
It could mean the image does not meet the minimum requirements for OCR. Try
to rescale your screenshot to 300DPI.
On Monday, February 18, 2013 2:13:04 PM UTC-6, Tommy Walsh wrote:
I haven't been able to find anything on this. I'm using Tessnet2 to take a
small screenshot and try to read the
be very helpful if I can get any suggestions here.
Thanks in advance.
On Friday, January 18, 2013 9:32:35 AM UTC+5:30, Quan Nguyen wrote:
Boxes look overlapping. You may want to space them out a bit more.
On Thursday, January 17, 2013 10:33:13 AM UTC-6, Tauqeer baig wrote:
I am trying
You would have better success with 1) rescaling the image to 300 DPI, 2)
send the coordinates of each letter, and 3) use PSM 10.
On Monday, January 21, 2013 8:41:04 AM UTC-6, Luigi De Rosa wrote:
Hi to all,
i'm trying to recognize those big characters in this attached picture.
I tried in
JVM 64-bit cannot load Tesseract and Leptonica 32-bit DLLs. You would need
JVM 32-bit.
On Friday, January 18, 2013 8:11:56 AM UTC-6, Deniz Atak wrote:
Hi,
I am trying to run Tess4J in 64 JVM from Netbeans IDE and getting this
error:
Testcase:
Georgia_Bold
Georgia_Italic
Times_New_Roman
Times_New_Roman_Bold
Trebuchet_MS
Trebuchet_MS_Bold
URW_Bookman_L_Italic
Verdana
Verdana_Bold
[1] http://pastebin.com/0dV84hBa
Zdenko
On Wed, Jan 16, 2013 at 1:02 AM, Quan Nguyen nguy...@gmail.comjavascript:
wrote:
I can shorten Times New
[fontname] is just a token. If it has spaces, simply remove the spaces.
On Sunday, January 13, 2013 9:01:34 PM UTC-6, gold snake wrote:
thanks, the problem is fixed now,because the font_properties and the [
lang].[fontname].exp[num] on the command , must same.
but one thing i cant
Your filename does not seem to follow the naming convention
[lang].[fontname].exp[num].tif
(see TrainingTesseract3). And since your fontname is A, the content of
font_properties should be:
A 0 0 0 0 0
On Saturday, January 12, 2013 2:15:09 AM UTC-6, gold snake wrote:
*the display error
The new .NET wrapper for Tesseract 3.02, which is still under development,
can be found at https://github.com/charlesw/tesseract.
--
You received this message because you are subscribed to the Google
Groups tesseract-ocr group.
To post to this group, send email to tesseract-ocr@googlegroups.com
, December 11, 2012 10:12:05 PM UTC-5, Quan Nguyen wrote:
Rescaling to 300 DPI will produce much better results for the images.
--
You received this message because you are subscribed to the Google
Groups tesseract-ocr group.
To post to this group, send email to tesseract-ocr@googlegroups.com
Also, VietOCR.NET 3.3x uses the .NET wrapper for Tesseract 3.0.1.
On Tuesday, December 4, 2012 10:18:55 AM UTC-6, eljainc wrote:
Quan,
Thank you very much for this information. I will give it a try.
Mike McWhinney
elja, Inc.
--
*From:* Quan Nguyen nguy
Check out the source of VietOCR.NET 2.0.4, which uses the same tessnet2
library.
http://sourceforge.net/projects/vietocr/files/vietocr.net/2.0.4/
--
You received this message because you are subscribed to the Google
Groups tesseract-ocr group.
To post to this group, send email to
Check out the source of VietOCR.NET 2.0.5, which uses the same tessnet2
library.
http://sourceforge.net/projects/vietocr/files/vietocr.net/2.0.5/http://sourceforge.net/projects/vietocr/files/vietocr.net/2.0.4/
--
You received this message because you are subscribed to the Google
Groups
If you build using the latest source
(r806http://code.google.com/p/tesseract-ocr/source/detail?r=806),
you'll get the word confidence in the hOCR output.
On Monday, November 12, 2012 1:54:34 AM UTC-6, lirong wrote:
Hi, everyone,Dose tessrect- ocr can output confidence level of the result?
I
The Powershell script train.ps1 on AddOns page can help automate the
training process.
http://code.google.com/p/tesseract-ocr/wiki/AddOns
On Tuesday, May 17, 2011 2:08:53 AM UTC-5, Eyal wrote:
Hi,
I tried to train some letters when I ran the *mftraining *with the
parameters*:*
Both Tesseract .exe and .dll execute without any problem on my Windows 8
Release Preview. I tried them via VietOCR program.
On Tuesday, November 6, 2012 8:08:06 AM UTC-8, zdenop wrote:
Hello,
did somebody tried to use Tesseract 3.02.02 on Windows 8?
Can you share your experience (does it
A couple of places (readme and faq pages) still refer to the GUI section in
the AddOns. That section has been now moved to 3rdParty.
On Sunday, October 28, 2012 12:10:56 PM UTC-5, zdenop wrote:
Changed.
--
Zdenko
On Tue, Oct 23, 2012 at 3:11 PM, Nick White
The train.ps1 script has been updated for Tesseract 3.02 training.
http://vietocr.svn.sourceforge.net/viewvc/vietocr/jTessBoxEditor/trunk/tools/
On Sunday, March 27, 2011 12:21:11 PM UTC-5, Quan Nguyen wrote:
I created a PowerShell script to automate language data generation for
Tesseract
The script has been updated for Tesseract 3.02 training.
http://vietocr.svn.sourceforge.net/viewvc/vietocr/jTessBoxEditor/trunk/tools/
On Sunday, March 27, 2011 12:21:11 PM UTC-5, Quan Nguyen wrote:
I created a PowerShell script to automate language data generation for
Tesseract 3.01. Save
VietOCR 3.4 RC has been released. This incorporates the latest Tesseract
3.02 executable and library. Please help test. Any input or comment is
welcome.
http://sourceforge.net/projects/vietocr/files/vietocr/
--
You received this message because you are subscribed to the Google
Groups
Instead of concatenating the .tr files, you can merge all your images, if
they all have the same font style, into a multi-page TIFF and train with
that. You can use
jTessBoxEditorhttp://vietocr.sourceforge.net/training.htmlto merge images and
edit the box file.
On Monday, October 1, 2012
When Tesseract 3.02 is officially released, the author of tessdotnet will
update to it. Then we'll have multiple language support.
https://github.com/charlesw/tesseract-ocr-dotnet/issues/4
On Thursday, August 9, 2012 9:10:41 AM UTC-5, Alex C wrote:
Hi. Is there a Tesseract language pack for
Scaling up your images to 300 DPI will improve the results.
Or upgrade to Tesseract 3.01 .NET wrapper
(https://github.com/charlesw/tesseract-ocr-dotnet).
On Saturday, August 4, 2012 7:15:40 AM UTC-5, hugi wrote:
http://stackoverflow.com/questions/10815978/including-tess4j-to-a-java-project-as-library-in-eclipse
--
You received this message because you are subscribed to the Google
Groups tesseract-ocr group.
To post to this group, send email to tesseract-ocr@googlegroups.com
To unsubscribe from this
You can use an HTML parser to get the data you want.
On Thursday, May 3, 2012 2:01:56 AM UTC-5, harry asir wrote:
Hi all,
Can any body suggest how to find a word and extract coordinates of the
same from hocr (String) using java. I am using Tess4j 1.0 Beta 2 and i
got hocr output as a
the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
Could you please let me know, where i am going wrong. Thanks!
Regards,
Kamal.
On Sunday, August 22, 2010 10:35:26 PM UTC-4, Quan Nguyen wrote:
A JNA-based wrapper for Tesseract OCR DLL, the library
23, 2012 at 7:07 PM, Quan Nguyen wrote:
all of the provided image processing functions are geared for Pix type,
not
raw image.
Why not just create a Pix from the raw image data? Leptonica has
pixCreateHeader(), pixSetResolution(), pixSetWpl(), pixSetData(), etc
[1] and various helper
.
Regards,
Harry John Asir
On Apr 24, 7:07 am, Quan Nguyen nguyen...@gmail.com wrote:
Execution for .exe and .dll+Java follow different paths: one calling
ProcessPage with Leptonica Pix image and one calling TesseractRect or
GetUTF8Text with raw image. It seems that Pix image get
:
# http://java.sun.com/webapps/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native coe.
# See problematic frame for where to report the bug.
#
Please help me how to solve this issue.
Regards,
Harry John Asir
On Apr 19, 9:18 am, Quan Nguyen nguyen
for coloured images. With the test images
present in Tess4J folder (Ziped one), Ocr is working. Can you help me
in doing ocr for coloured images using Tess4J.
I am using Windows 7 PC.
Regards,
Harry John Asir
On Apr 17, 8:14 am, Quan Nguyen nguyen...@gmail.com wrote:
A JNA-based
A JNA-based wrapper for Tesseract OCR 3.02 DLL, the library provides
optical character recognition (OCR) support for:
* TIFF, JPEG, GIF, PNG, and BMP image formats
* Multi-page TIFF images
* PDF document format
This version is still in early beta development; as such, it has rough
Tessnet2 is .NET 2.0. Did you target your VS2010 solution for .NET
2.0?
VietOCR.NET 2.x, which uses the same wrapper, is VS2008-based and
works fine on Win7.
http://sourceforge.net/projects/vietocr/files/vietocr.net/
On Nov 12, 1:17 pm, Carlesmk carles.blasc...@gmail.com wrote:
Hi everibody,
Are you sure it does not accept Unicode characters? If that's the
case, you can convert Unicode characters to ASCII escaped sequences.
In JDK, there is a tool named native2ascii, which takes a text file
with specified encoding and produces an output file containing escaped
sequences.
Anyhow, I
There's a jTessBoxEditor tool that can help in editing the boxes. It
can also generate training images (and boxes).
http://vietocr.sourceforge.net/training.html
On Oct 29, 4:02 am, merve t mervet2...@gmail.com wrote:
http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3
i did
Try with PSM 8 or 10.
On Oct 24, 9:09 am, Giuseppe Menga me...@polito.it wrote:
That is interesting. I'm recognizing espiration dates from medicines, and I
found convenient to repeat the date 3 or 4 times, it improves recognition.
Someone can explain the reason.
Giuseppe
-Messaggio
-Merged box will have a character value composed of all the
characters of the merging boxes
http://sourceforge.net/projects/vietocr/files/jTessBoxEditor/
On Oct 2, 10:50 pm, Quan Nguyen nguyen...@gmail.com wrote:
A box editor for Tesseract OCR data. This release includes the
following fixes
Tesseract does not support that feature out of the box -- it would
recognize all pages found in multi-page TIFF. You'll have to manually
extract a specific page and send it to Tesseract for recognition.
Have you tried a frontend, such as VietOCR? It supports reading multi-
page TIFF and lets the
They're not compatible. If you want Tess 3.0x, try
http://code.google.com/p/tesseractdotnet/ .
On Oct 13, 3:23 am, onur karali onurkar...@gmail.com wrote:
Hi,
I can build and use .net wrapper tessnet2 for tesseract version 2.04
successfully but build operation gives error about baseAPI.h could
.
display only rectangles. Please tell me the specification for the txt file
to be accepted by jRessBoxEditor,
Thanks in advance
MNS Rao
- Original Message -
From: Quan Nguyen nguyen...@gmail.com
To: tesseract-ocr tesseract-ocr@googlegroups.com
Sent: Friday, October 07, 2011 2:08 AM
with tesseract.exe
On 5 Ott, 04:27, Quan Nguyen nguyen...@gmail.com wrote:
What's the error exactly? Does the image work with tesseract.exe?
On Oct 4, 5:02 am, Alessandro Latella alexla...@libero.it wrote:
Hi guys, I'm trying to run tesseract on c #.
The program works well
I tried on the text received, using Windows fonts Tunga on Win7 64-
bit, w/o any problem. I can't attach the output files here, so please
check your inbox.
On Oct 6, 6:52 am, mns_rao mns...@gmail.com wrote:
Generating Tiff/box for kannada Text file is not working; For Tunga
font only rectangles
What's the error exactly? Does the image work with tesseract.exe?
On Oct 4, 5:02 am, Alessandro Latella alexla...@libero.it wrote:
Hi guys, I'm trying to run tesseract on c #.
The program works well on English language 'ocr.Init(@C:\Program
Files\Tesseract-OCR\tessdata, eng, false);'
If I
Out of the box, Tess 3.0 supports multi-page TIFF. Did you try?
On Oct 4, 12:49 pm, LAPIII webpren...@gmail.com wrote:
Also, I'm using Linux Mint.
--
You received this message because you are subscribed to the Google
Groups tesseract-ocr group.
To post to this group, send email to
A box editor for Tesseract OCR data. This release includes the
following fixes and enhancements:
- Add a utility function which creates TIFF/Box pair suitable for
training with Tesseract
- Fix a bug which may clear out a modified box file when loading
another image
Please help test and post your
Does the tessdata folder have the required language data files?
On Sep 20, 3:51 am, Daniela21 dmari...@gmail.com wrote:
Hello,
I am trying to run the tesnet2 project based on these
VietOCR (Java version) does not feed the original image to Tesseract,
but rather it reads and then writes back out an uncompressed TIFF
file, rescaled to 300 DPI if instructed so, which is then sent to the
engine. I found this regurgitated image somehow has been more amenable
to Tesseract.
The
Hi Jon,
I tried your images with VietOCR, which makes the images more amenable
to Tesseract engine, and it produced fairly accurate results. I think
it could have been better if -density 300 had been used.
You can open PDF directly in VietOCR if GhostScript has been
installed.
Please try the latest beta versions, which incorporate the PSM fix.
On Sep 9, 1:42 am, Bonny esla...@gmail.com wrote:
Huh..
No attachment alowed.
In meantime I try VietOCR but doesn't recongnize two colon too.
--
You received this message because you are subscribed to the Google
Groups
There's a Windows powershell script in AddOns.
http://code.google.com/p/tesseract-ocr/wiki/AddOns
On Sep 7, 11:45 pm, haoest hao...@gmail.com wrote:
But without a batch file to build the .tr files, re-building all 32
fonts from command line would be terrifying.
--
You received this message
Vish,
Tess4J does support multi-page PDF and multi-page TIFF. Substitute
with your PDF file in the unit test case and give it a try.
Regards,
Quan
On Jul 12, 1:20 am, Vish yava...@gmail.com wrote:
Gurus,
We are using Tesseract's Java library, Called Tess4j to convert PDF
files to text. It
tesseract.dll is x86, so make sure your project's Property Build
Platform target is also x86.
On Jul 9, 7:00 am, Sarel van der Merwe sfvdme...@gmail.com wrote:
I installed the redistribution pack.
1. Reboot and recompiled.
2. Still having the same problem.
Could not load file or assembly
Andreas,
Try adding a slash to the data path, such as:
string tessdataFolder = @D:\Temp\IPoVnOCRer\IPoVn\Test\Tessdata\;
I'm curious as to why you use unsafe block in your code.
Quan
On Jul 6, 5:01 am, Andreas Reiff andire...@googlemail.com wrote:
I get an AccessViolationException, trying to
into this yet.
Best wishes,
Andreas
On 6 Jul., 14:05, Quan Nguyen nguyen...@gmail.com wrote:
Andreas,
Try adding a slash to the data path, such as:
string tessdataFolder = @D:\Temp\IPoVnOCRer\IPoVn\Test\Tessdata\;
I'm curious as to why you use unsafe block in your code.
Quan
Hunter,
I would grab the files from the project's svn. You can then build the
tesseract.dll from that.
I put in a couple of minor changes so that the recognize method would
accept an additional rectangular region paramter and return Unicode
string rather than UTF-8. Look at the project's Issues
Correct Subject line.
--
You received this message because you are subscribed to the Google
Groups tesseract-ocr group.
To post to this group, send email to tesseract-ocr@googlegroups.com
To unsubscribe from this group, send email to
tesseract-ocr+unsubscr...@googlegroups.com
For more options,
Version 3.1 Beta integrates the new tesseractdotnet .NET wrapper DLL
x86 (r42+). In contrast, version 3.0 uses command-line process to
invoke Tesseract.exe.
http://vietocr.sf.net
For more info about the wrapper, visit http://code.google.com/p/tesseractdotnet/
--
You received this message
A correction: the name should be VietOCR.NET 3.1 Beta.
On Jul 3, 10:08 am, Quan Nguyen nguyen...@gmail.com wrote:
Version 3.1 Beta integrates the new tesseractdotnet .NET wrapper DLL
x86 (r42+). In contrast, version 3.0 uses command-line process to
invoke Tesseract.exe.
http://vietocr.sf.net
The resolution of your image is too low -- at 96 DPI, any OCR engine
would have problem with it. After rescaling to 300 DPI, Tesseract was
able to recognize it.
On Jun 17, 9:05 am, Felipe Coutinho felipelcouti...@gmail.com wrote:
Hello,
I'm a new tess user. I'm trying to test the tess with
A Java/.NET GUI frontend for Tesseract OCR engine. The releases
include the following fixes and improvements:
* Improve program usability, enabling image nagivation and
manipulation with keyboard
* Fix an installation issue that was unable to uninstall previous
versions (.NET only)
* Fix an EOL
That's the problem -- you'd need an entry for every image file. The
following is excerpted from the TrainingTesseract3 wiki:
When running mftraining, each .tr filename must match an entry in the
font_properties file, or mftraining will abort.
If they are the same font, you can put them in a
Have you tried setting the environment variable TESSDATA_PREFIX?
On May 20, 1:47 pm, Daniel cogdeb...@gmail.com wrote:
I'm attempting to integrate Tesseract 3 with another stand-alone app,
but I'm running into a problem: Tesseract always looks for the
language files in \Program Files
Can you elaborate on the problems with those characters?
On May 20, 9:44 am, Holm Dressler velovity1...@googlemail.com wrote:
2. I clean up the box file with jTessBoxEditor.jar (still have
problems with special characters like the German ö,ä,ü ...)
--
You received this message because you are
Take a look at the source code of VietOCR.NET, which uses tessnet2
library.
http://vietocr.sf.net
On May 9, 10:08 am, Vignesh Raj vignesh...@greatminds.co.in wrote:
Hi. Am very new to this and I need some help on how to set up tessnet
for my .Net (c#) based application.
I have not done
Did you scan them correctly, with appropriate pixel resolution (~300
DPI) and monochrome/grayscale settings?
On May 9, 10:20 am, Giby_the_kid g.benjamin.le...@gmail.com wrote:
I've test with the sample of text in the sources... it has worked...
Now if I tried with any other scanned document, I
The binary executable would be placed in /usr/bin and language data
in /usr/share/tesseract-ocr/tessdata.
On May 5, 8:54 pm, James McCartha slayer2...@gmail.com wrote:
i used the synaptic manager and im using the newest ver of ubuntu
whare would the subdirectory be located in ubuntu
--
You
Looks like you're running Tesseract 2.0x version, which does not
support Oriental scripts. Download, install Tesseract 3.01 and try
training again.
On Apr 29, 7:09 am, Oleg Tikhonov olegtikho...@gmail.com wrote:
Here is a command and the error/message
$ tesseract.exe
Print screens are, in general, not adequate for training new
languages. You'd be better off using GIMP to produce your TIFF images.
Be sure to specify the language to bootstrap the new charset, such as:
$ tesseract.exe ../korean_training/kor.ariel.exp1.tif ../
korean_training/kor.ariel.exp1 -l
You can try VietOCR, a frontend program which uses Tesseract engine to
perform OCR on multi-page TIFF or individual ones and appends the
output to previous results.
On Apr 28, 8:41 pm, faye stefan.der.pr...@googlemail.com wrote:
Is there an option to let tessarct write the output of several
301 - 400 of 430 matches
Mail list logo