Hi,

Issue still exists in the latest beta version of Tess4J also.

If I convert the image to 300 DPI, whether i can get the coordinates
of the text corresponding to original image using hocr on the
converted 300 DPI image. If yes, can u guide me how to convert the
image to 300 DPI using Java.

Note: In the old Tess4J (which doesn't support hocr), able to extract
text from this image. Hence you can look into the code to handle any
image resolution if possible.

Thanks&Regards,
Harry John Asir

On Apr 25, 6:41 am, Quan Nguyen <[email protected]> wrote:
> Your image resolution is too low; it needs be rescaled to a higher one, 300
> DPI, for instance.
>
> Please try with the latest beta version just uploaded today. Thanks.
>
>
>
> On Tuesday, April 24, 2012 1:21:32 AM UTC-5, harry asir wrote:
>
> > Hi,
>
> > I have sent the image and log files yo your mail id .
>
> > Regards,
> > Harry John Asir
>
> > On Apr 24, 7:07 am, Quan Nguyen <[email protected]> wrote:
> > > Execution for .exe and .dll+Java follow different paths: one calling
> > > ProcessPage with Leptonica Pix image and one calling TesseractRect or
> > > GetUTF8Text with raw image. It seems that Pix image get thresholded
> > before
> > > recognition and thus produces better and more consistent results.
>
> > > I'm trying to research what other function calls need to be made to
> > > produce, for raw images, results similar to that for Pix image. I would
> > > rather want to use TessBaseAPI functions to perform these image
> > processing
> > > than do it with Java, but it seems, for Tesseract 3.02, all of the
> > provided
> > > image processing functions are geared for Pix type, not raw image.
>
> > > Meanwhile, can you attach your color image for testing?
>
> > > On Sunday, April 22, 2012 11:42:19 PM UTC-5, harry asir wrote:
>
> > > > Hi,
>
> > > > When i run Tesseract.exe (Not Tess4j) in My PC, it is extracting
> > > > characters. That is Tesseract.exe is working fine.
>
> > > > But, issues mentioned in the above thread is happening only in Tess4J
>
> > > > I have tried with JRE versions 1.7, 1.6.0_20, 1.6.0.14 and the issue
> > > > mentioned in the above thread is happening with all these JRE. Whether
> > > > these JRE versions are ok to use or whether I need to use any other
> > > > JRE versions.
>
> > > > Image details:
> > > > 1. Coloured image
> > > > 2. 320 x 480 pixels, 3.5 inches (~165 ppi pixel density)
>
> > > > Please guide me to solve these issues.
>
> > > > Regards,
> > > > Harry John Asir
>
> > > > On Apr 20, 7:30 am, Quan Nguyen <[email protected]> wrote:
> > > > > Code? Pix? Does Tesseract itself work with those images?
>
> > > > > On Thursday, April 19, 2012 1:25:29 AM UTC-5, harry asir wrote:
>
> > > > > > Hi,
>
> > > > > > Thanks for your reply.
>
> > > > > > With Woindows7, any image other than default image present in
> > tess4j
> > > > > > folder (Zipped one) crash is happening at JRE. Similar crash is
> > > > > > happening with any images (including default image present in
> > tess4j
> > > > > > folder)
>
> > > > > > Logs:
>
> > > > > > #
> > > > > > # A fatal error has been detected by the Java Run Time
> > Environment:
> > > > > > # EXCEPTION_ACCESS_VIOLATION (0x0000005) at pc=0x72064673,
> > pid=6184,
> > > > > > tid=3628
> > > > > > #
> > > > > > # JRE version: 6.0_20-b02
> > > > > > # Java VM: Java HotSpot(TM) Client VM (16.3-b01 mixed mode,
> > sharing
> > > > > > windows-x86)
> > > > > > # problematic frame:
> > > > > > # C [MSVCR90.dll+0X24673]
> > > > > > #
> > > > > > # An error report file with more information is saved as:
> > > > > > # C:\Users\pcrmd\workspace\Tess4jNew\hs_err_pid6184.log
> > > > > > #
> > > > > > # If you would like to submit a bug report, please visit:
> > > > > > #  http://java.sun.com/webapps/bugreport/crash.jsp
> > > > > > # The crash happened outside the Java Virtual Machine in native
> > coe.
> > > > > > # See problematic frame for where to report the bug.
> > > > > > #
>
> > > > > > Please help me how to solve this issue.
>
> > > > > > Regards,
> > > > > > Harry John Asir
>
> > > > > > On Apr 19, 9:18 am, Quan Nguyen <[email protected]> wrote:
> > > > > > > Tess4J is just a simple wrapper around TessBaseAPI. It can do
> > only
> > > > what
> > > > > > > Tesseract can. For color images, you probably will need to apply
> > > > > > > thresholding to create binary images suitable for OCR
> > operations.
>
> > > > > > > On Wednesday, April 18, 2012 5:45:53 AM UTC-5, harry asir wrote:
>
> > > > > > > > OCR is always failing for coloured images. With the test
> > images
> > > > > > > > present in Tess4J folder (Ziped one), Ocr is working. Can you
> > help
> > > > me
> > > > > > > > in doing ocr for coloured images using Tess4J.
> > > > > > > > I am using Windows 7 PC.
>
> > > > > > > > Regards,
> > > > > > > > Harry John Asir
>
> > > > > > > > On Apr 17, 8:14 am, Quan Nguyen <[email protected]> wrote:
> > > > > > > > > A JNA-based wrapper for Tesseract OCR 3.02 DLL, the library
> > > > provides
> > > > > > > > > optical character recognition (OCR) support for:
>
> > > > > > > > >     * TIFF, JPEG, GIF, PNG, and BMP image formats
> > > > > > > > >     * Multi-page TIFF images
> > > > > > > > >     * PDF document format
>
> > > > > > > > > This version is still in early beta development; as such, it
> > has
> > > > > > rough
> > > > > > > > > edges and not undergone thorough testing. Any
> > > > > > > > feedback/comment/suggestion
> > > > > > > > > is welcome.
>
> > > > > > > > >http://tess4j.sf.net
> > > > > > > On Wednesday, April 18, 2012 5:45:53 AM UTC-5, harry asir wrote:
>
> > > > > > > > OCR is always failing for coloured images. With the test
> > images
> > > > > > > > present in Tess4J folder (Ziped one), Ocr is working. Can you
> > help
> > > > me
> > > > > > > > in doing ocr for coloured images using Tess4J.
> > > > > > > > I am using Windows 7 PC.
>
> > > > > > > > Regards,
> > > > > > > > Harry John Asir
>
> > > > > > > > On Apr 17, 8:14 am, Quan Nguyen <[email protected]> wrote:
> > > > > > > > > A JNA-based wrapper for Tesseract OCR 3.02 DLL, the library
> > > > provides
> > > > > > > > > optical character recognition (OCR) support for:
>
> > > > > > > > >     * TIFF, JPEG, GIF, PNG, and BMP image formats
> > > > > > > > >     * Multi-page TIFF images
> > > > > > > > >     * PDF document format
>
> > > > > > > > > This version is still in early beta development; as such, it
> > has
> > > > > > rough
> > > > > > > > > edges and not undergone thorough testing. Any
> > > > > > > > feedback/comment/suggestion
> > > > > > > > > is welcome.
>
> > > > > > > > >http://tess4j.sf.net
> > > > > > > On Wednesday, April 18, 2012 5:45:53 AM UTC-5, harry asir wrote:
>
> > > > > > > > OCR is always failing for coloured images. With the test
> > images
> > > > > > > > present in Tess4J folder (Ziped one), Ocr is working. Can you
> > help
> > > > me
> > > > > > > > in doing ocr for coloured images using Tess4J.
> > > > > > > > I am using Windows 7 PC.
>
> > > > > > > > Regards,
> > > > > > > > Harry John Asir
>
> > > > > > > > On Apr 17, 8:14 am, Quan Nguyen <[email protected]> wrote:
> > > > > > > > > A JNA-based wrapper for Tesseract OCR 3.02 DLL, the library
> > > > provides
> > > > > > > > > optical character recognition (OCR) support for:
>
> > > > > > > > >     * TIFF, JPEG, GIF, PNG, and BMP image formats
> > > > > > > > >     * Multi-page TIFF images
> > > > > > > > >     * PDF document format
>
> > > > > > > > > This version is still in early beta development; as such, it
> > has
> > > > > > rough
> > > > > > > > > edges and not undergone thorough testing. Any
> > > > > > > > feedback/comment/suggestion
> > > > > > > > > is welcome.
>
> > > > > > > > >http://tess4j.sf.net
> > > > > > > On Wednesday, April 18, 2012 5:45:53 AM UTC-5, harry asir wrote:
>
> > > > > > > > OCR is always failing for coloured images. With the test
> > images
> > > > > > > > present in Tess4J folder (Ziped one), Ocr is working. Can you
> > help
> > > > me
> > > > > > > > in doing ocr for coloured images using Tess4J.
> > > > > > > > I am using Windows 7 PC.
>
> > > > > > > > Regards,
> > > > > > > > Harry John Asir
>
> > > > > > > > On Apr 17, 8:14 am, Quan Nguyen <[email protected]> wrote:
> > > > > > > > > A JNA-based wrapper for Tesseract OCR 3.02 DLL, the library
> > > > provides
> > > > > > > > > optical character recognition (OCR) support for:
>
> > > > > > > > >     * TIFF, JPEG, GIF, PNG, and BMP image formats
> > > > > > > > >     * Multi-page TIFF images
> > > > > > > > >     * PDF document format
>
> > > > > > > > > This version is still in early beta development; as such, it
> > has
> > > > > > rough
> > > > > > > > > edges and not undergone thorough testing. Any
> > > > > > > > feedback/comment/suggestion
> > > > > > > > > is welcome.
>
> > > > > > > > >http://tess4j.sf.net-Hidequotedtext -
>
> > > > > > > - Show quoted text -- Hide quoted text -
>
> > > > > - Show quoted text -- Hide quoted text -
>
> > > - Show quoted text -- Hide quoted text -
>
> - Show quoted text -

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to