Execution for .exe and .dll+Java follow different paths: one calling ProcessPage with Leptonica Pix image and one calling TesseractRect or GetUTF8Text with raw image. It seems that Pix image get thresholded before recognition and thus produces better and more consistent results.
I'm trying to research what other function calls need to be made to produce, for raw images, results similar to that for Pix image. I would rather want to use TessBaseAPI functions to perform these image processing than do it with Java, but it seems, for Tesseract 3.02, all of the provided image processing functions are geared for Pix type, not raw image. Meanwhile, can you attach your color image for testing? On Sunday, April 22, 2012 11:42:19 PM UTC-5, harry asir wrote: > > Hi, > > When i run Tesseract.exe (Not Tess4j) in My PC, it is extracting > characters. That is Tesseract.exe is working fine. > > But, issues mentioned in the above thread is happening only in Tess4J > > I have tried with JRE versions 1.7, 1.6.0_20, 1.6.0.14 and the issue > mentioned in the above thread is happening with all these JRE. Whether > these JRE versions are ok to use or whether I need to use any other > JRE versions. > > Image details: > 1. Coloured image > 2. 320 x 480 pixels, 3.5 inches (~165 ppi pixel density) > > Please guide me to solve these issues. > > Regards, > Harry John Asir > > On Apr 20, 7:30 am, Quan Nguyen <[email protected]> wrote: > > Code? Pix? Does Tesseract itself work with those images? > > > > > > > > On Thursday, April 19, 2012 1:25:29 AM UTC-5, harry asir wrote: > > > > > Hi, > > > > > Thanks for your reply. > > > > > With Woindows7, any image other than default image present in tess4j > > > folder (Zipped one) crash is happening at JRE. Similar crash is > > > happening with any images (including default image present in tess4j > > > folder) > > > > > Logs: > > > > > # > > > # A fatal error has been detected by the Java Run Time Environment: > > > # EXCEPTION_ACCESS_VIOLATION (0x0000005) at pc=0x72064673, pid=6184, > > > tid=3628 > > > # > > > # JRE version: 6.0_20-b02 > > > # Java VM: Java HotSpot(TM) Client VM (16.3-b01 mixed mode, sharing > > > windows-x86) > > > # problematic frame: > > > # C [MSVCR90.dll+0X24673] > > > # > > > # An error report file with more information is saved as: > > > # C:\Users\pcrmd\workspace\Tess4jNew\hs_err_pid6184.log > > > # > > > # If you would like to submit a bug report, please visit: > > > # http://java.sun.com/webapps/bugreport/crash.jsp > > > # The crash happened outside the Java Virtual Machine in native coe. > > > # See problematic frame for where to report the bug. > > > # > > > > > Please help me how to solve this issue. > > > > > Regards, > > > Harry John Asir > > > > > On Apr 19, 9:18 am, Quan Nguyen <[email protected]> wrote: > > > > Tess4J is just a simple wrapper around TessBaseAPI. It can do only > what > > > > Tesseract can. For color images, you probably will need to apply > > > > thresholding to create binary images suitable for OCR operations. > > > > > > On Wednesday, April 18, 2012 5:45:53 AM UTC-5, harry asir wrote: > > > > > > > OCR is always failing for coloured images. With the test images > > > > > present in Tess4J folder (Ziped one), Ocr is working. Can you help > me > > > > > in doing ocr for coloured images using Tess4J. > > > > > I am using Windows 7 PC. > > > > > > > Regards, > > > > > Harry John Asir > > > > > > > On Apr 17, 8:14 am, Quan Nguyen <[email protected]> wrote: > > > > > > A JNA-based wrapper for Tesseract OCR 3.02 DLL, the library > provides > > > > > > optical character recognition (OCR) support for: > > > > > > > > * TIFF, JPEG, GIF, PNG, and BMP image formats > > > > > > * Multi-page TIFF images > > > > > > * PDF document format > > > > > > > > This version is still in early beta development; as such, it has > > > rough > > > > > > edges and not undergone thorough testing. Any > > > > > feedback/comment/suggestion > > > > > > is welcome. > > > > > > > >http://tess4j.sf.net > > > > On Wednesday, April 18, 2012 5:45:53 AM UTC-5, harry asir wrote: > > > > > > > OCR is always failing for coloured images. With the test images > > > > > present in Tess4J folder (Ziped one), Ocr is working. Can you help > me > > > > > in doing ocr for coloured images using Tess4J. > > > > > I am using Windows 7 PC. > > > > > > > Regards, > > > > > Harry John Asir > > > > > > > On Apr 17, 8:14 am, Quan Nguyen <[email protected]> wrote: > > > > > > A JNA-based wrapper for Tesseract OCR 3.02 DLL, the library > provides > > > > > > optical character recognition (OCR) support for: > > > > > > > > * TIFF, JPEG, GIF, PNG, and BMP image formats > > > > > > * Multi-page TIFF images > > > > > > * PDF document format > > > > > > > > This version is still in early beta development; as such, it has > > > rough > > > > > > edges and not undergone thorough testing. Any > > > > > feedback/comment/suggestion > > > > > > is welcome. > > > > > > > >http://tess4j.sf.net > > > > On Wednesday, April 18, 2012 5:45:53 AM UTC-5, harry asir wrote: > > > > > > > OCR is always failing for coloured images. With the test images > > > > > present in Tess4J folder (Ziped one), Ocr is working. Can you help > me > > > > > in doing ocr for coloured images using Tess4J. > > > > > I am using Windows 7 PC. > > > > > > > Regards, > > > > > Harry John Asir > > > > > > > On Apr 17, 8:14 am, Quan Nguyen <[email protected]> wrote: > > > > > > A JNA-based wrapper for Tesseract OCR 3.02 DLL, the library > provides > > > > > > optical character recognition (OCR) support for: > > > > > > > > * TIFF, JPEG, GIF, PNG, and BMP image formats > > > > > > * Multi-page TIFF images > > > > > > * PDF document format > > > > > > > > This version is still in early beta development; as such, it has > > > rough > > > > > > edges and not undergone thorough testing. Any > > > > > feedback/comment/suggestion > > > > > > is welcome. > > > > > > > >http://tess4j.sf.net > > > > On Wednesday, April 18, 2012 5:45:53 AM UTC-5, harry asir wrote: > > > > > > > OCR is always failing for coloured images. With the test images > > > > > present in Tess4J folder (Ziped one), Ocr is working. Can you help > me > > > > > in doing ocr for coloured images using Tess4J. > > > > > I am using Windows 7 PC. > > > > > > > Regards, > > > > > Harry John Asir > > > > > > > On Apr 17, 8:14 am, Quan Nguyen <[email protected]> wrote: > > > > > > A JNA-based wrapper for Tesseract OCR 3.02 DLL, the library > provides > > > > > > optical character recognition (OCR) support for: > > > > > > > > * TIFF, JPEG, GIF, PNG, and BMP image formats > > > > > > * Multi-page TIFF images > > > > > > * PDF document format > > > > > > > > This version is still in early beta development; as such, it has > > > rough > > > > > > edges and not undergone thorough testing. Any > > > > > feedback/comment/suggestion > > > > > > is welcome. > > > > > > > >http://tess4j.sf.net-Hide quoted text - > > > > > > - Show quoted text -- Hide quoted text - > > > > - Show quoted text - -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

