hi all,

I tried to use Tesseract for non-printed text recolonization, and found it 
didn't work well for them.

Here is the details for my test.


   - Tesseract version: v3.0.2
   - Circumstance: Window 64bit,
   - Hardware: Intel(R) core(TM) i303110 M CPU @2.4GHz
   - Orginal image: Image of nameplate on the device, you could find my 
   attached image named 'Image of nameplate on the device.jpg' for orginal 
   image and segmentation of the picture from other images files which 
   Tesseract didn't recognize well (This means get none or wrong return).
   - Key information for Tesseract: 
   - Even I set variables "tessedit_char_whitelist" to 
      "0123456789abcdefghijklmnopqrstuvwxyz", there is still no improvements.
      - The language is English by default.
      - I read the wiki page '"ImproveQuality 
      <https://github.com/tesseract-ocr/tesseract/wiki/ImproveQuality>", 
      and try to "Image processing" and nothing changed.
      - I also tried to segment each character into one and set the "psm" 
      to 10 and get the bad performance either.
   

Maybe this is because of the non-printed text, not Tesseract itself. And i 
have to train new data and treat it as a new language, is that correct. Or, 
is there any optimization i could do to help?

Thanks,  

<https://lh3.googleusercontent.com/-V8chc6mYos4/WJWSBE2o8SI/AAAAAAAAAaE/plTvegKM7AcZQY6MFUMeP_OESv2MvdcbQCLcB/s1600/Image%2Bof%2Bnameplate%2Bon%2Bthe%2Bdevice.jpg>

<https://lh3.googleusercontent.com/-RrUELO9YKhI/WJWSFa2hneI/AAAAAAAAAaI/Y8jLaO7SwfEBkfVcnusBiRyyfLaz_kPswCLcB/s1600/time_year.jpg>

<https://lh3.googleusercontent.com/-oDQEgkt2OuU/WJWSHeF9IzI/AAAAAAAAAaM/Kq1jHhdN16oHKB2brNdvrKjNIFPgf7MgACLcB/s1600/connection_group_label.jpg>

<https://lh3.googleusercontent.com/-jbfOJcCW8Vs/WJWSJNzxPcI/AAAAAAAAAaQ/8NDGbPGYlAM31gJ16cnhQRq2nzGiwxWxACLcB/s1600/insulation_ac.jpg>

<https://lh3.googleusercontent.com/-cTP06tkNvtw/WJWSKjTmuLI/AAAAAAAAAaU/D9F8WPWnkDUftKwBpFkM0i_PWXNJVivRgCLcB/s1600/low_current_2.jpg>

<https://lh3.googleusercontent.com/-RIsUhdJoc3I/WJWSUF1wPlI/AAAAAAAAAaY/zO6_h0xk0fw6W1CAI6PS7SPBCf_8192_wCLcB/s1600/product_model.jpg>


-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/d4c326a1-22a4-4a2d-9d1b-3c3cfbe00f79%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to