<https://lh3.googleusercontent.com/-1L3HIEP1HIc/WYAVPr3JopI/AAAAAAAAAAw/BLMCuDASK74TVIPxP6Dl3igHVXryWSnagCLcBGAs/s1600/yanghui_100_0.jpg>






Hello,

I'm trying to apply Tess4.0 to recongnize the simplified Chinese with the 
command as:
  argc = 13;
  argv[1] = "E:/数据库/yanghui_results/yanghui_100_0.jpg";
  argv[2] = "E:/sample/01";
  argv[3] = "-l";
  argv[4] = "chi_sim+eng";
  argv[5] = "-psm";
  argv[6] = "7";
  argv[7] = "--oem";
  argv[8] = "OEM_TESSERACT_LSTM_COMBINED";
  argv[9] = "--tessdata-dir";
  argv[10] = "../tessdata";
  argv[11] = "--user-words";
  argv[12] = "../tessdata/chi_sim.user-words";

I have used the chi_sim and eng traineddata as the tessdata language, but 
some specific symbols, such as '∠' (means an angle), cannot be correctly 
recognized.


For example, an image demonstrated in above is the input data of Tess4.0, 
and the results is shown as the following:
如图, 在口ABCD中, 点E, F在AC上, 且乙ABE=乙CDF, 求证: BE=DF,

>From the results, we can observe that the '∠' symbol has been recognized as 
'乙', and the *rhomboid <?keyword=rhomboid> symbol is recognized as '口', '.' 
period symbol as ',' **comma <?keyword=comma> *



*symbol <?keyword=symbol>.How to correctly recognized these specific 
symbols with Tess4.0? Can you help me?*

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/8c00f7b8-1d84-4824-96a4-c8c2e50781bc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to