hi,Zdenko I have tried the command under two cmd window encodings(chcp 65001 and chcp 936). I got the same failure results. results as follows: [image: chcp936.png] [image: chcp65001.png]
在 2018年11月9日星期五 UTC+8上午5:03:00,zdenop写道: > > What is output of command "chcp" (in command line)? > > Zdenko > > > st 7. 11. 2018 o 2:55 bruce <[email protected] <javascript:>> napísal(a): > >> hi,zdenop ,thank you for your reply. >> my environment is: >> windows 7 professional 64bit >> tesseract version: >> https://digi.bib.uni-mannheim.de/tesseract/tesseract-ocr-w64-setup-v4.0.0.20181030.exe >> >> test train_txt: >> https://drive.google.com/open?id=1BfURsI_HdwaKeowZP0sa8L6GKWgIVDWJ >> >> test fonts : >> https://drive.google.com/open?id=1YZObeYWOzNZbkMTcrCNw3KVlYT7hn1Q6 >> >> https://drive.google.com/open?id=15C-v4ped8ssFGXW0pSKw6CMSQgW2s0WV >> >> >> I tried the fonts of all Chinese names.All got the same error message.and >> the link just two of these fonts. you can test . >> I guess the --fonts parameter doesn't support chinese character? >> >> 在 2018年11月6日星期二 UTC+8下午6:11:00,zdenop写道: >>> >>> Hello, >>> >>> Please see bug-report and suggested solution: >>> https://github.com/tesseract-ocr/tesseract/issues/1252 >>> >>> I guess problem is in pango, but we would like to test it. Are you able >>> to create simple test case (provide small chi_sim.txt and share font if it >>> is possible) for this issue? >>> >>> Zdenko >>> >>> >>> ut 6. 11. 2018 o 10:56 bruce <[email protected]> napísal(a): >>> >>>> I use the command as follows to find the fonts I can use to train my >>>> language. >>>> *text2image.exe --text=chi_sim.txt --outputbase=chi_sim.庞中华行书.exp0 >>>> --fints_dir=C:\Windows\Fonts --find_fonts* >>>> and i got the result as follows: >>>> Font MStiffHeiPRC >>>> failed with 414359 hits = 100.00% >>>> Font MStiffHeiPRC >>>> failed with 414359 hits = 100.00% >>>> Font MStiffHeiPRC >>>> failed with 414359 hits = 100.00% >>>> Font MStiffHeiPRC >>>> failed with 414359 hits = 100.00% >>>> Font MStream PRC failed >>>> with 414359 hits = 100.00% >>>> Font MSung PRC failed >>>> with 414359 hits = 100.00% >>>> Font MSung PRC failed >>>> with 414359 hits = 100.00% >>>> 庞中华行书 Light : 414361 >>>> hits = 100.00%, raw = 3440 = 100.00% >>>> Font 剑客毛笔行书 failed >>>> with 414357 hits = 100.00% >>>> Font 可可漫雪体 failed with >>>> 414360 hits = 100.00% >>>> Font 多米手写体 failed with >>>> 414253 hits = 99.97% >>>> Font 字体中国-锐博体V1 failed >>>> with 414359 hits = 100.00% >>>> Font 孙运和酷楷 failed with >>>> 414359 hits = 100.00% >>>> Font 建刚静心楷 failed with >>>> 414359 hits = 100.00% >>>> Font 张维镜手写楷书 Medium >>>> failed with 410014 hits = 98.95% >>>> Font 徐金如硬笔行楷X failed >>>> with 413042 hits = 99.68% >>>> >>>> >>>> >>>> Than I use command like this:*text2image.exe --text=chi_sim.txt >>>> --outputbase=chi_sim.庞中华行书.exp0 --ptsize 36 --font "庞中华行书" --fonts_dir >>>> C:\Windows\Fonts* >>>> I got an error resut as follows: >>>> Could not find font >>>> named '庞中华行书'. >>>> Pango suggested font >>>> 'MingLiU'. >>>> Please correct --font >>>> arg. >>>> >>>> text2image not support chinese name fonts?How could i use these chinese >>>> name fonts? >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "tesseract-ocr" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> To post to this group, send email to [email protected]. >>>> Visit this group at https://groups.google.com/group/tesseract-ocr. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/tesseract-ocr/a9a31397-9196-4923-aa79-43d151d534a1%40googlegroups.com >>>> >>>> <https://groups.google.com/d/msgid/tesseract-ocr/a9a31397-9196-4923-aa79-43d151d534a1%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> For more options, visit https://groups.google.com/d/optout. >>> >>> -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/63e4ef0a-7754-4ee8-ad8f-7f95dcfef718%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

