Again, thank you for posting it earlier than me :) Anyway, do you know how could I pass this problem ? Is there any trick that could help me ? Maybe using Git bash or something ?
Le vendredi 23 février 2018 12:04:53 UTC+1, shree a écrit : > > Please open this as an issue in github repo - > https://github.com/tesseract-ocr/tesseract/issues > > > the "/" is added without taking care if the command is used on Windows > or Linux. > > Found a couple of places in that file where this is the case. > > // Load the unicharset for the script if available. > string filename = script_dir + "/" + > unicharset->get_script_from_script_id(s) + > ".unicharset"; > > and > > // Load the xheights for the script if available. > string filename = script_dir + "/" + > unicharset.get_script_from_script_id(s) + > ".xheights"; > > > > ShreeDevi > ____________________________________________________________ > भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com > > On Fri, Feb 23, 2018 at 2:25 PM, Jehan <[email protected] <javascript:>> > wrote: > >> I'm training Tesseract on Windows for a new font and everything went >> pretty well until the set_unicharset_properties command step: >> >> set_unicharset_properties -U .\unicharset -O .\unicharset2 -F >> "C:\Windows\Fonts\Roman.tff" --script_dir='C:\Program Files >> (x86)\Tesseract-OCR\training' >> >> Loaded unicharset of size 7 from file .\unicharset >>> Setting unichar properties >>> Other case c of C is not in unicharset >>> Other case f of F is not in unicharset >>> Setting script properties >>> Failed to load script unicharset from:C:\Program Files >>> (x86)\Tesseract-OCR\training/Latin.unicharset >>> Warning: properties incomplete for index 3 = C >>> Warning: properties incomplete for index 4 = 0 >>> Warning: properties incomplete for index 5 = 1 >>> Warning: properties incomplete for index 6 = F >>> Writing unicharset to file .\unicharset2 >> >> >> I've verified that Latin.unicharset is in the right directory. >> >> The problem (I'm pretty sure) is on the end of this line : >> >> Failed to load script unicharset from:C:\Program Files >>> (x86)\Tesseract-OCR\training/Latin.unicharset >>> >> >> The thing is that the training software adds a "/" instead of a "\". >> I've looked on unicharset_training_utils.cpp, in the line 166, the "/" >> is added without taking care if the command is used on Windows or Linux. >> >> Is there a solution for Windows to load Latin.unicharset even with this >> "/" ? >> If not, what is the easiest solution ? >> >> For information, my unicharset2 file looks like that : >> >>> 7 >>> NULL 0 Common 0 >>> Joined 7 0,255,0,255,0,0,0,0,0,0 Latin 1 0 1 Joined # Joined [4a 6f 69 >>> 6e 65 64 ]a >>> |Broken|0|1 f 0,255,0,255,0,0,0,0,0,0 Common 2 10 2 |Broken|0|1 # Broken >>> C 5 0,255,0,255,0,0,0,0,0,0 Latin 3 0 3 C # C [43 ]A >>> 0 8 0,255,0,255,0,0,0,0,0,0 Common 4 2 4 0 # 0 [30 ]0 >>> ... >> >> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> To post to this group, send email to [email protected] >> <javascript:>. >> Visit this group at https://groups.google.com/group/tesseract-ocr. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/aa3a131c-51fe-42ea-9fba-336ef89737cd%40googlegroups.com >> >> <https://groups.google.com/d/msgid/tesseract-ocr/aa3a131c-51fe-42ea-9fba-336ef89737cd%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/51e77998-357a-4bcd-a2f3-daec8eb4315a%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

