Again, thank you for posting it earlier than me :)

Anyway, do you know how could I pass this problem ? Is there any trick that 
could help me ? Maybe using Git bash or something ?

Le vendredi 23 février 2018 12:04:53 UTC+1, shree a écrit :
>
> Please open this as an issue in github repo - 
> https://github.com/tesseract-ocr/tesseract/issues
>
> >  the "/" is added without taking care if the command is used on Windows 
> or Linux. 
>
> Found a couple of places in that file where this is the case.
>
>     // Load the unicharset for the script if available.
>     string filename = script_dir + "/" +
>                       unicharset->get_script_from_script_id(s) + 
> ".unicharset";
>
> ​and
>
>     // Load the xheights for the script if available.
>     string filename = script_dir + "/" +
>                       unicharset.get_script_from_script_id(s) + 
> ".xheights";
> ​
>
>
> ShreeDevi
> ____________________________________________________________
> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>
> On Fri, Feb 23, 2018 at 2:25 PM, Jehan <jehanp...@gmail.com <javascript:>> 
> wrote:
>
>> I'm training Tesseract on Windows for a new font and everything went 
>> pretty well until the set_unicharset_properties command step:
>>
>> set_unicharset_properties -U .\unicharset -O .\unicharset2 -F 
>> "C:\Windows\Fonts\Roman.tff" --script_dir='C:\Program Files 
>> (x86)\Tesseract-OCR\training'
>>
>> Loaded unicharset of size 7 from file .\unicharset
>>> Setting unichar properties
>>> Other case c of C is not in unicharset
>>> Other case f of F is not in unicharset
>>> Setting script properties
>>> Failed to load script unicharset from:C:\Program Files 
>>> (x86)\Tesseract-OCR\training/Latin.unicharset
>>> Warning: properties incomplete for index 3 = C
>>> Warning: properties incomplete for index 4 = 0
>>> Warning: properties incomplete for index 5 = 1
>>> Warning: properties incomplete for index 6 = F
>>> Writing unicharset to file .\unicharset2
>>
>>
>> I've verified that Latin.unicharset is in the right directory.
>>
>> The problem (I'm pretty sure) is on the end of this line :
>>
>> Failed to load script unicharset from:C:\Program Files 
>>> (x86)\Tesseract-OCR\training/Latin.unicharset
>>>
>>
>> The thing is that the training software adds a "/" instead of a "\".
>> I've looked on unicharset_training_utils.cpp, in the line 166, the "/" 
>> is added without taking care if the command is used on Windows or Linux.
>>
>> Is there a solution for Windows to load Latin.unicharset even with this 
>> "/" ?
>> If not, what is the easiest solution ?
>>
>> For information, my unicharset2 file looks like that :
>>
>>> 7
>>> NULL 0 Common 0
>>> Joined 7 0,255,0,255,0,0,0,0,0,0 Latin 1 0 1 Joined # Joined [4a 6f 69 
>>> 6e 65 64 ]a
>>> |Broken|0|1 f 0,255,0,255,0,0,0,0,0,0 Common 2 10 2 |Broken|0|1 # Broken
>>> C 5 0,255,0,255,0,0,0,0,0,0 Latin 3 0 3 C # C [43 ]A
>>> 0 8 0,255,0,255,0,0,0,0,0,0 Common 4 2 4 0 # 0 [30 ]0 
>>> ...
>>
>>  
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to tesseract-oc...@googlegroups.com <javascript:>.
>> To post to this group, send email to tesser...@googlegroups.com 
>> <javascript:>.
>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/tesseract-ocr/aa3a131c-51fe-42ea-9fba-336ef89737cd%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/tesseract-ocr/aa3a131c-51fe-42ea-9fba-336ef89737cd%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/51e77998-357a-4bcd-a2f3-daec8eb4315a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to