I didn't have any problem when following the instructions to add '±' to 
eng.traineddata. Is it because for Chinese there are much more characters?

在 2019年6月13日星期四 UTC-4下午4:04:45,Jingjing Lin写道:
>
> before 
>
> src/training/tesstrain_utils.sh: line 72: 20849 Segmentation fault    
>   (core dumped) "${cmd}" "$@" 2>&1
>
>      20850 Done                    | tee -a ${LOG_FILE}
>
>
> it also shows:
>
> Error in pixCreateNoInit: pix_malloc fail for data
>
> Error in pixCreate: pixd not made
>
>
> 在 2019年6月13日星期四 UTC-4下午3:47:13,Jingjing Lin写道:
>>
>> when I tried to create new training data using the command below for fine 
>> tuning a few characters:
>>
>> src/training/tesstrain.sh --fonts_dir /usr/share/fonts --lang chi_sim 
>> --linedata_only \
>>   --noextract_font_properties --langdata_dir ../langdata \
>>   --tessdata_dir ./tessdata --output_dir ~/tesstutorial/train
>>
>>
>> It's taking forever to do it (actually I think stuck in Phase I: 
>> Generating training images) by doing the rendered page to file **.tif
>>
>> Rendered page 1285 to file 
>> /tmp/chi_sim-2019-06-13.rk6/chi_sim.AR_PL_UKai_CN.exp0.tif
>>
>> Rendered page 1286 to file 
>> /tmp/chi_sim-2019-06-13.rk6/chi_sim.AR_PL_UKai_CN.exp0.tif
>>
>> and sometimes gives the error below:
>>
>> src/training/tesstrain_utils.sh: line 72: 20849 Segmentation fault      
>> (core 
>> dumped) "${cmd}" "$@" 2>&1
>>
>>      20850 Done                    | tee -a ${LOG_FILE}
>>
>>
>>
>> What's the problem here?
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/eefe2193-dacd-4685-ae0b-aad10c2bdfbb%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to