Hi, no, unicharambigs is not used by LSTM files. It was used in the legacy mode.
I'm having similar problems with the ancient greek best traineddata: unfortunately it has been trained with some non standard characters (ά έ ή ί ό ύ ώ, instead of ά έ ή ί ό ύ ώ). I tried fine tuning the grc.traineddata, but without very much success, so, for the time being, I'm producing hocr files, post-process them and then use hocr-pdf to create a searchable pdf. best, andrea On Monday, March 13, 2023 at 5:13:33 PM UTC+1 Isidore Paris wrote: > Hi, > I'm doing some frk text recognition, and in my results, I have a great > number of " > ". Each one should be replaced by " ck ". > I updated my frk.traineddata file (from tessdata_best repository) with a > frk.unicharambigs file (I tried both formats v1 and v2) but absolutely > nothing changed. > I also tried the parameter " -c use_ambigs_for_adaption=1 " to see if > maybe it was needed, but still nothing changed, not a single character (> > and = and / are all still there). > > Here is the content of my v2 frk.unicharambigs file: > v2 > > ck 1 > = - 1 > / - 1 > > Does unicharambigs not work with LSTM files? Or did I miss some particular > or special step? > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/bf3c74e3-2e6c-40e8-91b9-c2c76921ccffn%40googlegroups.com.

