if you are referring to tesseract 4.00alpha with liptonica 1.74.1, and if you compiled them in the correct way and got the binaries that you need for training lmstf files, then I recommend to follow the suggestions that is made by tesseract devs which is: once you create an .lstmf file for a certain font (that can be used for Arabic writing) then get the official ara.traineddata file from GitHub paste it in tessdata folder, and the lstmf file in tesseract folder and run the command tesseract text_image result_text -l ara --oem 1 what Arabic characters exactly are you trying to enhance the accuracy for ?
On Saturday, April 8, 2017 at 11:52:25 AM UTC+3, Ahmad Moawad wrote: > Hello All, > > > I want to make training for Arabic language in Tesseract 4.0, and The > result of this version is great but still need some tunning, so I got > jTessBoxEditor 2.0 beta. > I tried to modify the incorrect characters and build ara.traineddata. > After copying the ara.traineddata to > /usr/share/tesseract-ocr/4.00/tessdata, I got random characters when I run > the tesseract on the image. > So any suggestion of how making training for Version 4.0, I already know > that that last version 3.0x cube doesn't included in 4.0 LSTM or waiting > until Ray makes another updated ara.traineddata. > > ,Thanks. > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/1c842b1e-1dc1-418b-a5b7-368c11e7dfa5%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

