[tesseract-ocr] How to add desired character to existing language in tesseract 4.0

2017-01-09 Thread xunmi048
I found the a desired character file in source training data in chi_sim . So I guessed that in chi_sim.traineddata there is information of desired character. How can I add my unique character like ↑↓ into trained language? -- You

[tesseract-ocr] Release of Tesseract 4.0 that has LSTM based OCR engine

2017-01-09 Thread hgupta
Hello all, Does anyone know the tentative release date of Tesseract 4.0 which has a new OCR engine based on LSTM neural networks ? Thanks, Hari -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop

Re: [tesseract-ocr] Re: Tesseract v3.03 and norwegian language

2017-01-09 Thread Ludvig F Aarstad
I think I might stick with the postprocessing for now, too much oddities I need tonlearn to be able to compile it ;). Still, I think this project is awesome and I might take it up a notch and try the same I am doing now just using .net code :) -- You received this message because you are

Re: [tesseract-ocr] Re: Tesseract v3.03 and norwegian language

2017-01-09 Thread ShreeDevi Kumar
Actually postprocessing with replace for AE will be the best bet as 4.0 is slower than the tesseract engine for latin-based scripts. You can experiment with 4.0.0alpha. See https://github.com/tesseract-ocr/tesseract/wiki/Compiling you will also need to compile the latest version of leptonica