I am looking to understand the architecture of OCR pipeline in tesseract 
v5.0.1 to know about *the preprocessing that happen before the LSTM network 
during inference and training*. 

I could only find these 7 year old documentation notes (
https://github.com/tesseract-ocr/docs/tree/main/das_tutorial2016) and I am 
not sure if they are still accurate. 

   1. Is the information I am looking for present anywhere in the online 
   documentation (https://tesseract-ocr.github.io/tessdoc/)? 
   2. Is there a way to turn off the pagelayout analysis and other 
   preprocessing before the LSTM modules? 


-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/3f329911-5d88-4ca5-9089-f66b78798been%40googlegroups.com.

Reply via email to