Hi all,

I'm looking to get better parallelism out of Tesseract in LSTM engine mode. 
I attempted to parallelize the loop over all words in RecogAllWordsPassN. 
To do this first I created N instances of LSTMRecognizer, all loaded with 
the same data. Then, I split the words in the page into N word sets, so 
that each set has approximately the same number of ROW's. Finally I 
parallelized the loop on N cores by processing each word set with the 
corresponding LSTMRecognizer instance. The speed-up I got was about 1.8x 
for N=2, which is very nice, but the code itself is not stable. As far as I 
can tell, the PAGE_RES_IT structure provides mutable access to the page 
data and I didn't synchronize mutations. This lead to crashes due to (I'm 
guessing here) inconsistencies in the PAGE_RES.

Looking at PAGE_RES_IT, it looks like a rabbit hole of data 
interdependencies and I'm not sure if just slapping a scoped_lock in 
ReplaceCurrentWord and the rest of the mutators is enough to make 
multithreaded operation in RecogAllWordsPassN safe.

How would you advise me to continue? Do you think that mutexing PAGE_RES is 
enough to provide crash-free word-level parallelism? Or maybe I should 
duplicate the PAGE_RES structure, with each duplicate containing only part 
of the rows, process each PAGE_RES object in parallel and then somehow 
merge the results?

Much obliged,
Stefan Dragnev.

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/1f4e66e2-91e3-428e-a3b6-beb052001e71%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to