[tesseract-ocr] Is there a way to disable the thresholding (binarization) from command line or Tesseract C++ API?

epiphany27 Mon, 08 Oct 2018 15:19:07 -0700

Hi

I have been trying to figure out if there's a way to disable the default 
thresholding done by leptonica during pre-preprocessing. I think in some 
cases when the scan quality of PDFs is good enough, the thresholding step 
ends up deteriorating the OCR accuracy. I have a feeling that thresholding 
is not really needed for all cases. The only way I could disable the 
thresholding step is by commenting out the following lines in baseapi.cpp . Is 
there any other way?


/*if (!thresholder_->IsBinary()) {
tesseract_->set_pix_thresholds(thresholder_->GetPixRectThresholds());
tesseract_->set_pix_grey(thresholder_->GetPixRectGrey()); 
} else { */
tesseract_->set_pix_thresholds(nullptr);
tesseract_->set_pix_grey(nullptr);
//}

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/9e61b13d-cc6d-4d2c-90e1-a15460a8c5fe%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[tesseract-ocr] Is there a way to disable the thresholding (binarization) from command line or Tesseract C++ API?

Reply via email to