I get far better results with bad quality scans using a local threshold. A very nice tool to find the best algorithm for your case is http://imagej.net/Auto_Threshold in the Fuji Distribution of ImageJ. I got awesome results with Phansalkar.
Am Sa., 26. Jan. 2019, 18:05 hat farhad khalafi <[email protected]> geschrieben: > I believe Tesseact uses Otsu algorithm to find a single threshold value. > The threshold is used to set gray pixels to binary values depending on > whether a gray level is below or above the threshold. I use Leptonica for > my own pre-OCR image processing and the performance seems to be fine. > > > > On Sat, Jan 26, 2019 at 9:54 AM Scott Thibault <[email protected]> > wrote: > >> I want to do my own pre-processing including binarization so I can remove >> borders and other artifacts. However, the performance seems to be worse. >> Perhaps Tesseract has better binarization? What algorithm does it use? >> >> --Scott >> >> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To post to this group, send email to [email protected]. >> Visit this group at https://groups.google.com/group/tesseract-ocr. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/2aac2250-cbaa-4b7a-bef0-ca1acf13d5f4%40googlegroups.com >> <https://groups.google.com/d/msgid/tesseract-ocr/2aac2250-cbaa-4b7a-bef0-ca1acf13d5f4%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at https://groups.google.com/group/tesseract-ocr. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/CAAMcHAG1gUeM7XrK1pxy43RjDP7Txahii7bp8q2hnhfOHCx1-A%40mail.gmail.com > <https://groups.google.com/d/msgid/tesseract-ocr/CAAMcHAG1gUeM7XrK1pxy43RjDP7Txahii7bp8q2hnhfOHCx1-A%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CADXFd_0gzSLqTLLyORx4b7pChx3CJPyOXebuDgeBZfpptC_r1g%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.

