Sanescan is yet another GUI scanning and OCR frontend that uses SANE as its backend.
The reason for its existence is that none of the current open-source scanning and OCR applications actually do OCR well enough that text selection works reliably when viewing the produced PDF document. Many of the tools work well with simple cases like single-column book pages, but something like multi-columnar newspaper or an invoice with tables often results in multi-line selection taking characters from all over the page. The work on Sanescan project has already resulted in improvements in Tesseract OCR engine itself (https://github.com/tesseract-ocr/tesseract/pull/3787), which demonstrates that Sanescan can be more than another frontend for Tesseract OCR and actually improve the state of the art of open source scanning and OCR experience. Currently Sanescan is of beta quality. The GUI application lacks polish so it's not yet recommended to end users. This post is only to raise awareness among the developers. However, the application already produces PDF documents that are in certain areas significantly better than all open source alternatives, so the potential is there. The code currently lives at my personal GitHub https://github.com/p12tic/sanescan, but the plan is to move the code somewhere under the SANE project (e.g. under the "frontend" group in SANE GitLab https://gitlab.com/sane-project/frontend). Short term plan is to focus on usability and full feature parity with all other open source OCR applications and then do a proper 1.0 release. Long term plan is to extract full OCR processing pipeline and provide it as a library to all third-party applications. Another long term goal is to introduce additional features (such as film scanning) that would make Sanescan the primary open-source choice to do all things to related to scanning. The last goal in particular is still many months away. This project has received a grant by the NLnet foundation and their NGI0 Discovery fund to improve open source scanning and OCR capabilities. Huge thanks to them! Regards, Povilas Kanapickas