On 6 December 2010 15:39, Barry Drake <[email protected]> wrote: > Hi there .... > > I rarely need OCR, but one of my slight disappointments is the lack of a > really accurate OCR engine for Linux. I've tried all the ones that > exist (that I've found so far), and apart from being a bit awkward to > operate, no matter how much I vary the scan settings, I always end up > doing a lot of corrections to the output. > > I've solved the problem by getting an old copy of 'TextBridge OCR' to > work under Wine. It's one that came with a long dead scanner I had some > years ago. The thing is, TextBridge produces far more accurate output > with little or no messing about. It even drives the scanner through > Twain (I was surprised and pleased by that). > > Is anyone out there getting real accuracy with a native Linux app? > > I had a need to do some OCR recently and came across a project called tesseract-ocr: http://code.google.com/p/tesseract-ocr/. It's based on HP code that dates from the mid-90s. I've only used it to extract text from existing graphics but it seems to be very accurate.
s/ -- My CV: http://bit.ly/sfgreenwood_cv Linkedin: http://www.linkedin.com/in/simonfgreenwood Twitter: @sfgreenwood
-- [email protected] https://lists.ubuntu.com/mailman/listinfo/ubuntu-uk https://wiki.ubuntu.com/UKTeam/
