Using tesseract-ocr from within AOO ?

2016-01-03 Thread Joost Andrae

Hi there,

I've just played with the OpenSource OCR engine 
https://code.google.com/p/tesseract-ocr/ and it seems to do it's job 
very well to do OCR on scanned bitmaps.
As it comes with Apache License 2.0 and as it's available as C++ source 
code why not integrating it's functionality into AOO or building an 
extension that either connect it's API to AOO or which connect's it 
using it's command line arguments ?


From my perspective both projects would benefit...

Just my 2 EUR cents


Kind regards, Joost


-
To unsubscribe, e-mail: dev-unsubscr...@openoffice.apache.org
For additional commands, e-mail: dev-h...@openoffice.apache.org



Re: Using tesseract-ocr from within AOO ?

2016-01-03 Thread Rory O'Farrell
On Sun, 3 Jan 2016 14:28:22 +0100
Joost Andrae  wrote:

> Hi there,
> 
> I've just played with the OpenSource OCR engine 
> https://code.google.com/p/tesseract-ocr/ and it seems to do it's job 
> very well to do OCR on scanned bitmaps.
> As it comes with Apache License 2.0 and as it's available as C++ source 
> code why not integrating it's functionality into AOO or building an 
> extension that either connect it's API to AOO or which connect's it 
> using it's command line arguments ?
> 
>  From my perspective both projects would benefit...
> 
> Just my 2 EUR cents
> 
> 
> Kind regards, Joost

My experience with it was that it was very accurate, perhaps very close in 
accuracy to the best commercial products under Windows.  I was undertaking a 
major OCR project (ebook preparation of two out of print 220 page books); I 
found that using a scan and OCR application under linux 
(Linux-Intelligent-Ocr-Solution) made more sense for a project of this size; I 
later fed the plain text files into OO Writer for detailed spellchecking and 
reformatting.

I doubt that full integration with OpenOffice would be a good idea; an 
extension might be possible, although I doubt its general usefulness will be 
worth the effort of writing it.

-- 
Rory O'Farrell 

-
To unsubscribe, e-mail: dev-unsubscr...@openoffice.apache.org
For additional commands, e-mail: dev-h...@openoffice.apache.org