Thank you Kåre and Klaas for your replies, I had some time to dig a bit more about this:
Kåre Särs <kare.s...@iki.fi> > [snip] > > 1) Create a non-GUI Qt/KDE library that can take an (Q)image and generate > output suitable for djvu/PDF/ODF. Maybe even generate djvu/PDF/ODF files. > > 2) Make a simple GUI around the library to test the functionality. > > 3) Add the ORC part to the KScan plugin ksaneplugin. (kdegraphics) > > 4) Create a Kipi-plugin for use in Gwenview,Digikam,.... > > 5) Standalone document scanning application that is specialized for > multipage scanning to PDF/djvu/ODT. > > > I'm not familiar with the ocropus API, so I'm not sure how much work it > would be. I'm not sure one GSOC would be enough for all 5 points ;) > > Regards, > Kåre In first place, I have just realized that gocr is able to provide an output saying where the characters/words are located (see the gocr man page, I checked how "-f XML" works with a sample image, and looks like it's what I need); thus it wouldn't be mandatory to add ocropus support right now; it would be fine, but optional. In second place, and just FYI, I've got a ~12 years old scanner, I've tested both skanlite and kooka, skanlite worked fine, however kooka doesn't work _for_ _me_. Fortunately I think I still can provide a djvu generator supporting OCR with kooka, even if I don't port it to libksane; see below. About Kåre's tasks set: I think I would split the first item thus: 1a) Create a non-GUI Qt/KDE library able to open and generate djvu documents without text layer. (libkdjvu) 1b) Create a non-GUI Qt/KDE library that can take an (Q)image and generate output suitable for djvu/PDF/ODF (libkocr) 1c) Add suport to the libkdjvu library to include the data retrieved with libkocr as text layer. Note that a djvu file may or may not have a text layer. Also note that getting a text with OCR and creating djvu files joining various images/texts are very different jobs. That are the reasons to split the first item like that. That being said, let me do some other remarks and questions: About my 1a): Perhaps I could reuse some code from okular; I'd need to investigate more about this. About my 1b): There is already some code in kooka to do something like that; see these classes: OcrGocrEngine, OcrEngine and KookaImage. So, performing these task would be mainly: hacking on OcrGocrEngine in order to make it give an output suitable for my new libkdjvu library (that would be done processing the output of "gocr -f XML") and taking all the kooka classes related to ocr and putting them together in a shared library (libkocr). Looks like most of kooka files are licensed with GPLv2 only with a couple of special exceptions; Klaas, could we please change that license to GPLv2 or later with the same couple of special exceptions? See: http://techbase.kde.org/Projects/KDE_Relicensing About 2) and 5): I'm open to other ideas, but right now I tend to think that both the "simple GUI" mentioned in 2) and the "Standalone document scanning application" mentioned in 5) will be a new tab in kooka which would behave as a djvu editor. I did quick mockup, this GUI would be able to open existing djvu documents as well as creating new ones: http://alioth.debian.org/~santa-guest/gsoc2012/mockup.png About 3) and 4): if I create that libkocr library this should be easy to do; however, I want to understand better how these plugins would work from a user point of view; for instance, let's say I open a png file in my gwenview, I have a menu item called "Process image with OCR" inside the "Plugins" menu. What would happen if I click that item? Would it open a text editor with the OCR result or what?
signature.asc
Description: This is a digitally signed message part.
>> Visit http://mail.kde.org/mailman/listinfo/kde-devel#unsub to unsubscribe <<