Virgiliu Craciun wrote:

> I do apologise, I did not mean somebody to write the code for us, I was
> wondering if anybody can point us to a freely available example on how
> to look
> for text inside(we're just clinical staff in a hospital trying
> desperately to
> sort our patients' files, it's nothing commercial).

Ah, no worries. I've had some ... frustrating ... experiences in the
past with commercial operations expecting free technical support just
because I've chosen to contribute my time and effort to the project for
free.

> (Doc: 17220080930.121655.008)'

> so I assume the string is retrievable.

Yes, that'll be pretty easy. You can just use PdfContentsTokenizer to
scan the content stream until you find the string of interest. It should
actually be pretty trivial, at least assuming you don't have to protect
against the possibility of other strings that co-incidentally match the
same format.

You will need to find the correct page of the PDF and, if the page has
multiple content streams, the right content stream. That's probably not
going to be hard either, though, and PoDoFo provides interfaces to
access the the page tree & page contents very easily.

> I'm afraid 'podofotxtextract' is not in the 'tool' subfolder (there are
> 7 tools,
> but not this one (podofo-0.6.0.tar.gz, downloaded yesterday)). I guess
> it would
> really shed some light on how to do it...

As Dom noted you will need to check out PoDoFo from Subversion to get
podofotxtextract. I hadn't realised it wasn't there by 0.6.0 , sorry.
There are instructions on how to get PoDoFo from svn on the PoDoFo website.

--
Craig Ringer

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Podofo-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/podofo-users

Reply via email to