Searching PDFs can be Very Hard.
1) The "text" might be nothing more than a bitmap image or some line art. You
said your PDFs are "searchable", so that's probably not an issue for you.
2) Text can be in any encoding. The character 'a' need not be represented by
the byte 0x61. It can be quite litterally ANYTHING. It's USUALLY 0x61, but
you cannot garauntee it.
3) Text does not have to be drawn in logical order. All the characters in Font
A might be drawn first, followed by all those in Font B, etc. (that's real
world example btw). It's also quite realistic to expect the first line of
collumn A to be drawn followed by the first line of column B, followed by the
second lines, and so on.
And words can be split across drawing commands. You have to deduce the
location of "words" based on where all the characters are drawn (after figuring
out what those characters actually are). Word search is Hard.
Having said all that, iText can get you most of the way there... Even "all the
way" depending on how many different programs you have building your PDFs ("1"
is a great answer). Check out the source for
com.itextpdf.text.pdf.parser.SimpleTextExtractingPdfContentStreamProcessor. It
ignores text locations, but that information is available in displayText(). I
believe it also handles the encoding mess for you.
Good hunting.
--Mark Storer
Senior Software Engineer
Cardiff.com
import legalese.Disclaimer;
Disclaimer<Cardiff> DisCard = null;
> -----Original Message-----
> From: wotikar [mailto:[email protected]]
> Sent: Thursday, August 19, 2010 1:39 AM
> To: [email protected]
> Subject: [iText-questions] Find string searchable pdf
>
>
>
> Hello,
>
> I'm looking for a method to find a string in an searchable
> pdf with iTextSharp. I need to stamp pdf with a signature,
> but this signature is not a the same location everytime.
> Therefor i need to search for a string like "With kind
> regards," and at that position i need to add the signature.
>
> Anybody who can help me with this?
>
> Thanks in advance,
>
> René
> --
> View this message in context:
> http://itext-general.2136553.n4.nabble.com/Find-string-searcha
ble-pdf-tp2330837p2330837.html
> Sent from the iText - General mailing list archive at Nabble.com.
>
> --------------------------------------------------------------
> ----------------
> This SF.net email is sponsored by
>
> Make an app they can't live without
> Enter the BlackBerry Developer Challenge
> http://p.sf.net/sfu/RIM-dev2dev
> _______________________________________________
> iText-questions mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/itext-questions
>
> Buy the iText book: http://www.itextpdf.com/book/ Check the
> site with examples before you ask questions:
> http://www.1t3xt.info/examples/ You can also search the
> keywords list: http://1t3xt.info/tutorials/keywords/
>
> No virus found in this incoming message.
> Checked by AVG - www.avg.com
> Version: 9.0.851 / Virus Database: 271.1.1/3068 - Release
> Date: 08/18/10 23:35:00
>
------------------------------------------------------------------------------
This SF.net email is sponsored by
Make an app they can't live without
Enter the BlackBerry Developer Challenge
http://p.sf.net/sfu/RIM-dev2dev
_______________________________________________
iText-questions mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/itext-questions
Buy the iText book: http://www.itextpdf.com/book/
Check the site with examples before you ask questions:
http://www.1t3xt.info/examples/
You can also search the keywords list: http://1t3xt.info/tutorials/keywords/