Re: [PDFdev] Searching and Replacing text in PDF DocProgrammatically

Khalid Abdul Hai Fri, 12 Sep 2003 01:58:35 -0700

PDFdev is a service provided by PDFzone.com | http://www.pdfzone.com
_____________________________________________________________



--

Thanks Todd,

Is this a 3rd party tool or this is freely available.
Do you mean to say we should read directly PDF.

Thanks,
Khalid
--------- Original Message ---------

DATE: Wed, 10 Sep 2003 10:41:57
From: Todd Kueny <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Cc: 

>
>PDFdev is a service provided by PDFzone.com | http://www.pdfzone.com
>_____________________________________________________________
>
>Recent versions of Think121's pdfExpress provide a feature apropos to 
>this discussion. pdfExpress supports a regular expression-like search 
>and replace function that operates on PDF content streams.  The regex 
>functionality has knowledge of the structure of PDF.
>
>       ? - matches any single PDF operand
>       $ - matches any single PDF operator
>       ?$ - matches a single PDF operators
>                  including its associated operands
>       ?, $ and ?$ support * and + for matching
>                    zero or more or one or more occurrences
>                   ?$* - matches zero or more occurrences of any 
>operator and its operands
>       (Jones) 12.45 Tj rg ...  - constant values (operands and operators) 
>that match exactly
>       _sX .. _eX - allows you to group matches like [ ] in regular regex
>       ${_sX} - the value of a group like \1 in regex
>
>So you can say something like
>
>       find:  _s1 ?$* _e1 (%%name%%) Tj
>       replace: ${_s1} (BOB SMITH) Tj
>
>When applied against a PDF content stream this will find the first 
>instance of a Tj operator applied to the string (%%name%%) and replace 
>its operand with (BOB SMITH).  The _s1 marker groups all the operators 
>and operands that occur prior to the Tj; these are subsequently used in 
>the replacement to prefix the new operator.
>
>This gets used in commercial situations where PDF files are acting as 
>templates (the text to find is always the same) or where additional 
>workflow functions use the result of the find (which can be written to 
>an output file) to perform other operations, e.g., locating TAB pages 
>in a PDF by content and then replacing them with a tray pull command of 
>some sort.
>
>Obviously, this requires detailed knowledge about what you are trying 
>to do and about the structure of the PDFs you are working with.
>
>Todd
>  
>  
>
>
>To change your subscription:
>http://www.pdfzone.com/discussions/lists-pdfdev.html
>
>



____________________________________________________________
Get advanced SPAM filtering on Webmail or POP Mail ... Get Lycos Mail!
http://login.mail.lycos.com/r/referral?aid=27005

To change your subscription:
http://www.pdfzone.com/discussions/lists-pdfdev.html

Re: [PDFdev] Searching and Replacing text in PDF DocProgrammatically

Reply via email to