I currently use a macro in a Windows text editor to remove dates from bookmarks within a PDF file, which started off as a fairly simple regex but gradually grew arms and legs as Word and Acrobat move to Unicode.

The form of each bookmark within the PDF (i.e. as displayed in BBEdit) is shown below.

/Title(˛ˇ·2·4·/·0·5·/·2·0·0·6· ·M·y· ·B·o·o·k·m·a·r·k)

Where the · character is a null character i.e \x00.

The search and replace expression I used to strip out the date is as follows:

Find What:
/Title\(˛ˇ(\x00[0-9]){2}\x00/(\x00[0-9]){2}\x00/(\x00[0-9]){4}\x00

Replace with:
/Title\(˛ˇ

Basically just stripping off the date.

With the current combination of Word/Acrobat I now need to remove the null characters from within the variable length string that follows the date. Is there a way to accomplish this either with a second search and replace or by enhancing the first search?

I can do it by repeatedly running a search and replace removing one null character at a time but this is laborious in the extreme. Would there be anyway to automate this?

The only way I have found to reduce the number of steps is to:

1) Strip out the date as before but now also the BOM character too.

2) Select the section of the file that defines the bookmarks.

3) Zap gremlins within that section only.

Any better ways?
--
Regards,

Steve Hodgson                        <mailto:[EMAIL PROTECTED]>



--
------------------------------------------------------------------
Have a feature request? Not sure the software's working correctly?
If so, please send mail to <[EMAIL PROTECTED]>, not to the list.
List FAQ: <http://www.barebones.com/support/lists/bbedit_talk.shtml>
List archives: <http://www.listsearch.com/BBEditTalk.lasso>
To unsubscribe, send mail to:  <[EMAIL PROTECTED]>

Reply via email to