I currently use a macro in a Windows text editor to remove dates
from bookmarks within a PDF file, which started off as a fairly
simple regex but gradually grew arms and legs as Word and
Acrobat move to Unicode.
The form of each bookmark within the PDF (i.e. as displayed in
BBEdit) is shown below.
/Title(˛ˇ·2·4·/·0·5·/·2·0·0·6· ·M·y· ·B·o·o·k·m·a·r·k)
Where the · character is a null character i.e \x00.
The search and replace expression I used to strip out the date
is as follows:
Find What:
/Title\(˛ˇ(\x00[0-9]){2}\x00/(\x00[0-9]){2}\x00/(\x00[0-9]){4}\x00
Replace with:
/Title\(˛ˇ
Basically just stripping off the date.
With the current combination of Word/Acrobat I now need to
remove the null characters from within the variable length
string that follows the date. Is there a way to accomplish this
either with a second search and replace or by enhancing the
first search?
I can do it by repeatedly running a search and replace removing
one null character at a time but this is laborious in the
extreme. Would there be anyway to automate this?
The only way I have found to reduce the number of steps is to:
1) Strip out the date as before but now also the BOM character too.
2) Select the section of the file that defines the bookmarks.
3) Zap gremlins within that section only.
Any better ways?
--
Regards,
Steve Hodgson <mailto:[EMAIL PROTECTED]>
--
------------------------------------------------------------------
Have a feature request? Not sure the software's working correctly?
If so, please send mail to <[EMAIL PROTECTED]>, not to the list.
List FAQ: <http://www.barebones.com/support/lists/bbedit_talk.shtml>
List archives: <http://www.listsearch.com/BBEditTalk.lasso>
To unsubscribe, send mail to: <[EMAIL PROTECTED]>