At 2:25 PM -0600 3/21/09, Doug McNutt didst inscribe upon an electronic papyrus:
>Here's what the first few lines looks like when opened with BBEdit > >%FDF-1.2 >%¹³S" >1 0 obj ><< >/FDF ><< >/Fields [ ><< >/V (¦ l i n e 1 4) >/T (¦ f 1 _ 0 5 8 \( 0 \)) >>> ><< >/V (¦ l i n e 1 5) >/T (¦ f 1 _ 0 6 0 \( 0 \)) >>> ><< >/V (¦ l i n e 6) >/T (¦ f 1 _ 0 4 2 \( 0 \)) >It appears that the parentheses that are not escaped designate blocks >that are encoded as UTF16. They begin with an FEFF code point which >is surely a byte order mark. After that there are 16 bit entries the >first byte of which is a null for every file I have looked at. The >escaped parentheses are there because the author of the PDF used >parentheses in his definitions of the form names. Note though that >the backslash escape character is preceded by a null but the >parenthesis following it is not. Yeah, that's the tricky part. I was going to suggest extracting the UTF16 parentheticals, but those un-prefixed parens complicate the process. Then again, Perl is quite capable of parsing that (as opaque bytes), so I guess my recommendation would be to have a script that splits the .fdf into, say, .fdf-t and .fdf-u files where the '-u' contains all the unicode strings (minus the BOMs) and the '-t' contains the rest, with some unique placeholder where the strings were so that the edited '-u' file can be recombined with it. Maybe use the same script for splitting and recombining: fdf.pl foo.fdf fdf.pl foo.fdf-t foo.fdf-u # check arg value for suffix, pass to 'if' block fdf foo.fdf-* # if you handle globbing via shell-alias Umm, are *all* the un-prefixed parens of the pattern shown? that is, around an ASCII 0 and at the end of the string? (sounds like a stylish CSTR, hehe). If so, that makes it easy -- just leave that part in the '-t' file. I take it the Unicode strings are the only part you want to edit in BBEdit? -W --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "BBEdit Talk" group. To post to this group, send email to bbedit@googlegroups.com To unsubscribe from this group, send email to bbedit+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/bbedit?hl=en If you have a specific feature request or would like to report a suspected (or confirmed) problem with the software, please email to "supp...@barebones.com" rather than posting to the group. -~----------~----~----~----~------~----~------~--~---