Paul Tremblay wrote at Fri, 26 Jul 2002 19:55:46 +0200:

> Is there a quicker way to substitute an item in a line than reading the line in each 
>time?
> 
> I am writing a script to convert RTF to XML. One part of the script involves simple 
>substitution,
> like this:
> 
> s/\\ldblquote /<rt_quote\/>/g;
> s/\\rdblquote /<lt_quote\/>/g;
> s/\\emdash /<em_dash\/>/g;
> s/\\rquote /<r_quote\/>/g;
> s/\\tab /<tab\/>/g;
> s/\\lquote /<l_quote\/>/g;
> 

A way is to match quite only the protected words,
look after them in a hash and
to replace for the hash value:

my %xml = qw(
    ldblquote   rt_quote
    rdblquote   lt_quote
    emdash      em_dash
    rquote      r_quote
    tab         tab
    lquote      l_quote
);

s/\\(\w+) /($_ = $xml{$1}) ? "<$_>" : "\\$1 "/ge;

Note that I added a global modifier,
as I believe,
that in an rtf file could occur more than one protected
word in a line.

There could be a lot possibility to speed up my re:
- capture whole content, so "\\$1 " needn't be done
- remove the assignment $_ = $xml{$1} and
  use twice $xml{$1}
  (perhaps useful if there are a lot backslashes in front of
   not protected words)
- ...

I proposed what I feel that should be quick,
time measuring is your job :D


Best Wishes,
Janek


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to