Hi, Ralf,
Thanks for your reply.
I tried the command:
awk 'BEGIN { RS="\&"};{
gsub(/\[[\n]+\]/, "");print $0}' gawk_test > gawk_modi
But the content in gawk_modi is still the same as the gawk_test:
<"Reports" [
]>
By the way, what's the difference between RS and FS?
Thanks for y our help,
Rita
On Thu, Jun 18, 2009 at 3:49 PM, Ralf Wildenhues <[email protected]>wrote:
> Hello Rita,
>
> * shaledova wrote on Thu, Jun 18, 2009 at 05:26:44AM CEST:
> >
> > I tried to use gawk to perform some text conversions. But I could not
> > substitute new lines (\n) using gsub such as:
> > gsub(/\[[\n]*\]/, "");
> >
> > For example, if I have a file containing:
> > <"Week Report" [
> >
> >
> >
> > ]>
> >
> > I want to convert these lines to:
> > <"Week Report">
> >
> > What is wrong with the expression?
>
> The expression is ok, but gawk operates on each line in turn by default;
> more specifically, the implicit loop is over records, with RS being the
> record separator, which is a newline by default. With something like
> awk 'BEGIN { RS="X" }
> { gsub(/\[[\n]*\]/, ""); print }'
>
> you can get the above input to turn into
> <"Week Report" >
>
> (note also the space before the closing > that was noto matched).
>
> Of course, this is a kludge and requires your input to not contain X;
> and you might have to adjust the output record separator ORS as well.
>
> However, when parsing nested structures, regular expressions are
> generally not the right tool. You might be better off writing a small
> state machine that reads the file line by line and just skips printing
> output when inside unwanted [ ] brackets.
>
> Hope that helps.
>
> Cheers,
> Ralf
>