Dear Jonh,

many, many thanks for your quick answer.

I modified your script a bit:
        $line .= $_ if /Id|To|From/;
        print $OUT "$id\t$line\n" if m!/Note!;

to:
$line .= $_ if m!<Note>! .. m!</Note>!;
print $OUT "$id\t$line\n" if m!</Note>!;


but some problem still persists with the output:
001 <Id>001</Id><To>Thomas</To><From>Joana</From><Message>foo</Message></Note>
002 <Id>002</Id><To>John</To><From>Paula</From><Message>foo</Message></Note>
003 <Id>003</Id><To>Andrew</To><From>Maria</From><Message>foo</Message></Note>

Note that there is no opening <Note> tag at the beginning.

Best, Andrej






John W. Krahn wrote:
Andrej Kastrin wrote:
Dear all,

Hello,

to pre-process my XML dataset in run simple Perl script on it, which extract Id identifier from XML data and paste the whole XML record to it. For example, the input data looks like:

<NoteSet>
    <Note>
        <Id>001</Id>
        <To>Thomas</To>
        <From>Joana</From>
    </Note>
    <Note>
        <Id>002</Id>
        <To>John</To>
        <From>Paula</From>
    </Note>
    <Note>
        <Id>003</Id>
        <To>Andrew</To>
        <From>Maria</From>
    </Note>
</NoteSet>

and the desire output using the script should be:

001    <Note><Id>001</Id><To>Thomas</To><From>Joana</From></Note>
002    <Note><Id>002</Id><To>John</To><From>Paula</From></Note>
003    <Note><Id>003</Id><To>Andrew</To><From>Maria</From></Note>

This should do what you want:

#!/usr/bin/perl
use warnings;
use strict;

my $FNI = shift;
my $FNO = "$FNI.dat";

open my $OUT, '>', $FNO or die "Cannot open '$FNO' $!";
open my $IN,  '<', $FNI or die "Cannot open '$FNI' $!";

my ( $id, $line );
while ( <$IN> ) {
    if ( m!<Note>! .. m!</Note>! ) {
        ( $id, $line ) = ( $1, '' ) if m!<Id>(\d+)</Id>!;
        s/\A\s+//;
        s/\s+\z//;
        tr/\t/ /s;   # more efficient than s/\t+/ /g
        $line .= $_ if /Id|To|From/;
        print $OUT "$id\t$line\n" if m!/Note!;
        }
    }

close $IN;
close $OUT;



But I can't figure why the script below omit the last record in the input dataset, e.g.:

Your second while loop is eating up the third record without outputting anything.



John

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/


Reply via email to