Brad, Here is a line by line of what it does: sub extractDelimited { my( $line ) = @_; > save the first passed parameter in $line
if( $line !~ /^"/ ) { > if the line doesn't start with a double quote return( undef, $line ); > return an undefined value along with the original string } my $val; my $remainder; > not used - can be removed $line =~ /[^\\]"/g; > find the first instance of a double quote which is not directly preceded by a backslash. The backslash is used to introduce a metacharacter, so a pair of them is a regular backslash my $pos = pos($line); > save the position in the line one character after the end of the pattern match from above. Returns undef if the pattern match fails print $pos, '/', length( $line ), "\n"; > debug statement - can be removed if( defined( $pos )) { > found a match return( substr( $line, 0, $pos ), substr( $line, $pos )); > return two strings: 1) the part of the line up to and including the double quote; 2) the remainder of the line } return( undef, $line ); > did not find a match, so return undef and the original string } The line in question: $line =~ /[^\\]"/g; It searches for a double quote where the preceding character is not a backslash. Note that it takes two backslashes because the first one introduces a metacharacter and the second one says just make it a backslash. The sequence "\n" is used frequently in Perl on Unix as a <CR><LF>. As the line is walked through: $line - the variable containing the data to be used =~ - binding operator which says the left operand is patterned like the right operand / - start of pattern (the preceding "m" is not required if using "/") [ - start of character set (find one character from the following set) ^ - negates the character set if first in the set (all but these characters) \\ - a backslash (two are needed because a single backslash introduces a metacharacter) ] - end of character set " - the double quote I'm looking for / - end of pattern g - modifier to find all occurrences (probably not needed as there is no cyclical match as done in Text::Balanced::extract_delimited). Neither $remainder nor $val is used. I had originally set those vars to be the first and second return parameters, but changed to just send the data back directly. Not knowing the level of Perl you are at, the "Programming Perl" book by O'Reilly is an excellent reference. I've got the second edition and chapter 2 goes into minute detail in regards to regular expressions. HTH, Mark Vaughan TTS Development Comcast Cable Corporation 720.268.8591 -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Brad O'Hara Sent: Wednesday, January 23, 2008 7:15 AM To: ARSperl User Discussion Subject: Re: [Arsperl-users] Parse arx file Mark, AWSOME! Work great. If it is not too much trouble, could I ask how this works? I assume $remainder is not used :-) What does $line =~ /[^\\]"/g; anchor to? Thanks again! Brad On Tuesday 22 January 2008 5:41 pm, Vaughan, Mark wrote: > Brad, > I was able to successfully run this on Windows using ActiveState Perl. > > I did get a seg fault when running this on a Solaris 9 box using Perl > 5.8.5. I followed through the code and > Text::Balanced::extract_delimited, line 135, provides the seg fault. > In translating the variables, here is the pattern match that is failing: > $$textref =~ m/\G(\s*)((?:\"(?:[^\\\"]*(?:\\.[^\\\"]*)*)\"))/gc > > The $textref variable is a scalar pointing to the input line. Only > guessing here, but I wonder if the pattern match is running out of > memory. > > Here is a sample routine which gives the desired response instead of > using Text::Balanced::extract_delimited. It runs fine on my sun box > using the input records you gave me: > > sub extractDelimited { > my( $line ) = @_; > > if( $line !~ /^"/ ) { > return( undef, $line ); > } > > my $val; > my $remainder; > > $line =~ /[^\\]"/g; > my $pos = pos($line); > print $pos, '/', length( $line ), "\n"; > if( defined( $pos )) { > return( substr( $line, 0, $pos ), substr( $line, $pos )); > } > return( undef, $line ); > } > > Try calling this routine instead of Text::Balanced::extract_delimited. > Also, the value in question is $ret[10], not $ret[9]. > > HTH, > Mark Vaughan > TTS Development > Comcast Cable Corporation > 720.268.8591 > -----Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Brad O'Hara > Sent: Sunday, January 20, 2008 6:08 AM > To: ARSperl User Discussion > Subject: Re: [Arsperl-users] Parse arx file > > Mark, > > Thanks for the help. Here is the code I am using that seg faults > when I try it. > #!/usr/bin/perl > use Text::Balanced qw( extract_delimited ); > > $line = `/usr/bin/tail -1 /home/arsrv/bin/perlmods/help.txt`; > > while ( length($line) ) { > if( substr($line,0,1) eq '"' ) { > ( $val, $line ) = extract_delimited( $line, '"' ); > last if defined($line) && !defined($val); > $val =~ s/^"//; > $val =~ s/"$//; > > $val =~ s/\\\\/\\/g; > $val =~ s/\\r//g; > $val =~ s/\\n/\n/g; > $val =~ s/\\"/"/g; > } > else { > $line =~ s/(\S+)//; > $val = $1; > } > push @ret, $val; > $line =~ s/\s*//; > } > > print ">$ret[9]<\n"; > exit; > > I will tried a new export with the same result. I placed a new arx > file at the address I gave you. > It contains the record before the problem record as well as headers. > > Thanks again, > > Brad > > On Jan 20, 2008, at 12:26 AM, Mark Vaughan wrote: > > > Brad, > > I took a look at your file. > > > > The first data record reads okay. > > > > The second data record reads okay in that the routine does not die. > > The > > problem is the tenth field (Scheduled Date, ID 536870921, type > > DATE). It is > > a 1,039,967 byte (after translation) character string. It is not a > > date > > field by any means. > > > > You may wish to regenerate your ARX file. > > > > HTH, > > Mark Vaughan > > 303.471.9987 (home) > > 303.601.4434 (mobile) > > -----Original Message----- > > From: [EMAIL PROTECTED] > > [mailto:[EMAIL PROTECTED] On Behalf Of Brad O'Hara > > Sent: Saturday, January 19, 2008 6:28 PM > > To: ARSperl User Discussion > > Subject: Re: [Arsperl-users] Parse arx file > > > > Mark, > > > > It is a very large record. I placed it at > > http://remedy.cns.ufl.edu/arsperl > > Any help greatly appreciated. > > > > Brad > > > > On Jan 19, 2008, at 12:03 PM, Mark Vaughan wrote: > > > >> What's the record look like? > >> Please send an ARX file with the header, one good record, and the > >> record > >> that does not process. > >> > >> Thanks, > >> Mark Vaughan > >> 303.471.9987 (home) > >> 303.601.4434 (mobile) > >> -----Original Message----- > >> From: [EMAIL PROTECTED] > >> [mailto:[EMAIL PROTECTED] On Behalf Of Brad O'Hara > >> Sent: Friday, January 18, 2008 8:29 AM > >> To: ARSperl User Discussion > >> Subject: Re: [Arsperl-users] Parse arx file > >> > >> Thilo, > >> > >> I've come upon a record the causes the following code to seg > >> fault. Do > >> you have any interest > >> in taking a look? > >> > >> Thanks, > >> Brad > >> > >> On Tuesday 13 November 2007 4:35 pm, Brad O'Hara wrote: > >>> Thanks! Certainly something I can work with! > >>> > >>> Brad > >>> > >>> On Tuesday 13 November 2007 3:01 pm, Thilo Stapff wrote: > >>>> Here's a subroutine that splits a line in ARX format to an array of > >>>> values. It works for the basic data types such as character, > >>>> integer > >> etc. > >>>> More complicated data like attachments, diary and currency would of > >>>> course need some further processing. > >>>> > >>>> > >>>> use Text::Balanced qw( extract_delimited ); > >>>> > >>>> > >>>> sub splitArxLine { > >>>> my( $line ) = @_; > >>>> my @ret; > >>>> my $val; > >>>> > >>>> print "--------------------\n"; > >>>> while( length($line) ){ > >>>> if( substr($line,0,1) eq '"' ){ > >>>> ( $val, $line ) = extract_delimited( $line, '"' ); > >>>> last if defined($line) && !defined($val); > >>>> $val =~ s/^"//; > >>>> $val =~ s/"$//; > >>>> > >>>> $val =~ s/\\\\/\\/g; > >>>> $val =~ s/\\r//g; > >>>> $val =~ s/\\n/\n/g; > >>>> $val =~ s/\\"/"/g; > >>>> > >>>> print "<$val>\n"; > >>>> > >>>> }else{ > >>>> $line =~ s/(\S+)//; > >>>> $val = $1; > >>>> } > >>>> push @ret, $val; > >>>> $line =~ s/\s*//; > >>>> } > >>>> > >>>> return @ret; > >>>> } > >>>> > >>>> > >>>> > >>>> Regards, > >>>> Thilo > >>>> > >>>> > >>>> > >>>> Brad O'Hara wrote: > >>>>> Hi all, > >>>>> > >>>>> Anyone written a routine to parse a .arx file? > >>>>> > >>>>> Thanks, > >>>>> Brad > >>>> > >>>> > >>>> > >> > ------------------------------------------------------------------------ > - > >>>> This SF.net email is sponsored by: Splunk Inc. > >>>> Still grepping through log files to find problems? Stop. > >>>> Now Search log events and configuration files using AJAX and a > >>>> browser. > >>>> Download your FREE copy of Splunk now >> http://get.splunk.com/ > >>>> _______________________________________________ > >>>> Arsperl-users mailing list > >>>> Arsperl-users@arsperl.org > >>>> https://lists.sourceforge.net/lists/listinfo/arsperl-users > >>>> > >>>> > >>> > >> > >> -- > >> Brad O'Hara E-mail: [EMAIL PROTECTED] > >> IT Expert Voice: (352)392-2061 > >> Computing and Networking Services Suncom: 622-2061 > >> University of Florida Fax: (352)392-9440 > >> > >> > ------------------------------------------------------------------------ > - > >> This SF.net email is sponsored by: Microsoft > >> Defy all challenges. Microsoft(R) Visual Studio 2008. > >> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > >> _______________________________________________ > >> Arsperl-users mailing list > >> Arsperl-users@arsperl.org > >> https://lists.sourceforge.net/lists/listinfo/arsperl-users > >> > >> > >> > ------------------------------------------------------------------------ > - > >> This SF.net email is sponsored by: Microsoft > >> Defy all challenges. Microsoft(R) Visual Studio 2008. > >> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > >> _______________________________________________ > >> Arsperl-users mailing list > >> Arsperl-users@arsperl.org > >> https://lists.sourceforge.net/lists/listinfo/arsperl-users > >> > > > > > > > ------------------------------------------------------------------------ > - > > This SF.net email is sponsored by: Microsoft > > Defy all challenges. Microsoft(R) Visual Studio 2008. > > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > > _______________________________________________ > > Arsperl-users mailing list > > Arsperl-users@arsperl.org > > https://lists.sourceforge.net/lists/listinfo/arsperl-users > > > > > > > ------------------------------------------------------------------------ > - > > This SF.net email is sponsored by: Microsoft > > Defy all challenges. Microsoft(R) Visual Studio 2008. > > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > > _______________________________________________ > > Arsperl-users mailing list > > Arsperl-users@arsperl.org > > https://lists.sourceforge.net/lists/listinfo/arsperl-users > > > > > ------------------------------------------------------------------------ > - > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Arsperl-users mailing list > Arsperl-users@arsperl.org > https://lists.sourceforge.net/lists/listinfo/arsperl-users > > ------------------------------------------------------------------------ - > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Arsperl-users mailing list > Arsperl-users@arsperl.org > https://lists.sourceforge.net/lists/listinfo/arsperl-users > > -- Brad O'Hara E-mail: [EMAIL PROTECTED] IT Expert Voice: (352)392-2061 Computing and Networking Services Suncom: 622-2061 University of Florida Fax: (352)392-9440 ------------------------------------------------------------------------ - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Arsperl-users mailing list Arsperl-users@arsperl.org https://lists.sourceforge.net/lists/listinfo/arsperl-users ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Arsperl-users mailing list Arsperl-users@arsperl.org https://lists.sourceforge.net/lists/listinfo/arsperl-users