I was only suggesting for this one particular case where you were already using readseq, know that you have the quoting, and have non-line-break carriage returns defeating readseq :) In general, I use READBLK and parse my own line breaks--legacy code that goes back to reading fixed blocks of data from tapes.
On May 26, 2011, at 12:36 PM, Bill Haskett wrote: > Ed: > > Actually, it's pretty easy to parse a CSV file. One just has to do it one > character at a time. The problem with the usual BASIC statements, READSEQ or > REMOVE is they don't read the entire line, just the part up to the CR/LF. > The second read reads the balance of the line (or until another CR and/or LF > is encountered). > > That's why I had to figure out a way to ensure the multiple lines were joined > together if, and only if, there was a CR/LF embedded in a quoted field. > > By the way, the wiki has some code to parse CSV files. > > Bill > > ------------------------------------------------------------------------ > ----- Original Message ----- > *From:* [email protected] > *To:* U2 Users List <[email protected]> > *Date:* 5/26/2011 6:48 AM > *Subject:* Re: [U2] [UD] Extract a line with a CR and/or LF character in it. >> just an idea I haven't thought about too deeply: >> Use readseq to read a line, then use the COUNT() function to count the >> quotes. If there are an odd number of quotes (mod(2)=1) then add a value >> mark and read and append another line. Loop until you have an even number of >> quotes (because there might be more than one "multivalued" field in the >> record), at which point you have the entire line. >> >> On May 26, 2011, at 2:57 AM, Bill Haskett wrote: >> >>> I figured out how to do this. I read each line and use a subroutine to go >>> through each character. It sets a variable 'QuoteOn' if we're in a quoted >>> string. Obviously if the line ends while in a quoted string, the next line >>> belongs to the current line. Man, what a pain this was! :-) >>> >>> Thanks for your thoughts and help. >>> >>> Bill >>> >>> ------------------------------------------------------------------------ >>> ----- Original Message ----- >>> *From:* [email protected] >>> *To:* [email protected] >>> *Date:* 5/25/2011 10:13 PM >>> *Subject:* Re: [U2] [UD] Extract a line with a CR and/or LF character in it. >>>> It's been a while - but I'm pretty sure that OSBREAD keeps the CR/LF as >>>> part of the block (you may need to put NO CONVERT ON in the code). READSEQ >>>> automatically ends at the CR/LF so you would have to "put the lines >>>> together" if you were short fields. >>>> >>>> In both cases it would mean going through the block/line a character at a >>>> time to parse out each field. Of course, to work with embedded quotes and >>>> commas you pretty much have to any way. With READSEQ you know the line >>>> ended on a CRLF - you just need to figure out if it's the end of the >>>> record or not. >>>> >>>> Does that make more sense? >>>> >>>> Hht >>>> Colin Alfke >>>> Calgary, Canada >>>> >>>> >>>>> From: wphaskett >>>>> >>>>> I guess that's my problem. I can't use OSBREAD because the Cr/Lf >>>>> appears in different columns in the line. I can't guarantee where it >>>>> shows up (or what character position). Using READSEQ doesn't work >>>>> either because the line read by the statement is only a part of the >>>>> entire line in the file! e.g. >>>>> >>>>> 0,4300,1BEU,Robert,Smith,Julie,Smith,1 Lakewood Dr,,63031,"1 Lakewood Dr >>>>> San Diego, CA 92122",,,$150.00,,,,, >>>>> 0,4300,1CYN,John Randolph,Bones,,,1 Round Ct,,63031,"1 Round Ct >>>>> San Diego, CA 92122",,,$150.00,,,,, >>>>> >>>>> ...when the lines should look like (only two lines): >>>>> >>>>> 0,4300,1BEU,Robert,Smith,Julie,Smith,1 Lakewood Dr,,63031,"1 Lakewood >>>>> Dr, San Diego, CA 92122",,,$150.00,,,,, >>>>> 0,4300,1CYN,John Randolph,Bones,,,1 Round Ct,,63031,"1 Round Ct, San >>>>> Diego, CA 92122",,,$150.00,,,,, >>>>> >>>>> There's no guarantee the field causing the problem will even have any >>>>> data in it, so I can't append every 2nd line to the end of every 1st >>>>> line. :-( >>>>> >>>>> Once I get the line I can deal with each character at a time. Any other >>>>> ideas? >>>>> >>>>> As always, thanks. >>>>> >>>>> Bill >>> _______________________________________________ >>> U2-Users mailing list >>> [email protected] >>> http://listserver.u2ug.org/mailman/listinfo/u2-users >> _______________________________________________ >> U2-Users mailing list >> [email protected] >> http://listserver.u2ug.org/mailman/listinfo/u2-users > > _______________________________________________ > U2-Users mailing list > [email protected] > http://listserver.u2ug.org/mailman/listinfo/u2-users _______________________________________________ U2-Users mailing list [email protected] http://listserver.u2ug.org/mailman/listinfo/u2-users
