Re: [U2] [UD] Extract a line with a CR and/or LF character in it.

Ed Clark Thu, 26 May 2011 10:55:43 -0700

I was only suggesting for this one particular case where you were already using 
readseq, know that you have the quoting, and have non-line-break carriage 
returns defeating readseq :) In general, I use READBLK and parse my own line 
breaks--legacy code that goes back to reading fixed blocks of data from tapes.


On May 26, 2011, at 12:36 PM, Bill Haskett wrote:

> Ed:
> 
> Actually, it's pretty easy to parse a CSV file.  One just has to do it one 
> character at a time.  The problem with the usual BASIC statements, READSEQ or 
> REMOVE is they don't read the entire line, just the part up to the CR/LF.  
> The second read reads the balance of the line (or until another CR and/or LF 
> is encountered).
> 
> That's why I had to figure out a way to ensure the multiple lines were joined 
> together if, and only if, there was a CR/LF embedded in a quoted field.
> 
> By the way, the wiki has some code to parse CSV files.
> 
> Bill
> 
> ------------------------------------------------------------------------
> ----- Original Message -----
> *From:* [email protected]
> *To:* U2 Users List <[email protected]>
> *Date:* 5/26/2011 6:48 AM
> *Subject:* Re: [U2] [UD] Extract a line with a CR and/or LF character in it.
>> just an idea I haven't thought about too deeply:
>> Use readseq to read a line, then use the COUNT() function to count the 
>> quotes. If there are an odd number of quotes (mod(2)=1) then add a value 
>> mark and read and append another line. Loop until you have an even number of 
>> quotes (because there might be more than one "multivalued" field in the 
>> record), at which point you have the entire line.
>> 
>> On May 26, 2011, at 2:57 AM, Bill Haskett wrote:
>> 
>>> I figured out how to do this.  I read each line and use a subroutine to go 
>>> through each character.  It sets a variable 'QuoteOn' if we're in a quoted 
>>> string.  Obviously if the line ends while in a quoted string, the next line 
>>> belongs to the current line.  Man, what a pain this was!  :-)
>>> 
>>> Thanks for your thoughts and help.
>>> 
>>> Bill
>>> 
>>> ------------------------------------------------------------------------
>>> ----- Original Message -----
>>> *From:* [email protected]
>>> *To:* [email protected]
>>> *Date:* 5/25/2011 10:13 PM
>>> *Subject:* Re: [U2] [UD] Extract a line with a CR and/or LF character in it.
>>>> It's been a while - but I'm pretty sure that OSBREAD keeps the CR/LF as 
>>>> part of the block (you may need to put NO CONVERT ON in the code). READSEQ 
>>>> automatically ends at the CR/LF so you would have to "put the lines 
>>>> together" if you were short fields.
>>>> 
>>>> In both cases it would mean going through the block/line a character at a 
>>>> time to parse out each field. Of course, to work with embedded quotes and 
>>>> commas you pretty much have to any way. With READSEQ you know the line 
>>>> ended on a CRLF - you just need to figure out if it's the end of the 
>>>> record or not.
>>>> 
>>>> Does that make more sense?
>>>> 
>>>> Hht
>>>> Colin Alfke
>>>> Calgary, Canada
>>>> 
>>>> 
>>>>> From: wphaskett
>>>>> 
>>>>> I guess that's my problem. I can't use OSBREAD because the Cr/Lf
>>>>> appears in different columns in the line. I can't guarantee where it
>>>>> shows up (or what character position). Using READSEQ doesn't work
>>>>> either because the line read by the statement is only a part of the
>>>>> entire line in the file! e.g.
>>>>> 
>>>>> 0,4300,1BEU,Robert,Smith,Julie,Smith,1 Lakewood Dr,,63031,"1 Lakewood Dr
>>>>> San Diego, CA 92122",,,$150.00,,,,,
>>>>> 0,4300,1CYN,John Randolph,Bones,,,1 Round Ct,,63031,"1 Round Ct
>>>>> San Diego, CA 92122",,,$150.00,,,,,
>>>>> 
>>>>> ...when the lines should look like (only two lines):
>>>>> 
>>>>> 0,4300,1BEU,Robert,Smith,Julie,Smith,1 Lakewood Dr,,63031,"1 Lakewood
>>>>> Dr, San Diego, CA 92122",,,$150.00,,,,,
>>>>> 0,4300,1CYN,John Randolph,Bones,,,1 Round Ct,,63031,"1 Round Ct, San
>>>>> Diego, CA 92122",,,$150.00,,,,,
>>>>> 
>>>>> There's no guarantee the field causing the problem will even have any
>>>>> data in it, so I can't append every 2nd line to the end of every 1st
>>>>> line. :-(
>>>>> 
>>>>> Once I get the line I can deal with each character at a time. Any other
>>>>> ideas?
>>>>> 
>>>>> As always, thanks.
>>>>> 
>>>>> Bill
>>> _______________________________________________
>>> U2-Users mailing list
>>> [email protected]
>>> http://listserver.u2ug.org/mailman/listinfo/u2-users
>> _______________________________________________
>> U2-Users mailing list
>> [email protected]
>> http://listserver.u2ug.org/mailman/listinfo/u2-users
> 
> _______________________________________________
> U2-Users mailing list
> [email protected]
> http://listserver.u2ug.org/mailman/listinfo/u2-users

_______________________________________________
U2-Users mailing list
[email protected]
http://listserver.u2ug.org/mailman/listinfo/u2-users

Re: [U2] [UD] Extract a line with a CR and/or LF character in it.

Reply via email to