On Tue, Feb 10, 2004, Oleg Goldshmidt wrote about "Re: Regexps":
> In general, handling string literals with regexps is not trivial,
> because you need to take into account escaped ", as in
>
> "foo \"sna fu\" bar"
This may not be relevant for his situation. One situation in which I once
used a similar trick to the one I posted earlier was in breaking up a
"CSV" - comma separated values. In a CSV, the comma is the field
separator (rather than the space in the poster's question), so a record might
look like
one,two,three,four,five
Now, the convention is that if field 'two' is to be replaced by something
containing a comma, say '1,2,3', the field is quoted with double-quotes:
one,"1,2,3",three,four,five
And you're supposed to split this record up on commas that are not inside
quotes.
What happens if there are quotes in one of the field? Each double-quote is
replaced by two of them, keeping the evenness of the number of quotes
(quote parity) and allowing exactly the same method of splitting on commas,
and allowing for an easy reverse transformation.
For example,
one,"1,2,3",three,he said ""hello!"",five
or
one,"1,2,3",three,"he said ""hi, man!""",five
In the last line you know you shouldn't seperate on the comma before 'man'
because it has an odd number of quotes before (or after) it. Nice and simple :)
At least, that is what I remember. Sadly, the Wikipedia entry on CSV is
non-existant, so I'm using my memory as the source ;)
Anyway, CSV is a simple record/field representation methods, but it is very
rarely used in Unix (it is more common in the Windows world). Tab-seperated
fields are, justifiably much more common - they are easier to use and usually
enough (and if you need tabs, seperate the fields with some other character).
--
Nadav Har'El | Tuesday, Feb 10 2004, 19 Shevat 5764
[EMAIL PROTECTED] |-----------------------------------------
Phone: +972-53-790466, ICQ 13349191 |A messy desk is a sign of a messy mind.
http://nadav.harel.org.il |An empty desk is a sign of an empty mind.
=================================================================
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word "unsubscribe" in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]