Hello:
        I'm trying to build a routine to split a string in fields by a 
specified delimiter.  The string format is pretty close to CSV, except that 
quoted substrings can appear within an unquoted string, and escaped quotes can 
exist within quoted strings, and the delimiter might exist within a quoted 
string (like in CSV).  Specifically, its to split recipient lists from e-mail 
"To:" headers.  So for example:

"LastName, FirstName" <address>, "Name" <address>, <address>; FirstName 
LastName, address; "First \"nick\" Last" address

The above string should be splitted into:
        "LastName, FirstName" <address>
        "Name" <address>
        <address>
        FirstName LastName
        address
        "First \"nick\" Last" address

All unquoted surrounding whitespace should be removed.  I've gotten so far as 
this:

# modified from the Perl Cookbook
push(@list, $+)
    while $text =~ 
/\s*("[^\"\\]*(?:\\.[^\"\\]*)*"(?:\s+[^;,]+))\s*[;,]?\s*|\s*([^;,]+)\s*[;,]?\s*|[;,]\s*/g;
push(@list, undef) if (substr($text, -1, 1) =~ /[;,]/);


But since the matches seem to be too greedy, it keeps trailing space before the 
delimiters.  Can someone offer a better solution?

NOTE:  I want it to be as generic as possible as I cannot expect the elements 
in the list to follow strict guidelines (there are too many broken programs out 
there and too many idiots!)

        Thanks!
        dZ.

-- 
Those whom the gods would destroy, they first make mad.

_______________________________________________
ActivePerl mailing list
[email protected]
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs

Reply via email to