Quoting is part of standard CSV, Ostermiller will take care of it.
But you don't need to loop over every character.  Once you have your
array, you start combining when you find an item that begins with a
quote, and you stop combining when you find an item that ends with a
quote.  In my example, item 2 both starts and ends with a quote, so,
and then item 3 starts with a quote, and item 5 ends with a quote,
which is why you combine 3-5.  The quotes in item 4 are escaped, but
you probably don't have any of those (at least your example didn't
illustrate any).

cheers,
barneyb

On Tue, Dec 22, 2009 at 9:21 AM, Phillip Vector
<vec...@mostdeadlygame.com> wrote:
>
> On Tue, Dec 22, 2009 at 9:09 AM, Barney Boisvert <bboisv...@gmail.com> wrote:
>>
>> Use a CSV parsing library, rather than rolling your own.  They take
>> care of all that stuff for you.  I've used
>> http://ostermiller.org/utils/CSV.html in the past.
>
> I took a look at that and didn't see anything that would be of help
> since it's not standard formatted that I can see.
>
>> If you really want to parse it yourself, you can use listToArray, and
>> then iterate over the array and combine items that are quoted.  For
>> example take this line:
>>
>> 1,"barney","boisvert, \"crazy man\", barney","1234 Main Street, Apt 5"
>>
>> When you listToArray it, you'll get this:
>>
>> [
>>  '1',
>>  '"barney"',
>>  '"boisvert',
>>  ' \"crazy man\"',
>>  ' barney"',
>>  '"1234 Main Street',
>>  ' Apt 5"'
>> ]
>>
>> Note the position of the double quotes in the items.  What you need to
>> do is find items that START with a double quote, and then combine them
>> with the following elements until you find an element that ENDS with a
>> double quote.  You'll need to handle the case when the current item is
>> the whole string (in the case of the second element), and when the
>> ending quote is escaped (the forth item).
>>
>> In this particular case you need to remove the quotes from item 2
>> (since it's a whole string to itself), and you need to combine items
>> 3, 4 and 5 (3 starts with a quote, 5 ends with a quote).  The quotes
>> in item 4 are escaped, so they should be ignored for combination, and
>> then the backslashes should be removed after the fact.  Note that
>> you'll need to deal with double-escaping.  As you can see, it's a
>> mess, so I'd highly recommend the third-party library.  ;)
>
> See, the issue is that there is no escaped quotes to show it's part of
> the field.
>
> 3270,5101650,"Dewey, Cheatum & Howe ", 0 , 0 ,0.00,0.00,9.25,-9.25
> 3270,5101650,Phillip Vector , 34 ," 3,161.00
> ",92.97,79.25,61.76,17.50
> 3270,5101650,"James P. Kardone JR., P.C. ", 0 , 0 ,0.00,0.00,9.25,-9.25
>
> So I'm not sure how to determine the end quote. I suppose I can set a
> flag if I have started with a quote and unset the flag when I
> encounter another quote... But I was hoping I didn't have to loop over
> every character to do it.
>
> 

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
Want to reach the ColdFusion community with something they want? Let them know 
on the House of Fusion mailing lists
Archive: 
http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:329319
Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4

Reply via email to