Ok.. I see what you are saying... Thanks. :) On Tue, Dec 22, 2009 at 9:26 AM, Barney Boisvert <[email protected]> wrote: > > Quoting is part of standard CSV, Ostermiller will take care of it. > But you don't need to loop over every character. Once you have your > array, you start combining when you find an item that begins with a > quote, and you stop combining when you find an item that ends with a > quote. In my example, item 2 both starts and ends with a quote, so, > and then item 3 starts with a quote, and item 5 ends with a quote, > which is why you combine 3-5. The quotes in item 4 are escaped, but > you probably don't have any of those (at least your example didn't > illustrate any). > > cheers, > barneyb > > On Tue, Dec 22, 2009 at 9:21 AM, Phillip Vector > <[email protected]> wrote: >> >> On Tue, Dec 22, 2009 at 9:09 AM, Barney Boisvert <[email protected]> wrote: >>> >>> Use a CSV parsing library, rather than rolling your own. They take >>> care of all that stuff for you. I've used >>> http://ostermiller.org/utils/CSV.html in the past. >> >> I took a look at that and didn't see anything that would be of help >> since it's not standard formatted that I can see. >> >>> If you really want to parse it yourself, you can use listToArray, and >>> then iterate over the array and combine items that are quoted. For >>> example take this line: >>> >>> 1,"barney","boisvert, \"crazy man\", barney","1234 Main Street, Apt 5" >>> >>> When you listToArray it, you'll get this: >>> >>> [ >>> '1', >>> '"barney"', >>> '"boisvert', >>> ' \"crazy man\"', >>> ' barney"', >>> '"1234 Main Street', >>> ' Apt 5"' >>> ] >>> >>> Note the position of the double quotes in the items. What you need to >>> do is find items that START with a double quote, and then combine them >>> with the following elements until you find an element that ENDS with a >>> double quote. You'll need to handle the case when the current item is >>> the whole string (in the case of the second element), and when the >>> ending quote is escaped (the forth item). >>> >>> In this particular case you need to remove the quotes from item 2 >>> (since it's a whole string to itself), and you need to combine items >>> 3, 4 and 5 (3 starts with a quote, 5 ends with a quote). The quotes >>> in item 4 are escaped, so they should be ignored for combination, and >>> then the backslashes should be removed after the fact. Note that >>> you'll need to deal with double-escaping. As you can see, it's a >>> mess, so I'd highly recommend the third-party library. ;) >> >> See, the issue is that there is no escaped quotes to show it's part of >> the field. >> >> 3270,5101650,"Dewey, Cheatum & Howe ", 0 , 0 ,0.00,0.00,9.25,-9.25 >> 3270,5101650,Phillip Vector , 34 ," 3,161.00 >> ",92.97,79.25,61.76,17.50 >> 3270,5101650,"James P. Kardone JR., P.C. ", 0 , 0 ,0.00,0.00,9.25,-9.25 >> >> So I'm not sure how to determine the end quote. I suppose I can set a >> flag if I have started with a quote and unset the flag when I >> encounter another quote... But I was hoping I didn't have to loop over >> every character to do it. >> >> > >
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~| Want to reach the ColdFusion community with something they want? Let them know on the House of Fusion mailing lists Archive: http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:329320 Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4

