Tim Daneliuk wrote: > On 08/02/2017 10:05 AM, Daiyue Weng wrote: >> Hi, I am trying to removing extra quotes from a large set of strings (a >> list of strings), so for each original string, it looks like, >> >> """str_value1"",""str_value2"",""str_value3"",1,""str_value4""" >> >> >> I like to remove the start and end quotes and extra pairs of quotes on >> each string value, so the result will look like, >> >> "str_value1","str_value2","str_value3",1,"str_value4" > > <SNIP> > > This part can also be done fairly efficiently with sed: > > time cat hugequote.txt | sed 's/"""/"/g;s/""/"/g' >/dev/null > > real 0m2.660s > user 0m2.635s > sys 0m0.055s > > hugequote.txt is a file with 1M copies of your test string above in it. > > Run on a quad core i5 on FreeBSD 10.3-STABLE.
It looks like Python is fairly competetive: $ wc -l hugequote.txt 1000000 hugequote.txt $ cat unquote.py import csv with open("hugequote.txt") as instream: for field, in csv.reader(instream): print(field) $ time python3 unquote.py > /dev/null real 0m3.773s user 0m3.665s sys 0m0.082s $ time cat hugequote.txt | sed 's/"""/"/g;s/""/"/g' > /dev/null real 0m4.862s user 0m4.721s sys 0m0.330s Run on ancient AMD hardware ;) -- https://mail.python.org/mailman/listinfo/python-list