Is there a fast way to parse a long list of DateTime strings to DateTime
objects if I want to be flexible on the date time format they can have? I
currently first find the format of one string and then reuse that for the
rest but it feels like this functionality should be built in or that
there's a much smarter/faster way to do it. I think there was previously a
date() function that took flexible input but it is not longer in Dates
package?
julia> length(long_list_of_date_strs)
497338
julia> function find_matching_datetime_format(datestr)
datestr = strip(datestr)
formats = ["y-m-d H:M", "y/m/d H:M", "y m d H:M"] # Add more
formats as needed...
for ft in formats
try
d = Dates.DateTime(datestr, ft)
return d, ft
catch err
end
end
throw("Cannot find a date format that matches: $(datestr)")
end
find_matching_datetime_format (generic function with 1 method)
julia> parsedate(s) = find_matching_datetime_format(s)[1]
parsedate (generic function with 1 method)
julia> function parsedates(datestrings)
d, ftstr = find_matching_datetime_format(datestrings[1])
ft = Dates.DateFormat(ftstr)
map(ds -> Dates.DateTime(ds, ft), datestrings)
end
parsedates (generic function with 1 method)
julia> @time r1 = map(parsedate, long_list_of_date_strs[1:10000]);
elapsed time: 3.507318034 seconds (273707960 bytes allocated, 41.44% gc
time)
julia> @time r2 = parsedates(long_list_of_date_strs[1:10000]);
elapsed time: 0.325861155 seconds (38314796 bytes allocated, 72.21% gc time)
julia> r1 == r2
true
Thanks for any advice!