Hi, very often while doing maintenance work on a datawarehouse using CSV files as input, I need to keep only the first line (headers) and any line matching a given regexp, in order to save some time.
Here's a little helper do to this: def keep_headers_and_matching_lines(filename,regexp) tempfilename = filename + ".tmp" FileUtils.mv(filename,tempfilename) File.open(tempfilename) do |input| File.open(filename,'w') do |output| input.each_with_index do |line,index| output << line if (line =~ regexp || index == 0) end end end end Typical use: preprocess { keep_headers_and_matching_lines('mydata.csv',/customer/i) } (sure that can be done also with a grep call - and that would be faster as well) in case it's useful to someone else ! cheers, Thibaut Barrère -- [blog] http://evolvingworker.com - tools for a better day [blog] http://blog.logeek.fr - about writing software _______________________________________________ Activewarehouse-discuss mailing list Activewarehouse-discuss@rubyforge.org http://rubyforge.org/mailman/listinfo/activewarehouse-discuss