Hi Ruby users, You might be aware that I'm heading up the effort to create a system which parses the GMPTE ATCO-CIF bus data into a DB with an API. I'd really like some advice from you.
I wrote the code below to parse CIF files and add the data within to a database. There are over 500 CIF files totalling over 200MB of data which this code iterates through. When I ran it, the process took just under 7 hours on my MacBook Pro. I really want it to parse through these files much faster so I'd like your advice on how to refactor this to make it work faster. Is Dir.glob the fastest way to read a directory? Is File.expand the fastest way to open a file? Is File.readlines to fastest way to extract the lines of the file? Is there a faster way to .save and .find from the models? Could my loop and if statements be made to work quicker? Any thoughts or advice are extremely welcome. Also, the code is here if you want to meddle - http://github.com/flythecoop/ATCO-CIF-Parser-And-API - and I can provide you with a couple of CIF files (but I don't think I'm allowed to distribute them all). def import_times @files = Dir.glob('public/cif/*') @files.each do |file| @path = File.expand_path(file) data = File.readlines(@path) stop_list = [] data.each do |line| if line[0,2] == 'ZL' @service = Service.new @service.reference = line[2,8] @service.stop_list_number = line[10,3] @service.direction = line[13,1] end if line[0,2] == 'ZD' @service.term_start = line[2,8] @service.term_end = line[10,8] @service.days_of_operation = line[18,64] end if line[0,2] == 'ZS' @service.number = line[10,4] @service.description = line[14,50] @service.save! @bus_service = Service.find(:last) end if line[0,2] == 'ZA' if !stop_list.include?(line[3,12]) @bus_stop = BusStop.new @bus_stop.ref_code = line[3,12] @bus_stop.name = line[15,48] @bus_stop.publicity_point = 1 if line[63,2] == 'P1' @bus_stop.working_point = 1 if line[63,2] == 'W1' @bus_stop.timing_point = 1 if line[63,2] == 'T1' @bus_stop.save! stop_list << line[3,12] unless stop_list.include?(line[3,12]) end end if line[0,2] == 'QS' @journey_detail = JourneyDetail.new @journey_detail.service_id = @bus_service.id @journey_detail.operator = line[3,4] @journey_detail.journey_identifier = line[7,6] @journey_detail.monday = 1 if line[29,1] == '1' @journey_detail.tuesday = 1 if line[30,1] == '1' @journey_detail.wednesday = 1 if line[31,1] == '1' @journey_detail.thursday = 1 if line[32,1] == '1' @journey_detail.friday = 1 if line[33,1] == '1' @journey_detail.saturday = 1 if line[34,1] == '1' @journey_detail.sunday = 1 if line[35,1] == '1' @journey_detail.school_term = line[36,1] @journey_detail.bank_holidays = line[37,1] end if line[0,2] == 'ZJ' @journey_detail.journey_type = line[49,1] end if line[0,2] == 'ZN' @journey_detail.journey_note = line[7,72] @journey_detail.save! @detail = JourneyDetail.find(:last) end if line[0,2] == 'QO' && @previous_record_identity == 'ZJ' # if there is no Note Record (ZN) save the @journey_detail @journey_detail.save! @detail = JourneyDetail.find(:last) end if line[0,2] == 'QO' @journey_stop = JourneyStop.new @journey_stop.service_id = @bus_service.id @journey_stop.journey_detail_id = @detail.id @journey_stop.bus_stop_id = line[2,12] @journey_stop.departure = "#{line[14,2]}:#{line[16,2]}:00" @journey_stop.bay_number = line[18,3] @journey_stop.save! #logger.info("--- Journey Time: #...@journey_stop.inspect}") elsif line[0,2] == 'QI' @journey_stop = JourneyStop.new @journey_stop.service_id = @bus_service.id @journey_stop.journey_detail_id = @detail.id @journey_stop.bus_stop_id = line[2,12] @journey_stop.arrival = "#{line[14,2]}:#{line[16,2]}:00" @journey_stop.departure = "#{line[18,2]}:#{line[20,2]}:00" @journey_stop.bay_number = line[23,3] @journey_stop.save! #logger.info("--- Journey Time: #...@journey_stop.inspect}") elsif line[0,2] == 'QT' @journey_stop = JourneyStop.new @journey_stop.service_id = @bus_service.id @journey_stop.journey_detail_id = @detail.id @journey_stop.bus_stop_id = line[2,12] @journey_stop.arrival = "#{line[14,2]}:#{line[16,2]}:00" @journey_stop.bay_number = line[18,3] @journey_stop.save! #logger.info("--- Journey Time: #...@journey_stop.inspect}") end @previous_record_identity = line[0,2] end end end -- *www.bobop.co.uk* 07811 197374 web design, development, consultancy -- You received this message because you are subscribed to the Google Groups "NWRUG" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/nwrug-members?hl=en.
