Re: [portland] Reformatting Data Files

Ethan Furman Fri, 28 Jan 2011 16:00:12 -0800

Rich Shepard wrote:

  Yes, this is correct. Instead of the organization above, I want each row
to be a complete record of location, date, parameter and value. Somewhat

analogous to transposing rows and columns in the spreadsheet, but that'snot

a solution.


  In the re-written file, there will be one row for each unique set of
location, date, parameter, and value. For this file, there are 2 locations,
22 dates, and 32 parameters (each with 1 value for that parameter on that
date at that location) or 1408 distinct rows.


Wow.

Here's an attempt -- not sure how general it will be, but hopefully itwill help.


====== input =============
:'JCM-10B':'JCM-10B':'JCM-10B':'JCM-20B':'JCM-20B':'JCM-20B'
:2004-08-16:2005-06-07::2006-06-12::2009-12-09
'Depth to Water':76.75:77.51:82.15:73.17:72.66:
'Clarity':6.35:9.75:4.75:3.26:4.07:7.69
'Salinity':0.38:0.69:0.16:0.79::0.05
'Temperature':41.55:34.87:42.73:40.80:45.14:
==========================

------ output ------------
::JCM-10B:2004-08-16:Depth to Water:76.75::
::JCM-10B:2004-08-16:Clarity:6.35::
::JCM-10B:2004-08-16:Salinity:0.38::
::JCM-10B:2004-08-16:Temperature:41.55::
::JCM-10B:2005-06-07:Depth to Water:77.51::
::JCM-10B:2005-06-07:Clarity:9.75::
::JCM-10B:2005-06-07:Salinity:0.69::
::JCM-10B:2005-06-07:Temperature:34.87::
::JCM-10B::Depth to Water:82.15::
::JCM-10B::Clarity:4.75::
::JCM-10B::Salinity:0.16::
::JCM-10B::Temperature:42.73::
::JCM-20B:2006-06-12:Depth to Water:73.17::
::JCM-20B:2006-06-12:Clarity:3.26::
::JCM-20B:2006-06-12:Salinity:0.79::
::JCM-20B:2006-06-12:Temperature:40.80::
::JCM-20B::Depth to Water:72.66::
::JCM-20B::Clarity:4.07::
::JCM-20B::Salinity:::
::JCM-20B::Temperature:45.14::
::JCM-20B:2009-12-09:Depth to Water:::
::JCM-20B:2009-12-09:Clarity:7.69::
::JCM-20B:2009-12-09:Salinity:0.05::
::JCM-20B:2009-12-09:Temperature:::
----------------------------

~~~~~~~~~~~~~~~~~~~~~ code ~~~~~~~~~~~~~~~~~~~~~~~~~~
#!/usr/bin/env python
"""
This file converts the exported wq excel files to
have each row contain the location, date, parameter,
and value.
"""

import csv

reader = csv.reader(open("10b20b.txt","rb"),
                    delimiter=":",
                    quotechar="'")
output = csv.writer(open("out.txt","wb"),
                    delimiter=":",
                    quotechar="'")

# records are in columnar, not row, format

# read in all data into format
# [fieldname, [value1, value2, value3, ..., valueN]]
all_data = list()
for row in reader:
    row = list(row)
    # keep fieldname separate from values
    all_data.append((row[0], row[1:]))

# if fieldname is empty, assume header row
# once a fieldname/value row is found, stop
# looking for anymore header rows
header_rows = list()
for i, fieldname_row in enumerate(all_data):
    fieldname, row = fieldname_row
    if not fieldname:
        header_rows.append(row)
    else:
        break
# assume all remaining rows are the fields and values
# i will be the index of the first row with a fieldname
first_field_row = i

# assemble headers
for j, _ in enumerate(header_rows[0]):
    headers = list()
    for row in header_rows:
        headers.append(row[j])
    # now step through field:value rows
    for fieldname, row in all_data[first_field_row:]:
        output.writerow([None, None]
                        + headers
                        + [fieldname, row[j], None, None])
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Hope this helps!

~Ethan~
_______________________________________________
Portland mailing list
[email protected]
http://mail.python.org/mailman/listinfo/portland

Re: [portland] Reformatting Data Files

Reply via email to