[Tutor] Merging table-like files with overlapping values in one column

Kat Thu, 21 Aug 2008 03:07:27 -0700

Hi all,

I'm new to Python and trying to come up with an elegant way of tackling the 
following problem. Sorry for the lengthy description:


I have several input files where in each file, every line has a space-separated 
pair values. The files are essentially tables with two columns. There are no 
duplicates in the first column values within each file, but they overlap when 
all files are considered. I'd like to merge them into one file according to 
values of the first column of each file with values from the second column of 
all files combined like this:

First file:
bar 100
foo 90
yadda 22

Second file:
bar 78
yadda 120
ziggy 99

Combined file:
bar 100 78
foo 90 NONE
yadda 22 120
ziggy NONE 99

I'm considering several approaches. In the first brute force way, I can read in 
each file, parse it into lines, parse lines into words, and write the values 
from the second word to a new output file along with the first word. That seems 
awful. My second idea is to convert each file into a dictionary (since the 
first column's values are unique within each file), then I can create a 
combined dictionary which allows multiple values to each key, then output that. 
Does that sound reasonable? Is there another approach? I'm not asking for 
implementation of course, just ideas for the design.

Thanks in advance.

Kat


      

_______________________________________________
Tutor maillist  -  [email protected]
http://mail.python.org/mailman/listinfo/tutor

[Tutor] Merging table-like files with overlapping values in one column

Reply via email to