Carl Cerecke wrote: > Nick Rout wrote: > >> I have a series of files with lines like: >> >> Proprietor >> Bill Smith and Mary Smith >> >> Identifier 23B/874 >> >> M5412345.6 Mortgage to XYZ Bank Limited >> >> each file will have an Identifier and a Proporietor, but not necessarily >> a mortgage >> >> I want to process each file and end up with a file like this: >> >> Bill and Mary Smith,23B/874,M5412345.6, XYZ Bank Ltd >> John & Joan Brown,23H/123,M123435.6,GHJ Lenders Ltd >> >> The trouble I am having is extracting the proprietor name, as it is on a >> different line to the word "Proprietor" >> >> Also the Identifier line appears twice in each file and I only want the >> first one. >> >> Can any one help me? > > > ------------------------------------------------- > #!/usr/bin/env python > > import sys, string > > lines = sys.stdin.readlines() > > len = len(lines) > > str = "" > for i in range(len): > line = lines[i][:-1] > #print line > if line == "Proprietor": > if str != "": > print str > str = "" > continue > if line == "": > continue > ls = string.split(line) > if ls[0] == "Identifier": > str = str + ","+ls[1] > elif ls[1] == "Mortgage": > str = str + ","+ls[0] > str = str + ","+string.join(ls[3:]) > else: > str = line > print str > -------------------------------------------------- > >
BTW, it reads from stdin. Put it in a file foo and type: python foo < filename > filename.csv Yay for the command line! Cheers, Carl
