syed zaidi wrote:
Dear Steve,Tutor doesn't allow attachment of huge files. I am attaching
the files I am taking as input, code and the output CSV file. I hope then
you would be able to help me. DOT keg files open in file viewer, you can
also view them in python. The CSV file is the desired output file.


There is no need to send four files when one will do. Also no need to send a file with multiple thousands of lines long when a dozen or so lines should be sufficient.

It would also help if you told us what the fields in the file should be called. You are probably familiar with them, but we aren't.

Since I don't know what the fields are called, I'm going to just make up some names.

def parse_d_line(line):
    # Expects a line like this:
    # D    SBG_0147 aceE; xxx xxx\tK00163 xxx xxx [EC:1.2.4.1]
    a, b = line.split('\t')  # split on tab character
    c, d = a.split(';')
    letter, sbg_code, other_code = c.split()
    compound1 = d.strip()
    words = b.split()
    k_code = words[0]
    ec = words[-1]
    compound2 = " ".join(words[1:-1])
    return (letter, sbg_code, other_code, compound1, k_code, compound2, ec)


kegfile = open('something.keg')
# skip lines until a bare exclamation mark
for line in kegfile:
    if line.strip() == '!':
        break

# analyse D lines only, skipping all others
for line in kegfile:
    if line.startswith('D'):
        print(parse_d_line(dline))
    elif line.strip() == '!':
        break  # stop processing


You will notice I don't use regular expressions in this.

    Some people, when confronted with a problem, think "I know,
    I'll use regular expressions." Now they have two problems.
    -- Jamie Zawinski




--
Steven

_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Reply via email to