Re: [Tutor] map one file and print it out following the sequence

Dave Angel Thu, 13 Oct 2011 08:45:43 -0700

On 10/13/2011 09:09 AM, lina wrote:

<snip>

I think your final version of sortfile() might look something like:

def sortfile(infilename=**INFILENAME, outfilename=OUTFILENAME):
    infile = open(infilename, "r")
    intext = infile.readlines()
    outfile = open(OUTFILENAME, "w")
    for chainid in CHAINID:
        print("chain id = ",chainid)
         sortoneblock(chainid, intext, outfile)
    infile.close()
    outfile.close()


$ python3 map-to-itp.py
{'O4': '2', 'C19': '3', 'C21': '1'}
C
Traceback (most recent call last):
   File "map-to-itp.py", line 55, in<module>
     sortfile()
   File "map-to-itp.py", line 17, in sortfile
     sortoneblock(chainid,intext,OUTFILENAME)
   File "map-to-itp.py", line 29, in sortoneblock
     f.write(line[1].strip() for line in temp)
TypeError: must be str, not generator

When you see an error message that describes a generator, it means youusually have a for-expression used as a value.

At your stage of learning you probably be ignoring generators and listcomprehensions, and just write simple for loops. So you should replace

the f.write with a loop.


        for item in temp:
            f.write(something + "\n")

One advantage is that you can easily stuff print() functions into theloop, to debug what's really happening. After you're sure it's right,it might be appropriate to use either a generator or a list comprehension.

I don't know how to fix the writing issue.

can I write the different chainID one into the same OUTFILE?

Thanks, I attached the code I used below:

  #!/usr/bin/python3

import os.path

LINESTOSKIP=0
CHAINID="CDEFGHI"
INFILENAME="pdbone.pdb"
OUTFILENAME="sortedone.pdb"
DICTIONARYFILE="itpone.itp"
mapping={}
valuefromdict={}

def sortfile():
     intext=fetchonefiledata(INFILENAME)
     for chainid in CHAINID:
         print(chainid)
         sortoneblock(chainid,intext,OUTFILENAME)

One way to get all the output into one file is to create the file insortfile(), and pass the file object. Look again at what I suggestedfor sortfile(). If you can open the file once, here, you won't have theoverhead of constantly opening the same file that nobody closed, andyou'll have the side benefit that the old contents of the file will beoverwritten.

Andreas' suggestion of using append would make more sense if you wantedthe output to accumulate over multiple runs of the program. If youdon't want the output file to be the history of all the runs, thenyou'll need to do one open(name, "w"), probably in sortfile(), and thenyou might as well pass the file object as I suggested.



def sortoneblock(cID,TEXT,OUTFILE):

If you followed my suggestions for sortfile(), then the last paramter tothis function would be outfile., and you could use outfile.write().

As Andreas says, don't use uppercase for non-constants.

     temp = []


        #this writes the cID to the output file, once per cID
        outfile.write(cID + "\n")

     for line in TEXT:
         blocks=line.strip().split()
         if len(blocks)== 11 and  blocks[3] == "CUR" and blocks[4] == cID and
blocks[2] in mapping.keys():


          if (len(blocks)== 11 and  blocks[3] == "CUR"
                and blocks[4] == cID and blocks[2] in mapping ):

Having the .keys() in that test is redundant and slows execution downquite a bit. "in" already knows how to look things up efficiently in adictionary, so there's no use in converting to a slow list before doingthe slow lookup.

Also, if you put parentheses around the whole if clause, you can span it
across multiple lines without doing anything special.

             temp.append((mapping[blocks[2]],line))
     temp.sort()
     with open(OUTFILE,"w") as f:
         f.write(line[1].strip() for line in temp)

See comment above for splitting this write into a loop. You also aregoing to have to decide what to write, as you have tuple containing bothan index number and a string in each item of temp. Probably you want towrite the second item of the tuple. Combining these changes, you

would have
       for index, line in temp:
           outfile.write(line + "\n")

Note that the following are equivalent:
       for item in temp:
            index, line = item
            outfile.write(line + "\n")

       for item in temp:
            outfile.write(item[1] + "\n")

But I like the first form, since it makes it clear what's been stored intemp. That sort of thing is important if you ever change it.




def generatedictionary(dictfilename):
     text=fetchonefiledata(DICTIONARYFILE)
     for line in text:
         parts=line.strip().split()
         if len(parts)==8:
             mapping[parts[4]]=parts[0]
     print(mapping)



def fetchonefiledata(infilename):
     text=open(infilename).readlines()
     if os.path.splitext(infilename)[1]==".itp":
         return text
     if os.path.splitext(infilename)[1]==".pdb":
         return text[LINESTOSKIP:]
     infilename.close()


if __name__=="__main__":
     generatedictionary(DICTIONARYFILE)
     sortfile()

Final note: write() doesn't automatically append a newline, so I tend toadd an explicit one in the write() itself. But if you start seeingdouble spacing, that's presumably because the line already had a newlinein it. You could use rstrip() on it (my choice), or remove the + "\n"in the write() method.


--

DaveA
_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] map one file and print it out following the sequence

Reply via email to