Re: how two join and arrange two files together
On Thu, Jul 23, 2009 at 12:22 AM, amr...@iisermohali.ac.in wrote: Hi, I have two large files: FileA 15 ALA H = 8.05 N = 119.31 CA = 52.18 HA = 4.52 C = 21 ALA H = 7.66 N = 123.58 CA = 54.33 HA = C = 179.35 23 ALA H = 8.78 N = CA = HA = C = 179.93. and FileB 21 ALA helix (helix_alpha, helix2) 23 ALA helix (helix_alpha, helix3) 38 ALA helix (helix_alpha, helix3)... now what i want that i will make another file in which i will join the two file in such a way that only matching entries will come like here 21 and 23 ALA is in both files, so the output will be something like:- 21 ALA H = 7.66 N = 123.58 CA = 54.33 HA = C = 179.35| 21 ALA helix (helix_alpha, helix2) 23 ALA H = 8.78 N = CA = HA = C = 179.93|23 ALA helix (helix_alpha, helix3) and further i will make another file in which i will be able to put those lines form this file based on the missing atom value, like for 21 ALA HA is not defined so i will put it another file based on its HA missing value similarly i will put 23 ALA on another file based on its missing N,CA and HA value. I tried to join the two file based on their matching entries by:--- from collections import defaultdict if __name__ == __main__: ... a = open(/home/amrita/alachems/chem100.txt) ... c = open(/home/amrita/secstr/secstr100.txt) ... def source(stream): ... return (line.strip() for line in stream) ... ... def merge(sources): ... for m in merge([source(a),source(c)]): ... print |.join(c.ljust(10) for c in m) ... but it is not giving any value. You never actually called any of your expletive deleted functions. Slightly corrected version: from collections import defaultdict def source(stream): return (line.strip() for line in stream) def merge(sources): for m in sources: print |.join(c.ljust(10) for c in m) if __name__ == __main__: a = open(/home/amrita/alachems/chem100.txt) c = open(/home/amrita/secstr/secstr100.txt) merge([source(a), source(c)]) It's still not sophisticated enough to give the exact output you're looking for, but it is a step in the right direction. You really should try asking someone from your CS Dept to help you. It would seriously take a couple hours, at most. - Chris -- Still brandishing a cluestick a vain... http://blog.rebertia.com -- http://mail.python.org/mailman/listinfo/python-list
Re: how two join and arrange two files together
On Thu, 23 Jul 2009 12:52:15 +0530, amr...@iisermohali.ac.in wrote: Hi, I have two large files: FileA 15 ALA H = 8.05 N = 119.31 CA = 52.18 HA = 4.52 C = 21 ALA H = 7.66 N = 123.58 CA = 54.33 HA = C = 179.35 23 ALA H = 8.78 N = CA = HA = C = 179.93. and FileB 21 ALA helix (helix_alpha, helix2) 23 ALA helix (helix_alpha, helix3) 38 ALA helix (helix_alpha, helix3)... now what i want that i will make another file in which i will join the two file in such a way that only matching entries will come like here 21 and 23 ALA is in both files, so the output will be something like:- 21 ALA H = 7.66 N = 123.58 CA = 54.33 HA = C = 179.35| 21 ALA helix (helix_alpha, helix2) 23 ALA H = 8.78 N = CA = HA = C = 179.93|23 ALA helix (helix_alpha, helix3) and further i will make another file in which i will be able to put those lines form this file based on the missing atom value, like for 21 ALA HA is not defined so i will put it another file based on its HA missing value similarly i will put 23 ALA on another file based on its missing N,CA and HA value. I tried to join the two file based on their matching entries by:--- (snip) I believe there are packages available for doing such things mostly written in perl. But if the aim is not to develop suitable applications but only to obtain the desired formatted file then doing this with the help of something like Excel would be easiest. This is my opinion from the experience I have from my bioinfo programming. Regards, Jyoti -- http://mail.python.org/mailman/listinfo/python-list
Re: how two join and arrange two files together
I tried to print those lines having C value missing by: import re expr = re.compile(C = None) f = open(/home/amrita/helix.dat) for line in f: if expr.search(line): print line but it is not giving any value. Hi, I have two large files: FileA 15 ALA H = 8.05 N = 119.31 CA = 52.18 HA = 4.52 C = 21 ALA H = 7.66 N = 123.58 CA = 54.33 HA = C = 179.35 23 ALA H = 8.78 N = CA = HA = C = 179.93. and FileB 21 ALA helix (helix_alpha, helix2) 23 ALA helix (helix_alpha, helix3) 38 ALA helix (helix_alpha, helix3)... now what i want that i will make another file in which i will join the two file in such a way that only matching entries will come like here 21 and 23 ALA is in both files, so the output will be something like:- 21 ALA H = 7.66 N = 123.58 CA = 54.33 HA = C = 179.35| 21 ALA helix (helix_alpha, helix2) 23 ALA H = 8.78 N = CA = HA = C = 179.93|23 ALA helix (helix_alpha, helix3) and further i will make another file in which i will be able to put those lines form this file based on the missing atom value, like for 21 ALA HA is not defined so i will put it another file based on its HA missing value similarly i will put 23 ALA on another file based on its missing N,CA and HA value. I tried to join the two file based on their matching entries by:--- from collections import defaultdict if __name__ == __main__: ... a = open(/home/amrita/alachems/chem100.txt) ... c = open(/home/amrita/secstr/secstr100.txt) ... def source(stream): ... return (line.strip() for line in stream) ... ... def merge(sources): ... for m in merge([source(a),source(c)]): ... print |.join(c.ljust(10) for c in m) ... but it is not giving any value. Thanks, Amrita Kumari Research Fellow IISER Mohali Chandigarh INDIA Amrita Kumari Research Fellow IISER Mohali Chandigarh INDIA -- http://mail.python.org/mailman/listinfo/python-list
Re: how two join and arrange two files together
amr...@iisermohali.ac.in wrote: [please keep the correspondence on the mailing list/newsgroup] It is working sir, but my datas are on file when i replaced StringIO() with open(filename.txt) then it is not printing the result properly, like in one file i have data like:--- 33 ALA H = 7.57 N = 121.52 CA = 55.58 HA = 3.89 C = 179.24 38 ALA H = 8.29 N = 120.62 CA = 54.33 HA = 4.04 C = 178.95 8 ALA H = 7.85 N = 123.95 CA = 54.67 HA = 2.98 C = 179.39 15 ALA H = 8.05 N = 119.31 CA = 52.18 HA = 4.52 C = 177.18 21 ALA H = 7.66 N = 123.58 CA = 54.33 HA = 4.05 C = 179.35 23 ALA H = 8.78 N = 120.16 CA = 55.84 HA = 4.14 C = 179.93 in other:--- 8 ALA helix (helix_alpha, helix1) 21 ALA helix (helix_alpha, helix2) 23 ALA helix (helix_alpha, helix2) 33 ALA helix (helix_alpha, helix3) 38 ALA helix (helix_alpha, helix3) 49 ALA bend and it is giving the result:- 15 ALA H = 8.05 N = 119.31 CA = 52.18 HA = 4.52 C = 177.18| 23 ALA H = 8.78 N = 120.16 CA = 55.84 HA = 4.14 C = 179.93|23 ALA helix (helix_alpha, helix2) 38 ALA H = 8.29 N = 120.62 CA = 54.33 HA = 4.04 C = 178.95|38 ALA helix (helix_alpha, helix3) |49 ALA bend 8 ALA H = 7.85 N = 123.95 CA = 54.67 HA = 2.98 C = 179.39|8 ALA helix (helix_alpha, helix1) it is not printing the result for 33 and 21. Hint: you have to adapt the key() function from def key(line): return line[:1] to something that returns the same key for lines in the two files that belong together, and different keys for lines that don't. Peter -- http://mail.python.org/mailman/listinfo/python-list
Re: how two join and arrange two files together
On Sat, Jul 18, 2009 at 12:09 AM, amr...@iisermohali.ac.in wrote: Hi, I have two files having entries like:-- fileA 8 ALA H = 7.85 N = 123.95 CA = 54.67 HA = 2.98 C = 179.39 15 ALA H = 8.05 N = 119.31 CA = 52.18 HA = 4.52 C = 177.18 23 ALA H = 8.78 N = 120.16 CA = 55.84 HA = 4.14 C = 179.93 and fileB ChainA: ALA8 -67.217297 -37.131330 ChainA: ALA21 -69.822977 -48.871282 ChainA: ALA23 -59.148095 -46.540043 ChainA: ALA33 -65.459303 -43.269718 i want to join thses two files in such a way that the output file will contain column of both the files and the enties of similar position of ALA will be together.so the output file should look something like: fileC 8 ALA H = 7.85 N = 123.95 CA = 54.67 HA = 2.98 C = 179.39 ChainA: ALA8 -67.217297 -37.131330 15 ALA H = 8.05 N = 119.31 CA = 52.18 HA = 4.52 C = 177.18 ChainA: ALA21 -69.822977 -48.871282 23 ALA H = 8.78 N = 120.16 CA = 55.84 HA = 4.14 C = 179.93 ChainA: ALA23 -59.148095 -46.540043 ChainA: ALA33 -65.459303 -43.269718 This mailinglist is not a collection of free code monkeys. Show us you've at least /tried/ to write this yourself, and tell us where you're running into problems or what error you're getting. See also http://catb.org/esr/faqs/smart-questions.html Additionally, you might consider asking on the Indian Python mailinglist instead: http://mail.python.org/mailman/listinfo/bangpypers - Chris -- http://mail.python.org/mailman/listinfo/python-list
Re: how two join and arrange two files together
I tried to join these two files together using command... from itertools import izip from os.path import exists def parafiles(*files): vec = (open(f) for f in files if exists(f)) data = izip(*vec) [f.close() for f in vec] return data for data in parafiles('/home/amrita/alachems/chem1.txt', '/home/amrita/secstr/secstr.txt'): print ' '.join(d.strip() for d in data) it just joined the column of two files. On Sat, Jul 18, 2009 at 12:09 AM, amr...@iisermohali.ac.in wrote: Hi, I have two files having entries like:-- fileA 8 Â ALA H = 7.85 N = 123.95 CA = 54.67 HA = 2.98 C = 179.39 15 ALA H = 8.05 N = 119.31 CA = 52.18 HA = 4.52 C = 177.18 23 ALA H = 8.78 N = 120.16 CA = 55.84 HA = 4.14 C = 179.93 and fileB ChainA: ALA8 Â Â -67.217297 Â Â Â -37.131330 ChainA: ALA21 Â -69.822977 Â Â Â -48.871282 ChainA: ALA23 Â -59.148095 Â Â Â -46.540043 ChainA: ALA33 Â -65.459303 Â Â Â -43.269718 i want to join thses two files in such a way that the output file will contain column of both the files and the enties of similar position of ALA will be together.so the output file should look something like: fileC 8 Â ALA H = 7.85 N = 123.95 CA = 54.67 HA = 2.98 C = 179.39 ChainA: ALA8 Â Â -67.217297 Â Â Â -37.131330 15 ALA H = 8.05 N = 119.31 CA = 52.18 HA = 4.52 C = 177.18 Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â ChainA: ALA21 Â -69.822977 Â Â Â -48.871282 23 ALA H = 8.78 N = 120.16 CA = 55.84 HA = 4.14 C = 179.93 ChainA: ALA23 Â -59.148095 Â Â Â -46.540043 Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â ChainA: ALA33 Â -65.459303 Â Â Â -43.269718 This mailinglist is not a collection of free code monkeys. Show us you've at least /tried/ to write this yourself, and tell us where you're running into problems or what error you're getting. See also http://catb.org/esr/faqs/smart-questions.html Additionally, you might consider asking on the Indian Python mailinglist instead: http://mail.python.org/mailman/listinfo/bangpypers - Chris Amrita Kumari Research Fellow IISER Mohali Chandigarh INDIA -- http://mail.python.org/mailman/listinfo/python-list
Re: how two join and arrange two files together
amr...@iisermohali.ac.in wrote: I tried to join these two files together using command... from itertools import izip from os.path import exists def parafiles(*files): vec = (open(f) for f in files if exists(f)) data = izip(*vec) [f.close() for f in vec] return data for data in parafiles('/home/amrita/alachems/chem1.txt', '/home/amrita/secstr/secstr.txt'): print ' '.join(d.strip() for d in data) parafiles has a bug: vec is a generator and hence cannot be run twice. Therefore the odd [f.close()...] list comprehension (don't use a list comp if you don't care about the result!) has no effect. If you change vec into a list you will be hit by another problem -- the for loop trying to operate on closed files. While the correct approach probably involves a contextlib.contextmanager I recommend that you concentrate on your real problem and keep parafiles() simple: def parafiles(*files): open_files = (open(f) for f in files if exits(f)) return izip(*open_files) it just joined the column of two files. Can you make an effort to express clearly what you want, preferrably with a simple and unambiguous example? Please keep the line widths below the threshold of 78 characters to avoid messing it up on its way to the reader. Peter -- http://mail.python.org/mailman/listinfo/python-list
Re: how two join and arrange two files together
I want to join column of two different data file but i want that the entries will match (example i mentioned in my first mail, the position of ALA eill match) if its not matching then it will get printed as such. amr...@iisermohali.ac.in wrote: I tried to join these two files together using command... from itertools import izip from os.path import exists def parafiles(*files): vec = (open(f) for f in files if exists(f)) data = izip(*vec) [f.close() for f in vec] return data for data in parafiles('/home/amrita/alachems/chem1.txt', '/home/amrita/secstr/secstr.txt'): print ' '.join(d.strip() for d in data) parafiles has a bug: vec is a generator and hence cannot be run twice. Therefore the odd [f.close()...] list comprehension (don't use a list comp if you don't care about the result!) has no effect. If you change vec into a list you will be hit by another problem -- the for loop trying to operate on closed files. While the correct approach probably involves a contextlib.contextmanager I recommend that you concentrate on your real problem and keep parafiles() simple: def parafiles(*files): open_files = (open(f) for f in files if exits(f)) return izip(*open_files) it just joined the column of two files. Can you make an effort to express clearly what you want, preferrably with a simple and unambiguous example? Please keep the line widths below the threshold of 78 characters to avoid messing it up on its way to the reader. Peter -- http://mail.python.org/mailman/listinfo/python-list Amrita Kumari Research Fellow IISER Mohali Chandigarh INDIA -- http://mail.python.org/mailman/listinfo/python-list
Re: how two join and arrange two files together
amr...@iisermohali.ac.in wrote: Can you make an effort to express clearly what you want, preferrably with a simple and unambiguous example? I want to join column of two different data file but i want that the entries will match (example i mentioned in my first mail, the position of ALA eill match) if its not matching then it will get printed as such. Just say No if you mean it. My best guess: from collections import defaultdict def merge(sources): blanks = [blank for items, blank, keyfunc in sources] d = defaultdict(lambda: blanks[:]) for index, (items, blank, keyfunc) in enumerate(sources): for item in items: d[keyfunc(item)][index] = item for key in sorted(d): yield d[key] if __name__ == __main__: from StringIO import StringIO a = StringIO(\ a alpha c beta d gamma ) b = StringIO(\ a one b two d three e four ) c = StringIO(\ a 111 b 222 f 333 ) def key(line): return line[:1] def source(stream, blank=, key=key): return (line.strip() for line in stream), blank, key for m in merge([source(x) for x in [a,b,c]]): print |.join(c.ljust(10) for c in m) -- http://mail.python.org/mailman/listinfo/python-list