Python 3.0 slow file IO
I just recently learned python, I'm using it mainly to process huge <5GB txt files of ASCII information about DNA. I've decided to learn 3.0, but maybe I need to step back to 2.6? I'm getting exceedingly frustrated by the slow file IO behaviour of python 3.0. I know that a bug-report was submitted here: http://bugs.python.org/issue4533. And a solution was posted. However, i don't know how to apply this patch. I've searched the forums and tried: C:\python30> patch -p0 < fileio_buffer.patch The patch command is not recognized.. Any help on implementing this patch, or advice on moving back to the older version is appreciated. Kind regards, Thomas -- http://mail.python.org/mailman/listinfo/python-list
Re: Python 3.0 slow file IO
Thanks a lot for all the responses. I'll move back to Python 2.5 for compatibility with SciPY and some other third party packages. I'll leave the compilation process for some other day, for now I'm a happy user, mayve In the future I would like to contribute to the developmental process.. -- http://mail.python.org/mailman/listinfo/python-list
f.seek() unwanted output
I'm having trouble with a script that is printing the output of f.seek () whereas in the documentation it is quoted not to have any output: file.seek(offset[, whence])¶ Set the file’s current position, like stdio‘s fseek. The whence argument is optional and defaults to os.SEEK_SET or 0 (absolute file positioning); other values are os.SEEK_CUR or 1 (seek relative to the current position) and os.SEEK_END or 2 (seek relative to the file’s end). There is no return value. -- I have a file in memory. when i try f.seek(0) #or any other value in f.tell() it gives me 0 as output: the following script illustrates my 'problem' >>> for a in range(10): f.seek(a) 0 1 2 3 4 5 6 7 8 9 >>> I don't want python to produce output when setting the file pointer. Any help woul be appreciated. Kind regards, Thomas -- http://mail.python.org/mailman/listinfo/python-list
Re: f.seek() unwanted output
Hi Tim, works! thanx a lot Thomas -- http://mail.python.org/mailman/listinfo/python-list
Organize large DNA txt files
Dear Fellow programmers, I'm using Python scripts too organize some rather large datasets describing DNA variation. Information is read, processed and written too a file in a sequential order, like this 1+ 1- 2+ 2- etc.. The files that i created contain positional information (nucleotide position) and some other info, like this: file 1+: 1 73 0 1 0 0 1 76 1 0 0 0 1 77 0 1 0 0 file 1- 1 74 0 0 6 0 1 78 0 0 4 0 1 89 0 0 0 2 Now the trick is that i want this: File 1+ AND File 1- 1 73 0 1 0 0 1 74 0 0 6 0 1 76 1 0 0 0 1 77 0 1 0 0 1 78 0 0 4 0 1 89 0 0 0 2 --- So the information should be sorted onto position. Right now I've written some very complicated scripts that read a number of lines from file 1- and 1+ and then combine this output. The problem is of course that the running number of file 1- can be lower then 1+, resulting in a incorrect order. Since both files are too large to input in a dictionary at once (both are 100 MB+) I need some sort of a alternative that can quickly sort everything without crashing my pc.. Your thoughts are appreciated.. Kind regards, Thomas -- http://mail.python.org/mailman/listinfo/python-list
Re: get rid of duplicate elements in list without set
You could use: B=list(set(A)).sort() Hope that helps. T -- http://mail.python.org/mailman/listinfo/python-list
Re: Organize large DNA txt files
Thanks, This works great! I did not know that it is possible to iterate through the file lines with a while function that's conditional on additional lines being present or not. -- http://mail.python.org/mailman/listinfo/python-list