Python 3.0 slow file IO

2009-02-05 Thread thomasvang...@gmail.com
I just recently learned python, I'm using it mainly to process huge
<5GB txt files of ASCII information about DNA. I've decided to learn
3.0, but maybe I need to step back to 2.6?

I'm getting exceedingly frustrated by the slow file IO behaviour of
python 3.0. I know that a bug-report was submitted here:
http://bugs.python.org/issue4533. And a solution was posted.
However, i don't know how to apply this patch. I've searched the
forums and tried:
C:\python30> patch -p0 < fileio_buffer.patch
The patch command is not recognized..

Any help on implementing this patch, or advice on moving back to the
older version is appreciated.
Kind regards,
Thomas
--
http://mail.python.org/mailman/listinfo/python-list


Re: Python 3.0 slow file IO

2009-02-06 Thread thomasvang...@gmail.com
Thanks a lot for all the responses. I'll move back to Python 2.5 for
compatibility with SciPY and some other third party packages.
I'll leave the compilation process for some other day, for now I'm a
happy user, mayve In the future I would like to contribute to the
developmental process..
--
http://mail.python.org/mailman/listinfo/python-list


f.seek() unwanted output

2009-01-05 Thread thomasvang...@gmail.com
I'm having trouble with a script that is printing the output of f.seek
() whereas in the documentation it is quoted not to have any output:


file.seek(offset[, whence])¶

Set the file’s current position, like stdio‘s fseek. The whence
argument is optional and defaults to os.SEEK_SET or 0 (absolute file
positioning); other values are os.SEEK_CUR or 1 (seek relative to the
current position) and os.SEEK_END or 2 (seek relative to the file’s
end). There is no return value.
--

I have a file in memory.
when i try f.seek(0) #or any other value in f.tell()
it gives me 0 as output:

the following script illustrates my 'problem'
>>> for a in range(10):
f.seek(a)


0
1
2
3
4
5
6
7
8
9
>>>

I don't want python to produce output when setting the file pointer.
Any help woul be appreciated.
Kind regards,
Thomas
--
http://mail.python.org/mailman/listinfo/python-list


Re: f.seek() unwanted output

2009-01-06 Thread thomasvang...@gmail.com
Hi Tim,
works! thanx a lot
Thomas
--
http://mail.python.org/mailman/listinfo/python-list


Organize large DNA txt files

2009-03-20 Thread thomasvang...@gmail.com
Dear Fellow programmers,

I'm using Python scripts too organize some rather large datasets
describing DNA variation. Information is read, processed and written
too a file in a sequential order, like this
1+
1-
2+
2-

etc.. The files that i created contain positional information
(nucleotide position) and some other info, like this:

file 1+:

1   73  0   1   0   0
1   76  1   0   0   0
1   77  0   1   0   0

file 1-

1   74  0   0   6   0
1   78  0   0   4   0
1   89  0   0   0   2

Now the trick is that i want this:

File 1+ AND File 1-

1   73  0   1   0   0
1   74  0   0   6   0
1   76  1   0   0   0
1   77  0   1   0   0
1   78  0   0   4   0
1   89  0   0   0   2
---

So the information should be sorted onto position. Right now I've
written some very complicated scripts that read a number of lines from
file 1- and 1+ and then combine this output. The problem is of course
that the running number of file 1- can be lower then 1+, resulting in
a incorrect order. Since both files are too large to input in a
dictionary at once (both are 100 MB+) I need some sort of a
alternative that can quickly sort everything without crashing my pc..

Your thoughts are appreciated..
Kind regards,
Thomas


--
http://mail.python.org/mailman/listinfo/python-list


Re: get rid of duplicate elements in list without set

2009-03-20 Thread thomasvang...@gmail.com
You could use:
B=list(set(A)).sort()
Hope that helps.
T
--
http://mail.python.org/mailman/listinfo/python-list


Re: Organize large DNA txt files

2009-03-20 Thread thomasvang...@gmail.com
Thanks,
This works great!
I did not know that it is possible to iterate through the file lines
with a while function that's conditional on additional lines being
present or not.
--
http://mail.python.org/mailman/listinfo/python-list