Sean Reifschneider added the comment:
I'm closing this because the slow I/O issue is known and expected to be
resolved as part of the Python 3.0 development. The Windows problems
with missing lines should be opened as a separate issue.
--
nosy: +jafo
resolution: - duplicate
status:
Ben Beasley added the comment:
I ran Richard Christen's script from msg55784 on Ubuntu Feisty Fawn
(64-bit) with both Python 2.5.1 and Python 3.0a1 (for the latter, I had
to change xrange to range).
(2, 5, 1, 'final', 0)
2007-09-11 11:39:08
(500, 7.3925600051879883)
(1000,
Ben Beasley added the comment:
See the BDFL's comment in msg55828. I know Py3k text I/O is very slow;
it's written in Python and uses UTF-8
as the default encoding. We've got a summer of code student working on
an accelerating this. (And if he doesn't finish we have another year to
work on it
New submission from christen:
September 11, 2007 I downloaded py 3.k
The good news :
Under Windows, Python 3k properly reads files larger than 4 Go (in
contrast to python 2.5 that skips some lines, see below)
The bad news : py 3k is very slow compared to py 2.5; see the results below
the code
Martin v. Löwis added the comment:
If you would like to help resolving the issue with the missing lines,
please submit a separate report for that. It is very difficult to track
unrelated bugs in a single tracker issue. It would help if you could
determine which lines are missing, e.g. by writing
christen added the comment:
Hi Martin
I could certainly do that, but how you get my huge files ? 5 Go of data
is quite big...
If you want to compute runtimes, it is better to not convert them to
local time. Instead, use the pattern
start = time.time()
...
print time.time()-start #
Martin v. Löwis added the comment:
I could certainly do that, but how you get my huge files ? 5 Go of data
is quite big...
[not sure what that is] I did not mean to suggest that you attach such
a large file. Instead, just report that as a separate bug report, and be
prepared to answer
Stefan Sonnenberg-Carstens added the comment:
Perhaps this is an issue of line separation ?
Could you provide the output of wc -l on a *NIX box ?
And, could you try with this code:
import sys
print(sys.version_info)
import time
print (time.localtime())
Stefan Sonnenberg-Carstens added the comment:
Sorry, this way:
import sys
print(sys.version_info)
import time
print (time.strftime('%Y-%m-%d %H:%M:%S'))
fichin=open(r'D:\pythons\16s\total_gb_161_16S.gb')
start = time.time()
for i,li in enumerate(fichin):
if i%100==0 and i0:
christen added the comment:
Hi Stefan
Calculations are underway
both read and write do not work well with p3k
you can try the code below on your own machine :
fichout.write(str(i)+' '*59+'\n') #generates a big file
fichout.write(str(i)+'\n') #generate file 4Go
the big file is not
10 matches
Mail list logo