[issue1141] reading large files

2007-09-17 Thread Sean Reifschneider
Sean Reifschneider added the comment: I'm closing this because the slow I/O issue is known and expected to be resolved as part of the Python 3.0 development. The Windows problems with missing lines should be opened as a separate issue. -- nosy: +jafo resolution: - duplicate status:

[issue1141] reading large files

2007-09-11 Thread Ben Beasley
Ben Beasley added the comment: I ran Richard Christen's script from msg55784 on Ubuntu Feisty Fawn (64-bit) with both Python 2.5.1 and Python 3.0a1 (for the latter, I had to change xrange to range). (2, 5, 1, 'final', 0) 2007-09-11 11:39:08 (500, 7.3925600051879883) (1000,

[issue1141] reading large files

2007-09-11 Thread Ben Beasley
Ben Beasley added the comment: See the BDFL's comment in msg55828. I know Py3k text I/O is very slow; it's written in Python and uses UTF-8 as the default encoding. We've got a summer of code student working on an accelerating this. (And if he doesn't finish we have another year to work on it

[issue1141] reading large files

2007-09-10 Thread christen
New submission from christen: September 11, 2007 I downloaded py 3.k The good news : Under Windows, Python 3k properly reads files larger than 4 Go (in contrast to python 2.5 that skips some lines, see below) The bad news : py 3k is very slow compared to py 2.5; see the results below the code

[issue1141] reading large files

2007-09-10 Thread Martin v. Löwis
Martin v. Löwis added the comment: If you would like to help resolving the issue with the missing lines, please submit a separate report for that. It is very difficult to track unrelated bugs in a single tracker issue. It would help if you could determine which lines are missing, e.g. by writing

[issue1141] reading large files

2007-09-10 Thread christen
christen added the comment: Hi Martin I could certainly do that, but how you get my huge files ? 5 Go of data is quite big... If you want to compute runtimes, it is better to not convert them to local time. Instead, use the pattern start = time.time() ... print time.time()-start #

[issue1141] reading large files

2007-09-10 Thread Martin v. Löwis
Martin v. Löwis added the comment: I could certainly do that, but how you get my huge files ? 5 Go of data is quite big... [not sure what that is] I did not mean to suggest that you attach such a large file. Instead, just report that as a separate bug report, and be prepared to answer

[issue1141] reading large files

2007-09-10 Thread Stefan Sonnenberg-Carstens
Stefan Sonnenberg-Carstens added the comment: Perhaps this is an issue of line separation ? Could you provide the output of wc -l on a *NIX box ? And, could you try with this code: import sys print(sys.version_info) import time print (time.localtime())

[issue1141] reading large files

2007-09-10 Thread Stefan Sonnenberg-Carstens
Stefan Sonnenberg-Carstens added the comment: Sorry, this way: import sys print(sys.version_info) import time print (time.strftime('%Y-%m-%d %H:%M:%S')) fichin=open(r'D:\pythons\16s\total_gb_161_16S.gb') start = time.time() for i,li in enumerate(fichin): if i%100==0 and i0:

[issue1141] reading large files

2007-09-10 Thread christen
christen added the comment: Hi Stefan Calculations are underway both read and write do not work well with p3k you can try the code below on your own machine : fichout.write(str(i)+' '*59+'\n') #generates a big file fichout.write(str(i)+'\n') #generate file 4Go the big file is not