Colin Talbert wrote:
<snip>
You are so correct. I'd been trying numerous things to read in this file and had deleted the code that I meant to put here and so wrote this from memory incorrectly. The code that I wrote should have been:

import bz2
input_file = bz2.BZ2File(r'C:\temp\planet-latest.osm.bz2','rb')
str=input_file.read()
len(str)

Which indeed does return only 900000.

Which is also the number returned when you sum the length of all the lines returned in a for line in file with:


import bz2
input_file = bz2.BZ2File(r'C:\temp\planet-latest.osm.bz2','rb')
lengthz = 0
for uline in input_file:
    lengthz = lengthz + len(uline)

print lengthz

<snip>
Seems to me for such a large file you'd have to use bz2.BZ2Decompressor. I have no experience with it, but its purpose is for sequential decompression -- decompression where not all the data is simultaneously available in memory.

DaveA

_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Reply via email to