On Wed, 2 Jun 2010 07:40:33 am Colin Talbert wrote: > I am also experiencing this same problem. (Also on a OSM bz2 > file). It appears to be working but then partway through reading a > file it simple ends. I did track down that file length is always > 900000 so it appears to be related to some sort of buffer constraint.
Without seeing your text file, and the code you use to read the text file, there's no way of telling what is going on, but I can guess the most likely causes: (1) Your text file is actually only 900,000 bytes long, and so there's no problem at all. (2) There's a bug in your code so that you stop reading after 900,000 bytes. (3) You're on Windows, and the text file contains an End-Of-File character ^Z after 900,000 bytes, and Windows supports that for backward compatibility with DOS. And a distant (VERY distant) number 4, there's a bug in the implementation of read() in Python which somehow nobody has noticed before now. As for your second issue, reading bz2 files: > import bz2 > > input_file = bz2.BZ2File(r"C:\temp\planet-latest.osm.bz2","r") You're opening a binary file in text mode. I'm pretty sure that is not going to work well. Try passing 'rb' as the mode instead. > try: > all_data = input_file.read() > print str(len(all_data)) You don't need to call str() before calling print. print is perfectly happy to operate on integers: print len(all_data) will work. -- Steven D'Aprano _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor