Re: writing large files quickly
Bengt Richter wrote: How the heck does that make a 400 MB file that fast? It literally takes a second or two while every other solution takes at least 2 - 5 minutes. Awesome... thanks for the tip!!! Because it isn't really writing the zeros. You can make these files all day long and not run out of disk space, because this kind of file doesn't take very many blocks. The blocks that were never written are virtual blocks, inasmuch as read() at that location will cause the filesystem to return a block of NULs. I wonder if it will also write virtual blocks when it gets real zero blocks to write from a user, or even with file system copy utils? I've seen this behaviour on big iron Unix systems, in a benchmark that repeatedly copied data from a memory mapped section to an output file. but for the general case, I doubt that adding is this block all zeros or does this block match something we recently wrote to disk checks will speed things up, on average... /F -- http://mail.python.org/mailman/listinfo/python-list
Re: writing large files quickly
Grant Edwards [EMAIL PROTECTED] writes: $ dd if=/dev/zero of=zeros bs=64k count=1024 1024+0 records in 1024+0 records out $ ls -l zeros -rw-r--r-- 1 grante users 67108864 Jan 28 14:49 zeros $ du -h zeros 65M zeros In my book that's 64MB not 65MB, but that's an argument for another day. You should be aware that the size that 'du' and 'ls -s' reports, include any indirect blocks needed to keep track of the data blocks of the file. Thus, you get the amount of space that the file actually uses in the file system, and would become free if you removed it. That's why it is larger than 64 Mbyte. And 'du' (at least GNU du) rounds upwards when you use -h. Try for instance: $ dd if=/dev/zero of=zeros bs=4k count=16367 16367+0 records in 16367+0 records out $ ls -ls zeros 65536 -rw-rw-r-- 1 bellman bellman 67039232 Jan 29 13:57 zeros $ du -h zeros 64M zeros $ dd if=/dev/zero of=zeros bs=4k count=16368 16368+0 records in 16368+0 records out $ ls -ls zeros 65540 -rw-rw-r-- 1 bellman bellman 67043328 Jan 29 13:58 zeros $ du -h zeros 65M zeros (You can infer from the above that my file system has a block size of 4 Kbyte.) -- Thomas Bellman, Lysator Computer Club, Linköping University, Sweden There are many causes worth dying for, but ! bellman @ lysator.liu.se none worth killing for. -- Gandhi ! Make Love -- Nicht Wahr! -- http://mail.python.org/mailman/listinfo/python-list
Re: writing large files quickly
Donn wrote: How the heck does that make a 400 MB file that fast? It literally takes a second or two while every other solution takes at least 2 - 5 minutes. Awesome... thanks for the tip!!! Because it isn't really writing the zeros. You can make these files all day long and not run out of disk space, because this kind of file doesn't take very many blocks. The blocks that were never written are virtual blocks, inasmuch as read() at that location will cause the filesystem to return a block of NULs. Under which operating system/file system? As far as I know this should be file system dependent at least under Linux, as the calls to open and seek are served by the file system driver. Jens -- http://mail.python.org/mailman/listinfo/python-list
Re: writing large files quickly
Ivan wrote: Steven D'Aprano wrote: Isn't this a file system specific solution though? Won't your file system need to have support for sparse files, or else it won't work? Yes, but AFAIK the only modern (meaning: in wide use today) file system that doesn't have this support is FAT/FAT32. I don't think ext2fs does this either. At least the du and df commands tell something different. Actually I'm not sure what this optimisation should give you anyway. The only circumstance under which files with only zeroes are meaningful is testing, and that's exactly when you don't want that optimisation. On compressing filesystems such as ntfs you will get this behaviour as a special case of compression and compression makes more sense. Jens -- http://mail.python.org/mailman/listinfo/python-list
Re: writing large files quickly
Donn wrote: Because it isn't really writing the zeros. You can make these files all day long and not run out of disk space, because this kind of file doesn't take very many blocks. The blocks that were never written are virtual blocks, inasmuch as read() at that location will cause the filesystem to return a block of NULs. Are you sure that's not just a case of asynchronous writing that can be done in a particularly efficient way? df quite clearly tells me that I'm running out of disk space on my ext2fs linux when I dump it full of zeroes. Jens -- http://mail.python.org/mailman/listinfo/python-list
Re: writing large files quickly
Jens Theisen wrote: Ivan wrote: Yes, but AFAIK the only modern (meaning: in wide use today) file system that doesn't have this support is FAT/FAT32. I don't think ext2fs does this either. At least the du and df commands tell something different. ext2 is a reimplementation of BSD UFS, so it does. Here: f = file('bigfile', 'w') f.seek(1024*1024) f.write('a') $ l afile -rw-r--r-- 1 ivoras wheel 1048577 Jan 28 14:57 afile $ du afile 8 afile Actually I'm not sure what this optimisation should give you anyway. The only circumstance under which files with only zeroes are meaningful is testing, and that's exactly when you don't want that optimisation. I read somewhere that it has a use in database software, but the only thing I can imagine for this is when using heap queues (http://python.active-venture.com/lib/node162.html). -- http://mail.python.org/mailman/listinfo/python-list
Re: writing large files quickly
Ivan wrote: ext2 is a reimplementation of BSD UFS, so it does. Here: f = file('bigfile', 'w') f.seek(1024*1024) f.write('a') $ l afile -rw-r--r-- 1 ivoras wheel 1048577 Jan 28 14:57 afile $ du afile 8 afile Interesting: cp bigfile bigfile2 cat bigfile bigfile3 du bigfile* 8 bigfile2 1032bigfile3 So it's not consumings 0's. It's just doesn't store unwritten data. And I can think of an application for that: An application might want to write the biginning of a file at a later point, so this makes it more efficient. I wonder how other file systems behave. I read somewhere that it has a use in database software, but the only thing I can imagine for this is when using heap queues (http://python.active-venture.com/lib/node162.html). That's an article about the heap efficient data structure. Was it your intention to link this? Jens -- http://mail.python.org/mailman/listinfo/python-list
Re: writing large files quickly
Jens Theisen wrote: Ivan wrote: I read somewhere that it has a use in database software, but the only thing I can imagine for this is when using heap queues (http://python.active-venture.com/lib/node162.html). I've used this feature eons ago where the file was essentially a single large address space (memory mapped array) that was expected to never fill all that full. I was tracking data from a year of (thousands? of) students seeing Drill-and-practice questions from a huge database of questions. The research criticism we got was that our analysis did not rule out any kid seeing the same question more than once, and getting practice that would improve performance w/o learning. I built a bit-filter and copied tapes dropping any repeats seen by students. We then just ran the same analysis we had on the raw data, and found no significant difference. The nice thing is that file size grew over time, so (for a while) I could run on the machine with other users. By the last block of tapes I was sitting alone in the machine room at 3:00 AM on Sat mornings afraid to so much as fire up an editor. -- -Scott David Daniels [EMAIL PROTECTED] -- http://mail.python.org/mailman/listinfo/python-list
Re: writing large files quickly
[Jens Theisen] ... Actually I'm not sure what this optimisation should give you anyway. The only circumstance under which files with only zeroes are meaningful is testing, and that's exactly when you don't want that optimisation. In most cases, a guarantee that reading uninitialized file data will return zeroes is a security promise, not an optimization. C doesn't require this behavior, but POSIX does. On FAT/FAT32, if you create a file, seek to a large offset, write a byte, then read the uninitialized data from offset 0 up to the byte just written, you get back whatever happened to be sitting on disk at the locations now reserved for the file. That can include passwords, other peoples' email, etc -- anything whatsoever that may have been written to disk at some time in the disk's history. Security weenies get upset at stuff like that ;-) -- http://mail.python.org/mailman/listinfo/python-list
Re: writing large files quickly
Jens Theisen wrote: cp bigfile bigfile2 cat bigfile bigfile3 du bigfile* 8 bigfile2 1032bigfile3 So it's not consumings 0's. It's just doesn't store unwritten data. And I Very possibly cp understands sparse file and cat (doint what it's meant to do) doesn't :) I read somewhere that it has a use in database software, but the only thing I can imagine for this is when using heap queues (http://python.active-venture.com/lib/node162.html). That's an article about the heap efficient data structure. Was it your intention to link this? Yes. The idea is that in implementing such a structure, in which each level is 2^x (x=level of the structure, and it's depentent on the number of entries the structure must hold) wide, most of blocks could exist and never be written to (i.e. they'd be empty). Using sparse files would save space :) (It has nothing to do with python; I remembered the article so I linked to it; The sparse-file issue is useful only when implementing heaps directly on file or in mmaped file). -- http://mail.python.org/mailman/listinfo/python-list
Re: writing large files quickly
On Fri, 27 Jan 2006 12:30:49 -0800, Donn Cave [EMAIL PROTECTED] wrote: In article [EMAIL PROTECTED], rbt [EMAIL PROTECTED] wrote: Won't work!? It's absolutely fabulous! I just need something big, quick and zeros work great. How the heck does that make a 400 MB file that fast? It literally takes a second or two while every other solution takes at least 2 - 5 minutes. Awesome... thanks for the tip!!! Because it isn't really writing the zeros. You can make these files all day long and not run out of disk space, because this kind of file doesn't take very many blocks. The blocks that were never written are virtual blocks, inasmuch as read() at that location will cause the filesystem to return a block of NULs. I wonder if it will also write virtual blocks when it gets real zero blocks to write from a user, or even with file system copy utils? Regards, Bengt Richter -- http://mail.python.org/mailman/listinfo/python-list
writing large files quickly
I've been doing some file system benchmarking. In the process, I need to create a large file to copy around to various drives. I'm creating the file like this: fd = file('large_file.bin', 'wb') for x in xrange(40960): fd.write('0') fd.close() This takes a few minutes to do. How can I speed up the process? Thanks! -- http://mail.python.org/mailman/listinfo/python-list
Re: writing large files quickly
One way to speed this up is to write larger strings: fd = file('large_file.bin', 'wb') for x in xrange(5120): fd.write('') fd.close() However, I bet within an hour or so you will have a much better answer or 10. =) -- http://mail.python.org/mailman/listinfo/python-list
Re: writing large files quickly
rbt wrote: I've been doing some file system benchmarking. In the process, I need to create a large file to copy around to various drives. I'm creating the file like this: fd = file('large_file.bin', 'wb') for x in xrange(40960): fd.write('0') fd.close() This takes a few minutes to do. How can I speed up the process? Thanks! Untested, but this should be faster. block = '0' * 409600 fd = file('large_file.bin', 'wb') for x in range(1000): fd.write('0') fd.close() -- http://mail.python.org/mailman/listinfo/python-list
Re: writing large files quickly
Untested, but this should be faster. block = '0' * 409600 fd = file('large_file.bin', 'wb') for x in range(1000): fd.write('0') fd.close() Just checking...you mean fd.write(block) right? :) Otherwise, you end up with just 1000 0 characters in your file :) Is there anything preventing one from just doing the following? fd.write(0 * 40960) It's one huge string for a very short time. It skips all the looping and allows Python to pump the file out to the disk as fast as the OS can handle it. (and sorta as fast as Python can generate this humongous string) -tkc -- http://mail.python.org/mailman/listinfo/python-list
Re: writing large files quickly
Tim Chase [EMAIL PROTECTED] writes: Is there anything preventing one from just doing the following? fd.write(0 * 40960) It's one huge string for a very short time. It skips all the looping and allows Python to pump the file out to the disk as fast as the OS can handle it. (and sorta as fast as Python can generate this humongous string) That's large enough that it might exceed your PC's memory and cause swapping. Try strings of about 64k (65536). -- http://mail.python.org/mailman/listinfo/python-list
Re: writing large files quickly
On 2006-01-27, rbt [EMAIL PROTECTED] wrote: I've been doing some file system benchmarking. In the process, I need to create a large file to copy around to various drives. I'm creating the file like this: fd = file('large_file.bin', 'wb') for x in xrange(40960): fd.write('0') fd.close() This takes a few minutes to do. How can I speed up the process? Don't write so much data. f = file('large_file.bin','wb') f.seek(40960-1) f.write('\x00') f.close() That should be almost instantaneous in that the time required for those 4 lines of code is neglgible compared to interpreter startup and shutdown. -- Grant Edwards grante Yow! These PRESERVES at should be FORCE-FED to visi.comPENTAGON OFFICIALS!! -- http://mail.python.org/mailman/listinfo/python-list
Re: writing large files quickly
Oops. I did mean fd.write(block) The only limit is available memory. I've used 1MB block sizes when I did read/write tests. I was comparing NFS vs. local disk performance. I know Python can do at least 100MB/sec. -- http://mail.python.org/mailman/listinfo/python-list
Re: writing large files quickly
fd.write('0') [cut] f = file('large_file.bin','wb') f.seek(40960-1) f.write('\x00') While a mindblowingly simple/elegant/fast solution (kudos!), the OP's file ends up with full of the character zero (ASCII 0x30), while your solution ends up full of the NUL character (ASCII 0x00): [EMAIL PROTECTED]:~/temp$ xxd op.bin 000: 3030 3030 3030 3030 3030 00 [EMAIL PROTECTED]:~/temp$ xxd new.bin 000: .. (using only length 10 instead of 400 megs to save time and disk space...) -tkc -- http://mail.python.org/mailman/listinfo/python-list
Re: writing large files quickly
On 2006-01-27, Tim Chase [EMAIL PROTECTED] wrote: fd.write('0') [cut] f = file('large_file.bin','wb') f.seek(40960-1) f.write('\x00') While a mindblowingly simple/elegant/fast solution (kudos!), the OP's file ends up with full of the character zero (ASCII 0x30), while your solution ends up full of the NUL character (ASCII 0x00): Oops. I missed the fact that he was writing 0x30 and not 0x00. Yes, the hole in the file will read as 0x00 bytes. If the OP actually requires that the file contain something other than 0x00 bytes, then my solution won't work. -- Grant Edwards grante Yow! I want the presidency at so bad I can already taste visi.comthe hors d'oeuvres. -- http://mail.python.org/mailman/listinfo/python-list
Re: writing large files quickly
Grant Edwards wrote: On 2006-01-27, Tim Chase [EMAIL PROTECTED] wrote: fd.write('0') [cut] f = file('large_file.bin','wb') f.seek(40960-1) f.write('\x00') While a mindblowingly simple/elegant/fast solution (kudos!), the OP's file ends up with full of the character zero (ASCII 0x30), while your solution ends up full of the NUL character (ASCII 0x00): Oops. I missed the fact that he was writing 0x30 and not 0x00. Yes, the hole in the file will read as 0x00 bytes. If the OP actually requires that the file contain something other than 0x00 bytes, then my solution won't work. Won't work!? It's absolutely fabulous! I just need something big, quick and zeros work great. How the heck does that make a 400 MB file that fast? It literally takes a second or two while every other solution takes at least 2 - 5 minutes. Awesome... thanks for the tip!!! Thanks to all for the advice... one can really learn things here :) -- http://mail.python.org/mailman/listinfo/python-list
Re: writing large files quickly
Grant Edwards wrote: On 2006-01-27, rbt [EMAIL PROTECTED] wrote: I've been doing some file system benchmarking. In the process, I need to create a large file to copy around to various drives. I'm creating the file like this: fd = file('large_file.bin', 'wb') for x in xrange(40960): fd.write('0') fd.close() This takes a few minutes to do. How can I speed up the process? Don't write so much data. f = file('large_file.bin','wb') f.seek(40960-1) f.write('\x00') f.close() OK, I'm still trying to pick my jaw up off of the floor. One question... how big of a file could this method create? 20GB, 30GB, limit depends on filesystem, etc? That should be almost instantaneous in that the time required for those 4 lines of code is neglgible compared to interpreter startup and shutdown. -- http://mail.python.org/mailman/listinfo/python-list
Re: writing large files quickly
In article [EMAIL PROTECTED], rbt [EMAIL PROTECTED] wrote: Won't work!? It's absolutely fabulous! I just need something big, quick and zeros work great. How the heck does that make a 400 MB file that fast? It literally takes a second or two while every other solution takes at least 2 - 5 minutes. Awesome... thanks for the tip!!! Because it isn't really writing the zeros. You can make these files all day long and not run out of disk space, because this kind of file doesn't take very many blocks. The blocks that were never written are virtual blocks, inasmuch as read() at that location will cause the filesystem to return a block of NULs. Donn Cave, [EMAIL PROTECTED] -- http://mail.python.org/mailman/listinfo/python-list
Re: writing large files quickly
Donn Cave wrote: In article [EMAIL PROTECTED], rbt [EMAIL PROTECTED] wrote: Won't work!? It's absolutely fabulous! I just need something big, quick and zeros work great. How the heck does that make a 400 MB file that fast? It literally takes a second or two while every other solution takes at least 2 - 5 minutes. Awesome... thanks for the tip!!! Because it isn't really writing the zeros. You can make these files all day long and not run out of disk space, because this kind of file doesn't take very many blocks. Hmmm... when I copy the file to a different drive, it takes up 409,600,000 bytes. Also, an md5 checksum on the generated file and on copies placed on other drives are the same. It looks like a regular, big file... I don't get it. The blocks that were never written are virtual blocks, inasmuch as read() at that location will cause the filesystem to return a block of NULs. Donn Cave, [EMAIL PROTECTED] -- http://mail.python.org/mailman/listinfo/python-list
Re: writing large files quickly
rbt wrote: Hmmm... when I copy the file to a different drive, it takes up 409,600,000 bytes. Also, an md5 checksum on the generated file and on copies placed on other drives are the same. It looks like a regular, big file... I don't get it. google(sparse files) -- Robert Kern [EMAIL PROTECTED] In the fields of hell where the grass grows high Are the graves of dreams allowed to die. -- Richard Harter -- http://mail.python.org/mailman/listinfo/python-list
Re: writing large files quickly
On 2006-01-27, rbt [EMAIL PROTECTED] wrote: fd.write('0') [cut] f = file('large_file.bin','wb') f.seek(40960-1) f.write('\x00') While a mindblowingly simple/elegant/fast solution (kudos!), the OP's file ends up with full of the character zero (ASCII 0x30), while your solution ends up full of the NUL character (ASCII 0x00): Oops. I missed the fact that he was writing 0x30 and not 0x00. Yes, the hole in the file will read as 0x00 bytes. If the OP actually requires that the file contain something other than 0x00 bytes, then my solution won't work. Won't work!? It's absolutely fabulous! I just need something big, quick and zeros work great. Then Bob's your uncle, eh? How the heck does that make a 400 MB file that fast? Most of the file isn't really there, it's just a big hole in a sparse array containing a single allocation block that contains the single '0x00' byte that was written: $ ls -l large_file.bin -rw-r--r-- 1 grante users 40960 Jan 27 15:02 large_file.bin $ du -h large_file.bin 12K large_file.bin The filesystem code in the OS is written so that it returns '0x00' bytes when you attempt to read data from the hole in the file. So, if you open the file and start reading, you'll get 400MB of 0x00 bytes before you get an EOF return. But the file really only takes up a couple chunks of disk space, and chunks are usually on the order of 4KB. -- Grant Edwards grante Yow! Is this an out-take at from the BRADY BUNCH? visi.com -- http://mail.python.org/mailman/listinfo/python-list
Re: writing large files quickly
On 2006-01-27, rbt [EMAIL PROTECTED] wrote: Hmmm... when I copy the file to a different drive, it takes up 409,600,000 bytes. Also, an md5 checksum on the generated file and on copies placed on other drives are the same. It looks like a regular, big file... I don't get it. Because the filesystem code keeps track of where you are in that 400MB stream, and returns 0x00 anytime you're reading from a hole. The cp program and the md5sum just open the file and start read()ing. The filesystem code returns 0x00 bytes for all of the read positions that are in the hole, just like Don said: The blocks that were never written are virtual blocks, inasmuch as read() at that location will cause the filesystem to return a block of NULs. -- Grant Edwards grante Yow! They at collapsed... like nuns visi.comin the street... they had no teenappeal! -- http://mail.python.org/mailman/listinfo/python-list
Re: writing large files quickly
Grant Edwards wrote: Because the filesystem code keeps track of where you are in that 400MB stream, and returns 0x00 anytime you're reading from a hole. The cp program and the md5sum just open the file and start read()ing. The filesystem code returns 0x00 bytes for all of the read positions that are in the hole, just like Don said: And, this file is of course useless for FS benchmarking, since you're barely reading data from disk at all. You'll just be testing the FS's handling of sparse files. I suggest you go for one of the suggestions with larger block sizes. That's probably your best bet. Regards, Erik Brandstadmoen -- http://mail.python.org/mailman/listinfo/python-list
Re: writing large files quickly
On 2006-01-27, rbt [EMAIL PROTECTED] wrote: OK, I'm still trying to pick my jaw up off of the floor. One question... how big of a file could this method create? 20GB, 30GB, limit depends on filesystem, etc? Right. Back in the day, the old libc and ext2 code had a 2GB file size limit at one point (it used an signed 32 bit value to keep track of file size/position). That was back when 1GB drive was something to brag about, so it wasn't a big deal for most people. I think everthing has large file support is enabled by default now, so the limit is 2^63 for most modern filesystems -- that's the limit of the file size you can create using the seek() trick. The limit for actual on-disk bytes may not be that large. Here's a good link: http://www.suse.de/~aj/linux_lfs.html -- Grant Edwards grante Yow! I'm GLAD I at remembered to XEROX all visi.commy UNDERSHIRTS!! -- http://mail.python.org/mailman/listinfo/python-list
Re: writing large files quickly
Grant Edwards wrote: On 2006-01-27, rbt [EMAIL PROTECTED] wrote: Hmmm... when I copy the file to a different drive, it takes up 409,600,000 bytes. Also, an md5 checksum on the generated file and on copies placed on other drives are the same. It looks like a regular, big file... I don't get it. Because the filesystem code keeps track of where you are in that 400MB stream, and returns 0x00 anytime you're reading from a hole. The cp program and the md5sum just open the file and start read()ing. The filesystem code returns 0x00 bytes for all of the read positions that are in the hole, just like Don said: OK I finally get it. It's too good to be true :) I'm going back to using _real_ files... files that don't look as if they are there but aren't. BTW, the file 'size' and 'size on disk' were identical on win 2003. That's a bit deceptive. According to the NTFS docs, they should be drastically different... 'size on disk' should be like 64K or something. The blocks that were never written are virtual blocks, inasmuch as read() at that location will cause the filesystem to return a block of NULs. -- http://mail.python.org/mailman/listinfo/python-list
Re: writing large files quickly
On 2006-01-27, Erik Andreas Brandstadmoen [EMAIL PROTECTED] wrote: Grant Edwards wrote: Because the filesystem code keeps track of where you are in that 400MB stream, and returns 0x00 anytime you're reading from a hole. The cp program and the md5sum just open the file and start read()ing. The filesystem code returns 0x00 bytes for all of the read positions that are in the hole, just like Don said: And, this file is of course useless for FS benchmarking, since you're barely reading data from disk at all. Quite right. Copying such a sparse file is probably only really testing the write performance of the filesystem containing the destination file. You'll just be testing the FS's handling of sparse files. Which may be a useful thing to know, but I rather doubt it. I suggest you go for one of the suggestions with larger block sizes. That's probably your best bet. -- Grant Edwards grante Yow! Now I'm concentrating at on a specific tank battle visi.comtoward the end of World War II! -- http://mail.python.org/mailman/listinfo/python-list
Re: writing large files quickly
On 2006-01-27, rbt [EMAIL PROTECTED] wrote: OK I finally get it. It's too good to be true :) Sorry about that. I should have paid closer attention to what you were going to do with the file. I'm going back to using _real_ files... files that don't look as if they are there but aren't. BTW, the file 'size' and 'size on disk' were identical on win 2003. That's a bit deceptive. What?! Windows lying to the user? I don't believe it! According to the NTFS docs, they should be drastically different... 'size on disk' should be like 64K or something. Probably. -- Grant Edwards grante Yow! Where's th' DAFFY at DUCK EXHIBIT?? visi.com -- http://mail.python.org/mailman/listinfo/python-list
Re: writing large files quickly
On Fri, 27 Jan 2006 12:30:49 -0800, Donn Cave wrote: In article [EMAIL PROTECTED], rbt [EMAIL PROTECTED] wrote: Won't work!? It's absolutely fabulous! I just need something big, quick and zeros work great. How the heck does that make a 400 MB file that fast? It literally takes a second or two while every other solution takes at least 2 - 5 minutes. Awesome... thanks for the tip!!! Because it isn't really writing the zeros. You can make these files all day long and not run out of disk space, because this kind of file doesn't take very many blocks. The blocks that were never written are virtual blocks, inasmuch as read() at that location will cause the filesystem to return a block of NULs. Isn't this a file system specific solution though? Won't your file system need to have support for sparse files, or else it won't work? Here is another possible solution, if you are running Linux, farm the real work out to some C code optimised for writing blocks to the disk: # untested and, it goes without saying, untimed os.system(dd if=/dev/zero of=largefile.bin bs=64K count=16384) That should make a 4GB file as fast as possible. If you have lots and lots of memory, you could try upping the block size (bs=...). -- Steven. -- http://mail.python.org/mailman/listinfo/python-list
Re: writing large files quickly
Steven D'Aprano wrote: Isn't this a file system specific solution though? Won't your file system need to have support for sparse files, or else it won't work? Yes, but AFAIK the only modern (meaning: in wide use today) file system that doesn't have this support is FAT/FAT32. -- http://mail.python.org/mailman/listinfo/python-list
Re: writing large files quickly
On 2006-01-27, Steven D'Aprano [EMAIL PROTECTED] wrote: Because it isn't really writing the zeros. You can make these files all day long and not run out of disk space, because this kind of file doesn't take very many blocks. The blocks that were never written are virtual blocks, inasmuch as read() at that location will cause the filesystem to return a block of NULs. Isn't this a file system specific solution though? Won't your file system need to have support for sparse files, or else it won't work? If your fs doesn't support sparse files, then you'll end up with a file that really does have 400MB of 0x00 bytes in it. Which is what the OP really needed in the first place. Here is another possible solution, if you are running Linux, farm the real work out to some C code optimised for writing blocks to the disk: # untested and, it goes without saying, untimed os.system(dd if=/dev/zero of=largefile.bin bs=64K count=16384) That should make a 4GB file as fast as possible. If you have lots and lots of memory, you could try upping the block size (bs=...). I agree. that probably is the optimal solution for Unix boxes. I messed around with something like that once, and block sizes bigger than 64k didn't make much difference. -- Grant Edwards grante Yow! As President I at have to go vacuum my coin visi.comcollection! -- http://mail.python.org/mailman/listinfo/python-list