Re: Q: sparse file creation in existing data?
Hi! > It could be used as a replacement for the truncate code, because then > truncate is simply a special case of punch, namely punch(0, end). I do not think so. Truncate leaves you with filesize 0, while punch schould leave you with filesize of original file. Pavel -- Philips Velo 1: 1"x4"x8", 300gram, 60, 12MB, 40bogomips, linux, mutt, details at http://atrey.karlin.mff.cuni.cz/~pavel/velo/index.html. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Q: sparse file creation in existing data?
Hi! It could be used as a replacement for the truncate code, because then truncate is simply a special case of punch, namely punch(0, end). I do not think so. Truncate leaves you with filesize 0, while punch schould leave you with filesize of original file. Pavel -- Philips Velo 1: 1x4x8, 300gram, 60, 12MB, 40bogomips, linux, mutt, details at http://atrey.karlin.mff.cuni.cz/~pavel/velo/index.html. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Q: sparse file creation in existing data?
Phil writes: > though looking and grepping through the sources I couldn't find a way (via > fcntl() or whatever) to allow an existing file to get holes. > > What I'd like to do is something like > > fh=open( ... , O_RDWR); > lseek(fh, position ,SEEK_START); > // where position is a multiple of fs block size > fcntl(fh,F_MAKESPARSE,16384); > > to create a 16kB hole in a file. > If the underlying fs doesn't support holes, I'd get ENOSYS or something. Peter Braam <[EMAIL PROTECTED]> implemented such a syscall, and support for it in ext2, in the 2.2 kernel. It is called "sys_punch" (punching holes in a file). I'm not sure how applicable the patch is in the 2.4 world (probably not at all, unfortunately). I did a port of this code to 2.2 ext3 a long time ago, but have not kept it updated. I'm not sure it was 100% race free (Al would probably say not), but it worked well enough while I was using it. The basic premise is that it will write zero's to partial blocks at the beginning and end of the punch region, and make holes of any intervening blocks. It did NOT do checks to see if a partial block was entirely zero filled and make it a hole (although that would be a possible feature). It could be used as a replacement for the truncate code, because then truncate is simply a special case of punch, namely punch(0, end). > What I'd like to use that for: > > I imagine having a file on hd (eg. tar) and not enough space to decompress. > So with SOME space at least I'd open the file and stream it's data to tar, > after each few kB read I'd free some space - so this could eventually succeed. > > I also thought about simple reversing the filedata - so I'd read off the > end of the file and truncate() downwards - but that would mean reversing > the whole file which could take some time on creation and would solve only > this specific case. I'm not sure I would agree with your application, but I do agree that there are some uses for it. In Peter's case he used it for implementing a cacheing HFS storage system, but he also wanted to use it for InterMezzo (a cacheing network filesystem) to do several things: - delete entries from a transaction log, in a way that actually reduces the physical size of the log. The log itself could be always appended to the end (for 64-bit file size), but transactions/blocks would be punched from the beginning. - delete blocks from the beginning/middle of large files cached on the client. This is useful if the file size is too large to fit into the cache, or if you are doing some sort of LRU replacement of blocks in the cache. I don't think any of this has been implemented in InterMezzo yet. Cheers, Andreas -- Andreas Dilger \ "If a man ate a pound of pasta and a pound of antipasto, \ would they cancel out, leaving him still hungry?" http://www-mddsp.enel.ucalgary.ca/People/adilger/ -- Dogbert - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Q: sparse file creation in existing data?
On Friday 29 June 2001 14:55, Ph. Marek wrote: > Hmmm, on second thought ... But I'd like it better to have a fcntl for > hole-making :-) > Maybe I'll implement this myself. A far superior interface would be: ssize_t sys_clear(unsigned int fd, size_t count) A stub implementation would just write zeroes. You would need a generic way of determining whether holes are supported for a particular file - this is where an fcntl would be appropriate. It would also be nice to know this before opening/creating a file, perhaps by fcntling the directory. But don't expect to have a real, hole-creating implementation any time soon. Taming the truncate races is hard enough as it is with a single boundary at the end of a file. Taking care of multiple boundaries inside the file is far, far harder. Talk to Al Viro or the Ext3 team if you want the whole ugly story. -- Daniel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Q: sparse file creation in existing data?
>For your specific problem I'd suggest the following approach: > >write a new filter prog, kind of a 'destructive cat' command. > >Open the file for read-modify (non destructive) >Let it read some blocks (number controllable on commandline) from the >beginning and pipe them to stdout. Then.. read in same amount (well; >block aligned) from the end of file and copy it over the beginning, >truncate the file accordingly. After some time reading and truncating >will meet in the middle of the file, then you read the blocks back in >on reverse (and truncate them after reading) > >It shouldn't be too difficult to write and result in a multipurpose >commandline utility. > >It does some disk i/o but I don't think it will too bad due to disk >buffers of the kernel and read-ahead. Theoretically, the kernel, buffers >etc. could detect you are just moving file blocks to different places and >does this in a 'zero-copy' fashion by just moving some inode entries >around, but I doubt that it is soo smart. > >Drawback: If you have choosen the read block size too large (hmm.. you >might want to read some data from beginning, some from the end, copy over >beginning, then truncate, and only then pipe to stdout: this should >eliminate this 'buffer/diskspace overrun') of 'destructive cat' too big, >or the archive is compressed and doesn't fit uncompressed, the backup was >destructed. This is by far not as safe as any backup tool should be. > >A major advantage, however, is that this tool will run w/o a kernel patch, >on any Unix on any filesystem, even those w/o holes (I think truncate >works in general, maybe not samba?) I didn't even get this idea - it sound's very io-intensive. Of course, if you look at it from the right angle - you'd - read the first halve, - read the 2nd halve, - and write the 2nd halve to the 1st, - and then read the 1st again. That's two times the amount of io. If I assume using tar it's because of writing the data only one third more io than necessary. Hmmm, on second thought ... But I'd like it better to have a fcntl for hole-making :-) Maybe I'll implement this myself. Thanks, Phil - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Q: sparse file creation in existing data?
On Fri, 29 Jun 2001, Ph. Marek wrote: > Hi everybody, > > though looking and grepping through the sources I couldn't find a way (via > fcntl() or whatever) to allow an existing file to get holes. Indeed, I don't think there is any such syscall. > I found that cp has a parameter --sparse (or suchlike) - but strace shows > it doing a open(,O_TRUNC) which has a bit of impact on previous filedata :-) Holes are not made by specific syscalls, just by not writing data to certain places. In principle, the write syscall could check if it just writes zeroes and create a hole instead. Of course, this check would be much slower than just writing the data to disk. Probably a 'write_zeroes' syscall to create a hole (if possible) would be better. While I see your purpose and the advantage, this is a pretty specific question and doesn't look like std. unix semantics. I doubt you'll get people to add such a syscall (maybe if you implement it yourself, people will accept it for addition, but I doubt even that). For your specific problem I'd suggest the following approach: write a new filter prog, kind of a 'destructive cat' command. Open the file for read-modify (non destructive) Let it read some blocks (number controllable on commandline) from the beginning and pipe them to stdout. Then.. read in same amount (well; block aligned) from the end of file and copy it over the beginning, truncate the file accordingly. After some time reading and truncating will meet in the middle of the file, then you read the blocks back in on reverse (and truncate them after reading) It shouldn't be too difficult to write and result in a multipurpose commandline utility. It does some disk i/o but I don't think it will too bad due to disk buffers of the kernel and read-ahead. Theoretically, the kernel, buffers etc. could detect you are just moving file blocks to different places and does this in a 'zero-copy' fashion by just moving some inode entries around, but I doubt that it is soo smart. Drawback: If you have choosen the read block size too large (hmm.. you might want to read some data from beginning, some from the end, copy over beginning, then truncate, and only then pipe to stdout: this should eliminate this 'buffer/diskspace overrun') of 'destructive cat' too big, or the archive is compressed and doesn't fit uncompressed, the backup was destructed. This is by far not as safe as any backup tool should be. A major advantage, however, is that this tool will run w/o a kernel patch, on any Unix on any filesystem, even those w/o holes (I think truncate works in general, maybe not samba?) Just my 0.02$ (you asked for ideas, didn't you?) Michael. -- Michael Weller: [EMAIL PROTECTED], [EMAIL PROTECTED], or even [EMAIL PROTECTED] If you encounter an eowmob account on any machine in the net, it's very likely it's me. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Q: sparse file creation in existing data?
Hi everybody, though looking and grepping through the sources I couldn't find a way (via fcntl() or whatever) to allow an existing file to get holes. I found that cp has a parameter --sparse (or suchlike) - but strace shows it doing a open(,O_TRUNC) which has a bit of impact on previous filedata :-) What I'd like to do is something like fh=open( ... , O_RDWR); lseek(fh, position ,SEEK_START); // where position is a multiple of fs block size fcntl(fh,F_MAKESPARSE,16384); to create a 16kB hole in a file. If the underlying fs doesn't support holes, I'd get ENOSYS or something. What I'd like to use that for: I imagine having a file on hd (eg. tar) and not enough space to decompress. So with SOME space at least I'd open the file and stream it's data to tar, after each few kB read I'd free some space - so this could eventually succeed. I also thought about simple reversing the filedata - so I'd read off the end of the file and truncate() downwards - but that would mean reversing the whole file which could take some time on creation and would solve only this specific case. Ideas, anyone? Regards, Phil - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Q: sparse file creation in existing data?
Hi everybody, though looking and grepping through the sources I couldn't find a way (via fcntl() or whatever) to allow an existing file to get holes. I found that cp has a parameter --sparse (or suchlike) - but strace shows it doing a open(,O_TRUNC) which has a bit of impact on previous filedata :-) What I'd like to do is something like fh=open( ... , O_RDWR); lseek(fh, position ,SEEK_START); // where position is a multiple of fs block size fcntl(fh,F_MAKESPARSE,16384); to create a 16kB hole in a file. If the underlying fs doesn't support holes, I'd get ENOSYS or something. What I'd like to use that for: I imagine having a file on hd (eg. tar) and not enough space to decompress. So with SOME space at least I'd open the file and stream it's data to tar, after each few kB read I'd free some space - so this could eventually succeed. I also thought about simple reversing the filedata - so I'd read off the end of the file and truncate() downwards - but that would mean reversing the whole file which could take some time on creation and would solve only this specific case. Ideas, anyone? Regards, Phil - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Q: sparse file creation in existing data?
On Fri, 29 Jun 2001, Ph. Marek wrote: Hi everybody, though looking and grepping through the sources I couldn't find a way (via fcntl() or whatever) to allow an existing file to get holes. Indeed, I don't think there is any such syscall. I found that cp has a parameter --sparse (or suchlike) - but strace shows it doing a open(,O_TRUNC) which has a bit of impact on previous filedata :-) Holes are not made by specific syscalls, just by not writing data to certain places. In principle, the write syscall could check if it just writes zeroes and create a hole instead. Of course, this check would be much slower than just writing the data to disk. Probably a 'write_zeroes' syscall to create a hole (if possible) would be better. While I see your purpose and the advantage, this is a pretty specific question and doesn't look like std. unix semantics. I doubt you'll get people to add such a syscall (maybe if you implement it yourself, people will accept it for addition, but I doubt even that). For your specific problem I'd suggest the following approach: write a new filter prog, kind of a 'destructive cat' command. Open the file for read-modify (non destructive) Let it read some blocks (number controllable on commandline) from the beginning and pipe them to stdout. Then.. read in same amount (well; block aligned) from the end of file and copy it over the beginning, truncate the file accordingly. After some time reading and truncating will meet in the middle of the file, then you read the blocks back in on reverse (and truncate them after reading) It shouldn't be too difficult to write and result in a multipurpose commandline utility. It does some disk i/o but I don't think it will too bad due to disk buffers of the kernel and read-ahead. Theoretically, the kernel, buffers etc. could detect you are just moving file blocks to different places and does this in a 'zero-copy' fashion by just moving some inode entries around, but I doubt that it is soo smart. Drawback: If you have choosen the read block size too large (hmm.. you might want to read some data from beginning, some from the end, copy over beginning, then truncate, and only then pipe to stdout: this should eliminate this 'buffer/diskspace overrun') of 'destructive cat' too big, or the archive is compressed and doesn't fit uncompressed, the backup was destructed. This is by far not as safe as any backup tool should be. A major advantage, however, is that this tool will run w/o a kernel patch, on any Unix on any filesystem, even those w/o holes (I think truncate works in general, maybe not samba?) Just my 0.02$ (you asked for ideas, didn't you?) Michael. -- Michael Weller: [EMAIL PROTECTED], [EMAIL PROTECTED], or even [EMAIL PROTECTED] If you encounter an eowmob account on any machine in the net, it's very likely it's me. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Q: sparse file creation in existing data?
For your specific problem I'd suggest the following approach: write a new filter prog, kind of a 'destructive cat' command. Open the file for read-modify (non destructive) Let it read some blocks (number controllable on commandline) from the beginning and pipe them to stdout. Then.. read in same amount (well; block aligned) from the end of file and copy it over the beginning, truncate the file accordingly. After some time reading and truncating will meet in the middle of the file, then you read the blocks back in on reverse (and truncate them after reading) It shouldn't be too difficult to write and result in a multipurpose commandline utility. It does some disk i/o but I don't think it will too bad due to disk buffers of the kernel and read-ahead. Theoretically, the kernel, buffers etc. could detect you are just moving file blocks to different places and does this in a 'zero-copy' fashion by just moving some inode entries around, but I doubt that it is soo smart. Drawback: If you have choosen the read block size too large (hmm.. you might want to read some data from beginning, some from the end, copy over beginning, then truncate, and only then pipe to stdout: this should eliminate this 'buffer/diskspace overrun') of 'destructive cat' too big, or the archive is compressed and doesn't fit uncompressed, the backup was destructed. This is by far not as safe as any backup tool should be. A major advantage, however, is that this tool will run w/o a kernel patch, on any Unix on any filesystem, even those w/o holes (I think truncate works in general, maybe not samba?) I didn't even get this idea - it sound's very io-intensive. Of course, if you look at it from the right angle - you'd - read the first halve, - read the 2nd halve, - and write the 2nd halve to the 1st, - and then read the 1st again. That's two times the amount of io. If I assume using tar it's because of writing the data only one third more io than necessary. Hmmm, on second thought ... But I'd like it better to have a fcntl for hole-making :-) Maybe I'll implement this myself. Thanks, Phil - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Q: sparse file creation in existing data?
On Friday 29 June 2001 14:55, Ph. Marek wrote: Hmmm, on second thought ... But I'd like it better to have a fcntl for hole-making :-) Maybe I'll implement this myself. A far superior interface would be: ssize_t sys_clear(unsigned int fd, size_t count) A stub implementation would just write zeroes. You would need a generic way of determining whether holes are supported for a particular file - this is where an fcntl would be appropriate. It would also be nice to know this before opening/creating a file, perhaps by fcntling the directory. But don't expect to have a real, hole-creating implementation any time soon. Taming the truncate races is hard enough as it is with a single boundary at the end of a file. Taking care of multiple boundaries inside the file is far, far harder. Talk to Al Viro or the Ext3 team if you want the whole ugly story. -- Daniel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Q: sparse file creation in existing data?
Phil writes: though looking and grepping through the sources I couldn't find a way (via fcntl() or whatever) to allow an existing file to get holes. What I'd like to do is something like fh=open( ... , O_RDWR); lseek(fh, position ,SEEK_START); // where position is a multiple of fs block size fcntl(fh,F_MAKESPARSE,16384); to create a 16kB hole in a file. If the underlying fs doesn't support holes, I'd get ENOSYS or something. Peter Braam [EMAIL PROTECTED] implemented such a syscall, and support for it in ext2, in the 2.2 kernel. It is called sys_punch (punching holes in a file). I'm not sure how applicable the patch is in the 2.4 world (probably not at all, unfortunately). I did a port of this code to 2.2 ext3 a long time ago, but have not kept it updated. I'm not sure it was 100% race free (Al would probably say not), but it worked well enough while I was using it. The basic premise is that it will write zero's to partial blocks at the beginning and end of the punch region, and make holes of any intervening blocks. It did NOT do checks to see if a partial block was entirely zero filled and make it a hole (although that would be a possible feature). It could be used as a replacement for the truncate code, because then truncate is simply a special case of punch, namely punch(0, end). What I'd like to use that for: I imagine having a file on hd (eg. tar) and not enough space to decompress. So with SOME space at least I'd open the file and stream it's data to tar, after each few kB read I'd free some space - so this could eventually succeed. I also thought about simple reversing the filedata - so I'd read off the end of the file and truncate() downwards - but that would mean reversing the whole file which could take some time on creation and would solve only this specific case. I'm not sure I would agree with your application, but I do agree that there are some uses for it. In Peter's case he used it for implementing a cacheing HFS storage system, but he also wanted to use it for InterMezzo (a cacheing network filesystem) to do several things: - delete entries from a transaction log, in a way that actually reduces the physical size of the log. The log itself could be always appended to the end (for 64-bit file size), but transactions/blocks would be punched from the beginning. - delete blocks from the beginning/middle of large files cached on the client. This is useful if the file size is too large to fit into the cache, or if you are doing some sort of LRU replacement of blocks in the cache. I don't think any of this has been implemented in InterMezzo yet. Cheers, Andreas -- Andreas Dilger \ If a man ate a pound of pasta and a pound of antipasto, \ would they cancel out, leaving him still hungry? http://www-mddsp.enel.ucalgary.ca/People/adilger/ -- Dogbert - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/