Re: identifying sparse files and get ride of them trick available?

Otto Moerbeek Sat, 10 Nov 2007 00:22:23 -0800

On Fri, Nov 09, 2007 at 03:47:10PM -0500, Daniel Ouellet wrote:

> Ted Unangst wrote:
> >On 11/9/07, Daniel Ouellet <[EMAIL PROTECTED]> wrote:
> >>Just for example, a source file that is sparse badly, don't really have
> >>allocated disk block yet, but when copy over, via scp, or rsync will
> >>actually use that space on the destination servers. All the servers are
> >>identical (or suppose to be anyway) but what is happening is the copy of
> >>them are running out of space at time in the copy process. Like when it
> >>is copying them, it may easy use twice the amount of space in the
> >>process and sadly filling up the destinations then then the sync process
> >>stop making the distribution of the load unusable. I need to increase
> >>the capacity yes, except that it will take me times to do so.
> >
> >so what are you going to do when you find these sparse files?
> 
> So far. When I find them. Not all of them, but huge waisting space one. 
> I delete them and replace them. with the original one, or even with the


I am confused by what you say. A sparse file does NOT waste space, it
REDUCES disk usage, compared to a non-sparse (dense?) file with the
same contents. 

> one copy using rsync -S back to the original reduce it's size in 1/2 and 

If the size is reduced, it is not the same file. Please be more
accurate in your description. A file's size is not the same as it's
disk usage. 

> more at times. So, yes, very inefficiently, but manageable anyway. It's 
> a plaster for now if you want. Don't get me wrong. Sparse files makes no 
> problem what so ever when they stay on the same systems. It's when you 
> need to move them around servers, and specially across Internet 
> connected locations and keep them in sync as much as possible in as 
> shorter time as possible that it becomes unmanageable. That's really the 
> issue at hands. Not that sparse files are bad in any ways. Keeping them 
> in sync across multiples system is however.

You cannot blame sparse files for that. If the same file would not be
sparse, your problem would be at least as big.

        -Otto


> 
> I was looking if there was a more intelligent ways to do it. (;> Like 
> finding them about some level of sparse, like let say 25% and then 
> compact them at the source to be none sparse again, or something 
> similar. Doesn't need to do every single one, even if that might be a 
> good thing in special cases, not all obviously.
> 
> The problem is that some customers end up running out of space and I 
> really didn't know, plus the huge factor of waisted bandwidth and 
> filling up their connections transferring empty files if you like and 
> taking much longer in sync time that other wise it wouldn't if you sync 
> as is.
> 
> Still is an interesting problem after I found out what it really was.
> 
> I hope it explained the issue somewhat better.
> 
> Thanks for the feedback never the less.
> 
> Daniel

Re: identifying sparse files and get ride of them trick available?

Reply via email to