On Fri, Nov 09, 2007 at 03:47:10PM -0500, Daniel Ouellet wrote:
> Ted Unangst wrote:
> >On 11/9/07, Daniel Ouellet <[EMAIL PROTECTED]> wrote:
> >>Just for example, a source file that is sparse badly, don't really have
> >>allocated disk block yet, but when copy over, via scp, or rsync will
> >>actually use that space on the destination servers. All the servers are
> >>identical (or suppose to be anyway) but what is happening is the copy of
> >>them are running out of space at time in the copy process. Like when it
> >>is copying them, it may easy use twice the amount of space in the
> >>process and sadly filling up the destinations then then the sync process
> >>stop making the distribution of the load unusable. I need to increase
> >>the capacity yes, except that it will take me times to do so.
> >
> >so what are you going to do when you find these sparse files?
>
> So far. When I find them. Not all of them, but huge waisting space one.
> I delete them and replace them. with the original one, or even with the
I am confused by what you say. A sparse file does NOT waste space, it
REDUCES disk usage, compared to a non-sparse (dense?) file with the
same contents.
> one copy using rsync -S back to the original reduce it's size in 1/2 and
If the size is reduced, it is not the same file. Please be more
accurate in your description. A file's size is not the same as it's
disk usage.
> more at times. So, yes, very inefficiently, but manageable anyway. It's
> a plaster for now if you want. Don't get me wrong. Sparse files makes no
> problem what so ever when they stay on the same systems. It's when you
> need to move them around servers, and specially across Internet
> connected locations and keep them in sync as much as possible in as
> shorter time as possible that it becomes unmanageable. That's really the
> issue at hands. Not that sparse files are bad in any ways. Keeping them
> in sync across multiples system is however.
You cannot blame sparse files for that. If the same file would not be
sparse, your problem would be at least as big.
-Otto
>
> I was looking if there was a more intelligent ways to do it. (;> Like
> finding them about some level of sparse, like let say 25% and then
> compact them at the source to be none sparse again, or something
> similar. Doesn't need to do every single one, even if that might be a
> good thing in special cases, not all obviously.
>
> The problem is that some customers end up running out of space and I
> really didn't know, plus the huge factor of waisted bandwidth and
> filling up their connections transferring empty files if you like and
> taking much longer in sync time that other wise it wouldn't if you sync
> as is.
>
> Still is an interesting problem after I found out what it really was.
>
> I hope it explained the issue somewhat better.
>
> Thanks for the feedback never the less.
>
> Daniel