On Jul 14, 2014, at 4:37 PM, Konstantin Olchanski <[email protected]> wrote:

> On Mon, Jul 14, 2014 at 04:33:03PM -0500, Kevin K wrote:
>> I guess I don't understand the part about how files can be different sizes 
>> on different filesystems.
>> 
>> They can obviously use up more or less disk space on different filesystems.  
>> For instance, a FAT disk with 32KB clusters will use up a minimum of 32KB 
>> even for a 10 byte file.  While NTFS will probably put the 10 bytes in the 
>> directory entry or use up a maximum of 4KB for 4KB clusters.
>> 
>> But I don't see why rsync would care about the unused data.  It should just 
>> sync the 10 bytes accessible.  I'm ignoring alternate streams here.
> 
> 
> This is the usual confusion between the "st_size" and "st_blocks" entries in 
> "struct stat" returned by lstat() and co.

Is what I was missing is complexities in files that, for example, may be sparse?

I was thinking of the case that, when you do a ls -l, you normally get a byte 
size value.  Depending on your options, you can also get block size, which du 
would also return.

So, if I'm not going off the deep end, a quick determination of whether a file 
is different probably has to check both values.  Since it may show 1000000 
bytes, but if sparse most of the file may be nulls and therefore no on disk 
storage allocated to it.  If that changes, on even the same filesystem, 
something may have changed and data may have to be synced.  And with different 
cluster sizes, the normal case will be blocks used will be different.

Reply via email to