There is a difference between the length of a file, and the amount of space
that it takes up on disk.

Disk space is allocated in chunks (units called clusters or blocks,
depending on who you are talking to), and on each filesystem these are
often different sizes. If you are working on a filesystem with a 1024-byte
blocksize, any file less than 1024 bytes in length will consume 1024 bytes
of storage. In the worst case, 1024 files of zero byte length will consume
1MB of disk space (there will also be space taken up for the index/inode
table, but let's ignore that for now)

I think FAT32 uses 8KB clusters, and Ext3 uses either 1,2 or 4KB depending
on the actual filesystem sizes.

Other problems can arise if the Ext3 original had any hard-linked files -
because FAT doesn't support links, any of these will need to be actually
duplicated on the FAT filesystem. some of the original files might be
'sparse' (i.e. they have large sections with zero data due to how they were
created, and copying breaks this feature)

So when you check "disk space used" you have to differentiate between "sum
of the sizes of each file" and "total disk space used by the filesystem".

Under Linux, 'du' *estimates* usage by counting file lengths, 'df' displays
the higher-level filesystem data, and they can often disagree.

If you want to assess the validity of your copied files, checksums are the
way to go. Tools like rsync will do this implicitly for you, where cp
doesn't.

-jim
_______________________________________________
Linux-users mailing list
[email protected]
http://lists.canterbury.ac.nz/mailman/listinfo/linux-users

Reply via email to