> > Partially, but I've been working on USS on z/OS and OpenVMS Bacula > > clients, where the filesystems are block oriented rather than byte > > oriented. One *can* obtain a precise dataset size in bytes, but the > > cost is reading the entire file to determine where the file data > > actually ends, which is very expensive on terabyte- or > > petabyte-scale datasets. > > Bacula *must* read the entire file in order to back it up, so the > exact byte count is known with no extra cost.
Umm, not on the OSes I mentioned. If you fstat the file or read the directory inode with Unix compatibility on, the underlying OS reads the file once to determine the actual file size in bytes to fill into the file stat structure in order to be compatible with the assumption that files are streams of bytes. You then get to do it again to get the actual data blocks. Reading a 20 TB file twice is nontrivial. I have LOTS of files that large, and a few that will grow into exabyte-scale in the not too distant future. > > It also doesn't really take sparse files or structured files (like > > VMS indexed datasets or VSAM data spaces) into account very well, so > > if this proposal is added to the "standard" Bacula database > > structure, you will encounter problems when you deal with these > > platforms (or anything more complicated than a simple sequential > > file). > > as Kern mentions, Bacula already has code which expects a simple byte > count to be sufficient to describe the size of an object. as a Unix > person, it is very hard for me to imagine a data object whose byte > count can not be summed up. See above. It's not that we can't get a byte count, but that there are systems where it's very expensive to get that byte count and very cheap to get the number of allocation units used, and also the size of the allocation unit. If the allocation unit size happens to be 1 byte (as it is on most Unix and Windows systems), you lose nothing and it's not a problem. If the allocation unit size is > 1, you win big by skipping the additional read of the file needed to report the size in the stat structure. But, I think a bit differently than most folks using Bacula, so I'll be interested to see what John says. John -- your rock, sir...8-) ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ Bacula-devel mailing list Bacula-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-devel