Hello Daniel, On Fri, Mar 02, 2018 at 10:08:57PM -0500, Daniel Gall wrote: > POSIX requires that > applications that don't handle UID/GIDs greater than the originally > specified 64k should aggregate high UID/GIDs to 65534. I didn't think > we wanted to allocate arrays the size of the expanded UID/GID range.
If we continue with this feature, I think IDs above 65K must be supported - they are supported in many common OSes. Some gnu systems already have UIDs>145000 and GIDs>78000. The implementation should likely not be a pre-allocated array, but perhaps a hash or a tree (both exists as gnulib modules). > Returning to the logic behind my feature request(s) I work with > tolerably large file systems (5-30PiB) and it is untenable to use the > normal Unix approach of piping commands together if only due to time. I don't think "piping" is the bottleneck (certainly not the tiny awk script). The issue is file system access, by "du" (with your patch) or "find" (with my example). Based on cursory observation, the command find -printf '%u %g %s %D %i\n' performs a single stat syscall per file - that's as efficient as it gets. You can try it on your system with: strace -e trace=file find -type f -printf "%u %g %s %D %i\n" > [...] I understand that my use case > is not the concern of the majority of users, but still I think storage > density is growing faster than I/O throughtput / latency even in > consumer hardware and that these features could save administrators > and users nontrivial amounts of time for relatively little complexity > cost in the du source. I agree with the density-vs-latency, but when dealing with large scale file-systems (many PB, in your case) - there are additional optimizations that should be performed on the filesystem level - such as dedicated metadata servers, large metadata cache etc. Such optimization will typically be much more effective than any user-level program improvement can achieve. As for "non trivial amounts of time" - can you provide some measurements of 'du' with your patch vs 'find' to demonstrate it is indeed non-trivial amount? This would give more weight towards accepting the patch. regards, - assaf
