Berny,

Thank you for your comprehensive response and guidance.  I'm more than
happy to address the technical items after a decision is made on the
feature request itself.  Believe it or not I have a reason, however
naive, for those constants which I know ought to be abstracted but
don't know where to do it in coreutils.  POSIX requires that
applications that don't handle UID/GIDs greater than the originally
specified 64k should aggregate high UID/GIDs to 65534.  I didn't think
we wanted to allocate arrays the size of the expanded UID/GID range.
If you (the coreutils team) think otherwise then I can get rid of that
handling code.  Regarding the output formatting I was trying to keep
the original intent of du producing one line of output per output item
- it also let me minimize the complexity of my changes to du which I
assumed correctly would be a key concern.

Returning to the logic behind my feature request(s) I work with
tolerably large file systems (5-30PiB) and it is untenable to use the
normal Unix approach of piping commands together if only due to time.
Here we have a case where du is literally throwing costly (in time to
acquire) file metadata away needlessly.  I understand that my use case
is not the concern of the majority of users, but still I think storage
density is growing faster than I/O throughtput / latency even in
consumer hardware and that these features could save administrators
and users nontrivial amounts of time for relatively little complexity
cost in the du source.

Thank you.

Dan.



On Fri, Mar 2, 2018 at 12:43 PM, Bernhard Voelker
<m...@bernhard-voelker.de> wrote:
> On 03/02/2018 06:22 AM, Daniel Gall wrote:
>> Have I done something incorrectly?  How or does this feature request
>> proceed from here?
>>
>> Thank you for considering it.
>
> From my point of view, I'm not too enthusiastic about to get this feature
> into 'du': while the need for such accounting might be warranted, I'm
> not convinced that 'du' is the right place for it to be added.
> The bar for adding new features to the GNU coreutils is quite high:
> it has to be valuable, easy to maintain, consider compatibilities,
> only add more complexity in very special cases, etc.
> The patch would add quote some bloat to du.c.
> Did you consider combining other tools like find(1) etc. to get the
> job done? This would follow more the UNIX philosophy.
>
> Technically, there are a couple of things which don't conform to our
> quality rules:
>
> a) there are too many magic numbers in too many places: 65534, 65535, 65536,
> and often used with <, <= and alike operators.  This is hard to read and
> therefore problematic to maintain.
> I'm wondering if using a hash array wouldn't be better to use.
>
> b) the source does not compile (in maintainer mode: -Werror):
> src/du.c: In function 'print_size':
> src/du.c:465:27: error: format '%Ld' expects argument of type 'long long 
> int', but argument 2 has type 'long long unsigned int' [-Werror=format=]
>                printf (" %Ld:", (long long unsigned int)i);
>                          ~~^    ~~~~~~~~~~~~~~~~~~~~~~~~~
>                          %lld
> cc1: all warnings being treated as errors
>
> c) it does not pass "make syntax-check", actually it violates 8 rules:
> maint.mk: found useless "if" before "free" above
> make: *** [maint.mk:326: sc_avoid_if_before_free] Error 1
> maint.mk: you have modified old NEWS
> make: *** [maint.mk:1075: sc_immutable_NEWS] Error 1
> maint.mk: line(s) with more than 80 characters; reindent
> make: *** [cfg.mk:333: sc_long_lines] Error 1
> maint.mk: TAB in indentation; use only spaces
> make: *** [cfg.mk:461: sc_prohibit_tab_based_indentation] Error 1
> maint.mk: the above files lack a space-before-open-paren
> make: *** [cfg.mk:703: sc_space_before_open_paren] Error 1
> make[1]: *** [/home/voelkerb/coreutils/tight-scope.mk:42: _gl_tight_scope] 
> Error 1
> make: *** [maint.mk:1582: sc_tight_scope] Error 1
> maint.mk: help2man requires at least two spaces between an option and its 
> description
> make: *** [maint.mk:759: sc_two_space_separator_in_usage] Error 1
>
> d) tests are missing.
>
> e) the documentation is really sparse.
> It should at least be loosely described how the groups information
> it printed.
>
> f) IMO the output format is questionable, e.g. in my /tmp with
> only 3 groups, the output already looks quite odd:
>
>   $ src/du -gshxc /tmp 2>/dev/null
>   12M Groups, root:724K, users:12K, berny:11M, ecs:4.0K /tmp
>   12M Groups, root:724K, users:12K, berny:11M, ecs:4.0K total
>
> What happens with 10, or 100 groups?
> Furthermore, the output of the above example looks really
> odd - not to say ugly - without the -s option; try it.
>
> So after all, I think this patch still is currently quite far away from
> being in a shape for being pushed.  Nevertheless, it might be a good
> basis for further discussions.
>
> Have a nice day,
> Berny
>

Reply via email to