On 22/07/18 08:12, Paul Eggert wrote: > Pádraig Brady wrote: >> I've also attached an alternative patch for df (in your name). > > That still has problems, since it can generate improperly-encoded strings in > UTF-8 locales (if the inputs are improperly encoded), and can replace parts > of > multibyte characters with '?' in non-UTF-8 locales. Please try the attached > patch instead, which attempts to address these issues. This is more along the > lines that Bruno suggested, except it doesn't use mbsiter as I figured it was > simpler overall just to use mbrtowc directly for this one thing.
I haven't time to review this now, but I did want to only avoid \n etc. that might cause issues for programs that parsed output from df on a line by line basis. This subset of control characters is safe to identify It seems problematic to start eliding improperly encoded mount points for example, rather than just outputting what's there. Also just incrementing width++ per each wide character doesn't seem right, though again I've not tested it. cheers, Pádraig