On 28/04/15 15:42, Eric Blake wrote: > On 04/28/2015 03:59 AM, Pádraig Brady wrote: > >>>> The output from du shall consist of the amount of space allocated >>>> to a file and the name of the file, in the following format: >>>> >>>> "%d %s\n", <size>, <pathname> >>>> >>>> Instead, GNU du uses "%d\t%s\n", i.e., a tab character as delimiter, >>>> even if POSIXLY_CORRECT is set. >>>> >>>> Do I read POSIX right? >>> >>> No, the space stands for any (positive) amount of white space. >>> >>> http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap05.html#tag_05 >>> >>> Andreas. >>> >> >> Thanks for pointing that out Andreas. >> So a ' ' in a format implies any amount of blank chars. > > Correct. > >> So we could separate the du columns with spaces rather than tab, > > Yes, I'd prefer that we did that. It is much easier to guarantee > alignment when using spaces and completely avoiding whatever tab stops > people have set up. > >> though that would almost definitely introduce a compatibility issue, >> and would be inconsistent with Solaris and FreeBSD at least. > > POSIX is already clear that anyone parsing for literal tabs is broken > when trying to parse du output. The only safe way to parse du output is > to break on all whitespace (the way awk already does). I'm 70-30 in > favor of changing to spaces.
What about file names with leading whitespace, which now couldn't be split if we didn't use a single tab. I don't think the gain is enough to break compat, given the greater alignment control etc. possible with expand(1) or numfmt(1) etc. I just checked an old wrapper script for du that I use, and see that it would be broken for example: http://www.pixelbeat.org/scripts/dutop cheers, Pádraig.
