Hi, Berny. You've beaten me to the punch. I have another message draft in progress about join. I'll send it later.
I haven't compared join with cut and awk, but it does the job. On Wed, Jul 19, 2017 at 17:08 Bernhard Voelker <m...@bernhard-voelker.de> wrote: > On 07/19/2017 07:43 PM, Kaz Kylheku (Coreutils) wrote: > > It is nontrivial code. For instance if we look at how the function > > cut_bytes works in the implementation, what it's doing is simply > > doing a getchar() from the stream, and querying a data structure > > to determine whether the byte should be printed or not. > > (That data structure consists of a pointer which marches through > > field range descriptors in parallel with going through the data.) > > > > cut_fields is more complicated due to the delimiting of fields, > > but essentially the same overall approach. > > > > Basically, printing of fields that isn't sorted and de-duplicated > > is a rewrite of all parts of the utility other than command > > line processing and the printing of usage help text. > > > > It's like two different programs in one, sharing a minimal > > skeleton. > > +1 > > Another point: it is already documented that cut(1) output is > never good for reordering: > > > http://git.sv.gnu.org/cgit/coreutils.git/tree/doc/coreutils.texi?id=545f181f4e#n5938 > > Note @command{awk} supports more sophisticated field processing, > like reordering fields, and handling fields aligned with blank > characters. > By default @command{awk} uses (and discards) runs of blank characters > to separate fields, and ignores leading and trailing blanks. > @example > @verbatim > awk '{print $2}' # print the second field > awk '{print $(NF-1)}' # print the penultimate field > awk '{print $2,$1}' # reorder the first two fields > @end verbatim > @end example > Note while @command{cut} accepts field specifications in > arbitrary order, output is always in the order encountered in the file. > > and even more: it suggests to use join: > > In the unlikely event that @command{awk} is unavailable, > one can use the @command{join} command, to process blank > characters as @command{awk} does above. > @example > @verbatim > join -a1 -o 1.2 - /dev/null # print the second field > join -a1 -o 1.2,1.1 - /dev/null # reorder the first two fields > @end verbatim > @end example > > Is this sufficient? > > Have a nice day, > Berny >