bug#14224: Feature request for the `cut`: record delimiter

Pádraig Brady Wed, 17 Apr 2013 18:14:41 -0700

On 04/17/2013 02:26 PM, George Brink wrote:
> Hello,
> 
> I have a task of extracting several "fields" from the text file. The
> standard `cut` tool could be a perfect tool for a job, but...
> In my file the '\n' character is a legal symbol inside fields and therefore
> the text file uses other symbol for record-separator. And the `cut` has a
> hard-coded '\n' for record separator (I just checked the source from the
> coreutils-8.21 package).


The patch would be simple but not without compatibility cost.
I.E. scripts using this would immediately become incompatible
with any systems without this feature.

So you'd like something like tac -s, --separator
However cut -s is taken, so we'd have to avoid the short -s at least.
Also tac -s takes a string rather than a character, so
that gives some extra credence (and complexity) to that option there.

Also related would be to support the -z, --zero-terminated option.
join, sort and uniq all have this option to use NUL as the record separator,
however they're all closely related sort dependent utilities
and we're trying to unify options between them.

If it is just a character you want to separate on,
then you can always use tr to convert before processing,
albeit with associated data copying overhead.

SEP=^
tr "$SEP"'\n' '\n'"$SEP" | cut ... | tr "$SEP"'\n' '\n'"$SEP"

So given that cut is not special here among the text filters,
and there is a workaround available, I'm 60:40 against
adding this feature.

thanks,
Pádraig.

bug#14224: Feature request for the `cut`: record delimiter

Reply via email to