Hi!

According to Issue 8 Draft 3, XCU, split, OPTIONS:
114711  −l line_count Specify the number of lines in each resulting file piece. 
The line_count argument is
114712     an unsigned decimal integer. The default is 1 000. If the input does 
not end with a
114713     <newline>, the partial line shall be included in the last output 
file.

This matches verbatim POSIX 1003.2a (Draft 8), Section 5: User
Portability Utilities Option, 5.25 split ‒ Split files into pieces,
5.25.3 Options:
5162  −l line_count
5163  −line_count (Obsolescent.)
5164  Specify the number of lines in each resulting file piece.  The
5165  line_count argument is an unsigned decimal integer. The default
5166  is 1000. If the input does not end with a <newline>, the partial
5167  line shall be included in the last output file.

Which was included in Issue 4 (and issues 2 and 3 bore the SysVr1 manual
which was silent about this).

This appears to be invented from whole cloth, since /no/ historical
implementation does this (on display, respectively, V7, SysVr4, SysVr1,
4.3BSD-Tahoe, 4.4BSD-Lite2, coreutils):
  $ rm -f xa?; { seq 10; printf abc; } | ./split.v7 -10; head xa[ab]
  ==> xaa <==
  1
  2
  3
  4
  5
  6
  7
  8
  9
  10
  
  ==> xab <==
  abc$ rm -f xa?; { seq 10; printf abc; } | ./split.sysvr4 -10; head xa[ab]
  ==> xaa <==
  1
  2
  3
  4
  5
  6
  7
  8
  9
  10
  
  ==> xab <==
  abc$ rm -f xa?; { seq 10; printf abc; } | ./split.sysvr1 -10; head xa[ab]
  ==> xaa <==
  1
  2
  3
  4
  5
  6
  7
  8
  9
  10
  
  ==> xab <==
  abc$ rm -f xa?; { seq 10; printf abc; } | CSRG/split.4.3tahoe -10; head xa[ab]
  ==> xaa <==
  1
  2
  3
  4
  5
  6
  7
  8
  9
  10
  
  ==> xab <==
  abc$ rm -f xa?; { seq 10; printf abc; } | CSRG/split.4.4 -10; head xa[ab]
  ==> xaa <==
  1
  2
  3
  4
  5
  6
  7
  8
  9
  10
  
  ==> xab <==
  abc$ rm -f xa?; { seq 10; printf abc; } | /bin/split -10; head xa[ab]
  ==> xaa <==
  1
  2
  3
  4
  5
  6
  7
  8
  9
  10
  
  ==> xab <==
  abc$
(i.e. the partial line is treated as a line and is allowed to start xab,
      or, indeed, each output file is terminated as soon as it receives
      line_count lines).

The illumos gate and NetBSD agree with this historical interpretation.

The only systems I found off-hand that did actually put abc in xaa were
FreeBSD and OpenBSD. I'd argue this is wrong, since ‒ even though they
do match the standard ‒ the standard doesn't match historical practice.

Best,
наб

Attachment: signature.asc
Description: PGP signature

Reply via email to