Hi Padraig, > I do think --number is more general than --chunk as it allows you to specify
> only 1 number > to get the behaviour described above. Also I notice that FreeBSDs split > recently > got a '-n chunk_count' option, so it would be good to maintain compat with > that > if possible. > I read the FreeBSD source. It's interesting that the Berkeley gave the copy right to UC Regents, who just skyrocketed my tuition. Anyhow... More on topic, their --number option is actually quite trivial; they get size = st_size/n and proceed like it's --bytes=size. In a sense, this chunks option can actually be seen as an extension to their --number option. I think what I'll end up doing is, implement their --number option, outputting the chunks to files. Then extend it to support --number=n/tot, which outputs to stdout. Then for delineation by newlines, I'll call it something like --number-lines=n, outputting all chunks with split's cwrite to files, and what I have now --number-lines=n/tot, which extracts a chunk to stdout. > We also need to decide how to select between text and binary modes for > --number. > Note reading from non seekable input complicates things. > For binary data I don't see how one could support --number. > So under this scheme then it'd be up to the user whether to use --number or --number-lines. --number of course supports binary, since it's byte delineation rather than line delineation. Lastly, I tested using this with sorting. As expected, it's not faster. This is done on gcc 14, rand is a million line ASCII file generated by gensort. Like I said, I'll try to implement the same concept, but internally within sort so we're free of the pipe overhead, and see how that goes. c...@gcc14:~/testing$ time ./sortgl --threads=8 rand > /dev/null real 0m1.820s user 0m5.236s sys 0m0.168s c...@gcc14:~/testing$ time sort -m <(./split -c1,8 rand | sort) <(./split -c2,8 rand | sort) <(./split -c3,8 rand | sort) <(./split -c4,8 rand | sort) <(./split -c5,8 rand | sort) <(./split -c6,8 rand | sort) <(./split -c7,8 rand | sort) <(./split -c8,8 rand | sort) > /dev/null real 0m2.198s user 0m5.324s sys 0m0.440s And lastly you guys probably wont hear back from me for a couple of weeks on anything. it's the end of the quarter at UCLA and that means fun projects and even more fun finals.