On 02/06/2013 10:09 PM, Assaf Gordon wrote:
Hello,Attach is a patch that gives 'csplit' the ability to split files by content of a field. A typical usage is: ## the "@1" pattern means "start a new file when field 1 changes" $ printf "A\nA\nB\nB\nB\nC\n" | csplit - @1 {*} $ wc -l xx* 2 xx00 3 xx01 1 xx02 6 total $ head xx* ==> xx00 <== A A ==> xx01 <== B B B ==> xx02 <== C This is just a proof of concept, and the pattern specification can be changed (I think "@N" doesn't conflict with any existing pattern). The same can probably be achieved using other programs (awk comes to mind), but it won't be as simple and clean (with all of csplit's output features). Let me know if you're willing to consider such addition.
Yes such a feature is useful, though maybe in conjuntion with uniq: http://lists.gnu.org/archive/html/coreutils/2011-03/msg00000.html So basically the proposal there is to support --suppress-matched so that you could then do: uniq -w1 --unique=separated --all-repeated=separated | csplit --suppress-matched '/^$/' '{*}' The caveat with that though is that uniq would benefit from better field selection, which is also on the TODO list. cheers, Pádraig.
