Hi, > Am 25.01.2017 um 16:10 schrieb Erik Brinkman <[email protected]>: > > I had completely overlooked tee. The original use case was to split a csv > by column, in which case a four column csv gets pretty verbose: > > source_command | tee >(cut -d, -f1 >file1) | tee >(cut -d, -f2 > file2) | > tee >(cut -d, -f3 > file3) > >(cut -d, -f4 > file4)
The number of columns is fixed in the csv I assume, i.e. 4 in your case, and `split` can distribute them to individual files: $ source_command | tr ',' '\n' | split -n r/4 Looks like a `vsplit` as opposite to `paste`. -- Reuti > However, I haven't needed to do something like this that frequently, and it > seems like the added complexity to cut is probably not worth it. Thanks for > the suggestion. > > Erik > > On Wed, Jan 25, 2017 at 4:42 AM Pádraig Brady <[email protected]> wrote: > >> On 25/01/17 01:13, Erik Brinkman wrote: >>> It'd be nice if cut allowed writing to several files. I'm not sure what >> the >>> appropriate syntax for something like this would be, but I could see a >>> command looking something like: >>> >>> cut -f 1,2 filename1 -f 3-5 filename2 >>> >>> or maybe >>> >>> cut -f 1,2:filename1:3-5:filename2 >>> >>> I don't think the first syntax is posix, and it's definitely not >> backwards >>> compatible. The second might work, but is pretty ugly. I couldn't find >>> anything related to this in the archive or in the rejected feature >>> requests. Some alternatives with downsides: >>> >>> - Save the buffer and use cut repeatedly on that. The downside is it >>> requires the buffer to be saved. >>> - I managed to throw together an awk script that could be tailored to >> do >>> similar things. This writes column 1 to file 1, etc for all of the >> listed >>> files: >>> >>> awk 'BEGIN { NUM = ARGC; if ( ARGC > 2 ) ARGC = 2 } { for ( I = 2; I < >>> NUM; ++I) { print $(I - 1) > ARGV[I] } }' input_file column_1 column_2 >>> >>> The nice part is that this works with subprocesses wiithout saving >>> entire intermediate buffers: >>> paste <(seq 1 10) <(seq 11 20) | awk 'BEGIN { NUM = ARGC; if ( ARGC > >> 2 >>> ) ARGC = 2 } { for ( I = 2; I < NUM; ++I) { print $(I - 1) > ARGV[I] >> } }' - >>>> (paste -sd+ | bc) >(paste -sd+ | bc) >>> >>> However, this is ugly, pretty manual, and doesn't support ranges very >>> easily. >>> >>> It seems plausible that cut source could be modified to store a field / >>> character list for each file, open up all of them, and write characters / >>> fields out on the fly as it normally does with stdout. I'm happy to >>> implement this myself and patch, but I'm uncertain if the coreutils team >>> views this as an appropriate addition, and if so, what a proper syntax >>> would look like. It seems like since this is modifying the field spec of >>> cut, it could potentially have ramifications for other field >> specifications >>> in coreutils, although I can't think of any the relate to writing, so it >>> may not matter. >> >> Would tee suffice for this use case? >> >> source_command | tee >(cut -f1,2 >file1) > >(cut -d' ' -f3,5 >file2) >> >> thanks, >> Pádraig >> >
signature.asc
Description: Message signed with OpenPGP using GPGMail
