I'm not aware of anything that will do in-place merges, but have you considered compressing them first, to save enough space, that you could manage it without being in-place? I've seen ASCII compress at 6:1 or more, but it will depend on the specific data.
Basically something roughly like this (verify the syntax first): > #compress individual files > for i in file1 file2; do gzip --verbose $i; done > #merge them to a new file > zcat file1.gz file2.gz | sort -u | gzip > file3.gz Not sure if that will save you enough space to get it done without being in-place or not. But worth considering. Lloyd Brown Systems Administrator Fulton Supercomputing Lab Brigham Young University http://marylou.byu.edu On 01/02/2014 10:12 AM, S. Dale Morrey wrote: > I have a 15 GB file and a 2 GB file. Both are sourced from somewhat > similar data so there could be quite a lot of overlap, both are in the same > format, i.e. plaintext CSV 1 entry per line. > > I'd like to read the 2GB file and add any entries that are present in it, > but missing in the 15GB file, basically merging the 2 files. Space is at a > premium so I'd prefer it be an in place merge onto the 15GB file. Even if > a temp file is made, It would still be best if the end result was a single > 17GB file. > > I don't want to reinvent the wheel. The time to perform the operation is > irrelevant, but I'd greatly prefer there not be any dupes. Is there a bash > command that could facilitate the activity? > > /* > PLUG: http://plug.org, #utah on irc.freenode.net > Unsubscribe: http://plug.org/mailman/options/plug > Don't fear the penguin. > */ > /* PLUG: http://plug.org, #utah on irc.freenode.net Unsubscribe: http://plug.org/mailman/options/plug Don't fear the penguin. */
