On 03/08/2016 05:57 PM, Criggie wrote:
I'm sure this can be optimised, but how's this for some dirty hackery.
server.dc home $ `cat _fdupes-2016-03-08.txt | xargs -n 3 | awk ' { print
"rm -f " $2, $3 " ; ln " $1 , $2 " ; ln " $1, $3 } '`
Clue 1
the input file contains the output of fdupes, and listed only triples of
identical files. There were no fours or twos
A better one would have looked for a blank line in the input, and looped
through from 2 to N .
Clue 2 before
server.dc home $ ll ./dir?/abc
-rw-rw-r-- 1 root nagios 76047612 Sep 9 22:39 ./dir1/abc
-rw-rw-r-- 1 root nagios 76047612 Sep 9 22:39 ./dir2/abc
-rw-r--r-- 1 root root 76047612 Oct 28 02:52 ./dir3/abc
Clue 3 after
server.dc home $ ll ./dir?/abc
-rw-rw-r-- 3 root nagios 76047612 Sep 9 22:39 ./dir1/abc
-rw-rw-r-- 3 root nagios 76047612 Sep 9 22:39 ./dir2/abc
-rw-rw-r-- 3 root nagios 76047612 Sep 9 22:39 ./dir3/abc
I think the nifty thing was -n 3 for xargs. I was unaware it could do that.
Answer
This machine has a lot of largish files triplicated on the disk. Since I
can't convert it to a filesyystem with deduplication, this deleted 2/3 of
the files, and hard linked them back into place.
And the script merely spits out shell commands which are then executed.
So testing it is just running the command without the backticks of
execution.
So the mount in question went from 355GB in use to 170GB, or 93% to 45%
usage.
I'd add a --no-run-if-empty to the xargs, just to be paranoid. Can't see
the need for backtics either ( which should be replaced in modern bashes
for $(...) so a new shell isn't forked to run the command )
(actually I'd just buy a bigger disk)
Steve
--
Steve Holdoway BSc(Hons) MIITP
http://www.greengecko.co.nz
Linkedin: http://www.linkedin.com/in/steveholdoway
Skype: sholdowa
___
Linux-users mailing list
Linux-users@lists.canterbury.ac.nz
http://lists.canterbury.ac.nz/mailman/listinfo/linux-users