Proposed enhancement to diff:
diff of two very different files can take a very long time
and a lot of memory.
diff -q uses the same algorithm even though the status is
known at the first difference.
I propose ending the comparison at the first difference if
diff is invoked with -q
diff is not invoked with -w, -i, or -b
The changes pass the regression tests and all the tests I've tried.
I believe the changes are not machine dependent.
I invite criticism and counterexamples.
Example:
$ ls -l trash.120403 trash.120711
-rw------- 1 gwes users 249686538 Apr 3 2012 trash.120403
-rw-r--r-- 1 gwes users 142356923 Jul 11 2012 trash.120711
$ time diff -q trash.120403 trash.120711
diff:
1m51.52s real 1m47.66s user 0m2.46s system
top output:
load averages: 1.02, 0.91, 0.58 xxxx.oat.com 15:41:54
49 processes: 47 idle, 2 on processor
CPU0 states: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle
CPU1 states: 98.4% user, 0.0% nice, 1.6% system, 0.0% interrupt, 0.0% idle
Memory: Real: 403M/785M act/tot Free: 796M Cache: 312M Swap: 0K/1248M
PID USERNAME PRI NICE SIZE RES STATE WAIT TIME CPU COMMAND
18740 gwes 57 0 362M 333M onproc/1 biowait 1:05 95.61% diff
$ time work/newdiff/diff -q trash.120403 trash.120711
Files trash.120403 and trash.120711 differ
0m0.00s real 0m0.00s user 0m0.00s system
The code changes
$ diff -u diff.h work/newdiff/diff.h
--- diff.h Thu May 15 16:29:15 2014
+++ work/newdiff/diff.h Thu May 15 15:57:30 2014
@@ -64,6 +64,10 @@
#define D_PROTOTYPE 0x080 /* Display C function prototype */
#define D_EXPANDTABS 0x100 /* Expand tabs to spaces */
#define D_IGNOREBLANKS 0x200 /* Ignore white space changes */
+ /* test for possible return at first difference
*/
+#define CANBRIEFRETURN(flags) (((flags) & (D_FOLDBLANKS | D_IGNORECASE \
+ | D_IGNOREBLANKS \
+ )) == 0)
/*
* Status values for print_status() and diffreg() return values
$ diff -u diffreg.c work/newdiff/diffreg.c
--- diffreg.c Thu May 15 16:29:15 2014
+++ work/newdiff/diffreg.c Thu May 15 16:31:19 2014
@@ -366,6 +366,15 @@
status |= 1;
goto closem;
}
+ if ((diff_format == D_BRIEF) && CANBRIEFRETURN(flags)) {
+ anychange = 1;
+ if (flags & D_HEADER) {
+ diff_output("%s %s %s\n", \
+ diffargs, file1, file2);
+ flags &= ~D_HEADER;
+ }
+ goto closem;
+ }
if (lflag) {
/* redirect stdout to pr */
int pfd[2];