Junio C Hamano <[email protected]> writes:
> Looking at the implementation of get_ws_cleaned_string() that is the
> workhorse of emitted_symbol_cmp_no_ws(), it seems to be doing wrong
> things for various "ignore whitespace" options (i.e. there is only
> one implementation, while "git diff" family takes things like
> "ignore space change", "ignore all whitespace", etc.), though.
This probably deserves a bit more illustration of how I envision the
code should evolve.
In the longer term, I would prefer to see emitted_symbol_cmp_no_ws()
to go and instead emitted_symbol_cmp() to take the diff options so
that it can change the behaviour of the comparison function based on
the -w/-b/--ignore-space-at-eol/etc. settings. And compare two strings
in place.
For that, you may need a helper function that takes a pointer to a
character pointer, picks the next byte that matters while advancing
the pointer, and returns that byte. The emitted_symbol_cmp(a, b)
which is not used for real comparison (i.e. ordering to see if a
sorts earlier than b) but for equivalence (i.e. considering various
whitespace-ignoring settings, does a and b matfch?) may become
something like:
int
emitted_symbol_eqv(struct emitted_diff_symbol *a,
struct emitted_diff_symbol *b,
const void *keydata) {
struct diff_options *diffopt = keydata;
const char *ap = a->line;
const char *bp = b->line;
while (1) {
int ca, cb;
ca = next_byte(&ap, diffopt);
cb = next_byte(&bp, diffopt);
if (ca != cb)
return 1; /* differs */
if (!ca)
return 0;
};
}
where next_byte() may look like:
static int
next_byte(const char **cp, struct diff_options *diffopt)
{
int retval;
again:
retval = **cp;
if (!retval)
return retval; /* end of string */
if (!isspace(retval)) {
(*cp)++; /* advance */
return retval;
}
switch (ignore_space_type(diffopt)) {
case NOT_IGNORING:
(*cp)++; /* advance */
return retval;
case IGNORE_SPACE_CHANGE:
while (**cp && isspace(**cp))
(*cp)++; /* squash consecutive spaces */
return ' '; /* normalize spaces with a SP */
case IGNORE_ALL_SPACE:
(*cp)++; /* advance */
goto again;
... other cases here ...
}
}