I wrote:
> Seems like we have two choices:
> 1. Remove the leakproof marking on texteq()/textne().  This
> would, unfortunately, be catastrophic for performance in
> a lot of scenarios where leakproofness matters.
> 2. Fix text_cmp to actually be leakproof, whereupon we might
> as well mark all the remaining btree comparison operators
> leakproof.
> I think #2 is pretty obviously the superior answer, if we
> can do it.

After burrowing down further, it's visibly the case that
text_cmp and varstr_cmp don't leak in the sense of actually
reporting any part of their input strings.  What they do do,
in some code paths, is things like

        ereport(ERROR,
                (errmsg("could not convert string to UTF-16: error code %lu",
                        GetLastError())));

So this seems to mostly be a question of interpretation:
does an error like this represent enough of an information
leak to violate the promises of leakproofness?  I'm not sure
that this error is really capable of disclosing any information
about the input strings.  (Note that this error occurs only if
we failed to convert UTF8 to UTF16, which probably could only
happen if the input isn't valid UTF8, which would mean a failure
of encoding checking somewhere up the line.)

There are also various pallocs and such that could fail, which
perhaps could be argued to disclose the lengths of the input
strings.  But that hazard exists already inside PG_GETARG_TEXT_PP;
if you want to complain about that, then functions like byteaeq()
aren't legitimately leakproof either.

On the whole I'm prepared to say that these aren't leakproofness
violations, but it would depend a lot on how strict you want to be.

                        regards, tom lane


Reply via email to