I wrote: > Seems like we have two choices: > 1. Remove the leakproof marking on texteq()/textne(). This > would, unfortunately, be catastrophic for performance in > a lot of scenarios where leakproofness matters. > 2. Fix text_cmp to actually be leakproof, whereupon we might > as well mark all the remaining btree comparison operators > leakproof. > I think #2 is pretty obviously the superior answer, if we > can do it.
After burrowing down further, it's visibly the case that text_cmp and varstr_cmp don't leak in the sense of actually reporting any part of their input strings. What they do do, in some code paths, is things like ereport(ERROR, (errmsg("could not convert string to UTF-16: error code %lu", GetLastError()))); So this seems to mostly be a question of interpretation: does an error like this represent enough of an information leak to violate the promises of leakproofness? I'm not sure that this error is really capable of disclosing any information about the input strings. (Note that this error occurs only if we failed to convert UTF8 to UTF16, which probably could only happen if the input isn't valid UTF8, which would mean a failure of encoding checking somewhere up the line.) There are also various pallocs and such that could fail, which perhaps could be argued to disclose the lengths of the input strings. But that hazard exists already inside PG_GETARG_TEXT_PP; if you want to complain about that, then functions like byteaeq() aren't legitimately leakproof either. On the whole I'm prepared to say that these aren't leakproofness violations, but it would depend a lot on how strict you want to be. regards, tom lane