Hi, > I want to make sure we don't regress in the case where the unique > index has more bloat than the non-unique index. I experimented with a > small dataset where the unique index grew to 4.5 GB due to bloat and > observed that the patched was about 20% slower (a couple of seconds at > this scale, but in production with larger tables, more concurrent > activity, more bloat and limited memory, I would expect the gap to be > wider). Would you be able to provide some additional data points on > this before we proceed further? That would help confirm the heuristic > holds up across different bloat conditions.
I agree that choosing a bloated unique index could lead to worse performance than a non-bloated non-unique index in certain cases. However, I think there are many cases where the more bloated, larger index would perform significantly better than a less bloated index with worse selectivity. Without the exact numbers, it should be clear to see that the example from my original email would be one of those cases. Deciding which index to choose based off of size alone and not taking into account other statistics would likely lead to many wrong decisions. Moreover, I would argue that even if the choice of a unique index led to worse performance, it should not be considered a regression. Today index selection is essentially random, therefore, there is no guarantee about which index is chosen. I'd argue that a savvy user must assume that the worst index is chosen when reasoning about performance. In addition, given that this patch will likely only be applied on a new major version, any stability of index ordering for the selection would be changed during the dump and restore. I'd reiterate as well that this is a small incremental improvement which I found would be helpful in a few situations that I have observed in user workload. I don't think that this excludes any future optimizations including more factors such as size/bloat, but those must be considered in combination with other statistics. I'd be interested in looking into and helping out with the development of those features in the future. Best, Ethan SDE, Amazon Web Services
