On Fri, Jun 15, 2018 at 2:36 PM, Robert Haas <robertmh...@gmail.com> wrote: > Yes, retail index deletion is essential for the delete-marking > approach that is proposed for zheap.
Makes sense. I don't know that much about zheap. I'm sure that retail index tuple deletion is really important in general, though. The Gray & Reuter book treats unique keys as a basic assumption, as do other authoritative reference works and papers. Other database systems probably make unique indexes simply use the user-visible attributes as unique values, but appending heap TID as a unique-ifier is probably a reasonably common design for secondary indexes (it would also be nice if we could simply not store duplicates for unique indexes, rather than using heap TID). I generally have a very high opinion on the nbtree code, but this seems like a problem that ought to be fixed. I've convinced myself that I basically have the right idea with this patch, because the classic L&Y invariants have all been tested with an enhanced amcheck run against all indexes in a regression test database. There was other stress-testing, too. The remaining problems are fixable, but I need some guidance. > It could also be extremely useful in some workloads with the regular > heap. If the indexes are large -- say, 100GB -- and the number of > tuples that vacuum needs to kill is small -- say, 5 -- scanning them > all to remove the references to those tuples is really inefficient. > If we had retail index deletion, then we could make a cost-based > decision about which approach to use in a particular case. I remember talking to Andres about this in a bar 3 years ago. I can imagine variations of pruning that do some amount of this when there are lots of duplicates. Perhaps something like InnoDB's purge threads, which do things like in-place deletes of secondary indexes after an updating (or deleting) xact commits. I believe that that mechanism targets secondary indexes specifically, and that is operates quite eagerly. > Can you enumerate some of the technical obstacles that you see? The #1 technical obstacle is that I simply don't know where I should try to take this patch, given that it probably needs to be tied to some much bigger project, such as zheap. I have an open mind, though, and intend to help if I can. I'm not really sure what the #2 and #3 problems are, because I'd need to be able to see a few steps ahead to be sure. Maybe #2 is that I'm doing something wonky to avoid breaking duplicate checking for unique indexes. (The way that unique duplicate checking has always worked [1] is perhaps questionable, though.) > I think it would be helpful if you could talk more about these > regressions (and the wins). I think that the performance regressions are due to the fact that when you have a huge number of duplicates today, it's useful to be able to claim space to fit further duplicates from almost any of the multiple leaf pages that contain or have contained duplicates. I'd hoped that the increased temporal locality that the patch gets would more than make up for that. As far as I can tell, the problem is that temporal locality doesn't help enough. I saw that performance was somewhat improved with extreme Zipf distribution contention, but it went the other way with less extreme contention. The details are not that fresh in my mind, since I shelved this patch for a while following limited performance testing. The code could certainly use more performance testing, and more general polishing. I'm not strongly motivated to do that right now, because I don't quite see a clear path to making this patch useful. But, as I said, I have an open mind about what the next step should be. [1] https://wiki.postgresql.org/wiki/Key_normalization#Avoiding_unnecessary_unique_index_enforcement -- Peter Geoghegan