Tom Lane wrote:
Both of these pages say up front that they are considering read-only data. So one of the questions that has to be answered (and the submitters have been entirely mum about) is exactly how bad is the update performance? If it's really awful that's going to constrain the use cases quite a lot, whereas merely "a bit slower than btree" wouldn't be such a problem. In any case, arguing that other DBs find it's a win will cut no ice with me. See adjacent discussion about hash indexes --- those *ought* to be a win, but they aren't in Postgres, for reasons that are still guesses. The translation gap between other DBs' experience and ours can be large.
Notwithstanding that, I have a couple of non-postgres data points / anecdotes on this.
Back in my days as an Ingres DBA in the mid 90s, our fairly highly tuned system used hash organised tables only for small fairly static lookup-type tables (state codes, postcodes, etc). Everything that was more dynamic was done with btree indexed tables.
A few years later, I was producing very large tables of email addresses using BDB. I quickly stopped using hash tables when I found that the reorganisation penalty was huge. Switching to btree worked just fine, with no sudden performance blip. This might not be directly relevant, but clearly the bucket size is.
I guess what we need to demonstrate is that the better hash performance will actually persist to a scale where it is actually worth it - surely for very small tables the index method won't matter much anyway.
cheers andrew ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match