Dear PostgreSQL Hackers:
After following the hackers mailing list for quite a while,
I am going to start investigating what will need to be done
to improve hash index performance. Below are the pieces of
this project that I am currently considering:
1. Characterize the current hash index implementation against
the BTree index, with a focus on space utilization and
lookup performance against a collection of test data. This
will give a baseline performance test to evaluate the impact
of changes. I initially do not plan to bench the hash creation
process since my initial focus will be on lookup performance.
2. Evaluate the performance of different hash index implementations
and/or changes to the current implementation. My current plan is
to keep the implementation as simple as possible and still provide
the desired performance. Several hash index suggestions deal with
changing the layout of the keys on a page to improve lookup
performance, including reducing the bucket size to a fraction of
a page or only storing the hash value on the page, instead of
the index value itself. My goal in this phase is to produce one
or more versions with better performance than the current BTree.
3. Look at build time and concurrency issues with the addition of
some additional tests to the test bed. (1)
4. Repeat as needed.
This is the rough plan. Does anyone see anything critical that
is missing at this point? Please send me any suggestions for test
data and various performance test ideas, since I will be working
on that first.
---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings