Re: Use simplehash.h instead of dynahash in SMgr

2022-10-23 Thread Andres Freund
Hi, On 2021-04-25 03:58:38 +1200, David Rowley wrote: > Currently, we use dynahash hash tables to store the SMgrRelation so we > can perform fast lookups by RelFileNodeBackend. However, I had in mind > that a simplehash table might perform better. So I tried it... > The test case was basically

Re: Use simplehash.h instead of dynahash in SMgr

2021-12-02 Thread Michael Paquier
On Tue, Oct 05, 2021 at 11:07:48AM +1300, David Rowley wrote: > I think I'd much rather talk about the concerns here than just > withdraw this. Even if what I have today just serves as something to > aid discussion. Hmm. This last update was two months ago, and the patch does not apply anymore.

Re: Use simplehash.h instead of dynahash in SMgr

2021-10-06 Thread Yura Sokolov
Good day, David and all. В Вт, 05/10/2021 в 11:07 +1300, David Rowley пишет: > On Mon, 4 Oct 2021 at 20:37, Jaime Casanova > wrote: > > Based on your comments I will mark this patch as withdrawn at midday > > of > > my monday unless someone objects to that. > > I really think we need a hash

Re: Use simplehash.h instead of dynahash in SMgr

2021-10-04 Thread David Rowley
On Mon, 4 Oct 2021 at 20:37, Jaime Casanova wrote: > Based on your comments I will mark this patch as withdrawn at midday of > my monday unless someone objects to that. I really think we need a hash table implementation that's faster than dynahash and supports stable pointers to elements

Re: Use simplehash.h instead of dynahash in SMgr

2021-10-04 Thread Jaime Casanova
On Mon, Sep 27, 2021 at 04:30:25PM +1300, David Rowley wrote: > On Fri, 24 Sept 2021 at 20:26, Jaime Casanova > wrote: > > Are you planning to work on this in this CF? > > This is marked as "Ready for committer" but it doesn't apply anymore. > > I've attached an updated patch. Since this patch

Re: Use simplehash.h instead of dynahash in SMgr

2021-09-26 Thread David Rowley
On Fri, 24 Sept 2021 at 20:26, Jaime Casanova wrote: > Are you planning to work on this in this CF? > This is marked as "Ready for committer" but it doesn't apply anymore. I've attached an updated patch. Since this patch is pretty different from the one that was marked as ready for committer,

Re: Use simplehash.h instead of dynahash in SMgr

2021-09-24 Thread Jaime Casanova
On Tue, Jun 22, 2021 at 02:15:26AM +1200, David Rowley wrote: [...] > > I've come up with a new hash table implementation that I've called > generichash. It works similarly to simplehash in regards to the Hi David, Are you planning to work on this in this CF? This is marked as "Ready for

Re: Use simplehash.h instead of dynahash in SMgr

2021-06-30 Thread David Rowley
On Thu, 1 Jul 2021 at 13:00, Thomas Munro wrote: > > On Wed, Jun 30, 2021 at 11:14 PM David Rowley wrote: > > 1) Since I really need 8-byte buckets in the hash table to make this > > as fast as possible, I want to use the array index for the hash status > > and that means changing the simplehash

Re: Use simplehash.h instead of dynahash in SMgr

2021-06-30 Thread Thomas Munro
On Wed, Jun 30, 2021 at 11:14 PM David Rowley wrote: > On Wed, 23 Jun 2021 at 12:17, Thomas Munro wrote: > Thanks for taking an interest in this. I started looking at your idea > and I've now changed my mind from just not liking it to thinking that > the whole idea is just completely horrible

Re: Use simplehash.h instead of dynahash in SMgr

2021-06-30 Thread David Rowley
On Wed, 23 Jun 2021 at 12:17, Thomas Munro wrote: > > David and I discussed this a bit off-list, and I just wanted to share > how I understand the idea so far in case it helps someone else. There > are essentially three subcomponents working together: Thanks for taking an interest in this. I

Re: Use simplehash.h instead of dynahash in SMgr

2021-06-22 Thread Thomas Munro
On Tue, Jun 22, 2021 at 6:51 PM David Rowley wrote: > I guess we could also ask ourselves how many join algorithms we need. David and I discussed this a bit off-list, and I just wanted to share how I understand the idea so far in case it helps someone else. There are essentially three

Re: Use simplehash.h instead of dynahash in SMgr

2021-06-21 Thread Thomas Munro
On Tue, Jun 22, 2021 at 1:55 PM David Rowley wrote: > On Tue, 22 Jun 2021 at 03:43, Tom Lane wrote: > > I kind of wonder if we really need four different hash table > > implementations (this being the third "generic" one, plus hash join > > has its own, and I may have forgotten others). Should

Re: Use simplehash.h instead of dynahash in SMgr

2021-06-21 Thread David Rowley
On Tue, 22 Jun 2021 at 03:43, Tom Lane wrote: > I kind of wonder if we really need four different hash table > implementations (this being the third "generic" one, plus hash join > has its own, and I may have forgotten others). Should we instead > think about revising simplehash to gain the

Re: Use simplehash.h instead of dynahash in SMgr

2021-06-21 Thread David Rowley
On Tue, 22 Jun 2021 at 02:53, Robert Haas wrote: > At the risk of kibitzing the least-important detail of this proposal, > I'm not very happy with the names of our hash implementations. > simplehash is not especially simple, and dynahash is not particularly > dynamic, especially now that the main

Re: Use simplehash.h instead of dynahash in SMgr

2021-06-21 Thread Tom Lane
Robert Haas writes: > On Mon, Jun 21, 2021 at 10:15 AM David Rowley wrote: >> I've come up with a new hash table implementation that I've called >> generichash. > At the risk of kibitzing the least-important detail of this proposal, > I'm not very happy with the names of our hash

Re: Use simplehash.h instead of dynahash in SMgr

2021-06-21 Thread Robert Haas
On Mon, Jun 21, 2021 at 10:15 AM David Rowley wrote: > I've come up with a new hash table implementation that I've called > generichash. At the risk of kibitzing the least-important detail of this proposal, I'm not very happy with the names of our hash implementations. simplehash is not

Re: Use simplehash.h instead of dynahash in SMgr

2021-06-21 Thread David Rowley
I'd been thinking of this patch again. When testing with simplehash, I found that the width of the hash bucket type was fairly critical for getting good performance from simplehash.h. With simplehash.h I didn't manage to narrow this any more than 16 bytes. I needed to store the 32-bit hash value

Re: Use simplehash.h instead of dynahash in SMgr

2021-05-07 Thread Yura Sokolov
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: tested, passed Documentation:not tested I believe it is ready for committer. The new status of this

RE: Use simplehash.h instead of dynahash in SMgr

2021-05-05 Thread Jakub Wartak
Hey David, > I think you'd have to batch by filenode and transaction in that case. Each > batch might be pretty small on a typical OLTP workload, so it might not help > much there, or it might hinder. True, it is very workload dependent (I was chasing mainly INSERTs multiValues,

Re: Use simplehash.h instead of dynahash in SMgr

2021-05-05 Thread David Rowley
Hi Jakub, On Wed, 5 May 2021 at 20:16, Jakub Wartak wrote: > I might be a little bit out of the loop, but as Alvaro stated - Thomas did > plenty of excellent job related to recovery performance in that thread. In my > humble opinion and if I'm not mistaken (I'm speculating here) it might be >

RE: Use simplehash.h instead of dynahash in SMgr

2021-05-05 Thread Jakub Wartak
Hi David, Alvaro, -hackers > Hi David, > > You're probably aware of this, but just to make it explicit: Jakub Wartak was > testing performance of recovery, and one of the bottlenecks he found in > some of his cases was dynahash as used by SMgr. It seems quite possible > that this work would

Re: Use simplehash.h instead of dynahash in SMgr

2021-04-30 Thread Alvaro Herrera
Hi David, You're probably aware of this, but just to make it explicit: Jakub Wartak was testing performance of recovery, and one of the bottlenecks he found in some of his cases was dynahash as used by SMgr. It seems quite possible that this work would benefit some of his test workloads. He last

Re: Use simplehash.h instead of dynahash in SMgr

2021-04-29 Thread David Rowley
I've attached an updated patch. I forgot to call SH_ENTRY_CLEANUP, when it's defined during SH_RESET. I also tided up a couple of comments and change the code to use pg_rotate_right32(.., 31) instead of adding a new function for pg_rotate_left32 and calling that to shift left 1 bit. David

Re: Use simplehash.h instead of dynahash in SMgr

2021-04-28 Thread David Rowley
On Thu, 29 Apr 2021 at 12:30, Yura Sokolov wrote: > > David Rowley писал 2021-04-29 02:51: > > Another line of thought for making it go faster would be to do > > something like get rid of the hash status field from SMgrEntry. That > > could be either coded into a single bit we'd borrow from the

Re: Use simplehash.h instead of dynahash in SMgr

2021-04-28 Thread Yura Sokolov
David Rowley писал 2021-04-29 02:51: On Thu, 29 Apr 2021 at 00:28, Yura Sokolov wrote: The best result is with just: return (uint32)rnode->node.relNode; ie, relNode could be taken without mixing at all. relNode is unique inside single database, and almost unique among whole cluster

Re: Use simplehash.h instead of dynahash in SMgr

2021-04-28 Thread David Rowley
On Thu, 29 Apr 2021 at 00:28, Yura Sokolov wrote: > The best result is with just: > > return (uint32)rnode->node.relNode; > > ie, relNode could be taken without mixing at all. > relNode is unique inside single database, and almost unique among whole > cluster > since it is Oid. I admit to

Re: Use simplehash.h instead of dynahash in SMgr

2021-04-28 Thread Yura Sokolov
David Rowley писал 2021-04-26 09:43: I tried that and it got a median result of 113.795 seconds over 14 runs with this recovery benchmark test. LOG: size: 4096, members: 2032, filled: 0.496094, total chain: 1014, max chain: 6, avg chain: 0.499016, total_collisions: 428, max_collisions: 3,

Re: Use simplehash.h instead of dynahash in SMgr

2021-04-26 Thread Andres Freund
Hi, On 2021-04-26 22:44:13 +0300, Yura Sokolov wrote: > Even for Robin Hood hashing 0.9 fill factor is too high. It leads to too > much movements on insertion/deletion and longer average collision chain. That's true for modification heavy cases - but most hash tables in PG, including the smgr

Re: Use simplehash.h instead of dynahash in SMgr

2021-04-26 Thread Yura Sokolov
Andres Freund писал 2021-04-26 21:46: Hi, On 2021-04-25 01:27:24 +0300, Yura Sokolov wrote: It is quite interesting result. Simplehash being open-addressing with linear probing is friendly for cpu cache. I'd recommend to define SH_FILLFACTOR with value lower than default (0.9). I believe 0.75

Re: Use simplehash.h instead of dynahash in SMgr

2021-04-26 Thread Andres Freund
Hi, On 2021-04-25 01:27:24 +0300, Yura Sokolov wrote: > It is quite interesting result. Simplehash being open-addressing with > linear probing is friendly for cpu cache. I'd recommend to define > SH_FILLFACTOR with value lower than default (0.9). I believe 0.75 is > suitable most for such kind of

Re: Use simplehash.h instead of dynahash in SMgr

2021-04-26 Thread David Rowley
On Mon, 26 Apr 2021 at 05:03, Yura Sokolov wrote: > If your test so sensitive to hash function speed, then I'd suggest > to try something even simpler: > > static inline uint32 > relfilenodebackend_hash(RelFileNodeBackend *rnode) > { > uint32 h = 0; > #define step(x) h ^=

Re: Use simplehash.h instead of dynahash in SMgr

2021-04-25 Thread Yura Sokolov
David Rowley писал 2021-04-25 16:36: On Sun, 25 Apr 2021 at 18:48, Yura Sokolov wrote: If you read comments in SH_START_ITERATE, you'll see: * Search for the first empty element. As deletions during iterations are * supported, we want to start/end at an element that cannot be affected

Re: Use simplehash.h instead of dynahash in SMgr

2021-04-25 Thread David Rowley
On Sun, 25 Apr 2021 at 18:48, Yura Sokolov wrote: > If you read comments in SH_START_ITERATE, you'll see: > >* Search for the first empty element. As deletions during iterations > are >* supported, we want to start/end at an element that cannot be > affected >* by elements being

Re: Use simplehash.h instead of dynahash in SMgr

2021-04-25 Thread Yura Sokolov
David Rowley писал 2021-04-25 05:23: Thanks for having a look at this. "On Sun, 25 Apr 2021 at 10:27, Yura Sokolov wrote: It is quite interesting result. Simplehash being open-addressing with linear probing is friendly for cpu cache. I'd recommend to define SH_FILLFACTOR with value lower

Re: Use simplehash.h instead of dynahash in SMgr

2021-04-24 Thread David Rowley
Thanks for having a look at this. "On Sun, 25 Apr 2021 at 10:27, Yura Sokolov wrote: > > It is quite interesting result. Simplehash being open-addressing with > linear probing is friendly for cpu cache. I'd recommend to define > SH_FILLFACTOR with value lower than default (0.9). I believe 0.75

Re: Use simplehash.h instead of dynahash in SMgr

2021-04-24 Thread Yura Sokolov
David Rowley писал 2021-04-24 18:58: Hackers, Last year, when working on making compactify_tuples() go faster for 19c60ad69, I did quite a bit of benchmarking of the recovery process. The next thing that was slow after compactify_tuples() was the hash lookups done in smgropen(). Currently, we

Use simplehash.h instead of dynahash in SMgr

2021-04-24 Thread David Rowley
Hackers, Last year, when working on making compactify_tuples() go faster for 19c60ad69, I did quite a bit of benchmarking of the recovery process. The next thing that was slow after compactify_tuples() was the hash lookups done in smgropen(). Currently, we use dynahash hash tables to store the