Jeff Davis wrote:
On Wed, 2007-05-02 at 23:59 +0100, Heikki Linnakangas wrote:
Jeff Davis wrote:
On Wed, 2007-05-02 at 20:58 +0100, Heikki Linnakangas wrote:
Jeff Davis wrote:
What should be the maximum size of this hash table?
Good question. And also, how do you remove entries from it?

I guess the size should somehow be related to number of backends. Each backend will realistically be doing just 1 or max 2 seq scan at a time. It also depends on the number of large tables in the databases, but we don't have that information easily available. How about using just NBackends? That should be plenty, but wasting a few hundred bytes of memory won't hurt anyone.
One entry per relation, not per backend, is my current design.
Umm, you naturally have just entry per relation, but we were talking about how many entries the table needs to hold.. You're patch had a hard-coded value of 1000 which is quite arbitrary.

I think I'm missing something from your statement "I guess the size
should somehow be related to number of backends." If there is only one
entry per relation, why do more backends require a larger hash table?

The hash table keeps track of ongoing seq scans. That's presumably related to number of backends; I can't imagine a plan where a backend is executing more than one seq scan at a time, so that gives us an upper bound of NBackends simultaneous seq scans in the system.

Sure, if you only have say 5 tables that are large enough that we try to keep track of them, there can't be more than 5 entries in the table. But we don't know how many such tables there is at startup time.

A hard coded constant in the range of 10 - 100 might be just fine in practice.

If I just step back for a second to consider a simpler design:

We will most likely have no more than a few relations larger than the
minimum threshold for Sync Scanning in any database being scanned

Note that the list would be shared with all databases in the cluster. Your point is still valid, though.

What if I just make an LRU that occupies a fixed size? Any reads or
updates can start at the MRU item and work backward, and any evictions
can happen at the LRU item.

Does a hash table really save us cycles if we keep the list small, say,
100 items? Looking at all 100 items would only be required perhaps on a
scan startup. I don't have a good sense of scale here, is following 50
pointers while holding a lock a significant source of contention?

Hmm. If you only need to scan through it when you start the scan, I suppose it would be ok.

100 is of course arbitrary, and that could grow or shrink automatically
at runtime.

Except that it can't shrink, because we don't have any convenient moment to remove an entry from the list, other than dropping the least recently used entry out of the list when inserting a new one. And since it's in shared mem, the allocated size is fixed at boot up, so we can't grow it either.

  Heikki Linnakangas

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
      choose an index scan if your joining column's datatypes do not

Reply via email to