On Wed, 2007-05-02 at 20:58 +0100, Heikki Linnakangas wrote:
> Jeff Davis wrote:
> > What should be the maximum size of this hash table?
> Good question. And also, how do you remove entries from it?
> I guess the size should somehow be related to number of backends. Each
> backend will realistically be doing just 1 or max 2 seq scan at a time.
> It also depends on the number of large tables in the databases, but we
> don't have that information easily available. How about using just
> NBackends? That should be plenty, but wasting a few hundred bytes of
> memory won't hurt anyone.
One entry per relation, not per backend, is my current design.
> I think you're going to need an LRU list and counter of used entries in
> addition to the hash table, and when all entries are in use, remove the
> least recently used one.
> The thing to keep an eye on is that it doesn't add too much overhead or
> lock contention in the typical case when there's no concurrent scans.
> For the locking, use a LWLock.
Ok. What would be the potential lock contention in the case of no
Also, is it easy to determine the space used by a dynahash with N
entries? I haven't looked at the dynahash code yet, so perhaps this will
> No, not the segment. RelFileNode consists of tablespace oid, database
> oid and relation oid. You can find it in scan->rs_rd->rd_node. The
> segmentation works at a lower level.
Ok, will do.
> Hmm. Should we care then? CFG is the default on Linux, and an average
> sysadmin is unlikely to change it.
Keep in mind that concurrent sequential scans with CFQ are *already*
very poor. I think that alone is an interesting fact that's somewhat
independent of Sync Scans.
> - when ReadBuffer is called, let the caller know if the read did
> physical I/O.
> - when the previous ReadBuffer didn't result in physical I/O, assume
> that we're not the pack leader. If the next buffer isn't already in
> cache, wait a few milliseconds before initiating the read, giving the
> pack leader a chance to do it instead.
> Needs testing, of course..
An interesting idea. I like that the most out of the ideas of
maintaining a "pack leader". That's very similar to what the Linux
anticipatory scheduler does for us.
> >> 4. It fails regression tests. You get an assertion failure on the portal
> >> test. I believe that changing the direction of a scan isn't handled
> >> properly; it's probably pretty easy to fix.
> > I will examine the code more carefully. As a first guess, is it possible
> > that test is failing because of the non-deterministic order in which
> > tuples are returned?
> No, it's an assertion failure, not just different output than expected.
> But it's probably quite simple to fix..
Ok, I'll find and correct it then.
---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend