On Thu, 2007-05-03 at 19:27 +0100, Heikki Linnakangas wrote:
> I understand that the data structure keeps track of relations being
> scanned, with one entry per such relation. I think that's very good
> design, and I'm not suggesting to change that.
> But what's the right size for that? We don't know how many large tables
> there is in the database, so we can't use that. A hard-coded constant
> maybe? What would it be? I figured we could use NBackends, because there
> can't realistically be more than one large seq scan per backend going on
> at any point in time. That's an upper bound, in any real world
> application there's far less than that.
Ok, I understand you now. We'll choose the maximum size of the
table/list as the max_connections on startup.
> > Part of the reason I did this is because, with a Sync Scan, it seems
> > like there may be possible reasons to do multiple seq scans concurrently
> > in the same backend. This is completely contrived, but consider:
> > SELECT * FROM bigtable UNION ALL SELECT * FROM bigtable;
> > I'm not trying to argue for such a plan to exist right now, but if there
> > is a possibility that we'd want to create such plans in the future, I
> > think it's better to track by relation rather than by backend.
> Those two seqscans would be executed one after each other, not in
> parallel, so you would still have just one seq scan at any given point
> in time.
Right, I was just throwing out a hypothetical situation (if highly
contrived) in which it may help to do multiple scans at once. I think
this can be safely ignored for now though, since the planner would never
generate such a plan, and I can't think of a reasonable use case.
> >>> 100 is of course arbitrary, and that could grow or shrink automatically
> >>> at runtime.
> >> Except that it can't shrink, because we don't have any convenient moment
> >> to remove an entry from the list, other than dropping the least recently
> >> used entry out of the list when inserting a new one. And since it's in
> >> shared mem, the allocated size is fixed at boot up, so we can't grow it
> >> either.
> > It can shrink when a new scan looks through the list and passes by
> > entries that don't need to be in the list.
> How do you identify an entry that doesn't need to be there?
Some type of clock-like algorithm where a counter was decremented when
an element is passed up and incremented when it's chosen might work.
It's probably easier to just use a hash table though.
> > I don't want to over-complicate things, so if you think having a hash
> > table is a better, simpler, or more readable design, I'll go with that
> > instead of the list-only approach. The only reason I suggested the list-
> > only approach was to see if the hash table was really gaining anything
> > for us. In practice, it will be quite rare that more than 10 different
> > big relations are being scanned at once, and I don't know how much of a
> > gain a hash table is when there are (in all but exceptional cases) only
> > a few entries.
> How about implementing the list-only approach first, since you're going
> to need the LRU list anyway, and we can add consider adding the hash
> table later?
Sounds good. Will do.
---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings