On Tue, Feb 27, 2007 at 05:38:39PM +0900, ITAGAKI Takahiro wrote:
> "Jim C. Nasby" <[EMAIL PROTECTED]> wrote:
> > > If we do UPDATE a tuple, the original page containing the tuple is marked
> > > as HIGH and the new page where the updated tuple is placed is marked as
> > > LOW.
> > Don't you mean UNFROZEN?
> No, the new tuples are marked as LOW. I intend to use UNFROZEN and FROZEN
> pages as "all tuples in the pages are visible to all transactions" for
> index-only-scan in the future.
Ahh, ok. Makes sense, though I tend to agree with others that it's
better to leave that off for now, or at least do the initial patch
> > What makes it more important to mark the original page as HIGH instead
> > of LOW, like the page with the new tuple? The description of the states
> > indicates that there would likely be a lot more dead tuples in a HIGH
> > page than in a LOW page.
> > Perhaps it would be better to have the bgwriter take a look at how many
> > dead tuples (or how much space the dead tuples account for) when it
> > writes a page out and adjust the DSM at that time.
> Yeah, I feel it is worth optimizable, too. One question is, how we treat
> dirty pages written by backends not by bgwriter? If we want to add some
> works in bgwriter, do we also need to make bgwriter to write almost of
> dirty pages?
IMO yes, we want the bgwriter to be the only process that's normally
writing pages out. How close we are to that, I don't know...
> > > * Agressive freezing
> > > We will freeze tuples in dirty pages using OldestXmin but FreezeLimit.
> > Do you mean using OldestXmin instead of FreezeLimit?
> Yes, we will use OldestXmin as the threshold to freeze tuples in
> dirty pages or pages that have some dead tuples. Or, many UNFROZEN
> pages still remain after vacuum and they will cost us in the next
> vacuum preventing XID wraparound.
Another good idea. If it's not too invasive I'd love to see that as a
stand-alone patch so that we know it can get in.
> > > I'm thinking to change them into 2 new paramaters. We will allocates
> > > memory
> > > for DSM that can hold all of estimated_database_size, and for FSM 50% or
> > > something of the size. Is this reasonable?
> > I don't think so, at least not until we get data from the field about
> > what's typical. If the DSM is tracking every page in the cluster then
> > I'd expect the FSM to be closer to 10% or 20% of that, anyway.
> I'd like to add some kind of logical flavors to max_fsm_pages
> and max_dsm_pages. For DSM, max_dsm_pages should represent the
> whole database size. In the other hand, what meaning does
> max_fsm_pages have? (estimated_updatable_size ?)
At some point it might make sense to convert the FSM into a bitmap; that
way everything just scales with database size.
In the meantime, I'm not sure if it makes sense to tie the FSM size to
the DSM size, since each FSM page requires 48x the storage of a DSM
page. I think there's also a lot of cases where FSM size will not scale
the same was DSM size will, such as when there's historical data in the
That raises another question... what happens when we run out of DSM
Jim Nasby [EMAIL PROTECTED]
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster