On Wed, Sep 14, 2016 at 10:53 PM, Alvaro Herrera <alvhe...@2ndquadrant.com> wrote:
> > > One thing not quite clear to me is how do we create the bitmap > representation starting from the array representation in midflight > without using twice as much memory transiently. Are we going to write > the array to a temp file, free the array memory, then fill the bitmap by > reading the array from disk? > We could do that. Or may be compress TID array when consumed half m_w_m and do this repeatedly with remaining memory. For example, if we start with 1GB memory, we decide to compress at 512MB. Say that results in 300MB for bitmap. We then continue to accumulate TID and do another round of fold up when another 350MB is consumed. I think we should maintain per offset count of number of dead tuples to choose the most optimal bitmap size (that needs overflow region). We can also track how many blocks or block ranges have at least one dead tuple to know if it's worthwhile to have some sort of indirection. Together that can tell us how much compression can be achieved and allow us to choose the most optimal representation. Thanks, Pavan -- Pavan Deolasee http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services