Hi Thomas,
On Friday, December 28, 2018 6:43 AM Thomas Munro
<[email protected]> wrote:
> [...]if you have ideas about the validity of the assumptions, the reason it
> breaks initdb, or any other aspect of this approach (or alternatives), please
> don't let me stop you, and of course please feel free to submit this, an
> improved version or an alternative proposal [...]
Sure. Thanks. I'd like to try to work on the idea. I also took a look at the
code, and I hope you don't mind if I ask for clarifications
(explanation/advice/opinions) on the following, since my postgres experience is
not substantial enough yet.
(1) I noticed that you used a "relsize_change_counter" to store in
MdSharedData. Is my understanding below correct?
The intention is to cache the rel_size per-backend (lock-free), with an array
of relsize_change_counter to skip using lseek syscall when the counter does not
change.
In _mdnblocks(), if the counter did not change, the cached rel size (nblocks)
is used and skip the call to FileSize() (lseek to get and cache rel size). That
means in the default Postgres master, lseek syscall (through FileSize()) is
called whenever we want to get the rel size (nblocks).
On the other hand, the simplest method I thought that could also work is to
only cache the file size (nblock) in shared memory, not in the backend process,
since both nblock and relsize_change_counter are uint32 data type anyway. If
relsize_change_counter can be changed without lock, then nblock can be changed
without lock, is it right? In that case, nblock can be accessed directly in
shared memory. In this case, is the relation size necessary to be cached in
backend?
(2) Is the MdSharedData temporary or permanent in shared memory?
from the patch:
typedef struct MdSharedData
{
/* XXX could have an array of these, and use rel OID % nelements? */
pg_atomic_uint32 relsize_change_counter;
} MdSharedData;
static MdSharedData *MdShared;
What I intend to have is a permanent hashtable that will keep the file size
(eventually/future dev, including table addresses) in the shared memory for
faster access by backend processes. The idea is to keep track of the
unallocated blocks, based from how much the relation has been extended or
truncated. Memory for this hashtable will be dynamically allocated.
Thanks,
Kirk Jamison