More review:

+ * For large objects, we just stick all of the allocations in fullness class
+ * 0. Since we can just return the space directly to the free page manager,
+ * we don't really need them on a list at all, except that if someone wants
+ * to bulk release everything allocated using this BlockAreaContext, we
+ * have no other way of finding them.

This comment is out-of-date.

+               /*
+                * If this is the only span, and there is no active
span, then maybe
+                * we should probably move this span to fullness class
1.  (Otherwise
+                * if you allocate exactly all the objects in the only
span, it moves
+                * to class 3, then you free them all, it moves to 2,
and then is
+                * given back, leaving no active span).
+                */

"maybe we should probably" seems to have one more doubt-expressing
word than it needs.

+               if (size_class == DSA_SCLASS_SPAN_LARGE)
+                       /* Large object frees give back segments
aggressively already. */
+                       continue;

We generally use braces in this kind of case.

+                * Search the fullness class 1 only.  That is where we
expect to find

extra "the"

+               /* Call for effect (we don't need the result). */
+               get_segment_by_index(area, index);
+       return area->segment_maps[index].mapped_address + offset;

It isn't guaranteed that area->segment_maps[index].mapped_address will
be non-NULL on return from get_segment_by_index, and then this
function will return a completely bogus pointer to the caller.  I
think you should probably elog() instead.

+               elog(ERROR, "dsa: can't attach to area handle %u", handle);

Avoid ":" in elog messages.   You don't really need to - and it isn't
project style to - tag these with "dsa:"; that's what \errverbose or
\set VERBOSITY verbose is for.  In this particular case, I might just
adopt the formulation from parallel.c:


                                 errmsg("could not map dynamic shared
memory segment")));

+                       elog(FATAL,
+                                "dsa couldn't find run of pages:
fpm_largest out of sync");

Here I'd go with "dsa could not find %u free pages".

+               elog(ERROR, "dsa_pin: area already pinned");

"dsa_area already pinned"

+               elog(ERROR, "dsa_unpin: area not pinned");

"dsa_area not pinned"

+               if (segment == NULL)
+                       elog(ERROR, "dsa: can't attach to segment");

As above.

+static dsa_segment_map *
+get_segment_by_index(dsa_area *area, dsa_segment_index index)
+       if (unlikely(area->segment_maps[index].mapped_address == NULL))
+       {
+               dsm_handle      handle;
+               dsm_segment *segment;
+               dsa_segment_map *segment_map;
+               handle = area->control->segment_handles[index];

Don't you need to acquire the lock for this?

+               /* Check all currently mapped segments to find what's
been freed. */
+               for (i = 0; i <= area->high_segment_index; ++i)
+               {
+                       if (area->segment_maps[i].header != NULL &&
+                               area->segment_maps[i].header->freed)
+                       {
+                               dsm_detach(area->segment_maps[i].segment);
+                               area->segment_maps[i].segment = NULL;
+                               area->segment_maps[i].header = NULL;
+                               area->segment_maps[i].mapped_address = NULL;
+                       }
+               }
+               area->freed_segment_counter = freed_segment_counter;

And this?

+ * Release a DSA area that was produced by dsa_create_in_place or
+ * dsa_attach_in_place.  It is preferable to use one of the 'dsa_on_XXX'
+ * callbacks so that this is managed automatically, because failure to release
+ * an area created in-place leaks its segments permanently.
+ */
+dsa_release_in_place(void *place)
+       decrement_reference_count((dsa_area_control *) place);

Since this seems to be the only caller of decrement_reference_count,
you could just put the logic here.  The contract for this function is
also a bit unclear from the header comment.  I initially thought that
it was your intention that this should be called from every process
that has either created or attached the segment.  But that doesn't
seem like it will work, because decrement_reference_count calls
dsm_unpin_segment on every segment, and a segment can only be pinned
once, so everybody else would fail.  So maybe the idea is that ANY ONE
process has to call dsa_release_in_place.  But then that could lead to
failures in other backends inside get_segment_by_index(), because
segments they don't have mapped might already be gone.  OK, third try:
maybe the idea is that the LAST process out has to call
dsa_release_in_place().  But how do the various cooperating processes
know which one that is?

I've also realized another thing that's not so good about this:
superblocks are 64kB, so allocating 64kB of initial space probably
just wastes most of it.  I think we want to either allocate just
enough space to hold the control information, or else that much space
plus space for at least a few superblocks.  I'm inclined to go the
first way, because it seems a bit overenthusiastic to allocate 256kB
or 512kB just on the off chance we might need it.  On the other hand,
including a few bytes in the control segment so that we don't need to
allocate 1MB segment that we might not need sounds pretty sharp.
Maybe DSA can expose an API that returns the number of bytes that will
be needed for the control structure, and then the caller can arrange
for that space to be available during the Estimate phase...

Robert Haas
The Enterprise PostgreSQL Company

Sent via pgsql-hackers mailing list (
To make changes to your subscription:

Reply via email to