Hi,

On 10/16/2015 08:00 PM, Jinyu Zhang wrote:

Update the patch_brin_optimze_mem according to your comment.

I've reviewed the last version of the patch, and I do have a few comments. I haven't done any performance testing at this point, as the machine I'm using for that is chewing on something else at the moment.

1) bringetbitmap(PG_FUNCTION_ARGS)

+ /*
+  * Estimate the approximate size of btup and allocate memory for btup.
+  */
+ btupInitSize = bdesc->bd_tupdesc->natts * 16;
+ btup = palloc(btupInitSize);

What is the reasoning behind this estimate? I mean, this only accounts for the values, not the NULL bitmap or BrinTuple fields. Maybe it does not really matter much, but I'd like to see some brief description in the docs, please.

This also interacts with our memory allocator in a rather annoying way, because palloc() always allocates memory in 2^k chunks, but sadly the estimate ignores that. So for natts=3 we get btupInitSize=48, but internally allocate 64B (and ignore the top 16B).


2) bringetbitmap(PG_FUNCTION_ARGS)

    if (tup)
    {
        if(size <= btupInitSize)
            memcpy(btup, tup, size);
        else
            btup = brin_copy_tuple(tup, size);
        LockBuffer(buf, BUFFER_LOCK_UNLOCK);
    }

Is there any particular reason why not to update btupInitSize to the size value? I mean, if the estimate is too low, then we'll call brin_copy_tuple() for all BRIN tuples, no?

In other words, I think the code should do this:

    if (tup)
    {
        if(size <= btupInitSize)
            memcpy(btup, tup, size);
        else
        {
            btup = brin_copy_tuple(tup, size);
            brinInitSize = size;
        }
        LockBuffer(buf, BUFFER_LOCK_UNLOCK);
    }

Another option might be the usual "doubling" strategy, i.e. do something like this instead:

    if (tup)
    {
        while (btupInitSize < size)
                    btupInitSize *= 2;

        btup = repalloc(btup, btupInitSize);
        memcpy(btup, tup, size);

        LockBuffer(buf, BUFFER_LOCK_UNLOCK);
    }

Possibly with only doing the repalloc when actually needed.


3) brin_new_memtuple(...)

The changes in this function seem wrong to me. Not only you've removed the code that initialized all the bv_* attributes, you're also using palloc() so there could easily be random data. This might be OK for our code as we call brin_memtuple_initialize() right after this function, and that seems to do the init, but AFAIK brin_new_memtuple() is a part of our public API at it's in a header file. And these changes surely break the contract documented in the comment above the function:

 * Create a new BrinMemTuple from scratch, and initialize it to an empty
 * state.

So I think this is wrong and needs to be reverted.

It's possible that we'll immediately reset the attributes, but that's negligible cost - we expect to do brin_memtuple_initialize() many times for the tuple, so it should not make any difference.

Another thing is that the three arrays (bt_values, bt_allnulls and bt_hasnulls) may be allocated as a single chunk and then sliced.

    char * ptr = palloc0(space for all three arrays);
    bt_values = ptr;
    bt_allnulls = ptr + (size of bt_values)
    bt_bt_hasnulls = ptr + (size of bt_values + bt_allnulls)

which is essentially what the original code did with bv_values. This reduces palloc() overhead and fragmentation.


4) brin_memtuple_initialize(...)

It's possible that the new code needs to reset more fields, but I don't really see why it should mess with dtuple->bt_columns[i].bv_values for example.


5) brin_deform_tuple(..)

The comment before the function probably should explain the purpose of the new parameter (that it can either be NULL or an existing tuple).


regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to