Re: [HACKERS] 0/NULL/NIL assignments in build_*_rel()

2017-02-07 Thread Robert Haas
On Mon, Feb 6, 2017 at 12:04 AM, Tom Lane  wrote:
> Ashutosh Bapat  writes:
>> Both build_simple_rel() and build_join_rel() allocate RelOptInfo using
>> makeNode(), which returned a zeroed out memory. The functions then
>> assign values like false, NULL, 0 or NIL which essentially contain
>> zero valued bytes. This looks like needless work. So, we are spending
>> some CPU cycles unnecessarily in those assignments. That may not be
>> much time wasted, but whenever someone adds a field to RelOptInfo,
>> those functions need to be updated with possibly a zero value
>> assignment. That looks like an unnecessary maintenance burden. Should
>> we just drop all those zero value assignments from there?
>
> I'd vote for not.  The general programming style in the backend is to
> ignore the fact that makeNode() zeroes the node's storage and initialize
> all the fields explicitly.

I know that's your preference so I try not to spend too much time
arguing about it but personally I don't like it.  If I want to find
the places where a particular structure member gets set to a value,
the places where it's getting set to NULL aren't interesting, because
I have to think about that case anyway; somebody might have inserted a
makeNode() call without explicit initializations someplace; people do
that sometimes.  So for me, the places where we reinitialize storage
that is already-zeroed seems like a waste not only of cycles but of
brainpower.  However, as I say, I recognize that we see the world
differently on this point.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 0/NULL/NIL assignments in build_*_rel()

2017-02-05 Thread Ashutosh Bapat
On Mon, Feb 6, 2017 at 11:10 AM, Tom Lane  wrote:
> Ashutosh Bapat  writes:
>> On Mon, Feb 6, 2017 at 10:34 AM, Tom Lane  wrote:
>>> I'd vote for not.  The general programming style in the backend is to
>>> ignore the fact that makeNode() zeroes the node's storage and initialize
>>> all the fields explicitly.  The primary advantage of that, IMO, is that
>>> you can grep for references to a particular field and understand
>>> everyplace that affects it.  As an example of the value, if you want to
>>> add a field X and you try to figure out what you need to modify by
>>> grepping for references to existing field Y, with this proposal you would
>>> miss places that were node initializations unless Y *always* needed to be
>>> initialized nonzero.
>
>> In the case of adding a new field, I would rather rely on where the
>> structure itself is used rather than another member.
>
> Maybe.  There's certainly something to be said for developing a different
> style in which we only initialize fields that need to be nonzero ... but
> if we were to go for that, relnode.c would not be the only place to fix,
> or even the best place to start.
>

As against any other Node structure, RelOptInfo somehow stood out
clearly for me in this aspect. May be because it's a large structure.
But yes, this might be a problem with other structures as well.

> Also, grepping for makeNode(FooStruct) works for cases where FooStructs
> are *only* built that way.  But there's more than one place in the backend
> where we build structs in other ways, typically by declaring one as a
> local variable.  It would take some effort to define a universal
> convention here and make sure all existing code follows it.
>

I think only makeNode() or palloc0(... * sizeof(nodename)) usages can
benefit from this. In other cases the fields need to be explicitly
initialized. Also, that would be a lot of code churn. May be we should
restrict to some large Node structures like RelOptInfo.

>> Should we at least move those assignments into a separate function?
>
> As far as relnode.c goes, I don't really think that's an improvement,
> because it complicates dealing with fields that need to be initialized
> differently for baserels and joinrels.
>

I am thinking more on the lines of makeVar() - create a function
makeRelOptInfo() or such, which accepts values for fields which need
assignment other than zero value and call it from build_join_rel() and
build_simple_rel() passing joinrel and base rel specific values resp.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 0/NULL/NIL assignments in build_*_rel()

2017-02-05 Thread Tom Lane
Ashutosh Bapat  writes:
> On Mon, Feb 6, 2017 at 10:34 AM, Tom Lane  wrote:
>> I'd vote for not.  The general programming style in the backend is to
>> ignore the fact that makeNode() zeroes the node's storage and initialize
>> all the fields explicitly.  The primary advantage of that, IMO, is that
>> you can grep for references to a particular field and understand
>> everyplace that affects it.  As an example of the value, if you want to
>> add a field X and you try to figure out what you need to modify by
>> grepping for references to existing field Y, with this proposal you would
>> miss places that were node initializations unless Y *always* needed to be
>> initialized nonzero.

> In the case of adding a new field, I would rather rely on where the
> structure itself is used rather than another member.

Maybe.  There's certainly something to be said for developing a different
style in which we only initialize fields that need to be nonzero ... but
if we were to go for that, relnode.c would not be the only place to fix,
or even the best place to start.

Also, grepping for makeNode(FooStruct) works for cases where FooStructs
are *only* built that way.  But there's more than one place in the backend
where we build structs in other ways, typically by declaring one as a
local variable.  It would take some effort to define a universal
convention here and make sure all existing code follows it.

> Should we at least move those assignments into a separate function?

As far as relnode.c goes, I don't really think that's an improvement,
because it complicates dealing with fields that need to be initialized
differently for baserels and joinrels.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 0/NULL/NIL assignments in build_*_rel()

2017-02-05 Thread Ashutosh Bapat
On Mon, Feb 6, 2017 at 10:34 AM, Tom Lane  wrote:
> Ashutosh Bapat  writes:
>> Both build_simple_rel() and build_join_rel() allocate RelOptInfo using
>> makeNode(), which returned a zeroed out memory. The functions then
>> assign values like false, NULL, 0 or NIL which essentially contain
>> zero valued bytes. This looks like needless work. So, we are spending
>> some CPU cycles unnecessarily in those assignments. That may not be
>> much time wasted, but whenever someone adds a field to RelOptInfo,
>> those functions need to be updated with possibly a zero value
>> assignment. That looks like an unnecessary maintenance burden. Should
>> we just drop all those zero value assignments from there?
>
> I'd vote for not.  The general programming style in the backend is to
> ignore the fact that makeNode() zeroes the node's storage and initialize
> all the fields explicitly.  The primary advantage of that, IMO, is that
> you can grep for references to a particular field and understand
> everyplace that affects it.  As an example of the value, if you want to
> add a field X and you try to figure out what you need to modify by
> grepping for references to existing field Y, with this proposal you would
> miss places that were node initializations unless Y *always* needed to be
> initialized nonzero.

In the case of adding a new field, I would rather rely on where the
structure itself is used rather than another member. For example while
adding a field to RelOptInfo, to find out places to initialize it, I
would grep for makeNode(RelOptInfo) or such to find where a new node
gets allocated, rather than say relids. Grepping for usages of a
particular field might find zero valued assignments useful, but not
necessarily worth maintaining that code.

>
> I could be convinced that it is a good idea to depart from this theory
> in places where it makes a significant performance difference ... but
> you've not offered any reason to think relnode.c is one.
>

I don't think that will bring any measurable performance improvement,
unless we start creating millions of RelOptInfos in a query. My real
pain is

>> whenever someone adds a field to RelOptInfo,
>> those functions need to be updated with possibly a zero value
>> assignment. That looks like an unnecessary maintenance burden.

Should we at least move those assignments into a separate function?
-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 0/NULL/NIL assignments in build_*_rel()

2017-02-05 Thread Tom Lane
Ashutosh Bapat  writes:
> Both build_simple_rel() and build_join_rel() allocate RelOptInfo using
> makeNode(), which returned a zeroed out memory. The functions then
> assign values like false, NULL, 0 or NIL which essentially contain
> zero valued bytes. This looks like needless work. So, we are spending
> some CPU cycles unnecessarily in those assignments. That may not be
> much time wasted, but whenever someone adds a field to RelOptInfo,
> those functions need to be updated with possibly a zero value
> assignment. That looks like an unnecessary maintenance burden. Should
> we just drop all those zero value assignments from there?

I'd vote for not.  The general programming style in the backend is to
ignore the fact that makeNode() zeroes the node's storage and initialize
all the fields explicitly.  The primary advantage of that, IMO, is that
you can grep for references to a particular field and understand
everyplace that affects it.  As an example of the value, if you want to
add a field X and you try to figure out what you need to modify by
grepping for references to existing field Y, with this proposal you would
miss places that were node initializations unless Y *always* needed to be
initialized nonzero.

I could be convinced that it is a good idea to depart from this theory
in places where it makes a significant performance difference ... but
you've not offered any reason to think relnode.c is one.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers