Re: [HACKERS] Time to drop old-style (V0) functions?

Andres Freund Tue, 20 Dec 2016 00:12:05 -0800

On 2016-12-19 15:25:42 -0500, Robert Haas wrote:
> On Mon, Dec 19, 2016 at 3:13 PM, Peter Eisentraut
> <peter.eisentr...@2ndquadrant.com> wrote:
> > On 12/9/16 7:52 AM, Robert Haas wrote:
> >> It's kind of ironic, at least IMHO, that the DirectionFunctionCall
> >> macros are anything but direct.  Each one is a non-inlined function
> >> call that does a minimum of 8 variable assignments before actually
> >> calling the function.
> >
> > If this is a problem (it might be), then we can just make those calls,
> > er, direct C function calls to different function.  For example,
> >
> >     result = DatumGetObjectId(DirectFunctionCall1(oidin,
> >                               CStringGetDatum(pro_name_or_oid)));
> >
> > could just be
> >
> >     result = oidin_internal(pro_name_or_oid);
>
> Yeah, true -- that works for simple cases, and might be beneficial
> where you're doing lots and lots of calls in a tight loop.
>
> For the more general problem, I wonder if we should try to figure out
> something where we have one calling convention for "simple" functions
> (those that little or none of the information in fcinfo and can pretty
> much just take their SQL args as C args) and another for "complex"
> functions (where we do the full push-ups).


Unfortunately I think so would likely also increase overhead in a bunch
of places, because suddenly we need branches determining which callling
convention is in use. There's a fair amount of places, some of them
performance sensitive, that deal with functions that can get different
number of arguments. Prominently nodeAgg.c's transition functions for
example.

When JITing expressions that doesn't matter, because we avoid doing so
repetitively. But that's not really sufficient imo, non JITed
performance matters a lot too.

There's imo primarily two parts that make our calling convention
expensive:
a) Additional instructions required to push to/from fcinfo->arg[null],
   including stack spilling required to have space for the arguments.
b) Increased number of cachelines touched, even for even trivial
   functions. We need to read/write to at least five:
   1) fcinfo->isnull to reset it
   2) fcinfo->arg[0...] to set the argument
   3) fcinfo->argnull[0...] to set argnull (separate cacheline)
   4) fcinfo->flinfo->fn_addr to get the actual address to call
   5) instruction cache miss for the function's contents

We can doctor around this some. A retively easy way to reduce the
overhead of a similar function call interface would be to restructure
things so we have something like:

typedef struct FunctionArg2
{
        Datum arg;
        bool argnull;
} FunctionArg2;

typedef struct FunctionCallInfoData2
{
        /* cached function address from ->flinfo */
        PGFunction      fn_addr;
        /* ptr to lookup info used for this call */
        FmgrInfo       *flinfo;
        /* function must set true if result is NULL */
        bool            isnull;
        /* # arguments actually passed */
        short           nargs;
        /* collation for function to use */
        Oid             fncollation;    /* collation for function to use */
        /* pass or return extra info about result */
        fmNodePtr       resultinfo;
        /* array big enough to fit flinfo->fn_nargs */
        FunctionArg2    args[FLEXIBLE_ARRAY_MEMBER];
} FunctionCallInfoData2;

That'd mean some code changes because accessing arguments can't be done
quite as now, but it'd be fairly minor.  We'd end up with a lot fewer
cache misses for common cases, because there's no need to fetch fn_addr,
and because the first two arg/argnull pairs still fit within the first
cacheline (which they never can in the current interface!).

But we'd still be passing arguments via memory, instead of using
registers.

I think a more efficient variant would make the function signature look
something like:

Datum /* directly returned argument */
pgfunc(
        /* extra information about function call */
        FunctionCallInfo *fcinfo,
        /* bitmap of NULL arguments */
        uint64_t nulls,
        /* first argument */
        Datum arg0,
        /* second argument */
        Datum arg1,
        /* returned NULL */
        bool *isnull
);

with additional arguments passed via fcinfo as currently done. Since
most of the performance critical function calls only have two arguments
that should allow to keep usage of fcinfo-> to the cases where we need
the extra information.  On 64bit linuxes the above will be entirely in
registers, on 64bit windows isnull would be placed on the stack.

I don't quite see how we could avoid stack usage at all for windows, any
of the above arguments seems important.

There'd be a need for some trickery to make PG_GETARG_DATUM() work
efficiently - afaics the argument numbers passed to PG_GETARG_*() are
always constants, so that shouldn't be too bad.

Regards,

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Time to drop old-style (V0) functions?

Reply via email to