Re: Using Expanded Objects other than Arrays from plpgsql

Michel Pelletier Wed, 23 Oct 2024 18:58:44 -0700

On Wed, Oct 23, 2024 at 8:21 AM Tom Lane <t...@sss.pgh.pa.us> wrote:


> Michel Pelletier <pelletier.mic...@gmail.com> writes:
> > Here's another example:
>
> > CREATE OR REPLACE FUNCTION test2(graph matrix)
> >     RETURNS bigint LANGUAGE plpgsql AS
> >     $$
> >     BEGIN
> >     perform set_element(graph, 1, 1, 1);
> >     RETURN nvals(graph);
> >     end;
> >     $$;
> > CREATE FUNCTION
> > postgres=# select test2(matrix('int32'));
> > DEBUG:  new_matrix
> > DEBUG:  matrix_get_flat_size
> > DEBUG:  flatten_matrix
> > DEBUG:  scalar_int32
> > DEBUG:  new_scalar
> > DEBUG:  matrix_set_element
> > DEBUG:  DatumGetMatrix
> > DEBUG:  expand_matrix
> > DEBUG:  new_matrix
> > DEBUG:  DatumGetScalar
> > DEBUG:  matrix_get_flat_size
> > DEBUG:  matrix_get_flat_size
> > DEBUG:  flatten_matrix
> > DEBUG:  context_callback_matrix_free
> > DEBUG:  context_callback_scalar_free
> > DEBUG:  matrix_nvals
> > DEBUG:  DatumGetMatrix
> > DEBUG:  expand_matrix
> > DEBUG:  new_matrix
> > DEBUG:  context_callback_matrix_free
> > DEBUG:  context_callback_matrix_free
> >  test2
> > -------
> >      0
> > (1 row)
>
> I'm a little confused by your debug output.  What are "scalar_int32"
> and "new_scalar", and what part of the plpgsql function is causing
> them to be invoked?
>

GraphBLAS scalars hold a single element value for the matrix type.
Internally, they are simply 1x1 matrices (much like vectors are 1xn
matrices).  The function signature is:

set_element(a matrix, i bigint, j bigint, s scalar)

There is a "CAST (integer as scalar)" function (scalar_int32) that casts
Postgres integers to GraphBLAS GrB_INT32 scalar elements (which calls
new_scalar because like vectors and matrices, they are expanded objects
which have a GrB_Scalar "handle").  Scalars are useful for working with
individual values, for example reduce() returns a scalar.  There are way
more efficient ways to push huge C arrays of values into matrices but for
now I'm just working at the element level.

Another thing that confuses me is why there's a second flatten_matrix
> operation happening here.  Shouldn't set_element return its result
> as a R/W expanded object?
>

That confuses me too, and my default assumption is always that I'm doing it
wrong.  set_element does return a R/W object afaict, here is the return:

https://github.com/OneSparse/OneSparse/blob/main/src/matrix.c#L1726

where:

#define OS_RETURN_MATRIX(_matrix) return EOHPGetRWDatum(&(_matrix)->hdr)


> > I would expect that to return 1.  If I do "graph = set_element(graph, 1,
> 1,
> > 1)" it works.
>
> I think you have a faulty understanding of PERFORM.  It's defined as
> "evaluate this expression and throw away the result", so it's *not*
> going to change "graph", not even if set_element declares that
> argument as INOUT.


Faulty indeed, I was going from the plpgsql statement documentation:

"Sometimes it is useful to evaluate an expression or SELECT query but
discard the result, for example when calling a function that has
side-effects but no useful result value."

My understanding of "side-effects" was flawed there, but I'm fine with "x =
set_element(x...)" anyway as I was trying to follow the example of
array_append et al.


> (Our interpretation of OUT arguments for functions
> is that they're just an alternate notation for specifying the function
> result.)  If you want to avoid the explicit assignment back to "graph"
> then the thing to do would be to declare set_element as a procedure,
> not a function, with an INOUT argument and then call it with CALL.
>

I'll stick with the assignment.

That's only cosmetically different though, in that the updated
> "graph" value is still passed back much as if it were a function
> result, and then the CALL infrastructure knows it has to assign that
> back to the argument variable.  And, as I tried to explain earlier,
> that code path currently has no mechanism for avoiding making a copy
> of the graph somewhere along the line: it will pass "graph" to the
> procedure as either a flat Datum or a R/O expanded object, so that
> set_element will be required to copy that before modifying it.
>

Right, I'm still figuring out exactly what that code flow is.  This is my
first dive into these corners of the code so thank you for being patient
with me.  I promise to write up some expanded object documentation if this
works!


> We can imagine extending whatever we do for "x := f(x)" cases so that
> it also works during "CALL p(x)".  But I think that's only going to
> yield cosmetic or notational improvements so I don't want to start
> with doing that.  Let's focus first on improving the existing
> infrastructure for the f(x) case.
>

+1

-Michel

Re: Using Expanded Objects other than Arrays from plpgsql

Reply via email to