Re: general purpose array_sort

Junwang Zhao Thu, 07 Nov 2024 05:57:06 -0800

Hi Michael,

On Mon, Nov 4, 2024 at 1:46 PM Michael Paquier <mich...@paquier.xyz> wrote:
>
> On Sun, Nov 03, 2024 at 11:33:05AM +0800, Junwang Zhao wrote:
> > Rebase needed due to array_reverse committed, PFA v11.
>
> There has been another conflict since you have posted this version
> (noticed that after my business in 027124a872d7).  I have looked at
> 0001.
>
> +    if (ARR_NDIM(array) < 1)
> +        PG_RETURN_ARRAYTYPE_P(array);
> There is no point in doing a sort if the array has only one element.
> You can add a check based on "ARR_DIMS(array)[0] < 2" to achieve that.


Yeah, this is reasonable but one case I can't be sure:

SELECT array_sort('{{2,3,4}}'::xid[]);

This will return the array as is, but xid doesn't have a LT_OPR, should
I error out in this case? like:

could not identify ordering operator for type xid[]

>
> +    typentry = lookup_type_cache(elmtyp, TYPECACHE_LT_OPR);
> +    if (!OidIsValid(typentry->lt_opr))
> +        ereport(ERROR,
> +                (errcode(ERRCODE_UNDEFINED_FUNCTION),
> +                 errmsg("could not identify ordering operator for type %s",
> +                    format_type_be(elmtyp))));
>
> The patch introduces two error paths based on the fact that ordering
> operators could not be found depending on a data type that lacks the
> ordering operator and the array ordering operator part.  It is right
> to issue an error if these are lacking, like the various stats paths.
> Should we have some regression tests with specific data types for
> these errors, though?  The stats paths don't care much about these
> error cases, but it does not mean that we should not care about them.
> In short, let's have negative test coverage if we can.
>
> +typedef struct ArraySortCachedInfo
> +{
> +    TypeCacheEntry *typentry;
> +    TypeCacheEntry *array_typentry;
> +} ArraySortCachedInfo;
>
> Let's put that at the top of the file, with a comment about how it
> links to array_sort() for the caching with fn_extra.  Let's also
> document the meaning of the fields.

Will fix it in the following patch set.

>
> FWIW, I am confused by this implementation, where you have to allocate
> the two TypeCacheEntry because of the fact that you have to deal with
> the 1-dimension case and the multi-dimension case.  In the context of
> a single function call, why do you need both typentry and
> array_typentry, actually?  Wouldn't it be enough to use one typentry
> that points to the typcache, meaning that you don't really need to use
> the extra business with fn_mcxt, no?  If you require both (because I
> may be wrong), perhaps you should have a regression test that's able
> to break when removing array_typentry, changing the code to only rely
> on typentry.   Note: I have just removed array_typentry in a quick
> test, current coverage was happy about it.  Feel free to prove me
> wrong.
>
> Agreed that the function should be immutable.  The results are fixed
> depending on the input even with the COLLATE clauses appended.
>
> Let's add something when there is only one element in the first
> dimension of the array, say two cases one with an int and one with an
> array of ints like:
> SELECT array_sort('{1}'::int[]);
> SELECT array_sort('{{1}}'::int[]);

Will add.

> --
> Michael



-- 
Regards
Junwang Zhao

Re: general purpose array_sort

Reply via email to