Hey Mike, Thanks a lot for following up! I realize that one aspect wasn't very explicit in my description: I only want one permutation of each combination, e.g., I only want one pair out of (a1, a2) and (a2, a1). The Cartesian product contains both. In your example data, for "pkey": 2, I only want one of (3, 5) and (5, 3).
(This is really just a performance optimization: of all of the pairs, I eventually pick the one that maximizes some metric, and that metric is symmetric, so I can could pick any of the equivalent permutations. However, I doubt that any query optimizer can detect that, and for 3-combinations, that optimization makes a really significant difference.) Off topic: ROOT [1], the standard framework used by physicists today, and interestingly PrestoDB as well [2], have a built-in function "combinations" for that purpose. Cheers, Ingo [1] https://root.cern/doc/master/group__vecops.html#gabbbf82257156b750c7b9bdd39a53ee33 [2] https://prestodb.io/docs/current/functions/array.html#combinations > -----Original Message----- > From: Mike Carey <dtab...@gmail.com> > Sent: Thursday, August 5, 2021 1:03 AM > To: users@asterixdb.apache.org > Subject: Re: Producing combinations of array elements in SQL++ > > Ingo, > > So if I'm understanding the use case right, and order is irrelevant, and it's > just > about computing unique pairs, here is a potentially simpler version: > > SELECT pkey, > (SELECT DISTINCT a1, a2 > FROM ds.array_field AS a1, > ds.array_field AS a2 > ) AS upairs > FROM my_dataset AS ds; > > I.e., do a cartesian product of the array with itself and eliminate duplicate > pairs. > No positional info needed. Does that meet the problem statement? (I may be > missing a part of it.) > > > Cheers, > > Mike > > PS - Here is an example that can be run to test it w/o creating data (just > pretending to do so with WITH): > > WITH my_dataset AS > ([{"pkey": 1, "array_field":[1, 2, 2, 4]}, > {"pkey": 2, "array_field":[3, 5]}]) > SELECT pkey, > (SELECT DISTINCT a1, a2 > FROM ds.array_field AS a1, > ds.array_field AS a2 > ) AS upairs > FROM my_dataset AS ds;