On Mon, Jul 18, 2022 at 3:03 PM Martin Kalcher <martin.kalc...@aboutsource.net> wrote: > Thanks for all your feedback and help. I got a patch that i consider > ready for review. It introduces two new functions: > > array_shuffle(anyarray) -> anyarray > array_sample(anyarray, integer) -> anyarray > > array_shuffle() shuffles an array (obviously). array_sample() picks n > random elements from an array.
I like this idea. I think it's questionable whether the behavior of array_shuffle() is correct for a multi-dimensional array. The implemented behavior is to keep the dimensions as they were, but permute the elements across all levels at random. But there are at least two other behaviors that seem potentially defensible: (1) always return a 1-dimensional array, (2) shuffle the sub-arrays at the top-level without the possibility of moving elements within or between sub-arrays. What behavior we decide is best here should be documented. array_sample() will return elements in random order when sample_size < array_size, but in the original order when sample_size >= array_size. Similarly, it will always return a 1-dimensional array in the former case, but will keep the original dimensions in the latter case. That seems pretty hard to defend. I think it should always return a 1-dimensional array with elements in random order, and I think this should be documented. I also think you should add test cases involving multi-dimensional arrays, as well as arrays with non-default bounds. e.g. trying shuffling or sampling some values like '[8:10][-6:-5]={{1,2},{3,4},{5,6}}'::int[] -- Robert Haas EDB: http://www.enterprisedb.com