Hi John,

I played with this again today and found an optimization that seems to 
dramatically improve the performance:

```
+static void
+radix_sort_tuple(SortTuple *begin, size_t n_elems, int level, Tuplesortstate 
*state)
+{
+       RadixPartitionInfo partitions[256] = {0};
+       uint8_t         remaining_partitions[256] = {0};
```

Here partitions and remaining_partitions are just temporary buffers, allocating 
memory from stack and initialize them seems slow. By passing them as function 
parameters are much faster. See attached diff for my change.

V5 patch: by the way, v5 is very faster than v1.
```
evantest=# select * from test_multi order by category, name;
Time: 299.912 ms
evantest=# select * from test_multi order by category, name;
Time: 298.793 ms
evantest=# select * from test_multi order by category, name;
Time: 300.306 ms
evantest=# select * from test_multi order by category, name;
Time: 302.140 ms
```

v5 + my change:
```
evantest=# select * from test_multi order by category, name;
Time: 152.572 ms
evantest=# select * from test_multi order by category, name;
Time: 143.296 ms
evantest=# select * from test_multi order by category, name;
Time: 151.606 ms
```

The test I did today is just the high cardinality first column test I had done 
before:
```
drop table if exists test_multi;
create unlogged table test_multi (category int, name text);
insert into test_multi select (random() * 1000000)::int as category,  
md5(random()::text) || md5(random()::text) as name from generate_series(1, 
1000000);
vacuum freeze test_multi;
\timing on
\o /dev/null
set wip_radix_sort = ‘on;
set work_mem = ‘2GB’;
select * from test_multi order by category, name;
```

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

Attachment: tuplesort_chaoli.diff
Description: Binary data




Reply via email to