Pádraig Brady wrote:
...
> I still get bad performance for the above with SUBTHREAD_LINES_HEURISTIC=128K

Sorry I haven't had time for this today.
I'll investigate tomorrow.

> So as you suggested, the large mem allocation when reading from a pipe
> is a problem,
> and in fact seems to be the main problem.  Now given the memory isn't
> actually used
> it shouldn't be a such an issue, but if one has MALLOC_PERTURB_ set,
> then it is used,
> and it has a huge impact. Compare:
>
> $ for i in $(seq 33); do seq 88| MALLOC_PERTURB_= timeout 2 sort
> --para=1 >/dev/null & done
> $ for i in $(seq 33); do seq 88| MALLOC_PERTURB_=1 timeout 2 sort
> --para=1 >/dev/null & done

Good point!

> So we should be more conservative in memory allocation in sort,
> and be more aligned with CPU cache sizes than RAM sizes I suspect.
> This will be an increasing problem as we tend to run more in ||.
> It would be interesting I think to sort first by L1 cache size,
> then by L2, etc, but as a first pass, a more sensible default
> of 8MB or so seems appropriate.
>
> As a general note, MALLOC_PERTURB_ should be unset when benchmarking
> anything to do with `sort`

Reply via email to