On 28/01/2019 09:20, Bertram Felgenhauer wrote:
I've made sure that the machine is mostly idle; things will be much
different if several processes start competing for the ressources.

I intended to experiment with smaller numbers but have not yet done so.
On another, similar machine, --gcthreads=2 was just as fast as 6.

It is increasingle clear that the limiting factor is the bandwidth between main memory and the processors. There needs to be sufficient parallelism to saturate this channel; more than that will not help and may well make things worse.

How many threads are needed to saturate the memory channel depends on the processor speed, the memory speed and the application so there's no single answer to this. Garbage collection can be particularly bad. Each word in the live data is read in and if it is a pointer it is followed. There's no reuse so caching doesn't help.

The idea behind developing the 32-bit value model for the 64-bit processor was that by reducing the size of an ML cell more cells could be transferred through the memory channel in a given time. That seems to have been borne out by the results.

isabelle-dev mailing list

Reply via email to