On 28/01/2019 09:20, Bertram Felgenhauer wrote:
I've made sure that the machine is mostly idle; things will be much
different if several processes start competing for the ressources.
I intended to experiment with smaller numbers but have not yet done so.
On another, similar machine, --gcthreads=2 was just as fast as 6.
It is increasingle clear that the limiting factor is the bandwidth
between main memory and the processors. There needs to be sufficient
parallelism to saturate this channel; more than that will not help and
may well make things worse.
How many threads are needed to saturate the memory channel depends on
the processor speed, the memory speed and the application so there's no
single answer to this. Garbage collection can be particularly bad.
Each word in the live data is read in and if it is a pointer it is
followed. There's no reuse so caching doesn't help.
The idea behind developing the 32-bit value model for the 64-bit
processor was that by reducing the size of an ML cell more cells could
be transferred through the memory channel in a given time. That seems
to have been borne out by the results.
isabelle-dev mailing list