Thanks to all who shared their experience.
Here is the brief summary of observations:
4 combinations of Intel Fortran or gfortran with Xeon or AMD processors
(of approximately the same base frequency) provided similar speed but
different results. Time comparison is not straightforward as the number
of iterations required for convergence varied between these 4 versions
(FOCEI, LAPLACIAN, and SAEM with ADVAN13 were used for all tests).
Results are numerically different, but not really different as parameter
estimates differ by no more than the respective confidence intervals of
parameter estimates: few percents for the well defined parameters, more
for parameters with large RSEs. Thus, any of these 4 combinations can be
used, but it is better not to mix them in one analysis. Also it seems to
be a good practice to specify not only OS and compiler with options, but
also processor or at least processor type to ensure exact
reproducibility of results.
Unlike earlier (10+ years ago) reports, Intel (old, v.11) compiler seems
to provide similar speed on both Intel and AMD new processors.
Thanks!
Leonid
On 11/19/2019 4:32 AM, Rikard Nordgren wrote:
Hi Leonid,
When upgrading from gfortran 4.4.7 to 5.1.1 we ran around 20 models with
both compilers and turning off the -ffast-math. The runs where on the
same hardware. The differences in the parameter estimates and OFV were
in general small. One big difference we could see was that the success
of the covariance step was seemingly random. It could succeed on one
compiler version, but not the other and it could also start failing when
the option was turned off. I have kept the runs, so let me know if you
would be interested. I also started some experiments using machine
dependent compiler flags, but as our cluster is heterogeneous I
abandoned this testing.
I think that getting identical results could be possible, but that it
would be quite a challenge. There are many components that affect the
results. The compiler, the compiler flags, the libc implementation, the
hardware and sometimes the operating system. To see for example where
the standard libraries comes into play you can do nm nonmem on the
nonmem executable (in linux) to list all symbols compiled in. Some are
function from external libraries, for example my exponential function is
from libc: exp@@GLIBC_2.2.5 . Even the functions that read in numbers
from text strings could introduce rounding errors since the text
representation is decimal and the internal floating point number is binary.
Best regards,
Rikard Nordgren
--
Rikard Nordgren
Systems developer
Dept of Pharmaceutical Biosciences
Faculty of Pharmacy
Uppsala University
Box 591
75124 Uppsala
Phone: +46 18 4714308
www.farmbio.uu.se/research/researchgroups/pharmacometrics/
On 2019-11-18 23:54, Leonid Gibiansky wrote:
Hi Jeroin,
Thanks for your input, very interesting. As far as the goal is
concerned, I am mostly interested to find options that would give
identical results on two platform rather than in speed. So far no
luck: 4 combinations of gfortran / Intel compilers on Xeon / AMD
processors give 4 sets of results that are close but not identical.
Related question to the group: have anybody experimented with gfortran
options (rather than using default provided by Nonmem distribution)?
Any recommendations? Same goal: maximum reproducibility across
different OSs, parallelization options, and processor types.
Thanks
Leonid
On 11/18/2019 5:28 PM, Jeroen Elassaiss-Schaap (PD-value B.V.) wrote:
Hi Leonid,
"A while" back we compared model development trajectories and results
between two computational platforms, Itanium and Xeon, see
https://www.page-meeting.org/?abstract=1188. The results roughly
were: 1/3 equal, 1/3 rounding differences and 1/3 real different
results. From discussions with the technical knowledgeable people I
worked with at the time, I recall that there are three levels/sources
for those differences:
1) computational (hardware) platform
2) compilers (+ optimization settings)
3) libraries (floating point handling does matter)
Assuming you would like to compare the speed of the platforms wrt
NONMEM, my advice would be to test a large series of different
models, from simple ADVAN1 or 2 to complex ODE, ranging from FO to
LAPLACIAN INT NUMERICAL, while keeping compilers and libraries the
same. Also small and large datasets, as in some instances you might
be testing only the L1/L2/L3 cache strategies and Turbo settings. And
with and without parallelization - as that might determine runtime
bottlenecks in practice.
Just having a peek at Epyc - seems interesting (noticed results w
gcc7.4 compilation). As long as you are able to hold the computation
in cache, a big if for the 64-core, there might be an advantage.
All in all I am not sure that it is worth the trouble. For any given
PK-PD model there is a lot you can tune to gain speed, but the
optimal settings might be very different for the next and overrule
any platform differences.
Hope this helps,
Jeroen
http://pd-value.com
jer...@pd-value.com
@PD_value
+31 6 23118438
-- More value out of your data!
On 18/11/19 6:34 pm, Leonid Gibiansky wrote:
Thanks Bob and Peter!
The model is quite stable, but this is LAPLACIAN, so requires second
derivatives. At iteration 0, gradients differ by about 50 to 100%
between Intel and AMD. This leads to differences in minimization
path, and slightly different results. Not that different to change
the recommended dose, but sufficiently different to notice (OF
difference of 6 points; 50% more model evaluations to get to
convergence).
Thanks
Leonid
On 11/18/2019 12:15 PM, Bonate, Peter wrote:
Leonid - when you say different. What do you mean? Fixed effect
and random effects? Different OFV?
We did a poster at AAPS a decade or so ago comparing results across
different platforms using the same data and model. We got
different results on the standard errors (which related to matrix
inversion and how those are done using software-hardware
configurations). And with overparameterized models we got different
error messages - some platforms converged with no problem while
some did not converge and gave R matrix singularity.
Did your problems go beyond this?
pete
Peter Bonate, PhD
Executive Director
Pharmacokinetics, Modeling, and Simulation
Astellas
1 Astellas Way, N3.158
Northbrook, IL 60062
peter.bon...@astellas.com
(224) 205-5855
Details are irrelevant in terms of decision making - Joe Biden.
-----Original Message-----
From: owner-nmus...@globomaxnm.com <owner-nmus...@globomaxnm.com>
On Behalf Of Leonid Gibiansky
Sent: Monday, November 18, 2019 11:05 AM
To: nmusers <nmusers@globomaxnm.com>
Subject: [NMusers] AMD vs Intel
Dear All,
I am testing the new Epyc processors from AMD (comparing with Intel
Xeon), and getting different results. Just wondering whether
anybody faced the problem of differences between AMD and Intel
processors and knows how to solve it. I am using Intel compiler but
ready to switch to gfortran or anything else if this would help to
get identical results.
There were reports of Intel slowing the AMD execution in the past,
but in my tests, speed is comparable but the results differ.
Thanks
Leonid
Page Title
När du har kontakt med oss på Uppsala universitet med e-post så innebär
det att vi behandlar dina personuppgifter. För att läsa mer om hur vi
gör det kan du läsa här: http://www.uu.se/om-uu/dataskydd-personuppgifter/
E-mailing Uppsala University means that we will process your personal
data. For more information on how this is performed, please read here:
http://www.uu.se/en/about-uu/data-protection-policy