This is a response to the email from Fokke Dijkstra email concerning the 
performace of  MOM4 .
�
Most of my work has focused on looking at the difference in performance of 
MOM4 with the STATIC and DYNAMIC memory option. This option is not available 
in the current resease of MOM4,  we anticipate this option to be released 
with MOM4 later in Spring, 2003. The MOM4 study showed the following results:
�
With STATIC_MEMORY the total runtime = 1153.968941 seconds
With DYNAMIC_MEMORY the total runtime = 1877.834289 seconds
�
which is an approximate 40% difference in performance. The results for the
entire model were obtained on 60 processors running for a 15 day integration. 
The above times were obtained on an SGI 3800 System with 600 MHz processors.
�
We then examined the perfex files for the entire code, they showed the
primary differences were in the decoded loads. These showed that for:
�
STATIC_MEMORY the decoded loads = �9934669073472I
DYNAMIC_MEMORY the decoded loads = 15047988514480
�
To report the performance problem to the SGI Compiler Group we needed to
produce a "simple" test case that demonstrates the above behavior. After 
some analysis, we constructed such a case. It turns out that the problem is 
the result of the code generator working on loop constructs that have array 
syntax and derived types that have been allocated with DYNAMIC memory. I 
have attached the test case and you can see from the *.w2f output file that 
the poorly performing code produces an increase number of temporary arrays 
which would account for the performance degradation.
�
This performance bug has already been forwarded to the SGI Compiler Group 
and we are waiting to here back from them. In our studies, we also uncovered
several other loop constructs that performed poorly. They had similar 
behaviors to the case described above. Before we proceed with these studies, 
we are waiting to hear back from SGI as these may all be part of the same
family of performance bugs.
�
I did read the email from Fokke Dijkstra and I think he should see an 
improvement in MOM4 performance when the above performance fix is made. It is 
an unanswered question if the STATIC and DYNAMIC memory options will yield 
similar performance improvements on other systems

Fokke Dijkstra did observe an increase by a factor of two the number of 
floating point operations. My results shows no increase in this number. After 
talking with Matt Harrison, we may want to look at the use of stencil 
operators in MOM4 and see if they are contributing to an increase in floating 
point operations.

Christopher
�
Dr. Christopher L. Kerr
Geophysical Fluid Dynamics Laboratory
Forrestal Campus
Princeton University                             
Princeton, New Jersey  08542
Telephone: (609) 452-6573
Fax:       (609) 987-5063
Email:     [EMAIL PROTECTED]
Web:       http://www.gfdl.gov/~ck

Attachment: perf_sgi_example
Description: Unix tar archive

Reply via email to