This is a response to the email from Fokke Dijkstra email concerning the performace of MOM4 . � Most of my work has focused on looking at the difference in performance of MOM4 with the STATIC and DYNAMIC memory option. This option is not available in the current resease of MOM4, we anticipate this option to be released with MOM4 later in Spring, 2003. The MOM4 study showed the following results: � With STATIC_MEMORY the total runtime = 1153.968941 seconds With DYNAMIC_MEMORY the total runtime = 1877.834289 seconds � which is an approximate 40% difference in performance. The results for the entire model were obtained on 60 processors running for a 15 day integration. The above times were obtained on an SGI 3800 System with 600 MHz processors. � We then examined the perfex files for the entire code, they showed the primary differences were in the decoded loads. These showed that for: � STATIC_MEMORY the decoded loads = �9934669073472I DYNAMIC_MEMORY the decoded loads = 15047988514480 � To report the performance problem to the SGI Compiler Group we needed to produce a "simple" test case that demonstrates the above behavior. After some analysis, we constructed such a case. It turns out that the problem is the result of the code generator working on loop constructs that have array syntax and derived types that have been allocated with DYNAMIC memory. I have attached the test case and you can see from the *.w2f output file that the poorly performing code produces an increase number of temporary arrays which would account for the performance degradation. � This performance bug has already been forwarded to the SGI Compiler Group and we are waiting to here back from them. In our studies, we also uncovered several other loop constructs that performed poorly. They had similar behaviors to the case described above. Before we proceed with these studies, we are waiting to hear back from SGI as these may all be part of the same family of performance bugs. � I did read the email from Fokke Dijkstra and I think he should see an improvement in MOM4 performance when the above performance fix is made. It is an unanswered question if the STATIC and DYNAMIC memory options will yield similar performance improvements on other systems
Fokke Dijkstra did observe an increase by a factor of two the number of floating point operations. My results shows no increase in this number. After talking with Matt Harrison, we may want to look at the use of stencil operators in MOM4 and see if they are contributing to an increase in floating point operations. Christopher � Dr. Christopher L. Kerr Geophysical Fluid Dynamics Laboratory Forrestal Campus Princeton University Princeton, New Jersey 08542 Telephone: (609) 452-6573 Fax: (609) 987-5063 Email: [EMAIL PROTECTED] Web: http://www.gfdl.gov/~ck
perf_sgi_example
Description: Unix tar archive
