Okay, so here's a follow up on my progress.  Apologies in advance for
the long email here, but I'd like to be thorough about this before I
forget.

For the sake of completeness, here's my setup.  I'm running python
2.7.1 compiled from source with icc.  I'm running ubuntu 10.10 on one
of intel's new processors (a i7-2600).  The goal is to compile numpy
and scipy both with intel's compiler and intel's mkl.

I finally got numpy to compile with icc / ifort with pretty much all
of the tests passing.  It's a bit of work, partly cause I was trying
to be an optimization junky, but I thought I'd share my discoveries.
Scipy also compiles, but with some errors (which are likely due to me
not configuring f2py correctly).

First, I wanted to compile things with intel's interprocedural
optimization enabled, and that seems to work, but only if -O2 is used
for the compiling stage and -O1 is used for the linking stage. If -O3
is given for the compiling stage, then the einsum test goes into some
sort of infinite loop and hangs.  If -O2 or -O3 are given for the
linker, then there are random other segfaults (I forget where).
However, with these optimization levels, things are stable.  Also, if
I turn off -ipo, then -O3 works fine for compiling. I'm not sure if
this reflects bugs in the flags I'm passing to the intel compiler or
in icc/ifort itself.

Second, to use -ipo, it's critical that xiar is used instead of ar to
create object archives.  This needed to be changed in
fcompiler/intel.py and intelccompiler.py. I've attached a diff of
these files that gives working options for me.  I don't know if these
options are set in the correct place or not, but perhaps they would be
helpful:

The essence of it is the following (from intelccompiler.py)

linker_flags = '-O1 -ipo -openmp -lpthread -fno-alias -xHOST -fPIC '
compiler_opt_flags = '-static -ipo -xHOST -O2 -fPIC -DMKL_LP64 -mkl
-wd188 -g -fno-alias '
icc_run_string = 'icc ' + compiler_opt_flags
icpc_run_string = 'icpc ' + compiler_opt_flags
linker_run_string = 'icc ' + linker_flags + ' -shared '

with the rest of this diff setting these options.  In this case, the
-openmp and -lpthread are required for linking with the threaded layer
of the MKL.  This could possibly be ripped out of there.  Also, the
-fno-alias is critical for the c compiler -- random segfaults and
memory corruptions occur without it.  The -DMKL_LP64 is to ensure
proper linking with the lp64 (32 bit indices) part of mkl, instead of
the ilp64 (64 bit indices).  The latter isn't supported by the
lapack_lite module -- things compile, but don't work. -mkl may or may
not help things.

For the fortran side, this was the compiler string:

compiler_opt_flags = '-static -ipo -xHOST -fPIC -DMKL_LP64 -mkl -wd188
-g -fno-alias -O3'

Here you don't need the -fno-alias and -O3 seems to work.

Third, it was a bit of a pain to figure out how to get the
linking/detection done correctly, as somehow order matters, and it was
easy to get undefined symbols, runtime errors, etc. Very annoying.  In
the end, my site.cfg file looked like this:

[DEFAULT]
library_dirs=/usr/intel/current/mkl/lib/intel64
include_dirs=/usr/intel/current/mkl/include
mkl_libs = mkl_rt, mkl_core, mkl_intel_thread, mkl_intel_lp64
blas_libs = mkl_blas95_lp64
lapack_libs = mkl_lapack95_lp64

[lapack_opt]
library_dirs=/usr/intel/current/mkl/lib/intel64
include_dirs=/usr/intel/current/mkl/include/intel64/lp64
libraries = mkl_lapack95_lp64

[blas_opt]
library_dirs = /usr/intel/current/mkl/lib/intel64
include_dirs = /usr/intel/current/mkl/include/intel64/lp64
libraries = mkl_blas95_lp64

where /usr/intel/current/ points to my intel install location.  It's
critical that the mkl_libs are given in that order.  I didn't find
another combination that worked.

Finally, I attached my bash setup script for environment variables.  I
don't know how much of a role those play in things, but I had them in
place when things started working, so I should put them here.

Now, on to scipy.  With all these options in place, scipy compiles
fine.  However, there are two problems, and these don't seem to go
away at any optimization level. I'm looking for suggestions.  I'm
guessing it's some sort of configuration error.

1) The CloughTocher2DInterpolator segfaults every time it's called to
interpret values.  I couldn't manage to track it down -- it's in the
cython code somewhere -- but I can give more details next time,  I
disabled it for now.

2) f2py isn't getting the interfaces right.  When I run the test
suite, I get about 250 errors, all of the form:

ValueError: failed to create intent(cache|hide)|optional array-- must
have defined dimensions but got (5,5,)

and so on, with different tuples on the end.

Other than these errors, everything seemed to work great.  What might
I be doing wrong there?

Thanks!
-- Hoyt

++++++++++++++++++++++++++++++++++++++++++++++++
+ Hoyt Koepke
+ University of Washington Department of Statistics
+ http://www.stat.washington.edu/~hoytak/
+ hoy...@gmail.com
++++++++++++++++++++++++++++++++++++++++++

Attachment: setupenv.sh
Description: Bourne shell script

diff --git a/numpy/distutils/fcompiler/intel.py b/numpy/distutils/fcompiler/intel.py
index b593a91..aa632e6 100644
--- a/numpy/distutils/fcompiler/intel.py
+++ b/numpy/distutils/fcompiler/intel.py
@@ -10,6 +10,8 @@ compilers = ['IntelFCompiler', 'IntelVisualFCompiler',
              'IntelItaniumFCompiler', 'IntelItaniumVisualFCompiler',
              'IntelEM64VisualFCompiler', 'IntelEM64TFCompiler']
 
+compiler_opt_flags = '-static -ipo -xHOST -fPIC -DMKL_LP64 -mkl -wd188 -g -fno-alias -O3'
+
 def intel_version_match(type):
     # Match against the important stuff in the version string
     return simple_version_match(start=r'Intel.*?Fortran.*?(?:%s).*?Version' % (type,))
@@ -35,7 +37,7 @@ class IntelFCompiler(BaseIntelFCompiler):
         'compiler_f90' : [None],
         'compiler_fix' : [None, "-FI"],
         'linker_so'    : ["<F90>", "-shared"],
-        'archiver'     : ["ar", "-cr"],
+        'archiver'     : ["xiar", "-cr"],
         'ranlib'       : ["ranlib"]
         }
 
@@ -51,13 +53,13 @@ class IntelFCompiler(BaseIntelFCompiler):
         else:
             pic_flags = ['-KPIC']
         opt = pic_flags + ["-cm"]
-        return opt
+        return opt + compiler_opt_flags.split(' ')
 
     def get_flags_free(self):
         return ["-FR"]
 
     def get_flags_opt(self):
-        return ['-O3','-unroll']
+        return compiler_opt_flags.split(' ')
 
     def get_flags_arch(self):
         v = self.get_version()
@@ -129,7 +131,7 @@ class IntelItaniumFCompiler(IntelFCompiler):
         'compiler_fix' : [None, "-FI"],
         'compiler_f90' : [None],
         'linker_so'    : ['<F90>', "-shared"],
-        'archiver'     : ["ar", "-cr"],
+        'archiver'     : ["xiar", "-cr"],
         'ranlib'       : ["ranlib"]
         }
 
@@ -148,10 +150,10 @@ class IntelEM64TFCompiler(IntelFCompiler):
         'compiler_fix' : [None, "-FI"],
         'compiler_f90' : [None],
         'linker_so'    : ['<F90>', "-shared"],
-        'archiver'     : ["ar", "-cr"],
+        'archiver'     : ["xiar", "-cr"],
         'ranlib'       : ["ranlib"]
         }
-
+    
     def get_flags_arch(self):
         opt = []
         if cpu.is_PentiumIV() or cpu.is_Xeon():
diff --git a/numpy/distutils/intelccompiler.py b/numpy/distutils/intelccompiler.py
index b82445a..d7f4fd7 100644
--- a/numpy/distutils/intelccompiler.py
+++ b/numpy/distutils/intelccompiler.py
@@ -2,24 +2,39 @@
 from distutils.unixccompiler import UnixCCompiler
 from numpy.distutils.exec_command import find_executable
 
+linker_flags = '-O1 -ipo -openmp -lpthread -fno-alias -xHOST -fPIC '
+compiler_opt_flags = '-static -ipo -xHOST -O2 -fPIC -DMKL_LP64 -mkl -wd188 -g -fno-alias '
+icc_run_string = 'icc ' + compiler_opt_flags
+icpc_run_string = 'icpc ' + compiler_opt_flags
+linker_run_string = 'icc ' + linker_flags + ' -shared '
+
 class IntelCCompiler(UnixCCompiler):
 
     """ A modified Intel compiler compatible with an gcc built Python.
     """
 
     compiler_type = 'intel'
-    cc_exe = 'icc'
-    cc_args = 'fPIC'
+    cc_exe = icc_run_string
+    cc_args = ''
 
     def __init__ (self, verbose=0, dry_run=0, force=0):
         UnixCCompiler.__init__ (self, verbose,dry_run, force)
-        self.cc_exe = 'icc -fPIC'
+        self.cc_exe = icc_run_string
         compiler = self.cc_exe
         self.set_executables(compiler=compiler,
                              compiler_so=compiler,
                              compiler_cxx=compiler,
                              linker_exe=compiler,
-                             linker_so=compiler + ' -shared')
+                             linker_so=linker_run_string,
+                             archiver = ["xiar", "-cr"])
+
+    # Could NOT get this to work!!!!  Grrr...
+
+    # def get_flags(self):
+    #     return compiler_opt_flags.split(' ')
+
+    # def get_flags_linker_so(self):
+    #     return linker_flags.split(' ')
 
 class IntelItaniumCCompiler(IntelCCompiler):
     compiler_type = 'intele'
@@ -32,18 +47,19 @@ class IntelItaniumCCompiler(IntelCCompiler):
 
 class IntelEM64TCCompiler(UnixCCompiler):
 
-""" A modified Intel x86_64 compiler compatible with a 64bit gcc built Python.
+    """ A modified Intel x86_64 compiler compatible with a 64bit gcc built Python.
     """
 
     compiler_type = 'intelem'
-    cc_exe = 'icc -m64 -fPIC'
-    cc_args = "-fPIC"
+    cc_exe = icc_run_string + " -m64"
+    cc_args = ""
     def __init__ (self, verbose=0, dry_run=0, force=0):
         UnixCCompiler.__init__ (self, verbose,dry_run, force)
-        self.cc_exe = 'icc -m64 -fPIC'
+        self.cc_exe =  icc_run_string + " -m64"
         compiler = self.cc_exe
         self.set_executables(compiler=compiler,
                              compiler_so=compiler,
                              compiler_cxx=compiler,
                              linker_exe=compiler,
-                             linker_so=compiler + ' -shared')
+                             linker_so=linker_run_string,
+                             archiver = ["xiar", "-cr"])
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to