Zha0q1 opened a new pull request #18915:
URL: https://github.com/apache/incubator-mxnet/pull/18915


   As part of the effort to support large tensors by default in 2.0, we are 
going to switch to using ILP64 OpenBLAS by default. This PR changes the static 
build script to 1. compile ILP64 OpenBLAS 2. link statically to OpenBLAS 3. 
fixes gfortran (needed by opencv) linkage issue when doing (2) 4. default to 
using int 64 indexing in linux_native.cmake
   
   cblas dependent operators seem to work fine after this migration: no more 
name clashing and >INT_MAX indexing works.
   e.g.
   ```
   def run_test():
     import mxnet as mx
     from mxnet import nd
   
     # large tensor, only works on int 64 BLAS
     A=mx.nd.ones(shape=(1, 2**31))
     nd.linalg.syrk(A)
     nd.waitall()
   
   if __name__ == '__main__':
       run_test()
   ```
   
   
   
   
   
   
   
   lapack-dependent operators are broken, example: 
   ```
   def run_test():
     import mxnet as mx
     from mxnet import nd
   
     # large tensor, only works on int 64 BLAS
     A=nd.array([[1., 4.], [2., 3.]])
     B = nd.linalg.inverse(A)
     print(B)
     print(nd.dot(B, A))
     nd.waitall()
   
   if __name__ == '__main__':
       run_test()
   
   ```
   
   ```
   [21:13:28] ../src/storage/storage.cc:198: Using Pooled (Naive) 
StorageManager for CPU
    ** On entry to SGETRI parameter number  3 had an illegal value
   Traceback (most recent call last):
     File "inverse.py", line 14, in <module>
       run_test()
     File "inverse.py", line 9, in run_test
       print(B)
     File "/home/ubuntu/openblasfix/python/mxnet/ndarray/ndarray.py", line 286, 
in __repr__
       return '\n%s\n<%s %s @%s>' % (str(self.asnumpy()),
     File "/home/ubuntu/openblasfix/python/mxnet/ndarray/ndarray.py", line 
2595, in asnumpy
       ctypes.c_size_t(data.size)))
     File "/home/ubuntu/openblasfix/python/mxnet/base.py", line 246, in 
check_call
       raise get_last_ffi_error()
   mxnet.base.MXNetError: Traceback (most recent call last):
     File "../src/storage/./pooled_storage_manager.h", line 188
   MXNetError: Memory allocation failed Cannot allocate memory
   ```
   
   This PR is complete. More to do in other PRs: 1. add tests for cblas related 
ops. 2. fix lapack related ops. 3. default to using int 64 indexing in other 
cmake config files 
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to