Hi, I have been able to re-implement this with otwrapy . In case someone has pickling issues, I added a new keyword for the module to call this new function:
def _exec_sample_pathos(func, n_cpus): """Return a function that executes a sample in parallel using pathos. Parameters ---------- func : Function or calable A callable python object, usually a function. The function should take an input vector as argument and return an output vector. n_cpus : int Number of CPUs on which to distribute the function calls. Returns ------- _exec_sample : Function or callable The parallelized funtion. """ from pathos.multiprocessing import ProcessingPool def _exec_sample(X): if len(X) == 1: return func(X) p = ProcessingPool(n_cpus) X = np.array(X) X_splitted = np.array_split(X, n_cpus) pipe = [] for i in range(n_cpus): pipe.append(p.apipe(func, X_splitted[i])) rs = [] for i in range(n_cpus): rs.append(pipe[i].get()) rs = [item for sublist in rs for item in sublist] return ot.NumericalSample(rs) return _exec_sample It uses pathos as a multiprocessing replacement. Basically, it uses dill instead of pickle to serialize almost everything. Will seek to make a PR for this. Then I finally used it with: import otwrapy as otw model = otw.Parallelizer(Wrapper(...), backend='pathos') A side note about the n_cpu keyword, if not set, the default value will get all the available cpus but one is needed by the scheduler. So change it to be n_cpus - 1 ;) Pamphile ROY De: "Pamphile ROY" <[email protected]> À: "regis lebrun" <[email protected]> Cc: "users" <[email protected]> Envoyé: Dimanche 23 Octobre 2016 10:26:02 Objet: Re: Sobol indices computation->what is new in the 1.8 version? Hi, Regarding Polynomial Chaos, we choose not to use them but Kriging (for now Scikit-Learn). Indeed I have seen the new upcoming class and plan to move on to it but I am a little bit confused about which method to use/prefer. I suppose I will have a look at all of them. Thanks, Pamphile ROY ----- Mail original ----- De: "regis lebrun" <[email protected]> À: "Pamphile ROY" <[email protected]> Cc: "users" <[email protected]> Envoyé: Samedi 22 Octobre 2016 16:37:14 Objet: Sobol indices computation->what is new in the 1.8 version? Hi again, For your study, you can also have a look at the FunctionalChaosRandomVector class http://openturns.github.io/user_manual/response_surface/_generated/openturns.FunctionalChaosRandomVector.html?highlight=functionalchaos to compute all the indices you want. If your model is reasonably smooth, as you are in low dimension (only 2 inputs) you can use a basis of high degree and get a very good approximation of your model. There are many unique features in OpenTURNS to compute these kind of approximations, including the ability to build orthogonal polynomial bases for arbitrary continuous distributions (assuming that they have moments of any order and are characterized by their moments). The upcoming version (1.8) will give you access to much more capabilities to compute Sobol indices, see http://openturns.github.io/user_manual/_generated/openturns.SobolIndicesAlgorithm.html On the positive side, you have access to confidence intervals (asymptotic for one of the algorithms, by bootstrap for the others), and you have a complete decoupling between the evaluation of the model and the computation of the indices and their confidence intervals by one of the different algorithms, so it make it very clear that you have to provide a parallel _exec_sample method to speed-up things On the negative side, there is no more blockSize parameters, so everything has to take place into the memory. For a sampling of size N with an input dimension of 2, the input sample is of size 4*N if you only need first and total order indices, and 6*N if you also need the second order indices. The resulting storage is of 6*N*(400+2)*8 octets, it means 184Mo if N=1e4. You will also get new capabilities to compute chaos decomposition: more efficient and more robust arbitrary distribution polynomial generation, non-polynomial bases, new specific polynomial bases. The drawback is that the SensitivityAnalysis class has been removed in favor of the SobolIndicesAlgorithmclass, so you will have to adapt your scripts. Cheers Régis LEBRUN ----- Mail original ----- De : Pamphile ROY <[email protected]> À : regis lebrun <[email protected]> Cc : users <[email protected]> Envoyé le : Samedi 22 octobre 2016 13h26 Objet : Re: [ot-users] Pickling and parallelism Hi Régis, Thanks for your fast reply. >From what I understand, otwrapy would do the same thing as sobol.setBlockSize(int(ot.ResourceMap.Get("parallel-threads"))). So your suggesting to do instead: import otwrapy as otw model = otw.Parallelizer(ot.PythonFunction(2, 400, func), n_cpus=10) If so, what is the benefit of doing it? This will only be useful when doing sobol.getFirstOrderIndices(0). What I am trying to do is to perform several analysis as I have a functional output: model = ot.PythonFunction(2, 400, func) sobol = ot.SensitivityAnalysis(sample1, sample2, model) sobol.setBlockSize(int(ot.ResourceMap.Get("parallel-threads"))) indices = [[], [], []] for i in range(400): indices[1].append(np.array(sobol.getFirstOrderIndices(i))) indices[2].append(np.array(sobol.getTotalOrderIndices(i))) This work but I want the for loop to be parallel so I tried: from pathos.multiprocessing import ProcessingPool, cpu_count model = ot.PythonFunction(2, 400, func) sobol = ot.SensitivityAnalysis(sample1, sample2, sobol_model) sobol.setBlockSize(int(ot.ResourceMap.Get("parallel-threads"))) def map_indices(i): first = np.array(sobol.getFirstOrderIndices(i)) total = np.array(sobol.getTotalOrderIndices(i)) return first, total pool = ProcessingPool(cpu_count()) results = pool.imap(map_indices, range(400)) first = np.empty(400) total = np.empty(400) for i in range(400): first[i], total[i] = results.next() But in order for this to work, the map_indice function has to be pickled (here it uses dill that can serialize close to everything). Hence the error I get. Thanks again, Pamphile ROY ----- Mail original ----- De: "regis lebrun" <[email protected]> À: "Pamphile ROY" <[email protected]>, "users" <[email protected]> Envoyé: Samedi 22 Octobre 2016 12:48:29 Objet: Re: [ot-users] Pickling and parallelism Hi, I understand that you have a model that has been interfaced with OpenTURNS using the OpenTURNSPythonFunction class and you want to perform sensitivity analysis using the SensitivityAnalysis class, and you would like to benefit from some parallelism in the execution of the analysis. The correct way to do that is to provide the _exec_sample method. In this method, you are free to use any multithreading/multiprocessing capability you want. You may consider either otwrapy (http://felipeam86.github.io/otwrapy/) or one of the solutions proposed here: http://openturns.github.io/developer_guide/wrapper_development.html or your favorite tool. Then, you will get rid of the GIL. I cannot help for the second point, as it is far beyond my knowledge on python and serialization. Cheers Régis LEBRUN BQ_BEGIN ________________________________ De : Pamphile ROY <[email protected]> À : [email protected] Envoyé le : Vendredi 21 octobre 2016 21h56 Objet : [ot-users] Pickling and parallelism Hi, I would have 2 questions: 1. I have seen that the class (and others) allows some multithreading. From my understanding, it is based on TBB and only multithreads the tasks. BQ_BEGIN Thus it is concerned by the GIL. Is there an automatic set up like for BQ_END multithreading but for multiprocessing instead? Any advice? BQ_BEGIN 2. Using pathos for multiprocessing, I am trying to dump an instance of BQ_END SensitivityAnalysis but I cannot get it to work even with dill. BQ_BEGIN For information, I am running under macOS sierra and this is OT 1.6 (maybe BQ_END it is coming from here... I am going to upgrade but it means a refactoring on my side). BQ_BEGIN Here is the following traceback: sobol = ot.SensitivityAnalysis(sample1, sample2, sobol_model) _f = dill.dumps(sobol) File BQ_END "/Users/Pamphile/.virtualenvs/jpod/lib/python2.7/site-packages/dill/dill.py", line 243, in dumps BQ_BEGIN dump(obj, file, protocol, byref, fmode, recurse)#, strictio) File BQ_END "/Users/Pamphile/.virtualenvs/jpod/lib/python2.7/site-packages/dill/dill.py", line 236, in dump BQ_BEGIN pik.dump(obj) File BQ_END "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 224, in dump BQ_BEGIN self.save(obj) File BQ_END "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 306, in save BQ_BEGIN rv = reduce(self.proto) File BQ_END "/Applications/OpenTURNS/openturns/lib/python2.7/site-packages/openturns/common.py", line 258, in Object___getstate__ BQ_BEGIN study.add('instance', self) File BQ_END "/Applications/OpenTURNS/openturns/lib/python2.7/site-packages/openturns/common.py", line 688, in add BQ_BEGIN return _common.Study_add(self, *args) NotImplementedError: Wrong number or type of arguments for overloaded BQ_END function 'Study_add'. BQ_BEGIN Possible C/C++ prototypes are: OT::Study::add(OT::InterfaceObject const &) OT::Study::add(OT::String const &,OT::InterfaceObject const BQ_END &,OT::Bool) BQ_BEGIN OT::Study::add(OT::String const &,OT::InterfaceObject const &) OT::Study::add(OT::PersistentObject const &) OT::Study::add(OT::String const &,OT::PersistentObject const BQ_END &,OT::Bool) BQ_BEGIN OT::Study::add(OT::String const &,OT::PersistentObject const &) Thank you for your help! Pamphile _______________________________________________ OpenTURNS users mailing list [email protected] http://openturns.org/mailman/listinfo/users BQ_END BQ_END
_______________________________________________ OpenTURNS users mailing list [email protected] http://openturns.org/mailman/listinfo/users
