Yes for the first point, not exactly for the second point. By default, blockSize is equal to 1 so the computations on the scrambled samples are done one by one. If _exec_sample is parallelized, only the evaluations over the two initial samples will be done in parallel, the evaluations over the scrambled samples being done blockSize by blockSize ie one by one by default. Set blockSize to any value greater than the number of cores you have access to and you will get full benefit from your parallelized _exec_sample too. Personally, I would use the largest possible value allowed by the memory resources in order to reduce the overload of the python/SWIG/C++ wrapping and to give room for the scheduler to balance the load if the computations have very different execution time.
A++ Régis LEBRUN >________________________________ > De : Pamphile ROY <[email protected]> >À : regis lebrun <[email protected]> >Cc : users <[email protected]> >Envoyé le : Samedi 22 octobre 2016 15h40 >Objet : Re: [ot-users] Pickling and parallelism > > > >Ok thank you for this precise answer. > > >So to sum up, if I understand correctly: > 1. I only have to call getSecondOrderIndices() and everything is computed. > The for loop only access the computed values. > 2. setBlockSize() defines the number of chunked elements. It linked to the > number of processes. In the current case, without wrapper and no TBB, this > means that the computation is performed on one chunk after another. And if > the wrapper allows multiprocessing, several chunks are computed at the same > time. >Pamphile ROY > > >________________________________ > >De: "regis lebrun" <[email protected]> >À: "Pamphile ROY" <[email protected]> >Cc: "users" <[email protected]> >Envoyé: Samedi 22 Octobre 2016 15:16:09 >Objet: Re: [ot-users] Pickling and parallelism > > >Hi Pamphile, > >First of all, the SensitivityAnalysis class allows you to compute first and >second order Sobol indices, in addition to the first order total indices. >These indices are computed thanks to the computeSobolIndices() method, which >is protected so is not exposed into the Python interface. >The behaviour of the class is the following: >+ At the first call to either getFirstOrderIndices() or >getTotalOrderIndices(), if no previous call to getSecondOrderIndices() has >been made, a call to the computeSobolIndices() method is done and both the >first order and total order indices are computed. THIS COMPUTATION IS DONE FOR >ALL THE OUTPUTS IN ONE RUN. >+ At a second call to either getFirstOrderIndices() or getTotalOrderIndices(), >no new computation is done, you get a value already computed and stored into >the relevant data structure >+ At the first call to getSecondOrderIndices(), all the indices (first order, >total order and second order) are computed. THIS COMPUTATION IS DONE FOR ALL >THE OUTPUTS IN ONE RUN. >+ At a second call to either getFirstOrderIndices(), getTotalOrderIndices() or >getSecondOrderIndices(), no new computation is done. > >The good practice is then to first call getSecondOrderIndices() if you need >both first and second order indices, otherwise the first order indices will be >computed twice. > >Concerning the loop over the outputs, you will not get any acceleration if you >parallelize the loop as the FIRST call to any of the getXXXIndices(int) will >start the computation of ALL the indices for ALL the outputs. The key point is >thus to efficiently compute all these indices, which is done by providing an >efficient implementation of the _exec_sample() method in your wrapper. > >I checked the source code of SensitivityAnalysis::computeSobolIndices() in >OT1.6, and the conclusion is: > >+ there is no use of TBBs here >+ the model you provide to the constructor is called over samples (hence the >interest in providing an _exec_sample method into your wrapper) >+ the model is called over the two samples you provide to the constructor, >each call is N evaluations >+ the model *should be* called over samples of much larger size if all the >scramble inputs were pre-computed, which could exhaust the available memory. >To avoid this, the evaluation over the scrambled inputs are partitioned into >chunks of size blockSize (excepted the last chunk which can have a smaller >size as it contains the remaining computations), which is precisely what you >tune thanks to the setBlockSize() method. > >In your case, depending on the amount of memory you have and the distribution >scheduler you use, you should set the block size to the largest possible value >and call getSecondOrderIndices() first (the argument can be any valid index >value) if you need second order indices, as it will compute all the indices at >once, instead of calling first getFirstOrderIndices() then >getSecondOrderIndices() as it would compute the first order and total order >indices twice. > >The technology you use to implement _exec_sample is your choice, but here >otwrapy could help. > >The key point is that you don't need to parallelize the loop over the output >dimension, but the evaluation of your model over a sample. > >Cheers > >Régis LEBRUN > >----- Mail original ----- > >De : Pamphile ROY <[email protected]> >> À : regis lebrun <[email protected]> >> Cc : users <[email protected]> >> Envoyé le : Samedi 22 octobre 2016 13h26 >> Objet : Re: [ot-users] Pickling and parallelism >> >> Hi Régis, >> >> Thanks for your fast reply. >> >> From what I understand, otwrapy would do the same thing as >> sobol.setBlockSize(int(ot.ResourceMap.Get("parallel-threads"))). >> So your suggesting to do instead: >> import otwrapy as otw >> model = otw.Parallelizer(ot.PythonFunction(2, 400, func), n_cpus=10) >> If so, what is the benefit of doing it? >> >> This will only be useful when doing sobol.getFirstOrderIndices(0). >> >> What I am trying to do is to perform several analysis as I have a functional >> output: >> >> model = ot.PythonFunction(2, 400, func) >> sobol = ot.SensitivityAnalysis(sample1, sample2, model) >> sobol.setBlockSize(int(ot.ResourceMap.Get("parallel-threads"))) >> indices = [[], [], []] >> for i in range(400): >> indices[1].append(np.array(sobol.getFirstOrderIndices(i))) >> indices[2].append(np.array(sobol.getTotalOrderIndices(i))) >> >> This work but I want the for loop to be parallel so I tried: >> >> from pathos.multiprocessing import ProcessingPool, cpu_count >> >> model = ot.PythonFunction(2, 400, func) >> sobol = ot.SensitivityAnalysis(sample1, sample2, sobol_model) >> sobol.setBlockSize(int(ot.ResourceMap.Get("parallel-threads"))) >> >> def map_indices(i): >> first = np.array(sobol.getFirstOrderIndices(i)) >> total = np.array(sobol.getTotalOrderIndices(i)) >> return first, total >> >> pool = ProcessingPool(cpu_count()) >> results = pool.imap(map_indices, range(400)) >> first = np.empty(400) >> total = np.empty(400) >> >> for i in range(400): >> first[i], total[i] = results.next() >> >> But in order for this to work, the map_indice function has to be pickled >> (here >> it uses dill that can serialize close to everything). >> Hence the error I get. >> >> >> Thanks again, >> >> Pamphile ROY >> >> >> >> ----- Mail original ----- >> De: "regis lebrun" <[email protected]> >> À: "Pamphile ROY" <[email protected]>, "users" >> <[email protected]> >> Envoyé: Samedi 22 Octobre 2016 12:48:29 >> Objet: Re: [ot-users] Pickling and parallelism >> >> Hi, >> >> I understand that you have a model that has been interfaced with OpenTURNS >> using >> the OpenTURNSPythonFunction class and you want to perform sensitivity >> analysis >> using the SensitivityAnalysis class, and you would like to benefit from some >> parallelism in the execution of the analysis. >> >> The correct way to do that is to provide the _exec_sample method. In this >> method, you are free to use any multithreading/multiprocessing capability >> you >> want. You may consider either otwrapy (http://felipeam86.github.io/otwrapy/) >> or >> one of the solutions proposed here: >> >> http://openturns.github.io/developer_guide/wrapper_development.html or your >> favorite tool. Then, you will get rid of the GIL. >> >> I cannot help for the second point, as it is far beyond my knowledge on >> python >> and serialization. >> >> Cheers >> >> Régis LEBRUN >> >> >>________________________________ >>> De : Pamphile ROY <[email protected]> >>> À : [email protected] >>> Envoyé le : Vendredi 21 octobre 2016 21h56 >>> Objet : [ot-users] Pickling and parallelism >>> >>> >>> >>> Hi, >>> >>> >>> I would have 2 questions: >>> >>> >>> 1. I have seen that the class (and others) allows some multithreading. From >> my understanding, it is based on TBB and only multithreads the tasks. >> >>Thus it is concerned by the GIL. Is there an automatic set up like for >> multithreading but for multiprocessing instead? Any advice? >> >> >>> >>> 2. Using pathos for multiprocessing, I am trying to dump an instance of >> SensitivityAnalysis but I cannot get it to work even with dill. >> >>For information, I am running under macOS sierra and this is OT 1.6 (maybe >> it is coming from here... I am going to upgrade but it means a refactoring >> on my >> side). >> >>Here is the following traceback: >>> >>> >>> sobol = ot.SensitivityAnalysis(sample1, sample2, sobol_model) >>> _f = dill.dumps(sobol) >>> >>> File >> "/Users/Pamphile/.virtualenvs/jpod/lib/python2.7/site-packages/dill/dill.py", >> >> line 243, in dumps >> >>dump(obj, file, protocol, byref, fmode, recurse)#, strictio) >>> File >> "/Users/Pamphile/.virtualenvs/jpod/lib/python2.7/site-packages/dill/dill.py", >> >> line 236, in dump >> >>pik.dump(obj) >>> File >> "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", >> >> line 224, in dump >> >>self.save(obj) >>> File >> "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", >> >> line 306, in save >> >>rv = reduce(self.proto) >>> File >> "/Applications/OpenTURNS/openturns/lib/python2.7/site-packages/openturns/common.py", >> >> line 258, in Object___getstate__ >> >>study.add('instance', self) >>> File >> "/Applications/OpenTURNS/openturns/lib/python2.7/site-packages/openturns/common.py", >> >> line 688, in add >> >>return _common.Study_add(self, *args) >>> NotImplementedError: Wrong number or type of arguments for overloaded >> function 'Study_add'. >> >>Possible C/C++ prototypes are: >>> OT::Study::add(OT::InterfaceObject const &) >>> OT::Study::add(OT::String const &,OT::InterfaceObject const >> &,OT::Bool) >> >>OT::Study::add(OT::String const &,OT::InterfaceObject const &) >>> OT::Study::add(OT::PersistentObject const &) >>> OT::Study::add(OT::String const &,OT::PersistentObject const >> &,OT::Bool) >> >>OT::Study::add(OT::String const &,OT::PersistentObject const &) >>> >>> >>> Thank you for your help! >>> >>> >>> >>> >>> Pamphile >>> _______________________________________________ >>> OpenTURNS users mailing list >>> [email protected] >>> http://openturns.org/mailman/listinfo/users >>> >>> >>> >> >> >> > > _______________________________________________ OpenTURNS users mailing list [email protected] http://openturns.org/mailman/listinfo/users
