Re: [ot-users] Sample transformation

roy Sun, 29 Oct 2017 05:16:45 -0700

Hi Regis,

For information, I have trouble pickling the class ot.FixedStrategy.
From the example script :


adaptiveStrategy = ot.FixedStrategy(basis, 
enumerateFunction.getStrataCumulatedCardinal(deg))

If I try to pickle this :

import pickle
path = './model.dat'
with open(path, 'wb') as f:
    pickler = pickle.Pickler(f)
    pickler.dump(adaptiveStrategy)

This works but then the deserialization does not :

with open(path, 'rb') as f:
    unpickler = pickle.Unpickler(f)
    adaptiveStrategy = unpickler.load()

I get this error:

Traceback (most recent call last):
  File "example.py", line 58, in <module>
    adaptiveStrategy = unpickler.load()
  File 
"/Users/roy/Applications/miniconda3/envs/batman3/lib/python3.6/site-packages/openturns/common.py",
 line 344, in Object___setstate__
    self.__init__()
  File 
"/Users/roy/Applications/miniconda3/envs/batman3/lib/python3.6/site-packages/openturns/metamodel.py",
 line 1908, in __init__
    this = _metamodel.new_FixedStrategy(*args)
NotImplementedError: Wrong number or type of arguments for overloaded function 
'new_FixedStrategy'.
  Possible C/C++ prototypes are:
    OT::FixedStrategy::FixedStrategy(OT::OrthogonalBasis const 
&,OT::UnsignedInteger const)
    OT::FixedStrategy::FixedStrategy(OT::FixedStrategy const &)

Thanks in advance.


Pamphile ROY
Chercheur doctorant en Quantification d’Incertitudes
CERFACS - Toulouse (31) - France
+33 (0) 5 61 19 31 57
+33 (0) 7 86 43 24 22



> Le 12 oct. 2017 à 16:16, roy <[email protected]> a écrit :
> 
> Hi Regis,
> 
> This is great thanks. It is now working as expected.
> Maybe this can be clarified in the documentation.
> 
> 
> Pamphile ROY
> Chercheur doctorant en Quantification d’Incertitudes
> CERFACS - Toulouse (31) - France
> +33 (0) 5 61 19 31 57
> +33 (0) 7 86 43 24 22
> 
> 
> 
>> Le 12 oct. 2017 à 00:11, regis lebrun <[email protected] 
>> <mailto:[email protected]>> a écrit :
>> 
>> I found it. In order to speed up the computation of the coefficients of the 
>> polynomial expansion, we developed a class named DesignProxy, which acts 
>> like a cache for the evaluation of the multivariate basis oven the input 
>> sample. Essentially, it contains a large matrix, with a default size given 
>> by ResourceMap.GetAsUnsignedInteger("DesignProxy-DefaultCacheSize") and 
>> equals to 16777216, it means 128Mo.
>> 
>> 
>> So if you add 
>> ot.ResourceMap.SetAsUnsignedInteger("DesignProxy-DefaultCacheSize", 
>> smallSize) with smallSize adapted to your memory budget (eg. smallSize=0), 
>> then everything should be ok.
>> 
>> You can also run the algorithm on the whole output sample. The DesignProxy 
>> instance is built once and shared among the different marginals. You can see 
>> that the memory cost of the algorithm is essentially the same for an output 
>> sample of dimension 1 or 14. Concerning the computation time, a part of the 
>> computation is shared between the marginals so the total cost is not 
>> proportional to the output dimension, even if no parallelization is 
>> implemented here (but the linear algebra is already parallelized using 
>> threads).
>> 
>> Tell me if it solved your problem!
>> 
>> Régis
>> 
>> ________________________________
>> De : roy <[email protected] <mailto:[email protected]>>
>> À : regis lebrun <[email protected] 
>> <mailto:[email protected]>> 
>> Envoyé le : Mercredi 11 octobre 2017 10h40
>> Objet : Re: [ot-users] Sample transformation
>> 
>> 
>> 
>> I was able to make an extract.
>> 
>> I am fitting a case with functional output. So to parallelize the fitting I 
>> use a function that independently construct a model per feature.
>> The memory consumption is coming from every call to run() with a bump of 
>> ~130 Mo each time. Maybe OT can handle itself the parallelization?
>> I saw that it was working without needing the loop, so maybe I should do 
>> that instead.
>> 
>> But still, 130 Mo for a model is quite a lot.
>> 
>> 
>> Cheers,
>> 
>> Pamphile ROY
>> Chercheur doctorant en Quantification d’Incertitudes
>> CERFACS - Toulouse (31) - France
>> +33 (0) 5 61 19 31 57
>> +33 (0) 7 86 43 24 22
>> 
>> 
>> 
>> Le 11 oct. 2017 à 08:44, regis lebrun <[email protected] 
>> <mailto:[email protected]>> a écrit :
>>> 
>>> ouch! 3-4 Go is crazy! Do you have any script to share in order to help us 
>>> catching the bug? I use the FunctionalChaosAlgorithm class more than often 
>>> and I never faced this kind of behavior. If there is a bug it should be a 
>>> good thing to catch it asap: we enter the 1.10 release candidate phase, a 
>>> good slot to fix this kind of bugs.
>>> 
>>> Cheers
>>> 
>>> Régis
>>> 
>>> 
>>> 
>>> ________________________________
>>> De : roy <[email protected] <mailto:[email protected]>>
>>> À : regis lebrun <[email protected] 
>>> <mailto:[email protected]>> 
>>> Cc : users <[email protected] <mailto:[email protected]>>
>>> Envoyé le : Mercredi 11 octobre 2017 0h21
>>> Objet : Re: [ot-users] Sample transformation
>>> 
>>> 
>>> 
>>> Hi Régis,
>>> 
>>> Not sure about the leak as I only do python.
>>> But using the tool I know, I was not able to free the memory(using some del 
>>> and gc.collect()).
>>> 
>>> I saw the issue when constructing a model on a cluster (Quadrature with 121 
>>> points, degree 10 in 2d) and the batch manager killed the job
>>> due to memory consumption. On my Mac the memory goes to 3-4 Go for this but 
>>> on the cluster it explodes.
>>> 
>>> As always, thanks for the quick reply :)
>>> 
>>> 
>>> Pamphile ROY
>>> Chercheur doctorant en Quantification d’Incertitudes
>>> CERFACS - Toulouse (31) - France
>>> +33 (0) 5 61 19 31 57
>>> +33 (0) 7 86 43 24 22
>>> 
>>> 
>>> 
>>> Le 10 oct. 2017 à 23:13, regis lebrun <[email protected] 
>>> <mailto:[email protected]>> a écrit :
>>> 
>>> 
>>>> Hi Pamphil,
>>>> 
>>>> Nice to know that the code *seems* to work ;-)
>>>> 
>>>> Are you sure that there is a memory leak? The algorithm creates 
>>>> potentially large objects, which are stored into the FunctionalChaosResult 
>>>> member of the algorithm. If there is a pending reference to this object, 
>>>> the memory will not be released. Maybe Denis, Julien or Sofiane have more 
>>>> insight on this point?
>>>> 
>>>> Cheers
>>>> 
>>>> Régis
>>>> 
>>>> 
>>>> 
>>>> ________________________________
>>>> De : roy <[email protected] <mailto:[email protected]>>
>>>> À : regis lebrun <[email protected] 
>>>> <mailto:[email protected]>> 
>>>> Cc : users <[email protected] <mailto:[email protected]>>
>>>> Envoyé le : Mardi 10 octobre 2017 6h35
>>>> Objet : Re: [ot-users] Sample transformation
>>>> 
>>>> 
>>>> 
>>>> Hi Regis,
>>>> 
>>>> Thanks for this long and well detailed answer!
>>>> The code you provided seems to work as expected.
>>>> 
>>>> However during my tests I noticed that the memory was not freed correctly.
>>>> Once the class FunctionalChaosAlgorithm is called, there is a memory bump 
>>>> and even after calling del
>>>> and gc.collect(), memory is still not freed (using memory_profiler for 
>>>> that). Might be a memory leak?
>>>> 
>>>> Kind regards,
>>>> 
>>>> Pamphile ROY
>>>> Chercheur doctorant en Quantification d’Incertitudes
>>>> CERFACS - Toulouse (31) - France
>>>> +33 (0) 5 61 19 31 57
>>>> +33 (0) 7 86 43 24 22
>>>> 
>>>> 
>>>> 
>>>> Le 7 oct. 2017 à 19:59, regis lebrun <[email protected] 
>>>> <mailto:[email protected]>> a écrit :
>>>> 
>>>> 
>>>> 
>>>> Hi Pamphil,
>>>>> 
>>>>> You were almost right: the AdaptiveStieltjesAlgorithm is very close to 
>>>>> what you are looking for, but not exactly what you need. It is the 
>>>>> algorithmic part of the factory of orthonormal polynomials, the class you 
>>>>> have to use is StandardDistributionPolynomialFactory, ie a factory (=able 
>>>>> to build something) and not an algorithm (=something able to compute 
>>>>> something). You have all the details here:
>>>>> 
>>>>> http://openturns.github.io/openturns/master/user_manual/_generated/openturns.StandardDistributionPolynomialFactory.html
>>>>>  
>>>>> <http://openturns.github.io/openturns/master/user_manual/_generated/openturns.StandardDistributionPolynomialFactory.html>
>>>>> 
>>>>> I agree on the fact that the difference is quite subtle, as it can be 
>>>>> seen by comparing the API of the two classes. The distinction was made at 
>>>>> a time were several algorithms were competing for the task 
>>>>> (GramSchmidtAlgorithm, ChebychevAlgorithm) but in fact the 
>>>>> AdaptiveStieltjesAlgorithm proved to be much more accurate and reliable 
>>>>> than the other algorithms, and now it is the only orthonormalization 
>>>>> algorithm available.
>>>>> 
>>>>> Another subtle trick is the following.
>>>>> 
>>>>> If you create a basis this way:
>>>>> basis = ot.StandardDistributionPolynomialFactory(dist)
>>>>> you will get the basis associated to the *standard representative* 
>>>>> distribution in the parametric family to which dist belongs. It means the 
>>>>> distribution with zero mean and unit variance, or with support equals to 
>>>>> [-1, 1], or dist itself if no affine transformation is able to reduce the 
>>>>> number of parameters of the distribution. 
>>>>> It is managed automatically within the FunctionalChaosAlgorithm, but can 
>>>>> be disturbing if you do things by hand.
>>>>> 
>>>>> If you create a basis this way:
>>>>> basis = 
>>>>> ot.StandardDistributionPolynomialFactory(ot.AdaptiveStieltjesAlgorithm(dist))
>>>>> then the distribution is preserved, and you get the orthonormal 
>>>>> polynomials corresponding to dist. Be aware of the fact that the 
>>>>> algorithm may have hard time to build the polynomials if your 
>>>>> distribution is far away from its standard representative, as it may 
>>>>> involves the computation of recurrence coefficients with a much wider 
>>>>> range of variation. The benefit is that the orthonormality measure is 
>>>>> exactly your distribution, assuming that its copula is the independent 
>>>>> one, so you don't have to introduce a marginal transformation between 
>>>>> both measures.
>>>>> 
>>>>> Some additional remarks:
>>>>> + it looks like dist has dimension>1, as you extract its marginal 
>>>>> distributions later on. AdaptiveStieltjesAlgorithm and 
>>>>> StandardDistributionPolynomialFactory only work with 1D distributions (it 
>>>>> is not checked by the library, my shame). What you have to do is:
>>>>> 
>>>>> basis = 
>>>>> ot.OrthogonalProductPolynomialFactory([ot.StandardDistributionPolynomialFactory(ot.AdaptiveStieltjesAlgorithm(dist.getMarginal(i)))
>>>>>  for i in range(dist.getDimension())])
>>>>> Quite a long line, I know...
>>>>> It will build a multivariate polynomial basis orthonormal with respect to 
>>>>> the product distribution (ie with independent copula) sharing the same 1D 
>>>>> marginal distributions as dist.
>>>>> 
>>>>> 
>>>>> After that, everything will work as expected and you will NOT have to 
>>>>> build the transformation (if you build it it will coincide with the 
>>>>> identity function). If you encounter performance issues (the polynomials 
>>>>> of high degrees take ages to be built as in 
>>>>> http://trac.openturns.org/ticket/885, or there is an overflow, or the 
>>>>> numerical precision is bad) then use:
>>>>> basis = 
>>>>> ot.OrthogonalProductPolynomialFactory([ot.StandardDistributionPolynomialFactory(dist.getMarginal(i))
>>>>>  for i in range(dist.getDimension())])
>>>>> and build the transformation the way you do it.
>>>>> 
>>>>> + if you use the FunctionalChaosAlgorithm class by providing an input 
>>>>> sample and an output sample, you also have to provide the weights of the 
>>>>> input sample EVEN IF the experiment given in the projection strategy 
>>>>> would allow to recompute them. It is because the fact that you provide 
>>>>> the input sample overwrite the weighted experiment of the projection 
>>>>> stratey by a FixedExperiment doe.
>>>>> 
>>>>> I attached two complete examples: one using the exact marginal 
>>>>> distributions and the other using the standard representatives.
>>>>> 
>>>>> Best regards
>>>>> 
>>>>> Régis
>>>>> 
>>>>> ________________________________
>>>>> De : roy <[email protected]>
>>>>> À : regis lebrun <[email protected]> 
>>>>> Cc : users <[email protected]>
>>>>> Envoyé le : Vendredi 6 octobre 2017 14h22
>>>>> Objet : Re: [ot-users] Sample transformation
>>>>> 
>>>>> 
>>>>> 
>>>>> Hi Regis,
>>>>> 
>>>>> Thank you for this detailed answer.
>>>>> 
>>>>> - I am using the latest release from conda (OT 1.9, python 3.6.2, latest 
>>>>> numpy, etc.) ,
>>>>> - For the sample, I need it to generate externally the output (cost code 
>>>>> that cannot be integrated into OT as model),
>>>>> - I have to convert ot.Sample into np.array because it is then used by 
>>>>> other functions to create the simulations, etc.
>>>>> 
>>>>> If I understood correctly, I can create the projection strategy using 
>>>>> this snippet:
>>>>> 
>>>>> basis = ot.AdaptiveStieltjesAlgorithm(dist)
>>>>> measure = basis.getMeasure()
>>>>> quad = ot.Indices(in_dim)
>>>>> for i in range(in_dim):
>>>>> quad[i] = degree + 1
>>>>> 
>>>>> comp_dist = ot.GaussProductExperiment(measure, quad)
>>>>> proj_strategy = ot.IntegrationStrategy(comp_dist)
>>>>> 
>>>>> inv_trans = 
>>>>> ot.Function(ot.MarginalTransformationEvaluation([measure.getMarginal(i) 
>>>>> for i in range(in_dim)], distributions))
>>>>> sample = np.array(inv_trans(comp_dist.generate()))
>>>>> 
>>>>> 
>>>>> It seems to work. Except that the basis does not work with 
>>>>> ot.FixedStrategy(basis, dim_basis). I get a non implemented method error.
>>>>> 
>>>>> After I get the sample and the corresponding output, what is the way to 
>>>>> go? Which arguments do I need to use for the
>>>>> ot.FunctionalChaosAlgorithm? 
>>>>> 
>>>>> I am comparing the Q2 and on Ishigami and I was only able to get correct 
>>>>> results using:
>>>>> 
>>>>> pc_algo = ot.FunctionalChaosAlgorithm(sample, output, dist, 
>>>>> trunc_strategy)
>>>>> 
>>>>> But for least square strategy I had to use this:
>>>>> 
>>>>> pc_algo = ot.FunctionalChaosAlgorithm(sample, output)
>>>>> 
>>>>> 
>>>>> Is it normal?
>>>>> 
>>>>> 
>>>>> Pamphile ROY
>>>>> Chercheur doctorant en Quantification d’Incertitudes
>>>>> CERFACS - Toulouse (31) - France
>>>>> +33 (0) 5 61 19 31 57
>>>>> +33 (0) 7 86 43 24 22
>>>>> 
>>>>> 
>>>>> 
>>>>> Le 5 oct. 2017 à 15:40, regis lebrun <[email protected]> 
>>>>> a écrit :
>>>>> 
>>>>> 
>>>>> 
>>>>> Hi Pamphile,
>>>>> 
>>>>> 
>>>>>> 
>>>>>> 1) The problem:
>>>>>> The problem you get is due to the fact that in your version of OpenTURNS 
>>>>>> (1.7 I suppose), the GaussProductExperiment class has a different way to 
>>>>>> handle the input distribution than the other WeightedExperiment classes: 
>>>>>> it generates the quadrature rule of the *standard representatives* of 
>>>>>> the marginal distributions instead of the marginal distributions. It 
>>>>>> does not change the rate of convergence of the PCE algorithm and allows 
>>>>>> to use specific algorithms for distributions with known orthonormal 
>>>>>> polynomials. It is not explained in the documentation and if you ask the 
>>>>>> doe for its distribution it will give you the initial distribution 
>>>>>> instead of the standardized one.
>>>>>> 
>>>>>> 2) The mathematical background:
>>>>>> The generation of quadrature rules for arbitrary 1D distributions is a 
>>>>>> badly conditioned problem. Even if the quadrature rule is well-defined 
>>>>>> (existence of moments of any order, distribution characterized by these 
>>>>>> moments), the application that maps the recurrence coefficients of the 
>>>>>> orthogonal polynomials to their value can have a very large condition 
>>>>>> number. As a result, the adaptive integration used to compute the 
>>>>>> recurrence coefficients of order n, based on the values of the 
>>>>>> polynomials of degree n-1 and n-2, can lead to wrong values and all the 
>>>>>> process falls down.
>>>>>> 
>>>>>> 3) The current state of the software:
>>>>>> Since version 1.8, OpenTURNS no more generates the quadrature rule of 
>>>>>> the standard representatives, but the quadrature rule of the actual 
>>>>>> marginal distributions. The AdaptiveStieltjesAlgorithm class, introduced 
>>>>>> in release 1.8, is much more robust than the previous orthonormalization 
>>>>>> algorithms and is able to handle even stiff problems. There are still 
>>>>>> difficult situations (distributions with discontinuous PDF inside of the 
>>>>>> range, fixed in OT 1.9, or really badly conditioned distributions, 
>>>>>> hopefully fixed when ticket#885 will be solved) but most usual 
>>>>>> situations are under control even with marginal degrees of order 20.
>>>>>> 
>>>>>> 4) The (probable) bug in your code and the way to solve it
>>>>>> You must be aware of the fact that the distribution you put into your 
>>>>>> WeightedExperiment object will be superseded by the distribution 
>>>>>> corresponding to your OrthogonalBasisFactory inside of the 
>>>>>> FunctionalChaosAlgorithm. If you need to have the input sample before to 
>>>>>> run the functional chaos algorithm, then you have to build your 
>>>>>> transformation by hand. Assuming that you already defined your 
>>>>>> projection basis called 'myBasis', your marginal integration degrees 
>>>>>> 'myDegrees' and your marginal distributions 'myMarginals', you have to 
>>>>>> write (in OT 1.7):
>>>>>> 
>>>>>> # Here the explicit cast into a NumericalMathFunction is to be able to 
>>>>>> evaluate the transformation over a sample
>>>>>> myTransformation = 
>>>>>> ot.NumericalMathFunction(ot.MarginalTransformationEvaluation([myBasis.getDistribution().getMarginal(i)
>>>>>>  for i in range(dimension), myMarginals))
>>>>>> sample = 
>>>>>> myTransformation(ot.GaussProductExperiment(myBasis.getDistribution(), 
>>>>>> myDegrees).generate())
>>>>>> 
>>>>>> 
>>>>>> You should avoid to cast OT objects into np objects as much as possible, 
>>>>>> and if you cannot avoid these casts you should do them only in the 
>>>>>> sections where they are needed. They can be expansive for large objects, 
>>>>>> and if the sample you get from generate() is used only as an argument of 
>>>>>> a NumericalMathFunction, then it will be converted back into a 
>>>>>> NumericalSample!
>>>>>> 
>>>>>> Best regards
>>>>>> 
>>>>>> Régis
>>>>>> ________________________________
>>>>>> De : roy <[email protected]>
>>>>>> À : users <[email protected]> 
>>>>>> Envoyé le : Jeudi 5 octobre 2017 11h13
>>>>>> Objet : [ot-users] Sample transformation
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> Hi,
>>>>>> 
>>>>>> I am facing consistency concerns in the API regarding distributions and 
>>>>>> sampling.
>>>>>> 
>>>>>> The initial goal was to get the sampling for Polynomial Chaos as I must 
>>>>>> not use the model variable.
>>>>>> So for least square strategy I do something like this:
>>>>>> 
>>>>>> proj_strategy = ot.LeastSquaresStrategy(montecarlo_design)
>>>>>> sample = np.array(proj_strategy.getExperiment().generate())
>>>>>> 
>>>>>> sample is correct as the bounds of each feature lie in the corresponding 
>>>>>> ranges.
>>>>>> 
>>>>>> But now if I want to use IntegrationStrategy:
>>>>>> 
>>>>>> ot.IntegrationStrategy(ot.GaussProductExperiment(dists, list))
>>>>>> sample = np.array(proj_strategy.getExperiment().generate())
>>>>>> 
>>>>>> sample’s outputs lie between [-1, 1] which does not corresponds to the 
>>>>>> distribution I have initially.
>>>>>> 
>>>>>> So I used the conversion class but it does not work well with 
>>>>>> GaussProductExperiment as it requires [0, 1] instead of [-1, 1].
>>>>>> 
>>>>>> Thus I use this hack:
>>>>>> 
>>>>>> # Convert from [-1, 1] -> input distributions
>>>>>> marg_inv_transf = ot.MarginalTransformationEvaluation(distributions, 1)
>>>>>> sample = (proj_strategy.getExperiment().generate() + 1) / 2.
>>>>>> 
>>>>>> 
>>>>>> Is it normal that the distribution classes are not returning in the same 
>>>>>> intervals?
>>>>>> 
>>>>>> 
>>>>>> Thanks for your support!
>>>>>> 
>>>>>> 
>>>>>> Pamphile ROY
>>>>>> Chercheur doctorant en Quantification d’Incertitudes
>>>>>> CERFACS - Toulouse (31) - France
>>>>>> +33 (0) 5 61 19 31 57
>>>>>> +33 (0) 7 86 43 24 22
>>>>>> 
>>>>>> 
>>>>>> _______________________________________________
>>>>>> OpenTURNS users mailing list
>>>>>> [email protected]
>>>>>> http://openturns.org/mailman/listinfo/users
>>>>>> <example.py><example_standard.py>
>>>>>> 
>>>>>> _______________________________________________
>>>> OpenTURNS users mailing list
>>>> [email protected] <mailto:[email protected]>
>>>> http://openturns.org/mailman/listinfo/users
>>>> 
>>>> 
>>> _______________________________________________
>>> OpenTURNS users mailing list
>>> [email protected] <mailto:[email protected]>
>>> http://openturns.org/mailman/listinfo/users
>>> 
>

_______________________________________________
OpenTURNS users mailing list
[email protected]
http://openturns.org/mailman/listinfo/users

Re: [ot-users] Sample transformation

Reply via email to