Re: [ot-users] Sample transformation

roy Tue, 10 Oct 2017 15:22:36 -0700

Hi Régis,

Not sure about the leak as I only do python.
But using the tool I know, I was not able to free the memory(using some del and 
gc.collect()).


I saw the issue when constructing a model on a cluster (Quadrature with 121 
points, degree 10 in 2d) and the batch manager killed the job
due to memory consumption. On my Mac the memory goes to 3-4 Go for this but on 
the cluster it explodes.

As always, thanks for the quick reply :)


Pamphile ROY
Chercheur doctorant en Quantification d’Incertitudes
CERFACS - Toulouse (31) - France
+33 (0) 5 61 19 31 57
+33 (0) 7 86 43 24 22



> Le 10 oct. 2017 à 23:13, regis lebrun <[email protected]> a 
> écrit :
> 
> Hi Pamphil,
> 
> Nice to know that the code *seems* to work ;-)
> 
> Are you sure that there is a memory leak? The algorithm creates potentially 
> large objects, which are stored into the FunctionalChaosResult member of the 
> algorithm. If there is a pending reference to this object, the memory will 
> not be released. Maybe Denis, Julien or Sofiane have more insight on this 
> point?
> 
> Cheers
> 
> Régis
> 
> 
> 
> ________________________________
> De : roy <[email protected]>
> À : regis lebrun <[email protected]> 
> Cc : users <[email protected]>
> Envoyé le : Mardi 10 octobre 2017 6h35
> Objet : Re: [ot-users] Sample transformation
> 
> 
> 
> Hi Regis,
> 
> Thanks for this long and well detailed answer!
> The code you provided seems to work as expected.
> 
> However during my tests I noticed that the memory was not freed correctly.
> Once the class FunctionalChaosAlgorithm is called, there is a memory bump and 
> even after calling del
> and gc.collect(), memory is still not freed (using memory_profiler for that). 
> Might be a memory leak?
> 
> Kind regards,
> 
> Pamphile ROY
> Chercheur doctorant en Quantification d’Incertitudes
> CERFACS - Toulouse (31) - France
> +33 (0) 5 61 19 31 57
> +33 (0) 7 86 43 24 22
> 
> 
> 
> Le 7 oct. 2017 à 19:59, regis lebrun <[email protected]> a 
> écrit :
>> 
>> Hi Pamphil,
>> 
>> You were almost right: the AdaptiveStieltjesAlgorithm is very close to what 
>> you are looking for, but not exactly what you need. It is the algorithmic 
>> part of the factory of orthonormal polynomials, the class you have to use is 
>> StandardDistributionPolynomialFactory, ie a factory (=able to build 
>> something) and not an algorithm (=something able to compute something). You 
>> have all the details here:
>> 
>> http://openturns.github.io/openturns/master/user_manual/_generated/openturns.StandardDistributionPolynomialFactory.html
>> 
>> I agree on the fact that the difference is quite subtle, as it can be seen 
>> by comparing the API of the two classes. The distinction was made at a time 
>> were several algorithms were competing for the task (GramSchmidtAlgorithm, 
>> ChebychevAlgorithm) but in fact the AdaptiveStieltjesAlgorithm proved to be 
>> much more accurate and reliable than the other algorithms, and now it is the 
>> only orthonormalization algorithm available.
>> 
>> Another subtle trick is the following.
>> 
>> If you create a basis this way:
>> basis = ot.StandardDistributionPolynomialFactory(dist)
>> you will get the basis associated to the *standard representative* 
>> distribution in the parametric family to which dist belongs. It means the 
>> distribution with zero mean and unit variance, or with support equals to 
>> [-1, 1], or dist itself if no affine transformation is able to reduce the 
>> number of parameters of the distribution. 
>> It is managed automatically within the FunctionalChaosAlgorithm, but can be 
>> disturbing if you do things by hand.
>> 
>> If you create a basis this way:
>> basis = 
>> ot.StandardDistributionPolynomialFactory(ot.AdaptiveStieltjesAlgorithm(dist))
>> then the distribution is preserved, and you get the orthonormal polynomials 
>> corresponding to dist. Be aware of the fact that the algorithm may have hard 
>> time to build the polynomials if your distribution is far away from its 
>> standard representative, as it may involves the computation of recurrence 
>> coefficients with a much wider range of variation. The benefit is that the 
>> orthonormality measure is exactly your distribution, assuming that its 
>> copula is the independent one, so you don't have to introduce a marginal 
>> transformation between both measures.
>> 
>> Some additional remarks:
>> + it looks like dist has dimension>1, as you extract its marginal 
>> distributions later on. AdaptiveStieltjesAlgorithm and 
>> StandardDistributionPolynomialFactory only work with 1D distributions (it is 
>> not checked by the library, my shame). What you have to do is:
>> 
>> basis = 
>> ot.OrthogonalProductPolynomialFactory([ot.StandardDistributionPolynomialFactory(ot.AdaptiveStieltjesAlgorithm(dist.getMarginal(i)))
>>  for i in range(dist.getDimension())])
>> Quite a long line, I know...
>> It will build a multivariate polynomial basis orthonormal with respect to 
>> the product distribution (ie with independent copula) sharing the same 1D 
>> marginal distributions as dist.
>> 
>> 
>> After that, everything will work as expected and you will NOT have to build 
>> the transformation (if you build it it will coincide with the identity 
>> function). If you encounter performance issues (the polynomials of high 
>> degrees take ages to be built as in http://trac.openturns.org/ticket/885, or 
>> there is an overflow, or the numerical precision is bad) then use:
>> basis = 
>> ot.OrthogonalProductPolynomialFactory([ot.StandardDistributionPolynomialFactory(dist.getMarginal(i))
>>  for i in range(dist.getDimension())])
>> and build the transformation the way you do it.
>> 
>> + if you use the FunctionalChaosAlgorithm class by providing an input sample 
>> and an output sample, you also have to provide the weights of the input 
>> sample EVEN IF the experiment given in the projection strategy would allow 
>> to recompute them. It is because the fact that you provide the input sample 
>> overwrite the weighted experiment of the projection stratey by a 
>> FixedExperiment doe.
>> 
>> I attached two complete examples: one using the exact marginal distributions 
>> and the other using the standard representatives.
>> 
>> Best regards
>> 
>> Régis
>> 
>> ________________________________
>> De : roy <[email protected]>
>> À : regis lebrun <[email protected]> 
>> Cc : users <[email protected]>
>> Envoyé le : Vendredi 6 octobre 2017 14h22
>> Objet : Re: [ot-users] Sample transformation
>> 
>> 
>> 
>> Hi Regis,
>> 
>> Thank you for this detailed answer.
>> 
>> - I am using the latest release from conda (OT 1.9, python 3.6.2, latest 
>> numpy, etc.) ,
>> - For the sample, I need it to generate externally the output (cost code 
>> that cannot be integrated into OT as model),
>> - I have to convert ot.Sample into np.array because it is then used by other 
>> functions to create the simulations, etc.
>> 
>> If I understood correctly, I can create the projection strategy using this 
>> snippet:
>> 
>> basis = ot.AdaptiveStieltjesAlgorithm(dist)
>> measure = basis.getMeasure()
>> quad = ot.Indices(in_dim)
>> for i in range(in_dim):
>>  quad[i] = degree + 1
>> 
>> comp_dist = ot.GaussProductExperiment(measure, quad)
>> proj_strategy = ot.IntegrationStrategy(comp_dist)
>> 
>> inv_trans = 
>> ot.Function(ot.MarginalTransformationEvaluation([measure.getMarginal(i) for 
>> i in range(in_dim)], distributions))
>> sample = np.array(inv_trans(comp_dist.generate()))
>> 
>> 
>> It seems to work. Except that the basis does not work with 
>> ot.FixedStrategy(basis, dim_basis). I get a non implemented method error.
>> 
>> After I get the sample and the corresponding output, what is the way to go? 
>> Which arguments do I need to use for the
>> ot.FunctionalChaosAlgorithm? 
>> 
>> I am comparing the Q2 and on Ishigami and I was only able to get correct 
>> results using:
>> 
>> pc_algo = ot.FunctionalChaosAlgorithm(sample, output, dist, trunc_strategy)
>> 
>> But for least square strategy I had to use this:
>> 
>> pc_algo = ot.FunctionalChaosAlgorithm(sample, output)
>> 
>> 
>> Is it normal?
>> 
>> 
>> Pamphile ROY
>> Chercheur doctorant en Quantification d’Incertitudes
>> CERFACS - Toulouse (31) - France
>> +33 (0) 5 61 19 31 57
>> +33 (0) 7 86 43 24 22
>> 
>> 
>> 
>> Le 5 oct. 2017 à 15:40, regis lebrun <[email protected]> a 
>> écrit :
>> 
>> 
>>> Hi Pamphile,
>>> 
>>> 
>>> 1) The problem:
>>> The problem you get is due to the fact that in your version of OpenTURNS 
>>> (1.7 I suppose), the GaussProductExperiment class has a different way to 
>>> handle the input distribution than the other WeightedExperiment classes: it 
>>> generates the quadrature rule of the *standard representatives* of the 
>>> marginal distributions instead of the marginal distributions. It does not 
>>> change the rate of convergence of the PCE algorithm and allows to use 
>>> specific algorithms for distributions with known orthonormal polynomials. 
>>> It is not explained in the documentation and if you ask the doe for its 
>>> distribution it will give you the initial distribution instead of the 
>>> standardized one.
>>> 
>>> 2) The mathematical background:
>>> The generation of quadrature rules for arbitrary 1D distributions is a 
>>> badly conditioned problem. Even if the quadrature rule is well-defined 
>>> (existence of moments of any order, distribution characterized by these 
>>> moments), the application that maps the recurrence coefficients of the 
>>> orthogonal polynomials to their value can have a very large condition 
>>> number. As a result, the adaptive integration used to compute the 
>>> recurrence coefficients of order n, based on the values of the polynomials 
>>> of degree n-1 and n-2, can lead to wrong values and all the process falls 
>>> down.
>>> 
>>> 3) The current state of the software:
>>> Since version 1.8, OpenTURNS no more generates the quadrature rule of the 
>>> standard representatives, but the quadrature rule of the actual marginal 
>>> distributions. The AdaptiveStieltjesAlgorithm class, introduced in release 
>>> 1.8, is much more robust than the previous orthonormalization algorithms 
>>> and is able to handle even stiff problems. There are still difficult 
>>> situations (distributions with discontinuous PDF inside of the range, fixed 
>>> in OT 1.9, or really badly conditioned distributions, hopefully fixed when 
>>> ticket#885 will be solved) but most usual situations are under control even 
>>> with marginal degrees of order 20.
>>> 
>>> 4) The (probable) bug in your code and the way to solve it
>>> You must be aware of the fact that the distribution you put into your 
>>> WeightedExperiment object will be superseded by the distribution 
>>> corresponding to your OrthogonalBasisFactory inside of the 
>>> FunctionalChaosAlgorithm. If you need to have the input sample before to 
>>> run the functional chaos algorithm, then you have to build your 
>>> transformation by hand. Assuming that you already defined your projection 
>>> basis called 'myBasis', your marginal integration degrees 'myDegrees' and 
>>> your marginal distributions 'myMarginals', you have to write (in OT 1.7):
>>> 
>>> # Here the explicit cast into a NumericalMathFunction is to be able to 
>>> evaluate the transformation over a sample
>>> myTransformation = 
>>> ot.NumericalMathFunction(ot.MarginalTransformationEvaluation([myBasis.getDistribution().getMarginal(i)
>>>  for i in range(dimension), myMarginals))
>>> sample = 
>>> myTransformation(ot.GaussProductExperiment(myBasis.getDistribution(), 
>>> myDegrees).generate())
>>> 
>>> 
>>> You should avoid to cast OT objects into np objects as much as possible, 
>>> and if you cannot avoid these casts you should do them only in the sections 
>>> where they are needed. They can be expansive for large objects, and if the 
>>> sample you get from generate() is used only as an argument of a 
>>> NumericalMathFunction, then it will be converted back into a 
>>> NumericalSample!
>>> 
>>> Best regards
>>> 
>>> Régis
>>> ________________________________
>>> De : roy <[email protected]>
>>> À : users <[email protected]> 
>>> Envoyé le : Jeudi 5 octobre 2017 11h13
>>> Objet : [ot-users] Sample transformation
>>> 
>>> 
>>> 
>>> Hi,
>>> 
>>> I am facing consistency concerns in the API regarding distributions and 
>>> sampling.
>>> 
>>> The initial goal was to get the sampling for Polynomial Chaos as I must not 
>>> use the model variable.
>>> So for least square strategy I do something like this:
>>> 
>>> proj_strategy = ot.LeastSquaresStrategy(montecarlo_design)
>>> sample = np.array(proj_strategy.getExperiment().generate())
>>> 
>>> sample is correct as the bounds of each feature lie in the corresponding 
>>> ranges.
>>> 
>>> But now if I want to use IntegrationStrategy:
>>> 
>>> ot.IntegrationStrategy(ot.GaussProductExperiment(dists, list))
>>> sample = np.array(proj_strategy.getExperiment().generate())
>>> 
>>> sample’s outputs lie between [-1, 1] which does not corresponds to the 
>>> distribution I have initially.
>>> 
>>> So I used the conversion class but it does not work well with 
>>> GaussProductExperiment as it requires [0, 1] instead of [-1, 1].
>>> 
>>> Thus I use this hack:
>>> 
>>> # Convert from [-1, 1] -> input distributions
>>> marg_inv_transf = ot.MarginalTransformationEvaluation(distributions, 1)
>>> sample = (proj_strategy.getExperiment().generate() + 1) / 2.
>>> 
>>> 
>>> Is it normal that the distribution classes are not returning in the same 
>>> intervals?
>>> 
>>> 
>>> Thanks for your support!
>>> 
>>> 
>>> Pamphile ROY
>>> Chercheur doctorant en Quantification d’Incertitudes
>>> CERFACS - Toulouse (31) - France
>>> +33 (0) 5 61 19 31 57
>>> +33 (0) 7 86 43 24 22
>>> 
>>> 
>>> _______________________________________________
>>> OpenTURNS users mailing list
>>> [email protected]
>>> http://openturns.org/mailman/listinfo/users
>>> <example.py><example_standard.py>
> 
> _______________________________________________
> OpenTURNS users mailing list
> [email protected]
> http://openturns.org/mailman/listinfo/users

_______________________________________________
OpenTURNS users mailing list
[email protected]
http://openturns.org/mailman/listinfo/users

Re: [ot-users] Sample transformation

Reply via email to