Re: [ot-users] Sobol indices with Saltelli2010 and Sobol sequence

BAUDIN Michael Thu, 05 Jul 2018 09:25:09 -0700

Hi,

Actually, there is no theoretical reason to limit the dimension of a low 
discrepancy sequence. A better indicator is a "kind of" ratio of the number of 
points in the sample with respect to the dimension: in the end, any DOE must 
fill the space. Notice that the number of points in a low discrepancy sequence 
must be chosen depending on the properties of the sequence. The Sobol' sequence 
is a base-2 sequence : the number of points must be a power of 2 to fill the 
elementary volumes. Moreover, for the Sobol' sequence, there is a minimum 
number of points to be used (i.e. a minimum value of the exponent). The same 
reasoning applies to all sequences : the discussion for Halton would make my 
e-mail much larger !


These methods behave this way : if the number of points is not large enough, 
the performance is lower than Monte-Carlo, but if the number exceeds this 
limit, the low discrepancy sequence outperforms MC by far (i.e. 1/n instead of 
1/sqrt(n)). 

In other softwares (e.g. Scilab), the Sobol' sequence goes up to dimension 
larger than 1000. The most important technical limitation for Sobol' is the 
storage of the coefficients of the polynomial associated with the generation of 
the Sobol' sequence : this is a pre-computed table, which is embedded within 
the source code (in some softwares, this is in a file which is read at 
startup). It would be straightforward to update the OT source code in order to 
fix this issue: I could give the URL, but I do not have any internet at this 
time. Other sequences (e.g. Halton) do not have this limitation.

A last word : you have optimized LHS designs in OpenTURNS. This is much better 
than plain LHS, especially when the ratio number of points/dimension is 
hostile.  

Best regards,

Michaël

-----Message d'origine-----
De : [email protected] [mailto:[email protected]] 
Envoyé : mercredi 20 juin 2018 15:12
À : [email protected]
Cc : [email protected]
Objet : Re: [ot-users] Sobol indices with Saltelli2010 and Sobol sequence

Hi Regis,

In the paper from Saltelli2010 (Variance based sensitivity analysis of model 
output. Design and estimator for the total sensitivity index), there is a clear 
statement about this.
(In this paper, they also showed a better convergence of the estimator as 
opposed to Jansen, so Saltelli2002).
We can also see more proof in Sobol2005 (On global sensitivity analysis of 
quasi-Monte Carlo algorithms).

One thing to note. The recommended way (I do not have a ref., but from 
Kucherenko at the SAMO UQ school) to create A and B with Sobol' sequences is to 
create a unique sample of size (n_samples, n_dim * 2) and then split vertically 
as I showed in my example. But if you do this, you will only be able to go to 
dim = 20.
I don’t know if there are ref. about the maximal acceptable dimension for Sobol 
sequences. It will highly depend on the quality of the Sobol generator. Broda 
seem to be recommended and offers dim <= 51.

For higher dimensions, it is clear that LHS would be the way to go as known to 
converge quickly than crude MC.

Cheers,

Pamphile


> Le 20 juin 2018 à 14:44, regis lebrun <[email protected]> a 
> écrit :
> 
> Hi Pamphile,
> 
> Do you have a reference on the use of Sobol sequences to compute Sobol 
> indices? Does the better convergence property still hold in large dimension?
> In order to change the default behaviour, we need a reference stating that 
> the different estimators still converge when using low discrepancy sequences, 
> and that there is a gain in using them. In this case, as the Sobol sequences 
> are limited to dimension less or equal to 40 in OT, I propose to use them up 
> to dimension 40, then to fallback to LHS for higher dimension (if we have a 
> reference on the convergence of LHS based estimates) or to Monte Carlo.
> 
> Cheers
> 
> Régis
> 
> 
> 
> Le mercredi 20 juin 2018 à 09:20:20 UTC+2, roy <[email protected]> a écrit :
> 
> 
> Hi Regis,
> 
> Thank you for these input.
> 
> I didn’t know there was a method to randomize the Sobol’ sequence in OT. 
> This should be the default behaviour for computing the indices as this is 
> known to converge faster. Or at least an option to do it directly would be 
> handy.
> 
> Sincerely,
> 
> Pamphile ROY
> PhD candidate in Uncertainty Quantification CERFACS—Toulouse 
> (31)—France
> +33 (0) 5 61 19 31 57
> +33 (0) 7 86 43 24 22
> 
> 
> 
>> Le 18 juin 2018 à 23:08, regis lebrun <[email protected]> a 
>> écrit :
>> 
>> Hi Pamphile,
>> 
>> I don't know what are the differences between Saltelli 2002 and Saltelli 
>> 2010 as it is not my main field of interest, but OpenTURNS proposes 
>> alternative sampling algorithms to compute Sobol indices (first and second 
>> order).
>> You are not forced to use a Monte Carlo sampling, you can use any weighted 
>> experiment with uniform weights, including LHS experiment and low 
>> discrepancy experiments. It is demonstrated in the following script:
>> 
>> ===== BEGIN =====
>> from openturns import *
>> 
>> mode = "QMC"
>> size = 10000
>> 
>> # Toy model
>> model = SymbolicFunction(["x0", "x1", "x2", "x3"], 
>> ["x0+x1+x2+x3+x1*x2"])
>> 
>> # Here you can use any multivariate distribution with independent 
>> copula distribution = Normal(4) if mode == "QMC":
>>     sampling_doe = LowDiscrepancyExperiment(SobolSequence(4), distribution, 
>> size, False)
>>     sampling_doe.setRandomize(True) # If set to False, the results are wrong
>>     sampling_doe.setRestart(False) # or True elif mode == "LHS":
>>     sampling_doe = LHSExperiment(distribution, size)
>>     sampling_doe.setAlwaysShuffle(True) # If set to False, the results are 
>> wrong
>>     sampling_doe.setRandomShift(False) # or True
>> else:
>>     sampling_doe = MonteCarloExperiment(distribution, size) sobol_doe 
>> = SobolIndicesExperiment(sampling_doe, True) inputDesign = 
>> sobol_doe.generate() outputDesign = model(inputDesign)
>> 
>> # Example of use with the different Sobol algorithms
>> Log.Show(Log.NONE)
>> algo = JansenSensitivityAlgorithm(inputDesign, outputDesign, size)
>> print("Jansen")
>> print("First order=", algo.getFirstOrderIndices()) print("Second 
>> order=", algo.getSecondOrderIndices()) algo = 
>> MartinezSensitivityAlgorithm(inputDesign, outputDesign, size)
>> print("Martinez")
>> print("First order=", algo.getFirstOrderIndices()) print("Second 
>> order=", algo.getSecondOrderIndices()) algo = 
>> MauntzKucherenkoSensitivityAlgorithm(inputDesign, outputDesign, size)
>> print("MauntzKucherenko")
>> print("First order=", algo.getFirstOrderIndices()) print("Second 
>> order=", algo.getSecondOrderIndices()) algo = 
>> SaltelliSensitivityAlgorithm(inputDesign, outputDesign, size)
>> print("Saltelli")
>> print("First order=", algo.getFirstOrderIndices()) print("Second 
>> order=", algo.getSecondOrderIndices()) ===== END =====
>> 
>> Note that you have to tweak a little bit the LowDiscrepancyExperiment and 
>> LHSExperiment classes to get correct results. The fact that the permutation 
>> has to be regenerated each time an LHS experiment is sampled is quite 
>> natural, but the fact that the low discrepancy sequence has to be randomly 
>> shifted is less clear to me... I have not played with the other low 
>> discrepancy sequences (Halton, Faure etc) to see if this behavior is general 
>> or not.
>> 
>> Best regards,
>> 
>> Régis LEBRUN
>> 
>> 
>> 
>> Le lundi 18 juin 2018 à 17:54:52 UTC+2, roy <[email protected]> a écrit :
>> 
>> 
>> Hi everyone!
>> 
>> From the documentation, Sobol’ indices with Saltelli's formulation are 
>> computed using its 2002 formula.
>> In Saltelli2010, there is another formulation which is demonstrated to be 
>> better.
>> 
>> Another point, there is the possibility to use an experiment in 
>> SobolIndicesExperiment(). But it is not possible to use Sobol’ sequence here.
>> It would be great to add this possibility as it is known to converge 
>> faster than the MonteCarloExperiment(). To get A and B with Sobol’ sequence, 
>> we must split the matrix vertically:
>> 
>> input_design = np.array(ot.SobolSequence(dim * 2).generate(size)) A = 
>> input_design[:, :dim] B = input_design[:, dim:]
>> 
>> Another possibility is to use Sobol’ scrambled and split horizontally. But 
>> from what I know, this sampler is not available in OT?
>> 
>> From the documentation, it is not really clear that the matrices are 
>> stacked as [A, B, A_B, B_A]. I had to look at the code to really understand 
>> that. Maybe this could be improved?
>> 
>> Thanks for the support :)
>> 
>> 
>> Pamphile ROY
>> PhD candidate in Uncertainty Quantification CERFACS —Toulouse 
>> (31)—France
>> +33 (0) 5 61 19 31 57
>> +33 (0) 7 86 43 24 22
>> 
>> _______________________________________________
>> OpenTURNS users mailing list
>> [email protected]
>> http://openturns.org/mailman/listinfo/users
> 

_______________________________________________
OpenTURNS users mailing list
[email protected]
http://openturns.org/mailman/listinfo/users



Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à 
l'intention exclusive des destinataires et les informations qui y figurent sont 
strictement confidentielles. Toute utilisation de ce Message non conforme à sa 
destination, toute diffusion ou toute publication totale ou partielle, est 
interdite sauf autorisation expresse.

Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le 
copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si 
vous avez reçu ce Message par erreur, merci de le supprimer de votre système, 
ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support 
que ce soit. Nous vous remercions également d'en avertir immédiatement 
l'expéditeur par retour du message.

Il est impossible de garantir que les communications par messagerie 
électronique arrivent en temps utile, sont sécurisées ou dénuées de toute 
erreur ou virus.
____________________________________________________

This message and any attachments (the 'Message') are intended solely for the 
addressees. The information contained in this Message is confidential. Any use 
of information contained in this Message not in accord with its purpose, any 
dissemination or disclosure, either whole or partial, is prohibited except 
formal approval.

If you are not the addressee, you may not copy, forward, disclose or use any 
part of it. If you have received this message in error, please delete it and 
all copies from your system and notify the sender immediately by return message.

E-mail communication cannot be guaranteed to be timely secure, error or 
virus-free.
_______________________________________________
OpenTURNS users mailing list
[email protected]
http://openturns.org/mailman/listinfo/users

Re: [ot-users] Sobol indices with Saltelli2010 and Sobol sequence

Reply via email to