Hi Régis, Considering a group of output variables would be interesting indeed. I cannot however clearly see how this can be done or used. Would it allow to select the variables having influence on several outputs ?
By “produce the conditional expectations as functions”, you mean create a function which takes a group of variables as input and returns the conditional expectation given the values of the variables in the group as output ? That is, if the group is denoted by Xu, the function is E(Y|Xu). In other words, this produces a function which averages the output Y for the variables not in the group Xu. Ok : this reduces the dimension of the function, because this function has only Xu as inputs, instead of all the input vector X. Was it your purpose ? This could be very useful, as a post-processing of the chaos : this would combine the chaos, the sensitivity analysis step and the dimension reduction in one single step. Best regards, Michaël De : [email protected] [mailto:[email protected]] Envoyé : mardi 3 juillet 2018 18:23 À : [email protected]; BAUDIN Michael <[email protected]> Objet : Re: RE: [ot-users] Grouped Sobol indices Hi Michael, You may also want to combine the contribution of a group of input variables on a group of output variables, eg when the output is the vector of coefficients of a Karhunen Loeve expansion... At the end it produces a lot of combinations! It may also be interesting to produce the conditional expectations as functions as a post-processing of a chaos expansion. Your opinion on this point? Cheers Régis Le mardi 3 juillet 2018 à 17:49:31 UTC+2, BAUDIN Michael <[email protected]<mailto:[email protected]>> a écrit : Hi Régis, Indeed, my mail is back, but not the data… yet. Hence, I would be happy to show a basic Python implementation of the total group Sobol’ indice, but the most efficient way is to get a template script … which is in the encrypted hard drive of the failed computer. ☹ This should be solved within a couple of days, I hope. Indeed a generic regular expression engine could to the trick, but it is much more complicated than required, I think. After all, here is all the choices we have : • First or total indice • By variable or by group This generates only 4 indices that can be required, to my knowledge : • first order indice by variable • total order indice by variable • first order indice by group • total order indice by group and this is it. In attachment, I put three files in order to show a basic test case for what I would like to do • crue-generate-with-dependency.py : generates a sample with 8 variables in input and 1 variable in output • crue-8var-copula.csv • chaos-by-group.py : creates a chaos from the sample and explores the indices from it. This last script ends with the lines : # Groups of variables (known) allgroups = [[0],[1],[2],[3,4,5],[6,7]] # Create a functional chaos model algo = ot.FunctionalChaosAlgorithm(x, y) print(algo.getAdaptiveStrategy()) print(algo.getProjectionStrategy()) algo.run() result = algo.getResult() fccr = ot.FunctionalChaosRandomVector(result) for group in allgroups: s = fccr.getSobolGroupedIndex(group) # TODO : get the total order indice of each group print("Group=%s, S=%.2f" % (str(group),s)) The script I have in my hard drive allow to loop over the multi-indices, computing the required indices within the loop : I will complete my message when possible. Meanwhile, I think that there is an issue within the API since it does not seem possible to get the enumeration rule from the functional chaos “algo” : am I wrong ? Your analysis of the estimate of these grouped indices by sampling make it clear that this is an advantage of the chaos : estimating the group indices is much easier with chaos than by sampling. Thank you very much with the copula decomposition script : I think that I can use almost as is ! Best regards, Michaël De : [email protected]<mailto:[email protected]> [mailto:[email protected]] Envoyé : mardi 19 juin 2018 10:40 À : [email protected]<mailto:[email protected]>; BAUDIN Michael <[email protected]<mailto:[email protected]>> Objet : Re: [ot-users] Grouped Sobol indices Hi Michael, Nice to see that your email is back ;-) Concerning your first point and a part of your post-scriptum, an idea could be to provide a kind of regular expression based filter to compute the indices you want. Given a set of input variables, you may want to collect all the coefficients involving each of the variables within this set with a positive exponent, or all the coefficients involving at least one of these variables with a positive exponent, or at least all these variables with a positive exponent, or all the coefficients involving none of these variables with a positive exponent and so on. And you may want to collect these coefficients output component by output component, or for a given set of output component, or... You see what I mean? At one point we have to bound the number of methods, so a generic extraction mechanism and a dedicated set of filters may be the most efficient way to go. Do you have a list of indices of interest, with their formal definition in terms of conditional expectation? It would help to see how to implement this method. Concerning the estimation of group indices by sampling, as soon as you are able to express them in terms of variance of conditional expectations, it is quite straightforward to implement a scheme similar to the Saltelli method. The main difficulties are the potentially large number of indices to compute, and the huge number of simulations required to sample conditional expectations with a large conditioning dimension. For the last point, there is currently no way to inspect a given multivariate distribution to see how to factor it into smaller pieces. The point of view in OpenTURNS is the opposite: you can build such a product multivariate distribution using the ComposedCopula class (each sub-copula describes the dependence within a group of dependent variables), but there is no easy way to perform the inverse computation. From a C++ perspective, you can ask for the copula of your distribution and try to dynamic_cast it into a ComposedCopula to get the associated ComposedCopula, but if it is the case, odds are large that you could have access to this information by looking at your Python script a few lines above! Nevertheless, if you really need this kind of service, it can be implemented quite easily using Csiszar divergences. In the following script, I use the (squared) Hellinger divergence to check if a given distribution is a product of two lower dimensional distributions at a given position. The computation of the divergence is done by sampling (Monte Carlo). If you have *exact* independence, the sampling size can be very low (less than 100) as the divergence is *exactly* zero and the kernel function drops to nearly zero at each point of the sample. From this script, it is straightforward to get the associated block distributions. import openturns as ot import numpy as np from time import time def isProduct(distribution, cut, N = 10000, epsilon = 1.0e-16): dim = distribution.getDimension() distribution1 = distribution.getMarginal([i for i in range(cut)]) distribution2 = distribution.getMarginal([cut+i for i in range(dim-cut)]) def kernel(x): x = np.array(x) s = x.shape x1 = x[:, 0:cut] pdf1 = np.array(distribution1.computePDF(x1)) x2 = x[:, cut:dim] pdf2 = np.array(distribution2.computePDF(x2)) pdf = np.array(distribution.computePDF(x)) return ((1.0 - np.sqrt(pdf1 * pdf2 / pdf))**2 * pdf).reshape(s[0], 1) value = ot.PythonFunction(dim, 1, func_sample=kernel)(distribution.getSample(N)).computeMean()[0] return (value < epsilon) def decompose(distribution, N = 10000, epsilon = 1.0e-16): dim = distribution.getDimension() blocks = list() currentBlock = [0] for i in range(1, dim-1): if isProduct(distribution, i, N, epsilon): blocks.append(currentBlock) currentBlock = [i] else: currentBlock.append(i) currentBlock.append(dim-1) blocks.append(currentBlock) return blocks if __name__ == "__main__": d1 = 2 d2 = 1 d3 = 4 dimension = d1 + d2 + d3 R = ot.CorrelationMatrix(dimension) for i in range(d1): for j in range(i): R[i, j] = 0.1 / (1 + i + j) for i in range(d2): for j in range(i): R[i + d1, j + d1] = 0.1 / (1 + i + j) for i in range(d3): for j in range(i): R[i + d1 + d2, j + d1 + d2] = 0.1 / (1 + i + j) print(R) t0 = time() print("Test 1: Normal distribution") distribution = ot.Normal([0.0]*dimension, [1.0]*dimension, R) print("blocks=", decompose(distribution)) print("Test 2: Composed copula") distribution = ot.ComposedCopula([ot.ClaytonCopula(5.0)]*5 + [ot.NormalCopula(R)]) print("blocks=", decompose(distribution)) print("t=", time() - t0, "s") Cheers Régis LEBRUN Le lundi 18 juin 2018 à 16:53:20 UTC+2, BAUDIN Michael <[email protected]<mailto:[email protected]>> a écrit : Hi ! I have a computer code with dependent input vector and I would like to perform sensitivity analysis on the vector output. To do this, I would like to estimate the Sobol’ indices. In order to workaround the dependent inputs, I would like to gather dependent input variables into independent groups and to estimate the Sobol’ indices on these groups. • My first idea is to use a chaos expansion in order to estimate Sobol’ indices. It is easy to estimate the first order Sobol’ indices with the getSobolGroupedIndex method of the FunctionnalChaosSobolIndices class: http://openturns.github.io/openturns/master/user_manual/response_surface/_generated/openturns.FunctionalChaosSobolIndices.html However, the corresponding total order indices method seem to be unavailable: there is no getSobolGroupedTotalIndex ? If so, this is a pity, since this is just another “simple” arrangement of the coefficients from the chaos expansion. It would allow to estimate the total effect of a group. Compared to the group Sobol first indice, it would allow to see if there are interactions with this group and other variables outside from the group. Also, if the group Sobol total indice is close to zero, this means that this group has no effect of the variability of the output. • The second topic of my post is the SobolIndicesAlgorithm class : http://openturns.github.io/openturns/master/user_manual/_generated/openturns.SobolIndicesAlgorithm.html I can see that there is no “group” method here. Is there a theoretical way of computing grouped Sobol’ indices with these estimators? • Finally, I would like to know if there is a way of using the joint distribution in order to generate the independent groups of dependent input variables? In other words, with a given multivariate joint distribution, is there a way of generating a list of sub-lists of indices such that: o Each sub-list contain dependent variables, o Two sub-lists are independent. With such a goal, what classes / methods to use ? Best regards, Michaël PS By the way, the total Sobol’ indices of a group of variables lead to a way of removing input variables from the model. Indeed, we could compute the largest group having a total Sobol’ indice lower than a given small threshold (says 0.01 for example). By design, all variables from this group could be replaced by fixed inputs without changing the variability of the output. Furthermore, if we had the distribution of this estimator, we could guarantee this property with, say, 95% confidence (i.e. including the variability from the estimate). In order to compute this group, the variables could first be ordered by decreasing total Sobol’ indices. Then, we could add the variables into the group until the total Sobol’ indice exceeds the threshold: the final group is the one just before this step. With this raw algorithm, the chaos expansion has a great advantage: given that the decomposition is known, the estimate is almost CPU-free. Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à l'intention exclusive des destinataires et les informations qui y figurent sont strictement confidentielles. Toute utilisation de ce Message non conforme à sa destination, toute diffusion ou toute publication totale ou partielle, est interdite sauf autorisation expresse. Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si vous avez reçu ce Message par erreur, merci de le supprimer de votre système, ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support que ce soit. Nous vous remercions également d'en avertir immédiatement l'expéditeur par retour du message. Il est impossible de garantir que les communications par messagerie électronique arrivent en temps utile, sont sécurisées ou dénuées de toute erreur ou virus. ____________________________________________________ This message and any attachments (the 'Message') are intended solely for the addressees. The information contained in this Message is confidential. Any use of information contained in this Message not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval. If you are not the addressee, you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return message. E-mail communication cannot be guaranteed to be timely secure, error or virus-free. _______________________________________________ OpenTURNS users mailing list [email protected]<mailto:[email protected]> http://openturns.org/mailman/listinfo/users Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à l'intention exclusive des destinataires et les informations qui y figurent sont strictement confidentielles. Toute utilisation de ce Message non conforme à sa destination, toute diffusion ou toute publication totale ou partielle, est interdite sauf autorisation expresse. Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si vous avez reçu ce Message par erreur, merci de le supprimer de votre système, ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support que ce soit. Nous vous remercions également d'en avertir immédiatement l'expéditeur par retour du message. Il est impossible de garantir que les communications par messagerie électronique arrivent en temps utile, sont sécurisées ou dénuées de toute erreur ou virus. ____________________________________________________ This message and any attachments (the 'Message') are intended solely for the addressees. The information contained in this Message is confidential. Any use of information contained in this Message not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval. If you are not the addressee, you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return message. E-mail communication cannot be guaranteed to be timely secure, error or virus-free. Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à l'intention exclusive des destinataires et les informations qui y figurent sont strictement confidentielles. Toute utilisation de ce Message non conforme à sa destination, toute diffusion ou toute publication totale ou partielle, est interdite sauf autorisation expresse. Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si vous avez reçu ce Message par erreur, merci de le supprimer de votre système, ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support que ce soit. Nous vous remercions également d'en avertir immédiatement l'expéditeur par retour du message. Il est impossible de garantir que les communications par messagerie électronique arrivent en temps utile, sont sécurisées ou dénuées de toute erreur ou virus. ____________________________________________________ This message and any attachments (the 'Message') are intended solely for the addressees. The information contained in this Message is confidential. Any use of information contained in this Message not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval. If you are not the addressee, you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return message. E-mail communication cannot be guaranteed to be timely secure, error or virus-free.
_______________________________________________ OpenTURNS users mailing list [email protected] http://openturns.org/mailman/listinfo/users
