Dear list members, I am working on so called automatic random variate generators, that is, routines that allow to sample from larger classes of distributions, e.g., distributions with log-concave densities. I have implemented all my algorithms in a library called UNU.RAN (universal non-uniform random variate generators) which is available at http://statmath.wu.ac.at/unuran/
It is also packed in ROOT (http://root.cern.ch/) and I have written a wrapper for R (package "Runuran"). The advantage of this approach is, that it allows to sample from quite arbitrary distributions. Distributions can be used with a couple of different methods depending on one's particular generation problem. A possible disadvantage is the two-step approach: in Step 1 a generator object is created, which can be used in Step 2 to draw one or many random samples, see below. I have been asked to make the library available for Matlab similarly to my "Runuran" package. Since Octave is open-source I have written such a package for octave instead using oct-files (which seems to be superior to mex-files; I never have used matlab before). (For me OOS means that I can look how others solved problems which significantly flattens the learning curve for me.) I have added the help pages of the octave functions as well as a short example of its application below. I have a few question / remarks: - Do you think that octave-forge is a proper place for hosting this package? - The package requires the UNU.RAN libray to be installed on the system. However, this makes it for Windows users almost impossible to use (since I have no time to learn how to prepare a binary package for the different windows OSes). Thus I have package the UNU.RAN sources in the "Runuran" package and let CRAN to this work for me. Is this a good approach for Octave packages, too? - I usually implement my algorithms using C. Thus my C++ code may look like a C file. I hope this is not a problem. Josef Examples: %% Initialize library unuran_config(); %% A simple example using the standard normal distribution. %% It uses a default generation method ("auto"). %% First we have to create the generator object. normgen = unuran_gen("normal"); %% It can be then used to draw a sample. X = unuran_rnd(normgen, 1, 5) %% For the next example we use a truncated gamma distribution with %% shape parameter 5 and select the fast generation method AROU %% (automatic ratio-of-uniforms method). gammagen = unuran_gen("gamma(5); domain=(3,inf)", "arou"); X = unuran_rnd(gammagen, 1, 5) %% Recently the Generalized Inverse Gaussian distribution has %% become quite popular in financial mathematics. %% Here is an example with a particular GIG distribution that %% shows that we can create samples from distributions with %% quite arbitrary user-defined density functions. %% The example also demonstrates that it is also possible to %% create a distribution object first which is then used for %% creating the generator object. %% We use this distribution object in combination with two %% different generation methods. %% The first example uses an exact method based on the %% acceptance-rejection method that only works for log-concave %densities). gigdistr = unuran_distr("cont; logpdf='2*log(x)-(x+1/x)'; domain=(0,inf)"); giggen1 = unuran_gen(gigdistr, "tdr; variant_ia"); X = unuran_rnd(giggen1, 1, 5) %% The second example uses an approximate inversion method giggen2 = unuran_gen(gigdistr, "pinv"); X = unuran_rnd(giggen2, 1, 5) %% It is also possible to use octave functions for creating the %% distribution object. function retval = lgigpdf(x) if (x>0) retval = 2*log(x)-(x+1/x); else retval = 0; endif endfunction gigdistr = unuran_cont(0, Inf, @lgigpdf, true); giggen3 = unuran_gen(gigdistr, "pinv; u_resolution=1.e-12; keepcdf=true"); X = unuran_rnd(giggen3, 1, 5) %% A nice by-product of method "pinv" is that it allows to %% compute cumulative distribution function and its inverse %% approximately. %% The maximal tolerated error is set to 1e-12 in our example. %% density function at 1, 2, 3, 4, and 5 X = unuran_pdf(giggen3, 1:5) %% cumulative density function at 1, 2, 3, 4, and 5 X = unuran_cdf(giggen3, 1:5) %% inverse of cumulative density function at 0.1, ..., 0.9 X = unuran_inv(giggen3, (1:9)/10) Help pages: -- Loadable Function: unuran_distr (DISTR) Create and return a UNU.RAN distribution object. Argument DISTR is a string that declares the distribution and which is passed to and interpreted by the UNU.RAN library. For the syntax for this string consult the UNU.RAN user manual, `http://statmath.wu.ac.at/unuran'. Alternatively one can create UNU.RAN distribution objects using commands `unuan_cont' and `unuran_discr'. These commands allow to use density functions defined by octave functions. See also: unuran_cont, unuran_discr, unuran_gen, unuran_pdf, unuran_cdf, unuran_inv -- Loadable Function: unuran_cont (LB, UB, PDF) -- Loadable Function: unuran_cont (LB, UB, PDF, ISLOG) -- Loadable Function: unuran_cont (LB, UB, PDF, DPDF, ISLOG) -- Loadable Function: unuran_cont (LB, UB, PDF, DPDF, CDF, ISLOG) Create and return a UNU.RAN distribution object for a _continuous_ distribution. In opposition to `unuran_distr', density functions are provided by octave functions. Arguments: LB Lower bound of domain. UB Upper bound of domain. PDF Probablity density function. Notice that the given function can be any multiple of some density. Use `NA' for not setting this function. DPDF Derivative of the given PDF. Use `NA' for not setting this function. CDF Cumulative distribution function. Use `NA' for not setting this function. ISLOG Boolean. If `true' then the given PDF and CDF are interpreted as the logarithms of the corresponding functions. Notice that DPDF is then the derivative of the log-density. If ISLOG is omitted or `false', then PDF and CDF are used as given. Notice that argument ISLOG can be omitted in all cases. Arguments PDF, DPDF, and CDF can be function handles (e.g., `@exp' for the exponential function) or inline function (e.g., `inline ("exp(-x)")'). Notice, however, that anonymous functions (e.g., `@() (exp(-x))') *must not* be used. Some generation methods need more data about the distribution which must be set by means of `unuran_cont_data'. See also: unuran_cont_data, unuran_distr, unuran_discr, unuran_gen -- Loadable Function: unuran_cont_data (UD, MODE, CENTER, AREA) Set additional data for a _continuous_ UNU.RAN distribution object UD. Arguments: UD UNU.RAN distribution object for _continuous_ distribution. MODE Mode of distribution. Use `NA' for not setting the MODE. CENTER A point where the given PDF does not vanish, e.g., a point near the mode. If omitted then MODE is used as default. If MODE is not set, then UNU.RAN tries to find such a point when it is required by the chosen generation method. Use `NA' for not setting the CENTER. AREA Area below the given PDF. Use `NA' for not setting the AREA. Notice that function `unuran_cont_data' does not return an object but modifies the distribution object UD. See also: unuran_cont -- Loadable Function: unuran_discr (LB, UB, PDF) -- Loadable Function: unuran_discr (LB, UB, PDF, CDF) Create and return a UNU.RAN distribution object for a _discrete_ distribution, In opposition to `unuran_distr' density functions can be provided by octave functions. Arguments: LB Lower bound of domain. UB Upper bound of domain. PDF Probablity mass function or a numeric vector that contains (a multiple) of a probability vector. Use `NA' for not setting the PDF. CDF Cumulative distribution function. Use `NA' for not setting this function. Arguments CDF and PDF (if the latter is a function) can be function handles (e.g., `@exp' for the exponential function) or inline function (e.g., `inline ("exp(-x)")'). Notice, however, that anonymous functions (e.g., `@() (exp(-x))') *must not* be used. Some generation methods need more data about the distribution which must be set by means of `unuran_discr_data'. See also: unuran_discr_data, unuran_distr, unuran_cont, unuran_gen -- Loadable Function: unuran_discr_data (UD, MODE, CENTER, SUM) Set additional data for a _discrete_ UNU.RAN distribution object UD. Arguments: UD UNU.RAN distribution object for _discrete_ distribution. MODE Mode of distribution. Use `NA' for not setting the MODE. CENTER Not implemented. Value is ignored. SUM Sum of probabilities of the given multiple of a probability mass function or probability vector. Use `NA' for not setting the SUM. Notice that function `unuran_discr_data' does not return an object but modifies the distribution object UD. See also: unuran_discr -- Loadable Function: unuran_gen (DISTR) -- Loadable Function: unuran_gen (DISTR, METHOD) Create and return a UNU.RAN generator object. The arguments are: DISTR Specifies the target distribution. DISTR is either is a string that is passed to and interpreted by the UNU.RAN library. Or it is an UNU.RAN distribution object created by a `unuran_distr', `unuran_cont', or `unuran_discr' call. METHOD A string declared the generation method that is passed to and interpreted by the UNU.RAN library. If METHOD is omitted then method `"auto"' is used by default. For the syntax of the UNU.RAN string consult the UNU.RAN user manual, `http://statmath.wu.ac.at/unuran'. The following example illustrate the use of this package. The first simple example creates a generator object `normgen' for the standard normal distribution. It uses a default generation method (AUTO) which is omitted. This object is then used to draw a random sample. normgen = unuran_gen("normal"); X = unuran_rnd(normgen, 1, 5); For the next example we use a truncated gamma distribution with shape parameter 5 and select the fast generation method `AROU' (automatic ratio-of-uniforms method). gammagen = unuran_gen("gamma(5); domain=(3,inf)", "arou"); X = unuran_rnd(gammagen, 1, 10); Recently the Generalized Inverse Gaussian distribution has become quite popular in financial mathematics. Here is an example with a particular GIG distribution that shows that we can create samples from distributions with quite arbitrary user-defined density functions. The example also demonstrates that it is also possible to create a distribution object first which is then used for creating the generator object. We use this distribution object in combination with two different generation methods. The first example uses an exact method based on the acceptance-rejection method that only works for log-concave densities. gigd = unuran_distr("cont; logpdf='2*log(x)-(x+1/x)'; domain=(0,inf)"); giggen1 = unuran_gen(gigd, "tdr; variant_ia"); X = unuran_rnd(giggen1, 1, 5); The second example uses an approximate inversion method. giggen2 = unuran_gen(gigd, "pinv"); X = unuran_rnd(giggen2, 1, 5); It is also possible to use octave functions for creating the distribution object. function retval = lgigpdf(x) if (x>0) retval = 2*log(x)-(x+1/x); else retval = 0; endif endfunction gigd = unuran_cont(0, Inf, @lgigpdf, true); giggen3 = unuran_gen(gigd, "pinv; u_resolution=1.e-12; keepcdf=true"); X = unuran_rnd(giggen3, 1, 5); A nice by-product of method `pinv' is that it allows to compute cumulative distribution function and its inverse approximately. The maximal tolerated error is set to 1e-12 in our example. %% density function at 1, 2, 3, 4, and 5 X = unuran_pdf(giggen3, 1:5); %% cumulative density function at 1, 2, 3, 4, and 5 X = unuran_cdf(giggen3, 1:5); %% inverse of cumulative density function at 0.1, ..., 0.9 X = unuran_inv(giggen3, (1:9)/10); See also: unuran_distr, unuran_cont, unuran_discr, unuran_rnd, unuran_pdf, unuran_cdf, unuran_inv -- Loadable Function: unuran_rnd (UG) -- Loadable Function: unuran_rnd (UG, N) -- Loadable Function: unuran_rnd (UG, M, N) -- Loadable Function: unuran_rnd (UG, [M N]) Return a matrix with random elements drawn from UNU.RAN generator object UG. If invoked with a single scalar argument N, return a square NxN random matrix. If N is omitted, N=1 is used by default. If supplied two scalar arguments (M, N), `unuran_rnd' takes them to be the number of rows and columns; that is, return an MxN random matrix. If given a vector with two elements, `unuran_rnd' uses the values of the elements as the number of rows and columns, respectively. The function calls the octave uniform random number. Thus the seed for generating the random matrix can be set by the `rand("state")' command. See also: unuran_gen, rand -- Loadable Function: unuran_pdf (UDG, X) -- Loadable Function: unuran_pdf (UDG, X, ISLOG) For each element of X, compute the probability density function (PDF) of the distribution implemented in the UNU.RAN object UDG. Argument UDG can either be a UNU.RAN distribution object or a UNU.RAN generator object. If the optional boolean argument ISLOG is `true', then the logarithm of the PDF is returned. If the given UNU.RAN object UDG does not contain the PDF of the distribution, then `NA' is returned. *Important:* `unuran_pdf' just evaluates the density function that is stored in UNU.RAN object UDG. It ignores the boundaries of the domain of the distribution, i.e., it does not return `0.0' outside the domain unless the implementation of the PDF handles this case correctly. This behavior is in particular important when UNU.RAN built-in distribution objects (e.g., for the normal distribution) are truncated by explicitly setting the domain boundaries. See also: unuran_distr, unuran_cont, unuran_discr, unuran_gen -- Loadable Function: unuran_cdf (UDG, X) For each element of X, compute the cumulative distribution function (CDF) of the distribution implemented in the UNU.RAN object UDG. Argument UDG can either be a UNU.RAN distribution object or a UNU.RAN generator object. For the computation of the CDF the following alternatives are tried (in the given order): 1. The CDF is available in UNU.RAN object UDG: the function is evaluated and the result is returned. *Important:* In this case `unuran_cdf' just evaluates the cumulative distribution function but ignores the boundaries of the domain of the distribution, i.e., it does not return `0.0' and `1.0', resp., outside the domain unless the implementation of the CDF handles this case correctly. This behavior is in particular important when UNU.RAN built-in distribution objects (e.g., for the normal distribution) are truncated by explicitly setting the domain boundaries. 2. UDG is a UNU.RAN generator object created by method `"pinv; keepcdf=true"': The approximate CDF is computed and returned. 3. UNU.RAN object UDG neither contains the CDF nor its approximation: `NA' is returned. See also: unuran_distr, unuran_cont, unuran_discr, unuran_gen -- Loadable Function: unuran_inv (UDG, P) For each element of P, compute the quantile function (the inverse CDF) of the distribution implemented in the UNU.RAN object UDG. Argument UDG can either be a UNU.RAN distribution object or a UNU.RAN generator object. For the computation of the inverse CDF the following alternatives are tried (in the given order): 1. The inverse CDF is available in UNU.RAN object UDG: the function is evaluated and the result is returned. *Important:* In this case `unuran_inv' just evaluates the inverse CDF but ignores the boundaries of the domain of the distribution, i.e., in case of a truncated distributuin it just returns the values for the untruncated distribution, unless the implementation of the inverse CDF handles this case correctly. This behavior is in particular important when UNU.RAN built-in distribution objects (e.g., for the normal distribution) are truncated by explicitly setting the domain boundaries. 2. UDG is a UNU.RAN generator object created by method that implements an inversion method, i.e., one of `PINV', `HINV', `NINV', or `DGT': Then the inverse CDF is computed approximately and the result is returned. 3. UNU.RAN object UDG neither contains the inverse CDF nor its approximation: `NA' is returned. See also: unuran_distr, unuran_cont, unuran_discr, unuran_gen -- ----------------------------------------------------------------------------- Josef Leydold | WU (Vienna University of Economics and Business) | Institute for Statistics and Mathematics ----------------------------------------------------------------------------- Augasse 2-6 | Tel. +43 1 31336 4695 A-1090 Vienna | FAX +43 1 31336 774 European Union | email josef.leyd...@wu.ac.at ----------------------------------------------------------------------------- Alles Unglueck kam daher, dass die Denkenden nicht mehr handeln konnten, und die Handelnden keine Zeit mehr fanden zu denken. (Marlen Haushofer) ------------------------------------------------------------------------------ Write once. Port to many. Get the SDK and tools to simplify cross-platform app development. Create new or port existing apps to sell to consumers worldwide. Explore the Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join http://p.sf.net/sfu/intel-appdev _______________________________________________ Octave-dev mailing list Octave-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/octave-dev