Re: Distribution generator for simulator in C++

Donald F. Burrill Sun, 9 Jan 2000 06:14:37 -0800
You report an empirical distribution like this:

|       / \
|      /   \
|     /     \   /\      !           /\         ^   .  . .
|   *'        `'  `*********** ******* ********
         @      @       @            @         @

Have you considered modelling it as a collection (sometimes called a 
mixture) of separate distributions with means at the locations marked 
(approximately) with "@"?  It's not clear whether the proposed separate 
distributions would best be modelled as normal (Gaussian) or Poisson or 
lognormal -- depends partly on how you think the data might have been 
constructed "in the raw".  Also not clear, if one chooses "normal", 
whether it be reasonable to assume the same variance for all;  your "!" 
rather looks as though it was meant to represent a quite narrow peak, 
which would argue for heteroscedastic distributions.
        Representation as a mixture would be rather more strongly 
supported (more strongly, that is, than as pure empiricism) if the 
several values "@" were themselves interestingly distributed, and in a 
way that invited some theoretical thought.  (For a simple-minded example, 
at roughly equal intervals with relative frequencies that diminished 
exponentially to the right.  Or if the data in those peaks turned out to 
be associated with useful categories.)

On Sun, 9 Jan 2000, Dave and Kim Nulton wrote:

> I'm writing a simulator in C++.  So far I have written a program to collect
> data from a database and hope to be able to generate an algorithm to return
> a random value with a distribution that matches my real world data.  What
> I'm finding is that the data is UGLY.  In order to generate a reasonable
> representation of the data, I'd need almost 3 million bins, and then most of
> the information would be crammed into the first 1000 or so bins.  I've drawn
> an ASCII art representation below.
> 
> I don't want to give up those flyers, because they sum up to a considerable
> amount.  I'm modeling man loading in a manufacturing facility, so throwing
> out the flyers will really skew my simulator.
> 
> Has anyone ever encountered such a problem?  Better yet, can someone
> recommend a C++ algorithm to model my data?  I'm thinking I may have to go
> to some sort of a logarithmic distribution, but it is important to base my
> simulator on real world data and not generic algorithms.  I would be willing
> to fit a model if I knew of a good model and how to utilize it in C++.
> 
> -dnult
>     / \
>    /   \
>   /     \   /\      !           /\         ^   .  . .
> *'        `'  `*********** ******* ********

                                                        -- DFB.
 ------------------------------------------------------------------------
 Donald F. Burrill                                 [EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,          [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264                                 603-535-2597
 184 Nashua Road, Bedford, NH 03110                          603-471-7128
Re: Distribution generator for simulator in C++

Reply via email to