I'm writing a simulator in C++.  So far I have written a program to collect
data from a database and hope to be able to generate an algorithm to return
a random value with a distribution that matches my real world data.  What
I'm finding is that the data is UGLY.  In order to generate a reasonable
representation of the data, I'd need almost 3 million bins, and then most of
the information would be crammed into the first 1000 or so bins.  I've drawn
an ASCII art representation below.

I don't want to give up those flyers, because they sum up to a considerable
amount.  I'm modeling man loading in a manufacturing facility, so throwing
out the flyers will really skew my simulator.

Has anyone ever encountered such a problem?  Better yet, can someone
recommend a C++ algorithm to model my data?  I'm thinking I may have to go
to some sort of a logarithmic distribution, but it is important to base my
simulator on real world data and not generic algorithms.  I would be willing
to fit a model if I knew of a good model and how to utilize it in C++.

-dnult
    / \
   /   \
  /     \   /\      !           /\         ^   .  . .
*'        `'  `*********** ******* ********

Reply via email to