On Wednesday 10 August 2005 06:21 pm, boblq wrote: > On Wednesday 10 August 2005 06:09 pm, Todd Walton wrote: > > On 8/10/05, boblq <[EMAIL PROTECTED]> wrote: > > > I will take a look at it. I have been hacking out a simple > > > script that does a lot of what I want. > > > > Do you plan to share it with the world? Perhaps from the > > kernel-panic.org wiki? > > > > -todd > > I will start by showing some results. Then likely ask > for code review from some Perl gurus on the list. > Then if it survives those tests I will GPL the code. > > I am actually quite excited about what an be learned > from this kind of analysis. I suspect we will find well > defined signatures for all kinds of behavior we are all > familiar with e.g. flame wars, trolling, dialup vs broad band, > etc. We shall see. > > The creative aspect is the data analysis but first the > data needs to be collected. That is coming on along. > > BobLQ
Ok, I wrote a simple Perl script to pull out the data. I ran it against the Kooler list archive I have. I have only begun looking at aggregate statistics. For approximately 8000 messages from 80+ posters we have an simple average of 100 messages/poster. Roughly 25% of the posters only sent one message. The most prolific poster sent 1200 messages. I find an exponential distribution of the Number of posts versus Rank of the poster. I suspect there is a very simple model that will account for this result but I have not yet had time to develop that model. Something to do with queuing. The data and results are at http://www.prencesita.com/MailStat/ For a little about the exponential distribution see http://mathworld.wolfram.com/ExponentialDistribution.html Good clean fun, BobLQ -- [email protected] http://www.kernel-panic.org/cgi-bin/mailman/listinfo/kplug-list
