Thanks Ted. Is there any way to use MMPP (Markov-manipulated Poisson process) algorithm (www.datalab.uci.edu/papers/tkdd07.pdf) in Mahout 0.4? Can you please direct me to some examples?
Thanks, Mubarak On Wed, Oct 20, 2010 at 4:06 PM, Ted Dunning <[email protected]> wrote: > For many situations, this can be done very simply, especially if you are > working web-based systems. For that case, > it is straightforward to model transactions coming as a Poisson process > with > a time varying rate. In the simplest case, > very simple seasonality models can be used to estimate the time varying > rate. I have used hourly estimates from one > day ago and one week ago as good indicators in the past. These indicators > did not model long weekends as well as I would > have liked, the the alarms based on these models were better than any other > system available. Long-term seasonality > was handled very well because of the short term nature of the expected > volume estimates. For tighter bounds, > it should be possible to use something akin to generalized linear models to > incorporate more information to get better > rate predictions. Since the failures I was trying to detected quickly were > typically total failures, I just had to raise an alert > as quickly as possible when the inter-transaction time exceeded a > reasonable > bound. For a specified false positive rate, > this was very easily done and results were very nearly optimal. More > importantly, the alerts almost always were faster > than our CEO who had an eagle eye for these things. > > For brick-and-mortar systems, this can be a bit more difficult because > business practices tend to cause some very irregular > volumes. If you are dealing with transactions that are being reported in > real-time rather than in batches, then you should be > fine. Batch reporting based on human triggers could probably be handled > using longer/softer rate averaging windows, however. > > I really don't expect that you need anything all that fancy for the rate > estimation. > > Can you say more about your data? Can you post anonymous sample data for a > two week period? > > On Tue, Oct 19, 2010 at 11:26 PM, Mubarak Seyed <[email protected] > >wrote: > > > My requirements are as follows: > > > > - Client system does the transaction using hub, we have a historical data > > and we can predict the trends of min/avg/max number of transaction for a > > given interval > > - Using the historical data, mine the data, need to find the predictions > > - Need to build a intelligent system (using ML technique, neural network > > algorithms) if there is no transaction for a client in the given > prediction > > range then system needs to send alarms > > > > > > For example, Walmart sells gift cards, each sale is a transaction and it > > needs to come to main system (from hub), we have a historical data for > > WalMart for sales (for each day, each hour, each 10 mins, peak volume, > > holiday season), if there is no transaction from WalMart for X range of > > time > > and that range does not fall in a prediction data, then intelligent > systems > > needs to raise an alarm. > > > -- Thanks, Mubarak Seyed.
