On Sep 28, 2008, at 6:26 PM, werner mueller wrote:
Hallo's
finally i find some time to ask boring questions :)
I some sort of stumbled across the mahout project at apachecon08 in
amsterdam. But i havent found the time for looking into it deeply.
I would like to ask for some hints / links / directions for a
'predictions' feature. i read through the mahout wiki and found some
interesting links. but since i com more from the applications part
and i
am not that much into databases i need some help getting started.
we develop a reporting application for a telcommunication company.
mainly we store data in an oracle cluster. it consists of a star-
schema.
the application mainly offers to create reports on two data sources:
costs and traffic. the data amount is about 1-2 terabytes.
the idea came up to implement some 'alarming' features. so customers
could set up some limits for contracts, phone numbers etc to get
notified once the limits are reached or the data 'behaves
strange' (too
strong increases for a period, other ideas to come...).
Can you give an example? It sounds like you simply want the user to
say "if contracts > X, then alarm", but I gather not, since you are
asking here. Or are you looking for the user
to not be involved in setting the thresholds, but instead to learn
from past examples where there was a problem? For instance, you have
failures from before, but you don't particularly know why it failed
(i.e. what features caused the problem).
i would like to ask if there is something of use in mahout or whether
you would recommend to keep such features 'simple' on a statistical
basis and not use learning techniques at all?
Well, simple is usually better, if it solves your problem.
on the other hand the more boring questions: do i need a hadoop
cluster
for your implementations or could i run them on oracle based
clusters as
well?
I don't know enough about Oracle clusters to render an opinion. If
your asking if Mahout will run inside the Oracle JVM, I'm guessing
that would be a stretch at this point, but I don't have anything to
base that on.
-Grant