Re: Question about data warehousing and mining through Mahout

Sean Owen Tue, 31 Aug 2010 15:04:28 -0700

On Tue, Aug 31, 2010 at 10:55 PM, hdev ml <[email protected]> wrote:
> Per my understanding of hive, we can do some statistical reporting, like
> frequency of user sessions, which geographical region, which device he is
> using the most etc.


Yes that's about what Hive is good for, if you're looking for some
open-source libraries along those lines.

>
> But we also want to mine this data to get some predictive capabilities like
> what is the likelihood that the user will use the same device again or if we
> get sales/marketing data (on the roadmap for future), we want to possibly
> predict which region to put more marketing/sales efforts. What is the
> pattern for growth of user base, in which geographical regions etc. What is
> the pattern of user requests failing and a number of requirements like these
> from the business.

This is pretty broad but I can try to give you the names of problems
this sounds like, to guide your search.

Predicting user usage of device sounds like a classification problem,
like developing a probabilistic model of behavior.

Deciding where to put marketing dollars sounds like a business
problem, not machine learning. I don't think a computer can tell you
that. Some techniques might help you identify trends in sales, but
this is simple regression, not really machine learning.

Looking for patterns in failure sounds a bit like frequent pattern
mining -- trying to find events that go together unusually often.

Re: Question about data warehousing and mining through Mahout

Reply via email to