Large scale data mining tasks are classic applications on Hadoop MapReduce. For querying, check out HBase on Hadoop.

I am not sure what kind of assurances you are looking for but mine and many companies are obviously betting that Hadoop is a very good platform.

Given that there not many alternatives, I wonder what your choices are...

Raghu.

11 Nov. wrote:
Sorry for the fuziness. We are using hadoop to help doing some lage scale
data mining job, like customer behaviour analysis, etc. The problem is that
we have too large data sets to fit in commercial data bases, say TB~PB
level.
Fortunately, we find hadoop, it might be suitable from the aspect of high
level design, but we don't know the detailed usability for realworld heavy
weight applications.

Advices are welcomed.

2007/12/19, Raghu Angadi <[EMAIL PROTECTED]>:
11 Nov. wrote:
Very glad to see your replies, especially Raghu from Yahoo:)
I really feel more confident about Hadoop now, but still don't have a
specific image about how mature Hadoop already is. So what kind of
realworld
application is Yahoo running based on Hadoop now? Do we have a benchmark
or
that kind of data to tell everybody where is Hadoop now? How far it
would go
to become a full fledged framework, or platform.
I think it will get better responses if you describe what your
requirements are in more specific terms. 'full fledged framework' is too
generic. I would also suggest installing Hadoop on a small cluster and
play with it to get better idea on how solid it is. Another thing to
consider is that Hadoop is improving all the time.

For large clusters, there are a few configuration tweaks that help.. I
think there is already a wiki or faq entry on Hadoop site.

Raghu.



Reply via email to