Re: Hadoop pain points?

robert Sun, 04 Mar 2012 10:12:48 -0800

2012/3/2 Kunaal wrote:
> I am doing a general poll on what are the most prevalent pain points that
> people run into with Hadoop? These could be performance related (memory
> usage, IO latencies), usage related or anything really.
>


My biggest frustration with core Hadoop after the last year or so has
been not having the capability to efficiently implement the so-called
"analytic functions" in general with map reduce.

These are not what one would think they are from just the name by the
way - see Oracle Analytics as an example of what I mean. The big
advantage is that they often allow you to avoid expensive self-joins
which can make a huge difference performance wise.

(I would say that 80% of the analytic functions can be implemented with
a UDF or a UDA in hive -- things like lead() or lag() or first() or
rank() -- but it is the other 20% that would knock the ball out of the park)

Re: Hadoop pain points?

Reply via email to