Re: Hive Hbase 0.94 ClassNotFoundException com.google.protobuf.Message

Jean-Daniel Cryans Thu, 25 Oct 2012 08:45:57 -0700

On Thu, Oct 25, 2012 at 8:31 AM, Nick maillard
<[email protected]> wrote:
> Hi jean-Daniel
>
> Ok I'll sent it in the env thanks for the advice.
> Are their other libs I might need to add?


The usual client libs... doesn't seem like we documented them
anywhere... it's pretty much what you have in now.

> Could just tell hive to use it's lib directory or hbase's lib directory in 
> it's
> classpath in some way?

That's a question for the hive ML.

> I could just set it in the bashrc but that's not very elegant.

I really meant that you should use HIVE_AUX_JARS_PATH in hive-env.sh

>
> Another thing I am testing my 3 machine hadoop cluster.
> I have queried 'select * from myTestTable' which has 1719428 entries.
> The 7 map tasks and 1 reducer took almost 5 minutes to compute, I am right to
> think it is a little slow?

You have a 1-2 minutes overhead in there because you are using
MapReduce, then usually one should set hbase.client.scanner.caching to
a better value than 1. It's client-side so hive needs to have it. But
everything will seem slow when using MR on such a small dataset, a
single client running a scan would be faster in this case.

> How could I make this go faster, more map tasks, more nodes?

Is select count(*) really the use case you want to optimize? Have you
read this? http://hbase.apache.org/book.html#performance

>
> True I would never scan a whole table usually but I could easily have queries
> that MR over a set of this size.
>

Re: Hive Hbase 0.94 ClassNotFoundException com.google.protobuf.Message

Reply via email to