Hi there,

There seemed an increasing interest in Hive On Spark From the Hive users. I
understand that there have been a few questions or problems reported and I
can see some frustration sometimes. It's impossible for Hive on Spark team
to respond every inquiry even thought we wish we could. However, there are
a few items to be noted:

1. Hive on Spark is being tested as part of Precommit test.
2. Hive on Spark is supported in some distributions such as CDH.
3. I tried a couple of days ago with latest master and branch-1, and they
all worked with my Spark 1.5 build.

Therefore, if you are facing some problem, it's likely due to your setup.
Please refer to Wiki on how to do it right. Nevertheless, I have a few
suggestions here:

1. Start with simple. Try out a CDH sandbox or distribution first and to
see it works in action before building your own. Comparing with your setup
may give you some clues.
2. Try with spark.master=local first, making sure that you have all the
necessary dependent jars, and then move to your production setup. Please
note that yarn-cluster is recommended and mesos is not supported. I tried
both yarn-cluster and local-cluster and both worked for me.
3. Check logs beyond hive.log such as spark log, and yarn-log to get more
error messages.

When you report your problem, please provide as much info as possible, such
as your platform, your builds, your configurations, and relevant logs so
that others can reproduce.

Please note that we are not in a good position to answer questions with
respect to Spark itself, such as spark-shell. Not only is that beyond the
scope of Hive on Scope, but also the team may not have the expertise to
give your meaningful answers. One thing to emphasize. When you build your
spark jar, don't include Hive, as it's very likely there is a version
mismatch. Again, a distribution may have solve the problem for you if you
like to give it a try.

Hope this helps.

Thanks,
Xuefu

Reply via email to