Hi there,

I've been having a few days of unalloyed torture getting Hive jobs to run
via Oozie on an AWS 5 machine cluster. The simplest job that involved the
live metastore succeeds or fails unpredictably. The error messages wasn't
too descriptive:

    Hive failed, error message[Main class
[org.apache.oozie.action.hadoop.HiveMain], exit code [1]]

After a lot of fun changing just about every imaginable setting, I studied
hivemetastore.log carefully (we have mySQL as the metastore) and realised
that every successful request came from 172.31.40.3.  Unsuccessful requests
came from 172.31.40.2,172.31.40.4 and 172.31.40.5 . The Hive console app
makes requests without problems on  172.31.40.1

This is getting somewhere after nearly week of having no idea whatsover is
going on. The question is now, is there a config setting somewhere I need
to change to allow all requests from 172.31.40.1-5 in? Or could I funnel
Oozie requests solely through 172.31.40.1 or 172.31.40.3, not  using 2/4/5.

Why would only 172.31.40.1 and 172.31.40.3 work?  There must be some
process by whereby jobs submitted to Oozie then get handed over to Hive.
There are 5 machines in the cluster, which matches the pattern of ip
addresses, and there seems to be a situation where the Oozie jobs are
randomly allocated to a machine in the cluster, which then contacts Hive
and  attempts to run  the query. The problem seems to be that the requests
only work when they come from a specific machine

all ideas and suggestions warmly received.

many thanks

Toby

Reply via email to