Try putting the hive-site.xml in HDFS in the workflow deployment directory, 
right next to the workflow.xml file, and specify hive-site.xml with no path.


-----Original Message-----
From: [email protected] [mailto:[email protected]] 
Sent: Thursday, October 17, 2013 2:25 PM
To: [email protected]
Subject: Re: Why don't tables I create in Hive jobs in Oozie stay around?

Hi Paul,

Thanks for the tip.  I added the ConnectionPassword param to the hive-site.xml 
(which was the only thing missing) and then tried your suggestion.

Unfortunately, it can't get out of the starting gate.  I get the error
message: org.apache.oozie.action.ActionExecutorException: JA008: File does not 
exist: /etc/hive/conf/hive-site.xml

I tried:
* "hive-site.xml",
* the above, which is a symlink
* the actual file that the symlink points to
* opening the permissions to the fiile and its parent dirs to 777.

All variations failed with the same error message.

-Michael



On Thu, Oct 17, 2013 at 4:58 PM, Paul Chavez < [email protected]> 
wrote:

> I put the metastore database connection settings into a hive-site.xml 
> file bundled with my workflow definition and reference it using the 
> <job-xml> element within the Hive action. My hive-site.xml looks 
> almost exactly the same as your hive-settings.xml.
>
> I used to be able to explain what exactly is happening as I had a 
> similarly maddening learning curve setting it up but it's been over a 
> year now and I can't remember the details. Give it a shot and let us 
> know how it goes?
>
> Snippet from a workflow template I use:
>     <action name="Process">
>         <hive xmlns="uri:oozie:hive-action:0.2">
>             <job-tracker>${jobTracker}</job-tracker>
>             <name-node>${nameNode}</name-node>
>             <job-xml>hive-site.xml</job-xml>
>             <script>${wf_script_path}</script>
>             <param>...</param>
>         </hive>
>         <ok to="end"/>
>         <error to="FailureEmail"/>
>     </action>
>
> Hope that helps,
> Paul Chavez
>
>
> -----Original Message-----
> From: [email protected] [mailto:[email protected]]
> Sent: Thursday, October 17, 2013 1:47 PM
> To: [email protected]
> Subject: Why don't tables I create in Hive jobs in Oozie stay around?
>
> I'm evaluating Oozie for my team and work to decide if we want to use it.
>  I have been working through the examples that Apache Oozie project 
> has for download.  I have spent much of the day trying to get basic 
> Hive jobs to work with mostly disappointing results.  I'd like some 
> help on what to do.
>
> I've tried using the hive action and running hive via a shell action.
>  After struggling with write permissions to the /user/hive HDFS 
> warehouse directory, the former only partially works (even when it 
> reports success) and the latter never works and doesn't report any useful 
> errors.
>
>
> *Hive via Hive action:*
>
> $ tree hive2
> hive2
> |-- hive-settings.xml
> |-- job.properties
> |-- script.sql
> |-- workflow.xml
>
>
> $ cat script.sql
> DROP TABLE IF EXISTS yytest2;
>
> CREATE EXTERNAL TABLE yytest2 (x int)
> ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
> LOCATION '/user/root/examples/input-data/yytable2/yytable';
>
> INSERT OVERWRITE DIRECTORY '/user/root/examples/output-data/hive2yy' 
> SELECT
> * FROM yytest2;
>
> CREATE TABLE IF NOT EXISTS xmas (x int) ROW FORMAT DELIMITED FIELDS 
> TERMINATED BY ',';
>
> LOAD DATA INPATH '/user/root/examples/input-data/yytable2/yytable' 
> INTO TABLE xmas;
>
>
> $ cat workflow.xml
> <?xml version="1.0" encoding="UTF-8"?> <workflow-app 
> xmlns="uri:oozie:workflow:0.2" name="hive-wf">
>   <start to="hive-node"/>
>
>   <action name="hive-node">
>     <hive xmlns="uri:oozie:hive-action:0.2">
>       <job-tracker>${jobTracker}</job-tracker>
>       <name-node>${nameNode}</name-node>
>       <prepare>
>         <delete path="${nameNode}/user/${wf:user()}/${examplesRoot
> }/output-data/hive2"/>
>         <mkdir path="${nameNode}/user/${wf:user()}/${examplesRoot
> }/output-data"/>
>       </prepare>
>       <configuration>
>         <property>
>           <name>mapred.job.queue.name</name>
>           <value>${queueName}</value>
>         </property>
>         <property>
>           <name>oozie.hive.defaults</name>
>           <value>hive-settings.xml</value>
>         </property>
>       </configuration>
>       <script>script.sql</script>
>     </hive>
>     <ok to="end"/>
>     <error to="fail"/>
>   </action>
>
>   <kill name="fail">
>     <message>BOO HOO2: Hive2 failed, error message[${wf:errorMessage(wf:
> lastErrorNode())}]</message>
>   </kill>
>   <end name="end"/>
> </workflow-app>
>
>
> $ cat job.properties
> nameNode=hdfs://10.230.138.159:8020
> jobTracker=http://10.230.138.159:50300
>
> queueName=default
> examplesRoot=examples
>
> oozie.use.system.libpath=true
>
> oozie.wf.application.path=${nameNode}/user/${user.name}/${examplesRoot
> }/apps/hive2
>
>
> $ cat hive-settings.xml
> <?xml version="1.0"?>
> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>
> <configuration>
>
>   <!-- Hive Execution Parameters -->
>   <property>
>     <name>javax.jdo.option.ConnectionURL</name>
>     
> <value>jdbc:mysql://10.230.138.159:3306/hive?createDatabaseIfNotExist
> =true</value>
>     <description>JDBC connect string for a JDBC metastore</description>
>   </property>
>
>   <property>
>     <name>javax.jdo.option.ConnectionDriverName</name>
>     <value>com.mysql.jdbc.Driver</value>
>     <description>Driver class name for a JDBC metastore</description>
>   </property>
>
>   <property>
>     <name>javax.jdo.option.ConnectionUserName</name>
>     <value>hive</value>
>     <description>username to use against metastore database</description>
>   </property>
>
>   <property>
>     <name>javax.jdo.option.ConnectionPassword</name>
>     <value>hiveuser</value>
>     <description>password to use against metastore database</description>
>   </property>
>
>   <property>
>     <name>javax.jdo.PersistenceManagerFactoryClass</name>
>     <value>org.datanucleus.jdo.JDOPersistenceManagerFactory</value>
>     <description>class implementing the jdo persistence</description>
>   </property>
>
> </configuration>
>
>
> When I run it now reports SUCCESS.
>
>
> It worked in the sense that the
> /user/root/examples/output-data/hive2yy/000000_0 file is present in 
> HDFS(created via the first table) and the 
> /user/root/examples/input-data/yytable2/yytable/integers.out file is 
> missing (because it got moved when Hive created the "xmas" managed table).
>
> But the tables I just created are not present in Hive nor are they in 
> the Hive metastore (MySQL):
> $ hive -e "show tables"
> OK
> tab_name
> Time taken: 2.153 seconds
>
> But they must have been created in order for the other results to be true.
>
> When I run the script.sql by hand in Hive it all works fine and the 
> tables remain.  *Why do the tables disappear when I run this via Oozie?
> *
>
>
> This email is long enough.  I'll send my other issue (Hive via shell
> action) in another email.
>
> *
> *
> *For reference:*
> I'm using a CentOS VMWare VM running Hortonworks HDP 1.3 Oozie version:
>  3.3.2.1.3.2.0-111 Hive version: 0.11  (0.11.0.1.3.2.0-111)
>
>
> Thanks for any help you can provide,
> -Michael
>

Reply via email to