Hi Paul, That got me a little farther, but it failed saying it could not find class "org.apache.hcatalog.security.HdfsAuthorizationProvider". I found a solution to that here: http://stackoverflow.com/a/16361987/871012
With that finally solved, I ran it again and now the first 4 (of 5) commands succeed - the tables get created and actually stick around, but the 5th command to load data into the managed table fails with: Path is not legal ''/user/root/examples/input-data/yytable2/yytable'': Move from: hdfs://10.230.138.159:8020/user/root/examples/input-data/yytable2/ yytable to: hdfs://michael-hadoop-5.tsh.thomson .com:8020/apps/hive/warehouse/xmas is not valid. Please check that values for params "default.fs.name" and "hive.metastore.warehouse.dir" do not conflict. I've tried various things to fix this, none of which work. I can run the hive commands in the Hive shell with no problem. Since this becoming a Hive issue, I've filed a larger posting here: http://hortonworks .com/community/forums/topic/problems-with-hive-and-oozie/ However, I will say that I've put in 12 hours (1.5 days) of effort to get a fairly simple Hive script to work in Oozie and am still unsuccessful. I also cannot get a shell action to execute hive properly. At this point, I'm very dubious about recommending using Oozie to my team at work. I hope my issue can be resolved soon and Oozie usability improves in general in the near future. In my view, Hive is an important enough feature of the Hadoop landscape that Hive should work out of the box with Oozie. -Michael On Thu, Oct 17, 2013 at 5:30 PM, Paul Chavez <pchavez@verticalsearchworks .com> wrote: > Try putting the hive-site.xml in HDFS in the workflow deployment > directory, right next to the workflow.xml file, and specify hive-site.xml > with no path. > > > -----Original Message----- > From: [email protected] [mailto:[email protected]] > Sent: Thursday, October 17, 2013 2:25 PM > To: [email protected] > Subject: Re: Why don't tables I create in Hive jobs in Oozie stay around? > > Hi Paul, > > Thanks for the tip. I added the ConnectionPassword param to the > hive-site.xml (which was the only thing missing) and then tried your > suggestion. > > Unfortunately, it can't get out of the starting gate. I get the error > message: org.apache.oozie.action.ActionExecutorException: JA008: File does > not exist: /etc/hive/conf/hive-site.xml > > I tried: > * "hive-site.xml", > * the above, which is a symlink > * the actual file that the symlink points to > * opening the permissions to the fiile and its parent dirs to 777. > > All variations failed with the same error message. > > -Michael > > > > On Thu, Oct 17, 2013 at 4:58 PM, Paul Chavez < > [email protected]> wrote: > > > I put the metastore database connection settings into a hive-site.xml > > file bundled with my workflow definition and reference it using the > > <job-xml> element within the Hive action. My hive-site.xml looks > > almost exactly the same as your hive-settings.xml. > > > > I used to be able to explain what exactly is happening as I had a > > similarly maddening learning curve setting it up but it's been over a > > year now and I can't remember the details. Give it a shot and let us > > know how it goes? > > > > Snippet from a workflow template I use: > > <action name="Process"> > > <hive xmlns="uri:oozie:hive-action:0.2"> > > <job-tracker>${jobTracker}</job-tracker> > > <name-node>${nameNode}</name-node> > > <job-xml>hive-site.xml</job-xml> > > <script>${wf_script_path}</script> > > <param>...</param> > > </hive> > > <ok to="end"/> > > <error to="FailureEmail"/> > > </action> > > > > Hope that helps, > > Paul Chavez > > > > > > -----Original Message----- > > From: [email protected] [mailto:[email protected]] > > Sent: Thursday, October 17, 2013 1:47 PM > > To: [email protected] > > Subject: Why don't tables I create in Hive jobs in Oozie stay around? > > > > I'm evaluating Oozie for my team and work to decide if we want to use it. > > I have been working through the examples that Apache Oozie project > > has for download. I have spent much of the day trying to get basic > > Hive jobs to work with mostly disappointing results. I'd like some > > help on what to do. > > > > I've tried using the hive action and running hive via a shell action. > > After struggling with write permissions to the /user/hive HDFS > > warehouse directory, the former only partially works (even when it > > reports success) and the latter never works and doesn't report any > useful errors. > > > > > > *Hive via Hive action:* > > > > $ tree hive2 > > hive2 > > |-- hive-settings.xml > > |-- job.properties > > |-- script.sql > > |-- workflow.xml > > > > > > $ cat script.sql > > DROP TABLE IF EXISTS yytest2; > > > > CREATE EXTERNAL TABLE yytest2 (x int) > > ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' > > LOCATION '/user/root/examples/input-data/yytable2/yytable'; > > > > INSERT OVERWRITE DIRECTORY '/user/root/examples/output-data/hive2yy' > > SELECT > > * FROM yytest2; > > > > CREATE TABLE IF NOT EXISTS xmas (x int) ROW FORMAT DELIMITED FIELDS > > TERMINATED BY ','; > > > > LOAD DATA INPATH '/user/root/examples/input-data/yytable2/yytable' > > INTO TABLE xmas; > > > > > > $ cat workflow.xml > > <?xml version="1.0" encoding="UTF-8"?> <workflow-app > > xmlns="uri:oozie:workflow:0.2" name="hive-wf"> > > <start to="hive-node"/> > > > > <action name="hive-node"> > > <hive xmlns="uri:oozie:hive-action:0.2"> > > <job-tracker>${jobTracker}</job-tracker> > > <name-node>${nameNode}</name-node> > > <prepare> > > <delete path="${nameNode}/user/${wf:user()}/${examplesRoot > > }/output-data/hive2"/> > > <mkdir path="${nameNode}/user/${wf:user()}/${examplesRoot > > }/output-data"/> > > </prepare> > > <configuration> > > <property> > > <name>mapred.job.queue.name</name> > > <value>${queueName}</value> > > </property> > > <property> > > <name>oozie.hive.defaults</name> > > <value>hive-settings.xml</value> > > </property> > > </configuration> > > <script>script.sql</script> > > </hive> > > <ok to="end"/> > > <error to="fail"/> > > </action> > > > > <kill name="fail"> > > <message>BOO HOO2: Hive2 failed, error message[${wf:errorMessage(wf: > > lastErrorNode())}]</message> > > </kill> > > <end name="end"/> > > </workflow-app> > > > > > > $ cat job.properties > > nameNode=hdfs://10.230.138.159:8020 > > jobTracker=http://10.230.138.159:50300 > > > > queueName=default > > examplesRoot=examples > > > > oozie.use.system.libpath=true > > > > oozie.wf.application.path=${nameNode}/user/${user.name}/${examplesRoot > > }/apps/hive2 > > > > > > $ cat hive-settings.xml > > <?xml version="1.0"?> > > <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> > > > > <configuration> > > > > <!-- Hive Execution Parameters --> > > <property> > > <name>javax.jdo.option.ConnectionURL</name> > > > > <value>jdbc:mysql://10.230.138.159:3306/hive?createDatabaseIfNotExist > > =true</value> > > <description>JDBC connect string for a JDBC metastore</description> > > </property> > > > > <property> > > <name>javax.jdo.option.ConnectionDriverName</name> > > <value>com.mysql.jdbc.Driver</value> > > <description>Driver class name for a JDBC metastore</description> > > </property> > > > > <property> > > <name>javax.jdo.option.ConnectionUserName</name> > > <value>hive</value> > > <description>username to use against metastore database</description> > > </property> > > > > <property> > > <name>javax.jdo.option.ConnectionPassword</name> > > <value>hiveuser</value> > > <description>password to use against metastore database</description> > > </property> > > > > <property> > > <name>javax.jdo.PersistenceManagerFactoryClass</name> > > <value>org.datanucleus.jdo.JDOPersistenceManagerFactory</value> > > <description>class implementing the jdo persistence</description> > > </property> > > > > </configuration> > > > > > > When I run it now reports SUCCESS. > > > > > > It worked in the sense that the > > /user/root/examples/output-data/hive2yy/000000_0 file is present in > > HDFS(created via the first table) and the > > /user/root/examples/input-data/yytable2/yytable/integers.out file is > > missing (because it got moved when Hive created the "xmas" managed > table). > > > > But the tables I just created are not present in Hive nor are they in > > the Hive metastore (MySQL): > > $ hive -e "show tables" > > OK > > tab_name > > Time taken: 2.153 seconds > > > > But they must have been created in order for the other results to be > true. > > > > When I run the script.sql by hand in Hive it all works fine and the > > tables remain. *Why do the tables disappear when I run this via Oozie? > > * > > > > > > This email is long enough. I'll send my other issue (Hive via shell > > action) in another email. > > > > * > > * > > *For reference:* > > I'm using a CentOS VMWare VM running Hortonworks HDP 1.3 Oozie version: > > 3.3.2.1.3.2.0-111 Hive version: 0.11 (0.11.0.1.3.2.0-111) > > > > > > Thanks for any help you can provide, > > -Michael > > >
