Hello,
INSERT OVERWRITE DIRECTORY '/user/oozie/examples/output-data' SELECT * FROM
test;
Job is still failing on this command. Need somewhere to look:
I only see in the hive log
: 2015-04-08 16:57:12,669 DEBUG [pool-4-thread-8]: metastore.ObjectStore
(ObjectStore.java:debugLog(6713)) - Open transaction: count = 1, isActive =
true at:
org.apache.hadoop.hive.metastore.ObjectS the
ttore.getMSchemaVersion(ObjectStore.java:6622)
2015-04-08 16:57:12,671 DEBUG [pool-4-thread-8]: metastore.ObjectStore
(ObjectStore.java:debugLog(6713)) - Commit transaction: count = 0, isactive
true at:
org.apache.hadoop.hive.metastore.ObjectStore.getMSchemaVersion(ObjectStore.java:6636)
2015-04-08 16:57:12,671 DEBUG [pool-4-thread-8]: metastore.ObjectStore
(ObjectStore.java:debugLog(6713)) - Open transaction: count = 1, isActive =
true at:
org.apache.hadoop.hive.metastore.ObjectStore.getTable(ObjectStore.java:897)
2015-04-08 16:57:12,671 DEBUG [pool-4-thread-8]: metastore.ObjectStore
(ObjectStore.java:debugLog(6713)) - Open transaction: count = 2, isActive =
true at:
org.apache.hadoop.hive.metastore.ObjectStore.getMTable(ObjectStore.java:960)
2015-04-08 16:57:12,675 DEBUG [pool-4-thread-8]: metastore.ObjectStore
(ObjectStore.java:debugLog(6713)) - Commit transaction: count = 1, isactive
true at:
org.apache.hadoop.hive.metastore.ObjectStore.getMTable(ObjectStore.java:968)
2015-04-08 16:57:12,683 DEBUG [pool-4-thread-8]: metastore.ObjectStore
(ObjectStore.java:debugLog(6713)) - Commit transaction: count = 0, isactive
true at:
org.apache.hadoop.hive.metastore.ObjectStore.getTable(ObjectStore.java:899)
getTable is getting called when the select statement is called.
It is not creating map reduce output from the INSERT OVERWRITE DIRECTORY.
Thanks,
Gazza
-----Original Messageis-----
From: Gary Clark
Sent: Tuesday, April 07, 2015 4:06 PM
To: [email protected]; Mohammad Islam
Subject: RE: Oozie Hive job running continuously
Hello,
CREATE EXTERNAL TABLE test (a INT) STORED AS TEXTFILE LOCATION '${INPUT}';
The test table gets populated with the file input. Good start.
However the below:
INSERT OVERWRITE DIRECTORY /user/oozie/examples/output-data/hive SELECT *
from test;
When running just:
SELECT * from test;
The job still succeeds.
Maybe permission issue but I don’t see anything to indicate that. Anybody seen
this?
Thanks,
Gazza
-----Original Message-----
From: Gary Clark [mailto:[email protected]]
Sent: Tuesday, April 07, 2015 12:33 PM
To: Mohammad Islam; [email protected]
Subject: RE: Oozie Hive job running continuously
Hello,
Thanks Mohammad.
Almost there. I had to changed the hive-site.xml from what was installed.
hive.metastore.client.connect.retry.delay with had 1s to 1.
hive.metastore.client.socket.timeout from 600s to 600.
The oozie 4.1.0 hive example almost works. The hive table "test" does get
created with the input data. Oozie to hive.
However even though the script was successful in creating and load the test
table:
07093549669-oozie-oozi-W]
ACTION[0000009-150407093549669-oozie-oozi-W@hive-node] Launcher ERROR, reason:
Main class [org.apache.oozie.action.hadoop.HiveMain], exit code [1]
2015-04-07 12:31:22,240 INFO ActionEndXCommand:541 -
SERVER[localhost.localdomain] USER[oozie] GROUP[-] TOKEN[] APP[hive-wf]
JOB[0000009-15040
Hmmm. Not sure at the moment.
Thanks,
Gazza
From: Mohammad Islam [mailto:[email protected]]
Sent: Tuesday, April 07, 2015 11:58 AM
To: Gary Clark; [email protected]
Subject: Re: Oozie Hive job running continuously
Hi Gazza,
Not sure what causes this.
Did you able to run any simple hive command? even the example that comes with
oozie-examples?
may be related: Loading data to MySQL table from Hive using
Sqoop/Oozie<http://stackoverflow.com/questions/23663978/loading-data-to-mysql-table-from-hive-using-sqoop-oozie>
[image]<http://stackoverflow.com/questions/23663978/loading-data-to-mysql-table-from-hive-using-sqoop-oozie>
Loading data to MySQL table from Hive using
Sqoop/Oozie<http://stackoverflow.com/questions/23663978/loading-data-to-mysql-table-from-hive-using-sqoop-oozie>
I am facing some grave issue (tried unsuccessful 196 times) while loading data
using Sqoop (sqoop command inside Oozie) to MySQL table. If there is only one
column...
View on
stackoverflow.com<http://stackoverflow.com/questions/23663978/loading-data-to-mysql-table-from-hive-using-sqoop-oozie>
Preview by Yahoo
Regards,
Mohammad
On Tuesday, April 7, 2015 8:13 AM, Gary Clark
<[email protected]<mailto:[email protected]>> wrote:
Hello,
Running the oozie job I hit strange problem: Looking at the oozie log I see the
below:
2015-04-07 09:49:16,109 WARN JobSubmitter:150 - SERVER[localhost.localdomain]
Hadoop command-line option parsing not performed. Implement the Tool interface
and execute your application with ToolRunner to remedy this.
2015-04-07 09:49:16,117 WARN JobSubmitter:259 - SERVER[localhost.localdomain]
No job jar file set. User classes may not be found. See Job or
Job#setJar(String).
2015-04-07 09:49:16,664 INFO HiveActionExecutor:541 -
SERVER[localhost.localdomain] USER[oozie] GROUP[-] TOKEN[] APP[hive-wf]
JOB[0000001-150407093549669-oozie-oozi-W]
ACTION[0000001-150407093549669-oozie-oozi-W@hive-node<mailto:0000001-150407093549669-oozie-oozi-W@hive-node>]
checking action, hadoop job ID [job_1428416855230_0002] status [RUNNING]
2015-04-07 09:49:16,667 INFO ActionStartXCommand:541 -
SERVER[localhost.localdomain] USER[oozie] GROUP[-] TOKEN[] APP[hive-wf]
JOB[0000001-150407093549669-oozie-oozi-W]
ACTION[0000001-150407093549669-oozie-oozi-W@hive-node<mailto:0000001-150407093549669-oozie-oozi-W@hive-node>]
[***0000001-150407093549669-oozie-oozi-W@hive-node<mailto:0000001-150407093549669-oozie-oozi-W@hive-node>***]Action
status=RUNNING
2015-04-07 09:49:16,667 INFO ActionStartXCommand:541 -
SERVER[localhost.localdomain] USER[oozie] GROUP[-] TOKEN[] APP[hive-wf]
JOB[0000001-150407093549669-oozie-oozi-W]
ACTION[0000001-150407093549669-oozie-oozi-W@hive-node<mailto:0000001-150407093549669-oozie-oozi-W@hive-node>]
[***0000001-150407093549669-oozie-oozi-W@hive-node<mailto:0000001-150407093549669-oozie-oozi-W@hive-node>***]Action
updated in DB!
2015-04-07 09:49:36,803 INFO CallbackServlet:541 -
SERVER[localhost.localdomain] USER[-] GROUP[-] TOKEN[-] APP[-]
JOB[0000001-150407093549669-oozie-oozi-W]
ACTION[0000001-150407093549669-oozie-oozi-W@hive-node<mailto:0000001-150407093549669-oozie-oozi-W@hive-node>]
callback for action
[0000001-150407093549669-oozie-oozi-W@hive-node<mailto:0000001-150407093549669-oozie-oozi-W@hive-node>]
2015-04-07 09:49:39,639 INFO HiveActionExecutor:541 -
SERVER[localhost.localdomain] USER[oozie] GROUP[-] TOKEN[] APP[hive-wf]
JOB[0000001-150407093549669-oozie-oozi-W]
ACTION[0000001-150407093549669-oozie-oozi-W@hive-node<mailto:0000001-150407093549669-oozie-oozi-W@hive-node>]
action completed, external ID [job_1428416855230_0002]
2015-04-07 09:49:40,004 WARN HiveActionExecutor:544 -
SERVER[localhost.localdomain] USER[oozie] GROUP[-] TOKEN[] APP[hive-wf]
JOB[0000001-150407093549669-oozie-oozi-W]
ACTION[0000001-150407093549669-oozie-oozi-W@hive-node<mailto:0000001-150407093549669-oozie-oozi-W@hive-node>]
Launcher ERROR, reason: Main class [org.apache.oozie.action.hadoop.HiveMain],
main() threw exception, java.lang.NumberFormatException: For input string: "1s"
2015-04-07 09:49:40,006 WARN HiveActionExecutor:544 -
SERVER[localhost.localdomain] USER[oozie] GROUP[-] TOKEN[] APP[hive-wf]
JOB[0000001-150407093549669-oozie-oozi-W]
ACTION[0000001-150407093549669-oozie-oozi-W@hive-node<mailto:0000001-150407093549669-oozie-oozi-W@hive-node>]
Launcher exception: java.lang.NumberFormatException: For input string: "1s"
java.lang.RuntimeException: java.lang.NumberFormatException: For input string:
"1s"
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:346)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
at org.apache.oozie.action.hadoop.HiveMain.runHive(HiveMain.java:323)
at org.apache.oozie.action.hadoop.HiveMain.run(HiveMain.java:284)
at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:39)
at org.apache.oozie.action.hadoop.HiveMain.main(HiveMain.java:66)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:226)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.NumberFormatException: For input string: "1s"
at
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:492)
at java.lang.Integer.parseInt(Integer.java:527)
at org.apache.hadoop.conf.Configuration.getInt(Configuration.java:1134)
at org.apache.hadoop.hive.conf.HiveConf.getIntVar(HiveConf.java:1211)
at org.apache.hadoop.hive.conf.HiveConf.getIntVar(HiveConf.java:1220)
at
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:58)
at
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:72)
at
org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2453)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2465)
at
org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:340)
Im seeing a strange error. I'm running hiveServer2 and metastore. Ideas on how
to remedy this would be appreciated.
Thanks,
Gazza
-----Original Message-----
From: Gary Clark [mailto:[email protected]<mailto:[email protected]>]
Sent: Tuesday, April 07, 2015 8:25 AM
To: [email protected]<mailto:[email protected]>; Mohammad Islam
Subject: RE: Oozie Hive job running continuously
Totallly agree. I will post on this thread.
-----Original Message-----
From: Mohammad Islam
[mailto:[email protected]<mailto:[email protected]>]
Sent: Monday, April 06, 2015 4:59 PM
To: [email protected]<mailto:[email protected]>
Subject: Re: Oozie Hive job running continuously
Please share with us, how did you resolve the issue. It will hep others in
future.
You should include "<job-xml>${hiveSiteXML}</job-xml>" after <name-node> or
before <configurations>?
It has some sequential dependencies.More at :Oozie - Regards,Mohammad
On Monday, April 6, 2015 2:41 PM, Gary Clark
<[email protected]<mailto:[email protected]>> wrote:
Thanks Islam. I made a wee bit of progress. In that I did get a succeed job.
However no table was created.
I am using 4.1.0. I perform the below:
1) ./hive --service metastore
This should then connects to the local derby database.
2) It then fails with the below:
<workflow-app xmlns="uri:oozie:workflow:0.2" name="hive-wf">
<start to="hive-node"/>
<action name="hive-node">
<hive xmlns="uri:oozie:hive-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
</configuration>
<job-xml>${hiveSiteXML}</job-xml>
<script>script.q</script>
</hive>
<ok to="end"/>
<error to="fail"/>
</action>
<kill name="fail">
<message>Hive failed, error
message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end"/>
</workflow-app>
Error: E0701 : E0701: XML schema error, cvc-complex-type.2.4.a: Invalid content
was found starting with element 'job-xml'. One of
'{"uri:oozie:hive-action:0.2":script}' is expected.
It does not like the "job-xml"
Not sure what version to use. Please advise. I need to use job-xml to specify
the hive-site.xml.
Thanks,
Gazza
-----Original Message-----
From: Mohammad Islam
[mailto:[email protected]<mailto:[email protected]>]
Sent: Monday, April 06, 2015 12:08 PM
To: [email protected]<mailto:[email protected]>
Subject: Re: Oozie Hive job running continuously
Hi Gazza,You seems to identify the problem correctly.You have some resource
shortage. You can check RM UI to see how much resources are available and
already used.You may consider to use uber job for launcher. Also try to reduce
the container memory for Launcher. Alternatively, you can increase the number
of nodes in your cluster.
Regards,Mohammad
On Monday, April 6, 2015 7:48 AM, Gary Clark
<[email protected]<mailto:[email protected]>> wrote:
Hello,
I am hoping this is a common problem. I am running a single node cluster (1
machine) with Hadoop 2.6.0 and OOZIE 4.1.0 with hiveServer2 running in the
background.
I am executing the hive work flow example provided in the examples:
What I am seeing is the below in the hadoop logs:
2015-04-06 08:57:01,475 INFO [IPC Server handler 0 on 59454]
org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID :
jvm_1428328525963_0001_m_000002 asked for a task
2015-04-06 08:57:01,476 INFO [IPC Server handler 0 on 59454]
org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID:
jvm_1428328525963_0001_m_000002 given task:
attempt_1428328525963_0001_m_000000_0
2015-04-06 08:57:09,052 INFO [IPC Server handler 29 on 59454]
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt
attempt_1428328525963_0001_m_000000_0 is : 1.0
2015-04-06 08:57:39,354 INFO [IPC Server handler 27 on 59454]
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt
attempt_1428328525963_0001_m_000000_0 is : 1.0
2015-04-06 08:58:09,474 INFO [IPC Server handler 29 on 59454]
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt
attempt_1428328525963_0001_m_000000_0 is : 1.0
2015-04-06 08:58:36,866 INFO [IPC Server handler 25 on 59454]
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt
attempt_1428328525963_0001_m_000000_0 is : 1.0
2015-04-06 08:59:07,026 INFO [IPC Server handler 25 on 59454]
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt
attempt_1428328525963_0001_m_000000_0 is : 1.0
2015-04-06 08:59:37,142 INFO [IPC Server handler 26 on 59454]
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt
attempt_1428328525963_0001_m_000000_0 is : 1.0
2015-04-06 09:00:07,278 INFO [IPC Server handler 28 on 59454]
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt
attempt_1428328525963_0001_m_000000_0 is : 1.0
2015-04-06 09:00:37,389 INFO [IPC Server handler 26 on 59454]
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt
attempt_1428328525963_0001_m_000000_0 is : 1.0
It looks like a resource constraint.
My yarn-site.xml:
<?xml version="1.0"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>5120</value>
<description>Amount of physical memory, in MB, that can be allocated
for containers.</description>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>512</value>
<description>
The minimum allocation for every container request at the RM,
in MBs. Memory requests lower than this won't take effect,
and the specified value will get allocated at minimum.
</description>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>2048</value>
<description>
The maximum allocation for every container request at the RM,
in MBs. Memory requests higher than this won't take effect,
and will get capped to this value.
</description>
</property>
<property>
<name>yarn.resourcemanager.scheduler.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
</property>
<configuration>
Anybody seen this? I am running this on a virtual machine.
Much Appreciated,
Gazza