Re: Oozie Hive job running continuously

Purshotam Shah Mon, 06 Apr 2015 14:59:29 -0700

Move job-xml tag above configuration, as given below.

<workflow-app xmlns="uri:oozie:workflow:0.2" name="hive-wf">    <start 
to="hive-node"/>


    <action name="hive-node">
        <hive xmlns="uri:oozie:hive-action:0.2">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>            
<job-xml>${hiveSiteXML}</job-xml>            <configuration>
                <property>
                    <name>mapred.job.queue.name</name>
                    <value>${queueName}</value>
                </property>
            </configuration>
            <script>script.q</script>
        </hive>
        <ok to="end"/>
        <error to="fail"/>
    </action>
    <kill name="fail">
        <message>Hive failed, error 
message[${wf:errorMessage(wf:lastErrorNode())}]</message>
    </kill>
    <end name="end"/>
</workflow-app>
      From: Gary Clark <[email protected]>
 To: "[email protected]" <[email protected]>; Mohammad Islam 
<[email protected]> 
 Sent: Monday, April 6, 2015 2:40 PM
 Subject: RE: Oozie Hive job running continuously
   
Thanks Islam. I made a wee bit of progress. In that I did get a succeed job. 
However no table was created.

I am using 4.1.0. I perform the below:

1) ./hive --service metastore

This should then connects to the local derby database. 

2) It then fails with the below:

<workflow-app xmlns="uri:oozie:workflow:0.2" name="hive-wf">
    <start to="hive-node"/>

    <action name="hive-node">
        <hive xmlns="uri:oozie:hive-action:0.2">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <configuration>
                <property>
                    <name>mapred.job.queue.name</name>
                    <value>${queueName}</value>
                </property>
            </configuration>
            <job-xml>${hiveSiteXML}</job-xml>
            <script>script.q</script>
        </hive>
        <ok to="end"/>
        <error to="fail"/>
    </action>
    <kill name="fail">
        <message>Hive failed, error 
message[${wf:errorMessage(wf:lastErrorNode())}]</message>
    </kill>
    <end name="end"/>
</workflow-app>


Error: E0701 : E0701: XML schema error, cvc-complex-type.2.4.a: Invalid content 
was found starting with element 'job-xml'. One of 
'{"uri:oozie:hive-action:0.2":script}' is expected.

It does not like the "job-xml"

Not sure what version to use. Please advise. I need to use job-xml to specify 
the hive-site.xml.

Thanks,
Gazza






-----Original Message-----
From: Mohammad Islam [mailto:[email protected]] 
Sent: Monday, April 06, 2015 12:08 PM
To: [email protected]
Subject: Re: Oozie Hive job running continuously

Hi Gazza,You seems to identify the problem correctly.You have some resource 
shortage. You can check RM UI to see how much resources are available and 
already used.You may consider to use uber job for launcher. Also try to reduce 
the container memory for Launcher. Alternatively, you can increase the number 
of nodes in your cluster.
Regards,Mohammad 


    On Monday, April 6, 2015 7:48 AM, Gary Clark <[email protected]> wrote:
  

 
Hello,

I am hoping this is a common problem. I am running a single node cluster (1 
machine) with Hadoop 2.6.0 and OOZIE 4.1.0 with hiveServer2 running in the 
background.

I am executing the hive work flow example provided in the examples:

What I am seeing is the below in the hadoop logs:

2015-04-06 08:57:01,475 INFO [IPC Server handler 0 on 59454] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID : 
jvm_1428328525963_0001_m_000002 asked for a task
2015-04-06 08:57:01,476 INFO [IPC Server handler 0 on 59454] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID: 
jvm_1428328525963_0001_m_000002 given task: 
attempt_1428328525963_0001_m_000000_0
2015-04-06 08:57:09,052 INFO [IPC Server handler 29 on 59454] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt 
attempt_1428328525963_0001_m_000000_0 is : 1.0
2015-04-06 08:57:39,354 INFO [IPC Server handler 27 on 59454] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt 
attempt_1428328525963_0001_m_000000_0 is : 1.0
2015-04-06 08:58:09,474 INFO [IPC Server handler 29 on 59454] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt 
attempt_1428328525963_0001_m_000000_0 is : 1.0
2015-04-06 08:58:36,866 INFO [IPC Server handler 25 on 59454] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt 
attempt_1428328525963_0001_m_000000_0 is : 1.0
2015-04-06 08:59:07,026 INFO [IPC Server handler 25 on 59454] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt 
attempt_1428328525963_0001_m_000000_0 is : 1.0
2015-04-06 08:59:37,142 INFO [IPC Server handler 26 on 59454] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt 
attempt_1428328525963_0001_m_000000_0 is : 1.0
2015-04-06 09:00:07,278 INFO [IPC Server handler 28 on 59454] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt 
attempt_1428328525963_0001_m_000000_0 is : 1.0
2015-04-06 09:00:37,389 INFO [IPC Server handler 26 on 59454] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt 
attempt_1428328525963_0001_m_000000_0 is : 1.0

It looks like a resource constraint.

My yarn-site.xml:

<?xml version="1.0"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->
<configuration>

<!-- Site specific YARN configuration properties -->

 <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
  </property>

  <property>
    <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
    <value>org.apache.hadoop.mapred.ShuffleHandler</value>
  </property>

  <property>
    <name>yarn.nodemanager.resource.memory-mb</name>
    <value>5120</value>
    <description>Amount of physical memory, in MB, that can be allocated
      for containers.</description>
  </property>

  <property>
    <name>yarn.scheduler.minimum-allocation-mb</name>
    <value>512</value>
    <description>
      The minimum allocation for every container request at the RM,
      in MBs. Memory requests lower than this won't take effect,
      and the specified value will get allocated at minimum.
    </description>
  </property>

  <property>
    <name>yarn.scheduler.maximum-allocation-mb</name>
    <value>2048</value>
    <description>
      The maximum allocation for every container request at the RM,
      in MBs. Memory requests higher than this won't take effect,
      and will get capped to this value.
    </description>
  </property>

  <property>
    <name>yarn.resourcemanager.scheduler.class</name>
    
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
  </property>

<configuration>

Anybody seen this? I am running this on a virtual machine.

Much Appreciated,
Gazza

Re: Oozie Hive job running continuously

Reply via email to