Re: oozie execute shell(content hive or sqoop command)

Peter Cseh Wed, 07 Sep 2016 00:47:04 -0700

You can implement multiple Sqoop actions for this and even do them parallel
using a fork
<https://oozie.apache.org/docs/4.2.0/WorkflowFunctionalSpec.html#a3.1.5_Fork_and_Join_Control_Nodes>
.
As an alternative, you can use the --import-all-tables
<http://sqoop.apache.org/docs/1.4.4/SqoopUserGuide.html#_literal_sqoop_import_all_tables_literal>
Sqoop option together with --exclude-tables to import all tables in one
action.
br
gp


On Wed, Sep 7, 2016 at 4:38 AM, wangwei <[email protected]> wrote:

> Hi,
>  For sqoop, I would like to implement the following operation：
> #!/usr/bin/env bash
> # Need to use sqoop to import multiple tables
> # sqoop.sh
> for table in test1 test2 test3 test4 test5 test6 .....
> do
>     sqoop import .....
> done
> but,Oozie seems to be only scheduling a sqoop-action
> so,I would like to bulk sqoop through the implementation of shell
> <workflow-app xmlns="uri:oozie:workflow:0.5" name="shell-python">
>     <start to="shell-sqoop"/>
>     <action name="shell-sqoop">
>         <shell xmlns="uri:oozie:shell-action:0.1">
>             <job-tracker>${jobTracker}</job-tracker>
>             <name-node>${nameNode}</name-node>
>             <prepare>
>                 <delete path="${nameNode}/user/${wf:
> user()}/src_bbd_hguan"/>
>             </prepare>
>             <configuration>
>                 <property>
>                     <name>mapred.job.queue.name</name>
>                     <value>${queueName}</value>
>                 </property>
>             </configuration>
>             <exec>sqoop.sh</exec>
>             <file>sqoop.sh#sqoop.sh</file>
>             <capture-output/>
>         </shell>
>         <ok to="end"/>
>         <error to="fail"/>
>     </action>
>     <kill name="fail">
>         <message>Map/Reduce failed, error message[${wf:errorMessage(wf:
> lastErrorNode())}]</message>
>     </kill>
>     <end name="end"/>
> </workflow-app>
> but yarn logs appears below the exception：
>
>
> 2016-09-06 17:41:06,682 INFO [Thread-56] org.apache.hadoop.mapreduce.
> jobhistory.JobHistoryEventHandler: Stopped JobHistoryEventHandler.
> super.stop()
> 2016-09-06 17:41:06,683 INFO [Thread-56] org.apache.hadoop.mapreduce.
> v2.app.rm.RMContainerAllocator: Setting job diagnostics to Job init
> failed : org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
> java.io.FileNotFoundException: File does not exist:
> hdfs://kunlundev02:8020/user/bbd/.staging/job_1470312512846_0152/job.
> splitmetainfo
>         at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$
> InitTransition.createSplits(JobImpl.java:1580)
>         at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$
> InitTransition.transition(JobImpl.java:1444)
>         at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$
> InitTransition.transition(JobImpl.java:1402)
>         at org.apache.hadoop.yarn.state.StateMachineFactory$
> MultipleInternalArc.doTransition(StateMachineFactory.java:385)
>         at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(
> StateMachineFactory.java:302)
>         at org.apache.hadoop.yarn.state.StateMachineFactory.access$
> 300(StateMachineFactory.java:46)
>         at org.apache.hadoop.yarn.state.StateMachineFactory$
> InternalStateMachine.doTransition(StateMachineFactory.java:448)
>         at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.
> handle(JobImpl.java:996)
>         at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.
> handle(JobImpl.java:138)
>         at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$
> JobEventDispatcher.handle(MRAppMaster.java:1333)
>         at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.
> serviceStart(MRAppMaster.java:1101)
>         at org.apache.hadoop.service.AbstractService.start(
> AbstractService.java:193)
>         at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$4.run(
> MRAppMaster.java:1540)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(
> UserGroupInformation.java:1693)
>         at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.
> initAndStartAppMaster(MRAppMaster.java:1536)
>         at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(
> MRAppMaster.java:1469)
> Caused by: java.io.FileNotFoundException: File does not exist:
> hdfs://kunlundev02:8020/user/bbd/.staging/job_1470312512846_0152/job.
> splitmetainfo
>         at org.apache.hadoop.hdfs.DistributedFileSystem$19.
> doCall(DistributedFileSystem.java:1219)
>         at org.apache.hadoop.hdfs.DistributedFileSystem$19.
> doCall(DistributedFileSystem.java:1211)
>         at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(
> FileSystemLinkResolver.java:81)
>         at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(
> DistributedFileSystem.java:1211)
>         at org.apache.hadoop.mapreduce.split.SplitMetaInfoReader.
> readSplitMetaInfo(SplitMetaInfoReader.java:51)
>         at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$
> InitTransition.createSplits(JobImpl.java:1575)
>         ... 17 more
>
>
>
>
>
>
>
> ------------------ 原始邮件 ------------------
> 发件人: "Peter Cseh";<[email protected]>;
> 发送时间: 2016年9月6日(星期二) 晚上8:11
> 收件人: "user"<[email protected]>;
>
> 主题: Re: oozie execute shell(content hive or sqoop command)
>
>
>
> Hi,
> you may use the Sqoop action to do the import:
> https://oozie.apache.org/docs/4.2.0/DG_SqoopActionExtension.html
>
> gp
>
> On Tue, Sep 6, 2016 at 1:51 PM, wangwei <[email protected]> wrote:
>
> > Hi,
> >  I have a scene, there are a lot of tables need to use the sqoop import
> > mysql, so I need to write the sqoop in the shell script, to cycle through
> > all the tables.
> >   It still appears the same error。
> >
> >
> >
> >
> > ------------------ 原始邮件 ------------------
> > 发件人: "satish saley";<[email protected]>;
> > 发送时间: 2016年9月6日(星期二) 晚上7:21
> > 收件人: "user"<[email protected]>;
> >
> > 主题: Re: oozie execute shell(content hive or sqoop command)
> >
> >
> >
> > Hi,
> > For hive scripts, use hive-action. It would easy to follow the pipeline
> for
> > others and to debug since oozie will show the hive job url directly in
> the
> > UI.
> >
> > https://oozie.apache.org/docs/4.2.0/DG_HiveActionExtension.html
> > https://oozie.apache.org/docs/4.2.0/DG_Hive2ActionExtension.html
> >
> > On Tue, Sep 6, 2016 at 3:21 AM, wangwei <[email protected]> wrote:
> >
> > > Hi:
> > >  my shell content: hive.sh
> > >   #!/bin/bash
> > >   hive -e "select count(*) from test;"
> > >  my workflow content:workflow.xml
> > >
> > > The following error occurred：
> > >
> > > How to solve?,Please
> > >
> > >
> > >
> > >
> >
>
>
>
> --
> Peter Cseh
> Software Engineer
> <http://www.cloudera.com>




-- 
Peter Cseh
Software Engineer
<http://www.cloudera.com>

Re: oozie execute shell(content hive or sqoop command)

Reply via email to