[ 
https://issues.apache.org/jira/browse/FALCON-455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Satish Mittal resolved FALCON-455.
----------------------------------

    Resolution: Won't Fix

> Replication of output feed of an HCatalog process not working
> -------------------------------------------------------------
>
>                 Key: FALCON-455
>                 URL: https://issues.apache.org/jira/browse/FALCON-455
>             Project: Falcon
>          Issue Type: Bug
>    Affects Versions: 0.5
>            Reporter: Satish Mittal
>         Attachments: hcat-in-feed.xml, hcat-out-feed.xml, hcat-process.xml, 
> workflow.xml
>
>
> Suppose there is an HCatalog process (java type) that takes an HCat input 
> feed and outputs another HCat feed. Further, this output feed is configured 
> for replication across 2 clusters.
> The replication of output feed fails during Hive import step. The reason is 
> that HCat process job output on HDFS consists of '_logs' directory if process 
> writes to a static partition (or consists of an empty '_temporary' directory 
> if process writes to a dynamic partition). 
> The Hive import job logs contain following error:
> {noformat}
> 9036 [main] INFO  org.apache.hadoop.hive.ql.Driver  - Starting command: 
> import table table5 partition 
> (minute='25',month='05',year='2014',hour='12',day='29') from 
> 'hdfs://databusdev2.mkhoj.com:9000//projects/falcon/hcolo2/staging/FALCON_FEED_REPLICATION_hcat-out6_hcat-cluster2/default/table5/year=2014/2014-05-29-12-25/hcat-cluster2/data'
> 9036 [main] INFO  org.apache.hadoop.hive.ql.log.PerfLogger  - </PERFLOG 
> method=TimeToSubmit start=1401367057244 end=1401367057579 duration=335 
> from=org.apache.hadoop.hive.ql.Driver>
> 9036 [main] INFO  org.apache.hadoop.hive.ql.log.PerfLogger  - <PERFLOG 
> method=runTasks from=org.apache.hadoop.hive.ql.Driver>
> 9036 [main] INFO  org.apache.hadoop.hive.ql.log.PerfLogger  - <PERFLOG 
> method=task.COPY.Stage-0 from=org.apache.hadoop.hive.ql.Driver>
> 9036 [main] INFO  org.apache.hadoop.hive.ql.exec.Task  - Copying data from 
> hdfs://databusdev2.mkhoj.com:9000/projects/falcon/hcolo2/staging/FALCON_FEED_REPLICATION_hcat-out6_hcat-cluster2/default/table5/year=2014/2014-05-29-12-25/hcat-cluster2/data/year=2014/month=05/day=29/hour=12/minute=25
>  to 
> hdfs://databusdev2.mkhoj.com:9000/tmp/hive-mapred/hive_2014-05-29_12-37-37_244_6437156794758917899-1/-ext-10000
> 9069 [main] INFO  org.apache.hadoop.hive.ql.exec.Task  - Copying file: 
> hdfs://databusdev2.mkhoj.com:9000/projects/falcon/hcolo2/staging/FALCON_FEED_REPLICATION_hcat-out6_hcat-cluster2/default/table5/year=2014/2014-05-29-12-25/hcat-cluster2/data/year=2014/month=05/day=29/hour=12/minute=25/_SUCCESS
> 9096 [main] INFO  org.apache.hadoop.hive.ql.exec.Task  - Copying file: 
> hdfs://databusdev2.mkhoj.com:9000/projects/falcon/hcolo2/staging/FALCON_FEED_REPLICATION_hcat-out6_hcat-cluster2/default/table5/year=2014/2014-05-29-12-25/hcat-cluster2/data/year=2014/month=05/day=29/hour=12/minute=25/_logs
> 9190 [main] INFO  org.apache.hadoop.hive.ql.exec.Task  - Copying file: 
> hdfs://databusdev2.mkhoj.com:9000/projects/falcon/hcolo2/staging/FALCON_FEED_REPLICATION_hcat-out6_hcat-cluster2/default/table5/year=2014/2014-05-29-12-25/hcat-cluster2/data/year=2014/month=05/day=29/hour=12/minute=25/part-r-00000
> 9222 [main] INFO  org.apache.hadoop.hive.ql.log.PerfLogger  - <PERFLOG 
> method=task.DDL.Stage-1 from=org.apache.hadoop.hive.ql.Driver>
> 9580 [main] INFO  org.apache.hadoop.hive.ql.log.PerfLogger  - </PERFLOG 
> method=task.COPY.Stage-0 start=1401367057579 end=1401367058123 duration=544 
> from=org.apache.hadoop.hive.ql.Driver>
> 9580 [main] INFO  org.apache.hadoop.hive.ql.log.PerfLogger  - <PERFLOG 
> method=task.MOVE.Stage-2 from=org.apache.hadoop.hive.ql.Driver>
> 9581 [main] INFO  org.apache.hadoop.hive.ql.exec.Task  - Loading data to 
> table default.table5 partition (day=29, hour=12, minute=25, month=05, 
> year=2014) from 
> hdfs://databusdev2.mkhoj.com:9000/tmp/hive-mapred/hive_2014-05-29_12-37-37_244_6437156794758917899-1/-ext-10000
> 9598 [main] INFO  org.apache.hadoop.hive.ql.exec.MoveTask  - Partition is: 
> {day=29, hour=12, minute=25, month=05, year=2014}
> 9668 [main] ERROR org.apache.hadoop.hive.ql.exec.Task  - Failed with 
> exception checkPaths: 
> hdfs://databusdev2.mkhoj.com:9000/tmp/hive-mapred/hive_2014-05-29_12-37-37_244_6437156794758917899-1/-ext-10000
>  has nested 
> directoryhdfs://databusdev2.mkhoj.com:9000/tmp/hive-mapred/hive_2014-05-29_12-37-37_244_6437156794758917899-1/-ext-10000/_logs
> org.apache.hadoop.hive.ql.metadata.HiveException: checkPaths: 
> hdfs://databusdev2.mkhoj.com:9000/tmp/hive-mapred/hive_2014-05-29_12-37-37_244_6437156794758917899-1/-ext-10000
>  has nested 
> directoryhdfs://databusdev2.mkhoj.com:9000/tmp/hive-mapred/hive_2014-05-29_12-37-37_244_6437156794758917899-1/-ext-10000/_logs
>       at org.apache.hadoop.hive.ql.metadata.Hive.checkPaths(Hive.java:2108)
>       at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:2298)
>       at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1230)
>       at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:408)
>       at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
>       at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)
>       at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1532)
>       at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1305)
>       at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1136)
>       at org.apache.hadoop.hive.ql.Driver.run(Driver.java:976)
>       at org.apache.hadoop.hive.ql.Driver.run(Driver.java:966)
>       at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
>       at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
>       at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:424)
>       at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:359)
>       at 
> org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:457)
>       at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:467)
>       at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:748)
>       at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686)
>       at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
>       at org.apache.oozie.action.hadoop.HiveMain.runHive(HiveMain.java:318)
>       at org.apache.oozie.action.hadoop.HiveMain.run(HiveMain.java:279)
>       at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:39)
>       at org.apache.oozie.action.hadoop.HiveMain.main(HiveMain.java:66)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>       at java.lang.reflect.Method.invoke(Method.java:597)
>       at 
> org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:226)
>       at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>       at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391)
>       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
>       at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:396)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1278)
>       at org.apache.hadoop.mapred.Child.main(Child.java:260)
> 9668 [main] INFO  org.apache.hadoop.hive.ql.log.PerfLogger  - </PERFLOG 
> method=task.MOVE.Stage-2 start=1401367058123 end=1401367058211 duration=88 
> from=org.apache.hadoop.hive.ql.Driver>
> 9672 [main] ERROR org.apache.hadoop.hive.ql.Driver  - FAILED: Execution 
> Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask
> {noformat}
> Apprarently, Hive import doesn't like any directory in import path. This 
> behavior can be seen on Hive CLI also.
> {noformat}
> hive> import table table5 partition 
> (minute='32',month='05',year='2014',hour='12',day='29') from 
> 'hdfs://databusdev2.mkhoj.com:9000//projects/falcon/hcolo2/staging/FALCON_FEED_REPLICATION_hcat-out6_hcat-cluster2/default/table5/year=2014/2014-05-29-12-32/hcat-cluster2/data'
>     > ;
> Copying data from 
> hdfs://databusdev2.mkhoj.com:9000/projects/falcon/hcolo2/staging/FALCON_FEED_REPLICATION_hcat-out6_hcat-cluster2/default/table5/year=2014/2014-05-29-12-32/hcat-cluster2/data/year=2014/month=05/day=29/hour=12/minute=32
> Copying file: 
> hdfs://databusdev2.mkhoj.com:9000/projects/falcon/hcolo2/staging/FALCON_FEED_REPLICATION_hcat-out6_hcat-cluster2/default/table5/year=2014/2014-05-29-12-32/hcat-cluster2/data/year=2014/month=05/day=29/hour=12/minute=32/_SUCCESS
> Copying file: 
> hdfs://databusdev2.mkhoj.com:9000/projects/falcon/hcolo2/staging/FALCON_FEED_REPLICATION_hcat-out6_hcat-cluster2/default/table5/year=2014/2014-05-29-12-32/hcat-cluster2/data/year=2014/month=05/day=29/hour=12/minute=32/_logs
> Copying file: 
> hdfs://databusdev2.mkhoj.com:9000/projects/falcon/hcolo2/staging/FALCON_FEED_REPLICATION_hcat-out6_hcat-cluster2/default/table5/year=2014/2014-05-29-12-32/hcat-cluster2/data/year=2014/month=05/day=29/hour=12/minute=32/part-r-00000
> Loading data to table default.table5 partition (day=29, hour=12, minute=32, 
> month=05, year=2014)
> Failed with exception checkPaths: 
> hdfs://databusdev2.mkhoj.com:9000/tmp/hive-hive/hive_2014-05-29_13-13-43_867_8757094482694632648-1/-ext-10000
>  has nested 
> directoryhdfs://databusdev2.mkhoj.com:9000/tmp/hive-hive/hive_2014-05-29_13-13-43_867_8757094482694632648-1/-ext-10000/_logs
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.MoveTask
> hive>
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to