[ https://issues.apache.org/jira/browse/FALCON-2049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15347773#comment-15347773 ]
Balu Vellanki commented on FALCON-2049: --------------------------------------- This issue was introduced by https://issues.apache.org/jira/browse/FALCON-1844 where setDeleteMissing is set to true by default. Assume source dir is /tmp/source/${YEAR}/${MONTH}/${DAY} and target is /tmp/target/${YEAR}/${MONTH}/${DAY} Feed replication is triggered with DistCp being equivalent to following CLI distcp command {code} hadoop distcp -update -delete hdfs://c6401.ambari.apache.org:8020/tmp/source/1/2/3 hdfs://c6401.ambari.apache.org:8020/tmp/target/1/2/3/ {code} The following scenarios can occur, Case 1. Source dir is created but is empty, but availabilityFlag is created. Result : DistCp succeeds, hdfs://c6401.ambari.apache.org:8020/tmp/target/1/2/3/ is created and availabilityFlag is copies to target Case 2. Source dir is created and has files. Result : DistCp succeeds, hdfs://c6401.ambari.apache.org:8020/tmp/target/1/2/3/ is created and target dir has same files as sourceDir with same dir structure. Case 3. Source dir is created without any files and target dir is also created. Result : DistCp succeeds, both source and target have empty dirs. Case 4. Source dir is created but is empty, availabilityFlag is not created. Result : DistCp fails with error "Job commit failed: org.apache.hadoop.tools.CopyListing$InvalidInputException: hdfs://c6401.ambari.apache.org:8020/tmp/target/1/2/3 does not exist" There seems to be two solutions for this problem. 1. Return success when sourceDir has no files and targetDir is missing, thus avoiding Case 4 OR 2. Create targetDir and then attempt to DistCp. This will trigger Case 3 and replication job will succeed. I recommend option 2 because having an empty source/target dir is a valid use case for data directories. > Feed Replication with Empty Directories are failing > --------------------------------------------------- > > Key: FALCON-2049 > URL: https://issues.apache.org/jira/browse/FALCON-2049 > Project: Falcon > Issue Type: Bug > Components: feed > Affects Versions: 0.10 > Reporter: Murali Ramasami > Priority: Critical > Fix For: 0.10 > > > Feed Replication with empty directories are failing with the following error > in application log: > {noformat} > 2016-06-23 08:35:21,475 INFO [eventHandlingThread] > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Moved tmp to > done: > hdfs://nat-os-r7-wkqu-falcon-multicluster-10.openstacklocal:8020/mr-history/tmp/hrt_qa/job_1466658266370_0059_conf.xml_tmp > to > hdfs://nat-os-r7-wkqu-falcon-multicluster-10.openstacklocal:8020/mr-history/tmp/hrt_qa/job_1466658266370_0059_conf.xml > 2016-06-23 08:35:21,476 INFO [eventHandlingThread] > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Moved tmp to > done: > hdfs://nat-os-r7-wkqu-falcon-multicluster-10.openstacklocal:8020/mr-history/tmp/hrt_qa/job_1466658266370_0059-1466670911340-hrt_qa-distcp%3A+oozie%3Aaction%3AT%3Djava%3AW%3DFALCON_FEED_REPLICAT-1466670921283-0-0-FAILED-default-1466670920070.jhist_tmp > to > hdfs://nat-os-r7-wkqu-falcon-multicluster-10.openstacklocal:8020/mr-history/tmp/hrt_qa/job_1466658266370_0059-1466670911340-hrt_qa-distcp%3A+oozie%3Aaction%3AT%3Djava%3AW%3DFALCON_FEED_REPLICAT-1466670921283-0-0-FAILED-default-1466670920070.jhist > 2016-06-23 08:35:21,477 INFO [Thread-66] > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Stopped > JobHistoryEventHandler. super.stop() > 2016-06-23 08:35:21,479 INFO [Thread-66] > org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator: Setting job diagnostics > to No of maps and reduces are 0 job_1466658266370_0059 > Job commit failed: org.apache.hadoop.tools.CopyListing$InvalidInputException: > hdfs://nat-os-r7-wkqu-falcon-multicluster-10.openstacklocal:8020/tmp/falcon-regression/FeedReplicationTest/target/2016/06/23/08/32 > doesn't exist > at > org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:84) > at > org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86) > at > org.apache.hadoop.tools.mapred.CopyCommitter.deleteMissing(CopyCommitter.java:241) > at > org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:94) > at > org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobCommit(CommitterEventHandler.java:285) > at > org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:237) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > {noformat} > Feed submitted: > {noformat} > <?xml version="1.0" encoding="UTF-8"?><feed xmlns="uri:falcon:feed:0.1" > name="A7769e4e0-49663d60" description="Input File"> > <partitions> > <partition name="colo"/> > <partition name="eventTime"/> > <partition name="impressionHour"/> > <partition name="pricingModel"/> > </partitions> > <availabilityFlag>availabilityFlag.txt</availabilityFlag> > <frequency>minutes(5)</frequency> > <late-arrival cut-off="days(100000)"/> > <clusters> > <cluster name="A7769e4e0-0af6c74b" type="source"> > <validity start="2016-06-23T08:32Z" end="2016-06-23T08:37Z"/> > <retention limit="days(1000000)" action="delete"/> > </cluster> > <cluster name="A7769e4e0-25f87f0e" type="target"> > <validity start="2016-06-23T08:32Z" end="2016-06-23T08:37Z"/> > <retention limit="days(1000000)" action="delete"/> > <locations> > <location type="data" > path="/tmp/falcon-regression/FeedReplicationTest/target/${YEAR}/${MONTH}/${DAY}/${HOUR}/${MINUTE}"/> > </locations> > </cluster> > </clusters> > <locations> > <location type="data" > path="/tmp/falcon-regression/FeedReplicationTest/source/${YEAR}/${MONTH}/${DAY}/${HOUR}/${MINUTE}"/> > <location type="stats" path="/data/regression/fetlrc/billing/stats"/> > <location type="meta" > path="/data/regression/fetlrc/billing/metadata"/> > </locations> > <ACL owner="hrt_qa" group="users" permission="0x755"/> > <schema location="/databus/streams_local/click_rr/schema/" > provider="protobuf"/> > <properties> > <property name="field1" value="value1"/> > <property name="field2" value="value2"/> > <property name="job.counter" value="true"/> > </properties> > </feed> > {noformat} > It is failing because of the target directories are not exists to replicate. -- This message was sent by Atlassian JIRA (v6.3.4#6332)