[ 
https://issues.apache.org/jira/browse/FALCON-169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13814099#comment-13814099
 ] 

Venkatesh Seetharam commented on FALCON-169:
--------------------------------------------

Thanks [~shwethags] and [~samarthg] for looking into this. But I wonder how 
there can be multiple '//' in the path. The code normalizes it as below:

{code}
        private void propagateFileSystemCopyProperties(String 
pathsWithPartitions,
                                                       Map<String, String> 
props) throws FalconException {
            String parts = pathsWithPartitions.replaceAll("//+", "/");
            parts = StringUtils.stripEnd(parts, "/");
            props.put("sourceRelativePaths", parts);

            props.put("distcpSourcePaths", "${coord:dataIn('input')}");
            props.put("distcpTargetPaths", "${coord:dataOut('output')}");
        }
{code}
sourceRelativePaths is substituted for falcon.include.path.
{code}
            
<main-class>org.apache.falcon.replication.FeedReplicator</main-class>
            <arg>-Dfalcon.include.path=${sourceRelativePaths}</arg>
{code}

I must be missing something here.

> multiple "/" in target for replication for multi target feed 
> -------------------------------------------------------------
>
>                 Key: FALCON-169
>                 URL: https://issues.apache.org/jira/browse/FALCON-169
>             Project: Falcon
>          Issue Type: Bug
>          Components: replication
>         Environment: QA
>            Reporter: Samarth Gupta
>            Assignee: Venkatesh Seetharam
>
> multiple "/" are getting appended to target dir, before concatenating 
> partition exp postfix. 
> For example while running single source multi target test, following is the 
> value being passed to distCp which can be viewed in tasktracker logs: 
> ** for patch from FALCON-163
> {quote} 
> -Dfalcon.include.path=hdfs://gs1001.grid.corp.inmobi.com:54310/localDC/rc/billing/2012/10/01/12/10//ua3
> at the bottom of logs is can be seen:
> 2013-11-05 06:33:20,219 INFO  - Inclusion pattern = 
> hdfs://gs1001.grid.corp.inmobi.com:54310/localDC/rc/billing/2012/10/01/12/10//ua3
>  (FilteredCopyListing:59)
> 2013-11-05 06:33:20,219 INFO  - Regex pattern = 
> (hdfs://gs1001\.grid\.corp\.inmobi\.com:54310/localDC/rc/billing/2012/10/01/12/10//ua3/)|(hdfs://gs1001\.grid\.corp\.inmobi\.com:54310/localDC/rc/billing/2012/10/01/12/10//ua3$)
>  (FilteredCopyListing:60)
> 2013-11-05 06:33:20,460 INFO  - Number of paths considered for copy: 0 
> (CustomReplicator:57)
> 2013-11-05 06:33:20,461 INFO  - Number of bytes considered for copy: 0 
> (Actual number of bytes copied depends on whether any files are skipped or 
> overwritten.) (CustomReplicator:58)
> 2013-11-05 06:33:21,212 INFO  - DistCp job-id: job_201310290719_0445 
> (DistCp:146)
> 2013-11-05 06:33:21,213 INFO  - DistCp job may be tracked at: 
> http://ivoryqa-1.corp.inmobi.com:50030/jobdetails.jsp?jobid=job_201310290719_0445
>  (DistCp:147)
> 2013-11-05 06:33:21,213 INFO  - To cancel, run the following command: hadoop 
> job -kill job_201310290719_0445 (DistCp:148)
> 2013-11-05 06:33:21,213 INFO  - Running job: job_201310290719_0445 
> (JobClient:1315)
> 2013-11-05 06:33:22,216 INFO  -  map 0% reduce 0% (JobClient:1328)
> 2013-11-05 06:33:33,244 INFO  - Job complete: job_201310290719_0445 
> (JobClient:1383)
> 2013-11-05 06:33:33,252 INFO  - Counters: 4 (JobClient:589)
> 2013-11-05 06:33:33,252 INFO  -   Job Counters  (JobClient:591)
> 2013-11-05 06:33:33,253 INFO  -     SLOTS_MILLIS_MAPS=5822 (JobClient:593)
> 2013-11-05 06:33:33,253 INFO  -     Total time spent by all reduces waiting 
> after reserving slots (ms)=0 (JobClient:593)
> 2013-11-05 06:33:33,254 INFO  -     Total time spent by all maps waiting 
> after reserving slots (ms)=0 (JobClient:593)
> 2013-11-05 06:33:33,255 INFO  -     SLOTS_MILLIS_REDUCES=0 (JobClient:593)
> 2013-11-05 06:33:33,307 INFO  - No files present in path: 
> hdfs://ivoryqa-1.corp.inmobi.com:8020/localDC/rc/billing/ua2/2012/10/01/12/10/ua3
>  (FeedReplicator:146)
> 2013-11-05 06:33:33,308 INFO  - Completed DistCp (FeedReplicator:77)
> {quote}
> where as if same is run on the current code from trunk, following are the 
> values in task tracker: 
> {quote}
> -Dfalcon.include.path=hdfs://gs1001.grid.corp.inmobi.com:54310/localDC/rc/billing/2012/10/01/12/10/ua3
> {quote}
> and replication is successful ..... 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to