[ 
https://issues.apache.org/jira/browse/HADOOP-5759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12712683#action_12712683
 ] 

Jothi Padmanabhan commented on HADOOP-5759:
-------------------------------------------

bq. It is possible to enhance this patch to create a new data strcture called 
rackToNodes at the very beginning. It can be populated by iterating through all 
the blocks at the very beginning.

Agreed. However, Amareshwari and I discussed this offline and we are not sure 
if we want to build this rackToNodes  before getMoreSplits method as it would 
mean calling getBlockLocations for all the blocks twice -- once to build the 
rackToNodes and once in getMoreSplits to build the blockInfo maps. Could we 
incrementally build this map in getMoreSplits. This would have the disadvantage 
of having incomplete information for the first few splits.

> IllegalArgumentException when CombineFileInputFormat is used as job 
> InputFormat
> -------------------------------------------------------------------------------
>
>                 Key: HADOOP-5759
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5759
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amareshwari Sriramadasu
>            Assignee: Amareshwari Sriramadasu
>             Fix For: 0.21.0
>
>         Attachments: patch-5759.txt
>
>
> As per my understanding, CombineFileInputFormat is creating splits with 
> rackname as split location. 
> When I use CombineFileInputFormat as the InputFormat for job, job 
> initialization fails with following exception :
> 2009-04-28 14:10:40,162 ERROR mapred.EagerTaskInitializationListener 
> (EagerTaskInitializationListener.java:run(83)) - Job initialization failed:
> java.lang.IllegalArgumentException: Network location name contains /: 
> /default-rack
>   at org.apache.hadoop.net.NodeBase.set(NodeBase.java:76)
>   at org.apache.hadoop.net.NodeBase.<init>(NodeBase.java:57)
>   at 
> org.apache.hadoop.mapred.JobTracker.addHostToNodeMapping(JobTracker.java:2342)
>   at 
> org.apache.hadoop.mapred.JobTracker.resolveAndAddToTopology(JobTracker.java:2336)
>   at 
> org.apache.hadoop.mapred.JobInProgress.createCache(JobInProgress.java:344)
>   at org.apache.hadoop.mapred.JobInProgress.initTasks(JobInProgress.java:441)
>   at 
> org.apache.hadoop.mapred.EagerTaskInitializationListener$InitJob.run(EagerTaskInitializationListener.java:81)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>   at java.lang.Thread.run(Thread.java:619)
> When I changed CombineFileInputFormat to pass just rackname (without '/'), JT 
> wrongly resolves  the node as /default-rack/<rack-name>.
> Solution is to pass hostnames holding the block(on the rack),  instead of 
> rackname.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to