cgivre opened a new pull request #2139:
URL: https://github.com/apache/drill/pull/2139


   # [DRILL-6268](https://issues.apache.org/jira/browse/DRILL-6268): 
Drill-on-YARN client obtains HDFS URL Incorrectly
   
   ## Description
   
   The Drill-on-YARN client must upload files to HDFS so that YARN can localize 
them. The code that does so is in DfsFacade. This code obtains the URL twice. 
The first time is correct:
    
   ```java
    private void loadYarnConfig() {
       ...
         URI fsUri = FileSystem.getDefaultUri( yarnConf );
         if(fsUri.toString().startsWith("file:/")) {
           System.err.println("Warning: Default DFS URI is for a local file 
system: " + fsUri);
         }
       }
     }
   ```
   The `fsUri` returned is `hdfs://localhost:9000`, which is the correct value 
for an out-of-the-box Hadoop 2.9.0 install after following these instructions. 
The instructions have the reader explicitly set the port number to 9000:
   ```xml
   <configuration>
       <property>
           <name>fs.defaultFS</name>
           <value>hdfs://localhost:9000</value>
       </property>
   </configuration>
   ```
   The other place that gets the URL, this time or real, is 
`DfsFacade.connect()`:
   ```java
       String dfsConnection = 
config.getString(DrillOnYarnConfig.DFS_CONNECTION);
   ```
   This value comes back as hdfs://localhost/, which causes HDFS to try to 
connect on port 8020 (the Hadoop default), resulting in the following error:
   ```
   Connecting to DFS... Connected.
   Uploading /Users/paulrogers/bin/apache-drill-1.13.0.tar.gz to 
/users/drill/apache-drill-1.13.0.tar.gz ... Failed.
   Failed to upload Drill archive
     Caused by: Failed to create DFS directory: /users/drill
     Caused by: Call From Pauls-MBP/192.168.1.243 to localhost:8020 failed on 
connection exception: java.net.ConnectException: Connection refused;
   ```
   
   For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
   (Shout out here to arjun kr for suggesting we include the extra exception 
details; very helpful here.)
   
   The workaround is to manually change the port to 8020 in the config setting 
shown above.
   The full fix is to change the code to use the following line in connect():
   ```java
       String dfsConnection = FileSystem.getDefaultUri(yarnConf);
   ```
   This bug is serious because it constrains the ability of users to select 
non-default HDFS ports.
   
   ## Documentation
   No user facing changes. 
   
   ## Testing
   Unit tests pass. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to