Github user JuhongPark commented on the pull request:

    https://github.com/apache/spark/pull/7252#issuecomment-119817605
  
    Thank you for your advice @nchammas @shivaram
    
    I checked auth process of boto,
    and I found `boto.provider` has the key information
    
    When provider is created, provider searches `~/.aws/credientials`
    
    In `boto/provider.py` I can use `boto.provider` class to find this 
configuration.
    
    ```python
        def __init__(self, name, access_key=None, secret_key=None,
                                security_token=None, profile_name=None):
    (...skip...)
                # Load shared credentials file if it exists
                shared_path = os.path.join(expanduser('~'), '.' + name, 
'credentials')
                self.shared_credentials = Config(do_load=False)
                if os.path.isfile(shared_path):
                        self.shared_credentials.load_from_path(shared_path)
    ```
    
    However the problem is when there is no key information(`/etc/boto.cfg`, 
`~/.boto`, `~/.aws/credientials`),
    `provider.__init__()` requests to 
‘http://169.254.169.254/latest/meta-data/iam/security-credentials/'
    then this error is raised.
    
    ```
    ERROR:boto:Caught exception reading instance data
    Traceback (most recent call last):
      File 
"/usr/local/spark-1.4.0-bin-hadoop2.4/ec2/lib/boto-2.34.0/boto/utils.py", line 
214, in retry_url
        r = opener.open(req, timeout=timeout)
      File 
"/usr/local/Cellar/python/2.7.8_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py",
 line 404, in open
        response = self._open(req, data)
      File 
"/usr/local/Cellar/python/2.7.8_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py",
 line 422, in _open
        '_open', req)
      File 
"/usr/local/Cellar/python/2.7.8_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py",
 line 382, in _call_chain
        result = func(*args)
      File 
"/usr/local/Cellar/python/2.7.8_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py",
 line 1214, in http_open
        return self.do_open(httplib.HTTPConnection, req)
      File 
"/usr/local/Cellar/python/2.7.8_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py",
 line 1184, in do_open
        raise URLError(err)
    URLError: <urlopen error timed out>
    ERROR:boto:Unable to read instance data, giving up
    ```
    
    169.254.169.254 is Amazon API URL, check following url for detail 
information
    
http://docs.aws.amazon.com/AWSEC2/2007-03-01/DeveloperGuide/AESDG-chapter-instancedata.html
    
    
    I also tried to make dummy connect(like connect_s3() or others), however I 
got error I described above cause the boto.provider is created.
    
    If spark use boto3(not original boto), the solution is simple : check only 
~/.aws/credentials
    https://boto3.readthedocs.org/en/latest/guide/quickstart.html#configuration
    
    But with original boto, I could not find more simple and clear way to check 
configuration.
    
    I think the original commit of this PR is better.
    
    Please check and feedback :)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to