Github user JuhongPark commented on the pull request:
https://github.com/apache/spark/pull/7252#issuecomment-119817605
Thank you for your advice @nchammas @shivaram
I checked auth process of boto,
and I found `boto.provider` has the key information
When provider is created, provider searches `~/.aws/credientials`
In `boto/provider.py` I can use `boto.provider` class to find this
configuration.
```python
def __init__(self, name, access_key=None, secret_key=None,
security_token=None, profile_name=None):
(...skip...)
# Load shared credentials file if it exists
shared_path = os.path.join(expanduser('~'), '.' + name,
'credentials')
self.shared_credentials = Config(do_load=False)
if os.path.isfile(shared_path):
self.shared_credentials.load_from_path(shared_path)
```
However the problem is when there is no key information(`/etc/boto.cfg`,
`~/.boto`, `~/.aws/credientials`),
`provider.__init__()` requests to
âhttp://169.254.169.254/latest/meta-data/iam/security-credentials/'
then this error is raised.
```
ERROR:boto:Caught exception reading instance data
Traceback (most recent call last):
File
"/usr/local/spark-1.4.0-bin-hadoop2.4/ec2/lib/boto-2.34.0/boto/utils.py", line
214, in retry_url
r = opener.open(req, timeout=timeout)
File
"/usr/local/Cellar/python/2.7.8_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py",
line 404, in open
response = self._open(req, data)
File
"/usr/local/Cellar/python/2.7.8_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py",
line 422, in _open
'_open', req)
File
"/usr/local/Cellar/python/2.7.8_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py",
line 382, in _call_chain
result = func(*args)
File
"/usr/local/Cellar/python/2.7.8_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py",
line 1214, in http_open
return self.do_open(httplib.HTTPConnection, req)
File
"/usr/local/Cellar/python/2.7.8_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py",
line 1184, in do_open
raise URLError(err)
URLError: <urlopen error timed out>
ERROR:boto:Unable to read instance data, giving up
```
169.254.169.254 is Amazon API URL, check following url for detail
information
http://docs.aws.amazon.com/AWSEC2/2007-03-01/DeveloperGuide/AESDG-chapter-instancedata.html
I also tried to make dummy connect(like connect_s3() or others), however I
got error I described above cause the boto.provider is created.
If spark use boto3(not original boto), the solution is simple : check only
~/.aws/credentials
https://boto3.readthedocs.org/en/latest/guide/quickstart.html#configuration
But with original boto, I could not find more simple and clear way to check
configuration.
I think the original commit of this PR is better.
Please check and feedback :)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]