Github user nchammas commented on the pull request:
https://github.com/apache/spark/pull/2988#issuecomment-61016125
Actually, we have a few options here for `spark-ec2` and the underlying
`spark_ec2.py` script.
1. We leave the script as-is and just document the fact that relative paths
won't work.
2. We update the script to preserve the user's working directory and
qualify any paths relative to that directory.
3. We update the script so that all relative paths passed in are translated
to absolute paths.
Comments:
* Option 1 is the simplest but is a bit user-unfriendly.
* Option 3 is tricky since we'd have to do the translation before the
Python script is invoked, and this is annoying to do in a cross-platform
manner; alternately, we can pass in both the user's CWD and the untranslated
paths so that Python can do the translation in a cross-platform manner.
* Option 2 seems feasible, but we'd have to track down all the places where
we handle files.
So far, I see that any code that handles the following files would be
affected if we go with either Option 2 or 3:
* [SSH identity
file](https://github.com/apache/spark/blob/e7fd80413d531e23b6c4def0ee32e52a39da36fa/ec2/spark_ec2.py#L71)
* [user data
file](https://github.com/apache/spark/blob/e7fd80413d531e23b6c4def0ee32e52a39da36fa/ec2/spark_ec2.py#L151)
* [deploy.generic
file](https://github.com/apache/spark/blob/e7fd80413d531e23b6c4def0ee32e52a39da36fa/ec2/spark_ec2.py#L589)
This turned out to be a bit more involved than I expected...
@shivaram What would you recommend?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]