Github user andrewor14 commented on the pull request:
https://github.com/apache/spark/pull/1715#issuecomment-51093637
Let's say we're using the `--jars` approach, and I run the following two
commands. (Correct me if I'm understanding your proposal incorrectly; I am
basing this off @liancheng's pseudocode in an earlier comment.) Here I am
assuming that `--arg1` and `--arg2` are also spark-submit arguments:
```
bin/spark-submit --jars app.jar --arg1 -- --arg2
```
Here we treat `--arg1` as a spark-submit argument, but pass `--arg2` to the
application. The user may also specify the primary jar explicitly as follows:
```
bin/spark-submit app.jar --arg1 -- --arg2
```
Here we treat both `--arg1` and `--arg2` as spark-submit arguments, but we
may pass `--` to the application depending on what `--arg1` is. For instance,
if `--arg1` is `--name` and takes in a value, then `--` will become the app
name. Otherwise, if `--arg1` is `--supervise`, which does not take in a value,
then `--` will be passed to the application.
From the user's perspective, the ways we specify the primary resource in
these two commands are near equivalent. However, the arguments are actually
parsed very differently. On the other hand, if we simply add a new Spark
specific config (`--spark-application-args` or something) and keep the way we
specify the primary resource the same, we get backwards compatibility for free
while providing this new functionality. This latter approach just seems simpler
to me.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]