[
https://issues.apache.org/jira/browse/SPARK-25995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16682762#comment-16682762
]
Felix Cheung commented on SPARK-25995:
--------------------------------------
sparkR is just taking the whole string as-is
[https://github.com/apache/spark/blob/141953f4c44dbad1c2a7059e92bec5fe770af932/R/pkg/R/client.R#L59]
you see sparkSubmitOpts is before args (args is the file with port number)
I think we should avoid duplicating the submit arg parsing in R, which we would
need to break before
{code:java}
fooarg
{code}
?
Is it easier/better to always set the temp file with port as the last arg
instead?
> sparkR should ensure user args are after the argument used for the port
> -----------------------------------------------------------------------
>
> Key: SPARK-25995
> URL: https://issues.apache.org/jira/browse/SPARK-25995
> Project: Spark
> Issue Type: Bug
> Components: SparkR
> Affects Versions: 2.3.2
> Reporter: Thomas Graves
> Priority: Minor
>
> Currently if you run sparkR and accidentally specify an argument, it fails
> with a useless error message. For example:
> $SPARK_HOME/bin/sparkR --master yarn --deploy-mode client fooarg
> This gets turned into:
> Launching java with spark-submit command spark-submit "--master" "yarn"
> "--deploy-mode" "client" "sparkr-shell" "fooarg"
> /tmp/Rtmp6XBGz2/backend_port162806ea36bca
> Notice that "fooarg" got put before /tmp file which is how R and jvm know
> which port to connect to. SparkR eventually fails with timeout exception
> after 10 seconds.
>
> SparkR should either not allow args or make sure the order is correct so the
> backend_port is always first. see
> https://github.com/apache/spark/blob/master/R/pkg/R/sparkR.R#L129
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]