[
https://issues.apache.org/jira/browse/SPARK-27287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16823876#comment-16823876
]
Marco Gaido commented on SPARK-27287:
-------------------------------------
[~dharmesh.kakadia] the point is: if you set a config on the \{{SparkSession}},
that is not "copied" to the \{{SparkContext}} until the first SQL job occurs.
Since the PCAModel.load method uses the sparkContext, if you do not submit a
SQL job before calling it, all the configuration set in the SQL session are
ignored. Above I suggested 2 ideas to fix this on Spark side and avoid such a
problem.
> PCAModel.load() does not honor spark configs
> --------------------------------------------
>
> Key: SPARK-27287
> URL: https://issues.apache.org/jira/browse/SPARK-27287
> Project: Spark
> Issue Type: Bug
> Components: ML
> Affects Versions: 2.4.0
> Reporter: Dharmesh Kakadia
> Priority: Major
>
> PCAModel.load() does not seem to be using the configurations set on the
> current spark session.
> Repro:
>
> The following will fail to read the data because the storage account
> credentials config used/propagated.
> conf.set("fs.azure.account.key.test.blob.core.windows.net","Xosad==")
> spark =
> SparkSession.builder.appName("dharmesh").config(conf=conf).master('spark://spark-master:7077').getOrCreate()
> model = PCAModel.load('wasb://[email protected]/model')
>
> The following however works:
> conf.set("fs.azure.account.key.test.blob.core.windows.net","Xosad==")
> spark =
> SparkSession.builder.appName("dharmesh").config(conf=conf).master('spark://spark-master:7077').getOrCreate()
> blah =
> spark.read.json('wasb://[email protected]/somethingelse/')
> blah.show()
> model = PCAModel.load('wasb://[email protected]/model')
>
> It looks like spark.read...() does force the use of the config once and then
> PCAModel.load() will work correctly.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]