[ https://issues.apache.org/jira/browse/SPARK-27287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16823876#comment-16823876 ]
Marco Gaido commented on SPARK-27287: ------------------------------------- [~dharmesh.kakadia] the point is: if you set a config on the \{{SparkSession}}, that is not "copied" to the \{{SparkContext}} until the first SQL job occurs. Since the PCAModel.load method uses the sparkContext, if you do not submit a SQL job before calling it, all the configuration set in the SQL session are ignored. Above I suggested 2 ideas to fix this on Spark side and avoid such a problem. > PCAModel.load() does not honor spark configs > -------------------------------------------- > > Key: SPARK-27287 > URL: https://issues.apache.org/jira/browse/SPARK-27287 > Project: Spark > Issue Type: Bug > Components: ML > Affects Versions: 2.4.0 > Reporter: Dharmesh Kakadia > Priority: Major > > PCAModel.load() does not seem to be using the configurations set on the > current spark session. > Repro: > > The following will fail to read the data because the storage account > credentials config used/propagated. > conf.set("fs.azure.account.key.test.blob.core.windows.net","Xosad==") > spark = > SparkSession.builder.appName("dharmesh").config(conf=conf).master('spark://spark-master:7077').getOrCreate() > model = PCAModel.load('wasb://t...@test.blob.core.windows.net/model') > > The following however works: > conf.set("fs.azure.account.key.test.blob.core.windows.net","Xosad==") > spark = > SparkSession.builder.appName("dharmesh").config(conf=conf).master('spark://spark-master:7077').getOrCreate() > blah = > spark.read.json('wasb://t...@test.blob.core.windows.net/somethingelse/') > blah.show() > model = PCAModel.load('wasb://t...@test.blob.core.windows.net/model') > > It looks like spark.read...() does force the use of the config once and then > PCAModel.load() will work correctly. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org