i guess it simply is never set, in which case it is created in: protected final def sparkSession: SparkSession = { if (optionSparkSession.isEmpty) { optionSparkSession = Some(SparkSession.builder().getOrCreate()) } optionSparkSession.get }
On Fri, May 10, 2019 at 4:31 PM Koert Kuipers <ko...@tresata.com> wrote: > i am trying to understand how ml persists pipelines. it seems a > SparkSession or SparkContext is needed for this, to write to hdfs. > > MLWriter and MLReader both extend BaseReadWrite to have access to a > SparkSession. but this is where it gets confusing... the only way to set > the SparkSession seems to be in BaseReadWrite: > > def session(sparkSession: SparkSession): this.type > > and i can find no place this is actually used, except for in one unit > test: org.apache.spark.ml.util.JavaDefaultReadWriteSuite > > i confirmed it is not used by simply adding a line inside that method that > throws an error, and all unit tests pass except for > JavaDefaultReadWriteSuite. > > how is the sparkSession set? > thanks! > > koert > > >