Re: ml Pipeline read write

Koert Kuipers Fri, 10 May 2019 17:28:28 -0700

i guess it simply is never set, in which case it is created in:

  protected final def sparkSession: SparkSession = {
    if (optionSparkSession.isEmpty) {
      optionSparkSession = Some(SparkSession.builder().getOrCreate())
    }
    optionSparkSession.get
  }


On Fri, May 10, 2019 at 4:31 PM Koert Kuipers <ko...@tresata.com> wrote:

> i am trying to understand how ml persists pipelines. it seems a
> SparkSession or SparkContext is needed for this, to write to hdfs.
>
> MLWriter and MLReader both extend BaseReadWrite to have access to a
> SparkSession. but this is where it gets confusing... the only way to set
> the SparkSession seems to be in BaseReadWrite:
>
> def session(sparkSession: SparkSession): this.type
>
> and i can find no place this is actually used, except for in one unit
> test: org.apache.spark.ml.util.JavaDefaultReadWriteSuite
>
> i confirmed it is not used by simply adding a line inside that method that
> throws an error, and all unit tests pass except for
> JavaDefaultReadWriteSuite.
>
> how is the sparkSession set?
> thanks!
>
> koert
>
>
>

Re: ml Pipeline read write

Reply via email to