[ 
https://issues.apache.org/jira/browse/SPARK-29235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew Bedford updated SPARK-29235:
------------------------------------
    Description: 
 
 Right after a CrossValidatorModel is trained, it has avgMetrics.  After the 
model is written to disk and read later, it no longer has avgMetrics.  To 
reproduce:

{{from pyspark.ml.tuning import CrossValidator, CrossValidatorModel}}
{{}}

{{cv = CrossValidator(...) #fill with params}}

{{cvModel = cv.fit(trainDF) #given dataframe with training}}

{{data}}{{print(cvModel.avgMetrics) #prints a nonempty list as expected}}

{{cvModel.write().save({color:#172b4d}"/tmp/model"{color})}}

{{cvModel2 = 
CrossValidatorModel.read().load({color:#172b4d}"/tmp/model"{color})}}{{print(cvModel2.avgMetrics)
 #BUG - prints an empty list}}

  was:
 
Right after a CrossValidatorModel is trained, it has avgMetrics.  After the 
model is written to disk and read later, it no longer has avgMetrics.  To 
reproduce:
{{from pyspark.ml.tuning import CrossValidator, CrossValidatorModel
}}{{}}

{{cv = CrossValidator(...) #fill with params
}}{{}}

{{cvModel = cv.fit(trainDF) #given dataframe with training}}

{{data}}{{print(cvModel.avgMetrics) #prints a nonempty list as expected}}

{{cvModel.write().save({color:#172b4d}"/tmp/model"{color})}}

{{cvModel2 = 
CrossValidatorModel.read().load({color:#172b4d}{color:#172b4d}"/tmp/model"{color}{color})}}{{{color:#172b4d}print(cvModel2.avgMetrics)
 #BUG - prints an empty list{color}}}


> CrossValidatorModel.avgMetrics disappears after model is written/read again
> ---------------------------------------------------------------------------
>
>                 Key: SPARK-29235
>                 URL: https://issues.apache.org/jira/browse/SPARK-29235
>             Project: Spark
>          Issue Type: Bug
>          Components: ML
>    Affects Versions: 2.4.1
>         Environment: Databricks cluster:
> {
>     "num_workers": 4,
>     "cluster_name": "mabedfor-test-classfix",
>     "spark_version": "5.3.x-cpu-ml-scala2.11",
>     "spark_conf": {
>         "spark.databricks.delta.preview.enabled": "true"
>     },
>     "node_type_id": "Standard_DS12_v2",
>     "driver_node_type_id": "Standard_DS12_v2",
>     "ssh_public_keys": [],
>     "custom_tags": {},
>     "spark_env_vars": {
>         "PYSPARK_PYTHON": "/databricks/python3/bin/python3"
>     },
>     "autotermination_minutes": 120,
>     "enable_elastic_disk": true,
>     "cluster_source": "UI",
>     "init_scripts": [],
>     "cluster_id": "0722-165622-calls746"
> }
>            Reporter: Matthew Bedford
>            Priority: Minor
>
>  
>  Right after a CrossValidatorModel is trained, it has avgMetrics.  After the 
> model is written to disk and read later, it no longer has avgMetrics.  To 
> reproduce:
> {{from pyspark.ml.tuning import CrossValidator, CrossValidatorModel}}
> {{}}
> {{cv = CrossValidator(...) #fill with params}}
> {{cvModel = cv.fit(trainDF) #given dataframe with training}}
> {{data}}{{print(cvModel.avgMetrics) #prints a nonempty list as expected}}
> {{cvModel.write().save({color:#172b4d}"/tmp/model"{color})}}
> {{cvModel2 = 
> CrossValidatorModel.read().load({color:#172b4d}"/tmp/model"{color})}}{{print(cvModel2.avgMetrics)
>  #BUG - prints an empty list}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to