[ 
https://issues.apache.org/jira/browse/SPARK-51473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17950265#comment-17950265
 ] 

Weichen Xu commented on SPARK-51473:
------------------------------------

I found an issue:

 
{code:java}
model_a = estimator.fit(df1)
del df1

model_a.summary.XXX(...){code}
assuming df1 contains id of model-B

model-B is released after `del df1` execution.

but `model_a.summary` might still use model-B because the summary contains the 
prediction dataframe.

 

The `model.evaluate` API has a similar issue.

[~podongfeng] 

> ML transformed dataframe keep a reference to the model
> ------------------------------------------------------
>
>                 Key: SPARK-51473
>                 URL: https://issues.apache.org/jira/browse/SPARK-51473
>             Project: Spark
>          Issue Type: Sub-task
>          Components: Connect, ML
>    Affects Versions: 4.1.0
>            Reporter: Ruifeng Zheng
>            Assignee: Ruifeng Zheng
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.1.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to