Louiszr commented on a change in pull request #29445:
URL: https://github.com/apache/spark/pull/29445#discussion_r471731819



##########
File path: python/pyspark/ml/tuning.py
##########
@@ -536,7 +536,7 @@ def copy(self, extra=None):
         bestModel = self.bestModel.copy(extra)
         avgMetrics = self.avgMetrics
         subModels = self.subModels
-        return CrossValidatorModel(bestModel, avgMetrics, subModels)
+        return self._copyValues(CrossValidatorModel(bestModel, avgMetrics, 
subModels), extra=extra)

Review comment:
       We can. However I just found one potential issue with using 
`Params.copy` (not specific to `CrossValidator`). It creates a shallow copy of 
`self` (i.e. the models). Hence if we run the below snippet
   ```python
   cvModelCopied = cvModel.copy()
   cvModel.avgMetrics[0] = 'foo'
   assert cvModelCopied.avgMetrics[0] != 'foo'  # This will fail
   ```
   Based on the Scala equivalent I think `avgMetrics` should be shallow copied 
and `subModels` should be copied with the copying actions delegated to `copy()` 
of each model. I will push a change to this function.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to