Github user yanboliang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17151#discussion_r104274732
  
    --- Diff: 
mllib/src/test/scala/org/apache/spark/ml/classification/DecisionTreeClassifierSuite.scala
 ---
    @@ -372,16 +372,18 @@ class DecisionTreeClassifierSuite
         // Categorical splits with tree depth 2
         val categoricalData: DataFrame =
           TreeTests.setMetadata(rdd, Map(0 -> 2, 1 -> 3), numClasses = 2)
    -    testEstimatorAndModelReadWrite(dt, categoricalData, allParamSettings, 
checkModelData)
    +    testEstimatorAndModelReadWrite(dt, categoricalData, allParamSettings,
    --- End diff --
    
    @sethah Thanks for your kindly remind. I was planing to write as your 
suggestion before I'm thinking:
    * Actually all ```Model``` only need to extends a fraction of ```Params``` 
from ```Estimator```, so all ```testEstimatorAndModelReadWrite(estimator, 
dataset, testParams, checkModelData)``` should be changed to 
```testEstimatorAndModelReadWrite(estimator, dataset, testEstimatorParams, 
testModelParams, checkModelData)``` eventually. I explicitly write with the 
later way in test suites to remind developers should separate their estimator 
and model params when adding new algorithms' read/write test. I'm afraid that 
developers are not aware of the separation if they refer other test suites and 
find almost all test cases only pass in ```testParams```.
    *  Though in the currently change, we pass in ```allParamSettings``` to 
both ```testEstimatorParams``` and ```testModelParams```, this is because they 
share the same params set. Others like ```ALS``` will be pass in separate 
params. I think we should push forward to refactor ```***``` and ```***Model``` 
to separate their params, which could make models more succinct.
    * If this is a public API, I totally agree with you. However, this is an 
internal auxiliary function, I think all test cases will need to pass in 
separate params eventually, so I settle a matter at one go.
    
    This is my two cents, I'm still open to hear your thoughts. If you have 
strongly opinion, I can update according your suggestion. Thanks. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to