hirakendu commented on pull request #33710:
URL: https://github.com/apache/spark/pull/33710#issuecomment-913234281


   @srowen  Thanks a lot for making this change and starting to introduce this 
very important functionality.
   
   @zhengruifeng  While it may be good and convenient to have some alternative 
ways to load model from file, at minimum we need the ability to specify the 
initial weights as vector / matrix. And it's best to keep it consistent with 
the other parameters. For example, the 
[pyspark.ml.classification.LogisticRegression](http://spark.apache.org/docs/latest/api/python/reference/api/pyspark.ml.classification.LogisticRegression.html)
 class has existing parameters `lowerBoundsOnCoefficients`, 
`upperBoundsOnCoefficients`, `lowerBoundsOnIntercepts`, 
`upperBoundsOnIntercepts`. In the same format, it would be best to introduce 
new parameters `initialCoefficients` and `initialIntercepts`.
   
   To use a model file, one can simply load it and pass the coefficients and 
intercepts:
   
   ```python
   ## Load previously saved model file.
   lr_model1 = LogisticRegressionModel.read().load("/path/to/lr_model1.bin")
   
   ## Pass it as initial state.
   lr_classifier2 = LogisticRegression(
       initialCoefficients = lr_model1.coefficients,
       initialIntercepts = lr_model1.intercept
     )
   lr_model2 = lr_classifier2.fit(train_df)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to