hirakendu commented on pull request #33710: URL: https://github.com/apache/spark/pull/33710#issuecomment-913234281
@srowen Thanks a lot for making this change and starting to introduce this very important functionality. @zhengruifeng While it may be good and convenient to have some alternative ways to load model from file, at minimum we need the ability to specify the initial weights as vector / matrix. And it's best to keep it consistent with the other parameters. For example, the [pyspark.ml.classification.LogisticRegression](http://spark.apache.org/docs/latest/api/python/reference/api/pyspark.ml.classification.LogisticRegression.html) class has existing parameters `lowerBoundsOnCoefficients`, `upperBoundsOnCoefficients`, `lowerBoundsOnIntercepts`, `upperBoundsOnIntercepts`. In the same format, it would be best to introduce new parameters `initialCoefficients` and `initialIntercepts`. To use a model file, one can simply load it and pass the coefficients and intercepts: ```python ## Load previously saved model file. lr_model1 = LogisticRegressionModel.read().load("/path/to/lr_model1.bin") ## Pass it as initial state. lr_classifier2 = LogisticRegression( initialCoefficients = lr_model1.coefficients, initialIntercepts = lr_model1.intercept ) lr_model2 = lr_classifier2.fit(train_df) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
