GitHub user viirya opened a pull request:

    https://github.com/apache/spark/pull/20633

    [SPARK-23455][ML] Default Params in ML should be saved separately in 
metadata

    ## What changes were proposed in this pull request?
    
    We save ML's user-supplied params and default params as one entity in 
metadata. During loading the saved models, we set all the loaded params into 
created ML model instances as user-supplied params.
    
    It causes some problems, e.g., if we strictly disallow some params to be 
set at the same time, a default param can fail the param check because it is 
treated as user-supplied param after loading.
    
    The loaded default params should not be set as user-supplied params. We 
should save ML default params separately in metadata.
    
    For backward compatibility, when loading metadata, if it is a metadata file 
from previous Spark, we shouldn't raise error if we can't find the default 
param field.
    
    ## How was this patch tested?
    
    Pass existing tests and added tests.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/viirya/spark-1 save-ml-default-params

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/20633.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #20633
    
----
commit 69648d67546b292037d26ef3a282bf26afd4863e
Author: Liang-Chi Hsieh <viirya@...>
Date:   2018-02-17T02:34:11Z

    Save default params separately in JSON.

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to