[GitHub] spark issue #17849: [SPARK-10931][ML][PYSPARK] PySpark Models Copy Param Val...

BryanCutler Mon, 08 May 2017 11:33:39 -0700

Github user BryanCutler commented on the issue:

    https://github.com/apache/spark/pull/17849
  
    Thanks @holdenk for the review!  I think I wrote the description a little 
too rushed, so let me clarify a bit... 
    
    The temporary "fix" will just create empty params in the model if they 
exist in the Java model but not the Python one.  There should be no risk of 
having these added to the Python model since they are empty when created and 
not yet defined with a value.  These params will be set in 2 ways: 1) after the 
model is fit in the call to `_copy_values` where the value is copied from the 
estimator for any matching params, 2) when the model is loaded there is a call 
to `_transfer_params_from_java` that will copy value if the the Java param has 
been explicitly set (I think I need to add something here for the case that the 
Java model has a default value but Python model doesn't). 
    
    I think the best way forward to get parity with the Scala API is to then 
organize a JIRA with subtasks to update the Python ML class hierarchies to 
match the Scala ones, so that the Params will be defined that way with proper 
"get" and "set" methods too.  It might be good to also have a Python test that 
checks for matching params in Java for both the estimators and models.  It 
could be ignored by default and then enabled during the QA period.  The 
temporary fix here would continue to work and not interfere while the params 
are being added.  It could be removed once we feel that most of the params have 
been properly added and close to matching the Scala API.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #17849: [SPARK-10931][ML][PYSPARK] PySpark Models Copy Param Val...

Reply via email to