[GitHub] spark pull request #16770: [SPARK-15009][PYTHON][ML] Construct a CountVector...

BryanCutler Wed, 14 Mar 2018 11:06:34 -0700

Github user BryanCutler commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16770#discussion_r174557635
  
    --- Diff: python/pyspark/ml/tests.py ---
    @@ -1980,8 +1997,8 @@ def test_java_params(self):
                        pyspark.ml.regression]
             for module in modules:
                 for name, cls in inspect.getmembers(module, inspect.isclass):
    -                if not name.endswith('Model') and issubclass(cls, 
JavaParams)\
    -                        and not inspect.isabstract(cls):
    +                if not name.endswith('Model') and not 
name.endswith('Params')\
    --- End diff --
    
    Yes, that's pretty much right but this is only checking estimators and 
skips models also.  We should have an explicit check for 
`CountVectorizer.from_vocabulary` here too since that is possible.  
Unfortunately, a new param `maxDF` was added to Scala recently and the param 
check will fail.  Once that is in Python, we can add the check for it here.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #16770: [SPARK-15009][PYTHON][ML] Construct a CountVector...

Reply via email to