[GitHub] spark pull request #19993: [SPARK-22799][ML] Bucketizer should throw excepti...

jkbradley Fri, 19 Jan 2018 11:56:04 -0800

Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19993#discussion_r162717142
  
    --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Bucketizer.scala 
---
    @@ -201,9 +184,13 @@ final class Bucketizer @Since("1.4.0") 
(@Since("1.4.0") override val uid: String
     
       @Since("1.4.0")
       override def transformSchema(schema: StructType): StructType = {
    -    if (isBucketizeMultipleColumns()) {
    +    ParamValidators.checkExclusiveParams(this, "inputCol", "inputCols")
    --- End diff --
    
    The problem with trying to use a general method like this is that it's hard 
to capture model-specific requirements.  This currently misses checking to make 
sure that exactly one (not just <= 1) of each pair is available, plus that all 
of the single-column OR all of the multi-column Params are available.  (The 
same issue occurs in https://github.com/apache/spark/pull/20146 )  It will also 
be hard to check these items and account for defaults.
    
    I'd argue that it's not worth trying to use generic checking functions here.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #19993: [SPARK-22799][ML] Bucketizer should throw excepti...

Reply via email to