[GitHub] spark issue #15414: [SPARK-17848][ML] Move LabelCol datatype cast into Predi...

sethah Mon, 10 Oct 2016 14:31:45 -0700

Github user sethah commented on the issue:

    https://github.com/apache/spark/pull/15414
  
    What do you think about adding a new suite `PredictorSuite` where we can 
create a mock predictor, and call train on data of various types. The train 
method can just require that the label column is `DoubleType`:
    
    ````scala
    class MockPredictor(override val uid: String)
      extends Predictor[Vector, MockPredictor, MockPredictionModel] {
    
      override def train(dataset: Dataset[_]): MockPredictionModel = {
        require(dataset.schema("label").dataType == DoubleType)
        new MockPredictionModel(uid)
      }
    
      override def copy(extra: ParamMap): MockPredictor = defaultCopy(extra)
    }
    
    class MockPredictionModel(override val uid: String)
      extends PredictionModel[Vector, MockPredictionModel] {
    
      override def predict(features: Vector): Double = 1.0
    
      override def copy(extra: ParamMap): MockPredictionModel = 
defaultCopy(extra)
    }
    ````
    
    Then we just have a test that calls `fit` for each type of data.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #15414: [SPARK-17848][ML] Move LabelCol datatype cast into Predi...

Reply via email to