[GitHub] spark pull request: [SPARK-8542][MLlib]PMML export for Decision Tr...

JasmineGeorge Thu, 10 Sep 2015 06:10:37 -0700

Github user JasmineGeorge commented on the pull request:

    https://github.com/apache/spark/pull/7842#issuecomment-139230019
  
    When I ran the evaluator I realized the evaluator expects that the root 
node predicate should always have predicate True.
    And every node has the predicate condition to select that node.
    Unlike MLLib where the split is for choosing left or right child node. For 
continuous feature  Split left if feature <= threshold, else right.
    For categorical feature left if categorical feature value is in this set, 
else right.
    I had to move the predicates to the left and right nodes appropriately in 
pmml, instead of the parent node which holds the split in MLLib.
    
    Also in case of classification there must be class field in datafield and 
mining field which holds the predicted values.
    Added those too.
    
    About maintaining order of the MiningFields same as the input.
    The input is a vector of Double and the features are essentially ordered
    according to their index in the vector.
    
    Made the changes to sort mining and datafields to be ordered according to 
the 
    number part of the field names, which is their index in the input vector.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-8542][MLlib]PMML export for Decision Tr...

Reply via email to