mgaido91 commented on a change in pull request #23773: [SPARK-26721][ML] Avoid 
per-tree normalization in featureImportance for GBT
URL: https://github.com/apache/spark/pull/23773#discussion_r256430166
 
 

 ##########
 File path: 
mllib/src/main/scala/org/apache/spark/ml/classification/GBTClassifier.scala
 ##########
 @@ -341,11 +341,12 @@ class GBTClassificationModel private[ml](
    * The importance vector is normalized to sum to 1. This method is suggested 
by Hastie et al.
 
 Review comment:
   the normalization of the importance vector for each tree, but then at the 
end the vector is still normalized. To simplify in a diagram, before the PR it 
was:
   `tree importance` -> `normalization` -> `sum` -> `normalization`
   now it is
   `tree importance` -> `sum` -> `normalization`
   So the final result is still normalized.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to