Github user holdenk commented on a diff in the pull request:
https://github.com/apache/spark/pull/10298#discussion_r47959777
--- Diff: python/pyspark/mllib/feature.py ---
@@ -174,6 +174,38 @@ def setWithStd(self, withStd):
self.call("setWithStd", withStd)
return self
+ @property
+ @since('2.0.0')
+ def withStd(self):
+ """
+ Returns if the model scales the data to unit standard deviation.
+ """
+ return self.call("withStd")
+
+ @property
+ @since('2.0.0')
+ def withMean(self):
+ """
+ Returns if the model centers the data before scaling.
+ """
+ return self.call("withMean")
+
+ @property
+ @since('2.0.0')
+ def std(self):
+ """
+ Return the column standard deviation values. Only set if model was
trained withStd.
+ """
+ return self.call("std")
+
+ @property
+ @since('2.0.0')
+ def mean(self):
--- End diff --
Thats a good point, so it turns out the comment was a little off (currently
we compute the mean and std regardless although there is a todo to skip
computation when both are set to false - although I'm not sure that makes any
sense from a producing a usable model point of view). I've updated the comment
text.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]