[ https://issues.apache.org/jira/browse/SPARK-20711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-20711: ------------------------------------ Assignee: (was: Apache Spark) > MultivariateOnlineSummarizer incorrect min/max for NaN value > ------------------------------------------------------------ > > Key: SPARK-20711 > URL: https://issues.apache.org/jira/browse/SPARK-20711 > Project: Spark > Issue Type: Bug > Components: ML > Affects Versions: 2.2.0 > Reporter: zhengruifeng > Priority: Minor > > {code} > scala> val summarizer = new MultivariateOnlineSummarizer() > summarizer: org.apache.spark.mllib.stat.MultivariateOnlineSummarizer = > org.apache.spark.mllib.stat.MultivariateOnlineSummarizer@2ac58d > scala> summarizer.add(Vectors.dense(Double.NaN, -10.0)) > res20: summarizer.type = > org.apache.spark.mllib.stat.MultivariateOnlineSummarizer@2ac58d > scala> summarizer.add(Vectors.dense(Double.NaN, 2.0)) > res21: summarizer.type = > org.apache.spark.mllib.stat.MultivariateOnlineSummarizer@2ac58d > scala> summarizer.min > res22: org.apache.spark.mllib.linalg.Vector = [1.7976931348623157E308,-10.0] > scala> summarizer.max > res23: org.apache.spark.mllib.linalg.Vector = [-1.7976931348623157E308,2.0] > {code} > For a feature only containing {{Double.NaN}}, the returned max is > {{Double.MinValue}} and the min is {{Double.MaxValue}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org