[
https://issues.apache.org/jira/browse/SPARK-22271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shafique Jamal updated SPARK-22271:
-----------------------------------
Description:
Please excuse me if this issue was addressed already - I was unable to find it.
Calling .describe().show() on my dataframe results in a value of null for the
row "mean":
{noformat}
val foo = spark.read.parquet("decimalNumbers.parquet")
foo.select(col("numericvariable")).describe().show()
foo: org.apache.spark.sql.DataFrame = [numericvariable: decimal(38,32)]
+-------+--------------------+
|summary| numericvariable|
+-------+--------------------+
| count| 299|
| mean| null|
| stddev| 0.2376438793946738|
| min|0.037815489727642...|
| max|2.138189366554511...|
{noformat}
But all of the rows for this seem ok (I can attache a parquet file). When I
round the column, however, all is fine:
{noformat}
foo.select(bround(col("numericvariable"), 31)).describe().show()
+-------+---------------------------+
|summary|bround(numericvariable, 31)|
+-------+---------------------------+
| count| 299|
| mean| 0.139522503183236...|
| stddev| 0.2376438793946738|
| min| 0.037815489727642...|
| max| 2.138189366554511...|
+-------+---------------------------+
{noformat}
Rounding using 32 gives null also though.
was:
Please excuse me if this issue was addressed already - I was unable to find it.
Calling .describe().show() on my dataframe results in a value of null for the
row "mean":
{{val foo = spark.read.parquet("decimalNumbers.parquet")
foo.select(col("numericvariable")).describe().show()
foo: org.apache.spark.sql.DataFrame = [numericvariable: decimal(38,32)]
+-------+--------------------+
|summary| numericvariable|
+-------+--------------------+
| count| 299|
| mean| null|
| stddev| 0.2376438793946738|
| min|0.037815489727642...|
| max|2.138189366554511...|}}
But all of the rows for this seem ok (I can attache a parquet file). When I
round the column, however, all is fine:
{{foo.select(bround(col("numericvariable"), 31)).describe().show()
+-------+---------------------------+
|summary|bround(numericvariable, 31)|
+-------+---------------------------+
| count| 299|
| mean| 0.139522503183236...|
| stddev| 0.2376438793946738|
| min| 0.037815489727642...|
| max| 2.138189366554511...|
+-------+---------------------------+}}
Rounding to 32 give null also though.
> Describe results in "null" for the value of "mean" of a numeric variable
> ------------------------------------------------------------------------
>
> Key: SPARK-22271
> URL: https://issues.apache.org/jira/browse/SPARK-22271
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.1.0
> Environment:
> Reporter: Shafique Jamal
> Priority: Minor
>
> Please excuse me if this issue was addressed already - I was unable to find
> it.
> Calling .describe().show() on my dataframe results in a value of null for the
> row "mean":
> {noformat}
> val foo = spark.read.parquet("decimalNumbers.parquet")
> foo.select(col("numericvariable")).describe().show()
> foo: org.apache.spark.sql.DataFrame = [numericvariable: decimal(38,32)]
> +-------+--------------------+
> |summary| numericvariable|
> +-------+--------------------+
> | count| 299|
> | mean| null|
> | stddev| 0.2376438793946738|
> | min|0.037815489727642...|
> | max|2.138189366554511...|
> {noformat}
> But all of the rows for this seem ok (I can attache a parquet file). When I
> round the column, however, all is fine:
> {noformat}
> foo.select(bround(col("numericvariable"), 31)).describe().show()
> +-------+---------------------------+
> |summary|bround(numericvariable, 31)|
> +-------+---------------------------+
> | count| 299|
> | mean| 0.139522503183236...|
> | stddev| 0.2376438793946738|
> | min| 0.037815489727642...|
> | max| 2.138189366554511...|
> +-------+---------------------------+
> {noformat}
> Rounding using 32 gives null also though.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]