[ 
https://issues.apache.org/jira/browse/SPARK-23576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16491267#comment-16491267
 ] 

Hafthor Stefansson commented on SPARK-23576:
--------------------------------------------

Here's an equivalent problem:

spark.sql("select cast(1 as decimal(38,18)) as 
x").write.format("parquet").save("decimal.parq")

spark.read.schema(spark.sql("select cast(1 as decimal) as 
x").schema).parquet("decimal.parq").show

returns 1000000000000000000!

It should throw, like it would if I specified a schema with x as float, or some 
other type.

> SparkSQL - Decimal data missing decimal point
> ---------------------------------------------
>
>                 Key: SPARK-23576
>                 URL: https://issues.apache.org/jira/browse/SPARK-23576
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.3.0
>         Environment: spark 2.3.0
> linux
>            Reporter: R
>            Priority: Major
>
> Integers like 3 stored as a decimal display in sparksql as 30000000000 with 
> no decimal point. But hive displays fine as 3.
> Repro steps:
>  # Create a .csv with the value 3
>  # Use spark to read the csv, cast it as decimal(31,8) and output to an ORC 
> file
>  # Use spark to read the ORC, infer the schema (it will infer 38,18 
> precision) and output to a Parquet file
>  # Create external hive table to read the parquet ( define the hive type as 
> decimal(31,8))
>  # Use spark-sql to select from the external hive table.
>  # Notice how sparksql shows 30000000000    !!!
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to