[ https://issues.apache.org/jira/browse/SPARK-23576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16491267#comment-16491267 ]
Hafthor Stefansson commented on SPARK-23576: -------------------------------------------- Here's an equivalent problem: spark.sql("select cast(1 as decimal(38,18)) as x").write.format("parquet").save("decimal.parq") spark.read.schema(spark.sql("select cast(1 as decimal) as x").schema).parquet("decimal.parq").show returns 1000000000000000000! It should throw, like it would if I specified a schema with x as float, or some other type. > SparkSQL - Decimal data missing decimal point > --------------------------------------------- > > Key: SPARK-23576 > URL: https://issues.apache.org/jira/browse/SPARK-23576 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.3.0 > Environment: spark 2.3.0 > linux > Reporter: R > Priority: Major > > Integers like 3 stored as a decimal display in sparksql as 30000000000 with > no decimal point. But hive displays fine as 3. > Repro steps: > # Create a .csv with the value 3 > # Use spark to read the csv, cast it as decimal(31,8) and output to an ORC > file > # Use spark to read the ORC, infer the schema (it will infer 38,18 > precision) and output to a Parquet file > # Create external hive table to read the parquet ( define the hive type as > decimal(31,8)) > # Use spark-sql to select from the external hive table. > # Notice how sparksql shows 30000000000 !!! > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org