[GitHub] [spark] cloud-fan commented on pull request #31284: [SPARK-34167][SQL]Reading parquet with IntDecimal written as a LongDecimal blows up

GitBox Thu, 28 Jan 2021 08:43:42 -0800


cloud-fan commented on pull request #31284:
URL: https://github.com/apache/spark/pull/31284#issuecomment-769215659



   Another simpler idea is to fix the schema inference:
   
   
https://github.com/apache/spark/blob/v3.1.1-rc1/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaConverter.scala#L112
   
   For INT64, we should make sure the inferred `DecimalType` is a long decimal. 
Then we will allocate long column vectors and get rid of this issue. It's also 
probably more efficient, as there is no down-casting and the space waste is not 
a big deal.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] cloud-fan commented on pull request #31284: [SPARK-34167][SQL]Reading parquet with IntDecimal written as a LongDecimal blows up

Reply via email to