pgaref commented on pull request #1823:
URL: https://github.com/apache/hive/pull/1823#issuecomment-771227835


   > All q.out files show data size increase for tables. Since most of them are 
consistently additional 4 bytes per row, that seems like not a bug. However, I 
found some irregular increases too like 16 bytes per row. Can you explain why 
data size increased so we can check the irregularities and make sure they are 
expected?
   
   Hey @mustafaiman -- the main size differences are on Timestamp columns where 
we now support nanosecond precision (using 2 extra variables for the lower and 
the upper precision as part of the stats -- see 
[ORC-611](https://issues.apache.org/jira/browse/ORC-611)).
   
   Other than that there are other changes that can also affect size, such as: 
Trimming StringStatistics minimum and maximum values as part of ORC-203  or 
List and Map column statistics that was recently added as part of ORC-398.
   
   Happy to check further if you have doubts about a particular query.
   
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org

Reply via email to