pgaref commented on pull request #1823: URL: https://github.com/apache/hive/pull/1823#issuecomment-771227835
> All q.out files show data size increase for tables. Since most of them are consistently additional 4 bytes per row, that seems like not a bug. However, I found some irregular increases too like 16 bytes per row. Can you explain why data size increased so we can check the irregularities and make sure they are expected? Hey @mustafaiman -- the main size differences are on Timestamp columns where we now support nanosecond precision (using 2 extra variables for the lower and the upper precision as part of the stats -- see [ORC-611](https://issues.apache.org/jira/browse/ORC-611)). Other than that there are other changes that can also affect size, such as: Trimming StringStatistics minimum and maximum values as part of ORC-203 or List and Map column statistics that was recently added as part of ORC-398. Happy to check further if you have doubts about a particular query. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org