[
https://issues.apache.org/jira/browse/HIVE-15115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15707207#comment-15707207
]
Prasanth Jayachandran commented on HIVE-15115:
----------------------------------------------
Would it be possible to confirm if both operating systems are using same
Timezone. My guess is the difference in file size is because of the differences
in timezone. Orc stores the timezone id in string format in file footer. So if
the q.out files are generated in different timezones the filesizes will differ.
Can you confirm if that's case? To view the timezone information, you can do
"hive --orcfiledump -t <path-to-orc-file>" and see what timezone gets printed
in OSX and centos.
The other possibility is ordering of rows. Generating ORC files with different
row orderings will cause different file size because of run length encoding.
Usually we avoid such flakiness by explicitly adding order by to INSERT or CTAS
query. Something like "INSERT OVERWRITE orctable SELECT * FROM src ORDER BY
key". This will avoid issue with changing file sizes caused by encodings
(applicable for parquet as well).
> Flaky test: TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
> ----------------------------------------------------------------------
>
> Key: HIVE-15115
> URL: https://issues.apache.org/jira/browse/HIVE-15115
> Project: Hive
> Issue Type: Sub-task
> Reporter: Barna Zsombor Klara
> Assignee: Barna Zsombor Klara
> Fix For: 2.2.0
>
> Attachments: HIVE-15115.patch
>
>
> This test was identified as flaky before, it seems it turned flaky again.
> Earlier Jira:
> [HIVE-14976|https://issues.apache.org/jira/browse/HIVE-14976]
> New flaky runs:
> https://builds.apache.org/job/PreCommit-HIVE-Build/1931/testReport
> https://builds.apache.org/job/PreCommit-HIVE-Build/1930/testReport
> {code}
> 516c516
> < totalSize 3220
> ---
> > totalSize 3224
> 569c569
> < totalSize 3220
> ---
> > totalSize 3224
> 634c634
> < totalSize 4577
> ---
> > totalSize 4581
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)