[
https://issues.apache.org/jira/browse/HIVE-20573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16617297#comment-16617297
]
Zoltan Haindrich commented on HIVE-20573:
-----------------------------------------
there was also some fluctuation in the plans (there are 2 spark drivers which
check agains the same set of q.out-s I think)
I've ended up disabling {{hive.stats.fetch.column.stats}} in the site xml-s:
{{data/conf/spark/yarn-cluster/hive-site.xml}}
I think before this ticket could be addressed HIVE-18139 should be fixed.
> Spark: incorrect results when column stats are fetched
> ------------------------------------------------------
>
> Key: HIVE-20573
> URL: https://issues.apache.org/jira/browse/HIVE-20573
> Project: Hive
> Issue Type: Bug
> Components: Spark
> Reporter: Zoltan Haindrich
> Priority: Major
>
> there are some result set differences when column stats fetch is enabled
> w.r.t to llap outputs. Examples:
> {code}
> +++ ql/src/test/results/clientpositive/spark/union_remove_12.q.out
> - totalSize 194
> + totalSize 192
> + 18
> + 18
> + 28
> + 28
> -8 18
> -8 18
> -8 28
> -8 28
> +++ ql/src/test/results/clientpositive/spark/vectorized_nested_mapjoin.q.out
> -6.06519093248863E11
> +5.744447909695194E9
> {code}
> ql/src/test/results/clientpositive/spark/union_remove_12.q.out
> ql/src/test/results/clientpositive/spark/union_remove_13.q.out
> ql/src/test/results/clientpositive/spark/union_remove_14.q.out
> ql/src/test/results/clientpositive/spark/bucket_map_join_tez1.q.out
> ql/src/test/results/clientpositive/spark/join32.q.out
> ql/src/test/results/clientpositive/spark/join33.q.out
> ql/src/test/results/clientpositive/spark/vectorized_nested_mapjoin.q.out
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)