[
https://issues.apache.org/jira/browse/IMPALA-8369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16840584#comment-16840584
]
ASF subversion and git services commented on IMPALA-8369:
---------------------------------------------------------
Commit 3567a2b5d4f797d0d48e37efc0126d022cb6a189 in impala's branch
refs/heads/master from Todd Lipcon
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=3567a2b ]
IMPALA-8369 (part 4): Hive 3: fixes for functional dataset loading
This fixes three issues for functional dataset loading:
- works around HIVE-21675, a bug in which 'CREATE VIEW IF NOT EXISTS'
does not function correctly in our current Hive build. This has been
fixed already, but the workaround is pretty simple, and actually the
'drop and recreate' pattern is used more widely for data-loading than
the 'create if not exists' one.
- Moves the creation of the 'hive_index' table from
load-dependent-tables.sql to a new load-dependent-tables-hive2.sql
file which is only executed on Hive 2.
- Moving from MR to Tez execution changed the behavior of data loading
by disabling the auto-merging of small files. With Hive-on-MR, this
behavior defaulted to true, but with Hive-on-Tez it defaults false.
The change is likely motivated by the fact that Tez automatically
groups small splits on the _input_ side and thus is less likely to
produce lots of small files. However, that grouping functionality
doesn't work properly in localhost clusters (TEZ-3310) so we aren't
seeing the benefit. So, this patch enables the post-process merging of
small files.
Prior to this change, the 'alltypesaggmultifilesnopart' test table was
getting 40+ files inside it, which broke various planner tests. With
the change, it gets the expected 4 files.
Change-Id: Ic34930dc064da3136dde4e01a011d14db6a74ecd
Reviewed-on: http://gerrit.cloudera.org:8080/13251
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> Impala should be able to interoperate with Hive 3.1.0
> -----------------------------------------------------
>
> Key: IMPALA-8369
> URL: https://issues.apache.org/jira/browse/IMPALA-8369
> Project: IMPALA
> Issue Type: Improvement
> Reporter: Vihang Karajgaonkar
> Assignee: Vihang Karajgaonkar
> Priority: Major
> Labels: impala-acid
>
> Currently, Impala only works with Hive 2.1.1. Since Hive 3.1.0 has been
> released for a while it would be good to add support for Hive 3.1.0 (HMS
> 3.1.0). This patch will focus on ability to connect to HMS 3.1.0 and run
> existing tests. It will not focus on adding support for newer features like
> ACID in Hive 3.1.0 which can be taken up as separate JIRA.
> It would be good to make changes to Impala source code such that it can work
> with both Hive 2.1.0 and Hive 3.1.0 without the need to create a separate
> branch. However, this should be a aspirational goal. If we hit a blocker we
> should investigate alternative approaches.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]