stak...@cloudera.com - Could you please fill out your Gerrit profile? In
the top-right corner there's a drop-down menu with 'settings' as an option.

That stops you from showing up on these reviews as 'Anonymous Coward'.

On 25 September 2016 at 18:14, Anonymous Coward (Code Review) <
ger...@cloudera.org> wrote:

> Hello Michael Brown,
>
> I'd like you to reexamine a change.  Please visit
>
>     http://gerrit.cloudera.org:8080/4419
>
> to look at the new patch set (#9).
>
> Change subject: IMPALA-4101: qgen: Hive join predicates should only
> contains equality functions
> ......................................................................
>
> IMPALA-4101: qgen: Hive join predicates should only contains equality
> functions
>
> Background:
>
> Hive only supports equi-joins in its JOIN clause, while Postgres and
> Impala support more
> complex functions such as <, <=, >, >=, etc. This change modifies the
> QueryGenerator._create_relational_join_condition and
> QueryGenerator._create_boolean_func_tree methods to only construct
> equality join
> conditions under certain conditions.
>
> The _create_boolean_func_tree method is invoked via
> QueryGenerator -> create_query -> _create_from_clause ->
> _create_join_clause ->
> _create_relational_join_condition -> _create_boolean_func_tree. This
> method is invoked
> when constructing the JOIN, WHERE, and HAVING clauses. It creates a tree
> of functions
> that would typically be found in any of these clauses.
>
> Changes:
>
> The parameter "signatures" is added to the method
> _create_boolean_func_tree, and it lists
> out all the allowed signatures the function is allowed to use. Previously,
> this list of
> signatures was populated by calling _funcs_to_allowed_signatures(FUNCS),
> and if
> "signatures" is not specified, then the code defaults back to the results
> of that method.
> A new method in the DefaultProfile called get_allowed_join_signatures is
> introduced and
> returns a list of function signatures that are allowed within a JOIN
> clause. The
> DefaultProfile allows all given signatures, while the HiveProfile only
> allows for the
> Equals and And functions, as well as any function that operates over only
> one column.
> The reason for these restrictions is that Hive only allows equality joins,
> does not allow
> OR operators in the join clause, and has some restrictions on functions
> that operate over
> multiple different tables. This last restriction is somewhat subtle; if
> one side of the
> equals operator contains a function that operates over two different
> tables, the other
> side of the operator cannot contain either of those tables. While it is
> possible to have
> functions that take in multiple input parameters, the inputs must be taken
> from specific
> tables to prevent Hive from throwing a compile time exception. Adding
> support for this in
> qgen code will require significant effort and modification to some core
> methods
> (_create_relational_join_condition and _populate_func_with_vals), so it's
> best to disable
> these for Hive altogether.
>
> Note that the _create_boolean_func_tree still allows for OR operators due
> to some logic
> around its "and_or_fill_ratio" variable. The plan is to fix this in a
> future patch that
> specifically focuses on removing OR operators from Hive JOIN clauses.
>
> Minor change to discrepancy_searcher so that the logs print out "Hive"
> instead of
> "Impala" when running against Hive.
>
> Testing:
>
> * Added a new unit test that ensures the HiveProfile only returns equality
> joins
> * Unit tests pass
> * Tested against Hive locally
> * Tested against Impala via Leopard
> * Tested against Impala via the Discrepancy Checker
>
> Change-Id: Ibe8832a03cfa0d7ecc293ec6db6db2bcb34ab459
> ---
> M tests/comparison/discrepancy_searcher.py
> M tests/comparison/query_generator.py
> M tests/comparison/query_profile.py
> A tests/comparison/tests/hive/test_hive_create_relational_
> join_condition.py
> 4 files changed, 139 insertions(+), 20 deletions(-)
>
>
>   git pull ssh://gerrit.cloudera.org:29418/Impala-ASF
> refs/changes/19/4419/9
> --
> To view, visit http://gerrit.cloudera.org:8080/4419
> To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
>
> Gerrit-MessageType: newpatchset
> Gerrit-Change-Id: Ibe8832a03cfa0d7ecc293ec6db6db2bcb34ab459
> Gerrit-PatchSet: 9
> Gerrit-Project: Impala-ASF
> Gerrit-Branch: master
> Gerrit-Owner: stak...@cloudera.com
> Gerrit-Reviewer: David Knupp <dkn...@cloudera.com>
> Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
> Gerrit-Reviewer: Taras Bobrovytsky <tbobrovyt...@cloudera.com>
> Gerrit-Reviewer: stak...@cloudera.com
>
> --
> You received this message because you are subscribed to the Google Groups
> "impala-cr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to impala-cr+unsubscr...@cloudera.com.
> For more options, visit https://groups.google.com/a/cloudera.com/d/optout.
>



-- 
Henry Robinson
Software Engineer
Cloudera
415-994-6679

Reply via email to