[
https://issues.apache.org/jira/browse/DRILL-5375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15941725#comment-15941725
]
ASF GitHub Bot commented on DRILL-5375:
---------------------------------------
Github user arina-ielchiieva commented on a diff in the pull request:
https://github.com/apache/drill/pull/794#discussion_r108035893
--- Diff:
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillOptiq.java
---
@@ -70,27 +70,65 @@
private static final org.slf4j.Logger logger =
org.slf4j.LoggerFactory.getLogger(DrillOptiq.class);
/**
- * Converts a tree of {@link RexNode} operators into a scalar expression
in Drill syntax.
+ * Converts a tree of {@link RexNode} operators into a scalar expression
in Drill syntax using one input.
+ *
+ * @param context parse context which contains planner settings
+ * @param input data input
+ * @param expr expression to be converted
+ * @return converted expression
*/
public static LogicalExpression toDrill(DrillParseContext context,
RelNode input, RexNode expr) {
- final RexToDrill visitor = new RexToDrill(context, input);
+ return toDrill(context, Lists.newArrayList(input), expr);
+ }
+
+ /**
+ * Converts a tree of {@link RexNode} operators into a scalar expression
in Drill syntax using multiple inputs.
+ *
+ * @param context parse context which contains planner settings
+ * @param inputs multiple data inputs
+ * @param expr expression to be converted
+ * @return converted expression
+ */
+ public static LogicalExpression toDrill(DrillParseContext context,
List<RelNode> inputs, RexNode expr) {
+ final RexToDrill visitor = new RexToDrill(context, inputs);
return expr.accept(visitor);
}
private static class RexToDrill extends
RexVisitorImpl<LogicalExpression> {
- private final RelNode input;
+ private final List<RelNode> inputs;
private final DrillParseContext context;
+ private final List<RelDataTypeField> fieldList;
- RexToDrill(DrillParseContext context, RelNode input) {
+ RexToDrill(DrillParseContext context, List<RelNode> inputs) {
super(true);
this.context = context;
- this.input = input;
+ this.inputs = inputs;
+ this.fieldList = Lists.newArrayList();
+ /*
+ Fields are enumerated by their presence order in input. Details
{@link org.apache.calcite.rex.RexInputRef}.
+ Thus we can merge field list from several inputs by adding them
into the list in order of appearance.
+ Each field index in the list will match field index in the
RexInputRef instance which will allow us
+ to retrieve field from filed list by index in {@link
#visitInputRef(RexInputRef)} method. Example:
+
+ Query: select t1.c1, t2.c1. t2.c2 from t1 inner join t2 on t1.c1
between t2.c1 and t2.c2
+
+ Input 1: $0
+ Input 2: $1, $2
+
+ Result: $0, $1, $2
+ */
+ for (RelNode input : inputs) {
--- End diff --
I have only allowed to pass multiple inputs instead of one as it needed to
non-equi joins plus I have merged all fields into one list from all inputs for
performance improvement. You are correct it's Calcite that ensures fields
numeration. I have just added comment explaining this behavior to ease code
review and as some help for future developers.
> Nested loop join: return correct result for left join
> -----------------------------------------------------
>
> Key: DRILL-5375
> URL: https://issues.apache.org/jira/browse/DRILL-5375
> Project: Apache Drill
> Issue Type: Bug
> Affects Versions: 1.8.0
> Reporter: Arina Ielchiieva
> Assignee: Arina Ielchiieva
> Labels: doc-impacting
>
> Mini repro:
> 1. Create 2 Hive tables with data
> {code}
> CREATE TABLE t1 (
> FYQ varchar(999),
> dts varchar(999),
> dte varchar(999)
> )
> ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
> 2016-Q1,2016-06-01,2016-09-30
> 2016-Q2,2016-09-01,2016-12-31
> 2016-Q3,2017-01-01,2017-03-31
> 2016-Q4,2017-04-01,2017-06-30
> CREATE TABLE t2 (
> who varchar(999),
> event varchar(999),
> dt varchar(999)
> )
> ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
> aperson,did somthing,2017-01-06
> aperson,did somthing else,2017-01-12
> aperson,had chrsitmas,2016-12-26
> aperson,went wild,2016-01-01
> {code}
> 2. Impala Query shows correct result
> {code}
> select t2.dt, t1.fyq, t2.who, t2.event
> from t2
> left join t1 on t2.dt between t1.dts and t1.dte
> order by t2.dt;
> +------------+---------+---------+-------------------+
> | dt | fyq | who | event |
> +------------+---------+---------+-------------------+
> | 2016-01-01 | NULL | aperson | went wild |
> | 2016-12-26 | 2016-Q2 | aperson | had chrsitmas |
> | 2017-01-06 | 2016-Q3 | aperson | did somthing |
> | 2017-01-12 | 2016-Q3 | aperson | did somthing else |
> +------------+---------+---------+-------------------+
> {code}
> 3. Drill query shows wrong results:
> {code}
> alter session set planner.enable_nljoin_for_scalar_only=false;
> use hive;
> select t2.dt, t1.fyq, t2.who, t2.event
> from t2
> left join t1 on t2.dt between t1.dts and t1.dte
> order by t2.dt;
> +-------------+----------+----------+--------------------+
> | dt | fyq | who | event |
> +-------------+----------+----------+--------------------+
> | 2016-12-26 | 2016-Q2 | aperson | had chrsitmas |
> | 2017-01-06 | 2016-Q3 | aperson | did somthing |
> | 2017-01-12 | 2016-Q3 | aperson | did somthing else |
> +-------------+----------+----------+--------------------+
> 3 rows selected (2.523 seconds)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)