Github user arina-ielchiieva commented on a diff in the pull request:

    https://github.com/apache/drill/pull/794#discussion_r108036357
  
    --- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/NestedLoopJoinBatch.java
 ---
    @@ -214,26 +226,62 @@ private boolean hasMore(IterOutcome outcome) {
     
       /**
        * Method generates the runtime code needed for NLJ. Other than the 
setup method to set the input and output value
    -   * vector references we implement two more methods
    -   * 1. emitLeft()  -> Project record from the left side
    -   * 2. emitRight() -> Project record from the right side (which is a 
hyper container)
    +   * vector references we implement three more methods
    +   * 1. doEval() -> Evaluates if record from left side matches record from 
the right side
    +   * 2. emitLeft() -> Project record from the left side
    +   * 3. emitRight() -> Project record from the right side (which is a 
hyper container)
        * @return the runtime generated class that implements the 
NestedLoopJoin interface
    -   * @throws IOException
    -   * @throws ClassTransformationException
        */
    -  private NestedLoopJoin setupWorker() throws IOException, 
ClassTransformationException {
    -    final CodeGenerator<NestedLoopJoin> nLJCodeGenerator = 
CodeGenerator.get(NestedLoopJoin.TEMPLATE_DEFINITION, 
context.getFunctionRegistry(), context.getOptions());
    +  private NestedLoopJoin setupWorker() throws IOException, 
ClassTransformationException, SchemaChangeException {
    +    final CodeGenerator<NestedLoopJoin> nLJCodeGenerator = 
CodeGenerator.get(
    +        NestedLoopJoin.TEMPLATE_DEFINITION, context.getFunctionRegistry(), 
context.getOptions());
         nLJCodeGenerator.plainJavaCapable(true);
         // Uncomment out this line to debug the generated code.
     //    nLJCodeGenerator.saveCodeForDebugging(true);
         final ClassGenerator<NestedLoopJoin> nLJClassGenerator = 
nLJCodeGenerator.getRoot();
     
    +    // generate doEval
    +    final ErrorCollector collector = new ErrorCollectorImpl();
    +
    +
    +    /*
    +        Logical expression may contain fields from left and right batches. 
During code generation (materialization)
    +        we need to indicate from which input field should be taken. 
Mapping sets can work with only one input at a time.
    +        But non-equality expressions can be complex:
    +          select t1.c1, t2.c1, t2.c2 from t1 inner join t2 on t1.c1 
between t2.c1 and t2.c2
    +        or even contain self join which can not be transformed into filter 
since OR clause is present
    +          select *from t1 inner join t2 on t1.c1 >= t2.c1 or t1.c3 <> t1.c4
    +
    +        In this case logical expression can not be split according to 
input presence (like during equality joins
    --- End diff --
    
    The thing is that inequality join is not only join that has `t1.c3 <> 
t1.c4` but also the one that has `OR`.
    For example, currently the following query `select * from t1 inner join t2 
on t1.c1 = t2.c1 or t1.c2 = t2.c2` will fail which the following error: 
`UNSUPPORTED_OPERATION ERROR: This query cannot be planned possibly due to 
either a cartesian join or an inequality join`.
    
    The main idea of my comment is that I don't bother if it's equality or 
inequality join, I just materialize the whole expression with fields from two 
inputs and to find out which input field is I add batch indication. If you want 
I can remove the comment, if it's confusing.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to