[GitHub] drill pull request #1212: DRILL-6323: Lateral Join - Initial implementation

2018-04-18 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/1212


---


[GitHub] drill pull request #1212: DRILL-6323: Lateral Join - Initial implementation

2018-04-17 Thread parthchandra
Github user parthchandra commented on a diff in the pull request:

https://github.com/apache/drill/pull/1212#discussion_r182283067
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/LateralJoinBatch.java
 ---
@@ -0,0 +1,861 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.physical.impl.join;
+
+import com.google.common.annotations.VisibleForTesting;
+import com.google.common.base.Preconditions;
+import org.apache.calcite.rel.core.JoinRelType;
+import org.apache.drill.common.types.TypeProtos;
+import org.apache.drill.common.types.Types;
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.exception.OutOfMemoryException;
+import org.apache.drill.exec.exception.SchemaChangeException;
+import org.apache.drill.exec.ops.FragmentContext;
+import org.apache.drill.exec.physical.base.LateralContract;
+import org.apache.drill.exec.physical.config.LateralJoinPOP;
+import org.apache.drill.exec.record.AbstractBinaryRecordBatch;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.JoinBatchMemoryManager;
+import org.apache.drill.exec.record.MaterializedField;
+import org.apache.drill.exec.record.RecordBatch;
+import org.apache.drill.exec.record.RecordBatchSizer;
+import org.apache.drill.exec.record.VectorAccessibleUtilities;
+import org.apache.drill.exec.record.VectorWrapper;
+import org.apache.drill.exec.vector.ValueVector;
+
+import static org.apache.drill.exec.record.RecordBatch.IterOutcome.EMIT;
+import static org.apache.drill.exec.record.RecordBatch.IterOutcome.NONE;
+import static org.apache.drill.exec.record.RecordBatch.IterOutcome.OK;
+import static 
org.apache.drill.exec.record.RecordBatch.IterOutcome.OK_NEW_SCHEMA;
+import static 
org.apache.drill.exec.record.RecordBatch.IterOutcome.OUT_OF_MEMORY;
+import static org.apache.drill.exec.record.RecordBatch.IterOutcome.STOP;
+
+/**
+ * RecordBatch implementation for the lateral join operator. Currently 
it's expected LATERAL to co-exists with UNNEST
+ * operator. Both LATERAL and UNNEST will share a contract with each other 
defined at {@link LateralContract}
+ */
+public class LateralJoinBatch extends 
AbstractBinaryRecordBatch implements LateralContract {
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(LateralJoinBatch.class);
+
+  // Input indexes to correctly update the stats
+  private static final int LEFT_INPUT = 0;
+
+  private static final int RIGHT_INPUT = 1;
+
+  // Maximum number records in the outgoing batch
+  private int maxOutputRowCount;
+
+  // Schema on the left side
+  private BatchSchema leftSchema;
+
+  // Schema on the right side
+  private BatchSchema rightSchema;
+
+  // Index in output batch to populate next row
+  private int outputIndex;
+
+  // Current index of record in left incoming which is being processed
+  private int leftJoinIndex = -1;
+
+  // Current index of record in right incoming which is being processed
+  private int rightJoinIndex = -1;
+
+  // flag to keep track if current left batch needs to be processed in 
future next call
+  private boolean processLeftBatchInFuture;
+
+  // Keep track if any matching right record was found for current left 
index record
+  private boolean matchedRecordFound;
+
+  private boolean useMemoryManager = true;
+
+  /* 

+   * Public Methods
+   * 
/
+  public LateralJoinBatch(LateralJoinPOP popConfig, FragmentContext 
context,
+  RecordBatch left, RecordBatch right) throws 
OutOfMemoryException {
+super(popConfig, context, left, right);
+

[GitHub] drill pull request #1212: DRILL-6323: Lateral Join - Initial implementation

2018-04-17 Thread sohami
Github user sohami commented on a diff in the pull request:

https://github.com/apache/drill/pull/1212#discussion_r182259926
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/LateralJoinBatch.java
 ---
@@ -0,0 +1,861 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.physical.impl.join;
+
+import com.google.common.annotations.VisibleForTesting;
+import com.google.common.base.Preconditions;
+import org.apache.calcite.rel.core.JoinRelType;
+import org.apache.drill.common.types.TypeProtos;
+import org.apache.drill.common.types.Types;
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.exception.OutOfMemoryException;
+import org.apache.drill.exec.exception.SchemaChangeException;
+import org.apache.drill.exec.ops.FragmentContext;
+import org.apache.drill.exec.physical.base.LateralContract;
+import org.apache.drill.exec.physical.config.LateralJoinPOP;
+import org.apache.drill.exec.record.AbstractBinaryRecordBatch;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.JoinBatchMemoryManager;
+import org.apache.drill.exec.record.MaterializedField;
+import org.apache.drill.exec.record.RecordBatch;
+import org.apache.drill.exec.record.RecordBatchSizer;
+import org.apache.drill.exec.record.VectorAccessibleUtilities;
+import org.apache.drill.exec.record.VectorWrapper;
+import org.apache.drill.exec.vector.ValueVector;
+
+import static org.apache.drill.exec.record.RecordBatch.IterOutcome.EMIT;
+import static org.apache.drill.exec.record.RecordBatch.IterOutcome.NONE;
+import static org.apache.drill.exec.record.RecordBatch.IterOutcome.OK;
+import static 
org.apache.drill.exec.record.RecordBatch.IterOutcome.OK_NEW_SCHEMA;
+import static 
org.apache.drill.exec.record.RecordBatch.IterOutcome.OUT_OF_MEMORY;
+import static org.apache.drill.exec.record.RecordBatch.IterOutcome.STOP;
+
+/**
+ * RecordBatch implementation for the lateral join operator. Currently 
it's expected LATERAL to co-exists with UNNEST
+ * operator. Both LATERAL and UNNEST will share a contract with each other 
defined at {@link LateralContract}
+ */
+public class LateralJoinBatch extends 
AbstractBinaryRecordBatch implements LateralContract {
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(LateralJoinBatch.class);
+
+  // Input indexes to correctly update the stats
+  private static final int LEFT_INPUT = 0;
+
+  private static final int RIGHT_INPUT = 1;
+
+  // Maximum number records in the outgoing batch
+  private int maxOutputRowCount;
+
+  // Schema on the left side
+  private BatchSchema leftSchema;
+
+  // Schema on the right side
+  private BatchSchema rightSchema;
+
+  // Index in output batch to populate next row
+  private int outputIndex;
+
+  // Current index of record in left incoming which is being processed
+  private int leftJoinIndex = -1;
+
+  // Current index of record in right incoming which is being processed
+  private int rightJoinIndex = -1;
+
+  // flag to keep track if current left batch needs to be processed in 
future next call
+  private boolean processLeftBatchInFuture;
+
+  // Keep track if any matching right record was found for current left 
index record
+  private boolean matchedRecordFound;
+
+  private boolean useMemoryManager = true;
+
+  /* 

+   * Public Methods
+   * 
/
+  public LateralJoinBatch(LateralJoinPOP popConfig, FragmentContext 
context,
+  RecordBatch left, RecordBatch right) throws 
OutOfMemoryException {
+super(popConfig, context, left, right);
+

[GitHub] drill pull request #1212: DRILL-6323: Lateral Join - Initial implementation

2018-04-17 Thread sohami
Github user sohami commented on a diff in the pull request:

https://github.com/apache/drill/pull/1212#discussion_r182262104
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java ---
@@ -186,6 +186,11 @@ private ExecConstants() {
   public static final String BIT_ENCRYPTION_SASL_ENABLED = 
"drill.exec.security.bit.encryption.sasl.enabled";
   public static final String BIT_ENCRYPTION_SASL_MAX_WRAPPED_SIZE = 
"drill.exec.security.bit.encryption.sasl.max_wrapped_size";
 
+  /**
--- End diff --

Removed. Thanks for catching this.


---


[GitHub] drill pull request #1212: DRILL-6323: Lateral Join - Initial implementation

2018-04-17 Thread parthchandra
Github user parthchandra commented on a diff in the pull request:

https://github.com/apache/drill/pull/1212#discussion_r182172836
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/LateralJoinBatch.java
 ---
@@ -0,0 +1,861 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.physical.impl.join;
+
+import com.google.common.annotations.VisibleForTesting;
+import com.google.common.base.Preconditions;
+import org.apache.calcite.rel.core.JoinRelType;
+import org.apache.drill.common.types.TypeProtos;
+import org.apache.drill.common.types.Types;
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.exception.OutOfMemoryException;
+import org.apache.drill.exec.exception.SchemaChangeException;
+import org.apache.drill.exec.ops.FragmentContext;
+import org.apache.drill.exec.physical.base.LateralContract;
+import org.apache.drill.exec.physical.config.LateralJoinPOP;
+import org.apache.drill.exec.record.AbstractBinaryRecordBatch;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.JoinBatchMemoryManager;
+import org.apache.drill.exec.record.MaterializedField;
+import org.apache.drill.exec.record.RecordBatch;
+import org.apache.drill.exec.record.RecordBatchSizer;
+import org.apache.drill.exec.record.VectorAccessibleUtilities;
+import org.apache.drill.exec.record.VectorWrapper;
+import org.apache.drill.exec.vector.ValueVector;
+
+import static org.apache.drill.exec.record.RecordBatch.IterOutcome.EMIT;
+import static org.apache.drill.exec.record.RecordBatch.IterOutcome.NONE;
+import static org.apache.drill.exec.record.RecordBatch.IterOutcome.OK;
+import static 
org.apache.drill.exec.record.RecordBatch.IterOutcome.OK_NEW_SCHEMA;
+import static 
org.apache.drill.exec.record.RecordBatch.IterOutcome.OUT_OF_MEMORY;
+import static org.apache.drill.exec.record.RecordBatch.IterOutcome.STOP;
+
+/**
+ * RecordBatch implementation for the lateral join operator. Currently 
it's expected LATERAL to co-exists with UNNEST
+ * operator. Both LATERAL and UNNEST will share a contract with each other 
defined at {@link LateralContract}
+ */
+public class LateralJoinBatch extends 
AbstractBinaryRecordBatch implements LateralContract {
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(LateralJoinBatch.class);
+
+  // Input indexes to correctly update the stats
+  private static final int LEFT_INPUT = 0;
+
+  private static final int RIGHT_INPUT = 1;
+
+  // Maximum number records in the outgoing batch
+  private int maxOutputRowCount;
+
+  // Schema on the left side
+  private BatchSchema leftSchema;
+
+  // Schema on the right side
+  private BatchSchema rightSchema;
+
+  // Index in output batch to populate next row
+  private int outputIndex;
+
+  // Current index of record in left incoming which is being processed
+  private int leftJoinIndex = -1;
+
+  // Current index of record in right incoming which is being processed
+  private int rightJoinIndex = -1;
+
+  // flag to keep track if current left batch needs to be processed in 
future next call
+  private boolean processLeftBatchInFuture;
+
+  // Keep track if any matching right record was found for current left 
index record
+  private boolean matchedRecordFound;
+
+  private boolean useMemoryManager = true;
+
+  /* 

+   * Public Methods
+   * 
/
+  public LateralJoinBatch(LateralJoinPOP popConfig, FragmentContext 
context,
+  RecordBatch left, RecordBatch right) throws 
OutOfMemoryException {
+super(popConfig, context, left, right);
+

[GitHub] drill pull request #1212: DRILL-6323: Lateral Join - Initial implementation

2018-04-17 Thread parthchandra
Github user parthchandra commented on a diff in the pull request:

https://github.com/apache/drill/pull/1212#discussion_r182172309
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java ---
@@ -186,6 +186,11 @@ private ExecConstants() {
   public static final String BIT_ENCRYPTION_SASL_ENABLED = 
"drill.exec.security.bit.encryption.sasl.enabled";
   public static final String BIT_ENCRYPTION_SASL_MAX_WRAPPED_SIZE = 
"drill.exec.security.bit.encryption.sasl.max_wrapped_size";
 
+  /**
--- End diff --

Not needed as you don't have code gen


---


[GitHub] drill pull request #1212: DRILL-6323: Lateral Join - Initial implementation

2018-04-16 Thread sohami
GitHub user sohami opened a pull request:

https://github.com/apache/drill/pull/1212

DRILL-6323: Lateral Join - Initial implementation

@parthchandra - Please review.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sohami/drill DRILL-6323

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/1212.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1212


commit 38b82ce8ebca3b91ba8847229a568daf88f654a3
Author: Sorabh Hamirwasia 
Date:   2018-03-06T05:09:16Z

DRILL-6323: Lateral Join - Initial implementation
Refactor Join PopConfigs

commit 0931ac974f7be41cd5ca4516524d04c4e5c6245d
Author: Sorabh Hamirwasia 
Date:   2018-02-05T22:46:19Z

DRILL-6323: Lateral Join - Initial implementation

commit 08b7d38e6ffe4fed492a2ea67c3907f0f514717e
Author: Sorabh Hamirwasia 
Date:   2018-02-20T22:47:48Z

DRILL-6323: Lateral Join - Initial implementation
Note: Refactor, fix various issues with LATERAL: a)Override 
prefetch call in BuildSchema phase for LATERAL, b)EMIT handling in buildSchema, 
c)Issue when in multilevel Lateral case
schema change is observed only on right side of UNNEST, 
d)Handle SelectionVector in incoming batch, e) Handling Schema change, f) 
Updating joinIndexes correctly when producing multiple output batches
for current left inputs.
Added tests for a)EMIT handling in buildSchema, b)Multiple 
UNNEST at same level, c)Multilevel Lateral, d)Multilevel Lateral with Schema 
change on left/right or both branches, e) Left LATERAL join
f)Schema change for UNNEST and Non-UNNEST columns, g)Error 
outcomes from left, h) Producing multiple output batches for given 
incoming, i) Consuming multiple incoming into single output batch

commit f686877198857a23227031329b2f48c163f85021
Author: Sorabh Hamirwasia 
Date:   2018-03-09T23:56:22Z

DRILL-6323: Lateral Join - Initial implementation
Note: Remove codegen and operator template class. Logic to copy 
data is moved to LateralJoinBatch itself

commit cb47407a8f47e7fc10ac04c438f7e903de92a80c
Author: Sorabh Hamirwasia 
Date:   2018-03-13T23:41:03Z

DRILL-6323: Lateral Join - Initial implementation
Note: Add some debug logs for LateralJoinBatch

commit 2ea968e3302ad8bb6d62a8032bd6b6329b900d3a
Author: Sorabh Hamirwasia 
Date:   2018-03-14T23:59:29Z

DRILL-6323: Lateral Join - Initial implementation
Note: Refactor BatchMemorySize to put outputBatchSize in 
abstract class. Created a new JoinBatchMemoryManager to be shared across join 
record batches.
  Changed merge join to use AbstractBinaryRecordBatch instead 
of AbstractRecordBatch, and use JoinBatchMemoryManager

commit fb521c65f91b69543ae9b5dbb9a727c79f852573
Author: Sorabh Hamirwasia 
Date:   2018-03-19T19:00:22Z

DRILL-6323: Lateral Join - Initial implementation
Note: Lateral Join Batch Memory manager support using the 
record batch sizer




---