[jira] [Commented] (DRILL-5993) Allow Copier to Copy a Record and Append to the End of an Outgoing Batch

2018-01-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16329585#comment-16329585
 ] 

ASF GitHub Bot commented on DRILL-5993:
---

Github user ilooner commented on a diff in the pull request:

https://github.com/apache/drill/pull/1057#discussion_r162196004
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/svremover/AbstractCopier.java
 ---
@@ -0,0 +1,95 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.physical.impl.svremover;
+
+import org.apache.drill.common.types.TypeProtos;
+import org.apache.drill.common.types.Types;
+import org.apache.drill.exec.exception.SchemaChangeException;
+import org.apache.drill.exec.record.RecordBatch;
+import org.apache.drill.exec.record.VectorContainer;
+import org.apache.drill.exec.record.VectorWrapper;
+import org.apache.drill.exec.vector.AllocationHelper;
+import org.apache.drill.exec.vector.ValueVector;
+
+public abstract class AbstractCopier implements Copier {
+  protected ValueVector[] vvOut;
+  protected VectorContainer outgoing;
+
+  @Override
+  public void setup(RecordBatch incoming, VectorContainer outgoing) throws 
SchemaChangeException {
--- End diff --

It is not used by the AbstractCopier itself but it is used by the 
AbstractSV2Copier and AbstractSV4Copier which override the method.


> Allow Copier to Copy a Record and Append to the End of an Outgoing Batch
> 
>
> Key: DRILL-5993
> URL: https://issues.apache.org/jira/browse/DRILL-5993
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
>
> Currently the copier can only copy record from an incoming batch to the 
> beginning of an outgoing batch. We need to be able to copy a record and 
> append it to the end of the outgoing batch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5993) Allow Copier to Copy a Record and Append to the End of an Outgoing Batch

2018-01-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16329563#comment-16329563
 ] 

ASF GitHub Bot commented on DRILL-5993:
---

Github user Ben-Zvi commented on a diff in the pull request:

https://github.com/apache/drill/pull/1057#discussion_r162193983
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/svremover/AbstractCopier.java
 ---
@@ -0,0 +1,95 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.physical.impl.svremover;
+
+import org.apache.drill.common.types.TypeProtos;
+import org.apache.drill.common.types.Types;
+import org.apache.drill.exec.exception.SchemaChangeException;
+import org.apache.drill.exec.record.RecordBatch;
+import org.apache.drill.exec.record.VectorContainer;
+import org.apache.drill.exec.record.VectorWrapper;
+import org.apache.drill.exec.vector.AllocationHelper;
+import org.apache.drill.exec.vector.ValueVector;
+
+public abstract class AbstractCopier implements Copier {
+  protected ValueVector[] vvOut;
+  protected VectorContainer outgoing;
+
+  @Override
+  public void setup(RecordBatch incoming, VectorContainer outgoing) throws 
SchemaChangeException {
--- End diff --

The "incoming" parameter is not used anywhere ...


> Allow Copier to Copy a Record and Append to the End of an Outgoing Batch
> 
>
> Key: DRILL-5993
> URL: https://issues.apache.org/jira/browse/DRILL-5993
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
>
> Currently the copier can only copy record from an incoming batch to the 
> beginning of an outgoing batch. We need to be able to copy a record and 
> append it to the end of the outgoing batch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5993) Allow Copier to Copy a Record and Append to the End of an Outgoing Batch

2018-01-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16329449#comment-16329449
 ] 

ASF GitHub Bot commented on DRILL-5993:
---

Github user ilooner commented on a diff in the pull request:

https://github.com/apache/drill/pull/1057#discussion_r162175059
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/svremover/AbstractSV4Copier.java
 ---
@@ -17,32 +17,30 @@
  */
 package org.apache.drill.exec.physical.impl.svremover;
 
-import javax.inject.Named;
-
 import org.apache.drill.common.types.TypeProtos.MajorType;
 import org.apache.drill.common.types.Types;
 import org.apache.drill.exec.exception.SchemaChangeException;
-import org.apache.drill.exec.ops.FragmentContext;
 import org.apache.drill.exec.record.RecordBatch;
+import org.apache.drill.exec.record.VectorContainer;
 import org.apache.drill.exec.record.VectorWrapper;
 import org.apache.drill.exec.record.selection.SelectionVector4;
 import org.apache.drill.exec.vector.AllocationHelper;
+import org.apache.drill.exec.vector.ValueVector;
 
-public abstract class CopierTemplate4 implements Copier{
-  static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(CopierTemplate4.class);
+public abstract class AbstractSV4Copier implements Copier {
--- End diff --

Done


> Allow Copier to Copy a Record and Append to the End of an Outgoing Batch
> 
>
> Key: DRILL-5993
> URL: https://issues.apache.org/jira/browse/DRILL-5993
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
>
> Currently the copier can only copy record from an incoming batch to the 
> beginning of an outgoing batch. We need to be able to copy a record and 
> append it to the end of the outgoing batch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5993) Allow Copier to Copy a Record and Append to the End of an Outgoing Batch

2018-01-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16328243#comment-16328243
 ] 

ASF GitHub Bot commented on DRILL-5993:
---

Github user Ben-Zvi commented on a diff in the pull request:

https://github.com/apache/drill/pull/1057#discussion_r161950855
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/svremover/AbstractSV4Copier.java
 ---
@@ -17,32 +17,30 @@
  */
 package org.apache.drill.exec.physical.impl.svremover;
 
-import javax.inject.Named;
-
 import org.apache.drill.common.types.TypeProtos.MajorType;
 import org.apache.drill.common.types.Types;
 import org.apache.drill.exec.exception.SchemaChangeException;
-import org.apache.drill.exec.ops.FragmentContext;
 import org.apache.drill.exec.record.RecordBatch;
+import org.apache.drill.exec.record.VectorContainer;
 import org.apache.drill.exec.record.VectorWrapper;
 import org.apache.drill.exec.record.selection.SelectionVector4;
 import org.apache.drill.exec.vector.AllocationHelper;
+import org.apache.drill.exec.vector.ValueVector;
 
-public abstract class CopierTemplate4 implements Copier{
-  static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(CopierTemplate4.class);
+public abstract class AbstractSV4Copier implements Copier {
--- End diff --

The code for AbstractSV4Copier is nearly identical to the code of 
AbstractSV2Copier; I wonder if we could have combined the two somehow (like a 
super class ??)



> Allow Copier to Copy a Record and Append to the End of an Outgoing Batch
> 
>
> Key: DRILL-5993
> URL: https://issues.apache.org/jira/browse/DRILL-5993
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
>
> Currently the copier can only copy record from an incoming batch to the 
> beginning of an outgoing batch. We need to be able to copy a record and 
> append it to the end of the outgoing batch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5993) Allow Copier to Copy a Record and Append to the End of an Outgoing Batch

2018-01-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16328038#comment-16328038
 ] 

ASF GitHub Bot commented on DRILL-5993:
---

Github user ilooner commented on the issue:

https://github.com/apache/drill/pull/1057
  
Addressed review comments and rebased.


> Allow Copier to Copy a Record and Append to the End of an Outgoing Batch
> 
>
> Key: DRILL-5993
> URL: https://issues.apache.org/jira/browse/DRILL-5993
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
>
> Currently the copier can only copy record from an incoming batch to the 
> beginning of an outgoing batch. We need to be able to copy a record and 
> append it to the end of the outgoing batch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5993) Allow Copier to Copy a Record and Append to the End of an Outgoing Batch

2018-01-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16327970#comment-16327970
 ] 

ASF GitHub Bot commented on DRILL-5993:
---

Github user ilooner commented on a diff in the pull request:

https://github.com/apache/drill/pull/1057#discussion_r161913565
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/svremover/GenericCopierTest.java
 ---
@@ -0,0 +1,41 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.physical.impl.svremover;
+
+import org.apache.drill.exec.memory.RootAllocator;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.test.rowSet.RowSet;
+import org.apache.drill.test.rowSet.RowSetBuilder;
+
+public class GenericCopierTest extends AbstractGenericCopierTest {
--- End diff --

The generated code uses the copyFromSafe methods in code generated copiers. 
Your copyEntry method which is used by the GenericCopiers also uses the 
copyFromSafe method. So we should inherit all the same supported types. I think 
if all the unit tests, functional tests, and QA tests pass that will be a 
sufficient enough vote of confidence. Ideally we would have unit tests for all 
the operators (not just the copiers) which use the complex data types, but 
there is significant amount of work left to be able to effectively unit tests 
with complex data types and adding all the machinery to do so is out of the 
scope of this PR.


> Allow Copier to Copy a Record and Append to the End of an Outgoing Batch
> 
>
> Key: DRILL-5993
> URL: https://issues.apache.org/jira/browse/DRILL-5993
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
>
> Currently the copier can only copy record from an incoming batch to the 
> beginning of an outgoing batch. We need to be able to copy a record and 
> append it to the end of the outgoing batch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5993) Allow Copier to Copy a Record and Append to the End of an Outgoing Batch

2017-12-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16299378#comment-16299378
 ] 

ASF GitHub Bot commented on DRILL-5993:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1057#discussion_r158179561
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/svremover/GenericCopierTest.java
 ---
@@ -0,0 +1,41 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.physical.impl.svremover;
+
+import org.apache.drill.exec.memory.RootAllocator;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.test.rowSet.RowSet;
+import org.apache.drill.test.rowSet.RowSetBuilder;
+
+public class GenericCopierTest extends AbstractGenericCopierTest {
--- End diff --

One reason I was a bit hesitant to add this support earlier is that we need 
this to work for ALL Drill's data types, including the obscure ones. Clearly, 
this mechanism works for scalars. But, what about:

* Arrays of scalars
* Maps
* Arrays of maps
* Unions
* Lists of unions
* Lists of scalars
* Lists of maps
* Repeated lists of scalars
* Repeated lists of maps
* Repeated lists of repeated lists
* Repeated lists of unions

There are probably a few more that I'm overlooking. Until we test ALL of 
these, we can't be sure if the new code works as well as the old code.


> Allow Copier to Copy a Record and Append to the End of an Outgoing Batch
> 
>
> Key: DRILL-5993
> URL: https://issues.apache.org/jira/browse/DRILL-5993
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>
> Currently the copier can only copy record from an incoming batch to the 
> beginning of an outgoing batch. We need to be able to copy a record and 
> append it to the end of the outgoing batch.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5993) Allow Copier to Copy a Record and Append to the End of an Outgoing Batch

2017-12-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16299377#comment-16299377
 ] 

ASF GitHub Bot commented on DRILL-5993:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1057#discussion_r158179222
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/record/VectorContainer.java 
---
@@ -353,6 +353,23 @@ public int getRecordCount() {
 
   public boolean hasRecordCount() { return recordCount != -1; }
 
+  /**
+   * This works with non-hyper {@link VectorContainer}s which have no 
selection vectors.
+   * Appends a row taken from a source {@link VectorContainer} to this 
{@link VectorContainer}.
+   * @param srcContainer The {@link VectorContainer} to copy a row from.
+   * @param srcIndex The index of the row to copy from the source {@link 
VectorContainer}.
+   */
+  public void appendRow(VectorContainer srcContainer, int srcIndex) {
+for (int vectorIndex = 0; vectorIndex < wrappers.size(); 
vectorIndex++) {
+  ValueVector destVector = wrappers.get(vectorIndex).getValueVector();
+  ValueVector srcVector = 
srcContainer.wrappers.get(vectorIndex).getValueVector();
+
+  destVector.copyEntry(recordCount, srcVector, srcIndex);
+}
+
+recordCount++;
--- End diff --

I would not advise changing the meaning of -1; that is baked into the code 
in many places. Instead, before adding the first row, simply set the count to 0 
when the target container is created.


> Allow Copier to Copy a Record and Append to the End of an Outgoing Batch
> 
>
> Key: DRILL-5993
> URL: https://issues.apache.org/jira/browse/DRILL-5993
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>
> Currently the copier can only copy record from an incoming batch to the 
> beginning of an outgoing batch. We need to be able to copy a record and 
> append it to the end of the outgoing batch.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5993) Allow Copier to Copy a Record and Append to the End of an Outgoing Batch

2017-12-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16297542#comment-16297542
 ] 

ASF GitHub Bot commented on DRILL-5993:
---

Github user ilooner commented on a diff in the pull request:

https://github.com/apache/drill/pull/1057#discussion_r157894540
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/svremover/Copier.java
 ---
@@ -19,13 +19,15 @@
 
 import org.apache.drill.exec.compile.TemplateClassDefinition;
 import org.apache.drill.exec.exception.SchemaChangeException;
-import org.apache.drill.exec.ops.FragmentContext;
 import org.apache.drill.exec.record.RecordBatch;
+import org.apache.drill.exec.record.VectorContainer;
 
 public interface Copier {
-  public static TemplateClassDefinition TEMPLATE_DEFINITION2 = new 
TemplateClassDefinition(Copier.class, CopierTemplate2.class);
-  public static TemplateClassDefinition TEMPLATE_DEFINITION4 = new 
TemplateClassDefinition(Copier.class, CopierTemplate4.class);
+  TemplateClassDefinition TEMPLATE_DEFINITION2 = new 
TemplateClassDefinition(Copier.class, CopierTemplate2.class);
+  TemplateClassDefinition TEMPLATE_DEFINITION4 = new 
TemplateClassDefinition(Copier.class, CopierTemplate4.class);
--- End diff --

I will create a separate PR to do that since that change is unrelated to 
this PR


> Allow Copier to Copy a Record and Append to the End of an Outgoing Batch
> 
>
> Key: DRILL-5993
> URL: https://issues.apache.org/jira/browse/DRILL-5993
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>
> Currently the copier can only copy record from an incoming batch to the 
> beginning of an outgoing batch. We need to be able to copy a record and 
> append it to the end of the outgoing batch.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5993) Allow Copier to Copy a Record and Append to the End of an Outgoing Batch

2017-12-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16297541#comment-16297541
 ] 

ASF GitHub Bot commented on DRILL-5993:
---

Github user ilooner commented on a diff in the pull request:

https://github.com/apache/drill/pull/1057#discussion_r157894423
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/svremover/CopierTemplate2.java
 ---
@@ -53,17 +51,32 @@ public int copyRecords(int index, int recordCount) 
throws SchemaChangeException
   }
 }
 
-int outgoingPosition = 0;
+return insertRecords(0, index, recordCount);
+  }
+
+  @Override
+  public int appendRecord(int index) throws SchemaChangeException {
+return appendRecords(index, 1);
+  }
+
+  @Override
+  public int appendRecords(int index, int recordCount) throws 
SchemaChangeException {
+return insertRecords(outgoing.getRecordCount(), index, recordCount);
+  }
+
+  private int insertRecords(int outgoingPosition, int index, int 
recordCount) throws SchemaChangeException {
+final int endIndex = index + recordCount;
 
-for(int svIndex = index; svIndex < index + recordCount; svIndex++, 
outgoingPosition++){
+for(int svIndex = index; svIndex < endIndex; svIndex++, 
outgoingPosition++){
   doEval(sv2.getIndex(svIndex), outgoingPosition);
 }
+
+outgoing.setRecordCount(outgoingPosition);
 return outgoingPosition;
   }
 
-  public abstract void doSetup(@Named("context") FragmentContext context,
-   @Named("incoming") RecordBatch incoming,
-   @Named("outgoing") RecordBatch outgoing)
+  public abstract void doSetup(@Named("incoming") RecordBatch incoming,
+   @Named("outgoing") VectorContainer outgoing)
--- End diff --

The copiers are only used in the SVRemover and TopN operator. I have 
replaced the code generated copiers in both now to use the GenericCopiers.


> Allow Copier to Copy a Record and Append to the End of an Outgoing Batch
> 
>
> Key: DRILL-5993
> URL: https://issues.apache.org/jira/browse/DRILL-5993
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>
> Currently the copier can only copy record from an incoming batch to the 
> beginning of an outgoing batch. We need to be able to copy a record and 
> append it to the end of the outgoing batch.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5993) Allow Copier to Copy a Record and Append to the End of an Outgoing Batch

2017-12-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16297299#comment-16297299
 ] 

ASF GitHub Bot commented on DRILL-5993:
---

Github user ilooner commented on a diff in the pull request:

https://github.com/apache/drill/pull/1057#discussion_r157853697
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/svremover/CopierTemplate4.java
 ---
@@ -54,17 +52,33 @@ public int copyRecords(int index, int recordCount) 
throws SchemaChangeException
   }
 }
 
-int outgoingPosition = 0;
-for(int svIndex = index; svIndex < index + recordCount; svIndex++, 
outgoingPosition++){
+return insertRecords(0, index, recordCount);
+  }
+
+  @Override
+  public int appendRecord(int index) throws SchemaChangeException {
+return appendRecords(index, 1);
+  }
--- End diff --

Updated the code and made an implementation of appendRecord which doesn't 
use a for loop


> Allow Copier to Copy a Record and Append to the End of an Outgoing Batch
> 
>
> Key: DRILL-5993
> URL: https://issues.apache.org/jira/browse/DRILL-5993
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>
> Currently the copier can only copy record from an incoming batch to the 
> beginning of an outgoing batch. We need to be able to copy a record and 
> append it to the end of the outgoing batch.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5993) Allow Copier to Copy a Record and Append to the End of an Outgoing Batch

2017-12-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16296038#comment-16296038
 ] 

ASF GitHub Bot commented on DRILL-5993:
---

Github user Ben-Zvi commented on a diff in the pull request:

https://github.com/apache/drill/pull/1057#discussion_r157646636
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/svremover/CopierTemplate4.java
 ---
@@ -54,17 +52,33 @@ public int copyRecords(int index, int recordCount) 
throws SchemaChangeException
   }
 }
 
-int outgoingPosition = 0;
-for(int svIndex = index; svIndex < index + recordCount; svIndex++, 
outgoingPosition++){
+return insertRecords(0, index, recordCount);
+  }
+
+  @Override
+  public int appendRecord(int index) throws SchemaChangeException {
+return appendRecords(index, 1);
+  }
--- End diff --

Minor performance comment: If *appendRecord()* is to be used heavily, then 
the overhead of calling the for() loop (in insertRecords() ) many times for 
only a size of 1 may be significant. 


> Allow Copier to Copy a Record and Append to the End of an Outgoing Batch
> 
>
> Key: DRILL-5993
> URL: https://issues.apache.org/jira/browse/DRILL-5993
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>
> Currently the copier can only copy record from an incoming batch to the 
> beginning of an outgoing batch. We need to be able to copy a record and 
> append it to the end of the outgoing batch.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5993) Allow Copier to Copy a Record and Append to the End of an Outgoing Batch

2017-12-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16296037#comment-16296037
 ] 

ASF GitHub Bot commented on DRILL-5993:
---

Github user Ben-Zvi commented on a diff in the pull request:

https://github.com/apache/drill/pull/1057#discussion_r157646058
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/svremover/CopierTemplate2.java
 ---
@@ -53,17 +51,32 @@ public int copyRecords(int index, int recordCount) 
throws SchemaChangeException
   }
 }
 
-int outgoingPosition = 0;
+return insertRecords(0, index, recordCount);
+  }
+
+  @Override
+  public int appendRecord(int index) throws SchemaChangeException {
+return appendRecords(index, 1);
+  }
+
+  @Override
+  public int appendRecords(int index, int recordCount) throws 
SchemaChangeException {
+return insertRecords(outgoing.getRecordCount(), index, recordCount);
+  }
+
+  private int insertRecords(int outgoingPosition, int index, int 
recordCount) throws SchemaChangeException {
+final int endIndex = index + recordCount;
 
-for(int svIndex = index; svIndex < index + recordCount; svIndex++, 
outgoingPosition++){
+for(int svIndex = index; svIndex < endIndex; svIndex++, 
outgoingPosition++){
   doEval(sv2.getIndex(svIndex), outgoingPosition);
 }
+
+outgoing.setRecordCount(outgoingPosition);
 return outgoingPosition;
   }
 
-  public abstract void doSetup(@Named("context") FragmentContext context,
-   @Named("incoming") RecordBatch incoming,
-   @Named("outgoing") RecordBatch outgoing)
+  public abstract void doSetup(@Named("incoming") RecordBatch incoming,
+   @Named("outgoing") VectorContainer outgoing)
--- End diff --

As the "generated code" methods are replaced with "in line" methods, we 
should remove these abstract signatures (both *doSetup* and *doEval*) to avoid 
generating code, and to distinguish these methods. Maybe we should _rename_ 
them as well to make the distinction clear.  Then probably also eliminate the 
"code generation" code in **getGenerated2Copier()** and 
**getGenerated4Copier()**.  


> Allow Copier to Copy a Record and Append to the End of an Outgoing Batch
> 
>
> Key: DRILL-5993
> URL: https://issues.apache.org/jira/browse/DRILL-5993
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>
> Currently the copier can only copy record from an incoming batch to the 
> beginning of an outgoing batch. We need to be able to copy a record and 
> append it to the end of the outgoing batch.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5993) Allow Copier to Copy a Record and Append to the End of an Outgoing Batch

2017-12-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16296039#comment-16296039
 ] 

ASF GitHub Bot commented on DRILL-5993:
---

Github user Ben-Zvi commented on a diff in the pull request:

https://github.com/apache/drill/pull/1057#discussion_r157616077
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/svremover/Copier.java
 ---
@@ -19,13 +19,15 @@
 
 import org.apache.drill.exec.compile.TemplateClassDefinition;
 import org.apache.drill.exec.exception.SchemaChangeException;
-import org.apache.drill.exec.ops.FragmentContext;
 import org.apache.drill.exec.record.RecordBatch;
+import org.apache.drill.exec.record.VectorContainer;
 
 public interface Copier {
-  public static TemplateClassDefinition TEMPLATE_DEFINITION2 = new 
TemplateClassDefinition(Copier.class, CopierTemplate2.class);
-  public static TemplateClassDefinition TEMPLATE_DEFINITION4 = new 
TemplateClassDefinition(Copier.class, CopierTemplate4.class);
+  TemplateClassDefinition TEMPLATE_DEFINITION2 = new 
TemplateClassDefinition(Copier.class, CopierTemplate2.class);
+  TemplateClassDefinition TEMPLATE_DEFINITION4 = new 
TemplateClassDefinition(Copier.class, CopierTemplate4.class);
--- End diff --

Just for cleanliness, should do the same (remove all the 
public/static/abstract/final ) in **filter/Filterer.java** as well ! 


> Allow Copier to Copy a Record and Append to the End of an Outgoing Batch
> 
>
> Key: DRILL-5993
> URL: https://issues.apache.org/jira/browse/DRILL-5993
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>
> Currently the copier can only copy record from an incoming batch to the 
> beginning of an outgoing batch. We need to be able to copy a record and 
> append it to the end of the outgoing batch.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5993) Allow Copier to Copy a Record and Append to the End of an Outgoing Batch

2017-12-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16296036#comment-16296036
 ] 

ASF GitHub Bot commented on DRILL-5993:
---

Github user Ben-Zvi commented on a diff in the pull request:

https://github.com/apache/drill/pull/1057#discussion_r157612792
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/record/VectorContainer.java 
---
@@ -353,6 +353,23 @@ public int getRecordCount() {
 
   public boolean hasRecordCount() { return recordCount != -1; }
 
+  /**
+   * This works with non-hyper {@link VectorContainer}s which have no 
selection vectors.
+   * Appends a row taken from a source {@link VectorContainer} to this 
{@link VectorContainer}.
+   * @param srcContainer The {@link VectorContainer} to copy a row from.
+   * @param srcIndex The index of the row to copy from the source {@link 
VectorContainer}.
+   */
+  public void appendRow(VectorContainer srcContainer, int srcIndex) {
+for (int vectorIndex = 0; vectorIndex < wrappers.size(); 
vectorIndex++) {
+  ValueVector destVector = wrappers.get(vectorIndex).getValueVector();
+  ValueVector srcVector = 
srcContainer.wrappers.get(vectorIndex).getValueVector();
+
+  destVector.copyEntry(recordCount, srcVector, srcIndex);
+}
+
+recordCount++;
--- End diff --

(1) Need to change the initial value of *recordCount* from -1 to 0 ;
(2) That "-1" is used to note "uninitialized". So need to add a boolean 
flag for that purpose.
(3) Please change the *appendRow()* signature to return *recordCount*. 
(E.g., so the caller would know when the target container is full).
 


> Allow Copier to Copy a Record and Append to the End of an Outgoing Batch
> 
>
> Key: DRILL-5993
> URL: https://issues.apache.org/jira/browse/DRILL-5993
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>
> Currently the copier can only copy record from an incoming batch to the 
> beginning of an outgoing batch. We need to be able to copy a record and 
> append it to the end of the outgoing batch.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5993) Allow Copier to Copy a Record and Append to the End of an Outgoing Batch

2017-12-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16275197#comment-16275197
 ] 

ASF GitHub Bot commented on DRILL-5993:
---

Github user ilooner commented on the issue:

https://github.com/apache/drill/pull/1057
  
@Ben-Zvi @paul-rogers As per our discussion I've added Paul's generic 
copiers and have provided a GenericCopier that works for RecordBatches that do 
not have selection vectors. I've enhanced the copier interface to include 
append methods.


> Allow Copier to Copy a Record and Append to the End of an Outgoing Batch
> 
>
> Key: DRILL-5993
> URL: https://issues.apache.org/jira/browse/DRILL-5993
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>
> Currently the copier can only copy record from an incoming batch to the 
> beginning of an outgoing batch. We need to be able to copy a record and 
> append it to the end of the outgoing batch.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5993) Allow Copier to Copy a Record and Append to the End of an Outgoing Batch

2017-11-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271921#comment-16271921
 ] 

ASF GitHub Bot commented on DRILL-5993:
---

Github user ilooner commented on the issue:

https://github.com/apache/drill/pull/1057
  
Please let me know if I should make any more changes.


> Allow Copier to Copy a Record and Append to the End of an Outgoing Batch
> 
>
> Key: DRILL-5993
> URL: https://issues.apache.org/jira/browse/DRILL-5993
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>
> Currently the copier can only copy record from an incoming batch to the 
> beginning of an outgoing batch. We need to be able to copy a record and 
> append it to the end of the outgoing batch.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5993) Allow Copier to Copy a Record and Append to the End of an Outgoing Batch

2017-11-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271919#comment-16271919
 ] 

ASF GitHub Bot commented on DRILL-5993:
---

Github user ilooner commented on the issue:

https://github.com/apache/drill/pull/1057
  
@paul-rogers since HashJoin does not need to support Selection Vectors 
maybe we can postpone adding the corresponding appendRow methods if and when 
they are needed. I suspect by the time anyone needs those methods we will have 
already migrated over to the new batch framework.


> Allow Copier to Copy a Record and Append to the End of an Outgoing Batch
> 
>
> Key: DRILL-5993
> URL: https://issues.apache.org/jira/browse/DRILL-5993
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>
> Currently the copier can only copy record from an incoming batch to the 
> beginning of an outgoing batch. We need to be able to copy a record and 
> append it to the end of the outgoing batch.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5993) Allow Copier to Copy a Record and Append to the End of an Outgoing Batch

2017-11-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271912#comment-16271912
 ] 

ASF GitHub Bot commented on DRILL-5993:
---

Github user ilooner commented on a diff in the pull request:

https://github.com/apache/drill/pull/1057#discussion_r153958856
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/record/TestVectorContainer.java
 ---
@@ -124,4 +132,52 @@ public void testContainerMerge() {
 leftIndirect.clear();
 right.clear();
   }
+
+  @Test
+  public void testAppendRow()
+  {
+MaterializedField colA = MaterializedField.create("colA", 
Types.required(TypeProtos.MinorType.INT));
+MaterializedField colB = MaterializedField.create("colB", 
Types.required(TypeProtos.MinorType.INT));
--- End diff --

Added more interesting types. Currently RowSet classes don't support the 
Map data type. Paul asked me to look into adding support for this a while ago 
for DRILL-5870 . I'll update the test framework to support that in the next PR.


> Allow Copier to Copy a Record and Append to the End of an Outgoing Batch
> 
>
> Key: DRILL-5993
> URL: https://issues.apache.org/jira/browse/DRILL-5993
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>
> Currently the copier can only copy record from an incoming batch to the 
> beginning of an outgoing batch. We need to be able to copy a record and 
> append it to the end of the outgoing batch.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5993) Allow Copier to Copy a Record and Append to the End of an Outgoing Batch

2017-11-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271670#comment-16271670
 ] 

ASF GitHub Bot commented on DRILL-5993:
---

Github user Ben-Zvi commented on a diff in the pull request:

https://github.com/apache/drill/pull/1057#discussion_r153934729
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/record/VectorContainer.java 
---
@@ -353,6 +353,23 @@ public int getRecordCount() {
 
   public boolean hasRecordCount() { return recordCount != -1; }
 
+  /**
+   * This works with non-hyper {@link VectorContainer}s which have no 
selection vectors.
+   * Appends a row taken from a source {@link VectorContainer} to this 
{@link VectorContainer}.
+   * @param srcContainer The {@link VectorContainer} to copy a row from.
+   * @param srcIndex The index of the row to copy from the source {@link 
VectorContainer}.
+   */
+  public void appendRow(VectorContainer srcContainer, int srcIndex) {
+for (int vectorIndex = 0; vectorIndex < wrappers.size(); 
vectorIndex++) {
+  ValueVector destVector = wrappers.get(vectorIndex).getValueVector();
+  ValueVector srcVector = 
srcContainer.wrappers.get(vectorIndex).getValueVector();
+
+  destVector.copyEntry(recordCount, srcVector, srcIndex);
+}
+
+recordCount++;
--- End diff --

 The immediate need for appendRow() is to distribute rows from a single 
incoming batch into multiple other batches (for the Hash Join internal 
partitioning), based on the hash value of the key columns at each row. This 
would not work well with the second suggestion (vectorizing - column 1, column 
2, etc.)  


> Allow Copier to Copy a Record and Append to the End of an Outgoing Batch
> 
>
> Key: DRILL-5993
> URL: https://issues.apache.org/jira/browse/DRILL-5993
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>
> Currently the copier can only copy record from an incoming batch to the 
> beginning of an outgoing batch. We need to be able to copy a record and 
> append it to the end of the outgoing batch.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5993) Allow Copier to Copy a Record and Append to the End of an Outgoing Batch

2017-11-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271671#comment-16271671
 ] 

ASF GitHub Bot commented on DRILL-5993:
---

Github user Ben-Zvi commented on a diff in the pull request:

https://github.com/apache/drill/pull/1057#discussion_r153935071
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/record/TestVectorContainer.java
 ---
@@ -124,4 +132,52 @@ public void testContainerMerge() {
 leftIndirect.clear();
 right.clear();
   }
+
+  @Test
+  public void testAppendRow()
+  {
+MaterializedField colA = MaterializedField.create("colA", 
Types.required(TypeProtos.MinorType.INT));
+MaterializedField colB = MaterializedField.create("colB", 
Types.required(TypeProtos.MinorType.INT));
--- End diff --

 Maybe add some "interesting" datatypes ?  Testing integers only may miss 
some issue. 



> Allow Copier to Copy a Record and Append to the End of an Outgoing Batch
> 
>
> Key: DRILL-5993
> URL: https://issues.apache.org/jira/browse/DRILL-5993
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>
> Currently the copier can only copy record from an incoming batch to the 
> beginning of an outgoing batch. We need to be able to copy a record and 
> append it to the end of the outgoing batch.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5993) Allow Copier to Copy a Record and Append to the End of an Outgoing Batch

2017-11-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271600#comment-16271600
 ] 

ASF GitHub Bot commented on DRILL-5993:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1057#discussion_r153926390
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/record/VectorContainer.java 
---
@@ -353,6 +353,23 @@ public int getRecordCount() {
 
   public boolean hasRecordCount() { return recordCount != -1; }
 
+  /**
+   * This works with non-hyper {@link VectorContainer}s which have no 
selection vectors.
+   * Appends a row taken from a source {@link VectorContainer} to this 
{@link VectorContainer}.
+   * @param srcContainer The {@link VectorContainer} to copy a row from.
+   * @param srcIndex The index of the row to copy from the source {@link 
VectorContainer}.
+   */
+  public void appendRow(VectorContainer srcContainer, int srcIndex) {
+for (int vectorIndex = 0; vectorIndex < wrappers.size(); 
vectorIndex++) {
+  ValueVector destVector = wrappers.get(vectorIndex).getValueVector();
+  ValueVector srcVector = 
srcContainer.wrappers.get(vectorIndex).getValueVector();
+
+  destVector.copyEntry(recordCount, srcVector, srcIndex);
+}
+
+recordCount++;
--- End diff --

This is OK for a row-by-row copy. But, you'll get better performance if you 
optimize for the entire batch. Because you have no SV4, the source and dest 
batches are the same so the vectors can be preloaded into an array of vectors 
to avoid the vector wrapper lookup per column.

Plus, if the code is written per batch, you can go a step further: 
vectorize the operation. Copy all values for column 1, then all for column 2, 
and so on. (In this case, you only get each vector once, so sticking with the 
wrappers is fine.) By vectorizing, you may get the vectorized cache-locality 
benefit that Drill promises from its operations. Worth a try to see if you get 
any speed-up.


> Allow Copier to Copy a Record and Append to the End of an Outgoing Batch
> 
>
> Key: DRILL-5993
> URL: https://issues.apache.org/jira/browse/DRILL-5993
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>
> Currently the copier can only copy record from an incoming batch to the 
> beginning of an outgoing batch. We need to be able to copy a record and 
> append it to the end of the outgoing batch.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5993) Allow Copier to Copy a Record and Append to the End of an Outgoing Batch

2017-11-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271587#comment-16271587
 ] 

ASF GitHub Bot commented on DRILL-5993:
---

Github user paul-rogers commented on the issue:

https://github.com/apache/drill/pull/1057
  
To answer the two questions:

1. The copier is used in multiple locations, some of which include 
selection vectors. Sort uses a copier to merge rows coming from multiple sorted 
batches. The SVR compresses out SVs. A filter will produce an SV2 which the SVR 
removes. An in-memory sort produces an SV4. But, because of the ways plans are 
generated, the hash join will never see a batch with an SV. (An SVR will be 
inserted, if needed, to remove the SV.)

2. We never write a batch using an SV. The SV is always a source 
indirection. Because we do indirection on the source side (and vectors are 
append only), there can be no SV on the destination side.

Note also that the {{VectorContainer}} class, despite it's API, knows 
nothing about SVs. The SV is tacked on separately by the {{RecordBatch}}. (This 
is a less-than-ideal design, but it is how things work at present.) FWIW, the 
test-oriented {{RowSet}} abstractions came about as wrappers around both the 
{{VectorContainer}} and SV to provide a unified view.

Because of how we do SVs, you'll need three copy methods: one for no SV, 
one for an SV2 and another for an SV4.

In the fullness of time, the new "column reader" and "column writer" 
abstractions will hide all this stuff, but it will take time before those tools 
come online.


> Allow Copier to Copy a Record and Append to the End of an Outgoing Batch
> 
>
> Key: DRILL-5993
> URL: https://issues.apache.org/jira/browse/DRILL-5993
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>
> Currently the copier can only copy record from an incoming batch to the 
> beginning of an outgoing batch. We need to be able to copy a record and 
> append it to the end of the outgoing batch.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5993) Allow Copier to Copy a Record and Append to the End of an Outgoing Batch

2017-11-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271512#comment-16271512
 ] 

ASF GitHub Bot commented on DRILL-5993:
---

Github user ilooner commented on the issue:

https://github.com/apache/drill/pull/1057
  
@Ben-Zvi @paul-rogers 


> Allow Copier to Copy a Record and Append to the End of an Outgoing Batch
> 
>
> Key: DRILL-5993
> URL: https://issues.apache.org/jira/browse/DRILL-5993
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>
> Currently the copier can only copy record from an incoming batch to the 
> beginning of an outgoing batch. We need to be able to copy a record and 
> append it to the end of the outgoing batch.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5993) Allow Copier to Copy a Record and Append to the End of an Outgoing Batch

2017-11-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271511#comment-16271511
 ] 

ASF GitHub Bot commented on DRILL-5993:
---

GitHub user ilooner opened a pull request:

https://github.com/apache/drill/pull/1057

DRILL-5993 Append Row Method For VectorContainer (WIP)

## Motivation

HashJoin requires a method that can take a row from a VectorContainer and 
append it to a destination VectorContainer. This is a WIP and this PR is mainly 
opened to improve my understanding.

## Implementation

This is an initial implementation that works with simple VectorContainers 
that are not hyper batches and do not have selection vectors. It is also 
assumed that the user called **SchemaUtil.coerceContainer** on the source 
VectorContainer before using the newly added **appendRow** method.

## Questions

- Do we have to worry about selection vectors in the source container?
- Do we have to think about support hyper batches in the destination 
container?

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ilooner/drill DRILL-5993

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/1057.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1057


commit ee43d6808562a1ff60c17fa7622b8358b63c7276
Author: Timothy Farkas 
Date:   2017-11-29T20:38:41Z

 - Initial Implementation of append row for a vector container




> Allow Copier to Copy a Record and Append to the End of an Outgoing Batch
> 
>
> Key: DRILL-5993
> URL: https://issues.apache.org/jira/browse/DRILL-5993
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>
> Currently the copier can only copy record from an incoming batch to the 
> beginning of an outgoing batch. We need to be able to copy a record and 
> append it to the end of the outgoing batch.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)