[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests

2018-08-17 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584609#comment-16584609
 ] 

ASF GitHub Bot commented on DRILL-6461:
---

ilooner commented on issue #1344: DRILL-6461: Added basic data correctness 
tests for hash agg, and improved operator unit testing framework.
URL: https://github.com/apache/drill/pull/1344#issuecomment-414029645
 
 
   @sohami Addressed review comments. Please take another look.
   
   I also implemented some tests for OperatorTestBuilder. The simple tests used 
the project operator to pass through input batches. While implementing those 
tests I discovered a bug in Project where the recordCount is not set on output 
containers, so I fixed that as part of this PR.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Basic Data Correctness Unit Tests
> -
>
> Key: DRILL-6461
> URL: https://issues.apache.org/jira/browse/DRILL-6461
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
>
> There are no data correctness unit tests for HashAgg. We need to add some.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests

2018-08-17 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584579#comment-16584579
 ] 

ASF GitHub Bot commented on DRILL-6461:
---

ilooner commented on a change in pull request #1344: DRILL-6461: Added basic 
data correctness tests for hash agg, and improved operator unit testing 
framework.
URL: https://github.com/apache/drill/pull/1344#discussion_r211061250
 
 

 ##
 File path: 
exec/java-exec/src/test/java/org/apache/drill/test/OperatorTestBuilder.java
 ##
 @@ -0,0 +1,300 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test;
+
+import com.google.common.base.Preconditions;
+import org.apache.commons.lang3.mutable.MutableInt;
+import org.apache.drill.exec.physical.base.AbstractBase;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.impl.BatchCreator;
+import org.apache.drill.exec.physical.impl.MockRecordBatch;
+import org.apache.drill.exec.physical.impl.svremover.Copier;
+import org.apache.drill.exec.physical.impl.svremover.GenericCopier;
+import org.apache.drill.exec.record.CloseableRecordBatch;
+import org.apache.drill.exec.record.RecordBatch;
+import org.apache.drill.exec.record.RecordBatchSizer;
+import org.apache.drill.exec.record.VectorContainer;
+import org.apache.drill.exec.vector.FixedWidthVector;
+import org.apache.drill.exec.vector.ValueVector;
+import org.apache.drill.exec.vector.VariableWidthVector;
+import org.apache.drill.test.rowSet.DirectRowSet;
+import org.apache.drill.test.rowSet.IndirectRowSet;
+import org.apache.drill.test.rowSet.RowSet;
+import org.apache.drill.test.rowSet.RowSetComparison;
+import org.junit.Assert;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Optional;
+import java.util.stream.Collectors;
+
+public class OperatorTestBuilder {
+
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(OperatorTestBuilder.class);
+
+  private final List expectedResults = new ArrayList<>();
+  private final List upstreamBatches = new ArrayList<>();
+  private PhysicalOpUnitTestBase physicalOpUnitTestBase;
+
+  private PhysicalOperator physicalOperator;
+  private long initReservation = AbstractBase.INIT_ALLOCATION;
+  private long maxAllocation = AbstractBase.MAX_ALLOCATION;
+  private Optional expectedNumBatchesOpt = Optional.empty();
+  private Optional expectedTotalRowsOpt = Optional.empty();
+  private boolean combineOutputBatches;
+
+  public OperatorTestBuilder(PhysicalOpUnitTestBase physicalOpUnitTestBase) {
+this.physicalOpUnitTestBase = physicalOpUnitTestBase;
+  }
+
+  @SuppressWarnings({"unchecked", "resource"})
+  public void go() throws Exception {
+final List actualResults = new ArrayList<>();
+CloseableRecordBatch testOperator = null;
+
+try {
+  validate();
+  int expectedNumBatches = 
expectedNumBatchesOpt.orElse(expectedResults.size());
+  physicalOpUnitTestBase.mockOpContext(physicalOperator, initReservation, 
maxAllocation);
+
+  final BatchCreator opCreator = 
(BatchCreator) 
physicalOpUnitTestBase.opCreatorReg.getOperatorCreator(physicalOperator.getClass());
+  testOperator = opCreator.getBatch(physicalOpUnitTestBase.fragContext, 
physicalOperator, (List)upstreamBatches);
+
+  batchIterator: for (int batchIndex = 0;; batchIndex++) {
+final RecordBatch.IterOutcome outcome = testOperator.next();
+
+switch (outcome) {
+  case NONE:
+if (!combineOutputBatches) {
+  Assert.assertEquals(expectedNumBatches, batchIndex);
+}
+// We are done iterating over batches. Now we need to compare them.
+break batchIterator;
+  case OK_NEW_SCHEMA:
+boolean skip = true;
+
+try {
+  skip = testOperator.getContainer().getRecordCount() == 0;
+} catch (IllegalStateException e) {
+  // We should skip this batch in this case. It means no data was 
included with the okay schema
+} finally {
+  if (skip) {
+batchIndex--;
+break;
+   

[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests

2018-08-17 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584574#comment-16584574
 ] 

ASF GitHub Bot commented on DRILL-6461:
---

ilooner commented on a change in pull request #1344: DRILL-6461: Added basic 
data correctness tests for hash agg, and improved operator unit testing 
framework.
URL: https://github.com/apache/drill/pull/1344#discussion_r211060910
 
 

 ##
 File path: 
exec/java-exec/src/test/java/org/apache/drill/test/OperatorTestBuilder.java
 ##
 @@ -0,0 +1,300 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test;
+
+import com.google.common.base.Preconditions;
+import org.apache.commons.lang3.mutable.MutableInt;
+import org.apache.drill.exec.physical.base.AbstractBase;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.impl.BatchCreator;
+import org.apache.drill.exec.physical.impl.MockRecordBatch;
+import org.apache.drill.exec.physical.impl.svremover.Copier;
+import org.apache.drill.exec.physical.impl.svremover.GenericCopier;
+import org.apache.drill.exec.record.CloseableRecordBatch;
+import org.apache.drill.exec.record.RecordBatch;
+import org.apache.drill.exec.record.RecordBatchSizer;
+import org.apache.drill.exec.record.VectorContainer;
+import org.apache.drill.exec.vector.FixedWidthVector;
+import org.apache.drill.exec.vector.ValueVector;
+import org.apache.drill.exec.vector.VariableWidthVector;
+import org.apache.drill.test.rowSet.DirectRowSet;
+import org.apache.drill.test.rowSet.IndirectRowSet;
+import org.apache.drill.test.rowSet.RowSet;
+import org.apache.drill.test.rowSet.RowSetComparison;
+import org.junit.Assert;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Optional;
+import java.util.stream.Collectors;
+
+public class OperatorTestBuilder {
+
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(OperatorTestBuilder.class);
+
+  private final List expectedResults = new ArrayList<>();
+  private final List upstreamBatches = new ArrayList<>();
+  private PhysicalOpUnitTestBase physicalOpUnitTestBase;
+
+  private PhysicalOperator physicalOperator;
+  private long initReservation = AbstractBase.INIT_ALLOCATION;
+  private long maxAllocation = AbstractBase.MAX_ALLOCATION;
+  private Optional expectedNumBatchesOpt = Optional.empty();
+  private Optional expectedTotalRowsOpt = Optional.empty();
+  private boolean combineOutputBatches;
+
+  public OperatorTestBuilder(PhysicalOpUnitTestBase physicalOpUnitTestBase) {
+this.physicalOpUnitTestBase = physicalOpUnitTestBase;
+  }
+
+  @SuppressWarnings({"unchecked", "resource"})
+  public void go() throws Exception {
+final List actualResults = new ArrayList<>();
+CloseableRecordBatch testOperator = null;
+
+try {
+  validate();
+  int expectedNumBatches = 
expectedNumBatchesOpt.orElse(expectedResults.size());
+  physicalOpUnitTestBase.mockOpContext(physicalOperator, initReservation, 
maxAllocation);
+
+  final BatchCreator opCreator = 
(BatchCreator) 
physicalOpUnitTestBase.opCreatorReg.getOperatorCreator(physicalOperator.getClass());
+  testOperator = opCreator.getBatch(physicalOpUnitTestBase.fragContext, 
physicalOperator, (List)upstreamBatches);
+
+  batchIterator: for (int batchIndex = 0;; batchIndex++) {
+final RecordBatch.IterOutcome outcome = testOperator.next();
+
+switch (outcome) {
+  case NONE:
+if (!combineOutputBatches) {
+  Assert.assertEquals(expectedNumBatches, batchIndex);
+}
+// We are done iterating over batches. Now we need to compare them.
+break batchIterator;
+  case OK_NEW_SCHEMA:
+boolean skip = true;
+
+try {
+  skip = testOperator.getContainer().getRecordCount() == 0;
+} catch (IllegalStateException e) {
+  // We should skip this batch in this case. It means no data was 
included with the okay schema
+} finally {
+  if (skip) {
+batchIndex--;
+break;
+   

[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests

2018-08-17 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584474#comment-16584474
 ] 

ASF GitHub Bot commented on DRILL-6461:
---

sohami commented on a change in pull request #1344: DRILL-6461: Added basic 
data correctness tests for hash agg, and improved operator unit testing 
framework.
URL: https://github.com/apache/drill/pull/1344#discussion_r211049297
 
 

 ##
 File path: 
exec/java-exec/src/test/java/org/apache/drill/test/OperatorTestBuilder.java
 ##
 @@ -0,0 +1,300 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test;
+
+import com.google.common.base.Preconditions;
+import org.apache.commons.lang3.mutable.MutableInt;
+import org.apache.drill.exec.physical.base.AbstractBase;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.impl.BatchCreator;
+import org.apache.drill.exec.physical.impl.MockRecordBatch;
+import org.apache.drill.exec.physical.impl.svremover.Copier;
+import org.apache.drill.exec.physical.impl.svremover.GenericCopier;
+import org.apache.drill.exec.record.CloseableRecordBatch;
+import org.apache.drill.exec.record.RecordBatch;
+import org.apache.drill.exec.record.RecordBatchSizer;
+import org.apache.drill.exec.record.VectorContainer;
+import org.apache.drill.exec.vector.FixedWidthVector;
+import org.apache.drill.exec.vector.ValueVector;
+import org.apache.drill.exec.vector.VariableWidthVector;
+import org.apache.drill.test.rowSet.DirectRowSet;
+import org.apache.drill.test.rowSet.IndirectRowSet;
+import org.apache.drill.test.rowSet.RowSet;
+import org.apache.drill.test.rowSet.RowSetComparison;
+import org.junit.Assert;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Optional;
+import java.util.stream.Collectors;
+
+public class OperatorTestBuilder {
+
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(OperatorTestBuilder.class);
+
+  private final List expectedResults = new ArrayList<>();
+  private final List upstreamBatches = new ArrayList<>();
+  private PhysicalOpUnitTestBase physicalOpUnitTestBase;
+
+  private PhysicalOperator physicalOperator;
+  private long initReservation = AbstractBase.INIT_ALLOCATION;
+  private long maxAllocation = AbstractBase.MAX_ALLOCATION;
+  private Optional expectedNumBatchesOpt = Optional.empty();
+  private Optional expectedTotalRowsOpt = Optional.empty();
+  private boolean combineOutputBatches;
+
+  public OperatorTestBuilder(PhysicalOpUnitTestBase physicalOpUnitTestBase) {
+this.physicalOpUnitTestBase = physicalOpUnitTestBase;
+  }
+
+  @SuppressWarnings({"unchecked", "resource"})
+  public void go() throws Exception {
+final List actualResults = new ArrayList<>();
+CloseableRecordBatch testOperator = null;
+
+try {
+  validate();
+  int expectedNumBatches = 
expectedNumBatchesOpt.orElse(expectedResults.size());
+  physicalOpUnitTestBase.mockOpContext(physicalOperator, initReservation, 
maxAllocation);
+
+  final BatchCreator opCreator = 
(BatchCreator) 
physicalOpUnitTestBase.opCreatorReg.getOperatorCreator(physicalOperator.getClass());
+  testOperator = opCreator.getBatch(physicalOpUnitTestBase.fragContext, 
physicalOperator, (List)upstreamBatches);
+
+  batchIterator: for (int batchIndex = 0;; batchIndex++) {
+final RecordBatch.IterOutcome outcome = testOperator.next();
+
+switch (outcome) {
+  case NONE:
+if (!combineOutputBatches) {
+  Assert.assertEquals(expectedNumBatches, batchIndex);
+}
+// We are done iterating over batches. Now we need to compare them.
+break batchIterator;
+  case OK_NEW_SCHEMA:
+boolean skip = true;
+
+try {
+  skip = testOperator.getContainer().getRecordCount() == 0;
+} catch (IllegalStateException e) {
+  // We should skip this batch in this case. It means no data was 
included with the okay schema
+} finally {
+  if (skip) {
+batchIndex--;
+break;
+

[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests

2018-08-17 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584471#comment-16584471
 ] 

ASF GitHub Bot commented on DRILL-6461:
---

sohami commented on a change in pull request #1344: DRILL-6461: Added basic 
data correctness tests for hash agg, and improved operator unit testing 
framework.
URL: https://github.com/apache/drill/pull/1344#discussion_r211048316
 
 

 ##
 File path: 
exec/java-exec/src/test/java/org/apache/drill/test/OperatorTestBuilder.java
 ##
 @@ -0,0 +1,300 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test;
+
+import com.google.common.base.Preconditions;
+import org.apache.commons.lang3.mutable.MutableInt;
+import org.apache.drill.exec.physical.base.AbstractBase;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.impl.BatchCreator;
+import org.apache.drill.exec.physical.impl.MockRecordBatch;
+import org.apache.drill.exec.physical.impl.svremover.Copier;
+import org.apache.drill.exec.physical.impl.svremover.GenericCopier;
+import org.apache.drill.exec.record.CloseableRecordBatch;
+import org.apache.drill.exec.record.RecordBatch;
+import org.apache.drill.exec.record.RecordBatchSizer;
+import org.apache.drill.exec.record.VectorContainer;
+import org.apache.drill.exec.vector.FixedWidthVector;
+import org.apache.drill.exec.vector.ValueVector;
+import org.apache.drill.exec.vector.VariableWidthVector;
+import org.apache.drill.test.rowSet.DirectRowSet;
+import org.apache.drill.test.rowSet.IndirectRowSet;
+import org.apache.drill.test.rowSet.RowSet;
+import org.apache.drill.test.rowSet.RowSetComparison;
+import org.junit.Assert;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Optional;
+import java.util.stream.Collectors;
+
+public class OperatorTestBuilder {
+
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(OperatorTestBuilder.class);
+
+  private final List expectedResults = new ArrayList<>();
+  private final List upstreamBatches = new ArrayList<>();
+  private PhysicalOpUnitTestBase physicalOpUnitTestBase;
+
+  private PhysicalOperator physicalOperator;
+  private long initReservation = AbstractBase.INIT_ALLOCATION;
+  private long maxAllocation = AbstractBase.MAX_ALLOCATION;
+  private Optional expectedNumBatchesOpt = Optional.empty();
+  private Optional expectedTotalRowsOpt = Optional.empty();
+  private boolean combineOutputBatches;
+
+  public OperatorTestBuilder(PhysicalOpUnitTestBase physicalOpUnitTestBase) {
+this.physicalOpUnitTestBase = physicalOpUnitTestBase;
+  }
+
+  @SuppressWarnings({"unchecked", "resource"})
+  public void go() throws Exception {
+final List actualResults = new ArrayList<>();
+CloseableRecordBatch testOperator = null;
+
+try {
+  validate();
+  int expectedNumBatches = 
expectedNumBatchesOpt.orElse(expectedResults.size());
+  physicalOpUnitTestBase.mockOpContext(physicalOperator, initReservation, 
maxAllocation);
+
+  final BatchCreator opCreator = 
(BatchCreator) 
physicalOpUnitTestBase.opCreatorReg.getOperatorCreator(physicalOperator.getClass());
+  testOperator = opCreator.getBatch(physicalOpUnitTestBase.fragContext, 
physicalOperator, (List)upstreamBatches);
+
+  batchIterator: for (int batchIndex = 0;; batchIndex++) {
+final RecordBatch.IterOutcome outcome = testOperator.next();
+
+switch (outcome) {
+  case NONE:
+if (!combineOutputBatches) {
+  Assert.assertEquals(expectedNumBatches, batchIndex);
+}
+// We are done iterating over batches. Now we need to compare them.
+break batchIterator;
+  case OK_NEW_SCHEMA:
+boolean skip = true;
+
+try {
+  skip = testOperator.getContainer().getRecordCount() == 0;
+} catch (IllegalStateException e) {
+  // We should skip this batch in this case. It means no data was 
included with the okay schema
+} finally {
+  if (skip) {
+batchIndex--;
+break;
+

[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests

2018-08-17 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584472#comment-16584472
 ] 

ASF GitHub Bot commented on DRILL-6461:
---

sohami commented on a change in pull request #1344: DRILL-6461: Added basic 
data correctness tests for hash agg, and improved operator unit testing 
framework.
URL: https://github.com/apache/drill/pull/1344#discussion_r211048926
 
 

 ##
 File path: 
exec/java-exec/src/test/java/org/apache/drill/test/OperatorTestBuilder.java
 ##
 @@ -0,0 +1,300 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test;
+
+import com.google.common.base.Preconditions;
+import org.apache.commons.lang3.mutable.MutableInt;
+import org.apache.drill.exec.physical.base.AbstractBase;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.impl.BatchCreator;
+import org.apache.drill.exec.physical.impl.MockRecordBatch;
+import org.apache.drill.exec.physical.impl.svremover.Copier;
+import org.apache.drill.exec.physical.impl.svremover.GenericCopier;
+import org.apache.drill.exec.record.CloseableRecordBatch;
+import org.apache.drill.exec.record.RecordBatch;
+import org.apache.drill.exec.record.RecordBatchSizer;
+import org.apache.drill.exec.record.VectorContainer;
+import org.apache.drill.exec.vector.FixedWidthVector;
+import org.apache.drill.exec.vector.ValueVector;
+import org.apache.drill.exec.vector.VariableWidthVector;
+import org.apache.drill.test.rowSet.DirectRowSet;
+import org.apache.drill.test.rowSet.IndirectRowSet;
+import org.apache.drill.test.rowSet.RowSet;
+import org.apache.drill.test.rowSet.RowSetComparison;
+import org.junit.Assert;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Optional;
+import java.util.stream.Collectors;
+
+public class OperatorTestBuilder {
+
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(OperatorTestBuilder.class);
+
+  private final List expectedResults = new ArrayList<>();
+  private final List upstreamBatches = new ArrayList<>();
+  private PhysicalOpUnitTestBase physicalOpUnitTestBase;
+
+  private PhysicalOperator physicalOperator;
+  private long initReservation = AbstractBase.INIT_ALLOCATION;
+  private long maxAllocation = AbstractBase.MAX_ALLOCATION;
+  private Optional expectedNumBatchesOpt = Optional.empty();
+  private Optional expectedTotalRowsOpt = Optional.empty();
+  private boolean combineOutputBatches;
+
+  public OperatorTestBuilder(PhysicalOpUnitTestBase physicalOpUnitTestBase) {
+this.physicalOpUnitTestBase = physicalOpUnitTestBase;
+  }
+
+  @SuppressWarnings({"unchecked", "resource"})
+  public void go() throws Exception {
+final List actualResults = new ArrayList<>();
+CloseableRecordBatch testOperator = null;
+
+try {
+  validate();
+  int expectedNumBatches = 
expectedNumBatchesOpt.orElse(expectedResults.size());
+  physicalOpUnitTestBase.mockOpContext(physicalOperator, initReservation, 
maxAllocation);
+
+  final BatchCreator opCreator = 
(BatchCreator) 
physicalOpUnitTestBase.opCreatorReg.getOperatorCreator(physicalOperator.getClass());
+  testOperator = opCreator.getBatch(physicalOpUnitTestBase.fragContext, 
physicalOperator, (List)upstreamBatches);
+
+  batchIterator: for (int batchIndex = 0;; batchIndex++) {
+final RecordBatch.IterOutcome outcome = testOperator.next();
+
+switch (outcome) {
+  case NONE:
+if (!combineOutputBatches) {
+  Assert.assertEquals(expectedNumBatches, batchIndex);
+}
+// We are done iterating over batches. Now we need to compare them.
+break batchIterator;
+  case OK_NEW_SCHEMA:
+boolean skip = true;
+
+try {
+  skip = testOperator.getContainer().getRecordCount() == 0;
+} catch (IllegalStateException e) {
+  // We should skip this batch in this case. It means no data was 
included with the okay schema
+} finally {
+  if (skip) {
+batchIndex--;
+break;
+

[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests

2018-08-17 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584458#comment-16584458
 ] 

ASF GitHub Bot commented on DRILL-6461:
---

ilooner commented on issue #1344: DRILL-6461: Added basic data correctness 
tests for hash agg, and improved operator unit testing framework.
URL: https://github.com/apache/drill/pull/1344#issuecomment-414004294
 
 
   @sohami Applied review comments. Please take another look.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Basic Data Correctness Unit Tests
> -
>
> Key: DRILL-6461
> URL: https://issues.apache.org/jira/browse/DRILL-6461
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
>
> There are no data correctness unit tests for HashAgg. We need to add some.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests

2018-08-17 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584455#comment-16584455
 ] 

ASF GitHub Bot commented on DRILL-6461:
---

ilooner commented on a change in pull request #1344: DRILL-6461: Added basic 
data correctness tests for hash agg, and improved operator unit testing 
framework.
URL: https://github.com/apache/drill/pull/1344#discussion_r211046248
 
 

 ##
 File path: 
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/svremover/AbstractGenericCopierTest.java
 ##
 @@ -20,77 +20,106 @@
 import org.apache.drill.common.types.TypeProtos;
 import org.apache.drill.common.types.Types;
 import org.apache.drill.exec.exception.SchemaChangeException;
-import org.apache.drill.exec.memory.RootAllocator;
+import org.apache.drill.exec.memory.BufferAllocator;
+import org.apache.drill.exec.physical.impl.MockRecordBatch;
 import org.apache.drill.exec.record.BatchSchema;
 import org.apache.drill.exec.record.MaterializedField;
 import org.apache.drill.exec.record.RecordBatch;
 import org.apache.drill.exec.record.VectorContainer;
-import org.apache.drill.exec.record.metadata.TupleMetadata;
+import org.apache.drill.exec.record.VectorWrapper;
 import org.apache.drill.exec.vector.SchemaChangeCallBack;
+import org.apache.drill.test.BaseDirTestWatcher;
+import org.apache.drill.test.OperatorFixture;
+import org.apache.drill.test.rowSet.DirectRowSet;
 import org.apache.drill.test.rowSet.RowSet;
-import org.apache.drill.test.rowSet.RowSetBatch;
 import org.apache.drill.test.rowSet.RowSetBuilder;
 import org.apache.drill.test.rowSet.RowSetComparison;
 import org.apache.drill.test.rowSet.schema.SchemaBuilder;
+import org.junit.Rule;
 import org.junit.Test;
 
 public abstract class AbstractGenericCopierTest {
+  @Rule
+  public final BaseDirTestWatcher baseDirTestWatcher = new 
BaseDirTestWatcher();
+
   @Test
-  public void testCopyRecords() throws SchemaChangeException {
-try (RootAllocator allocator = new RootAllocator(10_000_000)) {
-  final TupleMetadata batchSchema = 
createTestSchema(BatchSchema.SelectionVectorMode.NONE);
+  public void testCopyRecords() throws Exception {
+try (OperatorFixture operatorFixture = new 
OperatorFixture.Builder(baseDirTestWatcher).build()) {
+  final BufferAllocator allocator = operatorFixture.allocator();
+  final BatchSchema batchSchema = 
createTestSchema(BatchSchema.SelectionVectorMode.NONE);
   final RowSet srcRowSet = createSrcRowSet(allocator);
-  final RowSet destRowSet = new RowSetBuilder(allocator, 
batchSchema).build();
-  final VectorContainer destContainer = destRowSet.container();
-  final Copier copier = createCopier(new RowSetBatch(srcRowSet), 
destContainer, null);
+  final VectorContainer destContainer = new VectorContainer(allocator, 
batchSchema);
+
+  for (VectorWrapper vectorWrapper: destContainer) {
+vectorWrapper.getValueVector().allocateNew();
 
 Review comment:
   Thanks for catching this. I've moved the allocation code into a static 
method and call it in GenericCopier.copyRecords.
   
   It is still necessary to allocate the vectors in the appendRecords test 
though, since appendRecords assumes the vector container is pre existing.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Basic Data Correctness Unit Tests
> -
>
> Key: DRILL-6461
> URL: https://issues.apache.org/jira/browse/DRILL-6461
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
>
> There are no data correctness unit tests for HashAgg. We need to add some.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests

2018-08-17 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584433#comment-16584433
 ] 

ASF GitHub Bot commented on DRILL-6461:
---

ilooner commented on a change in pull request #1344: DRILL-6461: Added basic 
data correctness tests for hash agg, and improved operator unit testing 
framework.
URL: https://github.com/apache/drill/pull/1344#discussion_r211041686
 
 

 ##
 File path: 
exec/java-exec/src/test/java/org/apache/drill/test/OperatorTestBuilder.java
 ##
 @@ -0,0 +1,300 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test;
+
+import com.google.common.base.Preconditions;
+import org.apache.commons.lang3.mutable.MutableInt;
+import org.apache.drill.exec.physical.base.AbstractBase;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.impl.BatchCreator;
+import org.apache.drill.exec.physical.impl.MockRecordBatch;
+import org.apache.drill.exec.physical.impl.svremover.Copier;
+import org.apache.drill.exec.physical.impl.svremover.GenericCopier;
+import org.apache.drill.exec.record.CloseableRecordBatch;
+import org.apache.drill.exec.record.RecordBatch;
+import org.apache.drill.exec.record.RecordBatchSizer;
+import org.apache.drill.exec.record.VectorContainer;
+import org.apache.drill.exec.vector.FixedWidthVector;
+import org.apache.drill.exec.vector.ValueVector;
+import org.apache.drill.exec.vector.VariableWidthVector;
+import org.apache.drill.test.rowSet.DirectRowSet;
+import org.apache.drill.test.rowSet.IndirectRowSet;
+import org.apache.drill.test.rowSet.RowSet;
+import org.apache.drill.test.rowSet.RowSetComparison;
+import org.junit.Assert;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Optional;
+import java.util.stream.Collectors;
+
+public class OperatorTestBuilder {
+
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(OperatorTestBuilder.class);
+
+  private final List expectedResults = new ArrayList<>();
+  private final List upstreamBatches = new ArrayList<>();
+  private PhysicalOpUnitTestBase physicalOpUnitTestBase;
+
+  private PhysicalOperator physicalOperator;
+  private long initReservation = AbstractBase.INIT_ALLOCATION;
+  private long maxAllocation = AbstractBase.MAX_ALLOCATION;
+  private Optional expectedNumBatchesOpt = Optional.empty();
+  private Optional expectedTotalRowsOpt = Optional.empty();
+  private boolean combineOutputBatches;
+
+  public OperatorTestBuilder(PhysicalOpUnitTestBase physicalOpUnitTestBase) {
+this.physicalOpUnitTestBase = physicalOpUnitTestBase;
+  }
+
+  @SuppressWarnings({"unchecked", "resource"})
+  public void go() throws Exception {
+final List actualResults = new ArrayList<>();
+CloseableRecordBatch testOperator = null;
+
+try {
+  validate();
+  int expectedNumBatches = 
expectedNumBatchesOpt.orElse(expectedResults.size());
+  physicalOpUnitTestBase.mockOpContext(physicalOperator, initReservation, 
maxAllocation);
+
+  final BatchCreator opCreator = 
(BatchCreator) 
physicalOpUnitTestBase.opCreatorReg.getOperatorCreator(physicalOperator.getClass());
+  testOperator = opCreator.getBatch(physicalOpUnitTestBase.fragContext, 
physicalOperator, (List)upstreamBatches);
+
+  batchIterator: for (int batchIndex = 0;; batchIndex++) {
+final RecordBatch.IterOutcome outcome = testOperator.next();
+
+switch (outcome) {
+  case NONE:
+if (!combineOutputBatches) {
+  Assert.assertEquals(expectedNumBatches, batchIndex);
+}
+// We are done iterating over batches. Now we need to compare them.
+break batchIterator;
+  case OK_NEW_SCHEMA:
+boolean skip = true;
+
+try {
+  skip = testOperator.getContainer().getRecordCount() == 0;
+} catch (IllegalStateException e) {
+  // We should skip this batch in this case. It means no data was 
included with the okay schema
+} finally {
+  if (skip) {
+batchIndex--;
+break;
+   

[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests

2018-08-17 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584423#comment-16584423
 ] 

ASF GitHub Bot commented on DRILL-6461:
---

ilooner commented on a change in pull request #1344: DRILL-6461: Added basic 
data correctness tests for hash agg, and improved operator unit testing 
framework.
URL: https://github.com/apache/drill/pull/1344#discussion_r211040661
 
 

 ##
 File path: 
exec/java-exec/src/test/java/org/apache/drill/test/OperatorTestBuilder.java
 ##
 @@ -0,0 +1,300 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test;
+
+import com.google.common.base.Preconditions;
+import org.apache.commons.lang3.mutable.MutableInt;
+import org.apache.drill.exec.physical.base.AbstractBase;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.impl.BatchCreator;
+import org.apache.drill.exec.physical.impl.MockRecordBatch;
+import org.apache.drill.exec.physical.impl.svremover.Copier;
+import org.apache.drill.exec.physical.impl.svremover.GenericCopier;
+import org.apache.drill.exec.record.CloseableRecordBatch;
+import org.apache.drill.exec.record.RecordBatch;
+import org.apache.drill.exec.record.RecordBatchSizer;
+import org.apache.drill.exec.record.VectorContainer;
+import org.apache.drill.exec.vector.FixedWidthVector;
+import org.apache.drill.exec.vector.ValueVector;
+import org.apache.drill.exec.vector.VariableWidthVector;
+import org.apache.drill.test.rowSet.DirectRowSet;
+import org.apache.drill.test.rowSet.IndirectRowSet;
+import org.apache.drill.test.rowSet.RowSet;
+import org.apache.drill.test.rowSet.RowSetComparison;
+import org.junit.Assert;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Optional;
+import java.util.stream.Collectors;
+
+public class OperatorTestBuilder {
+
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(OperatorTestBuilder.class);
+
+  private final List expectedResults = new ArrayList<>();
+  private final List upstreamBatches = new ArrayList<>();
+  private PhysicalOpUnitTestBase physicalOpUnitTestBase;
+
+  private PhysicalOperator physicalOperator;
+  private long initReservation = AbstractBase.INIT_ALLOCATION;
+  private long maxAllocation = AbstractBase.MAX_ALLOCATION;
+  private Optional expectedNumBatchesOpt = Optional.empty();
+  private Optional expectedTotalRowsOpt = Optional.empty();
+  private boolean combineOutputBatches;
+
+  public OperatorTestBuilder(PhysicalOpUnitTestBase physicalOpUnitTestBase) {
+this.physicalOpUnitTestBase = physicalOpUnitTestBase;
+  }
+
+  @SuppressWarnings({"unchecked", "resource"})
+  public void go() throws Exception {
+final List actualResults = new ArrayList<>();
+CloseableRecordBatch testOperator = null;
+
+try {
+  validate();
+  int expectedNumBatches = 
expectedNumBatchesOpt.orElse(expectedResults.size());
+  physicalOpUnitTestBase.mockOpContext(physicalOperator, initReservation, 
maxAllocation);
+
+  final BatchCreator opCreator = 
(BatchCreator) 
physicalOpUnitTestBase.opCreatorReg.getOperatorCreator(physicalOperator.getClass());
+  testOperator = opCreator.getBatch(physicalOpUnitTestBase.fragContext, 
physicalOperator, (List)upstreamBatches);
+
+  batchIterator: for (int batchIndex = 0;; batchIndex++) {
+final RecordBatch.IterOutcome outcome = testOperator.next();
+
+switch (outcome) {
+  case NONE:
+if (!combineOutputBatches) {
+  Assert.assertEquals(expectedNumBatches, batchIndex);
+}
+// We are done iterating over batches. Now we need to compare them.
+break batchIterator;
+  case OK_NEW_SCHEMA:
+boolean skip = true;
+
+try {
+  skip = testOperator.getContainer().getRecordCount() == 0;
+} catch (IllegalStateException e) {
+  // We should skip this batch in this case. It means no data was 
included with the okay schema
+} finally {
+  if (skip) {
+batchIndex--;
+break;
+   

[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests

2018-08-17 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584419#comment-16584419
 ] 

ASF GitHub Bot commented on DRILL-6461:
---

ilooner commented on a change in pull request #1344: DRILL-6461: Added basic 
data correctness tests for hash agg, and improved operator unit testing 
framework.
URL: https://github.com/apache/drill/pull/1344#discussion_r211040315
 
 

 ##
 File path: 
exec/java-exec/src/test/java/org/apache/drill/test/OperatorTestBuilder.java
 ##
 @@ -0,0 +1,300 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test;
+
+import com.google.common.base.Preconditions;
+import org.apache.commons.lang3.mutable.MutableInt;
+import org.apache.drill.exec.physical.base.AbstractBase;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.impl.BatchCreator;
+import org.apache.drill.exec.physical.impl.MockRecordBatch;
+import org.apache.drill.exec.physical.impl.svremover.Copier;
+import org.apache.drill.exec.physical.impl.svremover.GenericCopier;
+import org.apache.drill.exec.record.CloseableRecordBatch;
+import org.apache.drill.exec.record.RecordBatch;
+import org.apache.drill.exec.record.RecordBatchSizer;
+import org.apache.drill.exec.record.VectorContainer;
+import org.apache.drill.exec.vector.FixedWidthVector;
+import org.apache.drill.exec.vector.ValueVector;
+import org.apache.drill.exec.vector.VariableWidthVector;
+import org.apache.drill.test.rowSet.DirectRowSet;
+import org.apache.drill.test.rowSet.IndirectRowSet;
+import org.apache.drill.test.rowSet.RowSet;
+import org.apache.drill.test.rowSet.RowSetComparison;
+import org.junit.Assert;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Optional;
+import java.util.stream.Collectors;
+
+public class OperatorTestBuilder {
+
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(OperatorTestBuilder.class);
+
+  private final List expectedResults = new ArrayList<>();
+  private final List upstreamBatches = new ArrayList<>();
+  private PhysicalOpUnitTestBase physicalOpUnitTestBase;
+
+  private PhysicalOperator physicalOperator;
+  private long initReservation = AbstractBase.INIT_ALLOCATION;
+  private long maxAllocation = AbstractBase.MAX_ALLOCATION;
+  private Optional expectedNumBatchesOpt = Optional.empty();
+  private Optional expectedTotalRowsOpt = Optional.empty();
+  private boolean combineOutputBatches;
+
+  public OperatorTestBuilder(PhysicalOpUnitTestBase physicalOpUnitTestBase) {
+this.physicalOpUnitTestBase = physicalOpUnitTestBase;
+  }
+
+  @SuppressWarnings({"unchecked", "resource"})
+  public void go() throws Exception {
+final List actualResults = new ArrayList<>();
+CloseableRecordBatch testOperator = null;
+
+try {
+  validate();
+  int expectedNumBatches = 
expectedNumBatchesOpt.orElse(expectedResults.size());
+  physicalOpUnitTestBase.mockOpContext(physicalOperator, initReservation, 
maxAllocation);
+
+  final BatchCreator opCreator = 
(BatchCreator) 
physicalOpUnitTestBase.opCreatorReg.getOperatorCreator(physicalOperator.getClass());
+  testOperator = opCreator.getBatch(physicalOpUnitTestBase.fragContext, 
physicalOperator, (List)upstreamBatches);
+
+  batchIterator: for (int batchIndex = 0;; batchIndex++) {
+final RecordBatch.IterOutcome outcome = testOperator.next();
+
+switch (outcome) {
+  case NONE:
+if (!combineOutputBatches) {
+  Assert.assertEquals(expectedNumBatches, batchIndex);
+}
+// We are done iterating over batches. Now we need to compare them.
+break batchIterator;
+  case OK_NEW_SCHEMA:
+boolean skip = true;
+
+try {
+  skip = testOperator.getContainer().getRecordCount() == 0;
+} catch (IllegalStateException e) {
+  // We should skip this batch in this case. It means no data was 
included with the okay schema
+} finally {
+  if (skip) {
+batchIndex--;
+break;
+   

[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests

2018-08-17 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584417#comment-16584417
 ] 

ASF GitHub Bot commented on DRILL-6461:
---

ilooner commented on a change in pull request #1344: DRILL-6461: Added basic 
data correctness tests for hash agg, and improved operator unit testing 
framework.
URL: https://github.com/apache/drill/pull/1344#discussion_r211039960
 
 

 ##
 File path: 
exec/java-exec/src/test/java/org/apache/drill/test/OperatorTestBuilder.java
 ##
 @@ -0,0 +1,300 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test;
+
+import com.google.common.base.Preconditions;
+import org.apache.commons.lang3.mutable.MutableInt;
+import org.apache.drill.exec.physical.base.AbstractBase;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.impl.BatchCreator;
+import org.apache.drill.exec.physical.impl.MockRecordBatch;
+import org.apache.drill.exec.physical.impl.svremover.Copier;
+import org.apache.drill.exec.physical.impl.svremover.GenericCopier;
+import org.apache.drill.exec.record.CloseableRecordBatch;
+import org.apache.drill.exec.record.RecordBatch;
+import org.apache.drill.exec.record.RecordBatchSizer;
+import org.apache.drill.exec.record.VectorContainer;
+import org.apache.drill.exec.vector.FixedWidthVector;
+import org.apache.drill.exec.vector.ValueVector;
+import org.apache.drill.exec.vector.VariableWidthVector;
+import org.apache.drill.test.rowSet.DirectRowSet;
+import org.apache.drill.test.rowSet.IndirectRowSet;
+import org.apache.drill.test.rowSet.RowSet;
+import org.apache.drill.test.rowSet.RowSetComparison;
+import org.junit.Assert;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Optional;
+import java.util.stream.Collectors;
+
+public class OperatorTestBuilder {
+
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(OperatorTestBuilder.class);
+
+  private final List expectedResults = new ArrayList<>();
+  private final List upstreamBatches = new ArrayList<>();
+  private PhysicalOpUnitTestBase physicalOpUnitTestBase;
+
+  private PhysicalOperator physicalOperator;
+  private long initReservation = AbstractBase.INIT_ALLOCATION;
+  private long maxAllocation = AbstractBase.MAX_ALLOCATION;
+  private Optional expectedNumBatchesOpt = Optional.empty();
+  private Optional expectedTotalRowsOpt = Optional.empty();
+  private boolean combineOutputBatches;
+
+  public OperatorTestBuilder(PhysicalOpUnitTestBase physicalOpUnitTestBase) {
+this.physicalOpUnitTestBase = physicalOpUnitTestBase;
+  }
+
+  @SuppressWarnings({"unchecked", "resource"})
+  public void go() throws Exception {
+final List actualResults = new ArrayList<>();
+CloseableRecordBatch testOperator = null;
+
+try {
+  validate();
+  int expectedNumBatches = 
expectedNumBatchesOpt.orElse(expectedResults.size());
+  physicalOpUnitTestBase.mockOpContext(physicalOperator, initReservation, 
maxAllocation);
+
+  final BatchCreator opCreator = 
(BatchCreator) 
physicalOpUnitTestBase.opCreatorReg.getOperatorCreator(physicalOperator.getClass());
+  testOperator = opCreator.getBatch(physicalOpUnitTestBase.fragContext, 
physicalOperator, (List)upstreamBatches);
+
+  batchIterator: for (int batchIndex = 0;; batchIndex++) {
+final RecordBatch.IterOutcome outcome = testOperator.next();
+
+switch (outcome) {
+  case NONE:
+if (!combineOutputBatches) {
+  Assert.assertEquals(expectedNumBatches, batchIndex);
+}
+// We are done iterating over batches. Now we need to compare them.
+break batchIterator;
+  case OK_NEW_SCHEMA:
+boolean skip = true;
+
+try {
+  skip = testOperator.getContainer().getRecordCount() == 0;
+} catch (IllegalStateException e) {
+  // We should skip this batch in this case. It means no data was 
included with the okay schema
+} finally {
+  if (skip) {
+batchIndex--;
+break;
+   

[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests

2018-08-17 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584412#comment-16584412
 ] 

ASF GitHub Bot commented on DRILL-6461:
---

ilooner commented on a change in pull request #1344: DRILL-6461: Added basic 
data correctness tests for hash agg, and improved operator unit testing 
framework.
URL: https://github.com/apache/drill/pull/1344#discussion_r211039752
 
 

 ##
 File path: 
exec/java-exec/src/test/java/org/apache/drill/test/OperatorTestBuilder.java
 ##
 @@ -0,0 +1,300 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test;
+
+import com.google.common.base.Preconditions;
+import org.apache.commons.lang3.mutable.MutableInt;
+import org.apache.drill.exec.physical.base.AbstractBase;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.impl.BatchCreator;
+import org.apache.drill.exec.physical.impl.MockRecordBatch;
+import org.apache.drill.exec.physical.impl.svremover.Copier;
+import org.apache.drill.exec.physical.impl.svremover.GenericCopier;
+import org.apache.drill.exec.record.CloseableRecordBatch;
+import org.apache.drill.exec.record.RecordBatch;
+import org.apache.drill.exec.record.RecordBatchSizer;
+import org.apache.drill.exec.record.VectorContainer;
+import org.apache.drill.exec.vector.FixedWidthVector;
+import org.apache.drill.exec.vector.ValueVector;
+import org.apache.drill.exec.vector.VariableWidthVector;
+import org.apache.drill.test.rowSet.DirectRowSet;
+import org.apache.drill.test.rowSet.IndirectRowSet;
+import org.apache.drill.test.rowSet.RowSet;
+import org.apache.drill.test.rowSet.RowSetComparison;
+import org.junit.Assert;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Optional;
+import java.util.stream.Collectors;
+
+public class OperatorTestBuilder {
+
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(OperatorTestBuilder.class);
+
+  private final List expectedResults = new ArrayList<>();
+  private final List upstreamBatches = new ArrayList<>();
+  private PhysicalOpUnitTestBase physicalOpUnitTestBase;
+
+  private PhysicalOperator physicalOperator;
+  private long initReservation = AbstractBase.INIT_ALLOCATION;
+  private long maxAllocation = AbstractBase.MAX_ALLOCATION;
+  private Optional expectedNumBatchesOpt = Optional.empty();
+  private Optional expectedTotalRowsOpt = Optional.empty();
+  private boolean combineOutputBatches;
+
+  public OperatorTestBuilder(PhysicalOpUnitTestBase physicalOpUnitTestBase) {
+this.physicalOpUnitTestBase = physicalOpUnitTestBase;
+  }
+
+  @SuppressWarnings({"unchecked", "resource"})
+  public void go() throws Exception {
+final List actualResults = new ArrayList<>();
+CloseableRecordBatch testOperator = null;
+
+try {
+  validate();
+  int expectedNumBatches = 
expectedNumBatchesOpt.orElse(expectedResults.size());
+  physicalOpUnitTestBase.mockOpContext(physicalOperator, initReservation, 
maxAllocation);
+
+  final BatchCreator opCreator = 
(BatchCreator) 
physicalOpUnitTestBase.opCreatorReg.getOperatorCreator(physicalOperator.getClass());
+  testOperator = opCreator.getBatch(physicalOpUnitTestBase.fragContext, 
physicalOperator, (List)upstreamBatches);
+
+  batchIterator: for (int batchIndex = 0;; batchIndex++) {
+final RecordBatch.IterOutcome outcome = testOperator.next();
+
+switch (outcome) {
+  case NONE:
+if (!combineOutputBatches) {
+  Assert.assertEquals(expectedNumBatches, batchIndex);
+}
+// We are done iterating over batches. Now we need to compare them.
+break batchIterator;
+  case OK_NEW_SCHEMA:
+boolean skip = true;
+
+try {
+  skip = testOperator.getContainer().getRecordCount() == 0;
+} catch (IllegalStateException e) {
+  // We should skip this batch in this case. It means no data was 
included with the okay schema
+} finally {
+  if (skip) {
+batchIndex--;
+break;
+   

[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests

2018-08-17 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584407#comment-16584407
 ] 

ASF GitHub Bot commented on DRILL-6461:
---

ilooner commented on a change in pull request #1344: DRILL-6461: Added basic 
data correctness tests for hash agg, and improved operator unit testing 
framework.
URL: https://github.com/apache/drill/pull/1344#discussion_r211038430
 
 

 ##
 File path: 
exec/java-exec/src/test/java/org/apache/drill/test/OperatorTestBuilder.java
 ##
 @@ -0,0 +1,300 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test;
+
+import com.google.common.base.Preconditions;
+import org.apache.commons.lang3.mutable.MutableInt;
+import org.apache.drill.exec.physical.base.AbstractBase;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.impl.BatchCreator;
+import org.apache.drill.exec.physical.impl.MockRecordBatch;
+import org.apache.drill.exec.physical.impl.svremover.Copier;
+import org.apache.drill.exec.physical.impl.svremover.GenericCopier;
+import org.apache.drill.exec.record.CloseableRecordBatch;
+import org.apache.drill.exec.record.RecordBatch;
+import org.apache.drill.exec.record.RecordBatchSizer;
+import org.apache.drill.exec.record.VectorContainer;
+import org.apache.drill.exec.vector.FixedWidthVector;
+import org.apache.drill.exec.vector.ValueVector;
+import org.apache.drill.exec.vector.VariableWidthVector;
+import org.apache.drill.test.rowSet.DirectRowSet;
+import org.apache.drill.test.rowSet.IndirectRowSet;
+import org.apache.drill.test.rowSet.RowSet;
+import org.apache.drill.test.rowSet.RowSetComparison;
+import org.junit.Assert;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Optional;
+import java.util.stream.Collectors;
+
+public class OperatorTestBuilder {
+
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(OperatorTestBuilder.class);
+
+  private final List expectedResults = new ArrayList<>();
+  private final List upstreamBatches = new ArrayList<>();
+  private PhysicalOpUnitTestBase physicalOpUnitTestBase;
+
+  private PhysicalOperator physicalOperator;
+  private long initReservation = AbstractBase.INIT_ALLOCATION;
+  private long maxAllocation = AbstractBase.MAX_ALLOCATION;
+  private Optional expectedNumBatchesOpt = Optional.empty();
+  private Optional expectedTotalRowsOpt = Optional.empty();
+  private boolean combineOutputBatches;
+
+  public OperatorTestBuilder(PhysicalOpUnitTestBase physicalOpUnitTestBase) {
+this.physicalOpUnitTestBase = physicalOpUnitTestBase;
+  }
+
+  @SuppressWarnings({"unchecked", "resource"})
+  public void go() throws Exception {
+final List actualResults = new ArrayList<>();
+CloseableRecordBatch testOperator = null;
+
+try {
+  validate();
+  int expectedNumBatches = 
expectedNumBatchesOpt.orElse(expectedResults.size());
+  physicalOpUnitTestBase.mockOpContext(physicalOperator, initReservation, 
maxAllocation);
+
+  final BatchCreator opCreator = 
(BatchCreator) 
physicalOpUnitTestBase.opCreatorReg.getOperatorCreator(physicalOperator.getClass());
+  testOperator = opCreator.getBatch(physicalOpUnitTestBase.fragContext, 
physicalOperator, (List)upstreamBatches);
+
+  batchIterator: for (int batchIndex = 0;; batchIndex++) {
+final RecordBatch.IterOutcome outcome = testOperator.next();
+
+switch (outcome) {
+  case NONE:
+if (!combineOutputBatches) {
+  Assert.assertEquals(expectedNumBatches, batchIndex);
+}
+// We are done iterating over batches. Now we need to compare them.
+break batchIterator;
+  case OK_NEW_SCHEMA:
+boolean skip = true;
+
+try {
+  skip = testOperator.getContainer().getRecordCount() == 0;
+} catch (IllegalStateException e) {
+  // We should skip this batch in this case. It means no data was 
included with the okay schema
+} finally {
+  if (skip) {
+batchIndex--;
+break;
+   

[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests

2018-08-17 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584405#comment-16584405
 ] 

ASF GitHub Bot commented on DRILL-6461:
---

ilooner commented on a change in pull request #1344: DRILL-6461: Added basic 
data correctness tests for hash agg, and improved operator unit testing 
framework.
URL: https://github.com/apache/drill/pull/1344#discussion_r211038182
 
 

 ##
 File path: 
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/MockRecordBatch.java
 ##
 @@ -158,19 +210,34 @@ public IterOutcome next() {
 container.clear();
 container = new VectorContainer(allocator, inputSchema);
   }
-  container.transferIn(input);
-  container.setRecordCount(recordCount);
-
-  // Transfer the sv2 as well
-  final SelectionVector2 inputSv2 =
-(allTestContainersSv2 != null && allTestContainersSv2.size() > 0)
-  ? allTestContainersSv2.get(currentContainerIndex) : null;
-  if (inputSv2 != null) {
-sv2.allocateNewSafe(inputSv2.getCount());
-for (int i=0; i Add Basic Data Correctness Unit Tests
> -
>
> Key: DRILL-6461
> URL: https://issues.apache.org/jira/browse/DRILL-6461
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
>
> There are no data correctness unit tests for HashAgg. We need to add some.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests

2018-08-17 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584384#comment-16584384
 ] 

ASF GitHub Bot commented on DRILL-6461:
---

sohami commented on a change in pull request #1344: DRILL-6461: Added basic 
data correctness tests for hash agg, and improved operator unit testing 
framework.
URL: https://github.com/apache/drill/pull/1344#discussion_r211032924
 
 

 ##
 File path: 
exec/java-exec/src/test/java/org/apache/drill/test/OperatorTestBuilder.java
 ##
 @@ -0,0 +1,300 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test;
+
+import com.google.common.base.Preconditions;
+import org.apache.commons.lang3.mutable.MutableInt;
+import org.apache.drill.exec.physical.base.AbstractBase;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.impl.BatchCreator;
+import org.apache.drill.exec.physical.impl.MockRecordBatch;
+import org.apache.drill.exec.physical.impl.svremover.Copier;
+import org.apache.drill.exec.physical.impl.svremover.GenericCopier;
+import org.apache.drill.exec.record.CloseableRecordBatch;
+import org.apache.drill.exec.record.RecordBatch;
+import org.apache.drill.exec.record.RecordBatchSizer;
+import org.apache.drill.exec.record.VectorContainer;
+import org.apache.drill.exec.vector.FixedWidthVector;
+import org.apache.drill.exec.vector.ValueVector;
+import org.apache.drill.exec.vector.VariableWidthVector;
+import org.apache.drill.test.rowSet.DirectRowSet;
+import org.apache.drill.test.rowSet.IndirectRowSet;
+import org.apache.drill.test.rowSet.RowSet;
+import org.apache.drill.test.rowSet.RowSetComparison;
+import org.junit.Assert;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Optional;
+import java.util.stream.Collectors;
+
+public class OperatorTestBuilder {
+
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(OperatorTestBuilder.class);
+
+  private final List expectedResults = new ArrayList<>();
+  private final List upstreamBatches = new ArrayList<>();
+  private PhysicalOpUnitTestBase physicalOpUnitTestBase;
+
+  private PhysicalOperator physicalOperator;
+  private long initReservation = AbstractBase.INIT_ALLOCATION;
+  private long maxAllocation = AbstractBase.MAX_ALLOCATION;
+  private Optional expectedNumBatchesOpt = Optional.empty();
+  private Optional expectedTotalRowsOpt = Optional.empty();
+  private boolean combineOutputBatches;
+
+  public OperatorTestBuilder(PhysicalOpUnitTestBase physicalOpUnitTestBase) {
+this.physicalOpUnitTestBase = physicalOpUnitTestBase;
+  }
+
+  @SuppressWarnings({"unchecked", "resource"})
+  public void go() throws Exception {
+final List actualResults = new ArrayList<>();
+CloseableRecordBatch testOperator = null;
+
+try {
+  validate();
+  int expectedNumBatches = 
expectedNumBatchesOpt.orElse(expectedResults.size());
+  physicalOpUnitTestBase.mockOpContext(physicalOperator, initReservation, 
maxAllocation);
+
+  final BatchCreator opCreator = 
(BatchCreator) 
physicalOpUnitTestBase.opCreatorReg.getOperatorCreator(physicalOperator.getClass());
+  testOperator = opCreator.getBatch(physicalOpUnitTestBase.fragContext, 
physicalOperator, (List)upstreamBatches);
+
+  batchIterator: for (int batchIndex = 0;; batchIndex++) {
+final RecordBatch.IterOutcome outcome = testOperator.next();
+
+switch (outcome) {
+  case NONE:
+if (!combineOutputBatches) {
+  Assert.assertEquals(expectedNumBatches, batchIndex);
+}
+// We are done iterating over batches. Now we need to compare them.
+break batchIterator;
+  case OK_NEW_SCHEMA:
+boolean skip = true;
+
+try {
+  skip = testOperator.getContainer().getRecordCount() == 0;
+} catch (IllegalStateException e) {
+  // We should skip this batch in this case. It means no data was 
included with the okay schema
+} finally {
+  if (skip) {
+batchIndex--;
+break;
+

[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests

2018-08-17 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584385#comment-16584385
 ] 

ASF GitHub Bot commented on DRILL-6461:
---

sohami commented on a change in pull request #1344: DRILL-6461: Added basic 
data correctness tests for hash agg, and improved operator unit testing 
framework.
URL: https://github.com/apache/drill/pull/1344#discussion_r210989987
 
 

 ##
 File path: 
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/svremover/AbstractGenericCopierTest.java
 ##
 @@ -20,77 +20,106 @@
 import org.apache.drill.common.types.TypeProtos;
 import org.apache.drill.common.types.Types;
 import org.apache.drill.exec.exception.SchemaChangeException;
-import org.apache.drill.exec.memory.RootAllocator;
+import org.apache.drill.exec.memory.BufferAllocator;
+import org.apache.drill.exec.physical.impl.MockRecordBatch;
 import org.apache.drill.exec.record.BatchSchema;
 import org.apache.drill.exec.record.MaterializedField;
 import org.apache.drill.exec.record.RecordBatch;
 import org.apache.drill.exec.record.VectorContainer;
-import org.apache.drill.exec.record.metadata.TupleMetadata;
+import org.apache.drill.exec.record.VectorWrapper;
 import org.apache.drill.exec.vector.SchemaChangeCallBack;
+import org.apache.drill.test.BaseDirTestWatcher;
+import org.apache.drill.test.OperatorFixture;
+import org.apache.drill.test.rowSet.DirectRowSet;
 import org.apache.drill.test.rowSet.RowSet;
-import org.apache.drill.test.rowSet.RowSetBatch;
 import org.apache.drill.test.rowSet.RowSetBuilder;
 import org.apache.drill.test.rowSet.RowSetComparison;
 import org.apache.drill.test.rowSet.schema.SchemaBuilder;
+import org.junit.Rule;
 import org.junit.Test;
 
 public abstract class AbstractGenericCopierTest {
+  @Rule
+  public final BaseDirTestWatcher baseDirTestWatcher = new 
BaseDirTestWatcher();
+
   @Test
-  public void testCopyRecords() throws SchemaChangeException {
-try (RootAllocator allocator = new RootAllocator(10_000_000)) {
-  final TupleMetadata batchSchema = 
createTestSchema(BatchSchema.SelectionVectorMode.NONE);
+  public void testCopyRecords() throws Exception {
+try (OperatorFixture operatorFixture = new 
OperatorFixture.Builder(baseDirTestWatcher).build()) {
+  final BufferAllocator allocator = operatorFixture.allocator();
+  final BatchSchema batchSchema = 
createTestSchema(BatchSchema.SelectionVectorMode.NONE);
   final RowSet srcRowSet = createSrcRowSet(allocator);
-  final RowSet destRowSet = new RowSetBuilder(allocator, 
batchSchema).build();
-  final VectorContainer destContainer = destRowSet.container();
-  final Copier copier = createCopier(new RowSetBatch(srcRowSet), 
destContainer, null);
+  final VectorContainer destContainer = new VectorContainer(allocator, 
batchSchema);
+
+  for (VectorWrapper vectorWrapper: destContainer) {
+vectorWrapper.getValueVector().allocateNew();
 
 Review comment:
   Why we need to allocate memory for destination container here ? Shouldn't 
the copier allocate it when `copyRecords` is called. Since it will do so based 
on incoming record count. Also in case of `StraightCopier` it should not 
allocate any memory for ValueVectors in destination container since there will 
be direct transfer of buffer from incoming to outgoing container ValueVectors.
   
   I think the bug is 
[here](https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/svremover/GenericCopier.java#L56)
 instead. It should be same as 
[AbstractCopier::copyRecords](https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/svremover/AbstractCopier.java#L45).
 Also `GenericCopier` is not efficient for cases like `StraightCopier` where no 
actual copy is done only transfer of buffer happens.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Basic Data Correctness Unit Tests
> -
>
> Key: DRILL-6461
> URL: https://issues.apache.org/jira/browse/DRILL-6461
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
>
> There are no data correctness unit tests for HashAgg. We need to add some.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests

2018-08-17 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584386#comment-16584386
 ] 

ASF GitHub Bot commented on DRILL-6461:
---

sohami commented on a change in pull request #1344: DRILL-6461: Added basic 
data correctness tests for hash agg, and improved operator unit testing 
framework.
URL: https://github.com/apache/drill/pull/1344#discussion_r211000586
 
 

 ##
 File path: 
exec/java-exec/src/test/java/org/apache/drill/test/OperatorTestBuilder.java
 ##
 @@ -0,0 +1,300 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test;
+
+import com.google.common.base.Preconditions;
+import org.apache.commons.lang3.mutable.MutableInt;
+import org.apache.drill.exec.physical.base.AbstractBase;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.impl.BatchCreator;
+import org.apache.drill.exec.physical.impl.MockRecordBatch;
+import org.apache.drill.exec.physical.impl.svremover.Copier;
+import org.apache.drill.exec.physical.impl.svremover.GenericCopier;
+import org.apache.drill.exec.record.CloseableRecordBatch;
+import org.apache.drill.exec.record.RecordBatch;
+import org.apache.drill.exec.record.RecordBatchSizer;
+import org.apache.drill.exec.record.VectorContainer;
+import org.apache.drill.exec.vector.FixedWidthVector;
+import org.apache.drill.exec.vector.ValueVector;
+import org.apache.drill.exec.vector.VariableWidthVector;
+import org.apache.drill.test.rowSet.DirectRowSet;
+import org.apache.drill.test.rowSet.IndirectRowSet;
+import org.apache.drill.test.rowSet.RowSet;
+import org.apache.drill.test.rowSet.RowSetComparison;
+import org.junit.Assert;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Optional;
+import java.util.stream.Collectors;
+
+public class OperatorTestBuilder {
+
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(OperatorTestBuilder.class);
+
+  private final List expectedResults = new ArrayList<>();
+  private final List upstreamBatches = new ArrayList<>();
+  private PhysicalOpUnitTestBase physicalOpUnitTestBase;
+
+  private PhysicalOperator physicalOperator;
+  private long initReservation = AbstractBase.INIT_ALLOCATION;
+  private long maxAllocation = AbstractBase.MAX_ALLOCATION;
+  private Optional expectedNumBatchesOpt = Optional.empty();
+  private Optional expectedTotalRowsOpt = Optional.empty();
+  private boolean combineOutputBatches;
+
+  public OperatorTestBuilder(PhysicalOpUnitTestBase physicalOpUnitTestBase) {
+this.physicalOpUnitTestBase = physicalOpUnitTestBase;
+  }
+
+  @SuppressWarnings({"unchecked", "resource"})
+  public void go() throws Exception {
+final List actualResults = new ArrayList<>();
+CloseableRecordBatch testOperator = null;
+
+try {
+  validate();
+  int expectedNumBatches = 
expectedNumBatchesOpt.orElse(expectedResults.size());
+  physicalOpUnitTestBase.mockOpContext(physicalOperator, initReservation, 
maxAllocation);
+
+  final BatchCreator opCreator = 
(BatchCreator) 
physicalOpUnitTestBase.opCreatorReg.getOperatorCreator(physicalOperator.getClass());
+  testOperator = opCreator.getBatch(physicalOpUnitTestBase.fragContext, 
physicalOperator, (List)upstreamBatches);
+
+  batchIterator: for (int batchIndex = 0;; batchIndex++) {
+final RecordBatch.IterOutcome outcome = testOperator.next();
+
+switch (outcome) {
+  case NONE:
+if (!combineOutputBatches) {
+  Assert.assertEquals(expectedNumBatches, batchIndex);
+}
+// We are done iterating over batches. Now we need to compare them.
+break batchIterator;
+  case OK_NEW_SCHEMA:
+boolean skip = true;
+
+try {
+  skip = testOperator.getContainer().getRecordCount() == 0;
+} catch (IllegalStateException e) {
+  // We should skip this batch in this case. It means no data was 
included with the okay schema
+} finally {
+  if (skip) {
+batchIndex--;
+break;
+

[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests

2018-08-17 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584388#comment-16584388
 ] 

ASF GitHub Bot commented on DRILL-6461:
---

sohami commented on a change in pull request #1344: DRILL-6461: Added basic 
data correctness tests for hash agg, and improved operator unit testing 
framework.
URL: https://github.com/apache/drill/pull/1344#discussion_r211032665
 
 

 ##
 File path: 
exec/java-exec/src/test/java/org/apache/drill/test/OperatorTestBuilder.java
 ##
 @@ -0,0 +1,300 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test;
+
+import com.google.common.base.Preconditions;
+import org.apache.commons.lang3.mutable.MutableInt;
+import org.apache.drill.exec.physical.base.AbstractBase;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.impl.BatchCreator;
+import org.apache.drill.exec.physical.impl.MockRecordBatch;
+import org.apache.drill.exec.physical.impl.svremover.Copier;
+import org.apache.drill.exec.physical.impl.svremover.GenericCopier;
+import org.apache.drill.exec.record.CloseableRecordBatch;
+import org.apache.drill.exec.record.RecordBatch;
+import org.apache.drill.exec.record.RecordBatchSizer;
+import org.apache.drill.exec.record.VectorContainer;
+import org.apache.drill.exec.vector.FixedWidthVector;
+import org.apache.drill.exec.vector.ValueVector;
+import org.apache.drill.exec.vector.VariableWidthVector;
+import org.apache.drill.test.rowSet.DirectRowSet;
+import org.apache.drill.test.rowSet.IndirectRowSet;
+import org.apache.drill.test.rowSet.RowSet;
+import org.apache.drill.test.rowSet.RowSetComparison;
+import org.junit.Assert;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Optional;
+import java.util.stream.Collectors;
+
+public class OperatorTestBuilder {
+
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(OperatorTestBuilder.class);
+
+  private final List expectedResults = new ArrayList<>();
+  private final List upstreamBatches = new ArrayList<>();
+  private PhysicalOpUnitTestBase physicalOpUnitTestBase;
+
+  private PhysicalOperator physicalOperator;
+  private long initReservation = AbstractBase.INIT_ALLOCATION;
+  private long maxAllocation = AbstractBase.MAX_ALLOCATION;
+  private Optional expectedNumBatchesOpt = Optional.empty();
+  private Optional expectedTotalRowsOpt = Optional.empty();
+  private boolean combineOutputBatches;
+
+  public OperatorTestBuilder(PhysicalOpUnitTestBase physicalOpUnitTestBase) {
+this.physicalOpUnitTestBase = physicalOpUnitTestBase;
+  }
+
+  @SuppressWarnings({"unchecked", "resource"})
+  public void go() throws Exception {
+final List actualResults = new ArrayList<>();
+CloseableRecordBatch testOperator = null;
+
+try {
+  validate();
+  int expectedNumBatches = 
expectedNumBatchesOpt.orElse(expectedResults.size());
+  physicalOpUnitTestBase.mockOpContext(physicalOperator, initReservation, 
maxAllocation);
+
+  final BatchCreator opCreator = 
(BatchCreator) 
physicalOpUnitTestBase.opCreatorReg.getOperatorCreator(physicalOperator.getClass());
+  testOperator = opCreator.getBatch(physicalOpUnitTestBase.fragContext, 
physicalOperator, (List)upstreamBatches);
+
+  batchIterator: for (int batchIndex = 0;; batchIndex++) {
+final RecordBatch.IterOutcome outcome = testOperator.next();
+
+switch (outcome) {
+  case NONE:
+if (!combineOutputBatches) {
+  Assert.assertEquals(expectedNumBatches, batchIndex);
+}
+// We are done iterating over batches. Now we need to compare them.
+break batchIterator;
+  case OK_NEW_SCHEMA:
+boolean skip = true;
+
+try {
+  skip = testOperator.getContainer().getRecordCount() == 0;
+} catch (IllegalStateException e) {
+  // We should skip this batch in this case. It means no data was 
included with the okay schema
+} finally {
+  if (skip) {
+batchIndex--;
+break;
+

[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests

2018-08-17 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584383#comment-16584383
 ] 

ASF GitHub Bot commented on DRILL-6461:
---

sohami commented on a change in pull request #1344: DRILL-6461: Added basic 
data correctness tests for hash agg, and improved operator unit testing 
framework.
URL: https://github.com/apache/drill/pull/1344#discussion_r210998188
 
 

 ##
 File path: 
exec/java-exec/src/test/java/org/apache/drill/test/OperatorTestBuilder.java
 ##
 @@ -0,0 +1,300 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test;
+
+import com.google.common.base.Preconditions;
+import org.apache.commons.lang3.mutable.MutableInt;
+import org.apache.drill.exec.physical.base.AbstractBase;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.impl.BatchCreator;
+import org.apache.drill.exec.physical.impl.MockRecordBatch;
+import org.apache.drill.exec.physical.impl.svremover.Copier;
+import org.apache.drill.exec.physical.impl.svremover.GenericCopier;
+import org.apache.drill.exec.record.CloseableRecordBatch;
+import org.apache.drill.exec.record.RecordBatch;
+import org.apache.drill.exec.record.RecordBatchSizer;
+import org.apache.drill.exec.record.VectorContainer;
+import org.apache.drill.exec.vector.FixedWidthVector;
+import org.apache.drill.exec.vector.ValueVector;
+import org.apache.drill.exec.vector.VariableWidthVector;
+import org.apache.drill.test.rowSet.DirectRowSet;
+import org.apache.drill.test.rowSet.IndirectRowSet;
+import org.apache.drill.test.rowSet.RowSet;
+import org.apache.drill.test.rowSet.RowSetComparison;
+import org.junit.Assert;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Optional;
+import java.util.stream.Collectors;
+
+public class OperatorTestBuilder {
+
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(OperatorTestBuilder.class);
+
+  private final List expectedResults = new ArrayList<>();
+  private final List upstreamBatches = new ArrayList<>();
+  private PhysicalOpUnitTestBase physicalOpUnitTestBase;
+
+  private PhysicalOperator physicalOperator;
+  private long initReservation = AbstractBase.INIT_ALLOCATION;
+  private long maxAllocation = AbstractBase.MAX_ALLOCATION;
+  private Optional expectedNumBatchesOpt = Optional.empty();
+  private Optional expectedTotalRowsOpt = Optional.empty();
+  private boolean combineOutputBatches;
+
+  public OperatorTestBuilder(PhysicalOpUnitTestBase physicalOpUnitTestBase) {
+this.physicalOpUnitTestBase = physicalOpUnitTestBase;
+  }
+
+  @SuppressWarnings({"unchecked", "resource"})
+  public void go() throws Exception {
+final List actualResults = new ArrayList<>();
+CloseableRecordBatch testOperator = null;
+
+try {
+  validate();
+  int expectedNumBatches = 
expectedNumBatchesOpt.orElse(expectedResults.size());
+  physicalOpUnitTestBase.mockOpContext(physicalOperator, initReservation, 
maxAllocation);
+
+  final BatchCreator opCreator = 
(BatchCreator) 
physicalOpUnitTestBase.opCreatorReg.getOperatorCreator(physicalOperator.getClass());
+  testOperator = opCreator.getBatch(physicalOpUnitTestBase.fragContext, 
physicalOperator, (List)upstreamBatches);
+
+  batchIterator: for (int batchIndex = 0;; batchIndex++) {
+final RecordBatch.IterOutcome outcome = testOperator.next();
+
+switch (outcome) {
+  case NONE:
+if (!combineOutputBatches) {
+  Assert.assertEquals(expectedNumBatches, batchIndex);
+}
+// We are done iterating over batches. Now we need to compare them.
+break batchIterator;
+  case OK_NEW_SCHEMA:
+boolean skip = true;
+
+try {
+  skip = testOperator.getContainer().getRecordCount() == 0;
+} catch (IllegalStateException e) {
+  // We should skip this batch in this case. It means no data was 
included with the okay schema
+} finally {
+  if (skip) {
+batchIndex--;
+break;
+

[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests

2018-08-17 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584387#comment-16584387
 ] 

ASF GitHub Bot commented on DRILL-6461:
---

sohami commented on a change in pull request #1344: DRILL-6461: Added basic 
data correctness tests for hash agg, and improved operator unit testing 
framework.
URL: https://github.com/apache/drill/pull/1344#discussion_r211022063
 
 

 ##
 File path: 
exec/java-exec/src/test/java/org/apache/drill/test/OperatorTestBuilder.java
 ##
 @@ -0,0 +1,300 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test;
+
+import com.google.common.base.Preconditions;
+import org.apache.commons.lang3.mutable.MutableInt;
+import org.apache.drill.exec.physical.base.AbstractBase;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.impl.BatchCreator;
+import org.apache.drill.exec.physical.impl.MockRecordBatch;
+import org.apache.drill.exec.physical.impl.svremover.Copier;
+import org.apache.drill.exec.physical.impl.svremover.GenericCopier;
+import org.apache.drill.exec.record.CloseableRecordBatch;
+import org.apache.drill.exec.record.RecordBatch;
+import org.apache.drill.exec.record.RecordBatchSizer;
+import org.apache.drill.exec.record.VectorContainer;
+import org.apache.drill.exec.vector.FixedWidthVector;
+import org.apache.drill.exec.vector.ValueVector;
+import org.apache.drill.exec.vector.VariableWidthVector;
+import org.apache.drill.test.rowSet.DirectRowSet;
+import org.apache.drill.test.rowSet.IndirectRowSet;
+import org.apache.drill.test.rowSet.RowSet;
+import org.apache.drill.test.rowSet.RowSetComparison;
+import org.junit.Assert;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Optional;
+import java.util.stream.Collectors;
+
+public class OperatorTestBuilder {
+
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(OperatorTestBuilder.class);
+
+  private final List expectedResults = new ArrayList<>();
+  private final List upstreamBatches = new ArrayList<>();
+  private PhysicalOpUnitTestBase physicalOpUnitTestBase;
+
+  private PhysicalOperator physicalOperator;
+  private long initReservation = AbstractBase.INIT_ALLOCATION;
+  private long maxAllocation = AbstractBase.MAX_ALLOCATION;
+  private Optional expectedNumBatchesOpt = Optional.empty();
+  private Optional expectedTotalRowsOpt = Optional.empty();
+  private boolean combineOutputBatches;
+
+  public OperatorTestBuilder(PhysicalOpUnitTestBase physicalOpUnitTestBase) {
+this.physicalOpUnitTestBase = physicalOpUnitTestBase;
+  }
+
+  @SuppressWarnings({"unchecked", "resource"})
+  public void go() throws Exception {
+final List actualResults = new ArrayList<>();
+CloseableRecordBatch testOperator = null;
+
+try {
+  validate();
+  int expectedNumBatches = 
expectedNumBatchesOpt.orElse(expectedResults.size());
+  physicalOpUnitTestBase.mockOpContext(physicalOperator, initReservation, 
maxAllocation);
+
+  final BatchCreator opCreator = 
(BatchCreator) 
physicalOpUnitTestBase.opCreatorReg.getOperatorCreator(physicalOperator.getClass());
+  testOperator = opCreator.getBatch(physicalOpUnitTestBase.fragContext, 
physicalOperator, (List)upstreamBatches);
+
+  batchIterator: for (int batchIndex = 0;; batchIndex++) {
+final RecordBatch.IterOutcome outcome = testOperator.next();
+
+switch (outcome) {
+  case NONE:
+if (!combineOutputBatches) {
+  Assert.assertEquals(expectedNumBatches, batchIndex);
+}
+// We are done iterating over batches. Now we need to compare them.
+break batchIterator;
+  case OK_NEW_SCHEMA:
+boolean skip = true;
+
+try {
+  skip = testOperator.getContainer().getRecordCount() == 0;
+} catch (IllegalStateException e) {
+  // We should skip this batch in this case. It means no data was 
included with the okay schema
+} finally {
+  if (skip) {
+batchIndex--;
+break;
+

[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests

2018-08-17 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584389#comment-16584389
 ] 

ASF GitHub Bot commented on DRILL-6461:
---

sohami commented on a change in pull request #1344: DRILL-6461: Added basic 
data correctness tests for hash agg, and improved operator unit testing 
framework.
URL: https://github.com/apache/drill/pull/1344#discussion_r210994889
 
 

 ##
 File path: 
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/MockRecordBatch.java
 ##
 @@ -158,19 +210,34 @@ public IterOutcome next() {
 container.clear();
 container = new VectorContainer(allocator, inputSchema);
   }
-  container.transferIn(input);
-  container.setRecordCount(recordCount);
-
-  // Transfer the sv2 as well
-  final SelectionVector2 inputSv2 =
-(allTestContainersSv2 != null && allTestContainersSv2.size() > 0)
-  ? allTestContainersSv2.get(currentContainerIndex) : null;
-  if (inputSv2 != null) {
-sv2.allocateNewSafe(inputSv2.getCount());
-for (int i=0; i Add Basic Data Correctness Unit Tests
> -
>
> Key: DRILL-6461
> URL: https://issues.apache.org/jira/browse/DRILL-6461
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
>
> There are no data correctness unit tests for HashAgg. We need to add some.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests

2018-08-17 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584390#comment-16584390
 ] 

ASF GitHub Bot commented on DRILL-6461:
---

sohami commented on a change in pull request #1344: DRILL-6461: Added basic 
data correctness tests for hash agg, and improved operator unit testing 
framework.
URL: https://github.com/apache/drill/pull/1344#discussion_r211020462
 
 

 ##
 File path: 
exec/java-exec/src/test/java/org/apache/drill/test/OperatorTestBuilder.java
 ##
 @@ -0,0 +1,300 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test;
+
+import com.google.common.base.Preconditions;
+import org.apache.commons.lang3.mutable.MutableInt;
+import org.apache.drill.exec.physical.base.AbstractBase;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.impl.BatchCreator;
+import org.apache.drill.exec.physical.impl.MockRecordBatch;
+import org.apache.drill.exec.physical.impl.svremover.Copier;
+import org.apache.drill.exec.physical.impl.svremover.GenericCopier;
+import org.apache.drill.exec.record.CloseableRecordBatch;
+import org.apache.drill.exec.record.RecordBatch;
+import org.apache.drill.exec.record.RecordBatchSizer;
+import org.apache.drill.exec.record.VectorContainer;
+import org.apache.drill.exec.vector.FixedWidthVector;
+import org.apache.drill.exec.vector.ValueVector;
+import org.apache.drill.exec.vector.VariableWidthVector;
+import org.apache.drill.test.rowSet.DirectRowSet;
+import org.apache.drill.test.rowSet.IndirectRowSet;
+import org.apache.drill.test.rowSet.RowSet;
+import org.apache.drill.test.rowSet.RowSetComparison;
+import org.junit.Assert;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Optional;
+import java.util.stream.Collectors;
+
+public class OperatorTestBuilder {
+
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(OperatorTestBuilder.class);
+
+  private final List expectedResults = new ArrayList<>();
+  private final List upstreamBatches = new ArrayList<>();
+  private PhysicalOpUnitTestBase physicalOpUnitTestBase;
+
+  private PhysicalOperator physicalOperator;
+  private long initReservation = AbstractBase.INIT_ALLOCATION;
+  private long maxAllocation = AbstractBase.MAX_ALLOCATION;
+  private Optional expectedNumBatchesOpt = Optional.empty();
+  private Optional expectedTotalRowsOpt = Optional.empty();
+  private boolean combineOutputBatches;
+
+  public OperatorTestBuilder(PhysicalOpUnitTestBase physicalOpUnitTestBase) {
+this.physicalOpUnitTestBase = physicalOpUnitTestBase;
+  }
+
+  @SuppressWarnings({"unchecked", "resource"})
+  public void go() throws Exception {
+final List actualResults = new ArrayList<>();
+CloseableRecordBatch testOperator = null;
+
+try {
+  validate();
+  int expectedNumBatches = 
expectedNumBatchesOpt.orElse(expectedResults.size());
+  physicalOpUnitTestBase.mockOpContext(physicalOperator, initReservation, 
maxAllocation);
+
+  final BatchCreator opCreator = 
(BatchCreator) 
physicalOpUnitTestBase.opCreatorReg.getOperatorCreator(physicalOperator.getClass());
+  testOperator = opCreator.getBatch(physicalOpUnitTestBase.fragContext, 
physicalOperator, (List)upstreamBatches);
+
+  batchIterator: for (int batchIndex = 0;; batchIndex++) {
+final RecordBatch.IterOutcome outcome = testOperator.next();
+
+switch (outcome) {
+  case NONE:
+if (!combineOutputBatches) {
+  Assert.assertEquals(expectedNumBatches, batchIndex);
+}
+// We are done iterating over batches. Now we need to compare them.
+break batchIterator;
+  case OK_NEW_SCHEMA:
+boolean skip = true;
+
+try {
+  skip = testOperator.getContainer().getRecordCount() == 0;
+} catch (IllegalStateException e) {
+  // We should skip this batch in this case. It means no data was 
included with the okay schema
+} finally {
+  if (skip) {
+batchIndex--;
+break;
+

[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests

2018-08-16 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16583004#comment-16583004
 ] 

ASF GitHub Bot commented on DRILL-6461:
---

ilooner commented on issue #1344: DRILL-6461: Added basic data correctness 
tests for hash agg, and improved operator unit testing framework.
URL: https://github.com/apache/drill/pull/1344#issuecomment-413667931
 
 
   @sohami The PR is ready for another round of review.
   
   I removed the fix that allowed **copyEntry** to work with empty VarLength 
vectors since value vectors should not be manipulated unless allocateNew is 
called. I fixed the failing test by calling allocateNew on all the vectors in 
the destination vector container.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Basic Data Correctness Unit Tests
> -
>
> Key: DRILL-6461
> URL: https://issues.apache.org/jira/browse/DRILL-6461
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
>
> There are no data correctness unit tests for HashAgg. We need to add some.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests

2018-08-15 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16581815#comment-16581815
 ] 

ASF GitHub Bot commented on DRILL-6461:
---

ilooner commented on issue #1344: DRILL-6461: Added basic data correctness 
tests for hash agg, and improved operator unit testing framework.
URL: https://github.com/apache/drill/pull/1344#issuecomment-413388915
 
 
   Arg! There appears to be one random test failing on Travis. I will look into 
it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Basic Data Correctness Unit Tests
> -
>
> Key: DRILL-6461
> URL: https://issues.apache.org/jira/browse/DRILL-6461
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
>
> There are no data correctness unit tests for HashAgg. We need to add some.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests

2018-08-15 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16581685#comment-16581685
 ] 

ASF GitHub Bot commented on DRILL-6461:
---

ilooner commented on issue #1344: DRILL-6461: Added basic data correctness 
tests for hash agg, and improved operator unit testing framework.
URL: https://github.com/apache/drill/pull/1344#issuecomment-413355862
 
 
   @sohami I removed the **RowSetBatch** and made the **MockRecordBatch** work 
with RowSets. Please take another look. 
   
   This proved to be a fruitful exercise because using MockRecordBatch exposed 
some bugs and corner cases in the vectors, and then those bugs and corner cases 
lead to even more bugs being exposed. The specific issues were:
   
- Previously I assumed **copyEntry** was a safe operation (does reallocs 
when needed), but this was not entirely true. **copyEntry** was safe for 
non-empty VarLength vectors, and Nullable vectors copying from another Nullable 
vector. It was not safe for empty VarLength vectors, copying from a 
non-nullable vector to a nullable vector, or fixed length vectors. So Included 
fixes for these three corner cases.
- I made **copyEntry** safe for empty VarLength vectors by always making 
sure the first entry in the offset vector existed and was zero on 
initialization of the vector. This caused memory to be leaked because some 
operators created empty VectorContainers and never closed them:
 - TopN created a dummy VectorContainer for no reason and never closed it, 
so I removed it.
 - HashTableTemplate created a dummy template VectorContainer and never 
closed it, so I close it now. We should actually probably avoid using the dummy 
VectorContainer entirely and just use a schema, but I would rather do that in a 
separate change.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Basic Data Correctness Unit Tests
> -
>
> Key: DRILL-6461
> URL: https://issues.apache.org/jira/browse/DRILL-6461
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
>
> There are no data correctness unit tests for HashAgg. We need to add some.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests

2018-08-13 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16578993#comment-16578993
 ] 

ASF GitHub Bot commented on DRILL-6461:
---

ilooner commented on a change in pull request #1344: DRILL-6461: Added basic 
data correctness tests for hash agg, and improved operator unit testing 
framework.
URL: https://github.com/apache/drill/pull/1344#discussion_r209779765
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/record/VectorContainer.java
 ##
 @@ -42,6 +42,8 @@
   private final BufferAllocator allocator;
   protected final List> wrappers = Lists.newArrayList();
   private BatchSchema schema;
+  private SelectionVector2 selectionVector2;
+  private SelectionVector4 selectionVector4;
 
 Review comment:
   Cool made https://issues.apache.org/jira/browse/DRILL-6683 to track that 
change.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Basic Data Correctness Unit Tests
> -
>
> Key: DRILL-6461
> URL: https://issues.apache.org/jira/browse/DRILL-6461
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
>
> There are no data correctness unit tests for HashAgg. We need to add some.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests

2018-08-13 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16578991#comment-16578991
 ] 

ASF GitHub Bot commented on DRILL-6461:
---

ilooner commented on a change in pull request #1344: DRILL-6461: Added basic 
data correctness tests for hash agg, and improved operator unit testing 
framework.
URL: https://github.com/apache/drill/pull/1344#discussion_r209779483
 
 

 ##
 File path: 
exec/java-exec/src/test/java/org/apache/drill/test/rowSet/RowSetBatch.java
 ##
 @@ -29,13 +30,28 @@
 import org.apache.drill.exec.record.selection.SelectionVector2;
 import org.apache.drill.exec.record.selection.SelectionVector4;
 
+import java.util.ArrayList;
 import java.util.Iterator;
+import java.util.List;
 
-public class RowSetBatch implements RecordBatch {
-  private final RowSet rowSet;
+/**
+ * A mock operator that returns the provided {@link RowSet}s as batches. 
Currently it's assumed that all the {@link RowSet}s have the same schema.
+ */
+public class RowSetBatch implements CloseableRecordBatch {
 
 Review comment:
   @sohami It's okay, I've already made the move to MockRecordBatch for this 
PR. Just doing a final test runs, and debugging straggling issues.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Basic Data Correctness Unit Tests
> -
>
> Key: DRILL-6461
> URL: https://issues.apache.org/jira/browse/DRILL-6461
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
>
> There are no data correctness unit tests for HashAgg. We need to add some.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests

2018-08-13 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16578988#comment-16578988
 ] 

ASF GitHub Bot commented on DRILL-6461:
---

ilooner commented on a change in pull request #1344: DRILL-6461: Added basic 
data correctness tests for hash agg, and improved operator unit testing 
framework.
URL: https://github.com/apache/drill/pull/1344#discussion_r209779109
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/record/RecordBatchSizer.java
 ##
 @@ -639,6 +641,8 @@ public ColumnSize getColumn(String name) {
   // columns can be obtained from children of topColumns.
   private Map columnSizes = 
CaseInsensitiveMap.newHashMap();
 
+  private List columnSizesList = new ArrayList<>();
 
 Review comment:
   done.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Basic Data Correctness Unit Tests
> -
>
> Key: DRILL-6461
> URL: https://issues.apache.org/jira/browse/DRILL-6461
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
>
> There are no data correctness unit tests for HashAgg. We need to add some.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests

2018-08-13 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16578595#comment-16578595
 ] 

ASF GitHub Bot commented on DRILL-6461:
---

sohami commented on a change in pull request #1344: DRILL-6461: Added basic 
data correctness tests for hash agg, and improved operator unit testing 
framework.
URL: https://github.com/apache/drill/pull/1344#discussion_r209677503
 
 

 ##
 File path: 
exec/java-exec/src/test/java/org/apache/drill/test/rowSet/RowSetBatch.java
 ##
 @@ -29,13 +30,28 @@
 import org.apache.drill.exec.record.selection.SelectionVector2;
 import org.apache.drill.exec.record.selection.SelectionVector4;
 
+import java.util.ArrayList;
 import java.util.Iterator;
+import java.util.List;
 
-public class RowSetBatch implements RecordBatch {
-  private final RowSet rowSet;
+/**
+ * A mock operator that returns the provided {@link RowSet}s as batches. 
Currently it's assumed that all the {@link RowSet}s have the same schema.
+ */
+public class RowSetBatch implements CloseableRecordBatch {
 
 Review comment:
   If you want you can leave it as is for now. I can take a look on how to 
refactor MockRecordBatch class along with other tests which uses it, which I 
have to do anyways. I was also thinking of making it use RowSets instead of 
container. 
   The reason for creating separate class was to use it specifically for 
testing different IterOutcomes in context of Lateral only whereas 
RowSetBatch looked more of a very simple implementation to just hold one 
container. Was not sure if MockRecordBatch will play well with other operators 
in general.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Basic Data Correctness Unit Tests
> -
>
> Key: DRILL-6461
> URL: https://issues.apache.org/jira/browse/DRILL-6461
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
>
> There are no data correctness unit tests for HashAgg. We need to add some.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests

2018-08-13 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16578560#comment-16578560
 ] 

ASF GitHub Bot commented on DRILL-6461:
---

sohami commented on a change in pull request #1344: DRILL-6461: Added basic 
data correctness tests for hash agg, and improved operator unit testing 
framework.
URL: https://github.com/apache/drill/pull/1344#discussion_r209672777
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/record/VectorContainer.java
 ##
 @@ -42,6 +42,8 @@
   private final BufferAllocator allocator;
   protected final List> wrappers = Lists.newArrayList();
   private BatchSchema schema;
+  private SelectionVector2 selectionVector2;
+  private SelectionVector4 selectionVector4;
 
 Review comment:
   I agree with that and think it would be a good refactoring to move 
`getSelectionVector2` and `getSelectionVector4` from `VectorAccessible` 
interface to `RecordBatch` interface. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Basic Data Correctness Unit Tests
> -
>
> Key: DRILL-6461
> URL: https://issues.apache.org/jira/browse/DRILL-6461
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
>
> There are no data correctness unit tests for HashAgg. We need to add some.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests

2018-08-08 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16574065#comment-16574065
 ] 

ASF GitHub Bot commented on DRILL-6461:
---

sachouche commented on a change in pull request #1344: DRILL-6461: Added basic 
data correctness tests for hash agg, and improved operator unit testing 
framework.
URL: https://github.com/apache/drill/pull/1344#discussion_r208775803
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/HashAggTemplate.java
 ##
 @@ -1175,6 +1175,7 @@ public AggIterOutcome outputCurrentBatch() {
   v.getValueVector().getMutator().setValueCount(numOutputRecords);
 }
 
+outContainer.setRecordCount(numOutputRecords);
 
 Review comment:
   Thank you Tim for the due diligence! 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Basic Data Correctness Unit Tests
> -
>
> Key: DRILL-6461
> URL: https://issues.apache.org/jira/browse/DRILL-6461
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
>
> There are no data correctness unit tests for HashAgg. We need to add some.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests

2018-08-08 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16574063#comment-16574063
 ] 

ASF GitHub Bot commented on DRILL-6461:
---

sachouche commented on a change in pull request #1344: DRILL-6461: Added basic 
data correctness tests for hash agg, and improved operator unit testing 
framework.
URL: https://github.com/apache/drill/pull/1344#discussion_r208775608
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/record/RecordBatchSizer.java
 ##
 @@ -639,6 +641,8 @@ public ColumnSize getColumn(String name) {
   // columns can be obtained from children of topColumns.
   private Map columnSizes = 
CaseInsensitiveMap.newHashMap();
 
+  private List columnSizesList = new ArrayList<>();
 
 Review comment:
   I see; can you please add a comment indicating this behavior (especially 
within the columnsList() API).
   
   Thanks!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Basic Data Correctness Unit Tests
> -
>
> Key: DRILL-6461
> URL: https://issues.apache.org/jira/browse/DRILL-6461
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
>
> There are no data correctness unit tests for HashAgg. We need to add some.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests

2018-08-08 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16574048#comment-16574048
 ] 

ASF GitHub Bot commented on DRILL-6461:
---

ilooner commented on a change in pull request #1344: DRILL-6461: Added basic 
data correctness tests for hash agg, and improved operator unit testing 
framework.
URL: https://github.com/apache/drill/pull/1344#discussion_r208770691
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/record/RecordBatchSizer.java
 ##
 @@ -639,6 +641,8 @@ public ColumnSize getColumn(String name) {
   // columns can be obtained from children of topColumns.
   private Map columnSizes = 
CaseInsensitiveMap.newHashMap();
 
+  private List columnSizesList = new ArrayList<>();
 
 Review comment:
   It's for convenience. VectorContainers expose their ValueVectors based on 
index, so it is handy to be able to lookup column size based on the same index 
as well. Otherwise I would have to get the Materialized field for the a column, 
extract the name, and then lookup the ColumnSize in the map.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Basic Data Correctness Unit Tests
> -
>
> Key: DRILL-6461
> URL: https://issues.apache.org/jira/browse/DRILL-6461
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
>
> There are no data correctness unit tests for HashAgg. We need to add some.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests

2018-08-08 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16574041#comment-16574041
 ] 

ASF GitHub Bot commented on DRILL-6461:
---

ilooner commented on a change in pull request #1344: DRILL-6461: Added basic 
data correctness tests for hash agg, and improved operator unit testing 
framework.
URL: https://github.com/apache/drill/pull/1344#discussion_r208769577
 
 

 ##
 File path: 
exec/java-exec/src/test/java/org/apache/drill/test/rowSet/RowSetBatch.java
 ##
 @@ -29,13 +30,28 @@
 import org.apache.drill.exec.record.selection.SelectionVector2;
 import org.apache.drill.exec.record.selection.SelectionVector4;
 
+import java.util.ArrayList;
 import java.util.Iterator;
+import java.util.List;
 
-public class RowSetBatch implements RecordBatch {
-  private final RowSet rowSet;
+/**
+ * A mock operator that returns the provided {@link RowSet}s as batches. 
Currently it's assumed that all the {@link RowSet}s have the same schema.
+ */
+public class RowSetBatch implements CloseableRecordBatch {
 
 Review comment:
   RowSetBatch predates MockRecordBatch :/ . But I'll see if I can make 
MockRecordBatch play nice with RowSets. Since MockRecordBatch is used for many 
many tests, I will make it work with RowSets while keeping some deprecated 
constructors for VectorContainers so that I don't have to update every test.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Basic Data Correctness Unit Tests
> -
>
> Key: DRILL-6461
> URL: https://issues.apache.org/jira/browse/DRILL-6461
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
>
> There are no data correctness unit tests for HashAgg. We need to add some.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests

2018-08-08 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16573804#comment-16573804
 ] 

ASF GitHub Bot commented on DRILL-6461:
---

ilooner commented on a change in pull request #1344: DRILL-6461: Added basic 
data correctness tests for hash agg, and improved operator unit testing 
framework.
URL: https://github.com/apache/drill/pull/1344#discussion_r208721030
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/HashAggTemplate.java
 ##
 @@ -1175,6 +1175,7 @@ public AggIterOutcome outputCurrentBatch() {
   v.getValueVector().getMutator().setValueCount(numOutputRecords);
 }
 
+outContainer.setRecordCount(numOutputRecords);
 
 Review comment:
   - I created an issue for centralizing setting the record counts here. We can 
probably just make VectorContainer.setRecordCount set the record count for 
value vectors as well instead of just itself. 
https://issues.apache.org/jira/browse/DRILL-6675
- Discussed offline, I'll make the logging changes to HashAgg as part of 
https://issues.apache.org/jira/browse/DRILL-6253


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Basic Data Correctness Unit Tests
> -
>
> Key: DRILL-6461
> URL: https://issues.apache.org/jira/browse/DRILL-6461
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
>
> There are no data correctness unit tests for HashAgg. We need to add some.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests

2018-08-08 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16573767#comment-16573767
 ] 

ASF GitHub Bot commented on DRILL-6461:
---

ilooner commented on a change in pull request #1344: DRILL-6461: Added basic 
data correctness tests for hash agg, and improved operator unit testing 
framework.
URL: https://github.com/apache/drill/pull/1344#discussion_r208711677
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/record/VectorContainer.java
 ##
 @@ -42,6 +42,8 @@
   private final BufferAllocator allocator;
   protected final List> wrappers = Lists.newArrayList();
   private BatchSchema schema;
+  private SelectionVector2 selectionVector2;
+  private SelectionVector4 selectionVector4;
 
 Review comment:
   This has actually been a point of confusion for me. VectorContainer and 
RecordBatch both implement VectorAccessible, which has the following methods:
   
   ```
 SelectionVector2 getSelectionVector2();
 SelectionVector4 getSelectionVector4();
   ```
   
   If the intention is that VectorContainers should never hold selection 
vectors, then the interfaces VectorContainer implements should reflect that. 
What are you're thoughts about refactoring the VectorAccessable to remove the 
selection vector methods? In the meantime I will remove these changes from 
VectorContainer.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Basic Data Correctness Unit Tests
> -
>
> Key: DRILL-6461
> URL: https://issues.apache.org/jira/browse/DRILL-6461
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
>
> There are no data correctness unit tests for HashAgg. We need to add some.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests

2018-08-08 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16573758#comment-16573758
 ] 

ASF GitHub Bot commented on DRILL-6461:
---

ilooner commented on a change in pull request #1344: DRILL-6461: Added basic 
data correctness tests for hash agg, and improved operator unit testing 
framework.
URL: https://github.com/apache/drill/pull/1344#discussion_r208711677
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/record/VectorContainer.java
 ##
 @@ -42,6 +42,8 @@
   private final BufferAllocator allocator;
   protected final List> wrappers = Lists.newArrayList();
   private BatchSchema schema;
+  private SelectionVector2 selectionVector2;
+  private SelectionVector4 selectionVector4;
 
 Review comment:
   This has actually been a point of confusion for me. VectorContainer and 
RecordBatch both implement VectorAccessible, which has the following methods:
   
   ```
 SelectionVector2 getSelectionVector2();
 SelectionVector4 getSelectionVector4();
   ```
   
   If the intention is that VectorContainers should never hold selection 
vectors, then the interfaces VectorContainer implements should reflect that. 
What are you're thoughts about refactoring the VectorAccessable to remove the 
selection vector methods? In the meantime I will remove the these changes from 
VectorContainer.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Basic Data Correctness Unit Tests
> -
>
> Key: DRILL-6461
> URL: https://issues.apache.org/jira/browse/DRILL-6461
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
>
> There are no data correctness unit tests for HashAgg. We need to add some.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests

2018-08-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570554#comment-16570554
 ] 

ASF GitHub Bot commented on DRILL-6461:
---

sohami commented on a change in pull request #1344: DRILL-6461: Added basic 
data correctness tests for hash agg, and improved operator unit testing 
framework.
URL: https://github.com/apache/drill/pull/1344#discussion_r202754788
 
 

 ##
 File path: 
exec/java-exec/src/test/java/org/apache/drill/test/rowSet/RowSetBatch.java
 ##
 @@ -29,13 +30,28 @@
 import org.apache.drill.exec.record.selection.SelectionVector2;
 import org.apache.drill.exec.record.selection.SelectionVector4;
 
+import java.util.ArrayList;
 import java.util.Iterator;
+import java.util.List;
 
-public class RowSetBatch implements RecordBatch {
-  private final RowSet rowSet;
+/**
+ * A mock operator that returns the provided {@link RowSet}s as batches. 
Currently it's assumed that all the {@link RowSet}s have the same schema.
+ */
+public class RowSetBatch implements CloseableRecordBatch {
 
 Review comment:
   This is similar to 
[MockRecordBatch](https://github.com/apache/drill/blob/master/exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/MockRecordBatch.java).
 I created that class for Lateral-Unnest project. 
   The main usage for that class was to be able to simulate different 
IterOutcome with different type of batches while unit testing an operator. May 
be we can try to merge these 2 classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Basic Data Correctness Unit Tests
> -
>
> Key: DRILL-6461
> URL: https://issues.apache.org/jira/browse/DRILL-6461
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
>
> There are no data correctness unit tests for HashAgg. We need to add some.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests

2018-08-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570553#comment-16570553
 ] 

ASF GitHub Bot commented on DRILL-6461:
---

sohami commented on a change in pull request #1344: DRILL-6461: Added basic 
data correctness tests for hash agg, and improved operator unit testing 
framework.
URL: https://github.com/apache/drill/pull/1344#discussion_r207977722
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/record/VectorContainer.java
 ##
 @@ -42,6 +42,8 @@
   private final BufferAllocator allocator;
   protected final List> wrappers = Lists.newArrayList();
   private BatchSchema schema;
+  private SelectionVector2 selectionVector2;
+  private SelectionVector4 selectionVector4;
 
 Review comment:
   I don't think we should include SV2/SV4 inside a VectorContainer. They are 
more of RecordBatch level members. VectorContainer should only have metadata 
and set of vectors in it which is what it is today.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Basic Data Correctness Unit Tests
> -
>
> Key: DRILL-6461
> URL: https://issues.apache.org/jira/browse/DRILL-6461
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
>
> There are no data correctness unit tests for HashAgg. We need to add some.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests

2018-08-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570552#comment-16570552
 ] 

ASF GitHub Bot commented on DRILL-6461:
---

sohami commented on a change in pull request #1344: DRILL-6461: Added basic 
data correctness tests for hash agg, and improved operator unit testing 
framework.
URL: https://github.com/apache/drill/pull/1344#discussion_r202754788
 
 

 ##
 File path: 
exec/java-exec/src/test/java/org/apache/drill/test/rowSet/RowSetBatch.java
 ##
 @@ -29,13 +30,28 @@
 import org.apache.drill.exec.record.selection.SelectionVector2;
 import org.apache.drill.exec.record.selection.SelectionVector4;
 
+import java.util.ArrayList;
 import java.util.Iterator;
+import java.util.List;
 
-public class RowSetBatch implements RecordBatch {
-  private final RowSet rowSet;
+/**
+ * A mock operator that returns the provided {@link RowSet}s as batches. 
Currently it's assumed that all the {@link RowSet}s have the same schema.
+ */
+public class RowSetBatch implements CloseableRecordBatch {
 
 Review comment:
   This is same as 
[MockRecordBatch](https://github.com/apache/drill/blob/master/exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/MockRecordBatch.java).
 I created that class for Lateral-Unnest project. 
   The main usage for that class was to be able to simulate different 
IterOutcome with different type of batches while unit testing an operator. May 
be we can try to merge these 2 classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Basic Data Correctness Unit Tests
> -
>
> Key: DRILL-6461
> URL: https://issues.apache.org/jira/browse/DRILL-6461
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
>
> There are no data correctness unit tests for HashAgg. We need to add some.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests

2018-08-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565844#comment-16565844
 ] 

ASF GitHub Bot commented on DRILL-6461:
---

sachouche commented on a change in pull request #1344: DRILL-6461: Added basic 
data correctness tests for hash agg, and improved operator unit testing 
framework.
URL: https://github.com/apache/drill/pull/1344#discussion_r206999463
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/record/RecordBatchSizer.java
 ##
 @@ -639,6 +641,8 @@ public ColumnSize getColumn(String name) {
   // columns can be obtained from children of topColumns.
   private Map columnSizes = 
CaseInsensitiveMap.newHashMap();
 
+  private List columnSizesList = new ArrayList<>();
 
 Review comment:
   Why do this? the columnSizes map could return a Collection 
(through java.util.Map.values()) efficiently  (that is, in place).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Basic Data Correctness Unit Tests
> -
>
> Key: DRILL-6461
> URL: https://issues.apache.org/jira/browse/DRILL-6461
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
>
> There are no data correctness unit tests for HashAgg. We need to add some.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests

2018-08-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565845#comment-16565845
 ] 

ASF GitHub Bot commented on DRILL-6461:
---

sachouche commented on a change in pull request #1344: DRILL-6461: Added basic 
data correctness tests for hash agg, and improved operator unit testing 
framework.
URL: https://github.com/apache/drill/pull/1344#discussion_r206996050
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/HashAggTemplate.java
 ##
 @@ -1175,6 +1175,7 @@ public AggIterOutcome outputCurrentBatch() {
   v.getValueVector().getMutator().setValueCount(numOutputRecords);
 }
 
+outContainer.setRecordCount(numOutputRecords);
 
 Review comment:
   Noticed the old code is using INFO debug level; shouldn't it be using debug?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Basic Data Correctness Unit Tests
> -
>
> Key: DRILL-6461
> URL: https://issues.apache.org/jira/browse/DRILL-6461
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
>
> There are no data correctness unit tests for HashAgg. We need to add some.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests

2018-08-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565843#comment-16565843
 ] 

ASF GitHub Bot commented on DRILL-6461:
---

sachouche commented on a change in pull request #1344: DRILL-6461: Added basic 
data correctness tests for hash agg, and improved operator unit testing 
framework.
URL: https://github.com/apache/drill/pull/1344#discussion_r206995312
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/HashAggTemplate.java
 ##
 @@ -1175,6 +1175,7 @@ public AggIterOutcome outputCurrentBatch() {
   v.getValueVector().getMutator().setValueCount(numOutputRecords);
 }
 
+outContainer.setRecordCount(numOutputRecords);
 
 Review comment:
   I noticed the Drill code has three places where to set the record count; 
within a:
   - RecordBatch object (look at ScanBatch)
   - VectorContainer
   - Individual vectors
   
   Can we centralize this to a utility so that setting the count is always 
consistent across all operators. I am not suggesting that you do that within 
this PR (as I don't want to extend its scope) but this is just to get some 
feedback about this behavior.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Basic Data Correctness Unit Tests
> -
>
> Key: DRILL-6461
> URL: https://issues.apache.org/jira/browse/DRILL-6461
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
>
> There are no data correctness unit tests for HashAgg. We need to add some.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)