[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests
[ https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584609#comment-16584609 ] ASF GitHub Bot commented on DRILL-6461: --- ilooner commented on issue #1344: DRILL-6461: Added basic data correctness tests for hash agg, and improved operator unit testing framework. URL: https://github.com/apache/drill/pull/1344#issuecomment-414029645 @sohami Addressed review comments. Please take another look. I also implemented some tests for OperatorTestBuilder. The simple tests used the project operator to pass through input batches. While implementing those tests I discovered a bug in Project where the recordCount is not set on output containers, so I fixed that as part of this PR. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add Basic Data Correctness Unit Tests > - > > Key: DRILL-6461 > URL: https://issues.apache.org/jira/browse/DRILL-6461 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Timothy Farkas >Assignee: Timothy Farkas >Priority: Major > > There are no data correctness unit tests for HashAgg. We need to add some. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests
[ https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584579#comment-16584579 ] ASF GitHub Bot commented on DRILL-6461: --- ilooner commented on a change in pull request #1344: DRILL-6461: Added basic data correctness tests for hash agg, and improved operator unit testing framework. URL: https://github.com/apache/drill/pull/1344#discussion_r211061250 ## File path: exec/java-exec/src/test/java/org/apache/drill/test/OperatorTestBuilder.java ## @@ -0,0 +1,300 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.test; + +import com.google.common.base.Preconditions; +import org.apache.commons.lang3.mutable.MutableInt; +import org.apache.drill.exec.physical.base.AbstractBase; +import org.apache.drill.exec.physical.base.PhysicalOperator; +import org.apache.drill.exec.physical.impl.BatchCreator; +import org.apache.drill.exec.physical.impl.MockRecordBatch; +import org.apache.drill.exec.physical.impl.svremover.Copier; +import org.apache.drill.exec.physical.impl.svremover.GenericCopier; +import org.apache.drill.exec.record.CloseableRecordBatch; +import org.apache.drill.exec.record.RecordBatch; +import org.apache.drill.exec.record.RecordBatchSizer; +import org.apache.drill.exec.record.VectorContainer; +import org.apache.drill.exec.vector.FixedWidthVector; +import org.apache.drill.exec.vector.ValueVector; +import org.apache.drill.exec.vector.VariableWidthVector; +import org.apache.drill.test.rowSet.DirectRowSet; +import org.apache.drill.test.rowSet.IndirectRowSet; +import org.apache.drill.test.rowSet.RowSet; +import org.apache.drill.test.rowSet.RowSetComparison; +import org.junit.Assert; + +import java.util.ArrayList; +import java.util.List; +import java.util.Optional; +import java.util.stream.Collectors; + +public class OperatorTestBuilder { + + private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(OperatorTestBuilder.class); + + private final List expectedResults = new ArrayList<>(); + private final List upstreamBatches = new ArrayList<>(); + private PhysicalOpUnitTestBase physicalOpUnitTestBase; + + private PhysicalOperator physicalOperator; + private long initReservation = AbstractBase.INIT_ALLOCATION; + private long maxAllocation = AbstractBase.MAX_ALLOCATION; + private Optional expectedNumBatchesOpt = Optional.empty(); + private Optional expectedTotalRowsOpt = Optional.empty(); + private boolean combineOutputBatches; + + public OperatorTestBuilder(PhysicalOpUnitTestBase physicalOpUnitTestBase) { +this.physicalOpUnitTestBase = physicalOpUnitTestBase; + } + + @SuppressWarnings({"unchecked", "resource"}) + public void go() throws Exception { +final List actualResults = new ArrayList<>(); +CloseableRecordBatch testOperator = null; + +try { + validate(); + int expectedNumBatches = expectedNumBatchesOpt.orElse(expectedResults.size()); + physicalOpUnitTestBase.mockOpContext(physicalOperator, initReservation, maxAllocation); + + final BatchCreator opCreator = (BatchCreator) physicalOpUnitTestBase.opCreatorReg.getOperatorCreator(physicalOperator.getClass()); + testOperator = opCreator.getBatch(physicalOpUnitTestBase.fragContext, physicalOperator, (List)upstreamBatches); + + batchIterator: for (int batchIndex = 0;; batchIndex++) { +final RecordBatch.IterOutcome outcome = testOperator.next(); + +switch (outcome) { + case NONE: +if (!combineOutputBatches) { + Assert.assertEquals(expectedNumBatches, batchIndex); +} +// We are done iterating over batches. Now we need to compare them. +break batchIterator; + case OK_NEW_SCHEMA: +boolean skip = true; + +try { + skip = testOperator.getContainer().getRecordCount() == 0; +} catch (IllegalStateException e) { + // We should skip this batch in this case. It means no data was included with the okay schema +} finally { + if (skip) { +batchIndex--; +break; +
[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests
[ https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584574#comment-16584574 ] ASF GitHub Bot commented on DRILL-6461: --- ilooner commented on a change in pull request #1344: DRILL-6461: Added basic data correctness tests for hash agg, and improved operator unit testing framework. URL: https://github.com/apache/drill/pull/1344#discussion_r211060910 ## File path: exec/java-exec/src/test/java/org/apache/drill/test/OperatorTestBuilder.java ## @@ -0,0 +1,300 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.test; + +import com.google.common.base.Preconditions; +import org.apache.commons.lang3.mutable.MutableInt; +import org.apache.drill.exec.physical.base.AbstractBase; +import org.apache.drill.exec.physical.base.PhysicalOperator; +import org.apache.drill.exec.physical.impl.BatchCreator; +import org.apache.drill.exec.physical.impl.MockRecordBatch; +import org.apache.drill.exec.physical.impl.svremover.Copier; +import org.apache.drill.exec.physical.impl.svremover.GenericCopier; +import org.apache.drill.exec.record.CloseableRecordBatch; +import org.apache.drill.exec.record.RecordBatch; +import org.apache.drill.exec.record.RecordBatchSizer; +import org.apache.drill.exec.record.VectorContainer; +import org.apache.drill.exec.vector.FixedWidthVector; +import org.apache.drill.exec.vector.ValueVector; +import org.apache.drill.exec.vector.VariableWidthVector; +import org.apache.drill.test.rowSet.DirectRowSet; +import org.apache.drill.test.rowSet.IndirectRowSet; +import org.apache.drill.test.rowSet.RowSet; +import org.apache.drill.test.rowSet.RowSetComparison; +import org.junit.Assert; + +import java.util.ArrayList; +import java.util.List; +import java.util.Optional; +import java.util.stream.Collectors; + +public class OperatorTestBuilder { + + private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(OperatorTestBuilder.class); + + private final List expectedResults = new ArrayList<>(); + private final List upstreamBatches = new ArrayList<>(); + private PhysicalOpUnitTestBase physicalOpUnitTestBase; + + private PhysicalOperator physicalOperator; + private long initReservation = AbstractBase.INIT_ALLOCATION; + private long maxAllocation = AbstractBase.MAX_ALLOCATION; + private Optional expectedNumBatchesOpt = Optional.empty(); + private Optional expectedTotalRowsOpt = Optional.empty(); + private boolean combineOutputBatches; + + public OperatorTestBuilder(PhysicalOpUnitTestBase physicalOpUnitTestBase) { +this.physicalOpUnitTestBase = physicalOpUnitTestBase; + } + + @SuppressWarnings({"unchecked", "resource"}) + public void go() throws Exception { +final List actualResults = new ArrayList<>(); +CloseableRecordBatch testOperator = null; + +try { + validate(); + int expectedNumBatches = expectedNumBatchesOpt.orElse(expectedResults.size()); + physicalOpUnitTestBase.mockOpContext(physicalOperator, initReservation, maxAllocation); + + final BatchCreator opCreator = (BatchCreator) physicalOpUnitTestBase.opCreatorReg.getOperatorCreator(physicalOperator.getClass()); + testOperator = opCreator.getBatch(physicalOpUnitTestBase.fragContext, physicalOperator, (List)upstreamBatches); + + batchIterator: for (int batchIndex = 0;; batchIndex++) { +final RecordBatch.IterOutcome outcome = testOperator.next(); + +switch (outcome) { + case NONE: +if (!combineOutputBatches) { + Assert.assertEquals(expectedNumBatches, batchIndex); +} +// We are done iterating over batches. Now we need to compare them. +break batchIterator; + case OK_NEW_SCHEMA: +boolean skip = true; + +try { + skip = testOperator.getContainer().getRecordCount() == 0; +} catch (IllegalStateException e) { + // We should skip this batch in this case. It means no data was included with the okay schema +} finally { + if (skip) { +batchIndex--; +break; +
[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests
[ https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584474#comment-16584474 ] ASF GitHub Bot commented on DRILL-6461: --- sohami commented on a change in pull request #1344: DRILL-6461: Added basic data correctness tests for hash agg, and improved operator unit testing framework. URL: https://github.com/apache/drill/pull/1344#discussion_r211049297 ## File path: exec/java-exec/src/test/java/org/apache/drill/test/OperatorTestBuilder.java ## @@ -0,0 +1,300 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.test; + +import com.google.common.base.Preconditions; +import org.apache.commons.lang3.mutable.MutableInt; +import org.apache.drill.exec.physical.base.AbstractBase; +import org.apache.drill.exec.physical.base.PhysicalOperator; +import org.apache.drill.exec.physical.impl.BatchCreator; +import org.apache.drill.exec.physical.impl.MockRecordBatch; +import org.apache.drill.exec.physical.impl.svremover.Copier; +import org.apache.drill.exec.physical.impl.svremover.GenericCopier; +import org.apache.drill.exec.record.CloseableRecordBatch; +import org.apache.drill.exec.record.RecordBatch; +import org.apache.drill.exec.record.RecordBatchSizer; +import org.apache.drill.exec.record.VectorContainer; +import org.apache.drill.exec.vector.FixedWidthVector; +import org.apache.drill.exec.vector.ValueVector; +import org.apache.drill.exec.vector.VariableWidthVector; +import org.apache.drill.test.rowSet.DirectRowSet; +import org.apache.drill.test.rowSet.IndirectRowSet; +import org.apache.drill.test.rowSet.RowSet; +import org.apache.drill.test.rowSet.RowSetComparison; +import org.junit.Assert; + +import java.util.ArrayList; +import java.util.List; +import java.util.Optional; +import java.util.stream.Collectors; + +public class OperatorTestBuilder { + + private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(OperatorTestBuilder.class); + + private final List expectedResults = new ArrayList<>(); + private final List upstreamBatches = new ArrayList<>(); + private PhysicalOpUnitTestBase physicalOpUnitTestBase; + + private PhysicalOperator physicalOperator; + private long initReservation = AbstractBase.INIT_ALLOCATION; + private long maxAllocation = AbstractBase.MAX_ALLOCATION; + private Optional expectedNumBatchesOpt = Optional.empty(); + private Optional expectedTotalRowsOpt = Optional.empty(); + private boolean combineOutputBatches; + + public OperatorTestBuilder(PhysicalOpUnitTestBase physicalOpUnitTestBase) { +this.physicalOpUnitTestBase = physicalOpUnitTestBase; + } + + @SuppressWarnings({"unchecked", "resource"}) + public void go() throws Exception { +final List actualResults = new ArrayList<>(); +CloseableRecordBatch testOperator = null; + +try { + validate(); + int expectedNumBatches = expectedNumBatchesOpt.orElse(expectedResults.size()); + physicalOpUnitTestBase.mockOpContext(physicalOperator, initReservation, maxAllocation); + + final BatchCreator opCreator = (BatchCreator) physicalOpUnitTestBase.opCreatorReg.getOperatorCreator(physicalOperator.getClass()); + testOperator = opCreator.getBatch(physicalOpUnitTestBase.fragContext, physicalOperator, (List)upstreamBatches); + + batchIterator: for (int batchIndex = 0;; batchIndex++) { +final RecordBatch.IterOutcome outcome = testOperator.next(); + +switch (outcome) { + case NONE: +if (!combineOutputBatches) { + Assert.assertEquals(expectedNumBatches, batchIndex); +} +// We are done iterating over batches. Now we need to compare them. +break batchIterator; + case OK_NEW_SCHEMA: +boolean skip = true; + +try { + skip = testOperator.getContainer().getRecordCount() == 0; +} catch (IllegalStateException e) { + // We should skip this batch in this case. It means no data was included with the okay schema +} finally { + if (skip) { +batchIndex--; +break; +
[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests
[ https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584471#comment-16584471 ] ASF GitHub Bot commented on DRILL-6461: --- sohami commented on a change in pull request #1344: DRILL-6461: Added basic data correctness tests for hash agg, and improved operator unit testing framework. URL: https://github.com/apache/drill/pull/1344#discussion_r211048316 ## File path: exec/java-exec/src/test/java/org/apache/drill/test/OperatorTestBuilder.java ## @@ -0,0 +1,300 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.test; + +import com.google.common.base.Preconditions; +import org.apache.commons.lang3.mutable.MutableInt; +import org.apache.drill.exec.physical.base.AbstractBase; +import org.apache.drill.exec.physical.base.PhysicalOperator; +import org.apache.drill.exec.physical.impl.BatchCreator; +import org.apache.drill.exec.physical.impl.MockRecordBatch; +import org.apache.drill.exec.physical.impl.svremover.Copier; +import org.apache.drill.exec.physical.impl.svremover.GenericCopier; +import org.apache.drill.exec.record.CloseableRecordBatch; +import org.apache.drill.exec.record.RecordBatch; +import org.apache.drill.exec.record.RecordBatchSizer; +import org.apache.drill.exec.record.VectorContainer; +import org.apache.drill.exec.vector.FixedWidthVector; +import org.apache.drill.exec.vector.ValueVector; +import org.apache.drill.exec.vector.VariableWidthVector; +import org.apache.drill.test.rowSet.DirectRowSet; +import org.apache.drill.test.rowSet.IndirectRowSet; +import org.apache.drill.test.rowSet.RowSet; +import org.apache.drill.test.rowSet.RowSetComparison; +import org.junit.Assert; + +import java.util.ArrayList; +import java.util.List; +import java.util.Optional; +import java.util.stream.Collectors; + +public class OperatorTestBuilder { + + private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(OperatorTestBuilder.class); + + private final List expectedResults = new ArrayList<>(); + private final List upstreamBatches = new ArrayList<>(); + private PhysicalOpUnitTestBase physicalOpUnitTestBase; + + private PhysicalOperator physicalOperator; + private long initReservation = AbstractBase.INIT_ALLOCATION; + private long maxAllocation = AbstractBase.MAX_ALLOCATION; + private Optional expectedNumBatchesOpt = Optional.empty(); + private Optional expectedTotalRowsOpt = Optional.empty(); + private boolean combineOutputBatches; + + public OperatorTestBuilder(PhysicalOpUnitTestBase physicalOpUnitTestBase) { +this.physicalOpUnitTestBase = physicalOpUnitTestBase; + } + + @SuppressWarnings({"unchecked", "resource"}) + public void go() throws Exception { +final List actualResults = new ArrayList<>(); +CloseableRecordBatch testOperator = null; + +try { + validate(); + int expectedNumBatches = expectedNumBatchesOpt.orElse(expectedResults.size()); + physicalOpUnitTestBase.mockOpContext(physicalOperator, initReservation, maxAllocation); + + final BatchCreator opCreator = (BatchCreator) physicalOpUnitTestBase.opCreatorReg.getOperatorCreator(physicalOperator.getClass()); + testOperator = opCreator.getBatch(physicalOpUnitTestBase.fragContext, physicalOperator, (List)upstreamBatches); + + batchIterator: for (int batchIndex = 0;; batchIndex++) { +final RecordBatch.IterOutcome outcome = testOperator.next(); + +switch (outcome) { + case NONE: +if (!combineOutputBatches) { + Assert.assertEquals(expectedNumBatches, batchIndex); +} +// We are done iterating over batches. Now we need to compare them. +break batchIterator; + case OK_NEW_SCHEMA: +boolean skip = true; + +try { + skip = testOperator.getContainer().getRecordCount() == 0; +} catch (IllegalStateException e) { + // We should skip this batch in this case. It means no data was included with the okay schema +} finally { + if (skip) { +batchIndex--; +break; +
[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests
[ https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584472#comment-16584472 ] ASF GitHub Bot commented on DRILL-6461: --- sohami commented on a change in pull request #1344: DRILL-6461: Added basic data correctness tests for hash agg, and improved operator unit testing framework. URL: https://github.com/apache/drill/pull/1344#discussion_r211048926 ## File path: exec/java-exec/src/test/java/org/apache/drill/test/OperatorTestBuilder.java ## @@ -0,0 +1,300 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.test; + +import com.google.common.base.Preconditions; +import org.apache.commons.lang3.mutable.MutableInt; +import org.apache.drill.exec.physical.base.AbstractBase; +import org.apache.drill.exec.physical.base.PhysicalOperator; +import org.apache.drill.exec.physical.impl.BatchCreator; +import org.apache.drill.exec.physical.impl.MockRecordBatch; +import org.apache.drill.exec.physical.impl.svremover.Copier; +import org.apache.drill.exec.physical.impl.svremover.GenericCopier; +import org.apache.drill.exec.record.CloseableRecordBatch; +import org.apache.drill.exec.record.RecordBatch; +import org.apache.drill.exec.record.RecordBatchSizer; +import org.apache.drill.exec.record.VectorContainer; +import org.apache.drill.exec.vector.FixedWidthVector; +import org.apache.drill.exec.vector.ValueVector; +import org.apache.drill.exec.vector.VariableWidthVector; +import org.apache.drill.test.rowSet.DirectRowSet; +import org.apache.drill.test.rowSet.IndirectRowSet; +import org.apache.drill.test.rowSet.RowSet; +import org.apache.drill.test.rowSet.RowSetComparison; +import org.junit.Assert; + +import java.util.ArrayList; +import java.util.List; +import java.util.Optional; +import java.util.stream.Collectors; + +public class OperatorTestBuilder { + + private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(OperatorTestBuilder.class); + + private final List expectedResults = new ArrayList<>(); + private final List upstreamBatches = new ArrayList<>(); + private PhysicalOpUnitTestBase physicalOpUnitTestBase; + + private PhysicalOperator physicalOperator; + private long initReservation = AbstractBase.INIT_ALLOCATION; + private long maxAllocation = AbstractBase.MAX_ALLOCATION; + private Optional expectedNumBatchesOpt = Optional.empty(); + private Optional expectedTotalRowsOpt = Optional.empty(); + private boolean combineOutputBatches; + + public OperatorTestBuilder(PhysicalOpUnitTestBase physicalOpUnitTestBase) { +this.physicalOpUnitTestBase = physicalOpUnitTestBase; + } + + @SuppressWarnings({"unchecked", "resource"}) + public void go() throws Exception { +final List actualResults = new ArrayList<>(); +CloseableRecordBatch testOperator = null; + +try { + validate(); + int expectedNumBatches = expectedNumBatchesOpt.orElse(expectedResults.size()); + physicalOpUnitTestBase.mockOpContext(physicalOperator, initReservation, maxAllocation); + + final BatchCreator opCreator = (BatchCreator) physicalOpUnitTestBase.opCreatorReg.getOperatorCreator(physicalOperator.getClass()); + testOperator = opCreator.getBatch(physicalOpUnitTestBase.fragContext, physicalOperator, (List)upstreamBatches); + + batchIterator: for (int batchIndex = 0;; batchIndex++) { +final RecordBatch.IterOutcome outcome = testOperator.next(); + +switch (outcome) { + case NONE: +if (!combineOutputBatches) { + Assert.assertEquals(expectedNumBatches, batchIndex); +} +// We are done iterating over batches. Now we need to compare them. +break batchIterator; + case OK_NEW_SCHEMA: +boolean skip = true; + +try { + skip = testOperator.getContainer().getRecordCount() == 0; +} catch (IllegalStateException e) { + // We should skip this batch in this case. It means no data was included with the okay schema +} finally { + if (skip) { +batchIndex--; +break; +
[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests
[ https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584458#comment-16584458 ] ASF GitHub Bot commented on DRILL-6461: --- ilooner commented on issue #1344: DRILL-6461: Added basic data correctness tests for hash agg, and improved operator unit testing framework. URL: https://github.com/apache/drill/pull/1344#issuecomment-414004294 @sohami Applied review comments. Please take another look. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add Basic Data Correctness Unit Tests > - > > Key: DRILL-6461 > URL: https://issues.apache.org/jira/browse/DRILL-6461 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Timothy Farkas >Assignee: Timothy Farkas >Priority: Major > > There are no data correctness unit tests for HashAgg. We need to add some. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests
[ https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584455#comment-16584455 ] ASF GitHub Bot commented on DRILL-6461: --- ilooner commented on a change in pull request #1344: DRILL-6461: Added basic data correctness tests for hash agg, and improved operator unit testing framework. URL: https://github.com/apache/drill/pull/1344#discussion_r211046248 ## File path: exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/svremover/AbstractGenericCopierTest.java ## @@ -20,77 +20,106 @@ import org.apache.drill.common.types.TypeProtos; import org.apache.drill.common.types.Types; import org.apache.drill.exec.exception.SchemaChangeException; -import org.apache.drill.exec.memory.RootAllocator; +import org.apache.drill.exec.memory.BufferAllocator; +import org.apache.drill.exec.physical.impl.MockRecordBatch; import org.apache.drill.exec.record.BatchSchema; import org.apache.drill.exec.record.MaterializedField; import org.apache.drill.exec.record.RecordBatch; import org.apache.drill.exec.record.VectorContainer; -import org.apache.drill.exec.record.metadata.TupleMetadata; +import org.apache.drill.exec.record.VectorWrapper; import org.apache.drill.exec.vector.SchemaChangeCallBack; +import org.apache.drill.test.BaseDirTestWatcher; +import org.apache.drill.test.OperatorFixture; +import org.apache.drill.test.rowSet.DirectRowSet; import org.apache.drill.test.rowSet.RowSet; -import org.apache.drill.test.rowSet.RowSetBatch; import org.apache.drill.test.rowSet.RowSetBuilder; import org.apache.drill.test.rowSet.RowSetComparison; import org.apache.drill.test.rowSet.schema.SchemaBuilder; +import org.junit.Rule; import org.junit.Test; public abstract class AbstractGenericCopierTest { + @Rule + public final BaseDirTestWatcher baseDirTestWatcher = new BaseDirTestWatcher(); + @Test - public void testCopyRecords() throws SchemaChangeException { -try (RootAllocator allocator = new RootAllocator(10_000_000)) { - final TupleMetadata batchSchema = createTestSchema(BatchSchema.SelectionVectorMode.NONE); + public void testCopyRecords() throws Exception { +try (OperatorFixture operatorFixture = new OperatorFixture.Builder(baseDirTestWatcher).build()) { + final BufferAllocator allocator = operatorFixture.allocator(); + final BatchSchema batchSchema = createTestSchema(BatchSchema.SelectionVectorMode.NONE); final RowSet srcRowSet = createSrcRowSet(allocator); - final RowSet destRowSet = new RowSetBuilder(allocator, batchSchema).build(); - final VectorContainer destContainer = destRowSet.container(); - final Copier copier = createCopier(new RowSetBatch(srcRowSet), destContainer, null); + final VectorContainer destContainer = new VectorContainer(allocator, batchSchema); + + for (VectorWrapper vectorWrapper: destContainer) { +vectorWrapper.getValueVector().allocateNew(); Review comment: Thanks for catching this. I've moved the allocation code into a static method and call it in GenericCopier.copyRecords. It is still necessary to allocate the vectors in the appendRecords test though, since appendRecords assumes the vector container is pre existing. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add Basic Data Correctness Unit Tests > - > > Key: DRILL-6461 > URL: https://issues.apache.org/jira/browse/DRILL-6461 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Timothy Farkas >Assignee: Timothy Farkas >Priority: Major > > There are no data correctness unit tests for HashAgg. We need to add some. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests
[ https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584433#comment-16584433 ] ASF GitHub Bot commented on DRILL-6461: --- ilooner commented on a change in pull request #1344: DRILL-6461: Added basic data correctness tests for hash agg, and improved operator unit testing framework. URL: https://github.com/apache/drill/pull/1344#discussion_r211041686 ## File path: exec/java-exec/src/test/java/org/apache/drill/test/OperatorTestBuilder.java ## @@ -0,0 +1,300 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.test; + +import com.google.common.base.Preconditions; +import org.apache.commons.lang3.mutable.MutableInt; +import org.apache.drill.exec.physical.base.AbstractBase; +import org.apache.drill.exec.physical.base.PhysicalOperator; +import org.apache.drill.exec.physical.impl.BatchCreator; +import org.apache.drill.exec.physical.impl.MockRecordBatch; +import org.apache.drill.exec.physical.impl.svremover.Copier; +import org.apache.drill.exec.physical.impl.svremover.GenericCopier; +import org.apache.drill.exec.record.CloseableRecordBatch; +import org.apache.drill.exec.record.RecordBatch; +import org.apache.drill.exec.record.RecordBatchSizer; +import org.apache.drill.exec.record.VectorContainer; +import org.apache.drill.exec.vector.FixedWidthVector; +import org.apache.drill.exec.vector.ValueVector; +import org.apache.drill.exec.vector.VariableWidthVector; +import org.apache.drill.test.rowSet.DirectRowSet; +import org.apache.drill.test.rowSet.IndirectRowSet; +import org.apache.drill.test.rowSet.RowSet; +import org.apache.drill.test.rowSet.RowSetComparison; +import org.junit.Assert; + +import java.util.ArrayList; +import java.util.List; +import java.util.Optional; +import java.util.stream.Collectors; + +public class OperatorTestBuilder { + + private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(OperatorTestBuilder.class); + + private final List expectedResults = new ArrayList<>(); + private final List upstreamBatches = new ArrayList<>(); + private PhysicalOpUnitTestBase physicalOpUnitTestBase; + + private PhysicalOperator physicalOperator; + private long initReservation = AbstractBase.INIT_ALLOCATION; + private long maxAllocation = AbstractBase.MAX_ALLOCATION; + private Optional expectedNumBatchesOpt = Optional.empty(); + private Optional expectedTotalRowsOpt = Optional.empty(); + private boolean combineOutputBatches; + + public OperatorTestBuilder(PhysicalOpUnitTestBase physicalOpUnitTestBase) { +this.physicalOpUnitTestBase = physicalOpUnitTestBase; + } + + @SuppressWarnings({"unchecked", "resource"}) + public void go() throws Exception { +final List actualResults = new ArrayList<>(); +CloseableRecordBatch testOperator = null; + +try { + validate(); + int expectedNumBatches = expectedNumBatchesOpt.orElse(expectedResults.size()); + physicalOpUnitTestBase.mockOpContext(physicalOperator, initReservation, maxAllocation); + + final BatchCreator opCreator = (BatchCreator) physicalOpUnitTestBase.opCreatorReg.getOperatorCreator(physicalOperator.getClass()); + testOperator = opCreator.getBatch(physicalOpUnitTestBase.fragContext, physicalOperator, (List)upstreamBatches); + + batchIterator: for (int batchIndex = 0;; batchIndex++) { +final RecordBatch.IterOutcome outcome = testOperator.next(); + +switch (outcome) { + case NONE: +if (!combineOutputBatches) { + Assert.assertEquals(expectedNumBatches, batchIndex); +} +// We are done iterating over batches. Now we need to compare them. +break batchIterator; + case OK_NEW_SCHEMA: +boolean skip = true; + +try { + skip = testOperator.getContainer().getRecordCount() == 0; +} catch (IllegalStateException e) { + // We should skip this batch in this case. It means no data was included with the okay schema +} finally { + if (skip) { +batchIndex--; +break; +
[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests
[ https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584423#comment-16584423 ] ASF GitHub Bot commented on DRILL-6461: --- ilooner commented on a change in pull request #1344: DRILL-6461: Added basic data correctness tests for hash agg, and improved operator unit testing framework. URL: https://github.com/apache/drill/pull/1344#discussion_r211040661 ## File path: exec/java-exec/src/test/java/org/apache/drill/test/OperatorTestBuilder.java ## @@ -0,0 +1,300 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.test; + +import com.google.common.base.Preconditions; +import org.apache.commons.lang3.mutable.MutableInt; +import org.apache.drill.exec.physical.base.AbstractBase; +import org.apache.drill.exec.physical.base.PhysicalOperator; +import org.apache.drill.exec.physical.impl.BatchCreator; +import org.apache.drill.exec.physical.impl.MockRecordBatch; +import org.apache.drill.exec.physical.impl.svremover.Copier; +import org.apache.drill.exec.physical.impl.svremover.GenericCopier; +import org.apache.drill.exec.record.CloseableRecordBatch; +import org.apache.drill.exec.record.RecordBatch; +import org.apache.drill.exec.record.RecordBatchSizer; +import org.apache.drill.exec.record.VectorContainer; +import org.apache.drill.exec.vector.FixedWidthVector; +import org.apache.drill.exec.vector.ValueVector; +import org.apache.drill.exec.vector.VariableWidthVector; +import org.apache.drill.test.rowSet.DirectRowSet; +import org.apache.drill.test.rowSet.IndirectRowSet; +import org.apache.drill.test.rowSet.RowSet; +import org.apache.drill.test.rowSet.RowSetComparison; +import org.junit.Assert; + +import java.util.ArrayList; +import java.util.List; +import java.util.Optional; +import java.util.stream.Collectors; + +public class OperatorTestBuilder { + + private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(OperatorTestBuilder.class); + + private final List expectedResults = new ArrayList<>(); + private final List upstreamBatches = new ArrayList<>(); + private PhysicalOpUnitTestBase physicalOpUnitTestBase; + + private PhysicalOperator physicalOperator; + private long initReservation = AbstractBase.INIT_ALLOCATION; + private long maxAllocation = AbstractBase.MAX_ALLOCATION; + private Optional expectedNumBatchesOpt = Optional.empty(); + private Optional expectedTotalRowsOpt = Optional.empty(); + private boolean combineOutputBatches; + + public OperatorTestBuilder(PhysicalOpUnitTestBase physicalOpUnitTestBase) { +this.physicalOpUnitTestBase = physicalOpUnitTestBase; + } + + @SuppressWarnings({"unchecked", "resource"}) + public void go() throws Exception { +final List actualResults = new ArrayList<>(); +CloseableRecordBatch testOperator = null; + +try { + validate(); + int expectedNumBatches = expectedNumBatchesOpt.orElse(expectedResults.size()); + physicalOpUnitTestBase.mockOpContext(physicalOperator, initReservation, maxAllocation); + + final BatchCreator opCreator = (BatchCreator) physicalOpUnitTestBase.opCreatorReg.getOperatorCreator(physicalOperator.getClass()); + testOperator = opCreator.getBatch(physicalOpUnitTestBase.fragContext, physicalOperator, (List)upstreamBatches); + + batchIterator: for (int batchIndex = 0;; batchIndex++) { +final RecordBatch.IterOutcome outcome = testOperator.next(); + +switch (outcome) { + case NONE: +if (!combineOutputBatches) { + Assert.assertEquals(expectedNumBatches, batchIndex); +} +// We are done iterating over batches. Now we need to compare them. +break batchIterator; + case OK_NEW_SCHEMA: +boolean skip = true; + +try { + skip = testOperator.getContainer().getRecordCount() == 0; +} catch (IllegalStateException e) { + // We should skip this batch in this case. It means no data was included with the okay schema +} finally { + if (skip) { +batchIndex--; +break; +
[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests
[ https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584419#comment-16584419 ] ASF GitHub Bot commented on DRILL-6461: --- ilooner commented on a change in pull request #1344: DRILL-6461: Added basic data correctness tests for hash agg, and improved operator unit testing framework. URL: https://github.com/apache/drill/pull/1344#discussion_r211040315 ## File path: exec/java-exec/src/test/java/org/apache/drill/test/OperatorTestBuilder.java ## @@ -0,0 +1,300 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.test; + +import com.google.common.base.Preconditions; +import org.apache.commons.lang3.mutable.MutableInt; +import org.apache.drill.exec.physical.base.AbstractBase; +import org.apache.drill.exec.physical.base.PhysicalOperator; +import org.apache.drill.exec.physical.impl.BatchCreator; +import org.apache.drill.exec.physical.impl.MockRecordBatch; +import org.apache.drill.exec.physical.impl.svremover.Copier; +import org.apache.drill.exec.physical.impl.svremover.GenericCopier; +import org.apache.drill.exec.record.CloseableRecordBatch; +import org.apache.drill.exec.record.RecordBatch; +import org.apache.drill.exec.record.RecordBatchSizer; +import org.apache.drill.exec.record.VectorContainer; +import org.apache.drill.exec.vector.FixedWidthVector; +import org.apache.drill.exec.vector.ValueVector; +import org.apache.drill.exec.vector.VariableWidthVector; +import org.apache.drill.test.rowSet.DirectRowSet; +import org.apache.drill.test.rowSet.IndirectRowSet; +import org.apache.drill.test.rowSet.RowSet; +import org.apache.drill.test.rowSet.RowSetComparison; +import org.junit.Assert; + +import java.util.ArrayList; +import java.util.List; +import java.util.Optional; +import java.util.stream.Collectors; + +public class OperatorTestBuilder { + + private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(OperatorTestBuilder.class); + + private final List expectedResults = new ArrayList<>(); + private final List upstreamBatches = new ArrayList<>(); + private PhysicalOpUnitTestBase physicalOpUnitTestBase; + + private PhysicalOperator physicalOperator; + private long initReservation = AbstractBase.INIT_ALLOCATION; + private long maxAllocation = AbstractBase.MAX_ALLOCATION; + private Optional expectedNumBatchesOpt = Optional.empty(); + private Optional expectedTotalRowsOpt = Optional.empty(); + private boolean combineOutputBatches; + + public OperatorTestBuilder(PhysicalOpUnitTestBase physicalOpUnitTestBase) { +this.physicalOpUnitTestBase = physicalOpUnitTestBase; + } + + @SuppressWarnings({"unchecked", "resource"}) + public void go() throws Exception { +final List actualResults = new ArrayList<>(); +CloseableRecordBatch testOperator = null; + +try { + validate(); + int expectedNumBatches = expectedNumBatchesOpt.orElse(expectedResults.size()); + physicalOpUnitTestBase.mockOpContext(physicalOperator, initReservation, maxAllocation); + + final BatchCreator opCreator = (BatchCreator) physicalOpUnitTestBase.opCreatorReg.getOperatorCreator(physicalOperator.getClass()); + testOperator = opCreator.getBatch(physicalOpUnitTestBase.fragContext, physicalOperator, (List)upstreamBatches); + + batchIterator: for (int batchIndex = 0;; batchIndex++) { +final RecordBatch.IterOutcome outcome = testOperator.next(); + +switch (outcome) { + case NONE: +if (!combineOutputBatches) { + Assert.assertEquals(expectedNumBatches, batchIndex); +} +// We are done iterating over batches. Now we need to compare them. +break batchIterator; + case OK_NEW_SCHEMA: +boolean skip = true; + +try { + skip = testOperator.getContainer().getRecordCount() == 0; +} catch (IllegalStateException e) { + // We should skip this batch in this case. It means no data was included with the okay schema +} finally { + if (skip) { +batchIndex--; +break; +
[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests
[ https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584417#comment-16584417 ] ASF GitHub Bot commented on DRILL-6461: --- ilooner commented on a change in pull request #1344: DRILL-6461: Added basic data correctness tests for hash agg, and improved operator unit testing framework. URL: https://github.com/apache/drill/pull/1344#discussion_r211039960 ## File path: exec/java-exec/src/test/java/org/apache/drill/test/OperatorTestBuilder.java ## @@ -0,0 +1,300 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.test; + +import com.google.common.base.Preconditions; +import org.apache.commons.lang3.mutable.MutableInt; +import org.apache.drill.exec.physical.base.AbstractBase; +import org.apache.drill.exec.physical.base.PhysicalOperator; +import org.apache.drill.exec.physical.impl.BatchCreator; +import org.apache.drill.exec.physical.impl.MockRecordBatch; +import org.apache.drill.exec.physical.impl.svremover.Copier; +import org.apache.drill.exec.physical.impl.svremover.GenericCopier; +import org.apache.drill.exec.record.CloseableRecordBatch; +import org.apache.drill.exec.record.RecordBatch; +import org.apache.drill.exec.record.RecordBatchSizer; +import org.apache.drill.exec.record.VectorContainer; +import org.apache.drill.exec.vector.FixedWidthVector; +import org.apache.drill.exec.vector.ValueVector; +import org.apache.drill.exec.vector.VariableWidthVector; +import org.apache.drill.test.rowSet.DirectRowSet; +import org.apache.drill.test.rowSet.IndirectRowSet; +import org.apache.drill.test.rowSet.RowSet; +import org.apache.drill.test.rowSet.RowSetComparison; +import org.junit.Assert; + +import java.util.ArrayList; +import java.util.List; +import java.util.Optional; +import java.util.stream.Collectors; + +public class OperatorTestBuilder { + + private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(OperatorTestBuilder.class); + + private final List expectedResults = new ArrayList<>(); + private final List upstreamBatches = new ArrayList<>(); + private PhysicalOpUnitTestBase physicalOpUnitTestBase; + + private PhysicalOperator physicalOperator; + private long initReservation = AbstractBase.INIT_ALLOCATION; + private long maxAllocation = AbstractBase.MAX_ALLOCATION; + private Optional expectedNumBatchesOpt = Optional.empty(); + private Optional expectedTotalRowsOpt = Optional.empty(); + private boolean combineOutputBatches; + + public OperatorTestBuilder(PhysicalOpUnitTestBase physicalOpUnitTestBase) { +this.physicalOpUnitTestBase = physicalOpUnitTestBase; + } + + @SuppressWarnings({"unchecked", "resource"}) + public void go() throws Exception { +final List actualResults = new ArrayList<>(); +CloseableRecordBatch testOperator = null; + +try { + validate(); + int expectedNumBatches = expectedNumBatchesOpt.orElse(expectedResults.size()); + physicalOpUnitTestBase.mockOpContext(physicalOperator, initReservation, maxAllocation); + + final BatchCreator opCreator = (BatchCreator) physicalOpUnitTestBase.opCreatorReg.getOperatorCreator(physicalOperator.getClass()); + testOperator = opCreator.getBatch(physicalOpUnitTestBase.fragContext, physicalOperator, (List)upstreamBatches); + + batchIterator: for (int batchIndex = 0;; batchIndex++) { +final RecordBatch.IterOutcome outcome = testOperator.next(); + +switch (outcome) { + case NONE: +if (!combineOutputBatches) { + Assert.assertEquals(expectedNumBatches, batchIndex); +} +// We are done iterating over batches. Now we need to compare them. +break batchIterator; + case OK_NEW_SCHEMA: +boolean skip = true; + +try { + skip = testOperator.getContainer().getRecordCount() == 0; +} catch (IllegalStateException e) { + // We should skip this batch in this case. It means no data was included with the okay schema +} finally { + if (skip) { +batchIndex--; +break; +
[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests
[ https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584412#comment-16584412 ] ASF GitHub Bot commented on DRILL-6461: --- ilooner commented on a change in pull request #1344: DRILL-6461: Added basic data correctness tests for hash agg, and improved operator unit testing framework. URL: https://github.com/apache/drill/pull/1344#discussion_r211039752 ## File path: exec/java-exec/src/test/java/org/apache/drill/test/OperatorTestBuilder.java ## @@ -0,0 +1,300 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.test; + +import com.google.common.base.Preconditions; +import org.apache.commons.lang3.mutable.MutableInt; +import org.apache.drill.exec.physical.base.AbstractBase; +import org.apache.drill.exec.physical.base.PhysicalOperator; +import org.apache.drill.exec.physical.impl.BatchCreator; +import org.apache.drill.exec.physical.impl.MockRecordBatch; +import org.apache.drill.exec.physical.impl.svremover.Copier; +import org.apache.drill.exec.physical.impl.svremover.GenericCopier; +import org.apache.drill.exec.record.CloseableRecordBatch; +import org.apache.drill.exec.record.RecordBatch; +import org.apache.drill.exec.record.RecordBatchSizer; +import org.apache.drill.exec.record.VectorContainer; +import org.apache.drill.exec.vector.FixedWidthVector; +import org.apache.drill.exec.vector.ValueVector; +import org.apache.drill.exec.vector.VariableWidthVector; +import org.apache.drill.test.rowSet.DirectRowSet; +import org.apache.drill.test.rowSet.IndirectRowSet; +import org.apache.drill.test.rowSet.RowSet; +import org.apache.drill.test.rowSet.RowSetComparison; +import org.junit.Assert; + +import java.util.ArrayList; +import java.util.List; +import java.util.Optional; +import java.util.stream.Collectors; + +public class OperatorTestBuilder { + + private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(OperatorTestBuilder.class); + + private final List expectedResults = new ArrayList<>(); + private final List upstreamBatches = new ArrayList<>(); + private PhysicalOpUnitTestBase physicalOpUnitTestBase; + + private PhysicalOperator physicalOperator; + private long initReservation = AbstractBase.INIT_ALLOCATION; + private long maxAllocation = AbstractBase.MAX_ALLOCATION; + private Optional expectedNumBatchesOpt = Optional.empty(); + private Optional expectedTotalRowsOpt = Optional.empty(); + private boolean combineOutputBatches; + + public OperatorTestBuilder(PhysicalOpUnitTestBase physicalOpUnitTestBase) { +this.physicalOpUnitTestBase = physicalOpUnitTestBase; + } + + @SuppressWarnings({"unchecked", "resource"}) + public void go() throws Exception { +final List actualResults = new ArrayList<>(); +CloseableRecordBatch testOperator = null; + +try { + validate(); + int expectedNumBatches = expectedNumBatchesOpt.orElse(expectedResults.size()); + physicalOpUnitTestBase.mockOpContext(physicalOperator, initReservation, maxAllocation); + + final BatchCreator opCreator = (BatchCreator) physicalOpUnitTestBase.opCreatorReg.getOperatorCreator(physicalOperator.getClass()); + testOperator = opCreator.getBatch(physicalOpUnitTestBase.fragContext, physicalOperator, (List)upstreamBatches); + + batchIterator: for (int batchIndex = 0;; batchIndex++) { +final RecordBatch.IterOutcome outcome = testOperator.next(); + +switch (outcome) { + case NONE: +if (!combineOutputBatches) { + Assert.assertEquals(expectedNumBatches, batchIndex); +} +// We are done iterating over batches. Now we need to compare them. +break batchIterator; + case OK_NEW_SCHEMA: +boolean skip = true; + +try { + skip = testOperator.getContainer().getRecordCount() == 0; +} catch (IllegalStateException e) { + // We should skip this batch in this case. It means no data was included with the okay schema +} finally { + if (skip) { +batchIndex--; +break; +
[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests
[ https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584407#comment-16584407 ] ASF GitHub Bot commented on DRILL-6461: --- ilooner commented on a change in pull request #1344: DRILL-6461: Added basic data correctness tests for hash agg, and improved operator unit testing framework. URL: https://github.com/apache/drill/pull/1344#discussion_r211038430 ## File path: exec/java-exec/src/test/java/org/apache/drill/test/OperatorTestBuilder.java ## @@ -0,0 +1,300 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.test; + +import com.google.common.base.Preconditions; +import org.apache.commons.lang3.mutable.MutableInt; +import org.apache.drill.exec.physical.base.AbstractBase; +import org.apache.drill.exec.physical.base.PhysicalOperator; +import org.apache.drill.exec.physical.impl.BatchCreator; +import org.apache.drill.exec.physical.impl.MockRecordBatch; +import org.apache.drill.exec.physical.impl.svremover.Copier; +import org.apache.drill.exec.physical.impl.svremover.GenericCopier; +import org.apache.drill.exec.record.CloseableRecordBatch; +import org.apache.drill.exec.record.RecordBatch; +import org.apache.drill.exec.record.RecordBatchSizer; +import org.apache.drill.exec.record.VectorContainer; +import org.apache.drill.exec.vector.FixedWidthVector; +import org.apache.drill.exec.vector.ValueVector; +import org.apache.drill.exec.vector.VariableWidthVector; +import org.apache.drill.test.rowSet.DirectRowSet; +import org.apache.drill.test.rowSet.IndirectRowSet; +import org.apache.drill.test.rowSet.RowSet; +import org.apache.drill.test.rowSet.RowSetComparison; +import org.junit.Assert; + +import java.util.ArrayList; +import java.util.List; +import java.util.Optional; +import java.util.stream.Collectors; + +public class OperatorTestBuilder { + + private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(OperatorTestBuilder.class); + + private final List expectedResults = new ArrayList<>(); + private final List upstreamBatches = new ArrayList<>(); + private PhysicalOpUnitTestBase physicalOpUnitTestBase; + + private PhysicalOperator physicalOperator; + private long initReservation = AbstractBase.INIT_ALLOCATION; + private long maxAllocation = AbstractBase.MAX_ALLOCATION; + private Optional expectedNumBatchesOpt = Optional.empty(); + private Optional expectedTotalRowsOpt = Optional.empty(); + private boolean combineOutputBatches; + + public OperatorTestBuilder(PhysicalOpUnitTestBase physicalOpUnitTestBase) { +this.physicalOpUnitTestBase = physicalOpUnitTestBase; + } + + @SuppressWarnings({"unchecked", "resource"}) + public void go() throws Exception { +final List actualResults = new ArrayList<>(); +CloseableRecordBatch testOperator = null; + +try { + validate(); + int expectedNumBatches = expectedNumBatchesOpt.orElse(expectedResults.size()); + physicalOpUnitTestBase.mockOpContext(physicalOperator, initReservation, maxAllocation); + + final BatchCreator opCreator = (BatchCreator) physicalOpUnitTestBase.opCreatorReg.getOperatorCreator(physicalOperator.getClass()); + testOperator = opCreator.getBatch(physicalOpUnitTestBase.fragContext, physicalOperator, (List)upstreamBatches); + + batchIterator: for (int batchIndex = 0;; batchIndex++) { +final RecordBatch.IterOutcome outcome = testOperator.next(); + +switch (outcome) { + case NONE: +if (!combineOutputBatches) { + Assert.assertEquals(expectedNumBatches, batchIndex); +} +// We are done iterating over batches. Now we need to compare them. +break batchIterator; + case OK_NEW_SCHEMA: +boolean skip = true; + +try { + skip = testOperator.getContainer().getRecordCount() == 0; +} catch (IllegalStateException e) { + // We should skip this batch in this case. It means no data was included with the okay schema +} finally { + if (skip) { +batchIndex--; +break; +
[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests
[ https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584405#comment-16584405 ] ASF GitHub Bot commented on DRILL-6461: --- ilooner commented on a change in pull request #1344: DRILL-6461: Added basic data correctness tests for hash agg, and improved operator unit testing framework. URL: https://github.com/apache/drill/pull/1344#discussion_r211038182 ## File path: exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/MockRecordBatch.java ## @@ -158,19 +210,34 @@ public IterOutcome next() { container.clear(); container = new VectorContainer(allocator, inputSchema); } - container.transferIn(input); - container.setRecordCount(recordCount); - - // Transfer the sv2 as well - final SelectionVector2 inputSv2 = -(allTestContainersSv2 != null && allTestContainersSv2.size() > 0) - ? allTestContainersSv2.get(currentContainerIndex) : null; - if (inputSv2 != null) { -sv2.allocateNewSafe(inputSv2.getCount()); -for (int i=0; i Add Basic Data Correctness Unit Tests > - > > Key: DRILL-6461 > URL: https://issues.apache.org/jira/browse/DRILL-6461 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Timothy Farkas >Assignee: Timothy Farkas >Priority: Major > > There are no data correctness unit tests for HashAgg. We need to add some. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests
[ https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584384#comment-16584384 ] ASF GitHub Bot commented on DRILL-6461: --- sohami commented on a change in pull request #1344: DRILL-6461: Added basic data correctness tests for hash agg, and improved operator unit testing framework. URL: https://github.com/apache/drill/pull/1344#discussion_r211032924 ## File path: exec/java-exec/src/test/java/org/apache/drill/test/OperatorTestBuilder.java ## @@ -0,0 +1,300 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.test; + +import com.google.common.base.Preconditions; +import org.apache.commons.lang3.mutable.MutableInt; +import org.apache.drill.exec.physical.base.AbstractBase; +import org.apache.drill.exec.physical.base.PhysicalOperator; +import org.apache.drill.exec.physical.impl.BatchCreator; +import org.apache.drill.exec.physical.impl.MockRecordBatch; +import org.apache.drill.exec.physical.impl.svremover.Copier; +import org.apache.drill.exec.physical.impl.svremover.GenericCopier; +import org.apache.drill.exec.record.CloseableRecordBatch; +import org.apache.drill.exec.record.RecordBatch; +import org.apache.drill.exec.record.RecordBatchSizer; +import org.apache.drill.exec.record.VectorContainer; +import org.apache.drill.exec.vector.FixedWidthVector; +import org.apache.drill.exec.vector.ValueVector; +import org.apache.drill.exec.vector.VariableWidthVector; +import org.apache.drill.test.rowSet.DirectRowSet; +import org.apache.drill.test.rowSet.IndirectRowSet; +import org.apache.drill.test.rowSet.RowSet; +import org.apache.drill.test.rowSet.RowSetComparison; +import org.junit.Assert; + +import java.util.ArrayList; +import java.util.List; +import java.util.Optional; +import java.util.stream.Collectors; + +public class OperatorTestBuilder { + + private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(OperatorTestBuilder.class); + + private final List expectedResults = new ArrayList<>(); + private final List upstreamBatches = new ArrayList<>(); + private PhysicalOpUnitTestBase physicalOpUnitTestBase; + + private PhysicalOperator physicalOperator; + private long initReservation = AbstractBase.INIT_ALLOCATION; + private long maxAllocation = AbstractBase.MAX_ALLOCATION; + private Optional expectedNumBatchesOpt = Optional.empty(); + private Optional expectedTotalRowsOpt = Optional.empty(); + private boolean combineOutputBatches; + + public OperatorTestBuilder(PhysicalOpUnitTestBase physicalOpUnitTestBase) { +this.physicalOpUnitTestBase = physicalOpUnitTestBase; + } + + @SuppressWarnings({"unchecked", "resource"}) + public void go() throws Exception { +final List actualResults = new ArrayList<>(); +CloseableRecordBatch testOperator = null; + +try { + validate(); + int expectedNumBatches = expectedNumBatchesOpt.orElse(expectedResults.size()); + physicalOpUnitTestBase.mockOpContext(physicalOperator, initReservation, maxAllocation); + + final BatchCreator opCreator = (BatchCreator) physicalOpUnitTestBase.opCreatorReg.getOperatorCreator(physicalOperator.getClass()); + testOperator = opCreator.getBatch(physicalOpUnitTestBase.fragContext, physicalOperator, (List)upstreamBatches); + + batchIterator: for (int batchIndex = 0;; batchIndex++) { +final RecordBatch.IterOutcome outcome = testOperator.next(); + +switch (outcome) { + case NONE: +if (!combineOutputBatches) { + Assert.assertEquals(expectedNumBatches, batchIndex); +} +// We are done iterating over batches. Now we need to compare them. +break batchIterator; + case OK_NEW_SCHEMA: +boolean skip = true; + +try { + skip = testOperator.getContainer().getRecordCount() == 0; +} catch (IllegalStateException e) { + // We should skip this batch in this case. It means no data was included with the okay schema +} finally { + if (skip) { +batchIndex--; +break; +
[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests
[ https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584385#comment-16584385 ] ASF GitHub Bot commented on DRILL-6461: --- sohami commented on a change in pull request #1344: DRILL-6461: Added basic data correctness tests for hash agg, and improved operator unit testing framework. URL: https://github.com/apache/drill/pull/1344#discussion_r210989987 ## File path: exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/svremover/AbstractGenericCopierTest.java ## @@ -20,77 +20,106 @@ import org.apache.drill.common.types.TypeProtos; import org.apache.drill.common.types.Types; import org.apache.drill.exec.exception.SchemaChangeException; -import org.apache.drill.exec.memory.RootAllocator; +import org.apache.drill.exec.memory.BufferAllocator; +import org.apache.drill.exec.physical.impl.MockRecordBatch; import org.apache.drill.exec.record.BatchSchema; import org.apache.drill.exec.record.MaterializedField; import org.apache.drill.exec.record.RecordBatch; import org.apache.drill.exec.record.VectorContainer; -import org.apache.drill.exec.record.metadata.TupleMetadata; +import org.apache.drill.exec.record.VectorWrapper; import org.apache.drill.exec.vector.SchemaChangeCallBack; +import org.apache.drill.test.BaseDirTestWatcher; +import org.apache.drill.test.OperatorFixture; +import org.apache.drill.test.rowSet.DirectRowSet; import org.apache.drill.test.rowSet.RowSet; -import org.apache.drill.test.rowSet.RowSetBatch; import org.apache.drill.test.rowSet.RowSetBuilder; import org.apache.drill.test.rowSet.RowSetComparison; import org.apache.drill.test.rowSet.schema.SchemaBuilder; +import org.junit.Rule; import org.junit.Test; public abstract class AbstractGenericCopierTest { + @Rule + public final BaseDirTestWatcher baseDirTestWatcher = new BaseDirTestWatcher(); + @Test - public void testCopyRecords() throws SchemaChangeException { -try (RootAllocator allocator = new RootAllocator(10_000_000)) { - final TupleMetadata batchSchema = createTestSchema(BatchSchema.SelectionVectorMode.NONE); + public void testCopyRecords() throws Exception { +try (OperatorFixture operatorFixture = new OperatorFixture.Builder(baseDirTestWatcher).build()) { + final BufferAllocator allocator = operatorFixture.allocator(); + final BatchSchema batchSchema = createTestSchema(BatchSchema.SelectionVectorMode.NONE); final RowSet srcRowSet = createSrcRowSet(allocator); - final RowSet destRowSet = new RowSetBuilder(allocator, batchSchema).build(); - final VectorContainer destContainer = destRowSet.container(); - final Copier copier = createCopier(new RowSetBatch(srcRowSet), destContainer, null); + final VectorContainer destContainer = new VectorContainer(allocator, batchSchema); + + for (VectorWrapper vectorWrapper: destContainer) { +vectorWrapper.getValueVector().allocateNew(); Review comment: Why we need to allocate memory for destination container here ? Shouldn't the copier allocate it when `copyRecords` is called. Since it will do so based on incoming record count. Also in case of `StraightCopier` it should not allocate any memory for ValueVectors in destination container since there will be direct transfer of buffer from incoming to outgoing container ValueVectors. I think the bug is [here](https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/svremover/GenericCopier.java#L56) instead. It should be same as [AbstractCopier::copyRecords](https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/svremover/AbstractCopier.java#L45). Also `GenericCopier` is not efficient for cases like `StraightCopier` where no actual copy is done only transfer of buffer happens. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add Basic Data Correctness Unit Tests > - > > Key: DRILL-6461 > URL: https://issues.apache.org/jira/browse/DRILL-6461 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Timothy Farkas >Assignee: Timothy Farkas >Priority: Major > > There are no data correctness unit tests for HashAgg. We need to add some. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests
[ https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584386#comment-16584386 ] ASF GitHub Bot commented on DRILL-6461: --- sohami commented on a change in pull request #1344: DRILL-6461: Added basic data correctness tests for hash agg, and improved operator unit testing framework. URL: https://github.com/apache/drill/pull/1344#discussion_r211000586 ## File path: exec/java-exec/src/test/java/org/apache/drill/test/OperatorTestBuilder.java ## @@ -0,0 +1,300 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.test; + +import com.google.common.base.Preconditions; +import org.apache.commons.lang3.mutable.MutableInt; +import org.apache.drill.exec.physical.base.AbstractBase; +import org.apache.drill.exec.physical.base.PhysicalOperator; +import org.apache.drill.exec.physical.impl.BatchCreator; +import org.apache.drill.exec.physical.impl.MockRecordBatch; +import org.apache.drill.exec.physical.impl.svremover.Copier; +import org.apache.drill.exec.physical.impl.svremover.GenericCopier; +import org.apache.drill.exec.record.CloseableRecordBatch; +import org.apache.drill.exec.record.RecordBatch; +import org.apache.drill.exec.record.RecordBatchSizer; +import org.apache.drill.exec.record.VectorContainer; +import org.apache.drill.exec.vector.FixedWidthVector; +import org.apache.drill.exec.vector.ValueVector; +import org.apache.drill.exec.vector.VariableWidthVector; +import org.apache.drill.test.rowSet.DirectRowSet; +import org.apache.drill.test.rowSet.IndirectRowSet; +import org.apache.drill.test.rowSet.RowSet; +import org.apache.drill.test.rowSet.RowSetComparison; +import org.junit.Assert; + +import java.util.ArrayList; +import java.util.List; +import java.util.Optional; +import java.util.stream.Collectors; + +public class OperatorTestBuilder { + + private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(OperatorTestBuilder.class); + + private final List expectedResults = new ArrayList<>(); + private final List upstreamBatches = new ArrayList<>(); + private PhysicalOpUnitTestBase physicalOpUnitTestBase; + + private PhysicalOperator physicalOperator; + private long initReservation = AbstractBase.INIT_ALLOCATION; + private long maxAllocation = AbstractBase.MAX_ALLOCATION; + private Optional expectedNumBatchesOpt = Optional.empty(); + private Optional expectedTotalRowsOpt = Optional.empty(); + private boolean combineOutputBatches; + + public OperatorTestBuilder(PhysicalOpUnitTestBase physicalOpUnitTestBase) { +this.physicalOpUnitTestBase = physicalOpUnitTestBase; + } + + @SuppressWarnings({"unchecked", "resource"}) + public void go() throws Exception { +final List actualResults = new ArrayList<>(); +CloseableRecordBatch testOperator = null; + +try { + validate(); + int expectedNumBatches = expectedNumBatchesOpt.orElse(expectedResults.size()); + physicalOpUnitTestBase.mockOpContext(physicalOperator, initReservation, maxAllocation); + + final BatchCreator opCreator = (BatchCreator) physicalOpUnitTestBase.opCreatorReg.getOperatorCreator(physicalOperator.getClass()); + testOperator = opCreator.getBatch(physicalOpUnitTestBase.fragContext, physicalOperator, (List)upstreamBatches); + + batchIterator: for (int batchIndex = 0;; batchIndex++) { +final RecordBatch.IterOutcome outcome = testOperator.next(); + +switch (outcome) { + case NONE: +if (!combineOutputBatches) { + Assert.assertEquals(expectedNumBatches, batchIndex); +} +// We are done iterating over batches. Now we need to compare them. +break batchIterator; + case OK_NEW_SCHEMA: +boolean skip = true; + +try { + skip = testOperator.getContainer().getRecordCount() == 0; +} catch (IllegalStateException e) { + // We should skip this batch in this case. It means no data was included with the okay schema +} finally { + if (skip) { +batchIndex--; +break; +
[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests
[ https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584388#comment-16584388 ] ASF GitHub Bot commented on DRILL-6461: --- sohami commented on a change in pull request #1344: DRILL-6461: Added basic data correctness tests for hash agg, and improved operator unit testing framework. URL: https://github.com/apache/drill/pull/1344#discussion_r211032665 ## File path: exec/java-exec/src/test/java/org/apache/drill/test/OperatorTestBuilder.java ## @@ -0,0 +1,300 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.test; + +import com.google.common.base.Preconditions; +import org.apache.commons.lang3.mutable.MutableInt; +import org.apache.drill.exec.physical.base.AbstractBase; +import org.apache.drill.exec.physical.base.PhysicalOperator; +import org.apache.drill.exec.physical.impl.BatchCreator; +import org.apache.drill.exec.physical.impl.MockRecordBatch; +import org.apache.drill.exec.physical.impl.svremover.Copier; +import org.apache.drill.exec.physical.impl.svremover.GenericCopier; +import org.apache.drill.exec.record.CloseableRecordBatch; +import org.apache.drill.exec.record.RecordBatch; +import org.apache.drill.exec.record.RecordBatchSizer; +import org.apache.drill.exec.record.VectorContainer; +import org.apache.drill.exec.vector.FixedWidthVector; +import org.apache.drill.exec.vector.ValueVector; +import org.apache.drill.exec.vector.VariableWidthVector; +import org.apache.drill.test.rowSet.DirectRowSet; +import org.apache.drill.test.rowSet.IndirectRowSet; +import org.apache.drill.test.rowSet.RowSet; +import org.apache.drill.test.rowSet.RowSetComparison; +import org.junit.Assert; + +import java.util.ArrayList; +import java.util.List; +import java.util.Optional; +import java.util.stream.Collectors; + +public class OperatorTestBuilder { + + private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(OperatorTestBuilder.class); + + private final List expectedResults = new ArrayList<>(); + private final List upstreamBatches = new ArrayList<>(); + private PhysicalOpUnitTestBase physicalOpUnitTestBase; + + private PhysicalOperator physicalOperator; + private long initReservation = AbstractBase.INIT_ALLOCATION; + private long maxAllocation = AbstractBase.MAX_ALLOCATION; + private Optional expectedNumBatchesOpt = Optional.empty(); + private Optional expectedTotalRowsOpt = Optional.empty(); + private boolean combineOutputBatches; + + public OperatorTestBuilder(PhysicalOpUnitTestBase physicalOpUnitTestBase) { +this.physicalOpUnitTestBase = physicalOpUnitTestBase; + } + + @SuppressWarnings({"unchecked", "resource"}) + public void go() throws Exception { +final List actualResults = new ArrayList<>(); +CloseableRecordBatch testOperator = null; + +try { + validate(); + int expectedNumBatches = expectedNumBatchesOpt.orElse(expectedResults.size()); + physicalOpUnitTestBase.mockOpContext(physicalOperator, initReservation, maxAllocation); + + final BatchCreator opCreator = (BatchCreator) physicalOpUnitTestBase.opCreatorReg.getOperatorCreator(physicalOperator.getClass()); + testOperator = opCreator.getBatch(physicalOpUnitTestBase.fragContext, physicalOperator, (List)upstreamBatches); + + batchIterator: for (int batchIndex = 0;; batchIndex++) { +final RecordBatch.IterOutcome outcome = testOperator.next(); + +switch (outcome) { + case NONE: +if (!combineOutputBatches) { + Assert.assertEquals(expectedNumBatches, batchIndex); +} +// We are done iterating over batches. Now we need to compare them. +break batchIterator; + case OK_NEW_SCHEMA: +boolean skip = true; + +try { + skip = testOperator.getContainer().getRecordCount() == 0; +} catch (IllegalStateException e) { + // We should skip this batch in this case. It means no data was included with the okay schema +} finally { + if (skip) { +batchIndex--; +break; +
[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests
[ https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584383#comment-16584383 ] ASF GitHub Bot commented on DRILL-6461: --- sohami commented on a change in pull request #1344: DRILL-6461: Added basic data correctness tests for hash agg, and improved operator unit testing framework. URL: https://github.com/apache/drill/pull/1344#discussion_r210998188 ## File path: exec/java-exec/src/test/java/org/apache/drill/test/OperatorTestBuilder.java ## @@ -0,0 +1,300 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.test; + +import com.google.common.base.Preconditions; +import org.apache.commons.lang3.mutable.MutableInt; +import org.apache.drill.exec.physical.base.AbstractBase; +import org.apache.drill.exec.physical.base.PhysicalOperator; +import org.apache.drill.exec.physical.impl.BatchCreator; +import org.apache.drill.exec.physical.impl.MockRecordBatch; +import org.apache.drill.exec.physical.impl.svremover.Copier; +import org.apache.drill.exec.physical.impl.svremover.GenericCopier; +import org.apache.drill.exec.record.CloseableRecordBatch; +import org.apache.drill.exec.record.RecordBatch; +import org.apache.drill.exec.record.RecordBatchSizer; +import org.apache.drill.exec.record.VectorContainer; +import org.apache.drill.exec.vector.FixedWidthVector; +import org.apache.drill.exec.vector.ValueVector; +import org.apache.drill.exec.vector.VariableWidthVector; +import org.apache.drill.test.rowSet.DirectRowSet; +import org.apache.drill.test.rowSet.IndirectRowSet; +import org.apache.drill.test.rowSet.RowSet; +import org.apache.drill.test.rowSet.RowSetComparison; +import org.junit.Assert; + +import java.util.ArrayList; +import java.util.List; +import java.util.Optional; +import java.util.stream.Collectors; + +public class OperatorTestBuilder { + + private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(OperatorTestBuilder.class); + + private final List expectedResults = new ArrayList<>(); + private final List upstreamBatches = new ArrayList<>(); + private PhysicalOpUnitTestBase physicalOpUnitTestBase; + + private PhysicalOperator physicalOperator; + private long initReservation = AbstractBase.INIT_ALLOCATION; + private long maxAllocation = AbstractBase.MAX_ALLOCATION; + private Optional expectedNumBatchesOpt = Optional.empty(); + private Optional expectedTotalRowsOpt = Optional.empty(); + private boolean combineOutputBatches; + + public OperatorTestBuilder(PhysicalOpUnitTestBase physicalOpUnitTestBase) { +this.physicalOpUnitTestBase = physicalOpUnitTestBase; + } + + @SuppressWarnings({"unchecked", "resource"}) + public void go() throws Exception { +final List actualResults = new ArrayList<>(); +CloseableRecordBatch testOperator = null; + +try { + validate(); + int expectedNumBatches = expectedNumBatchesOpt.orElse(expectedResults.size()); + physicalOpUnitTestBase.mockOpContext(physicalOperator, initReservation, maxAllocation); + + final BatchCreator opCreator = (BatchCreator) physicalOpUnitTestBase.opCreatorReg.getOperatorCreator(physicalOperator.getClass()); + testOperator = opCreator.getBatch(physicalOpUnitTestBase.fragContext, physicalOperator, (List)upstreamBatches); + + batchIterator: for (int batchIndex = 0;; batchIndex++) { +final RecordBatch.IterOutcome outcome = testOperator.next(); + +switch (outcome) { + case NONE: +if (!combineOutputBatches) { + Assert.assertEquals(expectedNumBatches, batchIndex); +} +// We are done iterating over batches. Now we need to compare them. +break batchIterator; + case OK_NEW_SCHEMA: +boolean skip = true; + +try { + skip = testOperator.getContainer().getRecordCount() == 0; +} catch (IllegalStateException e) { + // We should skip this batch in this case. It means no data was included with the okay schema +} finally { + if (skip) { +batchIndex--; +break; +
[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests
[ https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584387#comment-16584387 ] ASF GitHub Bot commented on DRILL-6461: --- sohami commented on a change in pull request #1344: DRILL-6461: Added basic data correctness tests for hash agg, and improved operator unit testing framework. URL: https://github.com/apache/drill/pull/1344#discussion_r211022063 ## File path: exec/java-exec/src/test/java/org/apache/drill/test/OperatorTestBuilder.java ## @@ -0,0 +1,300 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.test; + +import com.google.common.base.Preconditions; +import org.apache.commons.lang3.mutable.MutableInt; +import org.apache.drill.exec.physical.base.AbstractBase; +import org.apache.drill.exec.physical.base.PhysicalOperator; +import org.apache.drill.exec.physical.impl.BatchCreator; +import org.apache.drill.exec.physical.impl.MockRecordBatch; +import org.apache.drill.exec.physical.impl.svremover.Copier; +import org.apache.drill.exec.physical.impl.svremover.GenericCopier; +import org.apache.drill.exec.record.CloseableRecordBatch; +import org.apache.drill.exec.record.RecordBatch; +import org.apache.drill.exec.record.RecordBatchSizer; +import org.apache.drill.exec.record.VectorContainer; +import org.apache.drill.exec.vector.FixedWidthVector; +import org.apache.drill.exec.vector.ValueVector; +import org.apache.drill.exec.vector.VariableWidthVector; +import org.apache.drill.test.rowSet.DirectRowSet; +import org.apache.drill.test.rowSet.IndirectRowSet; +import org.apache.drill.test.rowSet.RowSet; +import org.apache.drill.test.rowSet.RowSetComparison; +import org.junit.Assert; + +import java.util.ArrayList; +import java.util.List; +import java.util.Optional; +import java.util.stream.Collectors; + +public class OperatorTestBuilder { + + private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(OperatorTestBuilder.class); + + private final List expectedResults = new ArrayList<>(); + private final List upstreamBatches = new ArrayList<>(); + private PhysicalOpUnitTestBase physicalOpUnitTestBase; + + private PhysicalOperator physicalOperator; + private long initReservation = AbstractBase.INIT_ALLOCATION; + private long maxAllocation = AbstractBase.MAX_ALLOCATION; + private Optional expectedNumBatchesOpt = Optional.empty(); + private Optional expectedTotalRowsOpt = Optional.empty(); + private boolean combineOutputBatches; + + public OperatorTestBuilder(PhysicalOpUnitTestBase physicalOpUnitTestBase) { +this.physicalOpUnitTestBase = physicalOpUnitTestBase; + } + + @SuppressWarnings({"unchecked", "resource"}) + public void go() throws Exception { +final List actualResults = new ArrayList<>(); +CloseableRecordBatch testOperator = null; + +try { + validate(); + int expectedNumBatches = expectedNumBatchesOpt.orElse(expectedResults.size()); + physicalOpUnitTestBase.mockOpContext(physicalOperator, initReservation, maxAllocation); + + final BatchCreator opCreator = (BatchCreator) physicalOpUnitTestBase.opCreatorReg.getOperatorCreator(physicalOperator.getClass()); + testOperator = opCreator.getBatch(physicalOpUnitTestBase.fragContext, physicalOperator, (List)upstreamBatches); + + batchIterator: for (int batchIndex = 0;; batchIndex++) { +final RecordBatch.IterOutcome outcome = testOperator.next(); + +switch (outcome) { + case NONE: +if (!combineOutputBatches) { + Assert.assertEquals(expectedNumBatches, batchIndex); +} +// We are done iterating over batches. Now we need to compare them. +break batchIterator; + case OK_NEW_SCHEMA: +boolean skip = true; + +try { + skip = testOperator.getContainer().getRecordCount() == 0; +} catch (IllegalStateException e) { + // We should skip this batch in this case. It means no data was included with the okay schema +} finally { + if (skip) { +batchIndex--; +break; +
[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests
[ https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584389#comment-16584389 ] ASF GitHub Bot commented on DRILL-6461: --- sohami commented on a change in pull request #1344: DRILL-6461: Added basic data correctness tests for hash agg, and improved operator unit testing framework. URL: https://github.com/apache/drill/pull/1344#discussion_r210994889 ## File path: exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/MockRecordBatch.java ## @@ -158,19 +210,34 @@ public IterOutcome next() { container.clear(); container = new VectorContainer(allocator, inputSchema); } - container.transferIn(input); - container.setRecordCount(recordCount); - - // Transfer the sv2 as well - final SelectionVector2 inputSv2 = -(allTestContainersSv2 != null && allTestContainersSv2.size() > 0) - ? allTestContainersSv2.get(currentContainerIndex) : null; - if (inputSv2 != null) { -sv2.allocateNewSafe(inputSv2.getCount()); -for (int i=0; i Add Basic Data Correctness Unit Tests > - > > Key: DRILL-6461 > URL: https://issues.apache.org/jira/browse/DRILL-6461 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Timothy Farkas >Assignee: Timothy Farkas >Priority: Major > > There are no data correctness unit tests for HashAgg. We need to add some. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests
[ https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584390#comment-16584390 ] ASF GitHub Bot commented on DRILL-6461: --- sohami commented on a change in pull request #1344: DRILL-6461: Added basic data correctness tests for hash agg, and improved operator unit testing framework. URL: https://github.com/apache/drill/pull/1344#discussion_r211020462 ## File path: exec/java-exec/src/test/java/org/apache/drill/test/OperatorTestBuilder.java ## @@ -0,0 +1,300 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.test; + +import com.google.common.base.Preconditions; +import org.apache.commons.lang3.mutable.MutableInt; +import org.apache.drill.exec.physical.base.AbstractBase; +import org.apache.drill.exec.physical.base.PhysicalOperator; +import org.apache.drill.exec.physical.impl.BatchCreator; +import org.apache.drill.exec.physical.impl.MockRecordBatch; +import org.apache.drill.exec.physical.impl.svremover.Copier; +import org.apache.drill.exec.physical.impl.svremover.GenericCopier; +import org.apache.drill.exec.record.CloseableRecordBatch; +import org.apache.drill.exec.record.RecordBatch; +import org.apache.drill.exec.record.RecordBatchSizer; +import org.apache.drill.exec.record.VectorContainer; +import org.apache.drill.exec.vector.FixedWidthVector; +import org.apache.drill.exec.vector.ValueVector; +import org.apache.drill.exec.vector.VariableWidthVector; +import org.apache.drill.test.rowSet.DirectRowSet; +import org.apache.drill.test.rowSet.IndirectRowSet; +import org.apache.drill.test.rowSet.RowSet; +import org.apache.drill.test.rowSet.RowSetComparison; +import org.junit.Assert; + +import java.util.ArrayList; +import java.util.List; +import java.util.Optional; +import java.util.stream.Collectors; + +public class OperatorTestBuilder { + + private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(OperatorTestBuilder.class); + + private final List expectedResults = new ArrayList<>(); + private final List upstreamBatches = new ArrayList<>(); + private PhysicalOpUnitTestBase physicalOpUnitTestBase; + + private PhysicalOperator physicalOperator; + private long initReservation = AbstractBase.INIT_ALLOCATION; + private long maxAllocation = AbstractBase.MAX_ALLOCATION; + private Optional expectedNumBatchesOpt = Optional.empty(); + private Optional expectedTotalRowsOpt = Optional.empty(); + private boolean combineOutputBatches; + + public OperatorTestBuilder(PhysicalOpUnitTestBase physicalOpUnitTestBase) { +this.physicalOpUnitTestBase = physicalOpUnitTestBase; + } + + @SuppressWarnings({"unchecked", "resource"}) + public void go() throws Exception { +final List actualResults = new ArrayList<>(); +CloseableRecordBatch testOperator = null; + +try { + validate(); + int expectedNumBatches = expectedNumBatchesOpt.orElse(expectedResults.size()); + physicalOpUnitTestBase.mockOpContext(physicalOperator, initReservation, maxAllocation); + + final BatchCreator opCreator = (BatchCreator) physicalOpUnitTestBase.opCreatorReg.getOperatorCreator(physicalOperator.getClass()); + testOperator = opCreator.getBatch(physicalOpUnitTestBase.fragContext, physicalOperator, (List)upstreamBatches); + + batchIterator: for (int batchIndex = 0;; batchIndex++) { +final RecordBatch.IterOutcome outcome = testOperator.next(); + +switch (outcome) { + case NONE: +if (!combineOutputBatches) { + Assert.assertEquals(expectedNumBatches, batchIndex); +} +// We are done iterating over batches. Now we need to compare them. +break batchIterator; + case OK_NEW_SCHEMA: +boolean skip = true; + +try { + skip = testOperator.getContainer().getRecordCount() == 0; +} catch (IllegalStateException e) { + // We should skip this batch in this case. It means no data was included with the okay schema +} finally { + if (skip) { +batchIndex--; +break; +
[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests
[ https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16583004#comment-16583004 ] ASF GitHub Bot commented on DRILL-6461: --- ilooner commented on issue #1344: DRILL-6461: Added basic data correctness tests for hash agg, and improved operator unit testing framework. URL: https://github.com/apache/drill/pull/1344#issuecomment-413667931 @sohami The PR is ready for another round of review. I removed the fix that allowed **copyEntry** to work with empty VarLength vectors since value vectors should not be manipulated unless allocateNew is called. I fixed the failing test by calling allocateNew on all the vectors in the destination vector container. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add Basic Data Correctness Unit Tests > - > > Key: DRILL-6461 > URL: https://issues.apache.org/jira/browse/DRILL-6461 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Timothy Farkas >Assignee: Timothy Farkas >Priority: Major > > There are no data correctness unit tests for HashAgg. We need to add some. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests
[ https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16581815#comment-16581815 ] ASF GitHub Bot commented on DRILL-6461: --- ilooner commented on issue #1344: DRILL-6461: Added basic data correctness tests for hash agg, and improved operator unit testing framework. URL: https://github.com/apache/drill/pull/1344#issuecomment-413388915 Arg! There appears to be one random test failing on Travis. I will look into it. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add Basic Data Correctness Unit Tests > - > > Key: DRILL-6461 > URL: https://issues.apache.org/jira/browse/DRILL-6461 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Timothy Farkas >Assignee: Timothy Farkas >Priority: Major > > There are no data correctness unit tests for HashAgg. We need to add some. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests
[ https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16581685#comment-16581685 ] ASF GitHub Bot commented on DRILL-6461: --- ilooner commented on issue #1344: DRILL-6461: Added basic data correctness tests for hash agg, and improved operator unit testing framework. URL: https://github.com/apache/drill/pull/1344#issuecomment-413355862 @sohami I removed the **RowSetBatch** and made the **MockRecordBatch** work with RowSets. Please take another look. This proved to be a fruitful exercise because using MockRecordBatch exposed some bugs and corner cases in the vectors, and then those bugs and corner cases lead to even more bugs being exposed. The specific issues were: - Previously I assumed **copyEntry** was a safe operation (does reallocs when needed), but this was not entirely true. **copyEntry** was safe for non-empty VarLength vectors, and Nullable vectors copying from another Nullable vector. It was not safe for empty VarLength vectors, copying from a non-nullable vector to a nullable vector, or fixed length vectors. So Included fixes for these three corner cases. - I made **copyEntry** safe for empty VarLength vectors by always making sure the first entry in the offset vector existed and was zero on initialization of the vector. This caused memory to be leaked because some operators created empty VectorContainers and never closed them: - TopN created a dummy VectorContainer for no reason and never closed it, so I removed it. - HashTableTemplate created a dummy template VectorContainer and never closed it, so I close it now. We should actually probably avoid using the dummy VectorContainer entirely and just use a schema, but I would rather do that in a separate change. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add Basic Data Correctness Unit Tests > - > > Key: DRILL-6461 > URL: https://issues.apache.org/jira/browse/DRILL-6461 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Timothy Farkas >Assignee: Timothy Farkas >Priority: Major > > There are no data correctness unit tests for HashAgg. We need to add some. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests
[ https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16578993#comment-16578993 ] ASF GitHub Bot commented on DRILL-6461: --- ilooner commented on a change in pull request #1344: DRILL-6461: Added basic data correctness tests for hash agg, and improved operator unit testing framework. URL: https://github.com/apache/drill/pull/1344#discussion_r209779765 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/record/VectorContainer.java ## @@ -42,6 +42,8 @@ private final BufferAllocator allocator; protected final List> wrappers = Lists.newArrayList(); private BatchSchema schema; + private SelectionVector2 selectionVector2; + private SelectionVector4 selectionVector4; Review comment: Cool made https://issues.apache.org/jira/browse/DRILL-6683 to track that change. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add Basic Data Correctness Unit Tests > - > > Key: DRILL-6461 > URL: https://issues.apache.org/jira/browse/DRILL-6461 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Timothy Farkas >Assignee: Timothy Farkas >Priority: Major > > There are no data correctness unit tests for HashAgg. We need to add some. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests
[ https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16578991#comment-16578991 ] ASF GitHub Bot commented on DRILL-6461: --- ilooner commented on a change in pull request #1344: DRILL-6461: Added basic data correctness tests for hash agg, and improved operator unit testing framework. URL: https://github.com/apache/drill/pull/1344#discussion_r209779483 ## File path: exec/java-exec/src/test/java/org/apache/drill/test/rowSet/RowSetBatch.java ## @@ -29,13 +30,28 @@ import org.apache.drill.exec.record.selection.SelectionVector2; import org.apache.drill.exec.record.selection.SelectionVector4; +import java.util.ArrayList; import java.util.Iterator; +import java.util.List; -public class RowSetBatch implements RecordBatch { - private final RowSet rowSet; +/** + * A mock operator that returns the provided {@link RowSet}s as batches. Currently it's assumed that all the {@link RowSet}s have the same schema. + */ +public class RowSetBatch implements CloseableRecordBatch { Review comment: @sohami It's okay, I've already made the move to MockRecordBatch for this PR. Just doing a final test runs, and debugging straggling issues. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add Basic Data Correctness Unit Tests > - > > Key: DRILL-6461 > URL: https://issues.apache.org/jira/browse/DRILL-6461 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Timothy Farkas >Assignee: Timothy Farkas >Priority: Major > > There are no data correctness unit tests for HashAgg. We need to add some. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests
[ https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16578988#comment-16578988 ] ASF GitHub Bot commented on DRILL-6461: --- ilooner commented on a change in pull request #1344: DRILL-6461: Added basic data correctness tests for hash agg, and improved operator unit testing framework. URL: https://github.com/apache/drill/pull/1344#discussion_r209779109 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/record/RecordBatchSizer.java ## @@ -639,6 +641,8 @@ public ColumnSize getColumn(String name) { // columns can be obtained from children of topColumns. private Map columnSizes = CaseInsensitiveMap.newHashMap(); + private List columnSizesList = new ArrayList<>(); Review comment: done. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add Basic Data Correctness Unit Tests > - > > Key: DRILL-6461 > URL: https://issues.apache.org/jira/browse/DRILL-6461 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Timothy Farkas >Assignee: Timothy Farkas >Priority: Major > > There are no data correctness unit tests for HashAgg. We need to add some. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests
[ https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16578595#comment-16578595 ] ASF GitHub Bot commented on DRILL-6461: --- sohami commented on a change in pull request #1344: DRILL-6461: Added basic data correctness tests for hash agg, and improved operator unit testing framework. URL: https://github.com/apache/drill/pull/1344#discussion_r209677503 ## File path: exec/java-exec/src/test/java/org/apache/drill/test/rowSet/RowSetBatch.java ## @@ -29,13 +30,28 @@ import org.apache.drill.exec.record.selection.SelectionVector2; import org.apache.drill.exec.record.selection.SelectionVector4; +import java.util.ArrayList; import java.util.Iterator; +import java.util.List; -public class RowSetBatch implements RecordBatch { - private final RowSet rowSet; +/** + * A mock operator that returns the provided {@link RowSet}s as batches. Currently it's assumed that all the {@link RowSet}s have the same schema. + */ +public class RowSetBatch implements CloseableRecordBatch { Review comment: If you want you can leave it as is for now. I can take a look on how to refactor MockRecordBatch class along with other tests which uses it, which I have to do anyways. I was also thinking of making it use RowSets instead of container. The reason for creating separate class was to use it specifically for testing different IterOutcomes in context of Lateral only whereas RowSetBatch looked more of a very simple implementation to just hold one container. Was not sure if MockRecordBatch will play well with other operators in general. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add Basic Data Correctness Unit Tests > - > > Key: DRILL-6461 > URL: https://issues.apache.org/jira/browse/DRILL-6461 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Timothy Farkas >Assignee: Timothy Farkas >Priority: Major > > There are no data correctness unit tests for HashAgg. We need to add some. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests
[ https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16578560#comment-16578560 ] ASF GitHub Bot commented on DRILL-6461: --- sohami commented on a change in pull request #1344: DRILL-6461: Added basic data correctness tests for hash agg, and improved operator unit testing framework. URL: https://github.com/apache/drill/pull/1344#discussion_r209672777 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/record/VectorContainer.java ## @@ -42,6 +42,8 @@ private final BufferAllocator allocator; protected final List> wrappers = Lists.newArrayList(); private BatchSchema schema; + private SelectionVector2 selectionVector2; + private SelectionVector4 selectionVector4; Review comment: I agree with that and think it would be a good refactoring to move `getSelectionVector2` and `getSelectionVector4` from `VectorAccessible` interface to `RecordBatch` interface. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add Basic Data Correctness Unit Tests > - > > Key: DRILL-6461 > URL: https://issues.apache.org/jira/browse/DRILL-6461 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Timothy Farkas >Assignee: Timothy Farkas >Priority: Major > > There are no data correctness unit tests for HashAgg. We need to add some. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests
[ https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16574065#comment-16574065 ] ASF GitHub Bot commented on DRILL-6461: --- sachouche commented on a change in pull request #1344: DRILL-6461: Added basic data correctness tests for hash agg, and improved operator unit testing framework. URL: https://github.com/apache/drill/pull/1344#discussion_r208775803 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/HashAggTemplate.java ## @@ -1175,6 +1175,7 @@ public AggIterOutcome outputCurrentBatch() { v.getValueVector().getMutator().setValueCount(numOutputRecords); } +outContainer.setRecordCount(numOutputRecords); Review comment: Thank you Tim for the due diligence! This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add Basic Data Correctness Unit Tests > - > > Key: DRILL-6461 > URL: https://issues.apache.org/jira/browse/DRILL-6461 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Timothy Farkas >Assignee: Timothy Farkas >Priority: Major > > There are no data correctness unit tests for HashAgg. We need to add some. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests
[ https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16574063#comment-16574063 ] ASF GitHub Bot commented on DRILL-6461: --- sachouche commented on a change in pull request #1344: DRILL-6461: Added basic data correctness tests for hash agg, and improved operator unit testing framework. URL: https://github.com/apache/drill/pull/1344#discussion_r208775608 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/record/RecordBatchSizer.java ## @@ -639,6 +641,8 @@ public ColumnSize getColumn(String name) { // columns can be obtained from children of topColumns. private Map columnSizes = CaseInsensitiveMap.newHashMap(); + private List columnSizesList = new ArrayList<>(); Review comment: I see; can you please add a comment indicating this behavior (especially within the columnsList() API). Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add Basic Data Correctness Unit Tests > - > > Key: DRILL-6461 > URL: https://issues.apache.org/jira/browse/DRILL-6461 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Timothy Farkas >Assignee: Timothy Farkas >Priority: Major > > There are no data correctness unit tests for HashAgg. We need to add some. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests
[ https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16574048#comment-16574048 ] ASF GitHub Bot commented on DRILL-6461: --- ilooner commented on a change in pull request #1344: DRILL-6461: Added basic data correctness tests for hash agg, and improved operator unit testing framework. URL: https://github.com/apache/drill/pull/1344#discussion_r208770691 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/record/RecordBatchSizer.java ## @@ -639,6 +641,8 @@ public ColumnSize getColumn(String name) { // columns can be obtained from children of topColumns. private Map columnSizes = CaseInsensitiveMap.newHashMap(); + private List columnSizesList = new ArrayList<>(); Review comment: It's for convenience. VectorContainers expose their ValueVectors based on index, so it is handy to be able to lookup column size based on the same index as well. Otherwise I would have to get the Materialized field for the a column, extract the name, and then lookup the ColumnSize in the map. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add Basic Data Correctness Unit Tests > - > > Key: DRILL-6461 > URL: https://issues.apache.org/jira/browse/DRILL-6461 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Timothy Farkas >Assignee: Timothy Farkas >Priority: Major > > There are no data correctness unit tests for HashAgg. We need to add some. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests
[ https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16574041#comment-16574041 ] ASF GitHub Bot commented on DRILL-6461: --- ilooner commented on a change in pull request #1344: DRILL-6461: Added basic data correctness tests for hash agg, and improved operator unit testing framework. URL: https://github.com/apache/drill/pull/1344#discussion_r208769577 ## File path: exec/java-exec/src/test/java/org/apache/drill/test/rowSet/RowSetBatch.java ## @@ -29,13 +30,28 @@ import org.apache.drill.exec.record.selection.SelectionVector2; import org.apache.drill.exec.record.selection.SelectionVector4; +import java.util.ArrayList; import java.util.Iterator; +import java.util.List; -public class RowSetBatch implements RecordBatch { - private final RowSet rowSet; +/** + * A mock operator that returns the provided {@link RowSet}s as batches. Currently it's assumed that all the {@link RowSet}s have the same schema. + */ +public class RowSetBatch implements CloseableRecordBatch { Review comment: RowSetBatch predates MockRecordBatch :/ . But I'll see if I can make MockRecordBatch play nice with RowSets. Since MockRecordBatch is used for many many tests, I will make it work with RowSets while keeping some deprecated constructors for VectorContainers so that I don't have to update every test. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add Basic Data Correctness Unit Tests > - > > Key: DRILL-6461 > URL: https://issues.apache.org/jira/browse/DRILL-6461 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Timothy Farkas >Assignee: Timothy Farkas >Priority: Major > > There are no data correctness unit tests for HashAgg. We need to add some. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests
[ https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16573804#comment-16573804 ] ASF GitHub Bot commented on DRILL-6461: --- ilooner commented on a change in pull request #1344: DRILL-6461: Added basic data correctness tests for hash agg, and improved operator unit testing framework. URL: https://github.com/apache/drill/pull/1344#discussion_r208721030 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/HashAggTemplate.java ## @@ -1175,6 +1175,7 @@ public AggIterOutcome outputCurrentBatch() { v.getValueVector().getMutator().setValueCount(numOutputRecords); } +outContainer.setRecordCount(numOutputRecords); Review comment: - I created an issue for centralizing setting the record counts here. We can probably just make VectorContainer.setRecordCount set the record count for value vectors as well instead of just itself. https://issues.apache.org/jira/browse/DRILL-6675 - Discussed offline, I'll make the logging changes to HashAgg as part of https://issues.apache.org/jira/browse/DRILL-6253 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add Basic Data Correctness Unit Tests > - > > Key: DRILL-6461 > URL: https://issues.apache.org/jira/browse/DRILL-6461 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Timothy Farkas >Assignee: Timothy Farkas >Priority: Major > > There are no data correctness unit tests for HashAgg. We need to add some. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests
[ https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16573767#comment-16573767 ] ASF GitHub Bot commented on DRILL-6461: --- ilooner commented on a change in pull request #1344: DRILL-6461: Added basic data correctness tests for hash agg, and improved operator unit testing framework. URL: https://github.com/apache/drill/pull/1344#discussion_r208711677 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/record/VectorContainer.java ## @@ -42,6 +42,8 @@ private final BufferAllocator allocator; protected final List> wrappers = Lists.newArrayList(); private BatchSchema schema; + private SelectionVector2 selectionVector2; + private SelectionVector4 selectionVector4; Review comment: This has actually been a point of confusion for me. VectorContainer and RecordBatch both implement VectorAccessible, which has the following methods: ``` SelectionVector2 getSelectionVector2(); SelectionVector4 getSelectionVector4(); ``` If the intention is that VectorContainers should never hold selection vectors, then the interfaces VectorContainer implements should reflect that. What are you're thoughts about refactoring the VectorAccessable to remove the selection vector methods? In the meantime I will remove these changes from VectorContainer. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add Basic Data Correctness Unit Tests > - > > Key: DRILL-6461 > URL: https://issues.apache.org/jira/browse/DRILL-6461 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Timothy Farkas >Assignee: Timothy Farkas >Priority: Major > > There are no data correctness unit tests for HashAgg. We need to add some. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests
[ https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16573758#comment-16573758 ] ASF GitHub Bot commented on DRILL-6461: --- ilooner commented on a change in pull request #1344: DRILL-6461: Added basic data correctness tests for hash agg, and improved operator unit testing framework. URL: https://github.com/apache/drill/pull/1344#discussion_r208711677 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/record/VectorContainer.java ## @@ -42,6 +42,8 @@ private final BufferAllocator allocator; protected final List> wrappers = Lists.newArrayList(); private BatchSchema schema; + private SelectionVector2 selectionVector2; + private SelectionVector4 selectionVector4; Review comment: This has actually been a point of confusion for me. VectorContainer and RecordBatch both implement VectorAccessible, which has the following methods: ``` SelectionVector2 getSelectionVector2(); SelectionVector4 getSelectionVector4(); ``` If the intention is that VectorContainers should never hold selection vectors, then the interfaces VectorContainer implements should reflect that. What are you're thoughts about refactoring the VectorAccessable to remove the selection vector methods? In the meantime I will remove the these changes from VectorContainer. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add Basic Data Correctness Unit Tests > - > > Key: DRILL-6461 > URL: https://issues.apache.org/jira/browse/DRILL-6461 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Timothy Farkas >Assignee: Timothy Farkas >Priority: Major > > There are no data correctness unit tests for HashAgg. We need to add some. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests
[ https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570554#comment-16570554 ] ASF GitHub Bot commented on DRILL-6461: --- sohami commented on a change in pull request #1344: DRILL-6461: Added basic data correctness tests for hash agg, and improved operator unit testing framework. URL: https://github.com/apache/drill/pull/1344#discussion_r202754788 ## File path: exec/java-exec/src/test/java/org/apache/drill/test/rowSet/RowSetBatch.java ## @@ -29,13 +30,28 @@ import org.apache.drill.exec.record.selection.SelectionVector2; import org.apache.drill.exec.record.selection.SelectionVector4; +import java.util.ArrayList; import java.util.Iterator; +import java.util.List; -public class RowSetBatch implements RecordBatch { - private final RowSet rowSet; +/** + * A mock operator that returns the provided {@link RowSet}s as batches. Currently it's assumed that all the {@link RowSet}s have the same schema. + */ +public class RowSetBatch implements CloseableRecordBatch { Review comment: This is similar to [MockRecordBatch](https://github.com/apache/drill/blob/master/exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/MockRecordBatch.java). I created that class for Lateral-Unnest project. The main usage for that class was to be able to simulate different IterOutcome with different type of batches while unit testing an operator. May be we can try to merge these 2 classes. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add Basic Data Correctness Unit Tests > - > > Key: DRILL-6461 > URL: https://issues.apache.org/jira/browse/DRILL-6461 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Timothy Farkas >Assignee: Timothy Farkas >Priority: Major > > There are no data correctness unit tests for HashAgg. We need to add some. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests
[ https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570553#comment-16570553 ] ASF GitHub Bot commented on DRILL-6461: --- sohami commented on a change in pull request #1344: DRILL-6461: Added basic data correctness tests for hash agg, and improved operator unit testing framework. URL: https://github.com/apache/drill/pull/1344#discussion_r207977722 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/record/VectorContainer.java ## @@ -42,6 +42,8 @@ private final BufferAllocator allocator; protected final List> wrappers = Lists.newArrayList(); private BatchSchema schema; + private SelectionVector2 selectionVector2; + private SelectionVector4 selectionVector4; Review comment: I don't think we should include SV2/SV4 inside a VectorContainer. They are more of RecordBatch level members. VectorContainer should only have metadata and set of vectors in it which is what it is today. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add Basic Data Correctness Unit Tests > - > > Key: DRILL-6461 > URL: https://issues.apache.org/jira/browse/DRILL-6461 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Timothy Farkas >Assignee: Timothy Farkas >Priority: Major > > There are no data correctness unit tests for HashAgg. We need to add some. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests
[ https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570552#comment-16570552 ] ASF GitHub Bot commented on DRILL-6461: --- sohami commented on a change in pull request #1344: DRILL-6461: Added basic data correctness tests for hash agg, and improved operator unit testing framework. URL: https://github.com/apache/drill/pull/1344#discussion_r202754788 ## File path: exec/java-exec/src/test/java/org/apache/drill/test/rowSet/RowSetBatch.java ## @@ -29,13 +30,28 @@ import org.apache.drill.exec.record.selection.SelectionVector2; import org.apache.drill.exec.record.selection.SelectionVector4; +import java.util.ArrayList; import java.util.Iterator; +import java.util.List; -public class RowSetBatch implements RecordBatch { - private final RowSet rowSet; +/** + * A mock operator that returns the provided {@link RowSet}s as batches. Currently it's assumed that all the {@link RowSet}s have the same schema. + */ +public class RowSetBatch implements CloseableRecordBatch { Review comment: This is same as [MockRecordBatch](https://github.com/apache/drill/blob/master/exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/MockRecordBatch.java). I created that class for Lateral-Unnest project. The main usage for that class was to be able to simulate different IterOutcome with different type of batches while unit testing an operator. May be we can try to merge these 2 classes. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add Basic Data Correctness Unit Tests > - > > Key: DRILL-6461 > URL: https://issues.apache.org/jira/browse/DRILL-6461 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Timothy Farkas >Assignee: Timothy Farkas >Priority: Major > > There are no data correctness unit tests for HashAgg. We need to add some. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests
[ https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565844#comment-16565844 ] ASF GitHub Bot commented on DRILL-6461: --- sachouche commented on a change in pull request #1344: DRILL-6461: Added basic data correctness tests for hash agg, and improved operator unit testing framework. URL: https://github.com/apache/drill/pull/1344#discussion_r206999463 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/record/RecordBatchSizer.java ## @@ -639,6 +641,8 @@ public ColumnSize getColumn(String name) { // columns can be obtained from children of topColumns. private Map columnSizes = CaseInsensitiveMap.newHashMap(); + private List columnSizesList = new ArrayList<>(); Review comment: Why do this? the columnSizes map could return a Collection (through java.util.Map.values()) efficiently (that is, in place). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add Basic Data Correctness Unit Tests > - > > Key: DRILL-6461 > URL: https://issues.apache.org/jira/browse/DRILL-6461 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Timothy Farkas >Assignee: Timothy Farkas >Priority: Major > > There are no data correctness unit tests for HashAgg. We need to add some. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests
[ https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565845#comment-16565845 ] ASF GitHub Bot commented on DRILL-6461: --- sachouche commented on a change in pull request #1344: DRILL-6461: Added basic data correctness tests for hash agg, and improved operator unit testing framework. URL: https://github.com/apache/drill/pull/1344#discussion_r206996050 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/HashAggTemplate.java ## @@ -1175,6 +1175,7 @@ public AggIterOutcome outputCurrentBatch() { v.getValueVector().getMutator().setValueCount(numOutputRecords); } +outContainer.setRecordCount(numOutputRecords); Review comment: Noticed the old code is using INFO debug level; shouldn't it be using debug? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add Basic Data Correctness Unit Tests > - > > Key: DRILL-6461 > URL: https://issues.apache.org/jira/browse/DRILL-6461 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Timothy Farkas >Assignee: Timothy Farkas >Priority: Major > > There are no data correctness unit tests for HashAgg. We need to add some. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests
[ https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565843#comment-16565843 ] ASF GitHub Bot commented on DRILL-6461: --- sachouche commented on a change in pull request #1344: DRILL-6461: Added basic data correctness tests for hash agg, and improved operator unit testing framework. URL: https://github.com/apache/drill/pull/1344#discussion_r206995312 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/HashAggTemplate.java ## @@ -1175,6 +1175,7 @@ public AggIterOutcome outputCurrentBatch() { v.getValueVector().getMutator().setValueCount(numOutputRecords); } +outContainer.setRecordCount(numOutputRecords); Review comment: I noticed the Drill code has three places where to set the record count; within a: - RecordBatch object (look at ScanBatch) - VectorContainer - Individual vectors Can we centralize this to a utility so that setting the count is always consistent across all operators. I am not suggesting that you do that within this PR (as I don't want to extend its scope) but this is just to get some feedback about this behavior. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add Basic Data Correctness Unit Tests > - > > Key: DRILL-6461 > URL: https://issues.apache.org/jira/browse/DRILL-6461 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Timothy Farkas >Assignee: Timothy Farkas >Priority: Major > > There are no data correctness unit tests for HashAgg. We need to add some. -- This message was sent by Atlassian JIRA (v7.6.3#76005)