[jira] [Commented] (DRILL-5323) Provide test tools to create, populate and compare row sets

2017-04-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979578#comment-15979578
 ] 

ASF GitHub Bot commented on DRILL-5323:
---

Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/785


> Provide test tools to create, populate and compare row sets
> ---
>
> Key: DRILL-5323
> URL: https://issues.apache.org/jira/browse/DRILL-5323
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Tools, Build & Test
>Affects Versions: 1.11.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>  Labels: ready-to-commit
> Fix For: 1.11.0
>
>
> Operators work with individual row sets. A row set is a collection of records 
> stored as column vectors. (Drill uses various terms for this concept. A 
> record batch is a row set with an operator implementation wrapped around it. 
> A vector container is a row set, but with much functionality left as an 
> exercise for the developer. And so on.)
> To simplify tests, we need a {{TestRowSet}} concept that wraps a 
> {{VectorContainer}} and provides easy ways to:
> * Define a schema for the row set.
> * Create a set of vectors that implement the schema.
> * Populate the row set with test data via code.
> * Add an SV2 to the row set.
> * Pass the row set to operator components (such as generated code blocks.)
> * Compare the results of the operation with an expected result set.
> * Dispose of the underling direct memory when work is done.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5323) Provide test tools to create, populate and compare row sets

2017-04-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979441#comment-15979441
 ] 

ASF GitHub Bot commented on DRILL-5323:
---

Github user sudheeshkatkam commented on the issue:

https://github.com/apache/drill/pull/785
  
+1


> Provide test tools to create, populate and compare row sets
> ---
>
> Key: DRILL-5323
> URL: https://issues.apache.org/jira/browse/DRILL-5323
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Tools, Build & Test
>Affects Versions: 1.11.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>  Labels: ready-to-commit
> Fix For: 1.11.0
>
>
> Operators work with individual row sets. A row set is a collection of records 
> stored as column vectors. (Drill uses various terms for this concept. A 
> record batch is a row set with an operator implementation wrapped around it. 
> A vector container is a row set, but with much functionality left as an 
> exercise for the developer. And so on.)
> To simplify tests, we need a {{TestRowSet}} concept that wraps a 
> {{VectorContainer}} and provides easy ways to:
> * Define a schema for the row set.
> * Create a set of vectors that implement the schema.
> * Populate the row set with test data via code.
> * Add an SV2 to the row set.
> * Pass the row set to operator components (such as generated code blocks.)
> * Compare the results of the operation with an expected result set.
> * Dispose of the underling direct memory when work is done.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5323) Provide test tools to create, populate and compare row sets

2017-04-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15976245#comment-15976245
 ] 

ASF GitHub Bot commented on DRILL-5323:
---

Github user paul-rogers commented on the issue:

https://github.com/apache/drill/pull/785
  
Rebased onto latest master and squashed.


> Provide test tools to create, populate and compare row sets
> ---
>
> Key: DRILL-5323
> URL: https://issues.apache.org/jira/browse/DRILL-5323
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Tools, Build & Test
>Affects Versions: 1.11.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: 1.11.0
>
>
> Operators work with individual row sets. A row set is a collection of records 
> stored as column vectors. (Drill uses various terms for this concept. A 
> record batch is a row set with an operator implementation wrapped around it. 
> A vector container is a row set, but with much functionality left as an 
> exercise for the developer. And so on.)
> To simplify tests, we need a {{TestRowSet}} concept that wraps a 
> {{VectorContainer}} and provides easy ways to:
> * Define a schema for the row set.
> * Create a set of vectors that implement the schema.
> * Populate the row set with test data via code.
> * Add an SV2 to the row set.
> * Pass the row set to operator components (such as generated code blocks.)
> * Compare the results of the operation with an expected result set.
> * Dispose of the underling direct memory when work is done.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5323) Provide test tools to create, populate and compare row sets

2017-04-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15976201#comment-15976201
 ] 

ASF GitHub Bot commented on DRILL-5323:
---

Github user sohami commented on the issue:

https://github.com/apache/drill/pull/785
  
Thanks for the changes, LGTM.
+1


> Provide test tools to create, populate and compare row sets
> ---
>
> Key: DRILL-5323
> URL: https://issues.apache.org/jira/browse/DRILL-5323
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Tools, Build & Test
>Affects Versions: 1.11.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: 1.11.0
>
>
> Operators work with individual row sets. A row set is a collection of records 
> stored as column vectors. (Drill uses various terms for this concept. A 
> record batch is a row set with an operator implementation wrapped around it. 
> A vector container is a row set, but with much functionality left as an 
> exercise for the developer. And so on.)
> To simplify tests, we need a {{TestRowSet}} concept that wraps a 
> {{VectorContainer}} and provides easy ways to:
> * Define a schema for the row set.
> * Create a set of vectors that implement the schema.
> * Populate the row set with test data via code.
> * Add an SV2 to the row set.
> * Pass the row set to operator components (such as generated code blocks.)
> * Compare the results of the operation with an expected result set.
> * Dispose of the underling direct memory when work is done.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5323) Provide test tools to create, populate and compare row sets

2017-04-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15972120#comment-15972120
 ] 

ASF GitHub Bot commented on DRILL-5323:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/785#discussion_r111867385
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/test/rowSet/RowSetSchema.java ---
@@ -0,0 +1,252 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test.rowSet;
+
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+import org.apache.drill.common.types.TypeProtos.MinorType;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.BatchSchema.SelectionVectorMode;
+import org.apache.drill.exec.vector.accessor.TupleAccessor.TupleSchema;
+import org.apache.drill.exec.record.MaterializedField;
+
+/**
+ * Row set schema presented as a number of distinct "views" for various
+ * purposes:
+ * 
+ * Batch schema: the schema used by a VectorContainer.
+ * Physical schema: the schema expressed as a hierarchy of
+ * tuples with the top tuple representing the row, nested tuples
+ * representing maps.
+ * Access schema: a flattened schema with all scalar columns
+ * at the top level, and with map columns pulled out into a separate
+ * collection. The flattened-scalar view is the one used to write to,
+ * and read from, the row set.
+ * 
+ * Allows easy creation of multiple row sets from the same schema.
+ * Each schema is immutable, which is fine for tests in which we
+ * want known inputs and outputs.
+ */
+
+public class RowSetSchema {
+
+  /**
+   * Logical description of a column. A logical column is a
+   * materialized field. For maps, also includes a logical schema
+   * of the map.
+   */
+
+  public static class LogicalColumn {
+protected final String fullName;
+protected final int accessIndex;
+protected int flatIndex;
+protected final MaterializedField field;
+
+/**
+ * Schema of the map. Includes only those fields directly within
+ * the map; does not include fields from nested tuples.
+ */
+
+protected PhysicalSchema mapSchema;
+
+public LogicalColumn(String fullName, int accessIndex, 
MaterializedField field) {
+  this.fullName = fullName;
+  this.accessIndex = accessIndex;
+  this.field = field;
+}
+
+private void updateStructure(int index, PhysicalSchema children) {
+  flatIndex = index;
+  mapSchema = children;
+}
+
+public int accessIndex() { return accessIndex; }
+public int flatIndex() { return flatIndex; }
+public boolean isMap() { return mapSchema != null; }
+public PhysicalSchema mapSchema() { return mapSchema; }
+public MaterializedField field() { return field; }
+public String fullName() { return fullName; }
+  }
+
+  /**
+   * Implementation of a tuple name space. Tuples allow both indexed and
+   * named access to their members.
+   *
+   * @param  the type of object representing each column
+   */
+
+  public static class NameSpace {
+private final Map nameSpace = new HashMap<>();
+private final List columns = new ArrayList<>();
+
+public int add(String key, T value) {
+  int index = columns.size();
+  nameSpace.put(key, index);
+  columns.add(value);
+  return index;
+}
+
+public T get(int index) {
+  return columns.get(index);
+}
+
+public T get(String key) {
+  Integer index = getIndex(key);
--- End diff --

Fixed.


> Provide test tools to create, populate and compare row sets
> 

[jira] [Commented] (DRILL-5323) Provide test tools to create, populate and compare row sets

2017-04-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15972118#comment-15972118
 ] 

ASF GitHub Bot commented on DRILL-5323:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/785#discussion_r111867499
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/test/rowSet/RowSetSchema.java ---
@@ -0,0 +1,252 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test.rowSet;
+
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+import org.apache.drill.common.types.TypeProtos.MinorType;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.BatchSchema.SelectionVectorMode;
+import org.apache.drill.exec.vector.accessor.TupleAccessor.TupleSchema;
+import org.apache.drill.exec.record.MaterializedField;
+
+/**
+ * Row set schema presented as a number of distinct "views" for various
+ * purposes:
+ * 
+ * Batch schema: the schema used by a VectorContainer.
+ * Physical schema: the schema expressed as a hierarchy of
+ * tuples with the top tuple representing the row, nested tuples
+ * representing maps.
+ * Access schema: a flattened schema with all scalar columns
+ * at the top level, and with map columns pulled out into a separate
+ * collection. The flattened-scalar view is the one used to write to,
+ * and read from, the row set.
+ * 
+ * Allows easy creation of multiple row sets from the same schema.
+ * Each schema is immutable, which is fine for tests in which we
+ * want known inputs and outputs.
+ */
+
+public class RowSetSchema {
+
+  /**
+   * Logical description of a column. A logical column is a
+   * materialized field. For maps, also includes a logical schema
+   * of the map.
+   */
+
+  public static class LogicalColumn {
+protected final String fullName;
+protected final int accessIndex;
+protected int flatIndex;
+protected final MaterializedField field;
+
+/**
+ * Schema of the map. Includes only those fields directly within
+ * the map; does not include fields from nested tuples.
+ */
+
+protected PhysicalSchema mapSchema;
+
+public LogicalColumn(String fullName, int accessIndex, 
MaterializedField field) {
+  this.fullName = fullName;
+  this.accessIndex = accessIndex;
+  this.field = field;
+}
+
+private void updateStructure(int index, PhysicalSchema children) {
+  flatIndex = index;
+  mapSchema = children;
+}
+
+public int accessIndex() { return accessIndex; }
+public int flatIndex() { return flatIndex; }
+public boolean isMap() { return mapSchema != null; }
+public PhysicalSchema mapSchema() { return mapSchema; }
+public MaterializedField field() { return field; }
+public String fullName() { return fullName; }
+  }
+
+  /**
+   * Implementation of a tuple name space. Tuples allow both indexed and
+   * named access to their members.
+   *
+   * @param  the type of object representing each column
+   */
+
+  public static class NameSpace {
+private final Map nameSpace = new HashMap<>();
+private final List columns = new ArrayList<>();
+
+public int add(String key, T value) {
+  int index = columns.size();
+  nameSpace.put(key, index);
+  columns.add(value);
+  return index;
+}
+
+public T get(int index) {
+  return columns.get(index);
+}
+
+public T get(String key) {
+  Integer index = getIndex(key);
+  if (index == -1) {
+return null;
+  }
+  return get(index);
+}
+
+public int getIndex(String 

[jira] [Commented] (DRILL-5323) Provide test tools to create, populate and compare row sets

2017-04-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15972115#comment-15972115
 ] 

ASF GitHub Bot commented on DRILL-5323:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/785#discussion_r111865207
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/test/rowSet/DirectRowSet.java ---
@@ -0,0 +1,153 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test.rowSet;
+
+import org.apache.drill.exec.memory.BufferAllocator;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.BatchSchema.SelectionVectorMode;
+import org.apache.drill.exec.record.VectorAccessible;
+import org.apache.drill.exec.record.VectorContainer;
+import org.apache.drill.exec.record.VectorWrapper;
+import org.apache.drill.exec.record.selection.SelectionVector2;
+import org.apache.drill.exec.vector.AllocationHelper;
+import org.apache.drill.exec.vector.ValueVector;
+import 
org.apache.drill.test.rowSet.AbstractRowSetAccessor.AbstractRowIndex;
+import org.apache.drill.test.rowSet.AbstractRowSetAccessor.BoundedRowIndex;
+import org.apache.drill.test.rowSet.RowSet.ExtendableRowSet;
+
+public class DirectRowSet extends AbstractSingleRowSet implements 
ExtendableRowSet {
+
+  private static class DirectRowIndex extends BoundedRowIndex {
+
+public DirectRowIndex(int rowCount) {
+  super(rowCount);
+}
+
+@Override
+public int index() { return rowIndex; }
+
+@Override
+public int batch() { return 0; }
--- End diff --

Technically that is true. Kept here to make clear that this is a 
single-batch index. (There are only three subclasses, so keeping it here means 
we have two copies.) Moving it to the parent says that even a hyper-batch 
defaults to a single row set.


> Provide test tools to create, populate and compare row sets
> ---
>
> Key: DRILL-5323
> URL: https://issues.apache.org/jira/browse/DRILL-5323
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Tools, Build & Test
>Affects Versions: 1.11.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: 1.11.0
>
>
> Operators work with individual row sets. A row set is a collection of records 
> stored as column vectors. (Drill uses various terms for this concept. A 
> record batch is a row set with an operator implementation wrapped around it. 
> A vector container is a row set, but with much functionality left as an 
> exercise for the developer. And so on.)
> To simplify tests, we need a {{TestRowSet}} concept that wraps a 
> {{VectorContainer}} and provides easy ways to:
> * Define a schema for the row set.
> * Create a set of vectors that implement the schema.
> * Populate the row set with test data via code.
> * Add an SV2 to the row set.
> * Pass the row set to operator components (such as generated code blocks.)
> * Compare the results of the operation with an expected result set.
> * Dispose of the underling direct memory when work is done.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5323) Provide test tools to create, populate and compare row sets

2017-04-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15972114#comment-15972114
 ] 

ASF GitHub Bot commented on DRILL-5323:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/785#discussion_r111865822
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/test/rowSet/DirectRowSet.java ---
@@ -0,0 +1,240 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test.rowSet;
+
+import org.apache.drill.exec.memory.BufferAllocator;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.BatchSchema.SelectionVectorMode;
+import org.apache.drill.exec.record.VectorAccessible;
+import org.apache.drill.exec.record.VectorContainer;
+import org.apache.drill.exec.record.VectorWrapper;
+import org.apache.drill.exec.record.selection.SelectionVector2;
+import org.apache.drill.exec.vector.AllocationHelper;
+import org.apache.drill.exec.vector.ValueVector;
+import org.apache.drill.exec.vector.accessor.TupleAccessor.TupleSchema;
+import org.apache.drill.exec.vector.accessor.impl.AbstractColumnWriter;
+import org.apache.drill.exec.vector.accessor.impl.ColumnAccessorFactory;
+import org.apache.drill.exec.vector.accessor.impl.TupleWriterImpl;
+import org.apache.drill.test.rowSet.RowSet.ExtendableRowSet;
+
+/**
+ * Implementation of a single row set with no indirection (selection)
+ * vector.
+ */
+
+public class DirectRowSet extends AbstractSingleRowSet implements 
ExtendableRowSet {
+
+  /**
+   * Reader index that points directly to each row in the row set.
+   * This index starts with pointing to the -1st row, so that the
+   * reader can require a next() for every row, including
+   * the first. (This is the JDBC RecordSet convention.)
+   */
+
+  private static class DirectRowIndex extends BoundedRowIndex {
+
+public DirectRowIndex(int rowCount) {
+  super(rowCount);
+}
+
+@Override
+public int index() { return rowIndex; }
+
+@Override
+public int batch() { return 0; }
+  }
+
+  /**
+   * Writer index that points to each row in the row set. The index starts 
at
+   * the 0th row and advances one row on each increment. This allows 
writers to
+   * start positioned at the first row. Writes happen in the current row.
+   * Calling next() advances to the next position, effectively 
saving
+   * the current row. The most recent row can be abandoned easily simply 
by not
+   * calling next(). This means that the number of completed rows 
is
+   * the same as the row index.
+   */
+
+  private static class ExtendableRowIndex extends RowSetIndex {
+
+private final int maxSize;
+
+public ExtendableRowIndex(int maxSize) {
+  this.maxSize = maxSize;
+  rowIndex = 0;
+}
+
+@Override
+public int index() { return rowIndex; }
+
+@Override
+public boolean next() {
+  if (++rowIndex <= maxSize ) {
+return true;
+  } else {
+rowIndex--;
+return false;
+  }
+}
+
+@Override
+public int size() { return rowIndex; }
+
+@Override
+public boolean valid() { return rowIndex < maxSize; }
+
+@Override
+public int batch() { return 0; }
+  }
+
+  /**
+   * Implementation of a row set writer. Only available for newly-created,
+   * empty, direct, single row sets. Rewriting is not allowed, nor is 
writing
+   * to a hyper row set.
+   */
+
+  public class RowSetWriterImpl extends TupleWriterImpl implements 
RowSetWriter {
+
+private final ExtendableRowIndex index;
+private final ExtendableRowSet rowSet;
+
+protected RowSetWriterImpl(ExtendableRowSet rowSet, TupleSchema 
schema, 

[jira] [Commented] (DRILL-5323) Provide test tools to create, populate and compare row sets

2017-04-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15972112#comment-15972112
 ] 

ASF GitHub Bot commented on DRILL-5323:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/785#discussion_r111865502
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/test/rowSet/DirectRowSet.java ---
@@ -0,0 +1,240 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test.rowSet;
+
+import org.apache.drill.exec.memory.BufferAllocator;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.BatchSchema.SelectionVectorMode;
+import org.apache.drill.exec.record.VectorAccessible;
+import org.apache.drill.exec.record.VectorContainer;
+import org.apache.drill.exec.record.VectorWrapper;
+import org.apache.drill.exec.record.selection.SelectionVector2;
+import org.apache.drill.exec.vector.AllocationHelper;
+import org.apache.drill.exec.vector.ValueVector;
+import org.apache.drill.exec.vector.accessor.TupleAccessor.TupleSchema;
+import org.apache.drill.exec.vector.accessor.impl.AbstractColumnWriter;
+import org.apache.drill.exec.vector.accessor.impl.ColumnAccessorFactory;
+import org.apache.drill.exec.vector.accessor.impl.TupleWriterImpl;
+import org.apache.drill.test.rowSet.RowSet.ExtendableRowSet;
+
+/**
+ * Implementation of a single row set with no indirection (selection)
+ * vector.
+ */
+
+public class DirectRowSet extends AbstractSingleRowSet implements 
ExtendableRowSet {
+
+  /**
+   * Reader index that points directly to each row in the row set.
+   * This index starts with pointing to the -1st row, so that the
+   * reader can require a next() for every row, including
+   * the first. (This is the JDBC RecordSet convention.)
+   */
+
+  private static class DirectRowIndex extends BoundedRowIndex {
+
+public DirectRowIndex(int rowCount) {
+  super(rowCount);
+}
+
+@Override
+public int index() { return rowIndex; }
+
+@Override
+public int batch() { return 0; }
+  }
+
+  /**
+   * Writer index that points to each row in the row set. The index starts 
at
+   * the 0th row and advances one row on each increment. This allows 
writers to
+   * start positioned at the first row. Writes happen in the current row.
+   * Calling next() advances to the next position, effectively 
saving
+   * the current row. The most recent row can be abandoned easily simply 
by not
+   * calling next(). This means that the number of completed rows 
is
+   * the same as the row index.
+   */
+
+  private static class ExtendableRowIndex extends RowSetIndex {
+
+private final int maxSize;
+
+public ExtendableRowIndex(int maxSize) {
+  this.maxSize = maxSize;
+  rowIndex = 0;
+}
+
+@Override
+public int index() { return rowIndex; }
+
+@Override
+public boolean next() {
+  if (++rowIndex <= maxSize ) {
+return true;
+  } else {
+rowIndex--;
+return false;
+  }
+}
+
+@Override
+public int size() { return rowIndex; }
+
+@Override
+public boolean valid() { return rowIndex < maxSize; }
+
+@Override
+public int batch() { return 0; }
+  }
+
+  /**
+   * Implementation of a row set writer. Only available for newly-created,
+   * empty, direct, single row sets. Rewriting is not allowed, nor is 
writing
+   * to a hyper row set.
+   */
+
+  public class RowSetWriterImpl extends TupleWriterImpl implements 
RowSetWriter {
+
+private final ExtendableRowIndex index;
+private final ExtendableRowSet rowSet;
+
+protected RowSetWriterImpl(ExtendableRowSet rowSet, TupleSchema 
schema, 

[jira] [Commented] (DRILL-5323) Provide test tools to create, populate and compare row sets

2017-04-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15972116#comment-15972116
 ] 

ASF GitHub Bot commented on DRILL-5323:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/785#discussion_r111868661
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/test/rowSet/SchemaBuilder.java ---
@@ -0,0 +1,138 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test.rowSet;
+
+import java.util.ArrayList;
+import java.util.List;
+
+import org.apache.drill.common.types.TypeProtos.DataMode;
+import org.apache.drill.common.types.TypeProtos.MajorType;
+import org.apache.drill.common.types.TypeProtos.MinorType;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.BatchSchema.SelectionVectorMode;
+import org.apache.drill.exec.record.MaterializedField;
+
+/**
+ * Builder of a row set schema expressed as a list of materialized
+ * fields. Optimized for use when creating schemas by hand in tests.
+ * 
+ * Example usage to create the following schema: 
+ * (c: INT, a: MAP(c: VARCHAR, d: INT, e: MAP(f: VARCHAR), g: INT), h: 
BIGINT)
--- End diff --

Fixed again.


> Provide test tools to create, populate and compare row sets
> ---
>
> Key: DRILL-5323
> URL: https://issues.apache.org/jira/browse/DRILL-5323
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Tools, Build & Test
>Affects Versions: 1.11.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: 1.11.0
>
>
> Operators work with individual row sets. A row set is a collection of records 
> stored as column vectors. (Drill uses various terms for this concept. A 
> record batch is a row set with an operator implementation wrapped around it. 
> A vector container is a row set, but with much functionality left as an 
> exercise for the developer. And so on.)
> To simplify tests, we need a {{TestRowSet}} concept that wraps a 
> {{VectorContainer}} and provides easy ways to:
> * Define a schema for the row set.
> * Create a set of vectors that implement the schema.
> * Populate the row set with test data via code.
> * Add an SV2 to the row set.
> * Pass the row set to operator components (such as generated code blocks.)
> * Compare the results of the operation with an expected result set.
> * Dispose of the underling direct memory when work is done.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5323) Provide test tools to create, populate and compare row sets

2017-04-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15972113#comment-15972113
 ] 

ASF GitHub Bot commented on DRILL-5323:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/785#discussion_r111867056
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/test/rowSet/RowSetBuilder.java ---
@@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test.rowSet;
+
+import org.apache.drill.exec.memory.BufferAllocator;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.test.rowSet.RowSet.RowSetWriter;
+import org.apache.drill.test.rowSet.RowSet.SingleRowSet;
+
+/**
+ * Fluent builder to quickly build up an row set (record batch)
+ * programmatically. Starting with an {@link OperatorFixture}:
+ * 
+ * OperatorFixture fixture = ...
+ * RowSet rowSet = fixture.rowSetBuilder(batchSchema)
+ *   .addRow(10, "string", new int[] {10.3, 10.4})
+ *   ...
+ *   .build();
+ */
+
+public final class RowSetBuilder {
+
+  private DirectRowSet rowSet;
+  private RowSetWriter writer;
+  private boolean withSv2;
+
+  public RowSetBuilder(BufferAllocator allocator, BatchSchema schema) {
+this(allocator, schema, 10);
+  }
+
+  public RowSetBuilder(BufferAllocator allocator, BatchSchema schema, int 
capacity) {
+rowSet = new DirectRowSet(allocator, schema);
+writer = rowSet.writer(capacity);
+  }
+
+  public RowSetBuilder add(Object...values) {
+if (! writer.valid()) {
+  throw new IllegalStateException( "Write past end of row set" );
+}
+for (int i = 0; i < values.length;  i++) {
+  writer.set(i, values[i]);
+}
+writer.save();
+return this;
--- End diff --

Good catch. Fixed.


> Provide test tools to create, populate and compare row sets
> ---
>
> Key: DRILL-5323
> URL: https://issues.apache.org/jira/browse/DRILL-5323
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Tools, Build & Test
>Affects Versions: 1.11.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: 1.11.0
>
>
> Operators work with individual row sets. A row set is a collection of records 
> stored as column vectors. (Drill uses various terms for this concept. A 
> record batch is a row set with an operator implementation wrapped around it. 
> A vector container is a row set, but with much functionality left as an 
> exercise for the developer. And so on.)
> To simplify tests, we need a {{TestRowSet}} concept that wraps a 
> {{VectorContainer}} and provides easy ways to:
> * Define a schema for the row set.
> * Create a set of vectors that implement the schema.
> * Populate the row set with test data via code.
> * Add an SV2 to the row set.
> * Pass the row set to operator components (such as generated code blocks.)
> * Compare the results of the operation with an expected result set.
> * Dispose of the underling direct memory when work is done.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5323) Provide test tools to create, populate and compare row sets

2017-04-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15972119#comment-15972119
 ] 

ASF GitHub Bot commented on DRILL-5323:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/785#discussion_r111867555
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/test/rowSet/RowSetSchema.java ---
@@ -0,0 +1,252 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test.rowSet;
+
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+import org.apache.drill.common.types.TypeProtos.MinorType;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.BatchSchema.SelectionVectorMode;
+import org.apache.drill.exec.vector.accessor.TupleAccessor.TupleSchema;
+import org.apache.drill.exec.record.MaterializedField;
+
+/**
+ * Row set schema presented as a number of distinct "views" for various
+ * purposes:
+ * 
+ * Batch schema: the schema used by a VectorContainer.
+ * Physical schema: the schema expressed as a hierarchy of
+ * tuples with the top tuple representing the row, nested tuples
+ * representing maps.
+ * Access schema: a flattened schema with all scalar columns
+ * at the top level, and with map columns pulled out into a separate
+ * collection. The flattened-scalar view is the one used to write to,
+ * and read from, the row set.
+ * 
+ * Allows easy creation of multiple row sets from the same schema.
+ * Each schema is immutable, which is fine for tests in which we
+ * want known inputs and outputs.
+ */
+
+public class RowSetSchema {
+
+  /**
+   * Logical description of a column. A logical column is a
+   * materialized field. For maps, also includes a logical schema
+   * of the map.
+   */
+
+  public static class LogicalColumn {
+protected final String fullName;
+protected final int accessIndex;
+protected int flatIndex;
+protected final MaterializedField field;
+
+/**
+ * Schema of the map. Includes only those fields directly within
+ * the map; does not include fields from nested tuples.
+ */
+
+protected PhysicalSchema mapSchema;
+
+public LogicalColumn(String fullName, int accessIndex, 
MaterializedField field) {
+  this.fullName = fullName;
+  this.accessIndex = accessIndex;
+  this.field = field;
+}
+
+private void updateStructure(int index, PhysicalSchema children) {
+  flatIndex = index;
+  mapSchema = children;
+}
+
+public int accessIndex() { return accessIndex; }
+public int flatIndex() { return flatIndex; }
+public boolean isMap() { return mapSchema != null; }
+public PhysicalSchema mapSchema() { return mapSchema; }
+public MaterializedField field() { return field; }
+public String fullName() { return fullName; }
+  }
+
+  /**
+   * Implementation of a tuple name space. Tuples allow both indexed and
+   * named access to their members.
+   *
+   * @param  the type of object representing each column
+   */
+
+  public static class NameSpace {
+private final Map nameSpace = new HashMap<>();
+private final List columns = new ArrayList<>();
+
+public int add(String key, T value) {
+  int index = columns.size();
+  nameSpace.put(key, index);
+  columns.add(value);
+  return index;
+}
+
+public T get(int index) {
+  return columns.get(index);
+}
+
+public T get(String key) {
+  Integer index = getIndex(key);
+  if (index == -1) {
+return null;
+  }
+  return get(index);
+}
+
+public int getIndex(String 

[jira] [Commented] (DRILL-5323) Provide test tools to create, populate and compare row sets

2017-04-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15972110#comment-15972110
 ] 

ASF GitHub Bot commented on DRILL-5323:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/785#discussion_r111867303
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/test/rowSet/RowSetSchema.java ---
@@ -0,0 +1,252 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test.rowSet;
+
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+import org.apache.drill.common.types.TypeProtos.MinorType;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.BatchSchema.SelectionVectorMode;
+import org.apache.drill.exec.vector.accessor.TupleAccessor.TupleSchema;
+import org.apache.drill.exec.record.MaterializedField;
+
+/**
+ * Row set schema presented as a number of distinct "views" for various
+ * purposes:
+ * 
+ * Batch schema: the schema used by a VectorContainer.
+ * Physical schema: the schema expressed as a hierarchy of
+ * tuples with the top tuple representing the row, nested tuples
+ * representing maps.
+ * Access schema: a flattened schema with all scalar columns
+ * at the top level, and with map columns pulled out into a separate
+ * collection. The flattened-scalar view is the one used to write to,
+ * and read from, the row set.
+ * 
+ * Allows easy creation of multiple row sets from the same schema.
+ * Each schema is immutable, which is fine for tests in which we
+ * want known inputs and outputs.
+ */
+
+public class RowSetSchema {
+
+  /**
+   * Logical description of a column. A logical column is a
+   * materialized field. For maps, also includes a logical schema
+   * of the map.
+   */
+
+  public static class LogicalColumn {
+protected final String fullName;
+protected final int accessIndex;
+protected int flatIndex;
+protected final MaterializedField field;
+
+/**
+ * Schema of the map. Includes only those fields directly within
+ * the map; does not include fields from nested tuples.
+ */
+
+protected PhysicalSchema mapSchema;
+
+public LogicalColumn(String fullName, int accessIndex, 
MaterializedField field) {
+  this.fullName = fullName;
+  this.accessIndex = accessIndex;
+  this.field = field;
+}
+
+private void updateStructure(int index, PhysicalSchema children) {
+  flatIndex = index;
+  mapSchema = children;
+}
+
+public int accessIndex() { return accessIndex; }
+public int flatIndex() { return flatIndex; }
+public boolean isMap() { return mapSchema != null; }
+public PhysicalSchema mapSchema() { return mapSchema; }
+public MaterializedField field() { return field; }
+public String fullName() { return fullName; }
+  }
+
+  /**
+   * Implementation of a tuple name space. Tuples allow both indexed and
+   * named access to their members.
+   *
+   * @param  the type of object representing each column
+   */
+
+  public static class NameSpace {
+private final Map nameSpace = new HashMap<>();
+private final List columns = new ArrayList<>();
+
+public int add(String key, T value) {
+  int index = columns.size();
+  nameSpace.put(key, index);
+  columns.add(value);
+  return index;
+}
+
+public T get(int index) {
+  return columns.get(index);
--- End diff --

True. In all methods that take a column index, it is assumed that the 
caller knows the column count either because the caller created the schema, or 
asked for the column count. Since the list get method does bounds 

[jira] [Commented] (DRILL-5323) Provide test tools to create, populate and compare row sets

2017-04-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15972117#comment-15972117
 ] 

ASF GitHub Bot commented on DRILL-5323:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/785#discussion_r111865581
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/test/rowSet/DirectRowSet.java ---
@@ -0,0 +1,240 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test.rowSet;
+
+import org.apache.drill.exec.memory.BufferAllocator;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.BatchSchema.SelectionVectorMode;
+import org.apache.drill.exec.record.VectorAccessible;
+import org.apache.drill.exec.record.VectorContainer;
+import org.apache.drill.exec.record.VectorWrapper;
+import org.apache.drill.exec.record.selection.SelectionVector2;
+import org.apache.drill.exec.vector.AllocationHelper;
+import org.apache.drill.exec.vector.ValueVector;
+import org.apache.drill.exec.vector.accessor.TupleAccessor.TupleSchema;
+import org.apache.drill.exec.vector.accessor.impl.AbstractColumnWriter;
+import org.apache.drill.exec.vector.accessor.impl.ColumnAccessorFactory;
+import org.apache.drill.exec.vector.accessor.impl.TupleWriterImpl;
+import org.apache.drill.test.rowSet.RowSet.ExtendableRowSet;
+
+/**
+ * Implementation of a single row set with no indirection (selection)
+ * vector.
+ */
+
+public class DirectRowSet extends AbstractSingleRowSet implements 
ExtendableRowSet {
+
+  /**
+   * Reader index that points directly to each row in the row set.
+   * This index starts with pointing to the -1st row, so that the
+   * reader can require a next() for every row, including
+   * the first. (This is the JDBC RecordSet convention.)
+   */
+
+  private static class DirectRowIndex extends BoundedRowIndex {
+
+public DirectRowIndex(int rowCount) {
+  super(rowCount);
+}
+
+@Override
+public int index() { return rowIndex; }
+
+@Override
+public int batch() { return 0; }
+  }
+
+  /**
+   * Writer index that points to each row in the row set. The index starts 
at
+   * the 0th row and advances one row on each increment. This allows 
writers to
+   * start positioned at the first row. Writes happen in the current row.
+   * Calling next() advances to the next position, effectively 
saving
+   * the current row. The most recent row can be abandoned easily simply 
by not
+   * calling next(). This means that the number of completed rows 
is
+   * the same as the row index.
+   */
+
+  private static class ExtendableRowIndex extends RowSetIndex {
+
+private final int maxSize;
+
+public ExtendableRowIndex(int maxSize) {
+  this.maxSize = maxSize;
+  rowIndex = 0;
+}
+
+@Override
+public int index() { return rowIndex; }
+
+@Override
+public boolean next() {
+  if (++rowIndex <= maxSize ) {
+return true;
+  } else {
+rowIndex--;
+return false;
+  }
+}
+
+@Override
+public int size() { return rowIndex; }
+
+@Override
+public boolean valid() { return rowIndex < maxSize; }
+
+@Override
+public int batch() { return 0; }
+  }
+
+  /**
+   * Implementation of a row set writer. Only available for newly-created,
+   * empty, direct, single row sets. Rewriting is not allowed, nor is 
writing
+   * to a hyper row set.
+   */
+
+  public class RowSetWriterImpl extends TupleWriterImpl implements 
RowSetWriter {
+
+private final ExtendableRowIndex index;
+private final ExtendableRowSet rowSet;
+
+protected RowSetWriterImpl(ExtendableRowSet rowSet, TupleSchema 
schema, 

[jira] [Commented] (DRILL-5323) Provide test tools to create, populate and compare row sets

2017-04-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15972111#comment-15972111
 ] 

ASF GitHub Bot commented on DRILL-5323:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/785#discussion_r111866420
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/test/rowSet/HyperRowSetImpl.java 
---
@@ -0,0 +1,221 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test.rowSet;
+
+import java.util.ArrayList;
+import java.util.List;
+
+import org.apache.drill.common.types.TypeProtos.MinorType;
+import org.apache.drill.exec.memory.BufferAllocator;
+import org.apache.drill.exec.record.BatchSchema.SelectionVectorMode;
+import org.apache.drill.exec.record.HyperVectorWrapper;
+import org.apache.drill.exec.record.VectorContainer;
+import org.apache.drill.exec.record.VectorWrapper;
+import org.apache.drill.exec.record.selection.SelectionVector4;
+import org.apache.drill.exec.vector.ValueVector;
+import org.apache.drill.exec.vector.accessor.AccessorUtilities;
+import org.apache.drill.exec.vector.complex.AbstractMapVector;
+import org.apache.drill.test.rowSet.AbstractRowSetAccessor.BoundedRowIndex;
+import org.apache.drill.test.rowSet.RowSet.HyperRowSet;
+import org.apache.drill.test.rowSet.RowSetSchema.LogicalColumn;
+import org.apache.drill.test.rowSet.RowSetSchema.PhysicalSchema;
+
+public class HyperRowSetImpl extends AbstractRowSet implements HyperRowSet 
{
+
+  public static class HyperRowIndex extends BoundedRowIndex {
+
+private final SelectionVector4 sv4;
+
+public HyperRowIndex(SelectionVector4 sv4) {
+  super(sv4.getCount());
+  this.sv4 = sv4;
+}
+
+@Override
+public int index() {
+  return AccessorUtilities.sv4Index(sv4.get(rowIndex));
+}
+
+@Override
+public int batch( ) {
+  return AccessorUtilities.sv4Batch(sv4.get(rowIndex));
+}
+  }
+
+  /**
+   * Build a hyper row set by restructuring a hyper vector bundle into a 
uniform
+   * shape. Consider this schema: 
+   * { a: 10, b: { c: 20, d: { e: 30 } } }
+   * 
+   * The hyper container, with two batches, has this structure:
+   * 
+   * Batchab
+   * 0Int vectorMap Vector(Int vector, Map 
Vector(Int vector))
+   * 1Int vectorMap Vector(Int vector, Map 
Vector(Int vector))
+   * 
+   * 
+   * The above table shows that top-level scalar vectors (such as the Int 
Vector for column
+   * a) appear "end-to-end" as a hyper-vector. Maps also appear 
end-to-end. But, the
+   * contents of the map (column c) do not appear end-to-end. Instead, 
they appear as
+   * contents in the map vector. To get to c, one indexes into the map 
vector, steps inside
+   * the map to find c and indexes to the right row.
+   * 
+   * Similarly, the maps for d do not appear end-to-end, one must step to 
the right batch
+   * in b, then step to d.
+   * 
+   * Finally, to get to e, one must step
+   * into the hyper vector for b, then steps to the proper batch, steps to 
d, step to e
+   * and finally step to the row within e. This is a very complex, costly 
indexing scheme
+   * that differs depending on map nesting depth.
+   * 
+   * To simplify access, this class restructures the maps to flatten the 
scalar vectors
+   * into end-to-end hyper vectors. For example, for the above:
+   * 
+   * 
+   * Batchacd
+   * 0Int vectorInt vectorInt 
vector
+   * 1Int vectorInt vectorInt 
vector
+   * 
+   *
+   * The maps are still available as hyper vectors, but separated into map 
fields.
+   * (Scalar access no longer needs to access the maps.) The result is a 
uniform
+   * addressing scheme for both top-level and nested vectors.
+   */
+
+  public static class HyperVectorBuilder {
  

[jira] [Commented] (DRILL-5323) Provide test tools to create, populate and compare row sets

2017-04-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15965207#comment-15965207
 ] 

ASF GitHub Bot commented on DRILL-5323:
---

Github user sohami commented on a diff in the pull request:

https://github.com/apache/drill/pull/785#discussion_r111005446
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/test/rowSet/DirectRowSet.java ---
@@ -0,0 +1,240 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test.rowSet;
+
+import org.apache.drill.exec.memory.BufferAllocator;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.BatchSchema.SelectionVectorMode;
+import org.apache.drill.exec.record.VectorAccessible;
+import org.apache.drill.exec.record.VectorContainer;
+import org.apache.drill.exec.record.VectorWrapper;
+import org.apache.drill.exec.record.selection.SelectionVector2;
+import org.apache.drill.exec.vector.AllocationHelper;
+import org.apache.drill.exec.vector.ValueVector;
+import org.apache.drill.exec.vector.accessor.TupleAccessor.TupleSchema;
+import org.apache.drill.exec.vector.accessor.impl.AbstractColumnWriter;
+import org.apache.drill.exec.vector.accessor.impl.ColumnAccessorFactory;
+import org.apache.drill.exec.vector.accessor.impl.TupleWriterImpl;
+import org.apache.drill.test.rowSet.RowSet.ExtendableRowSet;
+
+/**
+ * Implementation of a single row set with no indirection (selection)
+ * vector.
+ */
+
+public class DirectRowSet extends AbstractSingleRowSet implements 
ExtendableRowSet {
+
+  /**
+   * Reader index that points directly to each row in the row set.
+   * This index starts with pointing to the -1st row, so that the
+   * reader can require a next() for every row, including
+   * the first. (This is the JDBC RecordSet convention.)
+   */
+
+  private static class DirectRowIndex extends BoundedRowIndex {
+
+public DirectRowIndex(int rowCount) {
+  super(rowCount);
+}
+
+@Override
+public int index() { return rowIndex; }
+
+@Override
+public int batch() { return 0; }
+  }
+
+  /**
+   * Writer index that points to each row in the row set. The index starts 
at
+   * the 0th row and advances one row on each increment. This allows 
writers to
+   * start positioned at the first row. Writes happen in the current row.
+   * Calling next() advances to the next position, effectively 
saving
+   * the current row. The most recent row can be abandoned easily simply 
by not
+   * calling next(). This means that the number of completed rows 
is
+   * the same as the row index.
+   */
+
+  private static class ExtendableRowIndex extends RowSetIndex {
+
+private final int maxSize;
+
+public ExtendableRowIndex(int maxSize) {
+  this.maxSize = maxSize;
+  rowIndex = 0;
+}
+
+@Override
+public int index() { return rowIndex; }
+
+@Override
+public boolean next() {
+  if (++rowIndex <= maxSize ) {
+return true;
+  } else {
+rowIndex--;
+return false;
+  }
+}
+
+@Override
+public int size() { return rowIndex; }
+
+@Override
+public boolean valid() { return rowIndex < maxSize; }
+
+@Override
+public int batch() { return 0; }
+  }
+
+  /**
+   * Implementation of a row set writer. Only available for newly-created,
+   * empty, direct, single row sets. Rewriting is not allowed, nor is 
writing
+   * to a hyper row set.
+   */
+
+  public class RowSetWriterImpl extends TupleWriterImpl implements 
RowSetWriter {
+
+private final ExtendableRowIndex index;
+private final ExtendableRowSet rowSet;
+
+protected RowSetWriterImpl(ExtendableRowSet rowSet, TupleSchema 
schema, 

[jira] [Commented] (DRILL-5323) Provide test tools to create, populate and compare row sets

2017-04-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15965205#comment-15965205
 ] 

ASF GitHub Bot commented on DRILL-5323:
---

Github user sohami commented on a diff in the pull request:

https://github.com/apache/drill/pull/785#discussion_r110788792
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/test/rowSet/SchemaBuilder.java ---
@@ -0,0 +1,138 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test.rowSet;
+
+import java.util.ArrayList;
+import java.util.List;
+
+import org.apache.drill.common.types.TypeProtos.DataMode;
+import org.apache.drill.common.types.TypeProtos.MajorType;
+import org.apache.drill.common.types.TypeProtos.MinorType;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.BatchSchema.SelectionVectorMode;
+import org.apache.drill.exec.record.MaterializedField;
+
+/**
+ * Builder of a row set schema expressed as a list of materialized
+ * fields. Optimized for use when creating schemas by hand in tests.
+ * 
+ * Example usage to create the following schema: 
+ * (c: INT, a: MAP(c: VARCHAR, d: INT, e: MAP(f: VARCHAR), g: INT), h: 
BIGINT)
--- End diff --

Still same as previous version.


> Provide test tools to create, populate and compare row sets
> ---
>
> Key: DRILL-5323
> URL: https://issues.apache.org/jira/browse/DRILL-5323
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Tools, Build & Test
>Affects Versions: 1.11.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: 1.11.0
>
>
> Operators work with individual row sets. A row set is a collection of records 
> stored as column vectors. (Drill uses various terms for this concept. A 
> record batch is a row set with an operator implementation wrapped around it. 
> A vector container is a row set, but with much functionality left as an 
> exercise for the developer. And so on.)
> To simplify tests, we need a {{TestRowSet}} concept that wraps a 
> {{VectorContainer}} and provides easy ways to:
> * Define a schema for the row set.
> * Create a set of vectors that implement the schema.
> * Populate the row set with test data via code.
> * Add an SV2 to the row set.
> * Pass the row set to operator components (such as generated code blocks.)
> * Compare the results of the operation with an expected result set.
> * Dispose of the underling direct memory when work is done.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5323) Provide test tools to create, populate and compare row sets

2017-04-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15965201#comment-15965201
 ] 

ASF GitHub Bot commented on DRILL-5323:
---

Github user sohami commented on a diff in the pull request:

https://github.com/apache/drill/pull/785#discussion_r111003542
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/test/rowSet/DirectRowSet.java ---
@@ -0,0 +1,240 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test.rowSet;
+
+import org.apache.drill.exec.memory.BufferAllocator;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.BatchSchema.SelectionVectorMode;
+import org.apache.drill.exec.record.VectorAccessible;
+import org.apache.drill.exec.record.VectorContainer;
+import org.apache.drill.exec.record.VectorWrapper;
+import org.apache.drill.exec.record.selection.SelectionVector2;
+import org.apache.drill.exec.vector.AllocationHelper;
+import org.apache.drill.exec.vector.ValueVector;
+import org.apache.drill.exec.vector.accessor.TupleAccessor.TupleSchema;
+import org.apache.drill.exec.vector.accessor.impl.AbstractColumnWriter;
+import org.apache.drill.exec.vector.accessor.impl.ColumnAccessorFactory;
+import org.apache.drill.exec.vector.accessor.impl.TupleWriterImpl;
+import org.apache.drill.test.rowSet.RowSet.ExtendableRowSet;
+
+/**
+ * Implementation of a single row set with no indirection (selection)
+ * vector.
+ */
+
+public class DirectRowSet extends AbstractSingleRowSet implements 
ExtendableRowSet {
+
+  /**
+   * Reader index that points directly to each row in the row set.
+   * This index starts with pointing to the -1st row, so that the
+   * reader can require a next() for every row, including
+   * the first. (This is the JDBC RecordSet convention.)
+   */
+
+  private static class DirectRowIndex extends BoundedRowIndex {
+
+public DirectRowIndex(int rowCount) {
+  super(rowCount);
+}
+
+@Override
+public int index() { return rowIndex; }
+
+@Override
+public int batch() { return 0; }
+  }
+
+  /**
+   * Writer index that points to each row in the row set. The index starts 
at
+   * the 0th row and advances one row on each increment. This allows 
writers to
+   * start positioned at the first row. Writes happen in the current row.
+   * Calling next() advances to the next position, effectively 
saving
+   * the current row. The most recent row can be abandoned easily simply 
by not
+   * calling next(). This means that the number of completed rows 
is
+   * the same as the row index.
+   */
+
+  private static class ExtendableRowIndex extends RowSetIndex {
+
+private final int maxSize;
+
+public ExtendableRowIndex(int maxSize) {
+  this.maxSize = maxSize;
+  rowIndex = 0;
+}
+
+@Override
+public int index() { return rowIndex; }
+
+@Override
+public boolean next() {
+  if (++rowIndex <= maxSize ) {
+return true;
+  } else {
+rowIndex--;
+return false;
+  }
+}
+
+@Override
+public int size() { return rowIndex; }
+
+@Override
+public boolean valid() { return rowIndex < maxSize; }
+
+@Override
+public int batch() { return 0; }
+  }
+
+  /**
+   * Implementation of a row set writer. Only available for newly-created,
+   * empty, direct, single row sets. Rewriting is not allowed, nor is 
writing
+   * to a hyper row set.
+   */
+
+  public class RowSetWriterImpl extends TupleWriterImpl implements 
RowSetWriter {
+
+private final ExtendableRowIndex index;
+private final ExtendableRowSet rowSet;
+
+protected RowSetWriterImpl(ExtendableRowSet rowSet, TupleSchema 
schema, 

[jira] [Commented] (DRILL-5323) Provide test tools to create, populate and compare row sets

2017-04-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15965208#comment-15965208
 ] 

ASF GitHub Bot commented on DRILL-5323:
---

Github user sohami commented on a diff in the pull request:

https://github.com/apache/drill/pull/785#discussion_r110801295
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/test/rowSet/DirectRowSet.java ---
@@ -0,0 +1,240 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test.rowSet;
+
+import org.apache.drill.exec.memory.BufferAllocator;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.BatchSchema.SelectionVectorMode;
+import org.apache.drill.exec.record.VectorAccessible;
+import org.apache.drill.exec.record.VectorContainer;
+import org.apache.drill.exec.record.VectorWrapper;
+import org.apache.drill.exec.record.selection.SelectionVector2;
+import org.apache.drill.exec.vector.AllocationHelper;
+import org.apache.drill.exec.vector.ValueVector;
+import org.apache.drill.exec.vector.accessor.TupleAccessor.TupleSchema;
+import org.apache.drill.exec.vector.accessor.impl.AbstractColumnWriter;
+import org.apache.drill.exec.vector.accessor.impl.ColumnAccessorFactory;
+import org.apache.drill.exec.vector.accessor.impl.TupleWriterImpl;
+import org.apache.drill.test.rowSet.RowSet.ExtendableRowSet;
+
+/**
+ * Implementation of a single row set with no indirection (selection)
+ * vector.
+ */
+
+public class DirectRowSet extends AbstractSingleRowSet implements 
ExtendableRowSet {
+
+  /**
+   * Reader index that points directly to each row in the row set.
+   * This index starts with pointing to the -1st row, so that the
+   * reader can require a next() for every row, including
+   * the first. (This is the JDBC RecordSet convention.)
+   */
+
+  private static class DirectRowIndex extends BoundedRowIndex {
+
+public DirectRowIndex(int rowCount) {
+  super(rowCount);
+}
+
+@Override
+public int index() { return rowIndex; }
+
+@Override
+public int batch() { return 0; }
+  }
+
+  /**
+   * Writer index that points to each row in the row set. The index starts 
at
+   * the 0th row and advances one row on each increment. This allows 
writers to
+   * start positioned at the first row. Writes happen in the current row.
+   * Calling next() advances to the next position, effectively 
saving
+   * the current row. The most recent row can be abandoned easily simply 
by not
+   * calling next(). This means that the number of completed rows 
is
+   * the same as the row index.
+   */
+
+  private static class ExtendableRowIndex extends RowSetIndex {
+
+private final int maxSize;
+
+public ExtendableRowIndex(int maxSize) {
+  this.maxSize = maxSize;
+  rowIndex = 0;
+}
+
+@Override
+public int index() { return rowIndex; }
+
+@Override
+public boolean next() {
+  if (++rowIndex <= maxSize ) {
+return true;
+  } else {
+rowIndex--;
+return false;
+  }
+}
+
+@Override
+public int size() { return rowIndex; }
+
+@Override
+public boolean valid() { return rowIndex < maxSize; }
+
+@Override
+public int batch() { return 0; }
+  }
+
+  /**
+   * Implementation of a row set writer. Only available for newly-created,
+   * empty, direct, single row sets. Rewriting is not allowed, nor is 
writing
+   * to a hyper row set.
+   */
+
+  public class RowSetWriterImpl extends TupleWriterImpl implements 
RowSetWriter {
+
+private final ExtendableRowIndex index;
+private final ExtendableRowSet rowSet;
+
+protected RowSetWriterImpl(ExtendableRowSet rowSet, TupleSchema 
schema, 

[jira] [Commented] (DRILL-5323) Provide test tools to create, populate and compare row sets

2017-04-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15965206#comment-15965206
 ] 

ASF GitHub Bot commented on DRILL-5323:
---

Github user sohami commented on a diff in the pull request:

https://github.com/apache/drill/pull/785#discussion_r110783153
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/test/rowSet/RowSetSchema.java ---
@@ -0,0 +1,252 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test.rowSet;
+
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+import org.apache.drill.common.types.TypeProtos.MinorType;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.BatchSchema.SelectionVectorMode;
+import org.apache.drill.exec.vector.accessor.TupleAccessor.TupleSchema;
+import org.apache.drill.exec.record.MaterializedField;
+
+/**
+ * Row set schema presented as a number of distinct "views" for various
+ * purposes:
+ * 
+ * Batch schema: the schema used by a VectorContainer.
+ * Physical schema: the schema expressed as a hierarchy of
+ * tuples with the top tuple representing the row, nested tuples
+ * representing maps.
+ * Access schema: a flattened schema with all scalar columns
+ * at the top level, and with map columns pulled out into a separate
+ * collection. The flattened-scalar view is the one used to write to,
+ * and read from, the row set.
+ * 
+ * Allows easy creation of multiple row sets from the same schema.
+ * Each schema is immutable, which is fine for tests in which we
+ * want known inputs and outputs.
+ */
+
+public class RowSetSchema {
+
+  /**
+   * Logical description of a column. A logical column is a
+   * materialized field. For maps, also includes a logical schema
+   * of the map.
+   */
+
+  public static class LogicalColumn {
+protected final String fullName;
+protected final int accessIndex;
+protected int flatIndex;
+protected final MaterializedField field;
+
+/**
+ * Schema of the map. Includes only those fields directly within
+ * the map; does not include fields from nested tuples.
+ */
+
+protected PhysicalSchema mapSchema;
+
+public LogicalColumn(String fullName, int accessIndex, 
MaterializedField field) {
+  this.fullName = fullName;
+  this.accessIndex = accessIndex;
+  this.field = field;
+}
+
+private void updateStructure(int index, PhysicalSchema children) {
+  flatIndex = index;
+  mapSchema = children;
+}
+
+public int accessIndex() { return accessIndex; }
+public int flatIndex() { return flatIndex; }
+public boolean isMap() { return mapSchema != null; }
+public PhysicalSchema mapSchema() { return mapSchema; }
+public MaterializedField field() { return field; }
+public String fullName() { return fullName; }
+  }
+
+  /**
+   * Implementation of a tuple name space. Tuples allow both indexed and
+   * named access to their members.
+   *
+   * @param  the type of object representing each column
+   */
+
+  public static class NameSpace {
+private final Map nameSpace = new HashMap<>();
+private final List columns = new ArrayList<>();
+
+public int add(String key, T value) {
+  int index = columns.size();
+  nameSpace.put(key, index);
+  columns.add(value);
+  return index;
+}
+
+public T get(int index) {
+  return columns.get(index);
+}
+
+public T get(String key) {
+  Integer index = getIndex(key);
+  if (index == -1) {
+return null;
+  }
+  return get(index);
+}
+
+public int getIndex(String key) {

[jira] [Commented] (DRILL-5323) Provide test tools to create, populate and compare row sets

2017-04-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15965209#comment-15965209
 ] 

ASF GitHub Bot commented on DRILL-5323:
---

Github user sohami commented on a diff in the pull request:

https://github.com/apache/drill/pull/785#discussion_r110783017
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/test/rowSet/RowSetSchema.java ---
@@ -0,0 +1,252 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test.rowSet;
+
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+import org.apache.drill.common.types.TypeProtos.MinorType;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.BatchSchema.SelectionVectorMode;
+import org.apache.drill.exec.vector.accessor.TupleAccessor.TupleSchema;
+import org.apache.drill.exec.record.MaterializedField;
+
+/**
+ * Row set schema presented as a number of distinct "views" for various
+ * purposes:
+ * 
+ * Batch schema: the schema used by a VectorContainer.
+ * Physical schema: the schema expressed as a hierarchy of
+ * tuples with the top tuple representing the row, nested tuples
+ * representing maps.
+ * Access schema: a flattened schema with all scalar columns
+ * at the top level, and with map columns pulled out into a separate
+ * collection. The flattened-scalar view is the one used to write to,
+ * and read from, the row set.
+ * 
+ * Allows easy creation of multiple row sets from the same schema.
+ * Each schema is immutable, which is fine for tests in which we
+ * want known inputs and outputs.
+ */
+
+public class RowSetSchema {
+
+  /**
+   * Logical description of a column. A logical column is a
+   * materialized field. For maps, also includes a logical schema
+   * of the map.
+   */
+
+  public static class LogicalColumn {
+protected final String fullName;
+protected final int accessIndex;
+protected int flatIndex;
+protected final MaterializedField field;
+
+/**
+ * Schema of the map. Includes only those fields directly within
+ * the map; does not include fields from nested tuples.
+ */
+
+protected PhysicalSchema mapSchema;
+
+public LogicalColumn(String fullName, int accessIndex, 
MaterializedField field) {
+  this.fullName = fullName;
+  this.accessIndex = accessIndex;
+  this.field = field;
+}
+
+private void updateStructure(int index, PhysicalSchema children) {
+  flatIndex = index;
+  mapSchema = children;
+}
+
+public int accessIndex() { return accessIndex; }
+public int flatIndex() { return flatIndex; }
+public boolean isMap() { return mapSchema != null; }
+public PhysicalSchema mapSchema() { return mapSchema; }
+public MaterializedField field() { return field; }
+public String fullName() { return fullName; }
+  }
+
+  /**
+   * Implementation of a tuple name space. Tuples allow both indexed and
+   * named access to their members.
+   *
+   * @param  the type of object representing each column
+   */
+
+  public static class NameSpace {
+private final Map nameSpace = new HashMap<>();
+private final List columns = new ArrayList<>();
+
+public int add(String key, T value) {
+  int index = columns.size();
+  nameSpace.put(key, index);
+  columns.add(value);
+  return index;
+}
+
+public T get(int index) {
+  return columns.get(index);
--- End diff --

Can throw IndexOutOfBoundsException. Check for index with columns.size()


> Provide test tools to create, populate and compare row sets
> ---
>
> Key: 

[jira] [Commented] (DRILL-5323) Provide test tools to create, populate and compare row sets

2017-04-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15965204#comment-15965204
 ] 

ASF GitHub Bot commented on DRILL-5323:
---

Github user sohami commented on a diff in the pull request:

https://github.com/apache/drill/pull/785#discussion_r110783456
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/test/rowSet/RowSetSchema.java ---
@@ -0,0 +1,252 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test.rowSet;
+
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+import org.apache.drill.common.types.TypeProtos.MinorType;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.BatchSchema.SelectionVectorMode;
+import org.apache.drill.exec.vector.accessor.TupleAccessor.TupleSchema;
+import org.apache.drill.exec.record.MaterializedField;
+
+/**
+ * Row set schema presented as a number of distinct "views" for various
+ * purposes:
+ * 
+ * Batch schema: the schema used by a VectorContainer.
+ * Physical schema: the schema expressed as a hierarchy of
+ * tuples with the top tuple representing the row, nested tuples
+ * representing maps.
+ * Access schema: a flattened schema with all scalar columns
+ * at the top level, and with map columns pulled out into a separate
+ * collection. The flattened-scalar view is the one used to write to,
+ * and read from, the row set.
+ * 
+ * Allows easy creation of multiple row sets from the same schema.
+ * Each schema is immutable, which is fine for tests in which we
+ * want known inputs and outputs.
+ */
+
+public class RowSetSchema {
+
+  /**
+   * Logical description of a column. A logical column is a
+   * materialized field. For maps, also includes a logical schema
+   * of the map.
+   */
+
+  public static class LogicalColumn {
+protected final String fullName;
+protected final int accessIndex;
+protected int flatIndex;
+protected final MaterializedField field;
+
+/**
+ * Schema of the map. Includes only those fields directly within
+ * the map; does not include fields from nested tuples.
+ */
+
+protected PhysicalSchema mapSchema;
+
+public LogicalColumn(String fullName, int accessIndex, 
MaterializedField field) {
+  this.fullName = fullName;
+  this.accessIndex = accessIndex;
+  this.field = field;
+}
+
+private void updateStructure(int index, PhysicalSchema children) {
+  flatIndex = index;
+  mapSchema = children;
+}
+
+public int accessIndex() { return accessIndex; }
+public int flatIndex() { return flatIndex; }
+public boolean isMap() { return mapSchema != null; }
+public PhysicalSchema mapSchema() { return mapSchema; }
+public MaterializedField field() { return field; }
+public String fullName() { return fullName; }
+  }
+
+  /**
+   * Implementation of a tuple name space. Tuples allow both indexed and
+   * named access to their members.
+   *
+   * @param  the type of object representing each column
+   */
+
+  public static class NameSpace {
+private final Map nameSpace = new HashMap<>();
+private final List columns = new ArrayList<>();
+
+public int add(String key, T value) {
+  int index = columns.size();
+  nameSpace.put(key, index);
+  columns.add(value);
+  return index;
+}
+
+public T get(int index) {
+  return columns.get(index);
+}
+
+public T get(String key) {
+  Integer index = getIndex(key);
+  if (index == -1) {
+return null;
+  }
+  return get(index);
+}
+
+public int getIndex(String key) {

[jira] [Commented] (DRILL-5323) Provide test tools to create, populate and compare row sets

2017-04-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15965203#comment-15965203
 ] 

ASF GitHub Bot commented on DRILL-5323:
---

Github user sohami commented on a diff in the pull request:

https://github.com/apache/drill/pull/785#discussion_r110794477
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/test/rowSet/RowSetBuilder.java ---
@@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test.rowSet;
+
+import org.apache.drill.exec.memory.BufferAllocator;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.test.rowSet.RowSet.RowSetWriter;
+import org.apache.drill.test.rowSet.RowSet.SingleRowSet;
+
+/**
+ * Fluent builder to quickly build up an row set (record batch)
+ * programmatically. Starting with an {@link OperatorFixture}:
+ * 
+ * OperatorFixture fixture = ...
+ * RowSet rowSet = fixture.rowSetBuilder(batchSchema)
+ *   .addRow(10, "string", new int[] {10.3, 10.4})
+ *   ...
+ *   .build();
+ */
+
+public final class RowSetBuilder {
+
+  private DirectRowSet rowSet;
+  private RowSetWriter writer;
+  private boolean withSv2;
+
+  public RowSetBuilder(BufferAllocator allocator, BatchSchema schema) {
+this(allocator, schema, 10);
+  }
+
+  public RowSetBuilder(BufferAllocator allocator, BatchSchema schema, int 
capacity) {
+rowSet = new DirectRowSet(allocator, schema);
+writer = rowSet.writer(capacity);
+  }
+
+  public RowSetBuilder add(Object...values) {
+if (! writer.valid()) {
+  throw new IllegalStateException( "Write past end of row set" );
+}
+for (int i = 0; i < values.length;  i++) {
+  writer.set(i, values[i]);
+}
+writer.save();
+return this;
--- End diff --

For line 52-58. You can just call `writer.setRow(values)`


> Provide test tools to create, populate and compare row sets
> ---
>
> Key: DRILL-5323
> URL: https://issues.apache.org/jira/browse/DRILL-5323
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Tools, Build & Test
>Affects Versions: 1.11.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: 1.11.0
>
>
> Operators work with individual row sets. A row set is a collection of records 
> stored as column vectors. (Drill uses various terms for this concept. A 
> record batch is a row set with an operator implementation wrapped around it. 
> A vector container is a row set, but with much functionality left as an 
> exercise for the developer. And so on.)
> To simplify tests, we need a {{TestRowSet}} concept that wraps a 
> {{VectorContainer}} and provides easy ways to:
> * Define a schema for the row set.
> * Create a set of vectors that implement the schema.
> * Populate the row set with test data via code.
> * Add an SV2 to the row set.
> * Pass the row set to operator components (such as generated code blocks.)
> * Compare the results of the operation with an expected result set.
> * Dispose of the underling direct memory when work is done.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5323) Provide test tools to create, populate and compare row sets

2017-04-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15965202#comment-15965202
 ] 

ASF GitHub Bot commented on DRILL-5323:
---

Github user sohami commented on a diff in the pull request:

https://github.com/apache/drill/pull/785#discussion_r110781500
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/test/rowSet/RowSetSchema.java ---
@@ -0,0 +1,252 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test.rowSet;
+
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+import org.apache.drill.common.types.TypeProtos.MinorType;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.BatchSchema.SelectionVectorMode;
+import org.apache.drill.exec.vector.accessor.TupleAccessor.TupleSchema;
+import org.apache.drill.exec.record.MaterializedField;
+
+/**
+ * Row set schema presented as a number of distinct "views" for various
+ * purposes:
+ * 
+ * Batch schema: the schema used by a VectorContainer.
+ * Physical schema: the schema expressed as a hierarchy of
+ * tuples with the top tuple representing the row, nested tuples
+ * representing maps.
+ * Access schema: a flattened schema with all scalar columns
+ * at the top level, and with map columns pulled out into a separate
+ * collection. The flattened-scalar view is the one used to write to,
+ * and read from, the row set.
+ * 
+ * Allows easy creation of multiple row sets from the same schema.
+ * Each schema is immutable, which is fine for tests in which we
+ * want known inputs and outputs.
+ */
+
+public class RowSetSchema {
+
+  /**
+   * Logical description of a column. A logical column is a
+   * materialized field. For maps, also includes a logical schema
+   * of the map.
+   */
+
+  public static class LogicalColumn {
+protected final String fullName;
+protected final int accessIndex;
+protected int flatIndex;
+protected final MaterializedField field;
+
+/**
+ * Schema of the map. Includes only those fields directly within
+ * the map; does not include fields from nested tuples.
+ */
+
+protected PhysicalSchema mapSchema;
+
+public LogicalColumn(String fullName, int accessIndex, 
MaterializedField field) {
+  this.fullName = fullName;
+  this.accessIndex = accessIndex;
+  this.field = field;
+}
+
+private void updateStructure(int index, PhysicalSchema children) {
+  flatIndex = index;
+  mapSchema = children;
+}
+
+public int accessIndex() { return accessIndex; }
+public int flatIndex() { return flatIndex; }
+public boolean isMap() { return mapSchema != null; }
+public PhysicalSchema mapSchema() { return mapSchema; }
+public MaterializedField field() { return field; }
+public String fullName() { return fullName; }
+  }
+
+  /**
+   * Implementation of a tuple name space. Tuples allow both indexed and
+   * named access to their members.
+   *
+   * @param  the type of object representing each column
+   */
+
+  public static class NameSpace {
+private final Map nameSpace = new HashMap<>();
+private final List columns = new ArrayList<>();
+
+public int add(String key, T value) {
+  int index = columns.size();
+  nameSpace.put(key, index);
+  columns.add(value);
+  return index;
+}
+
+public T get(int index) {
+  return columns.get(index);
+}
+
+public T get(String key) {
+  Integer index = getIndex(key);
--- End diff --

`int` rather than `Integer`


> Provide test tools to create, populate and compare row sets
> 

[jira] [Commented] (DRILL-5323) Provide test tools to create, populate and compare row sets

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15958084#comment-15958084
 ] 

ASF GitHub Bot commented on DRILL-5323:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/785#discussion_r109563707
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/test/rowSet/AbstractSingleRowSet.java
 ---
@@ -0,0 +1,158 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test.rowSet;
+
+import org.apache.drill.common.types.TypeProtos.MajorType;
+import org.apache.drill.common.types.TypeProtos.MinorType;
+import org.apache.drill.exec.expr.TypeHelper;
+import org.apache.drill.exec.memory.BufferAllocator;
+import org.apache.drill.exec.physical.impl.spill.RecordBatchSizer;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.BatchSchema.SelectionVectorMode;
+import org.apache.drill.exec.record.VectorContainer;
+import org.apache.drill.exec.record.VectorWrapper;
+import org.apache.drill.exec.vector.ValueVector;
+import org.apache.drill.exec.vector.complex.MapVector;
+import org.apache.drill.test.rowSet.RowSet.SingleRowSet;
+import org.apache.drill.test.rowSet.RowSetSchema.LogicalColumn;
+import org.apache.drill.test.rowSet.RowSetSchema.PhysicalSchema;
+
+public abstract class AbstractSingleRowSet extends AbstractRowSet 
implements SingleRowSet {
+
+  public abstract static class StructureBuilder {
+protected final PhysicalSchema schema;
+protected final BufferAllocator allocator;
+protected final ValueVector[] valueVectors;
+protected final MapVector[] mapVectors;
+protected int vectorIndex;
+protected int mapIndex;
+
+public StructureBuilder(BufferAllocator allocator, RowSetSchema 
schema) {
+  this.allocator = allocator;
+  this.schema = schema.physical();
+  valueVectors = new ValueVector[schema.access().count()];
+  if (schema.access().mapCount() == 0) {
+mapVectors = null;
+  } else {
+mapVectors = new MapVector[schema.access().mapCount()];
+  }
+}
+  }
+
+  public static class VectorBuilder extends StructureBuilder {
+
+public VectorBuilder(BufferAllocator allocator, RowSetSchema schema) {
+  super(allocator, schema);
+}
+
+public ValueVector[] buildContainer(VectorContainer container) {
+  for (int i = 0; i < schema.count(); i++) {
+LogicalColumn colSchema = schema.column(i);
+@SuppressWarnings("resource")
+ValueVector v = TypeHelper.getNewVector(colSchema.field, 
allocator, null);
+container.add(v);
+if (colSchema.field.getType().getMinorType() == MinorType.MAP) {
+  MapVector mv = (MapVector) v;
+  mapVectors[mapIndex++] = mv;
+  buildMap(mv, colSchema.mapSchema);
+} else {
+  valueVectors[vectorIndex++] = v;
+}
+  }
+  container.buildSchema(SelectionVectorMode.NONE);
+  return valueVectors;
+}
+
+private void buildMap(MapVector mapVector, PhysicalSchema mapSchema) {
+  for (int i = 0; i < mapSchema.count(); i++) {
+LogicalColumn colSchema = mapSchema.column(i);
+MajorType type = colSchema.field.getType();
+Class vectorClass = 
TypeHelper.getValueVectorClass(type.getMinorType(), type.getMode());
+@SuppressWarnings("resource")
+ValueVector v = mapVector.addOrGet(colSchema.field.getName(), 
type, vectorClass);
+if (type.getMinorType() == MinorType.MAP) {
+  MapVector mv = (MapVector) v;
+  mapVectors[mapIndex++] = mv;
+  buildMap(mv, colSchema.mapSchema);
+} else {
+  valueVectors[vectorIndex++] = v;
+}
+  }
+}
+  }
+
+  

[jira] [Commented] (DRILL-5323) Provide test tools to create, populate and compare row sets

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15958083#comment-15958083
 ] 

ASF GitHub Bot commented on DRILL-5323:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/785#discussion_r110056242
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/test/rowSet/SchemaBuilder.java ---
@@ -0,0 +1,138 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test.rowSet;
+
+import java.util.ArrayList;
+import java.util.List;
+
+import org.apache.drill.common.types.TypeProtos.DataMode;
+import org.apache.drill.common.types.TypeProtos.MajorType;
+import org.apache.drill.common.types.TypeProtos.MinorType;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.BatchSchema.SelectionVectorMode;
+import org.apache.drill.exec.record.MaterializedField;
+
+/**
+ * Builder of a row set schema expressed as a list of materialized
+ * fields. Optimized for use when creating schemas by hand in tests.
+ * 
+ * Example usage to create the following schema: 
+ * (c: INT, a: MAP(c: VARCHAR, d: INT, e: MAP(f: VARCHAR), g: INT), h: 
BIGINT)
--- End diff --

Fixed.


> Provide test tools to create, populate and compare row sets
> ---
>
> Key: DRILL-5323
> URL: https://issues.apache.org/jira/browse/DRILL-5323
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Tools, Build & Test
>Affects Versions: 1.11.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: 1.11.0
>
>
> Operators work with individual row sets. A row set is a collection of records 
> stored as column vectors. (Drill uses various terms for this concept. A 
> record batch is a row set with an operator implementation wrapped around it. 
> A vector container is a row set, but with much functionality left as an 
> exercise for the developer. And so on.)
> To simplify tests, we need a {{TestRowSet}} concept that wraps a 
> {{VectorContainer}} and provides easy ways to:
> * Define a schema for the row set.
> * Create a set of vectors that implement the schema.
> * Populate the row set with test data via code.
> * Add an SV2 to the row set.
> * Pass the row set to operator components (such as generated code blocks.)
> * Compare the results of the operation with an expected result set.
> * Dispose of the underling direct memory when work is done.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5323) Provide test tools to create, populate and compare row sets

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15958082#comment-15958082
 ] 

ASF GitHub Bot commented on DRILL-5323:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/785#discussion_r110055747
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/test/rowSet/RowSetPrinter.java ---
@@ -0,0 +1,97 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test.rowSet;
+
+import java.io.PrintStream;
+
+import org.apache.drill.exec.record.BatchSchema.SelectionVectorMode;
+import org.apache.drill.test.rowSet.RowSet.RowSetReader;
+import org.apache.drill.test.rowSet.RowSetSchema.AccessSchema;
+
+public class RowSetPrinter {
+  private RowSet rowSet;
+
+  public RowSetPrinter(RowSet rowSet) {
+this.rowSet = rowSet;
+  }
+
+  public void print() {
+print(System.out);
+  }
+
+  public void print(PrintStream out) {
+SelectionVectorMode selectionMode = rowSet.getIndirectionType();
+RowSetReader reader = rowSet.reader();
+int colCount = reader.width();
+printSchema(out, selectionMode);
+while (reader.next()) {
+  printHeader(out, reader, selectionMode);
+  for (int i = 0; i < colCount; i++) {
+if (i > 0) {
+  out.print(", ");
+}
+out.print(reader.getAsString(i));
+  }
+  out.println();
+}
+  }
+
+  private void printSchema(PrintStream out, SelectionVectorMode 
selectionMode) {
+out.print("#");
+switch (selectionMode) {
+case FOUR_BYTE:
+  out.print(" (batch #, row #)");
+  break;
+case TWO_BYTE:
+  out.print(" (row #)");
+  break;
+default:
+  break;
+}
+out.print(": ");
+AccessSchema schema = rowSet.schema().access();
+for (int i = 0; i < schema.count(); i++) {
+  if (i > 0) {
--- End diff --

Access schema is a flattened view of the row set: the set of columns whose 
values can be set. If we have a schema like (a, b(c, d)) (where b is a map), 
the access schema is (a, b.c, b.d).


> Provide test tools to create, populate and compare row sets
> ---
>
> Key: DRILL-5323
> URL: https://issues.apache.org/jira/browse/DRILL-5323
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Tools, Build & Test
>Affects Versions: 1.11.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: 1.11.0
>
>
> Operators work with individual row sets. A row set is a collection of records 
> stored as column vectors. (Drill uses various terms for this concept. A 
> record batch is a row set with an operator implementation wrapped around it. 
> A vector container is a row set, but with much functionality left as an 
> exercise for the developer. And so on.)
> To simplify tests, we need a {{TestRowSet}} concept that wraps a 
> {{VectorContainer}} and provides easy ways to:
> * Define a schema for the row set.
> * Create a set of vectors that implement the schema.
> * Populate the row set with test data via code.
> * Add an SV2 to the row set.
> * Pass the row set to operator components (such as generated code blocks.)
> * Compare the results of the operation with an expected result set.
> * Dispose of the underling direct memory when work is done.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5323) Provide test tools to create, populate and compare row sets

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15958081#comment-15958081
 ] 

ASF GitHub Bot commented on DRILL-5323:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/785#discussion_r110056124
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/test/rowSet/RowSetWriterImpl.java 
---
@@ -0,0 +1,110 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test.rowSet;
+
+import java.math.BigDecimal;
+
+import org.apache.drill.exec.vector.ValueVector;
+import org.apache.drill.exec.vector.accessor.AbstractColumnWriter;
+import org.apache.drill.exec.vector.accessor.ColumnAccessorFactory;
+import org.apache.drill.exec.vector.accessor.ColumnWriter;
+import org.apache.drill.test.rowSet.RowSet.RowSetWriter;
+import org.joda.time.Period;
+
+/**
+ * Implements a row set writer on top of a {@link RowSet}
+ * container.
+ */
+
+public class RowSetWriterImpl extends AbstractRowSetAccessor implements 
RowSetWriter {
+
+  private final AbstractColumnWriter writers[];
+
+  public RowSetWriterImpl(AbstractSingleRowSet recordSet, AbstractRowIndex 
rowIndex) {
+super(rowIndex, recordSet.schema().access());
+ValueVector[] valueVectors = recordSet.vectors();
+writers = new AbstractColumnWriter[valueVectors.length];
+int posn = 0;
+for (int i = 0; i < writers.length; i++) {
+  writers[posn] = 
ColumnAccessorFactory.newWriter(valueVectors[i].getField().getType());
--- End diff --

Fixed.


> Provide test tools to create, populate and compare row sets
> ---
>
> Key: DRILL-5323
> URL: https://issues.apache.org/jira/browse/DRILL-5323
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Tools, Build & Test
>Affects Versions: 1.11.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: 1.11.0
>
>
> Operators work with individual row sets. A row set is a collection of records 
> stored as column vectors. (Drill uses various terms for this concept. A 
> record batch is a row set with an operator implementation wrapped around it. 
> A vector container is a row set, but with much functionality left as an 
> exercise for the developer. And so on.)
> To simplify tests, we need a {{TestRowSet}} concept that wraps a 
> {{VectorContainer}} and provides easy ways to:
> * Define a schema for the row set.
> * Create a set of vectors that implement the schema.
> * Populate the row set with test data via code.
> * Add an SV2 to the row set.
> * Pass the row set to operator components (such as generated code blocks.)
> * Compare the results of the operation with an expected result set.
> * Dispose of the underling direct memory when work is done.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5323) Provide test tools to create, populate and compare row sets

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15958085#comment-15958085
 ] 

ASF GitHub Bot commented on DRILL-5323:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/785#discussion_r110055632
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/test/rowSet/HyperRowSetImpl.java 
---
@@ -0,0 +1,221 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test.rowSet;
+
+import java.util.ArrayList;
+import java.util.List;
+
+import org.apache.drill.common.types.TypeProtos.MinorType;
+import org.apache.drill.exec.memory.BufferAllocator;
+import org.apache.drill.exec.record.BatchSchema.SelectionVectorMode;
+import org.apache.drill.exec.record.HyperVectorWrapper;
+import org.apache.drill.exec.record.VectorContainer;
+import org.apache.drill.exec.record.VectorWrapper;
+import org.apache.drill.exec.record.selection.SelectionVector4;
+import org.apache.drill.exec.vector.ValueVector;
+import org.apache.drill.exec.vector.accessor.AccessorUtilities;
+import org.apache.drill.exec.vector.complex.AbstractMapVector;
+import org.apache.drill.test.rowSet.AbstractRowSetAccessor.BoundedRowIndex;
+import org.apache.drill.test.rowSet.RowSet.HyperRowSet;
+import org.apache.drill.test.rowSet.RowSetSchema.LogicalColumn;
+import org.apache.drill.test.rowSet.RowSetSchema.PhysicalSchema;
+
+public class HyperRowSetImpl extends AbstractRowSet implements HyperRowSet 
{
+
+  public static class HyperRowIndex extends BoundedRowIndex {
+
+private final SelectionVector4 sv4;
+
+public HyperRowIndex(SelectionVector4 sv4) {
+  super(sv4.getCount());
+  this.sv4 = sv4;
+}
+
+@Override
+public int index() {
+  return AccessorUtilities.sv4Index(sv4.get(rowIndex));
+}
+
+@Override
+public int batch( ) {
+  return AccessorUtilities.sv4Batch(sv4.get(rowIndex));
+}
+  }
+
+  /**
+   * Build a hyper row set by restructuring a hyper vector bundle into a 
uniform
+   * shape. Consider this schema: 
+   * { a: 10, b: { c: 20, d: { e: 30 } } }
+   * 
+   * The hyper container, with two batches, has this structure:
+   * 
+   * Batchab
+   * 0Int vectorMap Vector(Int vector, Map 
Vector(Int vector))
+   * 1Int vectorMap Vector(Int vector, Map 
Vector(Int vector))
+   * 
+   * 
+   * The above table shows that top-level scalar vectors (such as the Int 
Vector for column
+   * a) appear "end-to-end" as a hyper-vector. Maps also appear 
end-to-end. But, the
+   * contents of the map (column c) do not appear end-to-end. Instead, 
they appear as
+   * contents in the map vector. To get to c, one indexes into the map 
vector, steps inside
+   * the map to find c and indexes to the right row.
+   * 
+   * Similarly, the maps for d do not appear end-to-end, one must step to 
the right batch
+   * in b, then step to d.
+   * 
+   * Finally, to get to e, one must step
+   * into the hyper vector for b, then steps to the proper batch, steps to 
d, step to e
+   * and finally step to the row within e. This is a very complex, costly 
indexing scheme
+   * that differs depending on map nesting depth.
+   * 
+   * To simplify access, this class restructures the maps to flatten the 
scalar vectors
+   * into end-to-end hyper vectors. For example, for the above:
+   * 
+   * 
+   * Batchacd
+   * 0Int vectorInt vectorInt 
vector
+   * 1Int vectorInt vectorInt 
vector
+   * 
+   *
+   * The maps are still available as hyper vectors, but separated into map 
fields.
+   * (Scalar access no longer needs to access the maps.) The result is a 
uniform
+   * addressing scheme for both top-level and nested vectors.
+   */
+
+  public static class HyperVectorBuilder {
  

[jira] [Commented] (DRILL-5323) Provide test tools to create, populate and compare row sets

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15958086#comment-15958086
 ] 

ASF GitHub Bot commented on DRILL-5323:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/785#discussion_r110055928
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/test/rowSet/RowSetUtilities.java 
---
@@ -0,0 +1,83 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test.rowSet;
+
+import org.apache.drill.common.types.TypeProtos.MinorType;
+import org.apache.drill.exec.record.selection.SelectionVector2;
+import org.apache.drill.exec.vector.accessor.AccessorUtilities;
+import org.apache.drill.exec.vector.accessor.ColumnAccessor.ValueType;
+import org.apache.drill.exec.vector.accessor.ColumnWriter;
+import org.apache.drill.test.rowSet.RowSet.RowSetWriter;
+import org.joda.time.Duration;
+import org.joda.time.Period;
+
+public class RowSetUtilities {
+
+  private RowSetUtilities() { }
+
+  public static void reverse(SelectionVector2 sv2) {
+int count = sv2.getCount();
+for (int i = 0; i < count / 2; i++) {
+  char temp = sv2.getIndex(i);
+  int dest = count - 1 - i;
+  sv2.setIndex(i, sv2.getIndex(dest));
+  sv2.setIndex(dest, temp);
+}
+  }
+
+  /**
+   * Set a test data value from an int. Uses the type information of the
+   * column to handle interval types. Else, uses the value type of the
+   * accessor. The value set here is purely for testing; the mapping
+   * from ints to intervals has no real meaning.
+   *
+   * @param rowWriter
+   * @param index
+   * @param value
+   */
+
+  public static void setFromInt(RowSetWriter rowWriter, int index, int 
value) {
+ColumnWriter writer = rowWriter.column(index);
+if (writer.valueType() == ValueType.PERIOD) {
+  setPeriodFromInt(writer, 
rowWriter.schema().column(index).getType().getMinorType(), value);
+} else {
+  AccessorUtilities.setFromInt(writer, value);
+}
+  }
+
+  public static void setPeriodFromInt(ColumnWriter writer, MinorType 
minorType,
+  int value) {
+switch (minorType) {
+case INTERVAL:
+  writer.setPeriod(Duration.millis(value).toPeriod());
+  break;
+case INTERVALYEAR:
+  writer.setPeriod(Period.years(value / 12).withMonths(value % 12));
+  break;
+case INTERVALDAY:
+  int sec = value % 60;
+  value = value / 60;
+  int min = value % 60;
+  value = value / 60;
+  
writer.setPeriod(Period.days(value).withMinutes(min).withSeconds(sec));
--- End diff --

This is a data generator. The int has no real meaning, it is just a 
convenient way to populate a field. So, here we just slice off some values to 
put into each field. Not pretty, but convenient for testing.


> Provide test tools to create, populate and compare row sets
> ---
>
> Key: DRILL-5323
> URL: https://issues.apache.org/jira/browse/DRILL-5323
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Tools, Build & Test
>Affects Versions: 1.11.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: 1.11.0
>
>
> Operators work with individual row sets. A row set is a collection of records 
> stored as column vectors. (Drill uses various terms for this concept. A 
> record batch is a row set with an operator implementation wrapped around it. 
> A vector container is a row set, but with much functionality left as an 
> exercise for the developer. And so on.)
> To simplify tests, we need a {{TestRowSet}} concept that wraps a 
> {{VectorContainer}} and provides easy ways to:
> * Define a schema for the row set.
> * Create a set of vectors that implement 

[jira] [Commented] (DRILL-5323) Provide test tools to create, populate and compare row sets

2017-03-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15948235#comment-15948235
 ] 

ASF GitHub Bot commented on DRILL-5323:
---

Github user sohami commented on a diff in the pull request:

https://github.com/apache/drill/pull/785#discussion_r108768096
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/test/rowSet/RowSetUtilities.java 
---
@@ -0,0 +1,83 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test.rowSet;
+
+import org.apache.drill.common.types.TypeProtos.MinorType;
+import org.apache.drill.exec.record.selection.SelectionVector2;
+import org.apache.drill.exec.vector.accessor.AccessorUtilities;
+import org.apache.drill.exec.vector.accessor.ColumnAccessor.ValueType;
+import org.apache.drill.exec.vector.accessor.ColumnWriter;
+import org.apache.drill.test.rowSet.RowSet.RowSetWriter;
+import org.joda.time.Duration;
+import org.joda.time.Period;
+
+public class RowSetUtilities {
+
+  private RowSetUtilities() { }
+
+  public static void reverse(SelectionVector2 sv2) {
+int count = sv2.getCount();
+for (int i = 0; i < count / 2; i++) {
+  char temp = sv2.getIndex(i);
+  int dest = count - 1 - i;
+  sv2.setIndex(i, sv2.getIndex(dest));
+  sv2.setIndex(dest, temp);
+}
+  }
+
+  /**
+   * Set a test data value from an int. Uses the type information of the
+   * column to handle interval types. Else, uses the value type of the
+   * accessor. The value set here is purely for testing; the mapping
+   * from ints to intervals has no real meaning.
+   *
+   * @param rowWriter
+   * @param index
+   * @param value
+   */
+
+  public static void setFromInt(RowSetWriter rowWriter, int index, int 
value) {
+ColumnWriter writer = rowWriter.column(index);
+if (writer.valueType() == ValueType.PERIOD) {
+  setPeriodFromInt(writer, 
rowWriter.schema().column(index).getType().getMinorType(), value);
+} else {
+  AccessorUtilities.setFromInt(writer, value);
+}
+  }
+
+  public static void setPeriodFromInt(ColumnWriter writer, MinorType 
minorType,
+  int value) {
+switch (minorType) {
+case INTERVAL:
+  writer.setPeriod(Duration.millis(value).toPeriod());
+  break;
+case INTERVALYEAR:
+  writer.setPeriod(Period.years(value / 12).withMonths(value % 12));
+  break;
+case INTERVALDAY:
+  int sec = value % 60;
+  value = value / 60;
+  int min = value % 60;
+  value = value / 60;
+  
writer.setPeriod(Period.days(value).withMinutes(min).withSeconds(sec));
--- End diff --

Looks like it's missing calculation to hours and then days


> Provide test tools to create, populate and compare row sets
> ---
>
> Key: DRILL-5323
> URL: https://issues.apache.org/jira/browse/DRILL-5323
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Tools, Build & Test
>Affects Versions: 1.11.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: 1.11.0
>
>
> Operators work with individual row sets. A row set is a collection of records 
> stored as column vectors. (Drill uses various terms for this concept. A 
> record batch is a row set with an operator implementation wrapped around it. 
> A vector container is a row set, but with much functionality left as an 
> exercise for the developer. And so on.)
> To simplify tests, we need a {{TestRowSet}} concept that wraps a 
> {{VectorContainer}} and provides easy ways to:
> * Define a schema for the row set.
> * Create a set of vectors that implement the schema.
> * Populate the row set with test data via code.
> * Add an SV2 to the row set.
> * Pass the row set to operator components (such as generated 

[jira] [Commented] (DRILL-5323) Provide test tools to create, populate and compare row sets

2017-03-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15948233#comment-15948233
 ] 

ASF GitHub Bot commented on DRILL-5323:
---

Github user sohami commented on a diff in the pull request:

https://github.com/apache/drill/pull/785#discussion_r108777101
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/test/rowSet/RowSetPrinter.java ---
@@ -0,0 +1,97 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test.rowSet;
+
+import java.io.PrintStream;
+
+import org.apache.drill.exec.record.BatchSchema.SelectionVectorMode;
+import org.apache.drill.test.rowSet.RowSet.RowSetReader;
+import org.apache.drill.test.rowSet.RowSetSchema.AccessSchema;
+
+public class RowSetPrinter {
+  private RowSet rowSet;
+
+  public RowSetPrinter(RowSet rowSet) {
+this.rowSet = rowSet;
+  }
+
+  public void print() {
+print(System.out);
+  }
+
+  public void print(PrintStream out) {
+SelectionVectorMode selectionMode = rowSet.getIndirectionType();
+RowSetReader reader = rowSet.reader();
+int colCount = reader.width();
+printSchema(out, selectionMode);
+while (reader.next()) {
+  printHeader(out, reader, selectionMode);
+  for (int i = 0; i < colCount; i++) {
+if (i > 0) {
+  out.print(", ");
+}
+out.print(reader.getAsString(i));
+  }
+  out.println();
+}
+  }
+
+  private void printSchema(PrintStream out, SelectionVectorMode 
selectionMode) {
+out.print("#");
+switch (selectionMode) {
+case FOUR_BYTE:
+  out.print(" (batch #, row #)");
+  break;
+case TWO_BYTE:
+  out.print(" (row #)");
+  break;
+default:
+  break;
+}
+out.print(": ");
+AccessSchema schema = rowSet.schema().access();
+for (int i = 0; i < schema.count(); i++) {
+  if (i > 0) {
--- End diff --

How about _maps_ inside AccessSchema ?


> Provide test tools to create, populate and compare row sets
> ---
>
> Key: DRILL-5323
> URL: https://issues.apache.org/jira/browse/DRILL-5323
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Tools, Build & Test
>Affects Versions: 1.11.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: 1.11.0
>
>
> Operators work with individual row sets. A row set is a collection of records 
> stored as column vectors. (Drill uses various terms for this concept. A 
> record batch is a row set with an operator implementation wrapped around it. 
> A vector container is a row set, but with much functionality left as an 
> exercise for the developer. And so on.)
> To simplify tests, we need a {{TestRowSet}} concept that wraps a 
> {{VectorContainer}} and provides easy ways to:
> * Define a schema for the row set.
> * Create a set of vectors that implement the schema.
> * Populate the row set with test data via code.
> * Add an SV2 to the row set.
> * Pass the row set to operator components (such as generated code blocks.)
> * Compare the results of the operation with an expected result set.
> * Dispose of the underling direct memory when work is done.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5323) Provide test tools to create, populate and compare row sets

2017-03-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15948234#comment-15948234
 ] 

ASF GitHub Bot commented on DRILL-5323:
---

Github user sohami commented on a diff in the pull request:

https://github.com/apache/drill/pull/785#discussion_r108808598
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/test/rowSet/AbstractSingleRowSet.java
 ---
@@ -0,0 +1,158 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test.rowSet;
+
+import org.apache.drill.common.types.TypeProtos.MajorType;
+import org.apache.drill.common.types.TypeProtos.MinorType;
+import org.apache.drill.exec.expr.TypeHelper;
+import org.apache.drill.exec.memory.BufferAllocator;
+import org.apache.drill.exec.physical.impl.spill.RecordBatchSizer;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.BatchSchema.SelectionVectorMode;
+import org.apache.drill.exec.record.VectorContainer;
+import org.apache.drill.exec.record.VectorWrapper;
+import org.apache.drill.exec.vector.ValueVector;
+import org.apache.drill.exec.vector.complex.MapVector;
+import org.apache.drill.test.rowSet.RowSet.SingleRowSet;
+import org.apache.drill.test.rowSet.RowSetSchema.LogicalColumn;
+import org.apache.drill.test.rowSet.RowSetSchema.PhysicalSchema;
+
+public abstract class AbstractSingleRowSet extends AbstractRowSet 
implements SingleRowSet {
+
+  public abstract static class StructureBuilder {
+protected final PhysicalSchema schema;
+protected final BufferAllocator allocator;
+protected final ValueVector[] valueVectors;
+protected final MapVector[] mapVectors;
+protected int vectorIndex;
+protected int mapIndex;
+
+public StructureBuilder(BufferAllocator allocator, RowSetSchema 
schema) {
+  this.allocator = allocator;
+  this.schema = schema.physical();
+  valueVectors = new ValueVector[schema.access().count()];
+  if (schema.access().mapCount() == 0) {
+mapVectors = null;
+  } else {
+mapVectors = new MapVector[schema.access().mapCount()];
+  }
+}
+  }
+
+  public static class VectorBuilder extends StructureBuilder {
+
+public VectorBuilder(BufferAllocator allocator, RowSetSchema schema) {
+  super(allocator, schema);
+}
+
+public ValueVector[] buildContainer(VectorContainer container) {
+  for (int i = 0; i < schema.count(); i++) {
+LogicalColumn colSchema = schema.column(i);
+@SuppressWarnings("resource")
+ValueVector v = TypeHelper.getNewVector(colSchema.field, 
allocator, null);
+container.add(v);
+if (colSchema.field.getType().getMinorType() == MinorType.MAP) {
+  MapVector mv = (MapVector) v;
+  mapVectors[mapIndex++] = mv;
+  buildMap(mv, colSchema.mapSchema);
+} else {
+  valueVectors[vectorIndex++] = v;
+}
+  }
+  container.buildSchema(SelectionVectorMode.NONE);
+  return valueVectors;
+}
+
+private void buildMap(MapVector mapVector, PhysicalSchema mapSchema) {
+  for (int i = 0; i < mapSchema.count(); i++) {
+LogicalColumn colSchema = mapSchema.column(i);
+MajorType type = colSchema.field.getType();
+Class vectorClass = 
TypeHelper.getValueVectorClass(type.getMinorType(), type.getMode());
+@SuppressWarnings("resource")
+ValueVector v = mapVector.addOrGet(colSchema.field.getName(), 
type, vectorClass);
+if (type.getMinorType() == MinorType.MAP) {
+  MapVector mv = (MapVector) v;
+  mapVectors[mapIndex++] = mv;
+  buildMap(mv, colSchema.mapSchema);
+} else {
+  valueVectors[vectorIndex++] = v;
+}
+  }
+}
+  }
+
+  public 

[jira] [Commented] (DRILL-5323) Provide test tools to create, populate and compare row sets

2017-03-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15948229#comment-15948229
 ] 

ASF GitHub Bot commented on DRILL-5323:
---

Github user sohami commented on a diff in the pull request:

https://github.com/apache/drill/pull/785#discussion_r108814142
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/test/rowSet/DirectRowSet.java ---
@@ -0,0 +1,153 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test.rowSet;
+
+import org.apache.drill.exec.memory.BufferAllocator;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.BatchSchema.SelectionVectorMode;
+import org.apache.drill.exec.record.VectorAccessible;
+import org.apache.drill.exec.record.VectorContainer;
+import org.apache.drill.exec.record.VectorWrapper;
+import org.apache.drill.exec.record.selection.SelectionVector2;
+import org.apache.drill.exec.vector.AllocationHelper;
+import org.apache.drill.exec.vector.ValueVector;
+import 
org.apache.drill.test.rowSet.AbstractRowSetAccessor.AbstractRowIndex;
+import org.apache.drill.test.rowSet.AbstractRowSetAccessor.BoundedRowIndex;
+import org.apache.drill.test.rowSet.RowSet.ExtendableRowSet;
+
+public class DirectRowSet extends AbstractSingleRowSet implements 
ExtendableRowSet {
+
+  private static class DirectRowIndex extends BoundedRowIndex {
+
+public DirectRowIndex(int rowCount) {
+  super(rowCount);
+}
+
+@Override
+public int index() { return rowIndex; }
+
+@Override
+public int batch() { return 0; }
--- End diff --

Can be moved to AbstractRowIndex class to return 0 by default. Override by 
derived class like HyperRowIndex to have different implementation.


> Provide test tools to create, populate and compare row sets
> ---
>
> Key: DRILL-5323
> URL: https://issues.apache.org/jira/browse/DRILL-5323
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Tools, Build & Test
>Affects Versions: 1.11.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: 1.11.0
>
>
> Operators work with individual row sets. A row set is a collection of records 
> stored as column vectors. (Drill uses various terms for this concept. A 
> record batch is a row set with an operator implementation wrapped around it. 
> A vector container is a row set, but with much functionality left as an 
> exercise for the developer. And so on.)
> To simplify tests, we need a {{TestRowSet}} concept that wraps a 
> {{VectorContainer}} and provides easy ways to:
> * Define a schema for the row set.
> * Create a set of vectors that implement the schema.
> * Populate the row set with test data via code.
> * Add an SV2 to the row set.
> * Pass the row set to operator components (such as generated code blocks.)
> * Compare the results of the operation with an expected result set.
> * Dispose of the underling direct memory when work is done.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5323) Provide test tools to create, populate and compare row sets

2017-03-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15948231#comment-15948231
 ] 

ASF GitHub Bot commented on DRILL-5323:
---

Github user sohami commented on a diff in the pull request:

https://github.com/apache/drill/pull/785#discussion_r108761814
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/test/rowSet/SchemaBuilder.java ---
@@ -0,0 +1,138 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test.rowSet;
+
+import java.util.ArrayList;
+import java.util.List;
+
+import org.apache.drill.common.types.TypeProtos.DataMode;
+import org.apache.drill.common.types.TypeProtos.MajorType;
+import org.apache.drill.common.types.TypeProtos.MinorType;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.BatchSchema.SelectionVectorMode;
+import org.apache.drill.exec.record.MaterializedField;
+
+/**
+ * Builder of a row set schema expressed as a list of materialized
+ * fields. Optimized for use when creating schemas by hand in tests.
+ * 
+ * Example usage to create the following schema: 
+ * (c: INT, a: MAP(c: VARCHAR, d: INT, e: MAP(f: VARCHAR), g: INT), h: 
BIGINT)
--- End diff --

... a: MAP(_b: VARCHAR_ .. instead of _c:VARCHAR_



> Provide test tools to create, populate and compare row sets
> ---
>
> Key: DRILL-5323
> URL: https://issues.apache.org/jira/browse/DRILL-5323
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Tools, Build & Test
>Affects Versions: 1.11.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: 1.11.0
>
>
> Operators work with individual row sets. A row set is a collection of records 
> stored as column vectors. (Drill uses various terms for this concept. A 
> record batch is a row set with an operator implementation wrapped around it. 
> A vector container is a row set, but with much functionality left as an 
> exercise for the developer. And so on.)
> To simplify tests, we need a {{TestRowSet}} concept that wraps a 
> {{VectorContainer}} and provides easy ways to:
> * Define a schema for the row set.
> * Create a set of vectors that implement the schema.
> * Populate the row set with test data via code.
> * Add an SV2 to the row set.
> * Pass the row set to operator components (such as generated code blocks.)
> * Compare the results of the operation with an expected result set.
> * Dispose of the underling direct memory when work is done.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5323) Provide test tools to create, populate and compare row sets

2017-03-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15948230#comment-15948230
 ] 

ASF GitHub Bot commented on DRILL-5323:
---

Github user sohami commented on a diff in the pull request:

https://github.com/apache/drill/pull/785#discussion_r108758552
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/test/rowSet/HyperRowSetImpl.java 
---
@@ -0,0 +1,221 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test.rowSet;
+
+import java.util.ArrayList;
+import java.util.List;
+
+import org.apache.drill.common.types.TypeProtos.MinorType;
+import org.apache.drill.exec.memory.BufferAllocator;
+import org.apache.drill.exec.record.BatchSchema.SelectionVectorMode;
+import org.apache.drill.exec.record.HyperVectorWrapper;
+import org.apache.drill.exec.record.VectorContainer;
+import org.apache.drill.exec.record.VectorWrapper;
+import org.apache.drill.exec.record.selection.SelectionVector4;
+import org.apache.drill.exec.vector.ValueVector;
+import org.apache.drill.exec.vector.accessor.AccessorUtilities;
+import org.apache.drill.exec.vector.complex.AbstractMapVector;
+import org.apache.drill.test.rowSet.AbstractRowSetAccessor.BoundedRowIndex;
+import org.apache.drill.test.rowSet.RowSet.HyperRowSet;
+import org.apache.drill.test.rowSet.RowSetSchema.LogicalColumn;
+import org.apache.drill.test.rowSet.RowSetSchema.PhysicalSchema;
+
+public class HyperRowSetImpl extends AbstractRowSet implements HyperRowSet 
{
+
+  public static class HyperRowIndex extends BoundedRowIndex {
+
+private final SelectionVector4 sv4;
+
+public HyperRowIndex(SelectionVector4 sv4) {
+  super(sv4.getCount());
+  this.sv4 = sv4;
+}
+
+@Override
+public int index() {
+  return AccessorUtilities.sv4Index(sv4.get(rowIndex));
+}
+
+@Override
+public int batch( ) {
+  return AccessorUtilities.sv4Batch(sv4.get(rowIndex));
+}
+  }
+
+  /**
+   * Build a hyper row set by restructuring a hyper vector bundle into a 
uniform
+   * shape. Consider this schema: 
+   * { a: 10, b: { c: 20, d: { e: 30 } } }
+   * 
+   * The hyper container, with two batches, has this structure:
+   * 
+   * Batchab
+   * 0Int vectorMap Vector(Int vector, Map 
Vector(Int vector))
+   * 1Int vectorMap Vector(Int vector, Map 
Vector(Int vector))
+   * 
+   * 
+   * The above table shows that top-level scalar vectors (such as the Int 
Vector for column
+   * a) appear "end-to-end" as a hyper-vector. Maps also appear 
end-to-end. But, the
+   * contents of the map (column c) do not appear end-to-end. Instead, 
they appear as
+   * contents in the map vector. To get to c, one indexes into the map 
vector, steps inside
+   * the map to find c and indexes to the right row.
+   * 
+   * Similarly, the maps for d do not appear end-to-end, one must step to 
the right batch
+   * in b, then step to d.
+   * 
+   * Finally, to get to e, one must step
+   * into the hyper vector for b, then steps to the proper batch, steps to 
d, step to e
+   * and finally step to the row within e. This is a very complex, costly 
indexing scheme
+   * that differs depending on map nesting depth.
+   * 
+   * To simplify access, this class restructures the maps to flatten the 
scalar vectors
+   * into end-to-end hyper vectors. For example, for the above:
+   * 
+   * 
+   * Batchacd
+   * 0Int vectorInt vectorInt 
vector
+   * 1Int vectorInt vectorInt 
vector
+   * 
+   *
+   * The maps are still available as hyper vectors, but separated into map 
fields.
+   * (Scalar access no longer needs to access the maps.) The result is a 
uniform
+   * addressing scheme for both top-level and nested vectors.
+   */
+
+  public static class HyperVectorBuilder {
+
 

[jira] [Commented] (DRILL-5323) Provide test tools to create, populate and compare row sets

2017-03-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15948232#comment-15948232
 ] 

ASF GitHub Bot commented on DRILL-5323:
---

Github user sohami commented on a diff in the pull request:

https://github.com/apache/drill/pull/785#discussion_r108823826
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/test/rowSet/RowSetWriterImpl.java 
---
@@ -0,0 +1,110 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test.rowSet;
+
+import java.math.BigDecimal;
+
+import org.apache.drill.exec.vector.ValueVector;
+import org.apache.drill.exec.vector.accessor.AbstractColumnWriter;
+import org.apache.drill.exec.vector.accessor.ColumnAccessorFactory;
+import org.apache.drill.exec.vector.accessor.ColumnWriter;
+import org.apache.drill.test.rowSet.RowSet.RowSetWriter;
+import org.joda.time.Period;
+
+/**
+ * Implements a row set writer on top of a {@link RowSet}
+ * container.
+ */
+
+public class RowSetWriterImpl extends AbstractRowSetAccessor implements 
RowSetWriter {
+
+  private final AbstractColumnWriter writers[];
+
+  public RowSetWriterImpl(AbstractSingleRowSet recordSet, AbstractRowIndex 
rowIndex) {
+super(rowIndex, recordSet.schema().access());
+ValueVector[] valueVectors = recordSet.vectors();
+writers = new AbstractColumnWriter[valueVectors.length];
+int posn = 0;
+for (int i = 0; i < writers.length; i++) {
+  writers[posn] = 
ColumnAccessorFactory.newWriter(valueVectors[i].getField().getType());
--- End diff --

we can use _"i"_ instead of _"posn"_


> Provide test tools to create, populate and compare row sets
> ---
>
> Key: DRILL-5323
> URL: https://issues.apache.org/jira/browse/DRILL-5323
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Tools, Build & Test
>Affects Versions: 1.11.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: 1.11.0
>
>
> Operators work with individual row sets. A row set is a collection of records 
> stored as column vectors. (Drill uses various terms for this concept. A 
> record batch is a row set with an operator implementation wrapped around it. 
> A vector container is a row set, but with much functionality left as an 
> exercise for the developer. And so on.)
> To simplify tests, we need a {{TestRowSet}} concept that wraps a 
> {{VectorContainer}} and provides easy ways to:
> * Define a schema for the row set.
> * Create a set of vectors that implement the schema.
> * Populate the row set with test data via code.
> * Add an SV2 to the row set.
> * Pass the row set to operator components (such as generated code blocks.)
> * Compare the results of the operation with an expected result set.
> * Dispose of the underling direct memory when work is done.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5323) Provide test tools to create, populate and compare row sets

2017-03-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15925231#comment-15925231
 ] 

ASF GitHub Bot commented on DRILL-5323:
---

GitHub user paul-rogers opened a pull request:

https://github.com/apache/drill/pull/785

DRILL-5323: Test tools for row sets

Provide test tools to create, populate and compare row sets

To simplify tests, we need a TestRowSet concept that wraps a
VectorContainer and provides easy ways to:

- Define a schema for the row set.
- Create a set of vectors that implement the schema.
- Populate the row set with test data via code.
- Add an SV2 to the row set.
- Pass the row set to operator components (such as generated code
blocks.)
- Examine the contents of a row set
- Compare the results of the operation with an expected result set.
- Dispose of the underling direct memory when work is done.

This code builds on that in DRILL-5324 to provide a complete row set
API. See DRILL-5318 for the spec.

Note: this code can be reviewed as-is, but cannot be committed until
after DRILL-5324 is committed: this code has compile-time dependencies
on that code. This PR will be rebased once DRILL-5324 is pulled into
master.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/paul-rogers/drill DRILL-5323

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/785.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #785


commit 99b3ed45eed41907a143ece7082b8af926bd920b
Author: Paul Rogers 
Date:   2017-03-14T23:18:24Z

DRILL-5323: Test tools for row sets

Provide test tools to create, populate and compare row sets

To simplify tests, we need a TestRowSet concept that wraps a
VectorContainer and provides easy ways to:

- Define a schema for the row set.
- Create a set of vectors that implement the schema.
- Populate the row set with test data via code.
- Add an SV2 to the row set.
- Pass the row set to operator components (such as generated code
blocks.)
- Examine the contents of a row set
- Compare the results of the operation with an expected result set.
- Dispose of the underling direct memory when work is done.

This code builds on that in DRILL-5324 to provide a complete row set
API. See DRILL-5318 for the spec.

Note: this code can be reviewed as-is, but cannot be committed until
after DRILL-5324 is committed: this code has compile-time dependencies
on that code. This PR will be rebased once DRILL-5324 is pulled into
master.




> Provide test tools to create, populate and compare row sets
> ---
>
> Key: DRILL-5323
> URL: https://issues.apache.org/jira/browse/DRILL-5323
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Tools, Build & Test
>Affects Versions: 1.11.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: 1.11.0
>
>
> Operators work with individual row sets. A row set is a collection of records 
> stored as column vectors. (Drill uses various terms for this concept. A 
> record batch is a row set with an operator implementation wrapped around it. 
> A vector container is a row set, but with much functionality left as an 
> exercise for the developer. And so on.)
> To simplify tests, we need a {{TestRowSet}} concept that wraps a 
> {{VectorContainer}} and provides easy ways to:
> * Define a schema for the row set.
> * Create a set of vectors that implement the schema.
> * Populate the row set with test data via code.
> * Add an SV2 to the row set.
> * Pass the row set to operator components (such as generated code blocks.)
> * Compare the results of the operation with an expected result set.
> * Dispose of the underling direct memory when work is done.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)