[ https://issues.apache.org/jira/browse/DRILL-6114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16366642#comment-16366642 ]
ASF GitHub Bot commented on DRILL-6114: --------------------------------------- Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1112#discussion_r168682470 --- Diff: exec/vector/src/main/java/org/apache/drill/exec/record/metadata/ColumnMetadata.java --- @@ -15,36 +15,115 @@ * See the License for the specific language governing permissions and * limitations under the License. */ -package org.apache.drill.exec.record; +package org.apache.drill.exec.record.metadata; import org.apache.drill.common.types.TypeProtos.DataMode; import org.apache.drill.common.types.TypeProtos.MajorType; import org.apache.drill.common.types.TypeProtos.MinorType; +import org.apache.drill.exec.record.MaterializedField; /** * Metadata description of a column including names, types and structure * information. */ public interface ColumnMetadata { + + /** + * Rough characterization of Drill types into metadata categories. + * Various aspects of Drill's type system are very, very messy. + * However, Drill is defined by its code, not some abstract design, + * so the metadata system here does the best job it can to simplify + * the messy type system while staying close to the underlying + * implementation. + */ + enum StructureType { - PRIMITIVE, LIST, TUPLE + + /** + * Primitive column (all types except List, Map and Union.) + * Includes (one-dimensional) arrays of those types. + */ + + PRIMITIVE, + + /** + * Map or repeated map. Also describes the row as a whole. + */ + + TUPLE, + + /** + * Union or (non-repeated) list. (A non-repeated list is, + * essentially, a repeated union.) + */ + + VARIANT, + + /** + * A repeated list. A repeated list is not simply the repeated + * form of a list, it is something else entirely. It acts as + * a dimensional wrapper around any other type (except list) + * and adds a non-nullable extra dimension. Hence, this type is + * for 2D+ arrays. + * <p> + * In theory, a 2D list of, say, INT would be an INT column, but + * repeated in to dimensions. Alas, that is not how it is. Also, + * if we have a separate category for 2D lists, we should have + * a separate category for 1D lists. But, again, that is not how + * the code has evolved. + */ + + MULTI_ARRAY } - public static final int DEFAULT_ARRAY_SIZE = 10; + int DEFAULT_ARRAY_SIZE = 10; --- End diff -- This is an interface. The default (and only legal) form of fields is `static final`. Thanks to Tim for reminding me of this. > Complete internal metadata layer for improved batch handling > ------------------------------------------------------------ > > Key: DRILL-6114 > URL: https://issues.apache.org/jira/browse/DRILL-6114 > Project: Apache Drill > Issue Type: Improvement > Reporter: Paul Rogers > Assignee: Paul Rogers > Priority: Major > Fix For: 1.13.0 > > > Slice of the ["batch handling" > project.|https://github.com/paul-rogers/drill/wiki/Batch-Handling-Upgrades] > that includes enhancements to the internal metadata system. -- This message was sent by Atlassian JIRA (v7.6.3#76005)