Github user bitblender commented on a diff in the pull request:
@@ -0,0 +1,551 @@
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ * Implementation of a column when creating a row batch.
+ * Every column resides at an index, is defined by a schema,
+ * is backed by a value vector, and and is written to by a writer.
+ * Each column also tracks the schema version in which it was added
+ * to detect schema evolution. Each column has an optional overflow
+ * vector that holds overflow record values when a batch becomes
+ * full.
+ * <p>
+ * Overflow vectors require special consideration. The vector class itself
+ * must remain constant as it is bound to the writer. To handle overflow,
+ * the implementation must replace the buffer in the vector with a new
+ * one, saving the full vector to return as part of the final row batch.
+ * This puts the column in one of three states:
+ * <ul>
+ * <li>Normal: only one vector is of concern - the vector for the active
+ * row batch.</li>
+ * <li>Overflow: a write to a vector caused overflow. For all columns,
+ * the data buffer is shifted to a harvested vector, and a new, empty
+ * buffer is put into the active vector.</li>
+ * <li>Excess: a (small) column received values for the row that will
--- End diff --
'Excess' is the LOOK_AHEAD state, correct? I think it would be better if
the comments use the same terminology as in the code.
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket