Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/914#discussion_r150758630 --- Diff: exec/vector/src/main/java/org/apache/drill/exec/vector/accessor/writer/BaseScalarWriter.java --- @@ -0,0 +1,264 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.vector.accessor.writer; + +import java.math.BigDecimal; + +import org.apache.drill.exec.vector.accessor.ColumnWriterIndex; +import org.apache.drill.exec.vector.accessor.impl.HierarchicalFormatter; +import org.joda.time.Period; + +/** + * Column writer implementation that acts as the basis for the + * generated, vector-specific implementations. All set methods + * throw an exception; subclasses simply override the supported + * method(s). + * <p> + * The only tricky part to this class is understanding the + * state of the write indexes as the write proceeds. There are + * two pointers to consider: + * <ul> + * <li>lastWriteIndex: The position in the vector at which the + * client last asked us to write data. This index is maintained + * in this class because it depends only on the actions of this + * class.</li> + * <li>vectorIndex: The position in the vector at which we will + * write if the client chooses to write a value at this time. + * The vector index is shared by all columns at the same repeat + * level. It is incremented as the client steps through the write + * and is observed in this class each time a write occurs.</i> + * </ul> + * A repeat level is defined as any of the following: + * <ul> + * <li>The set of top-level scalar columns, or those within a + * top-level, non-repeated map, or nested to any depth within + * non-repeated maps rooted at the top level.</li> + * <li>The values for a single scalar array.</li> + * <li>The set of scalar columns within a repeated map, or + * nested within non-repeated maps within a repeated map.</li> + * </ul> + * Items at a repeat level index together and share a vector + * index. However, the columns within a repeat level + * <i>do not</i> share a last write index: some can lag further + * behind than others. + * <p> + * Let's illustrate the states. Let's focus on one column and + * illustrate the three states that can occur during write: + * <ul> + * <li><b>Behind</b>: the last write index is more than one position behind + * the vector index. Zero-filling will be needed to catch up to + * the vector index.</li> + * <li><b>Written</b>: the last write index is the same as the vector + * index because the client wrote data at this position (and previous + * values were back-filled with nulls, empties or zeros.)</li> + * <li><b>Unwritten</b>: the last write index is one behind the vector + * index. This occurs when the column was written, then the client + * moved to the next row or array position.</li> + * <li><b>Restarted</b>: The current row is abandoned (perhaps filtered + * out) and is to be rewritten. The last write position moves + * back one position. Note that, the Restarted state is + * indistinguishable from the unwritten state: the only real + * difference is that the current slot (pointed to by the + * vector index) contains the previous written value that must + * be overwritten or back-filled. But, this is fine, because we + * assume that unwritten values are garbage anyway.</li> + * </ul> + * To illustrate:<pre><code> + * Behind Written Unwritten Restarted + * |X| |X| |X| |X| + * lw >|X| |X| |X| |X| + * | | |0| |0| lw > |0| + * v >| | lw, v > |X| lw > |X| v > |X| + * v > | | + * </code></pre> + * The illustrated state transitions are: + * <ul> + * <li>Suppose the state starts in Behind.<ul> + * <li>If the client writes a value, then the empty slot is + * back-filled and the state moves to Written.</li> + * <li>If the client does not write a value, the state stays + * at Behind, and the gap of unfilled values grows.</li></ul></li> + * <li>When in the Written state:<ul> + * <li>If the client saves the current row or array position, + * the vector index increments and we move to the Unwritten + * state.</li> + * <li>If the client abandons the row, the last write position + * moves back one to recreate the unwritten state. We've + * shown this state separately above just to illustrate + * the two transitions from Written.</li></ul></li> + * <li>When in the Unwritten (or Restarted) states:<ul> + * <li>If the client writes a value, then the writer moves back to the + * Written state.</li> + * <li>If the client skips the value, then the vector index increments + * again, leaving a gap, and the writer moves to the + * Behind state.</li></ul> + * </ul> + * <p> + * We've already noted that the Restarted state is identical to + * the Unwritten state (and was discussed just to make the flow a bit + * clearer.) The astute reader will have noticed that the Behind state is + * the same as the Unwritten state if we define the combined state as + * when the last write position is behind the vector index. + * <p> + * Further, if + * one simply treats the gap between last write and the vector indexes + * as the amount (which may be zero) to back-fill, then there is just + * one state. This is, in fact, how the code works: it always writes + * to the vector index (and can do so multiple times for a single row), + * back-filling as necessary. + * <p> + * The states, then, are more for our use in understanding the algorithm. + * They are also very useful when working through the logic of performing + * a roll-over when a vector overflows. + */ + +public abstract class BaseScalarWriter extends AbstractScalarWriter { + + public static final int MIN_BUFFER_SIZE = 256; + + /** + * Indicates the position in the vector to write. Set via an object so that + * all writers (within the same subtree) can agree on the write position. + * For example, all top-level, simple columns see the same row index. + * All columns within a repeated map see the same (inner) index, etc. + */ + + protected ColumnWriterIndex vectorIndex; + + /** + * Listener invoked if the vector overflows. If not provided, then the writer + * does not support vector overflow. + */ + + protected ColumnWriterListener listener; + + /** + * Cached direct memory location of the start of data for the vector + * being written. Updated each time the buffer is reallocated. + */ + + protected long bufAddr; --- End diff -- Very good question that requires a longer answer than can be explained here. Basically, the thought is that these accessors are the primary interface between users of vectors and the backing memory buffers. `DrillBuf`, like the `ByteBuf` from which it derives, and the `ByteBuffer` on which it is modeled, assume a serialization model. Here we assume more of a DB buffer model. The model used in the code is that `DrillBuf` handles allocation, reference counting, freeing and so on. The column accessors handle writes to, and reads from, the buffer using `PlatformDependent`. Calling `DrillBuf` methods without bounds checks is really little different than using `PlatformDependent` directly. Avoiding those extra calls has a performance benefit. FWIW, the text reader has long used memory addresses; here that work is isolated here, and removed (in a later PR) from the text reader (and other places.)
---