Quanlong Huang created ORC-1132:
-----------------------------------

             Summary: [C++] EncodedStringVectorBatch allocates used buffers
                 Key: ORC-1132
                 URL: https://issues.apache.org/jira/browse/ORC-1132
             Project: ORC
          Issue Type: Improvement
    Affects Versions: 1.6.0
            Reporter: Quanlong Huang
            Assignee: Quanlong Huang


The constructor of EncodedStringVectorBatch invokes the constructor of 
StringVectorBatch with batch capacity:
{code:cpp}
  EncodedStringVectorBatch::EncodedStringVectorBatch(uint64_t _capacity,
                                                     MemoryPool& pool)
                      : StringVectorBatch(_capacity, pool),
                        dictionary(),
                        index(pool, _capacity) {
    // PASS
  }
 {code}
This allocates unused `data` and `length` buffer in StringVectorBatch:
{code:cpp}
  StringVectorBatch::StringVectorBatch(uint64_t _capacity, MemoryPool& pool
               ): ColumnVectorBatch(_capacity, pool),
                  data(pool, _capacity),
                  length(pool, _capacity),
                  blob(pool) {
    // PASS
  }
{code}
We only use the `index` buffer and `dictionary` of EncodedStringVectorBatch:
{code:cpp}
  void StringDictionaryColumnReader::nextEncoded(ColumnVectorBatch& rowBatch,
                                                  uint64_t numValues,
                                                  char* notNull) {
    ColumnReader::next(rowBatch, numValues, notNull);
    notNull = rowBatch.hasNulls ? rowBatch.notNull.data() : nullptr;
    rowBatch.isEncoded = true;

    EncodedStringVectorBatch& batch = 
dynamic_cast<EncodedStringVectorBatch&>(rowBatch);
    batch.dictionary = this->dictionary;

    // Length buffer is reused to save dictionary entry ids
    rle->next(batch.index.data(), numValues, notNull);
  }
{code}
Thus we should avoid allocating buffers in the base class.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to