Github user paul-rogers commented on the issue: https://github.com/apache/drill/pull/1057 To answer the two questions: 1. The copier is used in multiple locations, some of which include selection vectors. Sort uses a copier to merge rows coming from multiple sorted batches. The SVR compresses out SVs. A filter will produce an SV2 which the SVR removes. An in-memory sort produces an SV4. But, because of the ways plans are generated, the hash join will never see a batch with an SV. (An SVR will be inserted, if needed, to remove the SV.) 2. We never write a batch using an SV. The SV is always a source indirection. Because we do indirection on the source side (and vectors are append only), there can be no SV on the destination side. Note also that the {{VectorContainer}} class, despite it's API, knows nothing about SVs. The SV is tacked on separately by the {{RecordBatch}}. (This is a less-than-ideal design, but it is how things work at present.) FWIW, the test-oriented {{RowSet}} abstractions came about as wrappers around both the {{VectorContainer}} and SV to provide a unified view. Because of how we do SVs, you'll need three copy methods: one for no SV, one for an SV2 and another for an SV4. In the fullness of time, the new "column reader" and "column writer" abstractions will hide all this stuff, but it will take time before those tools come online.
---