Github user xuchuanyin commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2706#discussion_r216885202
--- Diff:
processing/src/main/java/org/apache/carbondata/processing/loading/sort/SortStepRowHandler.java
---
@@ -570,23 +589,31 @@ public int
writeRawRowAsIntermediateSortTempRowToUnsafeMemory(Object[] row,
private void packNoSortFieldsToBytes(Object[] row, ByteBuffer rowBuffer)
{
// convert dict & no-sort
for (int idx = 0; idx < this.dictNoSortDimCnt; idx++) {
+ // cannot exceed default 2MB, hence no need to call ensureArraySize
rowBuffer.putInt((int) row[this.dictNoSortDimIdx[idx]]);
}
// convert no-dict & no-sort
for (int idx = 0; idx < this.noDictNoSortDimCnt; idx++) {
byte[] bytes = (byte[]) row[this.noDictNoSortDimIdx[idx]];
+ // cannot exceed default 2MB, hence no need to call ensureArraySize
rowBuffer.putShort((short) bytes.length);
rowBuffer.put(bytes);
}
// convert varchar dims
for (int idx = 0; idx < this.varcharDimCnt; idx++) {
byte[] bytes = (byte[]) row[this.varcharDimIdx[idx]];
+ // can exceed default 2MB, hence need to call ensureArraySize
+ rowBuffer = UnsafeSortDataRows
--- End diff --
Should we call this method per row per column?
Since in most scenarios, 2MB per row is enough, so will the method calling
here cause performance decrease?
---