Github user manishgupta88 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2644#discussion_r214316607
  
    --- Diff: 
streaming/src/main/java/org/apache/carbondata/streaming/CarbonStreamRecordWriter.java
 ---
    @@ -212,9 +271,13 @@ private void initializeAtFirstRow() throws 
IOException, InterruptedException {
                 byte[] col = (byte[]) columnValue;
                 output.writeShort(col.length);
                 output.writeBytes(col);
    +            dimensionStatsCollectors[dimCount].update(col);
               } else {
                 output.writeInt((int) columnValue);
    +            
dimensionStatsCollectors[dimCount].update(ByteUtil.toBytes((int) columnValue));
    --- End diff --
    
    For min/max comparison you are converting from Int to byte array for all 
the rows. This can impact the writing performance. Instead you can typecast 
into Int and do the comparison. After all the data is loaded then at the end 
you can convert all the values into byte array based on datatype. At that time 
it will be only one conversion for the final min/max values


---

Reply via email to