This is an automated email from the ASF dual-hosted git repository.

jackylk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
     new eff2977  [CARBONDATA-3519]Made optimizations in write step to avoid 
unnecessary memory blk allocation/free
eff2977 is described below

commit eff297769bc0513cdf4e7c3d2fd3f57c5ca60df1
Author: Venu Reddy <venugopalred...@huawei.com>
AuthorDate: Fri Dec 20 18:27:34 2019 +0530

    [CARBONDATA-3519]Made optimizations in write step to avoid unnecessary 
memory blk allocation/free
    
    Issue-1:
    Context:
    For a string column with local dictionary enabled, a column page of 
UnsafeFixLengthColumnPage with datatype DataTypes.BYTE_ARRAY is created for 
encodedPage along with regular actualPage of UnsafeVarLengthColumnPage. We have 
capacity field in the UnsafeFixLengthColumnPage. And this field indicates the 
capacity of allocated
    memoryBlock for the page. ensureMemory() method gets called while adding 
rows to check if totalLength + requestSize > capacity to allocate a new 
memoryBlock. If there is no room to add the next row, allocates a new block, 
copy the old context(prev rows) and free the old memoryBlock.
    Problem:
    While, UnsafeFixLengthColumnPage with with datatype DataTypes.BYTE_ARRAY is 
created for encodedPage, we have not assigned the capacity field with allocated 
memory block size. Hence, for each add row to tablePage, ensureMemory() check 
always fails, allocates a new column page memoryBlock, copy the old 
context(prev rows) and free the old memoryBlock.
    This allocation of new memoryBlock and free of old memoryBlock happens for 
each row row addition for the string columns with local dictionary.
    Solution:
    We can set the capacity to allocated size.
    
    Issue-2:
    Context:
    InVarLengthColumnPageBase, we have a rowOffset column page of 
UnsafeFixLengthColumnPage of datatype INT to maintain the data offset to each 
row of variable length columns. This rowOffset page allocates to be size of 
page.
    Problem:
    If we have 10 rows in the page, we need 11 rows for its rowOffset page. 
Because we always keep 0 as offset to 1st row. So an additional row is required 
for rowOffset page. Otherwise, ensureMemory() check always fails for the last 
row(10th row in this case) of data and allocates a new rowOffset page 
memoryBlock, copy the old context(prev rows) and free the old memoryBlock. This 
can happen for the string columns with local dictionary, direct dictionary 
columns, global disctionary columns.
    Solution:
    We can allocate for 1 additional row of rowOffset page.
    
    This closes #3524
---
 .../carbondata/core/datastore/page/UnsafeFixLengthColumnPage.java     | 4 ++--
 .../carbondata/core/datastore/page/VarLengthColumnPageBase.java       | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/datastore/page/UnsafeFixLengthColumnPage.java
 
b/core/src/main/java/org/apache/carbondata/core/datastore/page/UnsafeFixLengthColumnPage.java
index 13c382f..4ef5e5d 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/datastore/page/UnsafeFixLengthColumnPage.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/datastore/page/UnsafeFixLengthColumnPage.java
@@ -92,8 +92,8 @@ public class UnsafeFixLengthColumnPage extends ColumnPage {
     this.eachRowSize = eachRowSize;
     totalLength = 0;
     if (columnPageEncoderMeta.getStoreDataType() == DataTypes.BYTE_ARRAY) {
-      memoryBlock =
-          UnsafeMemoryManager.allocateMemoryWithRetry(taskId, (long) pageSize 
* eachRowSize);
+      capacity = pageSize * eachRowSize;
+      memoryBlock = UnsafeMemoryManager.allocateMemoryWithRetry(taskId, 
capacity);
       baseAddress = memoryBlock.getBaseObject();
       baseOffset = memoryBlock.getBaseOffset();
     }
diff --git 
a/core/src/main/java/org/apache/carbondata/core/datastore/page/VarLengthColumnPageBase.java
 
b/core/src/main/java/org/apache/carbondata/core/datastore/page/VarLengthColumnPageBase.java
index 1b1f8f6..01f1d55 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/datastore/page/VarLengthColumnPageBase.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/datastore/page/VarLengthColumnPageBase.java
@@ -73,7 +73,7 @@ public abstract class VarLengthColumnPageBase extends 
ColumnPage {
     try {
       rowOffset = ColumnPage.newPage(
           new ColumnPageEncoderMeta(spec, DataTypes.INT, 
columnPageEncoderMeta.getCompressorName()),
-          pageSize);
+          pageSize + 1);
     } catch (MemoryException e) {
       throw new RuntimeException(e);
     }

Reply via email to