dhatchayani opened a new pull request #3262: [CARBONDATA-3415] Merge index is 
not working for partition table. Merge index for partition table is taking 
significantly longer time than normal table.
URL: https://github.com/apache/carbondata/pull/3262
 
 
   **Problem:**
   (1) Merge index is not working for partition table.
   (2) Merge index for partition table is significantly more than the normal 
carbon table.
   
   **Root cause:**
   (1) Merge index event listener is moved to preStatusUpdateEvent in #3221 . 
But preStatusUpdateEvent is not triggered in case of partition table. Test case 
to validate merge index on partition table is also wrong. Not caught in the 
test builders.
   (2) Currently, merge index job will trigger tasks like one segment one task. 
But for a partition table, there are partitions in a segments and merge index 
is for partitions. So per segment it has to iterate and merge the index files 
inside partitions, because of this the time is little more when the number of 
partitions are high. Number of Tasks = Number of Segments.
   
   **Solution:**
   (1) Correct the test case and trigger merge index listener for partition 
table.
   (2) Parallelize the tasks launched to the partitions. Number of tasks = 
Number of partitions in a segment
   
    - [ ] Any interfaces changed?
    
    - [ ] Any backward compatibility impacted?
    
    - [ ] Document update required?
   
    - [x] Testing done
           UT Changed
          
    - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to