[ 
https://issues.apache.org/jira/browse/CARBONDATA-2895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-2895.
--------------------------------------
    Resolution: Fixed

> [Batch-sort]Query result mismatch with Batch-sort in save to disk (sort temp 
> files) scenario.
> ---------------------------------------------------------------------------------------------
>
>                 Key: CARBONDATA-2895
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-2895
>             Project: CarbonData
>          Issue Type: Bug
>            Reporter: Ajantha Bhat
>            Assignee: Ajantha Bhat
>            Priority: Major
>          Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> probelm: Query result mismatch with Batch-sort in save to disk (sort temp 
> files) scenario.
> scenario:
> a) Configure batchsort but give batch size more than 
> UnsafeMemoryManager.INSTANCE.getUsableMemory().
> b) Load data that is greater than batch size. Observe that 
> unsafeMemoryManager save to disk happened as it cannot process one batch.  
> c) so load happens in 2 batch. 
> d) When query the results. There result data rows is more than expected data 
> rows.
> root cause:
> For each batch, createSortDataRows() will be called.
> Files saved to disk during sorting of previous batch was considered for this 
> batch.
> solution:
> Files saved to disk during sorting of previous batch ,should not be 
> considered for this batch.
> Hence use batchID as rangeID field of sorttempfiles.
> So getFilesToMergeSort() will select files of only this batch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to