[
https://issues.apache.org/jira/browse/CARBONDATA-3593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jacky Li resolved CARBONDATA-3593.
----------------------------------
Fix Version/s: 2.0.0
Resolution: Fixed
> total_blocklets in query statistic always the same with valid_blocklets
> -----------------------------------------------------------------------
>
> Key: CARBONDATA-3593
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3593
> Project: CarbonData
> Issue Type: Improvement
> Components: core
> Reporter: Hong Shen
> Priority: Major
> Fix For: 2.0.0
>
> Time Spent: 3h 40m
> Remaining Estimate: 0h
>
> When I run sql on carbondata table with "enable.query.statistics=true",
> total_blocklets in query statistic always the same with valid_blocklets.
> Below is an example.
> Table test_table_hdfs_sort_city and test_table_hdfs_no_sort has the same
> data, the only different is test_table_hdfs_sort_city has
> SORT_COLUMN='city_name', while test_table_hdfs_no_sort with no sort column.
> {code}
> carbon.sql("select * from test_table_hdfs_sort_city where city_name='city1'
> ")
> {code}
> |scan_blocks_num|total_blocklets|valid_blocklets|total_pages|scanned_pages|valid_pages|
> | 1| 1|
> 1 | 193| 4| 4|
> {code}
> carbon.sql("select * from test_table_hdfs_no_sort where city_name='city1' ")
> {code}
> |scan_blocks_num|total_blocklets|valid_blocklets|total_pages|scanned_pages|valid_pages|
> | 1| 3|
> 3 | 193| 193| 193|
> After read the code, I found both TOTAL_BLOCKLET_NUM and
> VALID_SCAN_BLOCKLET_NUM will plus 1 in BlockletFilterScanner.executeFilter(),
> BlockletFilterScanner.executeFilterForPages,
> BlockletFullScanner.scanBlocklet.
> I think total_blocklets should be the total blocklet, valid_blocklets should
> be the filtered blocklet. If it need to be modified. I will provide a patch,
> since I have modified it locally.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)