[ https://issues.apache.org/jira/browse/DRILL-4376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197701#comment-15197701 ]
ASF GitHub Bot commented on DRILL-4376: --------------------------------------- Github user adeneche closed the pull request at: https://github.com/apache/drill/pull/422 > Wrong results when doing a count(*) on part of directories with metadata cache > ------------------------------------------------------------------------------ > > Key: DRILL-4376 > URL: https://issues.apache.org/jira/browse/DRILL-4376 > Project: Apache Drill > Issue Type: Bug > Components: Metadata > Affects Versions: 1.4.0 > Reporter: Deneche A. Hakim > Assignee: Deneche A. Hakim > Priority: Critical > Fix For: 1.7.0 > > > First create some parquet tables in multiple subfolders: > {noformat} > create table dfs.tmp.`test/201501` as select employee_id, full_name from > cp.`employee.json` limit 2; > create table dfs.tmp.`test/201502` as select employee_id, full_name from > cp.`employee.json` limit 2; > create table dfs.tmp.`test/201601` as select employee_id, full_name from > cp.`employee.json` limit 2; > create table dfs.tmp.`test/201602` as select employee_id, full_name from > cp.`employee.json` limit 2; > {noformat} > Running the following query gives the expected count: > {noformat} > select count(*) from dfs.tmp.`test/20160*`; > +---------+ > | EXPR$0 | > +---------+ > | 4 | > +---------+ > {noformat} > But once you create the metadata cache files, the query no longer returns the > correct results: > {noformat} > refresh table metadata dfs.tmp.`test`; > select count(*) from dfs.tmp.`test/20160*`; > +---------+ > | EXPR$0 | > +---------+ > | 2 | > +---------+ > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)