[
https://issues.apache.org/jira/browse/DRILL-3821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14903712#comment-14903712
]
Jacques Nadeau commented on DRILL-3821:
---------------------------------------
One other note on the first item above. Since we cache block locations, a full
refresh could be beneficial in the cases where the balancer has run (and
wouldn't be done if we short-circuit based on the change detection code).
> refresh table metadata command is updating the cache every single time
> ----------------------------------------------------------------------
>
> Key: DRILL-3821
> URL: https://issues.apache.org/jira/browse/DRILL-3821
> Project: Apache Drill
> Issue Type: Bug
> Components: Metadata
> Reporter: Rahul Challapalli
> Assignee: Mehant Baid
> Fix For: 1.2.0
>
>
> git.commit.id.abbrev=3c89b30
> The lineitem folder used below contains 50K parquet files. I ran the refresh
> table metadata command multiple times. After the first run, I expected all
> subsequent runs to come back very fast since there is nothing to update. But
> the below times suggest that drill might be actually updating the cache file
> every single time
> {code}
> 0: jdbc:drill:zk=10.10.100.190:5181> refresh table metadata
> dfs.`/drill/testdata/tpch100_50000files/lineitem`;
> +-------+---------------------------------------------------------------------------------------+
> | ok | summary
> |
> +-------+---------------------------------------------------------------------------------------+
> | true | Successfully updated metadata for table
> /drill/testdata/tpch100_50000files/lineitem. |
> +-------+---------------------------------------------------------------------------------------+
> 1 row selected (14.108 seconds)
> 0: jdbc:drill:zk=10.10.100.190:5181> refresh table metadata
> dfs.`/drill/testdata/tpch100_50000files/lineitem`;
> +-------+---------------------------------------------------------------------------------------+
> | ok | summary
> |
> +-------+---------------------------------------------------------------------------------------+
> | true | Successfully updated metadata for table
> /drill/testdata/tpch100_50000files/lineitem. |
> +-------+---------------------------------------------------------------------------------------+
> 1 row selected (11.372 seconds)
> 0: jdbc:drill:zk=10.10.100.190:5181> refresh table metadata
> dfs.`/drill/testdata/tpch100_50000files/lineitem`;
> +-------+---------------------------------------------------------------------------------------+
> | ok | summary
> |
> +-------+---------------------------------------------------------------------------------------+
> | true | Successfully updated metadata for table
> /drill/testdata/tpch100_50000files/lineitem. |
> +-------+---------------------------------------------------------------------------------------+
> 1 row selected (11.177 seconds)
> {code}
> When I checked the last modified time on the cache file on maprfs, it does
> indicate that the cache is touched every time the "refresh table metadata"
> command is run
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)