There is no need to refresh the metadata for every query. You only need to
generate the metadata cache once for each folder. Now if your data gets
updated, then any subsequent query you submit will automatically refresh
the metadata cache. Again you need not run the "refresh table metadata
<folder_name>" command  explicitly. Refer to [1] and ignore the reference
to "session" on that page.

[1] https://drill.apache.org/docs/optimizing-parquet-metadata-reading/

- Rahul



On Mon, Mar 6, 2017 at 7:49 AM, Chetan Kothari <[email protected]>
wrote:

> Hi All
>
>
>
> As I understand,  we can trigger generation of the Parquet Metadata Cache
> File by using REFRESH TABLE METADATA <path to table>.
>
> It seems we need to run this command on a directory, nested or flat, once
> during the session.
>
>
>
> Why we need to run for every session? That implies if I use REST API to
> fire query, I have to generate meta-data cache file as part of every REST
> API call.
>
> This seems to be issue as I have seen that generation of meta-data cache
> file takes some significant time.
>
>
>
> Can't we define/configure  cache expiry time so that we can keep meta-data
> in cache for longer duration?
>
>
>
> Any inputs on this will be helpful.
>
>
>
> Regards
>
> Chetan
>
>
>

Reply via email to