Hi Quanlong,
You're right. The catalog needs to handle metadata at a finer granularity.
We are actively looking into the options you mentioned as well as other
related changes (see IMPALA-3234 and IMPALA-3127) to improve the
performance and scalability of metadata management.
Thanks
Dimitris
On
Thank Dimitris!
At 2017-09-12 01:15:46, "Dimitris Tsirogiannis"
wrote:
>Hi Quanlong,
>
>You're pretty much correct. REFRESH can handle the majority of external
>metadata modifications (adding/dropping files/partitions, etc) and
>INVALIDATE METADATA should be used in
Hi all,
Currently if a "describe" statement hits an incomplete table, the impalad will
send an RPC request to the catalogd for loading metadata of this table. It will
take a long time for tables with many partitions and many files. However, to
serve the "describe" statement, we just need the
Hi all,
I used to thought that REFRESH statement is just incremental metadata reload.
It can't detect file deletion or modification. So we should use INVALIDATE
METADATA for these cases.
However, one of my friends told me that they always use REFRESH statement in
their ETL pipeline, either
If you'd like to contribute a patch to Impala, but aren't sure what you
want to work on, you can look at Impala's newbie issues:
https://issues.apache.org/jira/issues/?filter=12341668. You can find
detailed instructions on submitting patches at
Hi Quanlong,
You're pretty much correct. REFRESH can handle the majority of external
metadata modifications (adding/dropping files/partitions, etc) and
INVALIDATE METADATA should be used in the two use cases you mention. I am
sorry you had to look at the code to figure that out. I checked our
Thanks for the feedback Quanlong. We plan on addressing many of these
catalog issues in the immediate future.
Dimitris
On Mon, Sep 11, 2017 at 10:21 PM, Quanlong Huang
wrote:
> Hi Dimitris,
>
> Thanks for your quick reply!
>
> IMPALA-3127 is a great ticket. But it still
Hi Dimitris,
Thanks for your quick reply!
IMPALA-3127 is a great ticket. But it still has no progress and no assignee. Is
it tracked in your internal Jira?
Hopes this can be done soon, since some users may choose Presto instead of
Impala due to these usability cases.
Thanks
Quanlong
At