[
https://issues.apache.org/jira/browse/IMPALA-8121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Armstrong updated IMPALA-8121:
----------------------------------
Description:
There are some new features of Impala that are done but disabled by default
because they are not strictly better than the previous versions. E.g. the
various metadata improvements. Containerised Impala is likely to be new
deployments, so it is easier to make potentially disruptive changes to defaults
now.
h2. Metadata V2 Flags
Catalogd:
--catalog_topic_mode=minimal
Impalad:
--use_local_catalog=true
We want to invalidate based on HMS notifications
(https://issues.apache.org/jira/browse/IMPALA-7970) and memory pressure. It's
less clear if invalidating tables based on time is really useful - for large
fact tables it would add a lot of unpredictability because reloading the tables
is expensive.
Catalogd:
--invalidate_tables_timeout_s=???
--invalidate_tables_on_memory_pressure=true
Once IMPALA-7970 goes in, we probably also want automatic invalidation by
default (TBD - how to handle older HMS that doesn't support those APIs).
Catalogd:
--hms_event_polling_interval_s=???
We probably want to enable HDFS preads for remote reads: -use_hdfs_pread - but
I think this is going to be done automatically.
We may want to have an I/O cache enabled - tracked by IMPALA-8121
was:
There are some new features of Impala that are done but disabled by default
because they are not strictly better than the previous versions. E.g. the
various metadata improvements. Containerised Impala is likely to be new
deployments, so it is easier to make potentially disruptive changes to defaults
now.
h2. Metadata V2 Flags
Catalogd:
--catalog_topic_mode=minimal
Impalad:
--use_local_catalog=true
We want to invalidate based on HMS notifications
(https://issues.apache.org/jira/browse/IMPALA-7970) and memory pressure. It's
less clear if invalidating tables based on time is really useful - for large
fact tables it would add a lot of unpredictability because reloading the tables
is expensive.
Catalogd:
--invalidate_tables_timeout_s=???
--invalidate_tables_on_memory_pressure=true
Once IMPALA-7970 goes in, we probably also want automatic invalidation by
default (TBD - how to handle older HMS that doesn't support those APIs).
Catalogd:
--hms_event_polling_interval_s=???
We probably want to enable HDFS preads for remote reads: -use_hdfs_pread
We may want to have an I/O cache enabled - tracked by IMPALA-8121
> Pick better default flags in containers
> ---------------------------------------
>
> Key: IMPALA-8121
> URL: https://issues.apache.org/jira/browse/IMPALA-8121
> Project: IMPALA
> Issue Type: Sub-task
> Components: Infrastructure
> Reporter: Tim Armstrong
> Assignee: Tim Armstrong
> Priority: Major
> Labels: docker
>
> There are some new features of Impala that are done but disabled by default
> because they are not strictly better than the previous versions. E.g. the
> various metadata improvements. Containerised Impala is likely to be new
> deployments, so it is easier to make potentially disruptive changes to
> defaults now.
> h2. Metadata V2 Flags
> Catalogd:
> --catalog_topic_mode=minimal
> Impalad:
> --use_local_catalog=true
> We want to invalidate based on HMS notifications
> (https://issues.apache.org/jira/browse/IMPALA-7970) and memory pressure. It's
> less clear if invalidating tables based on time is really useful - for large
> fact tables it would add a lot of unpredictability because reloading the
> tables is expensive.
> Catalogd:
> --invalidate_tables_timeout_s=???
> --invalidate_tables_on_memory_pressure=true
> Once IMPALA-7970 goes in, we probably also want automatic invalidation by
> default (TBD - how to handle older HMS that doesn't support those APIs).
> Catalogd:
> --hms_event_polling_interval_s=???
> We probably want to enable HDFS preads for remote reads: -use_hdfs_pread -
> but I think this is going to be done automatically.
> We may want to have an I/O cache enabled - tracked by IMPALA-8121
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]