Hi Jesus,
Many thanks for your inputs.
Best,
Ashwin
On Mon, Mar 5, 2018 at 12:41 PM, Jesus Camacho Rodriguez <
jcama...@apache.org> wrote:
> Hi Ashwin,
>
> 1) It is important that table/column stats are available, so Calcite can
> trigger correctly its cost-based optimizations. You can do that
Hi Ashwin,
1) It is important that table/column stats are available, so Calcite can
trigger correctly its cost-based optimizations. You can do that either manually
by running ANALYZE... COMPUTE STATISTICS FOR COLUMNS statement, or enabling
hive.stats.autogather indeed.
2) Calcite-based
and to clarify the instrumentation aspect --
are there any Calcite logs that are typically turned off that would be good
to turn on to get insights, and are there
any JVM settings,
JMX hooks,
HDFS parameters, etc.
that would be good to tap into to get a better picture of how the Java
memory,
Hello Dev Team,
I am trying to run queries on Apache HIVE by setting the flag
*hive.cbo.enabled* to true and also to false and then compare the metrics.
I have a few questions regarding the same -
1. Do I need to set *hive.stats.autogather(to gather the tables statistics)*
to true as well before
1. TPC-DS seems like a great starting point to me. SSB would also be a good
addition.
2. You can set hive.cbo.enabled in your Hive configto false to turn off the
optimizer.
3. Count me interested in general although I have limited time available in
the immediate future. I'd be interested in