[
https://issues.apache.org/jira/browse/DRILL-6453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16545844#comment-16545844
]
Timothy Farkas commented on DRILL-6453:
---------------------------------------
Notes about the side effects of the fix:
It's important to note that there is no real solution for this. Specifically
there are two problems:
1. In order to avoid a deadlock, we are forced to make the assumption that the
batch size is honored for incoming probe batches. In some cases that assumption
will be violated and probe batches will be larger, and queries will OOM. At
best we can provide a tuning factor to allow users to tune their way out of
such cases for now. As the batch sizing project progresses
2. Since we need foreknowledge of probe batch sizes in order to prevent an OOM
and we are assuming that probe batch sizes are the batch size (ex. 16 mb),
there will be cases where incoming probe batches will actually be smaller. In
these cases we will have spilled the build side to make room for probe batches
unnecessarily, so we will see a performance regression. Discussing with Salim,
this can be avoided for simple cases where the non-key columns in the probe
batch are fixed length. But if the non-key columns for the probe are variable
length then the user could decrease the configured batch size.
Sniffing both the probe and batch side prevented these two issues, but as Salim
explained this leads to deadlock due to the fundamental exchange design. So we
now have to make these two sacrifices in order to avoid a deadlock in the
exchanges.
In summary, with a fix we will have to live the following issues for now:
1. OOM in queries that use non-batch sized operators. And require user tuning
to get the query to run
2. Experience a performance regression when the probe side records are small
and contain varchar non-key columns. The user could tune the batch size to
reduce the performance impact.
> TPC-DS query 72 has regressed
> -----------------------------
>
> Key: DRILL-6453
> URL: https://issues.apache.org/jira/browse/DRILL-6453
> Project: Apache Drill
> Issue Type: Bug
> Components: Execution - Flow
> Affects Versions: 1.14.0
> Reporter: Khurram Faraaz
> Assignee: Timothy Farkas
> Priority: Blocker
> Fix For: 1.14.0
>
> Attachments: 24f75b18-014a-fb58-21d2-baeab5c3352c.sys.drill,
> jstack_29173_June_10_2018.txt, jstack_29173_June_10_2018.txt,
> jstack_29173_June_10_2018_b.txt, jstack_29173_June_10_2018_b.txt,
> jstack_29173_June_10_2018_c.txt, jstack_29173_June_10_2018_c.txt,
> jstack_29173_June_10_2018_d.txt, jstack_29173_June_10_2018_d.txt,
> jstack_29173_June_10_2018_e.txt, jstack_29173_June_10_2018_e.txt
>
>
> TPC-DS query 72 seems to have regressed, query profile for the case where it
> Canceled after 2 hours on Drill 1.14.0 is attached here.
> {noformat}
> On, Drill 1.14.0-SNAPSHOT
> commit : 931b43e (TPC-DS query 72 executed successfully on this commit, took
> around 55 seconds to execute)
> SF1 parquet data on 4 nodes;
> planner.memory.max_query_memory_per_node = 10737418240.
> drill.exec.hashagg.fallback.enabled = true
> TPC-DS query 72 executed successfully & took 47 seconds to complete execution.
> {noformat}
> {noformat}
> TPC-DS data in the below run has date values stored as DATE datatype and not
> VARCHAR type
> On, Drill 1.14.0-SNAPSHOT
> commit : 82e1a12
> SF1 parquet data on 4 nodes;
> planner.memory.max_query_memory_per_node = 10737418240.
> drill.exec.hashagg.fallback.enabled = true
> and
> alter system set `exec.hashjoin.num_partitions` = 1;
> TPC-DS query 72 executed for 2 hrs and 11 mins and did not complete, I had to
> Cancel it by stopping the Foreman drillbit.
> As a result several minor fragments are reported to be in
> CANCELLATION_REQUESTED state on UI.
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)