[
https://issues.apache.org/jira/browse/DRILL-5528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16043607#comment-16043607
]
Paul Rogers edited comment on DRILL-5528 at 6/9/17 12:02 AM:
-------------------------------------------------------------
Preliminary results suggest that Drill is severely CPU bound for this query. As
a baseline, ran the query with the original, "unmanaged", sort.
Running on a Mac, with 3 GB given to the sort, the thread running the query is
at 100% of CPU. But, disk I/O during the read-sort-and-spill phase is very
small, only about 40MB/s for reads, 20 MB/s for writes peak, with many valleys
in between -- on an SSD that can do 700 MB/s. Earlier tests got upwards of 70+
MB/s.
Later, during the merge phase, read performance jumps to 150 MB/s. CPU is still
around 95%.
Total run time was just over an hour for a file that formerly took just a few
minutes on previous test variations.
This suggests that, somehow, something is using far to much CPU to do the work
of the sort. This also suggests that the excessive CPU use occurs whether the
sort memory is "managed" or not.
Time summary:
{code}
Results: 1 records, 2 batches, 3,608,140 ms
00 Screen
Setup: 0 ms - 0%, 0%
Process: 4 ms - 0%, 0%
01 Project
Setup: 5 ms - 0%, 0%
Process: 1 ms - 0%, 0%
02 StreamAgg (STREAMING AGGREGATE)
Setup: 133 ms - 11%, 0%
Process: 4 ms - 0%, 0%
03 Project
Setup: 53 ms - 4%, 0%
Process: 0 ms - 0%, 0%
04 SelectionVectorRemover (SELECTION VECTOR REMOVER)
Setup: 51 ms - 4%, 0%
Process: 501 ms - 0%, 0%
05 Filter
Setup: 194 ms - 17%, 0%
Process: 14,910 ms - 0%, 0%
06 Project
Setup: 41 ms - 3%, 0%
Process: 1,414 ms - 0%, 0%
07 SelectionVectorRemover (SELECTION VECTOR REMOVER)
Setup: 6 ms - 0%, 0%
Process: 33,020 ms - 1%, 1%
08 Sort (EXTERNAL SORT)
Setup: 0 ms - 0%, 0%
Process: 1,996,522 ms - 87%, 87%
09 Project
Setup: 137 ms - 12%, 0%
Process: 24,847 ms - 1%, 1%
10 Project
Setup: 499 ms - 44%, 0%
Process: 45,571 ms - 2%, 2%
11 Scan (TEXT SUB SCAN)
Setup: 0 ms - 0%, 0%
Process: 153,725 ms - 6%, 6%
Total:
Setup: 1,119 ms
Process: 2,270,519 ms
Input batches: 0, spills: 58, merge/spills: 0
{code}
In summary form:
{code}
08 Sort (EXTERNAL SORT)
Process: 1,996,522 ms - 87%, 87%
10 Project
Process: 45,571 ms - 2%, 2%
11 Scan (TEXT SUB SCAN)
Process: 153,725 ms - 6%, 6%
{code}
The sort itself takes 87% of the time. Reading the original 18 GB of data takes
just 6% of the time. The read rate during the scan is about 18 GB / 154 s = 116
MB/s, which is pretty good. So, clearly the delay is in the sort itself
somewhere.
For a baseline, ran a query that simply reads and discards data:
{code}
select * FROM (select * from dfs.data.`250wide.tbl`)d where d.columns[0] =
'ljdfhwuehnoiueyf'
{code}
Ran in just 178 seconds. This confirms that the scan part of the query is fine.
Read rate starts at 200 MB/s, then drops to 100 MB/s. CPU is at 97% suggesting
that the query, even for a simple read, is CPU bound rather than disk bound.
Scan time is 139 seconds or about 130 MB/s read rate.
was (Author: paul-rogers):
Preliminary results suggest that Drill is severely CPU bound for this query. As
a baseline, ran the query with the original, "unmanaged", sort.
Running on a Mac, the thread running the query is at 100% of CPU. But, disk I/O
during the read-sort-and-spill phase is very small, only about 40MB/s for
reads, 20 MB/s for writes peak, with many valleys in between -- on an SSD that
can do 700 MB/s. Earlier tests got upwards of 70+ MB/s.
Later, during the merge phase, read performance jumps to 150 MB/s. CPU is still
around 95%.
Total run time was just over an hour for a file that formerly took just a few
minutes on previous test variations.
This suggests that, somehow, something is using far to much CPU to do the work
of the sort. This also suggests that the excessive CPU use occurs whether the
sort memory is "managed" or not.
Time summary:
{code}
Results: 1 records, 2 batches, 3,608,140 ms
00 Screen
Setup: 0 ms - 0%, 0%
Process: 4 ms - 0%, 0%
01 Project
Setup: 5 ms - 0%, 0%
Process: 1 ms - 0%, 0%
02 StreamAgg (STREAMING AGGREGATE)
Setup: 133 ms - 11%, 0%
Process: 4 ms - 0%, 0%
03 Project
Setup: 53 ms - 4%, 0%
Process: 0 ms - 0%, 0%
04 SelectionVectorRemover (SELECTION VECTOR REMOVER)
Setup: 51 ms - 4%, 0%
Process: 501 ms - 0%, 0%
05 Filter
Setup: 194 ms - 17%, 0%
Process: 14,910 ms - 0%, 0%
06 Project
Setup: 41 ms - 3%, 0%
Process: 1,414 ms - 0%, 0%
07 SelectionVectorRemover (SELECTION VECTOR REMOVER)
Setup: 6 ms - 0%, 0%
Process: 33,020 ms - 1%, 1%
08 Sort (EXTERNAL SORT)
Setup: 0 ms - 0%, 0%
Process: 1,996,522 ms - 87%, 87%
09 Project
Setup: 137 ms - 12%, 0%
Process: 24,847 ms - 1%, 1%
10 Project
Setup: 499 ms - 44%, 0%
Process: 45,571 ms - 2%, 2%
11 Scan (TEXT SUB SCAN)
Setup: 0 ms - 0%, 0%
Process: 153,725 ms - 6%, 6%
Total:
Setup: 1,119 ms
Process: 2,270,519 ms
Input batches: 0, spills: 58, merge/spills: 0
{code}
In summary form:
{code}
08 Sort (EXTERNAL SORT)
Process: 1,996,522 ms - 87%, 87%
10 Project
Process: 45,571 ms - 2%, 2%
11 Scan (TEXT SUB SCAN)
Process: 153,725 ms - 6%, 6%
{code}
The sort itself takes 87% of the time. Reading the original 18 GB of data takes
just 6% of the time. The read rate during the scan is about 18 GB / 154 s = 116
MB/s, which is pretty good. So, clearly the delay is in the sort itself
somewhere.
For a baseline, ran a query that simply reads and discards data:
{code}
select * FROM (select * from dfs.data.`250wide.tbl`)d where d.columns[0] =
'ljdfhwuehnoiueyf'
{code}
Ran in just 178 seconds. This confirms that the scan part of the query is fine.
Read rate starts at 200 MB/s, then drops to 100 MB/s. CPU is at 97% suggesting
that the query, even for a simple read, is CPU bound rather than disk bound.
Scan time is 139 seconds or about 130 MB/s read rate.
> Sorting 19GB data with 14GB memory in a single fragment takes ~150 minutes
> --------------------------------------------------------------------------
>
> Key: DRILL-5528
> URL: https://issues.apache.org/jira/browse/DRILL-5528
> Project: Apache Drill
> Issue Type: Bug
> Components: Execution - Relational Operators
> Affects Versions: 1.10.0
> Reporter: Rahul Challapalli
> Assignee: Paul Rogers
> Attachments: 26e2183f-2c5d-4044-0e23-4aeb45c40be5.sys.drill,
> drillbit.log, drillbit.out, drill-env.sh
>
>
> Configuration :
> {code}
> git.commit.id.abbrev=1e0a14c
> DRILL_MAX_DIRECT_MEMORY="32G"
> DRILL_MAX_HEAP="4G"
> {code}
> Based on the runtime of the below query, I suspect there is a performance
> bottleneck somewhere
> {code}
> [root@qa-node190 external-sort]# /opt/drill/bin/sqlline -u
> jdbc:drill:zk=10.10.100.190:5181
> apache drill 1.11.0-SNAPSHOT
> "start your sql engine"
> 0: jdbc:drill:zk=10.10.100.190:5181> ALTER SESSION SET
> `exec.sort.disable_managed` = false;
> +-------+-------------------------------------+
> | ok | summary |
> +-------+-------------------------------------+
> | true | exec.sort.disable_managed updated. |
> +-------+-------------------------------------+
> 1 row selected (0.975 seconds)
> 0: jdbc:drill:zk=10.10.100.190:5181> alter session set
> `planner.width.max_per_node` = 1;
> +-------+--------------------------------------+
> | ok | summary |
> +-------+--------------------------------------+
> | true | planner.width.max_per_node updated. |
> +-------+--------------------------------------+
> 1 row selected (0.371 seconds)
> 0: jdbc:drill:zk=10.10.100.190:5181> alter session set
> `planner.disable_exchanges` = true;
> +-------+-------------------------------------+
> | ok | summary |
> +-------+-------------------------------------+
> | true | planner.disable_exchanges updated. |
> +-------+-------------------------------------+
> 1 row selected (0.292 seconds)
> 0: jdbc:drill:zk=10.10.100.190:5181> alter session set
> `planner.memory.max_query_memory_per_node` = 14106127360;
> +-------+----------------------------------------------------+
> | ok | summary |
> +-------+----------------------------------------------------+
> | true | planner.memory.max_query_memory_per_node updated. |
> +-------+----------------------------------------------------+
> 1 row selected (0.316 seconds)
> 0: jdbc:drill:zk=10.10.100.190:5181> select count(*) from (select * from
> dfs.`/drill/testdata/resource-manager/250wide.tbl` order by columns[0])d
> where d.columns[0] = 'ljdfhwuehnoiueyf';
> +---------+
> | EXPR$0 |
> +---------+
> | 0 |
> +---------+
> 1 row selected (8530.719 seconds)
> {code}
> I attached the logs and profile files. The data is too large to attach to a
> jira. Reach out to me if you need any more information
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)