[jira] [Comment Edited] (DRILL-7675) Very slow performance and Memory exhaustion while querying on very small dataset of parquet files

Paul Rogers (Jira) Wed, 01 Apr 2020 17:00:14 -0700


    [ 
https://issues.apache.org/jira/browse/DRILL-7675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17073249#comment-17073249
 ]


Paul Rogers edited comment on DRILL-7675 at 4/1/20, 11:59 PM:
--------------------------------------------------------------

When run on an out-of-the-box Drill deployment, default memory and parallelism, 
we get this error:
{noformat}
RESOURCE ERROR:
Not enough memory for internal partitioning and fallback mechanism for HashJoin 
to use unbounded memory is disabled.
Either enable fallback config drill.exec.hashjoin.fallback.enabled using Alter 
session/system command or increase memory limit for Drillbit
{noformat}
This seems an odd message for 2MB of data.

A workaround is to alter parallelism:

{code:sql}
ALTER SYSTEM SET `planner.width.max_per_node`=1
{code}

Note: {{ALTER SYSTEM}} because I'm using the Web UI which does not support 
session options.

The result shows that the rows are both quite wide and contain complex types:

{noformat}
1585209600001964
                164
                1585209600001964
                {"ready":true}
                
{"array":[{"assetId":"1909","supply":{"levels":{"array":[{"index":1,"price":252.7,"quantity":103},{"index":2,"price":254.5,"quantity":7},{"index":3,"price":254.7,"quantity":100},{"index":4,"price":256.1,"quantity":555},{"index":5,"price":265.0,"quantity":107}]},"cleared":false,"timestamp":1585209595003064},"demand":{"levels":{"array":[{"index":1,"price":260.0,"quantity":211},{"index":2,"price":254.7,"quantity":1},{"index":3,"price":252.7,"quantity":20},{"index":4,"price":252.5,"quantity":555},{"index":5,"price":252.0,"quantity":100}]},"cleared":false,"timestamp":1585209595003230},"managed":{"orders":{"array":[]},"stage":{"id":"cddc1aac-b9cd-483c-8549-9d53d0b18a5f","types":{"array":[]}},"cleared":false,"timestamp":1585209225712652},"external":{"orders":{"array":[]},"stage":{"id":"73e939ca-e167-4e98-99e6-36b628085986","types":{"array":[{"association":{"type":"Asset","assetId":"1909"},"type":"ASSET_ORDERS"},{"association":{"type":"Heuristic","assetId":"1909","heuristicId":"1076"},"type":"HEURISTIC_ORDERS"}]}},"cleared":false,"timestamp":1585209546760487},"assetHoldings":{"quantity":0,"cleared":false,"timestamp":1585209225711681},"last":{},"status":{"phase":"PRE_OPENING_AUCTION","state":"TRADED","cleared":false,"timestamp":1585209546765981},"auction":{},"base":{"basePrice":259.9,"maxVisibleLevels":5,"priceTicks":{"array":[{"from":0.01,"to":10.0,"size":0.001},{"from":10.0,"to":100.0,"size":0.01},{"from":100.0,"to":2500.0,"size":0.1},{"from":2500.0,"size":1.0}]},"instrumentType":"SECURITY","maxQuantity":670000.0,"minQuantity":7.0,"preOpeningCeilingPrice":350.8,"preOpeningFloorPrice":169.0,"cleared":false,"timestamp":1585209225711206},"fairValueSupply":{},"fairValueDemand":{"price":253.030096,"cleared":false,"timestamp":1585209600000642},"assetViolations":{"violations":{"array":[]},"stage":{"id":"572b7ecf-e7e1-4298-80b4-c1e97570fe51","types":{"array":[]}},"cleared":false,"timestamp":1585209226969761},"platformViolations":{"violations":{"array":[]},"stage":{"id":"cc9346dc-c089-4b24-92c4-cd0113414740","types":{"array":[]}},"cleared":false,"timestamp":1585209228726569},"systemState":{"value":"ACTIVE","cleared":false,"timestamp":1585209233708361}}]}
                null
                null
                null
                {"array":[]}
                {}
                {"types":{"array":[]}}
{noformat}

Wide rows are likely to hit memory issues for any operator that 
overly-aggressively allocates memory to each column. (Operators don't know how 
many rows they will process.)

The query profile shows that, when running with parallelization of 1, we get 
11K rows and use 13 MB of memory in one scan, 128K rows and 13MB in the other 
scan. The partition sender uses 80MB of memory. The join uses 300MB of memory. 
The other operators use minimal memory. Interestingly, query time is 9 sec. in 
this mode vs. the 38 sec. in the earlier test.

It is not clear, however, if the profile shows all memory; perhaps there is 
additional unaccounted memory in the Parquet column readers.


was (Author: paul.rogers):
When run on an out-of-the-box Drill deployment, default memory and parallelism, 
we get this error:
{noformat}
RESOURCE ERROR:
Not enough memory for internal partitioning and fallback mechanism for HashJoin 
to use unbounded memory is disabled.
Either enable fallback config drill.exec.hashjoin.fallback.enabled using Alter 
session/system command or increase memory limit for Drillbit
{noformat}
This seems an odd message for 2MB of data.

> Very slow performance and Memory exhaustion while querying on very small 
> dataset of parquet files
> -------------------------------------------------------------------------------------------------
>
>                 Key: DRILL-7675
>                 URL: https://issues.apache.org/jira/browse/DRILL-7675
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning &amp; Optimization, Storage - Parquet
>    Affects Versions: 1.18.0
>         Environment: [^sample-dataset.zip]
>            Reporter: Idan Sheinberg
>            Assignee: Paul Rogers
>            Priority: Critical
>         Attachments: sample-dataset.zip
>
>
> Per our discussion in Slack/Dev-list Here are all details and sample data-set 
> to recreate problematic query behavior:
>  * We are using Drill 1.18.0-SNAPSHOT built on March 6
>  * We are joining on two small Parquet datasets residing on S3 using the 
> following query:
> {code:java}
> SELECT 
>  CASE
>  WHEN tbl1.`timestamp` IS NULL THEN tbl2.`timestamp`
>  ELSE tbl1.`timestamp`
>  END AS ts, *
>  FROM `s3-store.state.`/164` AS tbl1
>  FULL OUTER JOIN `s3-store.result`.`/164` AS tbl2
>  ON tbl1.`timestamp`*10 = tbl2.`timestamp`
>  ORDER BY ts ASC
>  LIMIT 500 OFFSET 0 ROWS
> {code}
>  * We are running drill in a single node setup on a 16 core, 64GB ram 
> machine. Drill heap size is set to 16GB, while max direct memory is set to 
> 32GB.
>  * As the dataset consist of really small files, Drill has been tweaked to 
> parallelize on small item count by tweaking the following variables:
> {code:java}
> planner.slice_target = 25
> planner.width.max_per_node = 16 (to match the core count){code}
>  * Without the above parallelization, query speeds on parquet files are super 
> slow (tens of seconds)
>  * While queries do work, we are seeing non-proportional direct memory/heap 
> utilization. (up 20GB of direct memory used, a min of 12GB heap required)
>  * We're still encountering the occasional OOM of memory error (we're also 
> seeing heap exhaustion, but I guess that's another indication to same 
> problem. Reducing the node parallelization width to say, 8, reduces memory 
> contention, though it still reaches 8 gb of direct memory 
> {code:java}
> User Error Occurred: One or more nodes ran out of memory while executing the 
> query. (null)
>  org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: One or 
> more nodes ran out of memory while executing the query.null[Error Id: 
> 67b61fc9-320f-47a1-8718-813843a10ecc ]
>  at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:657)
>  at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:338)
>  at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
>  Caused by: org.apache.drill.exec.exception.OutOfMemoryException: null
>  at 
> org.apache.drill.exec.vector.complex.AbstractContainerVector.allocateNew(AbstractContainerVector.java:59)
>  at 
> org.apache.drill.exec.test.generated.PartitionerGen5$OutgoingRecordBatch.allocateOutgoingRecordBatch(PartitionerTemplate.java:380)
>  at 
> org.apache.drill.exec.test.generated.PartitionerGen5$OutgoingRecordBatch.initializeBatch(PartitionerTemplate.java:400)
>  at 
> org.apache.drill.exec.test.generated.PartitionerGen5.setup(PartitionerTemplate.java:126)
>  at 
> org.apache.drill.exec.physical.impl.partitionsender.PartitionSenderRootExec.createClassInstances(PartitionSenderRootExec.java:263)
>  at 
> org.apache.drill.exec.physical.impl.partitionsender.PartitionSenderRootExec.createPartitioner(PartitionSenderRootExec.java:218)
>  at 
> org.apache.drill.exec.physical.impl.partitionsender.PartitionSenderRootExec.innerNext(PartitionSenderRootExec.java:188)
>  at 
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:93)
>  at 
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:323)
>  at 
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:310)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>  at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:310)
>  ... 4 common frames omitted{code}
> I've attached a (real!) sample data-set to match the query above. That same 
> dataset recreates the aforementioned memory behavior
> Help, please.
> Idan
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (DRILL-7675) Very slow performance and Memory exhaustion while querying on very small dataset of parquet files

Reply via email to