[
https://issues.apache.org/jira/browse/DRILL-5032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15654543#comment-15654543
]
Serhii Harnyk commented on DRILL-5032:
--------------------------------------
As [~jni] mentioned
Today, the physical plan looks like:
listOfColumns : [col1, col2, ...] — TableLevel
Partitons : [
partiton1 :
{ listOfColums : [col1, col2, ...] -- PartitonLevel .... }
,
partiton2 :
{ listOfColums : [col1, col2, ...] -- PartitonLevel .... }
,
...
partiton_n :
{ listOfColums : [col1, col2, ...] -- PartitonLevel .... }
,
]
The listOfColumns are repeating in every partition, which seems to be
unnecessary. We should get rid of those repeated list of columns in each
partition, as long as they are same as the listOfColumns at Table level.
So the initial idea is to remove repeated listOfColums from HivePartition
physical plan serialization
> Drill query on hive parquet table failed with OutOfMemoryError: Java heap
> space
> -------------------------------------------------------------------------------
>
> Key: DRILL-5032
> URL: https://issues.apache.org/jira/browse/DRILL-5032
> Project: Apache Drill
> Issue Type: Bug
> Components: Functions - Hive
> Affects Versions: 1.8.0
> Reporter: Serhii Harnyk
> Assignee: Serhii Harnyk
>
> Following query on hive parquet table failed with OOM Java heap space:
> {code}
> select distinct(businessdate) from vmdr_trades where trade_date='2016-04-12'
> 2016-08-31 08:02:03,597 [283938c3-fde8-0fc6-37e1-9a568c7f5913:foreman] INFO
> o.a.drill.exec.work.foreman.Foreman - Query text for query id
> 283938c3-fde8-0fc6-37e1-9a568c7f5913: select distinct(businessdate) from
> vmdr_trades where trade_date='2016-04-12'
> 2016-08-31 08:05:58,502 [283938c3-fde8-0fc6-37e1-9a568c7f5913:foreman] INFO
> o.a.d.e.p.l.partition.PruneScanRule - Beginning partition pruning, pruning
> class:
> org.apache.drill.exec.planner.sql.logical.HivePushPartitionFilterIntoScan$2
> 2016-08-31 08:05:58,506 [283938c3-fde8-0fc6-37e1-9a568c7f5913:foreman] INFO
> o.a.d.e.p.l.partition.PruneScanRule - Total elapsed time to build and analyze
> filter tree: 1 ms
> 2016-08-31 08:05:58,506 [283938c3-fde8-0fc6-37e1-9a568c7f5913:foreman] INFO
> o.a.d.e.p.l.partition.PruneScanRule - No conditions were found eligible for
> partition pruning.Total pruning elapsed time: 3 ms
> 2016-08-31 08:05:58,663 [283938c3-fde8-0fc6-37e1-9a568c7f5913:foreman] INFO
> o.a.d.e.p.l.partition.PruneScanRule - Beginning partition pruning, pruning
> class:
> org.apache.drill.exec.planner.sql.logical.HivePushPartitionFilterIntoScan$2
> 2016-08-31 08:05:58,663 [283938c3-fde8-0fc6-37e1-9a568c7f5913:foreman] INFO
> o.a.d.e.p.l.partition.PruneScanRule - Total elapsed time to build and analyze
> filter tree: 0 ms
> 2016-08-31 08:05:58,663 [283938c3-fde8-0fc6-37e1-9a568c7f5913:foreman] INFO
> o.a.d.e.p.l.partition.PruneScanRule - No conditions were found eligible for
> partition pruning.Total pruning elapsed time: 0 ms
> 2016-08-31 08:05:58,664 [283938c3-fde8-0fc6-37e1-9a568c7f5913:foreman] INFO
> o.a.d.e.p.l.partition.PruneScanRule - Beginning partition pruning, pruning
> class:
> org.apache.drill.exec.planner.sql.logical.HivePushPartitionFilterIntoScan$1
> 2016-08-31 08:05:58,665 [283938c3-fde8-0fc6-37e1-9a568c7f5913:foreman] INFO
> o.a.d.e.p.l.partition.PruneScanRule - Total elapsed time to build and analyze
> filter tree: 0 ms
> 2016-08-31 08:05:58,665 [283938c3-fde8-0fc6-37e1-9a568c7f5913:foreman] INFO
> o.a.d.e.p.l.partition.PruneScanRule - No conditions were found eligible for
> partition pruning.Total pruning elapsed time: 0 ms
> 2016-08-31 08:09:42,355 [283938c3-fde8-0fc6-37e1-9a568c7f5913:foreman] ERROR
> o.a.drill.common.CatastrophicFailure - Catastrophic Failure Occurred,
> exiting. Information message: Unable to handle out of memory condition in
> Foreman.
> java.lang.OutOfMemoryError: Java heap space
> at java.util.Arrays.copyOf(Arrays.java:3332) ~[na:1.8.0_74]
> at
> java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:137)
> ~[na:1.8.0_74]
> at
> java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:121)
> ~[na:1.8.0_74]
> at
> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:421)
> ~[na:1.8.0_74]
> at java.lang.StringBuilder.append(StringBuilder.java:136)
> ~[na:1.8.0_74]
> at java.lang.StringBuilder.append(StringBuilder.java:76)
> ~[na:1.8.0_74]
> at
> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:457)
> ~[na:1.8.0_74]
> at java.lang.StringBuilder.append(StringBuilder.java:166)
> ~[na:1.8.0_74]
> at java.lang.StringBuilder.append(StringBuilder.java:76)
> ~[na:1.8.0_74]
> at
> com.google.protobuf.TextFormat$TextGenerator.write(TextFormat.java:538)
> ~[protobuf-java-2.5.0.jar:na]
> at
> com.google.protobuf.TextFormat$TextGenerator.print(TextFormat.java:526)
> ~[protobuf-java-2.5.0.jar:na]
> at
> com.google.protobuf.TextFormat$Printer.printFieldValue(TextFormat.java:389)
> ~[protobuf-java-2.5.0.jar:na]
> at
> com.google.protobuf.TextFormat$Printer.printSingleField(TextFormat.java:327)
> ~[protobuf-java-2.5.0.jar:na]
> at
> com.google.protobuf.TextFormat$Printer.printField(TextFormat.java:286)
> ~[protobuf-java-2.5.0.jar:na]
> at com.google.protobuf.TextFormat$Printer.print(TextFormat.java:273)
> ~[protobuf-java-2.5.0.jar:na]
> at
> com.google.protobuf.TextFormat$Printer.access$400(TextFormat.java:248)
> ~[protobuf-java-2.5.0.jar:na]
> at com.google.protobuf.TextFormat.print(TextFormat.java:71)
> ~[protobuf-java-2.5.0.jar:na]
> at com.google.protobuf.TextFormat.printToString(TextFormat.java:118)
> ~[protobuf-java-2.5.0.jar:na]
> at
> com.google.protobuf.AbstractMessage.toString(AbstractMessage.java:106)
> ~[protobuf-java-2.5.0.jar:na]
> at
> org.apache.drill.exec.planner.fragment.SimpleParallelizer.generateWorkUnit(SimpleParallelizer.java:395)
> ~[drill-java-exec-1.6.0.jar:1.6.0]
> at
> org.apache.drill.exec.planner.fragment.SimpleParallelizer.getFragments(SimpleParallelizer.java:134)
> ~[drill-java-exec-1.6.0.jar:1.6.0]
> at
> org.apache.drill.exec.work.foreman.Foreman.getQueryWorkUnit(Foreman.java:516)
> ~[drill-java-exec-1.6.0.jar:1.6.0]
> at
> org.apache.drill.exec.work.foreman.Foreman.runPhysicalPlan(Foreman.java:403)
> ~[drill-java-exec-1.6.0.jar:1.6.0]
> at
> org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:929)
> ~[drill-java-exec-1.6.0.jar:1.6.0]
> at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:251)
> ~[drill-java-exec-1.6.0.jar:1.6.0]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> [na:1.8.0_74]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> [na:1.8.0_74]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_74]
> 2016-08-31 08:09:43,358 [Drillbit-ShutdownHook#0] INFO
> o.apache.drill.exec.server.Drillbit - Received shutdown request.
> 2016-08-31 08:09:49,385 [BitServer-3] INFO
> o.a.d.exec.rpc.control.ControlClient - Channel closed /162.111.92.29:33973
> <--> /162.111.92.29:31011.
> 2016-08-31 08:09:50,388 [pool-6-thread-2] INFO
> o.a.drill.exec.rpc.data.DataServer - closed eventLoopGroup
> io.netty.channel.epoll.EpollEventLoopGroup@268b55a9 in 1007 ms
> 2016-08-31 08:09:50,389 [pool-6-thread-2] INFO
> o.a.drill.exec.service.ServiceEngine - closed dataPool in 1008 ms
> {code}
> The Drill cluster having 16 nodes and current drill direct memory is 16GB
> and heap memory 20GB. table vmdr_trades is ~19GB and having 750+ partition
> and each partition having one parquet file roughly.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)