[
https://issues.apache.org/jira/browse/TAJO-1950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15070824#comment-15070824
]
Hudson commented on TAJO-1950:
------------------------------
SUCCESS: Integrated in Tajo-master-build #1038 (See
[https://builds.apache.org/job/Tajo-master-build/1038/])
TAJO-1950: Query master uses too much memory during range shuffle. (jihoonson:
rev 1f9ae1da0424731567cea18e975c47d4479b0ae9)
* tajo-core-tests/src/test/java/org/apache/tajo/ha/TestHAServiceHDFSImpl.java
* tajo-common/src/main/java/org/apache/tajo/conf/TajoConf.java
* tajo-core-tests/src/test/java/org/apache/tajo/master/TestRepartitioner.java
*
tajo-storage/tajo-storage-hdfs/src/main/java/org/apache/tajo/storage/index/bst/BSTIndex.java
* tajo-core-tests/src/test/java/org/apache/tajo/worker/TestFetcher.java
* tajo-core/src/main/java/org/apache/tajo/engine/query/TaskRequestImpl.java
* tajo-core/src/main/java/org/apache/tajo/querymaster/DefaultTaskScheduler.java
* tajo-core/src/main/proto/ResourceProtos.proto
* tajo-core/src/main/java/org/apache/tajo/worker/TaskImpl.java
* tajo-core/src/main/java/org/apache/tajo/worker/FetchImpl.java
* tajo-core/src/main/java/org/apache/tajo/querymaster/Stage.java
*
tajo-pullserver/src/main/java/org/apache/tajo/pullserver/HttpDataServerHandler.java
* tajo-core/src/main/java/org/apache/tajo/worker/Fetcher.java
* CHANGES
*
tajo-pullserver/src/main/java/org/apache/tajo/pullserver/TajoPullServerService.java
* tajo-core/src/main/java/org/apache/tajo/engine/utils/TupleUtil.java
* tajo-core/src/main/resources/webapps/worker/task.jsp
* tajo-core/src/main/java/org/apache/tajo/engine/query/TaskRequest.java
* tajo-core/src/main/java/org/apache/tajo/querymaster/Repartitioner.java
* tajo-core/src/main/java/org/apache/tajo/querymaster/Task.java
* tajo-core/src/main/java/org/apache/tajo/querymaster/FetchScheduleEvent.java
* tajo-core/src/main/java/org/apache/tajo/worker/ExecutionBlockContext.java
> Query master uses too much memory during range shuffle
> ------------------------------------------------------
>
> Key: TAJO-1950
> URL: https://issues.apache.org/jira/browse/TAJO-1950
> Project: Tajo
> Issue Type: Improvement
> Components: Data Shuffle, Pull Server, QueryMaster
> Reporter: Jihoon Son
> Assignee: Jihoon Son
> Priority: Critical
> Fix For: 0.12.0, 0.11.1
>
> Attachments: TAJO-1950proposal.pdf
>
>
> I ran a simple sort query on a 8TB table as follows.
> {noformat}
> tpch10tb> select * from lineitem order by l_orderkey;
> {noformat}
> After the first stage is completed, query master divides the range of the
> sort key (l_orderkey) into multiple partitions for range shuffle. Here, the
> partitioning time took about 9 minutes.
> Here is the log.
> {noformat}
> ...
> 2015-10-26 14:23:10,782 INFO
> org.apache.tajo.engine.planner.global.ParallelExecutionQueue: Next executable
> block eb_1445835438802_0004_000002
> 2015-10-26 14:23:10,782 INFO org.apache.tajo.querymaster.Query: Scheduling
> Stage:eb_1445835438802_0004_000002
> 2015-10-26 14:23:10,796 INFO org.apache.tajo.querymaster.Stage:
> org.apache.tajo.querymaster.DefaultTaskScheduler is chosen for the task
> scheduling for eb_1445835438802_0004_000002
> 2015-10-26 14:23:10,796 INFO org.apache.tajo.querymaster.Stage:
> eb_1445835438802_0004_000002, Table's volume is approximately 663647 MB
> 2015-10-26 14:23:10,796 INFO org.apache.tajo.querymaster.Stage:
> eb_1445835438802_0004_000002, The determined number of non-leaf tasks is 10370
> 2015-10-26 14:23:10,816 INFO org.apache.tajo.querymaster.Repartitioner:
> eb_1445835438802_0004_000002, Try to divide [(6000000000), (1)) into 10370
> sub ranges (total units: 10370)
> 2015-10-26 14:24:58,996 INFO org.apache.tajo.util.JvmPauseMonitor: Detected
> pause in JVM or host machine (eg GC): pause of approximately 2440ms
> GC pool 'PS MarkSweep' had collection(s): count=1 time=2214ms
> GC pool 'PS Scavenge' had collection(s): count=1 time=622ms
> 2015-10-26 14:27:24,040 WARN org.apache.tajo.util.JvmPauseMonitor: Detected
> pause in JVM or host machine (eg GC): pause of approximately 13237ms
> GC pool 'PS MarkSweep' had collection(s): count=1 time=12635ms
> GC pool 'PS Scavenge' had collection(s): count=1 time=674ms
> 2015-10-26 14:28:51,914 WARN org.apache.tajo.util.JvmPauseMonitor: Detected
> pause in JVM or host machine (eg GC): pause of approximately 20873ms
> GC pool 'PS MarkSweep' had collection(s): count=1 time=20486ms
> GC pool 'PS Scavenge' had collection(s): count=1 time=644ms
> 2015-10-26 14:30:52,392 WARN org.apache.tajo.util.JvmPauseMonitor: Detected
> pause in JVM or host machine (eg GC): pause of approximately 30986ms
> GC pool 'PS MarkSweep' had collection(s): count=1 time=30546ms
> GC pool 'PS Scavenge' had collection(s): count=1 time=696ms
> 2015-10-26 14:32:07,550 WARN org.apache.tajo.util.JvmPauseMonitor: Detected
> pause in JVM or host machine (eg GC): pause of approximately 15449ms
> GC pool 'PS MarkSweep' had collection(s): count=1 time=14593ms
> GC pool 'PS Scavenge' had collection(s): count=1 time=1148ms
> 2015-10-26 14:32:15,807 INFO org.apache.tajo.querymaster.Stage: 10370 objects
> are scheduled
> ...
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)