[ 
https://issues.apache.org/jira/browse/KYLIN-2929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

fengYu updated KYLIN-2929:
--------------------------
    Attachment: 0002-KYLIN-2929-speed-up-dump-performance-write-dump-file.patch

this is my patch and test result :
run the same sql three times and watch coprocessor process time.

Before : 

2017-10-13 16:53:34,986 INFO  [kylin-coproc--pool5-t70] 
v2.CubeHBaseEndpointRPC:200 : <sub-thread for Query 
5f23c28e-fbfa-493d-b42c-202dddec03c2 GTScanRequest 2cf15a85>Endpoint RPC 
returned from HTable V200_NEW_KYLIN_KIJXSDW18F Shard 
\x56\x32\x30\x30\x5F\x4E\x45\x57\x5F\x4B\x59\x4C\x49\x4E\x5F\x4B\x49\x4A\x58\x53\x44\x57\x31\x38\x46\x2C\x00\x01\x2C\x31\x35\x30\x37\x37\x30\x33\x35\x32\x33\x36\x39\x35\x2E\x65\x39\x37\x63\x64\x38\x34\x32\x62\x33\x61\x63\x37\x63\x66\x30\x32\x38\x31\x64\x36\x32\x66\x38\x31\x63\x62\x36\x61\x38\x64\x39\x2E
 on host: db-53.photo.163.org.Total scanned row: 634776. Total scanned bytes: 
134956120. Total filtered/aggred row: 527872. Time elapsed in EP: 19082(ms). 
Server CPU usage: 0.0, server physical mem left: 1.381179392E9, server swap mem 
left:2.075181056E9.Etc message: start latency: 34@57,agg done@18265,compress 
done@19081,server stats done@19081, 
debugGitTag:a08e52e24c99f312eaa63bd3f9ef4cdc53fa2a67;.Normal Complete: 
true.Compressed row size: 13954413

2017-10-13 16:55:30,633 INFO  [kylin-coproc--pool5-t72] 
v2.CubeHBaseEndpointRPC:200 : <sub-thread for Query 
c1c17a13-fa25-4664-a381-e802ee403afb GTScanRequest 3ed35154>Endpoint RPC 
returned from HTable V200_NEW_KYLIN_KIJXSDW18F Shard 
\x56\x32\x30\x30\x5F\x4E\x45\x57\x5F\x4B\x59\x4C\x49\x4E\x5F\x4B\x49\x4A\x58\x53\x44\x57\x31\x38\x46\x2C\x00\x01\x2C\x31\x35\x30\x37\x37\x30\x33\x35\x32\x33\x36\x39\x35\x2E\x65\x39\x37\x63\x64\x38\x34\x32\x62\x33\x61\x63\x37\x63\x66\x30\x32\x38\x31\x64\x36\x32\x66\x38\x31\x63\x62\x36\x61\x38\x64\x39\x2E
 on host: db-53.photo.163.org.Total scanned row: 634776. Total scanned bytes: 
134956120. Total filtered/aggred row: 527872. Time elapsed in EP: 17371(ms). 
Server CPU usage: 0.08703703703703704, server physical mem left: 1.340674048E9, 
server swap mem left:2.075181056E9.Etc message: start latency: 12@3,agg 
done@16586,compress done@17371,server stats done@17371, 
debugGitTag:a08e52e24c99f312eaa63bd3f9ef4cdc53fa2a67;.Normal Complete: 
true.Compressed row size: 13954413

2017-10-13 16:56:33,382 INFO  [kylin-coproc--pool5-t74] 
v2.CubeHBaseEndpointRPC:200 : <sub-thread for Query 
5596d6f0-cf22-4d57-9f24-a576f0dc01af GTScanRequest 468942c3>Endpoint RPC 
returned from HTable V200_NEW_KYLIN_KIJXSDW18F Shard 
\x56\x32\x30\x30\x5F\x4E\x45\x57\x5F\x4B\x59\x4C\x49\x4E\x5F\x4B\x49\x4A\x58\x53\x44\x57\x31\x38\x46\x2C\x00\x01\x2C\x31\x35\x30\x37\x37\x30\x33\x35\x32\x33\x36\x39\x35\x2E\x65\x39\x37\x63\x64\x38\x34\x32\x62\x33\x61\x63\x37\x63\x66\x30\x32\x38\x31\x64\x36\x32\x66\x38\x31\x63\x62\x36\x61\x38\x64\x39\x2E
 on host: db-53.photo.163.org.Total scanned row: 634776. Total scanned bytes: 
134956120. Total filtered/aggred row: 527872. Time elapsed in EP: 17184(ms). 
Server CPU usage: 0.0624334964886146, server physical mem left: 1.320890368E9, 
server swap mem left:2.075181056E9.Etc message: start latency: 12@1,agg 
done@16397,compress done@17184,server stats done@17184, 
debugGitTag:a08e52e24c99f312eaa63bd3f9ef4cdc53fa2a67;.Normal Complete: 
true.Compressed row size: 13954413

After :

2017-10-13 17:01:05,660 INFO  [kylin-coproc--pool5-t76] 
v2.CubeHBaseEndpointRPC:200 : <sub-thread for Query 
c418792a-47e4-4de4-9525-8a7f5c1c4b37 GTScanRequest 5dfcfb5f>Endpoint RPC 
returned from HTable V200_NEW_KYLIN_KIJXSDW18F Shard 
\x56\x32\x30\x30\x5F\x4E\x45\x57\x5F\x4B\x59\x4C\x49\x4E\x5F\x4B\x49\x4A\x58\x53\x44\x57\x31\x38\x46\x2C\x00\x01\x2C\x31\x35\x30\x37\x37\x30\x33\x35\x32\x33\x36\x39\x35\x2E\x65\x39\x37\x63\x64\x38\x34\x32\x62\x33\x61\x63\x37\x63\x66\x30\x32\x38\x31\x64\x36\x32\x66\x38\x31\x63\x62\x36\x61\x38\x64\x39\x2E
 on host: db-53.photo.163.org.Total scanned row: 634776. Total scanned bytes: 
134956120. Total filtered/aggred row: 527872. Time elapsed in EP: 12253(ms). 
Server CPU usage: 0.0900900900900901, server physical mem left: 1.328091136E9, 
server swap mem left:2.075181056E9.Etc message: start latency: 33@58,agg 
done@11463,compress done@12253,server stats done@12253, 
debugGitTag:a08e52e24c99f312eaa63bd3f9ef4cdc53fa2a67;.Normal Complete: 
true.Compressed row size: 13954413

2017-10-13 17:02:05,746 INFO  [kylin-coproc--pool5-t78] 
v2.CubeHBaseEndpointRPC:200 : <sub-thread for Query 
d8c5418d-bad7-4090-845f-f5b0488b8b62 GTScanRequest 26450566>Endpoint RPC 
returned from HTable V200_NEW_KYLIN_KIJXSDW18F Shard 
\x56\x32\x30\x30\x5F\x4E\x45\x57\x5F\x4B\x59\x4C\x49\x4E\x5F\x4B\x49\x4A\x58\x53\x44\x57\x31\x38\x46\x2C\x00\x01\x2C\x31\x35\x30\x37\x37\x30\x33\x35\x32\x33\x36\x39\x35\x2E\x65\x39\x37\x63\x64\x38\x34\x32\x62\x33\x61\x63\x37\x63\x66\x30\x32\x38\x31\x64\x36\x32\x66\x38\x31\x63\x62\x36\x61\x38\x64\x39\x2E
 on host: db-53.photo.163.org.Total scanned row: 634776. Total scanned bytes: 
134956120. Total filtered/aggred row: 527872. Time elapsed in EP: 11394(ms). 
Server CPU usage: 0.09580838323353294, server physical mem left: 1.10680064E9, 
server swap mem left:2.075181056E9.Etc message: start latency: 12@3,agg 
done@10605,compress done@11394,server stats done@11394, 
debugGitTag:a08e52e24c99f312eaa63bd3f9ef4cdc53fa2a67;.Normal Complete: 
true.Compressed row size: 13954413

2017-10-13 17:03:10,659 INFO  [kylin-coproc--pool5-t80] 
v2.CubeHBaseEndpointRPC:200 : <sub-thread for Query 
c80fe74f-6ac9-4213-827c-35e60b97d867 GTScanRequest 7153f8f0>Endpoint RPC 
returned from HTable V200_NEW_KYLIN_KIJXSDW18F Shard 
\x56\x32\x30\x30\x5F\x4E\x45\x57\x5F\x4B\x59\x4C\x49\x4E\x5F\x4B\x49\x4A\x58\x53\x44\x57\x31\x38\x46\x2C\x00\x01\x2C\x31\x35\x30\x37\x37\x30\x33\x35\x32\x33\x36\x39\x35\x2E\x65\x39\x37\x63\x64\x38\x34\x32\x62\x33\x61\x63\x37\x63\x66\x30\x32\x38\x31\x64\x36\x32\x66\x38\x31\x63\x62\x36\x61\x38\x64\x39\x2E
 on host: db-53.photo.163.org.Total scanned row: 634776. Total scanned bytes: 
134956120. Total filtered/aggred row: 527872. Time elapsed in EP: 11066(ms). 
Server CPU usage: 0.053101156385187476, server physical mem left: 1.33179392E9, 
server swap mem left:2.075181056E9.Etc message: start latency: 13@1,agg 
done@10281,compress done@11065,server stats done@11066, 
debugGitTag:a08e52e24c99f312eaa63bd3f9ef4cdc53fa2a67;.Normal Complete: 
true.Compressed row size: 13954413

the first time is slow because hbase need load coprocessor jar, the improvement 
can speed up 35%(17184ms to 11066ms).

> speed up Dump file performance
> ------------------------------
>
>                 Key: KYLIN-2929
>                 URL: https://issues.apache.org/jira/browse/KYLIN-2929
>             Project: Kylin
>          Issue Type: Bug
>          Components: Query Engine
>    Affects Versions: v2.0.0
>            Reporter: fengYu
>            Assignee: fengYu
>              Labels: Performance
>         Attachments: 
> 0002-KYLIN-2929-speed-up-dump-performance-write-dump-file.patch
>
>
> when I work on KYLIN-2926, I find coprocessor will dump to disk once 
> estimatedMemSize is bigger than spillThreshold, and found that spill data 
> size is extraordinary smaller that estimatedMemSize, in my case dump file 
> size is about 8MB and spillThreshold is setting to 3GB.   
> So, I try to keep the spill data in memory rather than write the file to disk 
> immediately, and when those in-memory spill data reach the threshold, write 
> all spill files together.
> In my case, the coprocessor process cost time drop from 22s to 16s, it is 
> about 30% upgrade。



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to