Hi Rui Zha, >From the EXPLAIN ANALYZE output, I suspect you have missing statistics for this table. Can you run ANALYZE zhongxing_2015_obj_12months and see if the performance improves?
Best regards, Aitor On Tue, 19 Jan 2016 at 09:25 Ruilong Huo <[email protected]> wrote: > Thanks. > > I tried the same query with your data but still failed to reproduce it. It > took about 0.4 second to complete in my environment. > > Please create a ticket at https://issues.apache.org/jira/browse/HAWQ. It > would be great if you can attach all data in the table in that ticket so > that further reproduce and analysis can be taken. > > Best regards, > Ruilong Huo > > On Tue, Jan 19, 2016 at 5:05 PM, 查 锐 <[email protected]> wrote: > >> Hi Ruilong, >> >> The content of table zhongxing_2015_obj_12months is here >> >> regionid | enodebid | cellid | time | v1 | v2 | v3 >> | v4 | v5 | v6 | v7 | v8 | v9 | v10 | >> v11 | v12 | v13 | v14 | v15 | v16 | >> v17 | v18 | v19 | v20 | v21 | v22 | v23 | >> v24 | v25 | v26 | v27 | v28 | v29 | v30 | v31 >> | v32 | v33 | v34 | v35 | v36 | v37 >> | v38 | v39 | v40 | v41 | v42 | v43 | v44 >> | v45 | v46 | v47 | v48 | v49 | v50 | v51 | v52 >> | v53 | v54 | v55 | v56 | v57 | v >> 58 | v59 | v60 | v61 | v62 | v63 | v64 | v65 >> | v66 | v67 | v68 | v69 | v70 | v71 | v72 | >> v73 | v74 | v75 | v76 | v77 | v78 | >> v79 | v80 | v81 | v82 | v83 | v84 | v85 | v86 >> | v87 | v88 | v89 | v90 | v91 | v92 | v93 | >> v94 | v95 | v96 | v97 | v98 | v99 >> | v100 | v101 | v102 | v103 | v104 | v105 | v106 | >> v107 | v108 | v109 | v110 | v111 | v112 | v113 | v114 >> | v115 | v116 | v117 | v118 | v119 | v120 >> | v121 | v122 | v123 | v124 | v125 | v126 | v127 | >> v128 | v129 | v130 | v131 | v132 | v133 | v134 | v135 >> | v136 | v137 | v138 | v139 | v140 | v1 >> 41 | v142 | v143 | v144 | v145 | v146 | v147 | v148 >> | v149 | v150 | v151 | v152 | v153 | v154 | v155 | >> v156 | v157 | v158 | v159 | v160 | v161 | >> v162 | v163 | v164 | v165 | v166 | v167 | v168 | v169 >> | v170 | v171 | v172 | v173 | v174 | v175 | day >> >> ----------+----------+--------+------------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+ >> >> ---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+--------+---------+---------+--------+---------+------ >> >> --+---------+---------+---------+---------+---------+---------+---------+--------+--------+---------+---------+---------+---------+---------+--------+---------+---------+---------+---------+---------+---- >> >> -----+---------+---------+---------+---------+---------+---------+---------+-------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+ >> >> ---------+-------+---------+--------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+--------+-------- >> >> +---------+---------+---------+---------+---------+---------+--------+---------+---------+---------+---------+---------+---------+---------+--------+--------+---------+--------+---------+---------+------- >> >> --+---------+---------+---------+---------+---------+---------+--------+---------+---------+---------+---------+--------+---------+--------+---------+---------+---------+---------+---------+---------+---- >> >> -----+--------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+--------+---------+---------+---------+---------+---------+---------+--------+--------+-- >> >> ------+---------+---------+---------+---------+---------+--------+---------+---------+---------+---------+---------+--------+---------+------------ >> 999 | 103019 | 2 | 1501031720 | 3528539 | 6306154 | 2434845 >> | 9136141 | 4715281 | 4166421 | 2233478 | 6170791 | 1388084 | 4372131 | >> 9389100 | 7366252 | 3357084 | 7406413 | 1513883 | 9308701 | >> 1457033 | 4298094 | 9519949 | 4413685 | 3328848 | 2421259 | 4732545 | >> 1145746 | 1435813 | 6076642 | 2178385 | 5317895 | 9843957 | 3265059 | >> 4195098 | 511515 | 6772638 | 5822423 | 964529 | 9478699 | 75887 >> 4 | 4667358 | 8343212 | 2862168 | 2756141 | 6430593 | 2670890 | 6427843 | >> 234592 | 304208 | 3881361 | 3783518 | 4886727 | 3939864 | 2170262 | 155186 >> | 1043626 | 5890094 | 5468745 | 7126131 | 2495869 | 253 >> 4501 | 1316005 | 9849963 | 7481091 | 5506001 | 3497212 | 6354145 | >> 8853875 | 26143 | 3434271 | 9688694 | 8137829 | 9960882 | 8563434 | 1122292 >> | 8429005 | 8823352 | 8349624 | 2697861 | 4860927 | 7855095 | >> 6731805 | 55222 | 3890166 | 438327 | 4163808 | 7955744 | 8650484 | >> 6932470 | 4922705 | 5504441 | 9796450 | 1125121 | 2595695 | 4642362 | >> 8542125 | 1464853 | 5203847 | 5137968 | 7246920 | 752884 | 319888 >> | 3427429 | 3773933 | 6498516 | 4023497 | 3007079 | 7893264 | 123849 | >> 9834005 | 1272966 | 2910334 | 9769998 | 1541058 | 9682446 | 5588857 | >> 760941 | 646078 | 2988183 | 366871 | 9665679 | 4001586 | 401560 >> 8 | 7885914 | 9252858 | 1873461 | 3292256 | 4699529 | 1660829 | 590981 | >> 5228858 | 8656938 | 7423225 | 1598162 | 183868 | 5179803 | 890298 | 7177048 >> | 2106501 | 8607052 | 8649816 | 4177679 | 8215160 | 791 >> 7762 | 892917 | 1571426 | 4343883 | 7120589 | 8148255 | 6027362 | 2171403 >> | 4300038 | 5466707 | 2583910 | 5319403 | 433249 | 5816086 | 7471174 | >> 4446179 | 2392407 | 4053061 | 3627571 | 148499 | 348041 | 1 >> 89700 | 4803881 | 1037155 | 6246551 | 9487389 | 7140770 | 825111 | >> 4393547 | 9104373 | 3933856 | 1103255 | 7701196 | 266738 | 7412872 | >> 2015-01-03 >> > >> >> 在 2016年1月19日,16:58,Ruilong Huo <[email protected]> 写道: >> >> You are correct. From the explain analyze result, we can see that 11 out >> of 12 partitions are eliminated while querying against the parent table: >> Parquet table Scan on zhongxing_2015_obj_12months_1_prt_mar >> zhongxing_2015_obj_12months >> >> 1. Though there are some trivial drawbacks (i.e., it may introduce some >> remote read while accessing data on HDFS) regarding the datalocality for >> partitioned table, it is not the major reason. >> >> 2. The major factor regarding the performance downgrade is that: it is >> much slower to fetch first tuple, as well as finish the scan on the target >> partition. >> >> 1) For querying on target partition of parent table, it cost about 0.2 >> second to get the first tuple, and it cost about 4 ~ 38 seconds to finish >> the scan. >> Max/Last(seg44:ws03.mzhen.cn/seg174:ws06.mzhen.cn) 258465/226292 rows >> with 245/240 ms to first row, 4100/37965 ms to end >> >> 2) For querying on parent table directly, it cost about 9 ~ 26 seconds to >> get the first tuple, and it cost about 56 ~ 117 seconds to finish the scan. >> Max/Last(seg44:ws07.mzhen.cn/seg47:ws01.mzhen.cn) 258465/254465 rows >> with 9326/36830 ms to first row, 55956/117370 ms to end >> >> I tried to reproduce this issue with a simpler case in my local >> environment but fail. Would you please dump data in >> table zhongxing_2015_obj_12months or zhongxing_2015 and paste it here? Or >> just part of it would be great for further investigation. >> >> Best regards, >> Ruilong Huo >> >> On Tue, Jan 19, 2016 at 3:40 PM, 查 锐 <[email protected]> wrote: >> >>> Ruilong, >>> >>> Thank you for reply. >>> >>> The scan related information in the query plan shows that 11 partitions >>> are eliminated and 1 partition is chosen to be scan. I just don’t know why >>> the performance of querying sub partitioned table >>> (zhongxing_2015_obj_12months_1_prt_mar) is better than querying the root >>> partitioned table(zhongxing_2015_obj_12months) with the predicate indicates >>> to the sub partitioned table. >>> >>> I run the EXPLAIN ANALYZE query on the sub partitioned >>> table(zhongxing_2015_obj_12months_1_prt_mar) and the root partitioned >>> table(zhongxing_2015_obj_12months) separately, the result as follows: >>> >>> The EXLAIN ANALYZE results on the sub partitioned table >>> >>> >>> QUERY PLAN >>> >>> >>> >>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ >>> >>> ------------------------------------------------------------------------------------------------------- >>> Limit (cost=2159144.19..2159145.31 rows=50 width=324) >>> Rows out: 50 rows with 51344 ms to first row, 51345 ms to end, start >>> offset by 228/228 ms. >>> -> Gather Motion 192:1 (slice2; segments: 192) >>> (cost=2159144.19..2159145.31 rows=50 width=324) >>> Merge Key: p1 >>> Rows out: 50 rows at destination with 51344 ms to end, start >>> offset by 228/228 ms. >>> -> Limit (cost=2159144.19..2159144.31 rows=1 width=324) >>> Rows out: Avg 50.0 rows x 192 workers. Max/Last(seg0: >>> ws07.mzhen.cn/seg130:ws06.mzhen.cn) 50/50 rows with 41038/42852 ms to >>> end, start offset by 208/252 ms. >>> -> Sort (cost=2159144.19..2159144.44 rows=1 width=324) >>> Sort Key (Limit): p1 >>> Rows out: Avg 50.0 rows x 192 workers. >>> Max/Last(seg0:ws07.mzhen.cn/seg130:ws06.mzhen.cn) 50/50 rows with >>> 41038/42852 ms to end, start offset by 208/252 ms. >>> Executor memory: 81K bytes avg, 81K bytes max >>> (seg0:ws07.mzhen.cn). >>> Work_mem used: 81K bytes avg, 81K bytes max (seg0: >>> ws07.mzhen.cn). Workfile: (0 spilling, 0 reused) >>> Work_mem wanted: 3K bytes avg, 3K bytes max (seg61: >>> ws07.mzhen.cn) to lessen workfile I/O affecting 192 workers. >>> -> HashAggregate (cost=2159061.12..2159140.87 >>> rows=1 width=324) >>> Group By: >>> zhongxing_2015_obj_12months_1_prt_mar.enodebid, >>> zhongxing_2015_obj_12months_1_prt_mar.cellid, "?column3?" >>> Rows out: Avg 78.1 rows x 192 workers. >>> Max/Last(seg61:ws01.mzhen.cn/seg130:ws06.mzhen.cn) 89/77 rows with >>> 41043/42851 ms to first row, 41044/42851 ms to end, start offset by 2 >>> 63/252 ms. >>> Executor memory: 409K bytes avg, 473K bytes >>> max (seg61:ws01.mzhen.cn). >>> -> Redistribute Motion 192:192 (slice1; >>> segments: 192) (cost=2159013.37..2159015.62 rows=1 width=1444) >>> Hash Key: >>> zhongxing_2015_obj_12months_1_prt_mar.enodebid, >>> zhongxing_2015_obj_12months_1_prt_mar.cellid, unnamed_attr_3 >>> Rows out: Avg 15000.0 rows x 192 >>> workers at destination. Max/Last(seg61: >>> ws01.mzhen.cn/seg130:ws06.mzhen.cn) 17088/14784 rows with 5842/5848 ms >>> to first row, 40645/42541 >>> ms to end, start offset by 263/252 ms. >>> -> HashAggregate >>> (cost=2159013.37..2159013.62 rows=1 width=1444) >>> Group By: >>> zhongxing_2015_obj_12months_1_prt_mar.enodebid, >>> zhongxing_2015_obj_12months_1_prt_mar.cellid, >>> substr(zhongxing_2015_obj_12months_1_prt_mar."time"::text, 1, >>> 4) >>> Rows out: Avg 15000.0 rows x 192 >>> workers. Max/Last(seg0:ws07.mzhen.cn/seg174:ws06.mzhen.cn) 15000/15000 >>> rows with 8169/40815 ms to first row, 8267/40877 ms to end, >>> start offset by 212/301 ms. >>> Executor memory: 44329K bytes >>> avg, 44329K bytes max (seg0:ws07.mzhen.cn). >>> -> Parquet table Scan on >>> zhongxing_2015_obj_12months_1_prt_mar (cost=0.00..2158572.02 rows=6 >>> width=750) >>> Filter: day >= >>> '2015-03-01'::date AND day <= '2015-03-31'::date AND "time"::text >= >>> '1503010000'::text AND "time"::text <= '1503312345'::text AND regionid = 99 >>> 9 >>> Rows out: Avg 234302.7 >>> rows x 192 workers. Max/Last(seg44:ws03.mzhen.cn/seg174:ws06.mzhen.cn) >>> 258465/226292 rows with 245/240 ms to first row, 4100/37965 ms >>> to end, start offset by 297/301 ms. >>> Slice statistics: >>> (slice0) Executor memory: 6915K bytes. >>> (slice1) Executor memory: 60457K bytes avg x 192 workers, 60457K >>> bytes max (seg0:ws07.mzhen.cn). >>> (slice2) * Executor memory: 3091K bytes avg x 192 workers, 3155K >>> bytes max (seg61:ws01.mzhen.cn). Work_mem: 81K bytes max, 3K bytes >>> wanted. >>> Statement statistics: >>> Memory used: 131072K bytes >>> Memory wanted: 8598K bytes >>> Settings: default_segment_num=64 >>> Dispatcher statistics: >>> executors used(total/cached/new connection): (384/0/384); dispatcher >>> time(total/connection/dispatch data): (236.797 ms/219.617 ms/15.258 ms). >>> dispatch data time(max/min/avg): (4.910 ms/0.017 ms/0.102 ms); >>> consume executor data time(max/min/avg): (5.029 ms/0.004 ms/0.038 ms); free >>> executor time(max/min/avg): (0.000 ms/0.000 ms/0.000 ms). >>> Data locality statistics: >>> data locality ratio: 0.822; virtual segment number: 192; different >>> host number: 8; virtual segment number per host(avg/min/max): (24/24/24); >>> segment size(avg/min/max): (191623489.297/172566327/21129035 >>> 0); segment size with penalty(avg/min/max): (0.000/0/0); >>> continuity(avg/min/max): (1.000/1.000/1.000). >>> Total runtime: 51572.484 ms >>> (41 rows) >>> >>> >>> The EXLAIN ANALYZE results on the root partitioned table >>> >>> >>> QUERY PLAN >>> >>> >>> >>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ >>> >>> ---------------------------------------------------------------------------------------------------------- >>> Limit (cost=2159144.19..2159145.31 rows=50 width=324) >>> Rows out: 50 rows with 133541 ms to first row, 133542 ms to end, >>> start offset by 16/16 ms. >>> -> Gather Motion 192:1 (slice2; segments: 192) >>> (cost=2159144.19..2159145.31 rows=50 width=324) >>> Merge Key: p1 >>> Rows out: 50 rows at destination with 133541 ms to end, start >>> offset by 16/16 ms. >>> -> Limit (cost=2159144.19..2159144.31 rows=1 width=324) >>> Rows out: Avg 50.0 rows x 192 workers. Max/Last(seg0: >>> ws07.mzhen.cn/seg69:ws01.mzhen.cn) 50/50 rows with 116748/123223 ms to >>> end, start offset by 6484/26 ms. >>> -> Sort (cost=2159144.19..2159144.44 rows=1 width=324) >>> Sort Key (Limit): p1 >>> Rows out: Avg 50.0 rows x 192 workers. >>> Max/Last(seg0:ws07.mzhen.cn/seg69:ws01.mzhen.cn) 50/50 rows with >>> 116748/123223 ms to end, start offset by 6484/26 ms. >>> Executor memory: 81K bytes avg, 81K bytes max >>> (seg0:ws07.mzhen.cn). >>> Work_mem used: 81K bytes avg, 81K bytes max (seg0: >>> ws07.mzhen.cn). Workfile: (0 spilling, 0 reused) >>> Work_mem wanted: 3K bytes avg, 3K bytes max (seg61: >>> ws07.mzhen.cn) to lessen workfile I/O affecting 192 workers. >>> -> HashAggregate (cost=2159061.12..2159140.87 >>> rows=1 width=324) >>> Group By: >>> public.zhongxing_2015_obj_12months.enodebid, >>> public.zhongxing_2015_obj_12months.cellid, "?column3?" >>> Rows out: Avg 78.1 rows x 192 workers. >>> Max/Last(seg61:ws07.mzhen.cn/seg69:ws01.mzhen.cn) 89/81 rows with >>> 123071/123223 ms to first row, 123072/123223 ms to end, start offset b >>> y 163/26 ms. >>> Executor memory: 345K bytes avg, 409K bytes >>> max (seg61:ws07.mzhen.cn). >>> -> Redistribute Motion 192:192 (slice1; >>> segments: 192) (cost=2159013.37..2159015.62 rows=1 width=1444) >>> Hash Key: >>> public.zhongxing_2015_obj_12months.enodebid, >>> public.zhongxing_2015_obj_12months.cellid, unnamed_attr_3 >>> Rows out: Avg 70154.2 rows x 192 >>> workers at destination. Max/Last(seg61: >>> ws07.mzhen.cn/seg16:ws04.mzhen.cn) 79987/61081 rows with 1081/1140 ms >>> to first row, 121468/121889 >>> ms to end, start offset by 163/137 ms. >>> -> HashAggregate >>> (cost=2159013.37..2159013.62 rows=1 width=1444) >>> Group By: >>> public.zhongxing_2015_obj_12months.enodebid, >>> public.zhongxing_2015_obj_12months.cellid, >>> substr(public.zhongxing_2015_obj_12months."time"::text, 1, 4) >>> Rows out: Avg 70154.2 rows x 192 >>> workers. Max/Last(seg44:ws07.mzhen.cn/seg47:ws01.mzhen.cn) 75237/72524 >>> rows with 21600/69653 ms to first row, 60439/121820 ms to e >>> nd, start offset by 36811/67 ms. >>> Executor memory: 42856K bytes >>> avg, 42856K bytes max (seg0:ws07.mzhen.cn). >>> -> Result >>> (cost=0.00..2158572.02 rows=6 width=750) >>> Rows out: Avg 234302.7 >>> rows x 192 workers. Max/Last(seg44:ws07.mzhen.cn/seg47:ws01.mzhen.cn) >>> 258465/254465 rows with 9326/36830 ms to first row, 56780/118175 >>> ms to end, start offset by 36811/67 ms. >>> -> Append >>> (cost=0.00..2158569.54 rows=6 width=750) >>> Rows out: Avg >>> 234302.7 rows x 192 workers. Max/Last(seg44: >>> ws07.mzhen.cn/seg47:ws01.mzhen.cn) 258465/254465 rows with 9326/36830 >>> ms to first row, 56007/ >>> 117417 ms to end, start offset by 36811/67 ms. >>> -> Parquet table >>> Scan on zhongxing_2015_obj_12months_1_prt_mar zhongxing_2015_obj_12months >>> (cost=0.00..2158569.54 rows=6 width=750) >>> Filter: day >= >>> '2015-03-01'::date AND day <= '2015-03-31'::date AND "time"::text >= >>> '1503010000'::text AND "time"::text <= '1503312345'::text AND r >>> egionid = 999 >>> Rows out: Avg >>> 234302.7 rows x 192 workers. Max/Last(seg44: >>> ws07.mzhen.cn/seg47:ws01.mzhen.cn) 258465/254465 rows with 9326/36830 >>> ms to first row, >>> 55956/117370 ms to end, start offset by 36811/67 ms. >>> Slice statistics: >>> (slice0) Executor memory: 7474K bytes. >>> (slice1) Executor memory: 60604K bytes avg x 192 workers, 60611K >>> bytes max (seg2:ws07.mzhen.cn). >>> (slice2) * Executor memory: 3067K bytes avg x 192 workers, 3130K >>> bytes max (seg61:ws07.mzhen.cn). Work_mem: 81K bytes max, 3K bytes >>> wanted. >>> Statement statistics: >>> Memory used: 131072K bytes >>> Memory wanted: 8700K bytes >>> Settings: default_segment_num=64 >>> Dispatcher statistics: >>> executors used(total/cached/new connection): (384/384/0); dispatcher >>> time(total/connection/dispatch data): (14.710 ms/0.000 ms/13.435 ms). >>> dispatch data time(max/min/avg): (8.256 ms/0.048 ms/0.117 ms); >>> consume executor data time(max/min/avg): (0.042 ms/0.004 ms/0.013 ms); free >>> executor time(max/min/avg): (0.000 ms/0.000 ms/0.000 ms). >>> Data locality statistics: >>> data locality ratio: 0.462; virtual segment number: 192; different >>> host number: 8; virtual segment number per host(avg/min/max): (24/24/24); >>> segment size(avg/min/max): (2299488129.719/2237860180/237028 >>> 9866); segment size with penalty(avg/min/max): (0.000/0/0); >>> continuity(avg/min/max): (1.000/1.000/1.000). >>> Total runtime: 134648.912 ms >>> (45 rows) >>> >>> From the results, I noticed the locality ratio of the data scanned in >>> root partitioned table querying is 0.462, and the locality ratio of the >>> data scanned in sub partitioned table querying is 0.822. Is this normal? >>> >>> In addition, the ‘append’ operation in the root partitioned table query >>> plan doesn’t appear in the sub partitioned table query plan. >>> >>> >>> 在 2016年1月19日,14:52,Ruilong Huo <[email protected]> 写道: >>> >>> @查锐: >>> >>> From the plan, there is no DPE (dynamic partition elimination) involved. >>> That's why full table (instead of target partition) is scanned. >>> >>> Would you please attach the full output of the query plan including >>> settings and optimizer status as below? >>> >>> test=# explain xxx >>> QUERY PLAN >>> >>> ------------------------------------------------------------------------------------ >>> ... >>> Settings: default_segment_num=8 >>> Optimizer status: PQO version 1.617 >>> (6 rows) >>> >>> Best regards, >>> Ruilong Huo >>> >>> On Tue, Jan 19, 2016 at 2:43 PM, 查 锐 <[email protected]> wrote: >>> >>>> Even transforming string to date type the result just all the same... >>>> >>>> >>>> 在 2016年1月19日,14:24,陶进 <[email protected]> 写道: >>>> >>>> it was because the item day was compare to a string '2016-03-01' while >>>> day type is data? >>>> I guessed! >>>> >>>> 在 2016/1/19 12:36, 查 锐 写道: >>>> >>>> hi all, >>>> >>>> I created a table with date range partition, and my creating table >>>> cause as follows: >>>> >>>> CREATE TABLE zhongxing_2015_obj_12months (RegionID integer, eNodeBID >>>> bigint, CellID integer, Time varchar(10), v1 integer, v2 integer, v3 >>>> integer, v4 integer, v5 integer, v6 integer, v7 integer, v8 integer, >>>> v9 integer, v10 integer, v11 integer, v12 integer, v13 integer, v14 >>>> integer, v15 integer, v16 integer, v17 integer, v18 integer, v19 >>>> integer, v20 integer, v21 integer, v22 integer, v23 integer, v24 >>>> integer, v25 integer, v26 integer, v27 integer, v28 integer, v29 >>>> integer, v30 integer, v31 integer, v32 integer, v33 integer, v34 >>>> integer, v35 integer, v36 integer, v37 integer, v38 integer, v39 >>>> integer, v40 integer, v41 integer, v42 integer, v43 integer, v44 >>>> integer, v45 integer, v46 integer, v47 integer, v48 integer, v49 >>>> integer, v50 integer, v51 integer, v52 integer, v53 integer, v54 >>>> integer, v55 integer, v56 integer, v57 integer, v58 integer, v59 >>>> integer, v60 integer, v61 integer, v62 integer, v63 integer, v64 >>>> integer, v65 integer, v66 integer, v67 integer, v68 integer, v69 >>>> integer, v70 integer, v71 integer, v72 integer, v73 integer, v74 >>>> integer, v75 integer, v76 integer, v77 integer, v78 integer, v79 >>>> integer, v80 integer, v81 integer, v82 integer, v83 integer, v84 >>>> integer, v85 integer, v86 integer, v87 integer, v88 integer, v89 >>>> integer, v90 integer, v91 integer, v92 integer, v93 integer, v94 >>>> integer, v95 integer, v96 integer, v97 integer, v98 integer, v99 >>>> integer, v100 integer, v101 integer, v102 integer, v103 integer, v104 >>>> integer, v105 integer, v106 integer, v107 integer, v108 integer, v109 >>>> integer, v110 integer, v111 integer, v112 integer, v113 integer, v114 >>>> integer, v115 integer, v116 integer, v117 integer, v118 integer, v119 >>>> integer, v120 integer, v121 integer, v122 integer, v123 integer, v124 >>>> integer, v125 integer, v126 integer, v127 integer, v128 integer, v129 >>>> integer, v130 integer, v131 integer, v132 integer, v133 integer, v134 >>>> integer, v135 integer, v136 integer, v137 integer, v138 integer, v139 >>>> integer, v140 integer, v141 integer, v142 integer, v143 integer, v144 >>>> integer, v145 integer, v146 integer, v147 integer, v148 integer, v149 >>>> integer, v150 integer, v151 integer, v152 integer, v153 integer, v154 >>>> integer, v155 integer, v156 integer, v157 integer, v158 integer, v159 >>>> integer, v160 integer, v161 integer, v162 integer, v163 integer, v164 >>>> integer, v165 integer, v166 integer, v167 integer, v168 integer, v169 >>>> integer, v170 integer, v171 integer, v172 integer, v173 integer, v174 >>>> integer, v175 integer, day date) WITH (appendonly=true, >>>> orientation=parquet) >>>> DISTRIBUTED BY (Time) >>>> PARTITION BY RANGE(day) >>>> ( PARTITION JAN START (date '2015-01-01') INCLUSIVE , >>>> PARTITION FEB START (date '2015-02-01') INCLUSIVE , >>>> PARTITION MAR START (date '2015-03-01') INCLUSIVE , >>>> PARTITION APR START (date '2015-04-01') INCLUSIVE , >>>> PARTITION MAY START (date '2015-05-01') INCLUSIVE , >>>> PARTITION JUN START (date '2015-06-01') INCLUSIVE , >>>> PARTITION JUL START (date '2015-07-01') INCLUSIVE , >>>> PARTITION AUG START (date '2015-08-01') INCLUSIVE , >>>> PARTITION SEP START (date '2015-09-01') INCLUSIVE , >>>> PARTITION OCT START (date '2015-10-01') INCLUSIVE , >>>> PARTITION NOV START (date '2015-11-01') INCLUSIVE , >>>> PARTITION DEC START (date '2015-12-01') INCLUSIVE >>>> END (date '2016-01-01') EXCLUSIVE ); >>>> >>>> i.e the table has 12 partitions, and 1 month data in each partition. >>>> >>>> And I INSERT the table with the sql >>>> >>>> INSERT INTO zhongxing_2015_obj_12months SELECT RegionID, eNodeBID, >>>> CellID, Time, v1, v2, v3, v4, v5, v6, v7, v8, v9, v10, v11, v12, v13, v14, >>>> v15, v16, v17, v18, v19, v20, v21, v22, v23, v24, v25, v26, v27, v28, v29, >>>> v30, v31, v32, v33, v34, v35, v36, v37, v38, v39, v40, v41, v42, v43, v44, >>>> v45, v46, v47, v48, v49, v50, v51, v52, v53, v54, v55, v56, v57, v58, v59, >>>> v60, v61, v62, v63, v64, v65, v66, v67, v68, v69, v70, v71, v72, v73, v74, >>>> v75, v76, v77, v78, v79, v80, v81, v82, v83, v84, v85, v86, v87, v88, v89, >>>> v90, v91, v92, v93, v94, v95, v96, v97, v98, v99, v100, v101, v102, v103, >>>> v104, v105, v106, v107, v108, v109, v110, v111, v112, v113, v114, v115, >>>> v116, v117, v118, v119, v120, v121, v122, v123, v124, v125, v126, v127, >>>> v128, v129, v130, v131, v132, v133, v134, v135, v136, v137, v138, v139, >>>> v140, v141, v142, v143, v144, v145, v146, v147, v148, v149, v150, v151, >>>> v152, v153, v154, v155, v156, v157, v158, v159, v160, v161, v162, v163, >>>> v164, v165, v166, v167, v168, v169, v170, v171, v172, v173, v174, v175, >>>> to_date(substring(Time from 1 for 6),'yyMMdd') AS day FROM >>>> zhongxing_2015; >>>> >>>> but when I query one partition with the sql >>>> >>>> select >>>> sum(v1)+sum(v2)+sum(v3)+sum(v4)+sum(v5) as P1, >>>> sum(v6)+sum(v7)+sum(v8)+sum(v9)+sum(v10) as P2, >>>> sum(v11)+sum(v12)+sum(v13)+sum(v14)+sum(v15) as P3, >>>> sum(v16)+sum(v17)+sum(v18)+sum(v19)+sum(v20) as P4, >>>> sum(v21)+sum(v22)+sum(v23)+sum(v24)+sum(v25) as P5, >>>> sum(v26)+sum(v27)+sum(v28)+sum(v29)+sum(v30) as P6, >>>> sum(v31)+sum(v32)+sum(v33)+sum(v34)+sum(v35) as P7, >>>> sum(v36)+sum(v37)+sum(v38)+sum(v39)+sum(v40) as P8, >>>> sum(v41)+sum(v42)+sum(v43)+sum(v44)+sum(v45) as P9, >>>> sum(v46)+sum(v47)+sum(v48)+sum(v49)+sum(v50) as P10, >>>> sum(v51)+sum(v52)+sum(v53)+sum(v54)+sum(v55) as P11, >>>> sum(v56)+sum(v57)+sum(v58)+sum(v59)+sum(v60) as P12, >>>> sum(v61)+sum(v62)+sum(v63)+sum(v64)+sum(v65) as P13, >>>> sum(v66)+sum(v67)+sum(v68)+sum(v69)+sum(v70) as P14, >>>> sum(v71)+sum(v72)+sum(v73)+sum(v74)+sum(v75) as P15, >>>> sum(v76)+sum(v77)+sum(v78)+sum(v79)+sum(v80) as P16, >>>> sum(v81)+sum(v82)+sum(v83)+sum(v84)+sum(v85) as P17, >>>> sum(v86)+sum(v87)+sum(v88)+sum(v89)+sum(v90) as P18, >>>> sum(v91)+sum(v92)+sum(v93)+sum(v94)+sum(v95) as P19, >>>> sum(v96)+sum(v97)+sum(v98)+sum(v99)+sum(v100) as P20, >>>> sum(v101)+sum(v102)+sum(v103)+sum(v104)+sum(v105) as P21, >>>> sum(v106)+sum(v107)+sum(v108)+sum(v109)+sum(v110) as P22, >>>> sum(v111)+sum(v112)+sum(v113)+sum(v114)+sum(v115) as P23, >>>> sum(v116)+sum(v117)+sum(v118)+sum(v119)+sum(v120) as P24, >>>> sum(v121)+sum(v122)+sum(v123)+sum(v124)+sum(v125) as P25, >>>> sum(v126)+sum(v127)+sum(v128)+sum(v129)+sum(v130) as P26, >>>> sum(v131)+sum(v132)+sum(v133)+sum(v134)+sum(v135) as P27, >>>> sum(v136)+sum(v137)+sum(v138)+sum(v139)+sum(v140) as P28, >>>> sum(v141)+sum(v142)+sum(v143)+sum(v144)+sum(v145) as P29, >>>> sum(v146)+sum(v147)+sum(v148)+sum(v149)+sum(v150) as P30, >>>> sum(v151)+sum(v152)+sum(v153)+sum(v154)+sum(v155) as P31, >>>> sum(v156)+sum(v157)+sum(v158)+sum(v159)+sum(v160) as P32, >>>> sum(v161)+sum(v162)+sum(v163)+sum(v164)+sum(v165) as P33, >>>> sum(v166)+sum(v167)+sum(v168)+sum(v169)+sum(v170) as P34, >>>> sum(v171)+sum(v172)+sum(v173)+sum(v174)+sum(v175) as P35, >>>> substr(time,1,4) as month, >>>> eNodeBID, >>>> CellID >>>> from zhongxing_2015_obj_12months >>>> where (day >= '2015-03-01' and day <= '2015-03-31') >>>> and (Time between '1503010000' and '1503312345') >>>> and RegionID=999 >>>> group by eNodeBID,CellID,month >>>> order by P1 DESC limit 50; >>>> >>>> I found the performance is not good, and when I substitute the table >>>> name ‘zhongxing_2015_obj_12months’ with the sub partition table name ‘ >>>> zhongxing_2015_obj_12months_1_prt_mar’, I got almost 1x performance >>>> improvement. >>>> >>>> I think for the partitioned table, this two sql will be parsed to the >>>> same thing. So I wonder why this situation could be happened. I run the >>>> explain query, and the result as follows: >>>> >>>> >>>> QUERY PLAN >>>> >>>> >>>> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- >>>> Limit (cost=1725101.68..1725102.80 rows=50 width=324) >>>> -> Gather Motion 192:1 (slice2; segments: 192) >>>> (cost=1725101.68..1725102.80 rows=50 width=324) >>>> Merge Key: p1 >>>> -> Limit (cost=1725101.68..1725101.80 rows=1 width=324) >>>> -> Sort (cost=1725101.68..1725113.19 rows=24 width=324) >>>> Sort Key (Limit): p1 >>>> -> HashAggregate (cost=1721149.86..1724821.55 >>>> rows=24 width=324) >>>> Group By: >>>> public.zhongxing_2015_obj_12months.enodebid, >>>> public.zhongxing_2015_obj_12months.cellid, "?column3?" >>>> -> Redistribute Motion 192:192 (slice1; >>>> segments: 192) (cost=1718939.94..1719055.04 rows=24 width=1444) >>>> Hash Key: >>>> public.zhongxing_2015_obj_12months.enodebid, >>>> public.zhongxing_2015_obj_12months.cellid, unnamed_attr_3 >>>> -> HashAggregate >>>> (cost=1718939.94..1718962.96 rows=24 width=1444) >>>> Group By: >>>> public.zhongxing_2015_obj_12months.enodebid, >>>> public.zhongxing_2015_obj_12months.cellid, >>>> substr(public.zhongxing_2015_obj_12months.day::text, 1, 7) >>>> -> Result >>>> (cost=0.00..1698454.59 rows=240 width=716) >>>> -> Append >>>> (cost=0.00..1698224.41 rows=240 width=716) >>>> -> Parquet table >>>> Scan on zhongxing_2015_obj_12months_1_prt_mar zhongxing_2015_obj_12months >>>> (cost=0.00..1698224.41 rows=240 width=716) >>>> Filter: day = >>>> '2015-03-01 00:00:00'::timestamp without time zone >>>> >>>> I can see the “Parquet table Scan on >>>> zhongxing_2015_obj12months_1_prt_mar zhongxing_2015_obj_12months” part, >>>> dose this mean scanner scanning ALL DATA of all partitions in table >>>> zhongxing_2015_obj_12months plus one sub table >>>> zhongxing_2015_obj_12months_1_prt_mar include one partition of 12? >>>> >>>> I will waiting for the answer online, this problem is very critical and >>>> urgent for me! >>>> >>>> Thank you so mush!!! >>>> >>>> Best Regards >>>> Rui Zha >>>> >>>> >>>> >>>> >>>> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient> >>>> 受 >>>> Avast 保护的无病毒计算机已发送该电子邮件。 >>>> www.avast.com >>>> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient> >>>> >>>> >>>> >>> >>> >> >> -- *Aitor Cedres* *Global Support Services | Pivotal * *Email: **[email protected] <[email protected]>* *After hours contact #: +1 877.477.2269* Case Escalations: https://support.pivotal.io/hc/en-us/articles/203809556 <https://support.pivotal.io/hc/en-us/articles/203809556-Pivotal-How-do-I-escalate-an-existing-case->
