hi all,
I created a table with date range partition, and my creating table
cause as follows:
CREATETABLEzhongxing_2015_obj_12months (RegionID integer, eNodeBID
bigint, CellID integer, Time varchar(10), v1 integer, v2 integer, v3
integer, v4 integer, v5 integer, v6 integer, v7 integer, v8 integer,
v9 integer, v10 integer, v11 integer, v12 integer, v13 integer, v14
integer, v15 integer, v16 integer, v17 integer, v18 integer, v19
integer, v20 integer, v21 integer, v22 integer, v23 integer, v24
integer, v25 integer, v26 integer, v27 integer, v28 integer, v29
integer, v30 integer, v31 integer, v32 integer, v33 integer, v34
integer, v35 integer, v36 integer, v37 integer, v38 integer, v39
integer, v40 integer, v41 integer, v42 integer, v43 integer, v44
integer, v45 integer, v46 integer, v47 integer, v48 integer, v49
integer, v50 integer, v51 integer, v52 integer, v53 integer, v54
integer, v55 integer, v56 integer, v57 integer, v58 integer, v59
integer, v60 integer, v61 integer, v62 integer, v63 integer, v64
integer, v65 integer, v66 integer, v67 integer, v68 integer, v69
integer, v70 integer, v71 integer, v72 integer, v73 integer, v74
integer, v75 integer, v76 integer, v77 integer, v78 integer, v79
integer, v80 integer, v81 integer, v82 integer, v83 integer, v84
integer, v85 integer, v86 integer, v87 integer, v88 integer, v89
integer, v90 integer, v91 integer, v92 integer, v93 integer, v94
integer, v95 integer, v96 integer, v97 integer, v98 integer, v99
integer, v100 integer, v101 integer, v102 integer, v103 integer, v104
integer, v105 integer, v106 integer, v107 integer, v108 integer, v109
integer, v110 integer, v111 integer, v112 integer, v113 integer, v114
integer, v115 integer, v116 integer, v117 integer, v118 integer, v119
integer, v120 integer, v121 integer, v122 integer, v123 integer, v124
integer, v125 integer, v126 integer, v127 integer, v128 integer, v129
integer, v130 integer, v131 integer, v132 integer, v133 integer, v134
integer, v135 integer, v136 integer, v137 integer, v138 integer, v139
integer, v140 integer, v141 integer, v142 integer, v143 integer, v144
integer, v145 integer, v146 integer, v147 integer, v148 integer, v149
integer, v150 integer, v151 integer, v152 integer, v153 integer, v154
integer, v155 integer, v156 integer, v157 integer, v158 integer, v159
integer, v160 integer, v161 integer, v162 integer, v163 integer, v164
integer, v165 integer, v166 integer, v167 integer, v168 integer, v169
integer, v170 integer, v171 integer, v172 integer, v173 integer, v174
integer, v175 integer, day date) WITH(appendonly=true,
orientation=parquet)
DISTRIBUTED BY (Time)
PARTITION BY RANGE(day)
( PARTITION JAN START (date '2015-01-01') INCLUSIVE ,
PARTITION FEB START (date '2015-02-01') INCLUSIVE ,
PARTITION MAR START (date '2015-03-01') INCLUSIVE ,
PARTITION APR START (date '2015-04-01') INCLUSIVE ,
PARTITION MAY START (date '2015-05-01') INCLUSIVE ,
PARTITION JUN START (date '2015-06-01') INCLUSIVE ,
PARTITION JUL START (date '2015-07-01') INCLUSIVE ,
PARTITION AUG START (date '2015-08-01') INCLUSIVE ,
PARTITION SEP START (date '2015-09-01') INCLUSIVE ,
PARTITION OCT START (date '2015-10-01') INCLUSIVE ,
PARTITION NOV START (date '2015-11-01') INCLUSIVE ,
PARTITION DEC START (date '2015-12-01') INCLUSIVE
END (date '2016-01-01') EXCLUSIVE );
i.e the table has 12 partitions, and 1 month data in each partition.
And I INSERT the table with the sql
INSERT INTO zhongxing_2015_obj_12months SELECT RegionID, eNodeBID,
CellID, Time, v1, v2, v3, v4, v5, v6, v7, v8, v9, v10, v11, v12, v13,
v14, v15, v16, v17, v18, v19, v20, v21, v22, v23, v24, v25, v26, v27,
v28, v29, v30, v31, v32, v33, v34, v35, v36, v37, v38, v39, v40, v41,
v42, v43, v44, v45, v46, v47, v48, v49, v50, v51, v52, v53, v54, v55,
v56, v57, v58, v59, v60, v61, v62, v63, v64, v65, v66, v67, v68, v69,
v70, v71, v72, v73, v74, v75, v76, v77, v78, v79, v80, v81, v82, v83,
v84, v85, v86, v87, v88, v89, v90, v91, v92, v93, v94, v95, v96, v97,
v98, v99, v100, v101, v102, v103, v104, v105, v106, v107, v108, v109,
v110, v111, v112, v113, v114, v115, v116, v117, v118, v119, v120,
v121, v122, v123, v124, v125, v126, v127, v128, v129, v130, v131,
v132, v133, v134, v135, v136, v137, v138, v139, v140, v141, v142,
v143, v144, v145, v146, v147, v148, v149, v150, v151, v152, v153,
v154, v155, v156, v157, v158, v159, v160, v161, v162, v163, v164,
v165, v166, v167, v168, v169, v170, v171, v172, v173, v174, v175,
to_date(substring(Time from 1 for 6),'yyMMdd') AS day FROM zhongxing_2015;
but when I query one partition with the sql
select
sum(v1)+sum(v2)+sum(v3)+sum(v4)+sum(v5) as P1,
sum(v6)+sum(v7)+sum(v8)+sum(v9)+sum(v10) as P2,
sum(v11)+sum(v12)+sum(v13)+sum(v14)+sum(v15) as P3,
sum(v16)+sum(v17)+sum(v18)+sum(v19)+sum(v20) as P4,
sum(v21)+sum(v22)+sum(v23)+sum(v24)+sum(v25) as P5,
sum(v26)+sum(v27)+sum(v28)+sum(v29)+sum(v30) as P6,
sum(v31)+sum(v32)+sum(v33)+sum(v34)+sum(v35) as P7,
sum(v36)+sum(v37)+sum(v38)+sum(v39)+sum(v40) as P8,
sum(v41)+sum(v42)+sum(v43)+sum(v44)+sum(v45) as P9,
sum(v46)+sum(v47)+sum(v48)+sum(v49)+sum(v50) as P10,
sum(v51)+sum(v52)+sum(v53)+sum(v54)+sum(v55) as P11,
sum(v56)+sum(v57)+sum(v58)+sum(v59)+sum(v60) as P12,
sum(v61)+sum(v62)+sum(v63)+sum(v64)+sum(v65) as P13,
sum(v66)+sum(v67)+sum(v68)+sum(v69)+sum(v70) as P14,
sum(v71)+sum(v72)+sum(v73)+sum(v74)+sum(v75) as P15,
sum(v76)+sum(v77)+sum(v78)+sum(v79)+sum(v80) as P16,
sum(v81)+sum(v82)+sum(v83)+sum(v84)+sum(v85) as P17,
sum(v86)+sum(v87)+sum(v88)+sum(v89)+sum(v90) as P18,
sum(v91)+sum(v92)+sum(v93)+sum(v94)+sum(v95) as P19,
sum(v96)+sum(v97)+sum(v98)+sum(v99)+sum(v100) as P20,
sum(v101)+sum(v102)+sum(v103)+sum(v104)+sum(v105) as P21,
sum(v106)+sum(v107)+sum(v108)+sum(v109)+sum(v110) as P22,
sum(v111)+sum(v112)+sum(v113)+sum(v114)+sum(v115) as P23,
sum(v116)+sum(v117)+sum(v118)+sum(v119)+sum(v120) as P24,
sum(v121)+sum(v122)+sum(v123)+sum(v124)+sum(v125) as P25,
sum(v126)+sum(v127)+sum(v128)+sum(v129)+sum(v130) as P26,
sum(v131)+sum(v132)+sum(v133)+sum(v134)+sum(v135) as P27,
sum(v136)+sum(v137)+sum(v138)+sum(v139)+sum(v140) as P28,
sum(v141)+sum(v142)+sum(v143)+sum(v144)+sum(v145) as P29,
sum(v146)+sum(v147)+sum(v148)+sum(v149)+sum(v150) as P30,
sum(v151)+sum(v152)+sum(v153)+sum(v154)+sum(v155) as P31,
sum(v156)+sum(v157)+sum(v158)+sum(v159)+sum(v160) as P32,
sum(v161)+sum(v162)+sum(v163)+sum(v164)+sum(v165) as P33,
sum(v166)+sum(v167)+sum(v168)+sum(v169)+sum(v170) as P34,
sum(v171)+sum(v172)+sum(v173)+sum(v174)+sum(v175) as P35,
substr(time,1,4) as month,
eNodeBID,
CellID
from zhongxing_2015_obj_12months
where(day >= '2015-03-01'andday <= '2015-03-31')
and(Time between'1503010000'and'1503312345')
and RegionID=999
group by eNodeBID,CellID,month
order by P1 DESC limit 50;
I found the performance is not good, and when I substitute the table
name ‘zhongxing_2015_obj_12months’ with the sub partition table name
‘zhongxing_2015_obj_12months_1_prt_mar’, I got almost 1x performance
improvement.
I think for the partitioned table, this two sql will be parsed to the
same thing. So I wonder why this situation could be happened. I run
the explain query, and the result as follows:
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=1725101.68..1725102.80 rows=50 width=324)
-> Gather Motion 192:1 (slice2; segments: 192)
(cost=1725101.68..1725102.80 rows=50 width=324)
Merge Key: p1
-> Limit (cost=1725101.68..1725101.80 rows=1 width=324)
-> Sort (cost=1725101.68..1725113.19 rows=24 width=324)
Sort Key (Limit): p1
-> HashAggregate (cost=1721149.86..1724821.55 rows=24 width=324)
Group By: public.zhongxing_2015_obj_12months.enodebid,
public.zhongxing_2015_obj_12months.cellid, "?column3?"
-> Redistribute Motion 192:192 (slice1; segments: 192)
(cost=1718939.94..1719055.04 rows=24 width=1444)
Hash Key: public.zhongxing_2015_obj_12months.enodebid,
public.zhongxing_2015_obj_12months.cellid, unnamed_attr_3
-> HashAggregate (cost=1718939.94..1718962.96 rows=24
width=1444)
Group By:
public.zhongxing_2015_obj_12months.enodebid,
public.zhongxing_2015_obj_12months.cellid,
substr(public.zhongxing_2015_obj_12months.day::text, 1, 7)
-> Result (cost=0.00..1698454.59 rows=240 width=716)
-> Append (cost=0.00..1698224.41 rows=240
width=716)
-> Parquet table Scan on
zhongxing_2015_obj_12months_1_prt_mar zhongxing_2015_obj_12months
(cost=0.00..1698224.41 rows=240 width=716)
Filter: day = '2015-03-01
00:00:00'::timestamp without time zone
I can see the “Parquet table Scan on
zhongxing_2015_obj12months_1_prt_mar zhongxing_2015_obj_12months”
part, dose this mean scanner scanning ALL DATA of all partitions in
table zhongxing_2015_obj_12months plus one sub table
zhongxing_2015_obj_12months_1_prt_mar include one partition of 12?
I will waiting for the answer online, this problem is very critical
and urgent for me!
Thank you so mush!!!
Best Regards
Rui Zha