Re: selective partition scanning may not work

Aitor Cedres Tue, 19 Jan 2016 02:08:12 -0800

Hi Rui Zha,

>From the EXPLAIN ANALYZE output, I suspect you have missing statistics for
this table. Can you run ANALYZE zhongxing_2015_obj_12months and see if the
performance improves?


Best regards,
Aitor


On Tue, 19 Jan 2016 at 09:25 Ruilong Huo <[email protected]> wrote:

> Thanks.
>
> I tried the same query with your data but still failed to reproduce it. It
> took about 0.4 second to complete in my environment.
>
> Please create a ticket at https://issues.apache.org/jira/browse/HAWQ. It
> would be great if you can attach all data in the table in that ticket so
> that further reproduce and analysis can be taken.
>
> Best regards,
> Ruilong Huo
>
> On Tue, Jan 19, 2016 at 5:05 PM, 查 锐 <[email protected]> wrote:
>
>> Hi Ruilong,
>>
>> The content of table zhongxing_2015_obj_12months is here
>>
>>  regionid | enodebid | cellid |    time    |   v1    |   v2    |   v3
>> |   v4    |   v5    |   v6    |   v7    |   v8    |   v9    |   v10   |
>> v11   |   v12   |   v13   |   v14   |   v15   |   v16   |
>>    v17   |   v18   |   v19   |   v20   |   v21   |   v22   |   v23   |
>> v24   |   v25   |   v26   |   v27   |   v28   |   v29   |   v30   |   v31
>> |  v32   |   v33   |   v34   |  v35   |   v36   |  v37
>>   |   v38   |   v39   |   v40   |   v41   |   v42   |   v43   |   v44
>> |  v45   |  v46   |   v47   |   v48   |   v49   |   v50   |   v51   |  v52
>>   |   v53   |   v54   |   v55   |   v56   |   v57   |   v
>> 58   |   v59   |   v60   |   v61   |   v62   |   v63   |   v64   |   v65
>>   |  v66  |   v67   |   v68   |   v69   |   v70   |   v71   |   v72   |
>> v73   |   v74   |   v75   |   v76   |   v77   |   v78   |
>>    v79   |  v80  |   v81   |  v82   |   v83   |   v84   |   v85   |   v86
>>   |   v87   |   v88   |   v89   |   v90   |   v91   |   v92   |   v93   |
>> v94   |   v95   |   v96   |   v97   |  v98   |  v99
>> |  v100   |  v101   |  v102   |  v103   |  v104   |  v105   |  v106  |
>> v107   |  v108   |  v109   |  v110   |  v111   |  v112   |  v113   |  v114
>> |  v115  |  v116   |  v117  |  v118   |  v119   |  v120
>>   |  v121   |  v122   |  v123   |  v124   |  v125   |  v126   |  v127  |
>> v128   |  v129   |  v130   |  v131   |  v132  |  v133   |  v134  |  v135
>> |  v136   |  v137   |  v138   |  v139   |  v140   |  v1
>> 41   |  v142  |  v143   |  v144   |  v145   |  v146   |  v147   |  v148
>> |  v149   |  v150   |  v151   |  v152   |  v153  |  v154   |  v155   |
>> v156   |  v157   |  v158   |  v159   |  v160  |  v161  |
>> v162  |  v163   |  v164   |  v165   |  v166   |  v167   |  v168  |  v169
>>   |  v170   |  v171   |  v172   |  v173   |  v174  |  v175   |    day
>>
>> ----------+----------+--------+------------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+
>>
>> ---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+--------+---------+---------+--------+---------+------
>>
>> --+---------+---------+---------+---------+---------+---------+---------+--------+--------+---------+---------+---------+---------+---------+--------+---------+---------+---------+---------+---------+----
>>
>> -----+---------+---------+---------+---------+---------+---------+---------+-------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+
>>
>> ---------+-------+---------+--------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+--------+--------
>>
>> +---------+---------+---------+---------+---------+---------+--------+---------+---------+---------+---------+---------+---------+---------+--------+--------+---------+--------+---------+---------+-------
>>
>> --+---------+---------+---------+---------+---------+---------+--------+---------+---------+---------+---------+--------+---------+--------+---------+---------+---------+---------+---------+---------+----
>>
>> -----+--------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+--------+---------+---------+---------+---------+---------+---------+--------+--------+--
>>
>> ------+---------+---------+---------+---------+---------+--------+---------+---------+---------+---------+---------+--------+---------+------------
>>       999 |   103019 |      2 | 1501031720 | 3528539 | 6306154 | 2434845
>> | 9136141 | 4715281 | 4166421 | 2233478 | 6170791 | 1388084 | 4372131 |
>> 9389100 | 7366252 | 3357084 | 7406413 | 1513883 | 9308701 |
>>  1457033 | 4298094 | 9519949 | 4413685 | 3328848 | 2421259 | 4732545 |
>> 1145746 | 1435813 | 6076642 | 2178385 | 5317895 | 9843957 | 3265059 |
>> 4195098 | 511515 | 6772638 | 5822423 | 964529 | 9478699 | 75887
>> 4 | 4667358 | 8343212 | 2862168 | 2756141 | 6430593 | 2670890 | 6427843 |
>> 234592 | 304208 | 3881361 | 3783518 | 4886727 | 3939864 | 2170262 | 155186
>> | 1043626 | 5890094 | 5468745 | 7126131 | 2495869 | 253
>> 4501 | 1316005 | 9849963 | 7481091 | 5506001 | 3497212 | 6354145 |
>> 8853875 | 26143 | 3434271 | 9688694 | 8137829 | 9960882 | 8563434 | 1122292
>> | 8429005 | 8823352 | 8349624 | 2697861 | 4860927 | 7855095 |
>>  6731805 | 55222 | 3890166 | 438327 | 4163808 | 7955744 | 8650484 |
>> 6932470 | 4922705 | 5504441 | 9796450 | 1125121 | 2595695 | 4642362 |
>> 8542125 | 1464853 | 5203847 | 5137968 | 7246920 | 752884 | 319888
>> | 3427429 | 3773933 | 6498516 | 4023497 | 3007079 | 7893264 | 123849 |
>> 9834005 | 1272966 | 2910334 | 9769998 | 1541058 | 9682446 | 5588857 |
>> 760941 | 646078 | 2988183 | 366871 | 9665679 | 4001586 | 401560
>> 8 | 7885914 | 9252858 | 1873461 | 3292256 | 4699529 | 1660829 | 590981 |
>> 5228858 | 8656938 | 7423225 | 1598162 | 183868 | 5179803 | 890298 | 7177048
>> | 2106501 | 8607052 | 8649816 | 4177679 | 8215160 | 791
>> 7762 | 892917 | 1571426 | 4343883 | 7120589 | 8148255 | 6027362 | 2171403
>> | 4300038 | 5466707 | 2583910 | 5319403 | 433249 | 5816086 | 7471174 |
>> 4446179 | 2392407 | 4053061 | 3627571 | 148499 | 348041 | 1
>> 89700 | 4803881 | 1037155 | 6246551 | 9487389 | 7140770 | 825111 |
>> 4393547 | 9104373 | 3933856 | 1103255 | 7701196 | 266738 | 7412872 |
>> 2015-01-03
>>
>
>>
>> 在 2016年1月19日，16:58，Ruilong Huo <[email protected]> 写道：
>>
>> You are correct. From the explain analyze result, we can see that 11 out
>> of 12 partitions are eliminated while querying against the parent table:
>> Parquet table Scan on zhongxing_2015_obj_12months_1_prt_mar
>> zhongxing_2015_obj_12months
>>
>> 1. Though there are some trivial drawbacks (i.e., it may introduce some
>> remote read while accessing data on HDFS) regarding the datalocality for
>> partitioned table, it is not the major reason.
>>
>> 2. The major factor regarding the performance downgrade is that: it is
>> much slower to fetch first tuple, as well as finish the scan on the target
>> partition.
>>
>> 1) For querying on target partition of parent table, it cost about 0.2
>> second to get the first tuple, and it cost about 4 ~ 38 seconds to finish
>> the scan.
>> Max/Last(seg44:ws03.mzhen.cn/seg174:ws06.mzhen.cn) 258465/226292 rows
>> with 245/240 ms to first row, 4100/37965 ms to end
>>
>> 2) For querying on parent table directly, it cost about 9 ~ 26 seconds to
>> get the first tuple, and it cost about 56 ~ 117 seconds to finish the scan.
>> Max/Last(seg44:ws07.mzhen.cn/seg47:ws01.mzhen.cn) 258465/254465 rows
>> with 9326/36830 ms to first row, 55956/117370 ms to end
>>
>> I tried to reproduce this issue with a simpler case in my local
>> environment but fail. Would you please dump data in
>> table zhongxing_2015_obj_12months or zhongxing_2015 and paste it here? Or
>> just part of it would be great for further investigation.
>>
>> Best regards,
>> Ruilong Huo
>>
>> On Tue, Jan 19, 2016 at 3:40 PM, 查 锐 <[email protected]> wrote:
>>
>>> Ruilong,
>>>
>>> Thank you for reply.
>>>
>>> The scan related information in the query plan shows that 11 partitions
>>> are eliminated and 1 partition is chosen to be scan. I just don’t know why
>>> the performance of querying  sub partitioned table
>>> (zhongxing_2015_obj_12months_1_prt_mar) is better than querying the root
>>> partitioned table(zhongxing_2015_obj_12months) with the predicate indicates
>>> to the sub partitioned table.
>>>
>>> I run the EXPLAIN ANALYZE query on the sub partitioned
>>> table(zhongxing_2015_obj_12months_1_prt_mar) and the root partitioned
>>> table(zhongxing_2015_obj_12months) separately, the result as follows:
>>>
>>> The EXLAIN ANALYZE results on the sub partitioned table
>>>
>>>
>>>   QUERY PLAN
>>>
>>>
>>>
>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>>
>>> -------------------------------------------------------------------------------------------------------
>>>  Limit  (cost=2159144.19..2159145.31 rows=50 width=324)
>>>    Rows out:  50 rows with 51344 ms to first row, 51345 ms to end, start
>>> offset by 228/228 ms.
>>>    ->  Gather Motion 192:1  (slice2; segments: 192)
>>> (cost=2159144.19..2159145.31 rows=50 width=324)
>>>          Merge Key: p1
>>>          Rows out:  50 rows at destination with 51344 ms to end, start
>>> offset by 228/228 ms.
>>>          ->  Limit  (cost=2159144.19..2159144.31 rows=1 width=324)
>>>                Rows out:  Avg 50.0 rows x 192 workers.  Max/Last(seg0:
>>> ws07.mzhen.cn/seg130:ws06.mzhen.cn) 50/50 rows with 41038/42852 ms to
>>> end, start offset by 208/252 ms.
>>>                ->  Sort  (cost=2159144.19..2159144.44 rows=1 width=324)
>>>                      Sort Key (Limit): p1
>>>                      Rows out:  Avg 50.0 rows x 192 workers.
>>> Max/Last(seg0:ws07.mzhen.cn/seg130:ws06.mzhen.cn) 50/50 rows with
>>> 41038/42852 ms to end, start offset by 208/252 ms.
>>>                      Executor memory:  81K bytes avg, 81K bytes max
>>> (seg0:ws07.mzhen.cn).
>>>                      Work_mem used:  81K bytes avg, 81K bytes max (seg0:
>>> ws07.mzhen.cn). Workfile: (0 spilling, 0 reused)
>>>                      Work_mem wanted: 3K bytes avg, 3K bytes max (seg61:
>>> ws07.mzhen.cn) to lessen workfile I/O affecting 192 workers.
>>>                      ->  HashAggregate  (cost=2159061.12..2159140.87
>>> rows=1 width=324)
>>>                            Group By:
>>> zhongxing_2015_obj_12months_1_prt_mar.enodebid,
>>> zhongxing_2015_obj_12months_1_prt_mar.cellid, "?column3?"
>>>                            Rows out:  Avg 78.1 rows x 192 workers.
>>> Max/Last(seg61:ws01.mzhen.cn/seg130:ws06.mzhen.cn) 89/77 rows with
>>> 41043/42851 ms to first row, 41044/42851 ms to end, start offset by 2
>>> 63/252 ms.
>>>                            Executor memory:  409K bytes avg, 473K bytes
>>> max (seg61:ws01.mzhen.cn).
>>>                            ->  Redistribute Motion 192:192  (slice1;
>>> segments: 192)  (cost=2159013.37..2159015.62 rows=1 width=1444)
>>>                                  Hash Key:
>>> zhongxing_2015_obj_12months_1_prt_mar.enodebid,
>>> zhongxing_2015_obj_12months_1_prt_mar.cellid, unnamed_attr_3
>>>                                  Rows out:  Avg 15000.0 rows x 192
>>> workers at destination.  Max/Last(seg61:
>>> ws01.mzhen.cn/seg130:ws06.mzhen.cn) 17088/14784 rows with 5842/5848 ms
>>> to first row, 40645/42541
>>> ms to end, start offset by 263/252 ms.
>>>                                  ->  HashAggregate
>>> (cost=2159013.37..2159013.62 rows=1 width=1444)
>>>                                        Group By:
>>> zhongxing_2015_obj_12months_1_prt_mar.enodebid,
>>> zhongxing_2015_obj_12months_1_prt_mar.cellid,
>>> substr(zhongxing_2015_obj_12months_1_prt_mar."time"::text, 1,
>>>  4)
>>>                                        Rows out:  Avg 15000.0 rows x 192
>>> workers.  Max/Last(seg0:ws07.mzhen.cn/seg174:ws06.mzhen.cn) 15000/15000
>>> rows with 8169/40815 ms to first row, 8267/40877 ms to end,
>>>  start offset by 212/301 ms.
>>>                                        Executor memory:  44329K bytes
>>> avg, 44329K bytes max (seg0:ws07.mzhen.cn).
>>>                                        ->  Parquet table Scan on
>>> zhongxing_2015_obj_12months_1_prt_mar  (cost=0.00..2158572.02 rows=6
>>> width=750)
>>>                                              Filter: day >=
>>> '2015-03-01'::date AND day <= '2015-03-31'::date AND "time"::text >=
>>> '1503010000'::text AND "time"::text <= '1503312345'::text AND regionid = 99
>>> 9
>>>                                              Rows out:  Avg 234302.7
>>> rows x 192 workers.  Max/Last(seg44:ws03.mzhen.cn/seg174:ws06.mzhen.cn)
>>> 258465/226292 rows with 245/240 ms to first row, 4100/37965 ms
>>> to end, start offset by 297/301 ms.
>>>  Slice statistics:
>>>    (slice0)    Executor memory: 6915K bytes.
>>>    (slice1)    Executor memory: 60457K bytes avg x 192 workers, 60457K
>>> bytes max (seg0:ws07.mzhen.cn).
>>>    (slice2)  * Executor memory: 3091K bytes avg x 192 workers, 3155K
>>> bytes max (seg61:ws01.mzhen.cn).  Work_mem: 81K bytes max, 3K bytes
>>> wanted.
>>>  Statement statistics:
>>>    Memory used: 131072K bytes
>>>    Memory wanted: 8598K bytes
>>>  Settings:  default_segment_num=64
>>>  Dispatcher statistics:
>>>    executors used(total/cached/new connection): (384/0/384); dispatcher
>>> time(total/connection/dispatch data): (236.797 ms/219.617 ms/15.258 ms).
>>>    dispatch data time(max/min/avg): (4.910 ms/0.017 ms/0.102 ms);
>>> consume executor data time(max/min/avg): (5.029 ms/0.004 ms/0.038 ms); free
>>> executor time(max/min/avg): (0.000 ms/0.000 ms/0.000 ms).
>>>  Data locality statistics:
>>>    data locality ratio: 0.822; virtual segment number: 192; different
>>> host number: 8; virtual segment number per host(avg/min/max): (24/24/24);
>>> segment size(avg/min/max): (191623489.297/172566327/21129035
>>> 0); segment size with penalty(avg/min/max): (0.000/0/0);
>>> continuity(avg/min/max): (1.000/1.000/1.000).
>>>  Total runtime: 51572.484 ms
>>> (41 rows)
>>>
>>>
>>> The EXLAIN ANALYZE results on the root partitioned table
>>>
>>>
>>>   QUERY PLAN
>>>
>>>
>>>
>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>>
>>> ----------------------------------------------------------------------------------------------------------
>>>  Limit  (cost=2159144.19..2159145.31 rows=50 width=324)
>>>    Rows out:  50 rows with 133541 ms to first row, 133542 ms to end,
>>> start offset by 16/16 ms.
>>>    ->  Gather Motion 192:1  (slice2; segments: 192)
>>> (cost=2159144.19..2159145.31 rows=50 width=324)
>>>          Merge Key: p1
>>>          Rows out:  50 rows at destination with 133541 ms to end, start
>>> offset by 16/16 ms.
>>>          ->  Limit  (cost=2159144.19..2159144.31 rows=1 width=324)
>>>                Rows out:  Avg 50.0 rows x 192 workers.  Max/Last(seg0:
>>> ws07.mzhen.cn/seg69:ws01.mzhen.cn) 50/50 rows with 116748/123223 ms to
>>> end, start offset by 6484/26 ms.
>>>                ->  Sort  (cost=2159144.19..2159144.44 rows=1 width=324)
>>>                      Sort Key (Limit): p1
>>>                      Rows out:  Avg 50.0 rows x 192 workers.
>>> Max/Last(seg0:ws07.mzhen.cn/seg69:ws01.mzhen.cn) 50/50 rows with
>>> 116748/123223 ms to end, start offset by 6484/26 ms.
>>>                      Executor memory:  81K bytes avg, 81K bytes max
>>> (seg0:ws07.mzhen.cn).
>>>                      Work_mem used:  81K bytes avg, 81K bytes max (seg0:
>>> ws07.mzhen.cn). Workfile: (0 spilling, 0 reused)
>>>                      Work_mem wanted: 3K bytes avg, 3K bytes max (seg61:
>>> ws07.mzhen.cn) to lessen workfile I/O affecting 192 workers.
>>>                      ->  HashAggregate  (cost=2159061.12..2159140.87
>>> rows=1 width=324)
>>>                            Group By:
>>> public.zhongxing_2015_obj_12months.enodebid,
>>> public.zhongxing_2015_obj_12months.cellid, "?column3?"
>>>                            Rows out:  Avg 78.1 rows x 192 workers.
>>> Max/Last(seg61:ws07.mzhen.cn/seg69:ws01.mzhen.cn) 89/81 rows with
>>> 123071/123223 ms to first row, 123072/123223 ms to end, start offset b
>>> y 163/26 ms.
>>>                            Executor memory:  345K bytes avg, 409K bytes
>>> max (seg61:ws07.mzhen.cn).
>>>                            ->  Redistribute Motion 192:192  (slice1;
>>> segments: 192)  (cost=2159013.37..2159015.62 rows=1 width=1444)
>>>                                  Hash Key:
>>> public.zhongxing_2015_obj_12months.enodebid,
>>> public.zhongxing_2015_obj_12months.cellid, unnamed_attr_3
>>>                                  Rows out:  Avg 70154.2 rows x 192
>>> workers at destination.  Max/Last(seg61:
>>> ws07.mzhen.cn/seg16:ws04.mzhen.cn) 79987/61081 rows with 1081/1140 ms
>>> to first row, 121468/121889
>>>  ms to end, start offset by 163/137 ms.
>>>                                  ->  HashAggregate
>>> (cost=2159013.37..2159013.62 rows=1 width=1444)
>>>                                        Group By:
>>> public.zhongxing_2015_obj_12months.enodebid,
>>> public.zhongxing_2015_obj_12months.cellid,
>>> substr(public.zhongxing_2015_obj_12months."time"::text, 1, 4)
>>>                                        Rows out:  Avg 70154.2 rows x 192
>>> workers.  Max/Last(seg44:ws07.mzhen.cn/seg47:ws01.mzhen.cn) 75237/72524
>>> rows with 21600/69653 ms to first row, 60439/121820 ms to e
>>> nd, start offset by 36811/67 ms.
>>>                                        Executor memory:  42856K bytes
>>> avg, 42856K bytes max (seg0:ws07.mzhen.cn).
>>>                                        ->  Result
>>> (cost=0.00..2158572.02 rows=6 width=750)
>>>                                              Rows out:  Avg 234302.7
>>> rows x 192 workers.  Max/Last(seg44:ws07.mzhen.cn/seg47:ws01.mzhen.cn)
>>> 258465/254465 rows with 9326/36830 ms to first row, 56780/118175
>>>  ms to end, start offset by 36811/67 ms.
>>>                                              ->  Append
>>> (cost=0.00..2158569.54 rows=6 width=750)
>>>                                                    Rows out:  Avg
>>> 234302.7 rows x 192 workers.  Max/Last(seg44:
>>> ws07.mzhen.cn/seg47:ws01.mzhen.cn) 258465/254465 rows with 9326/36830
>>> ms to first row, 56007/
>>> 117417 ms to end, start offset by 36811/67 ms.
>>>                                                    ->  Parquet table
>>> Scan on zhongxing_2015_obj_12months_1_prt_mar zhongxing_2015_obj_12months
>>> (cost=0.00..2158569.54 rows=6 width=750)
>>>                                                          Filter: day >=
>>> '2015-03-01'::date AND day <= '2015-03-31'::date AND "time"::text >=
>>> '1503010000'::text AND "time"::text <= '1503312345'::text AND r
>>> egionid = 999
>>>                                                          Rows out:  Avg
>>> 234302.7 rows x 192 workers.  Max/Last(seg44:
>>> ws07.mzhen.cn/seg47:ws01.mzhen.cn) 258465/254465 rows with 9326/36830
>>> ms to first row,
>>> 55956/117370 ms to end, start offset by 36811/67 ms.
>>>  Slice statistics:
>>>    (slice0)    Executor memory: 7474K bytes.
>>>    (slice1)    Executor memory: 60604K bytes avg x 192 workers, 60611K
>>> bytes max (seg2:ws07.mzhen.cn).
>>>    (slice2)  * Executor memory: 3067K bytes avg x 192 workers, 3130K
>>> bytes max (seg61:ws07.mzhen.cn).  Work_mem: 81K bytes max, 3K bytes
>>> wanted.
>>>  Statement statistics:
>>>    Memory used: 131072K bytes
>>>    Memory wanted: 8700K bytes
>>>  Settings:  default_segment_num=64
>>>  Dispatcher statistics:
>>>    executors used(total/cached/new connection): (384/384/0); dispatcher
>>> time(total/connection/dispatch data): (14.710 ms/0.000 ms/13.435 ms).
>>>    dispatch data time(max/min/avg): (8.256 ms/0.048 ms/0.117 ms);
>>> consume executor data time(max/min/avg): (0.042 ms/0.004 ms/0.013 ms); free
>>> executor time(max/min/avg): (0.000 ms/0.000 ms/0.000 ms).
>>>  Data locality statistics:
>>>    data locality ratio: 0.462; virtual segment number: 192; different
>>> host number: 8; virtual segment number per host(avg/min/max): (24/24/24);
>>> segment size(avg/min/max): (2299488129.719/2237860180/237028
>>> 9866); segment size with penalty(avg/min/max): (0.000/0/0);
>>> continuity(avg/min/max): (1.000/1.000/1.000).
>>>  Total runtime: 134648.912 ms
>>> (45 rows)
>>>
>>> From the results, I noticed the locality ratio of the data scanned in
>>> root partitioned table querying is 0.462, and the locality ratio of the
>>> data scanned in sub partitioned table querying is 0.822. Is this normal?
>>>
>>> In addition, the ‘append’ operation in the root partitioned table query
>>> plan doesn’t appear in the sub partitioned table query plan.
>>>
>>>
>>> 在 2016年1月19日，14:52，Ruilong Huo <[email protected]> 写道：
>>>
>>> @查锐:
>>>
>>> From the plan, there is no DPE (dynamic partition elimination) involved.
>>> That's why full table (instead of target partition) is scanned.
>>>
>>> Would you please attach the full output of the query plan including
>>> settings and optimizer status as below?
>>>
>>> test=# explain xxx
>>>                                      QUERY PLAN
>>>
>>> ------------------------------------------------------------------------------------
>>> ...
>>>  Settings:  default_segment_num=8
>>>  Optimizer status: PQO version 1.617
>>> (6 rows)
>>>
>>> Best regards,
>>> Ruilong Huo
>>>
>>> On Tue, Jan 19, 2016 at 2:43 PM, 查 锐 <[email protected]> wrote:
>>>
>>>> Even transforming string to date type the result just all the same...
>>>>
>>>>
>>>> 在 2016年1月19日，14:24，陶进 <[email protected]> 写道：
>>>>
>>>> it was because the item day was compare to a string '2016-03-01' while
>>>> day  type is data?
>>>> I guessed!
>>>>
>>>> 在 2016/1/19 12:36, 查 锐 写道:
>>>>
>>>> hi all,
>>>>
>>>> I created a table with date range partition, and my creating table
>>>> cause as follows:
>>>>
>>>> CREATE TABLE zhongxing_2015_obj_12months (RegionID integer, eNodeBID
>>>> bigint, CellID integer, Time varchar(10), v1 integer, v2 integer, v3
>>>> integer, v4 integer, v5 integer, v6 integer, v7 integer, v8 integer,
>>>> v9 integer, v10 integer, v11 integer, v12 integer, v13 integer, v14
>>>> integer, v15 integer, v16 integer, v17 integer, v18 integer, v19
>>>> integer, v20 integer, v21 integer, v22 integer, v23 integer, v24
>>>> integer, v25 integer, v26 integer, v27 integer, v28 integer, v29
>>>> integer, v30 integer, v31 integer, v32 integer, v33 integer, v34
>>>> integer, v35 integer, v36 integer, v37 integer, v38 integer, v39
>>>> integer, v40 integer, v41 integer, v42 integer, v43 integer, v44
>>>> integer, v45 integer, v46 integer, v47 integer, v48 integer, v49
>>>> integer, v50 integer, v51 integer, v52 integer, v53 integer, v54
>>>> integer, v55 integer, v56 integer, v57 integer, v58 integer, v59
>>>> integer, v60 integer, v61 integer, v62 integer, v63 integer, v64
>>>> integer, v65 integer, v66 integer, v67 integer, v68 integer, v69
>>>> integer, v70 integer, v71 integer, v72 integer, v73 integer, v74
>>>> integer, v75 integer, v76 integer, v77 integer, v78 integer, v79
>>>> integer, v80 integer, v81 integer, v82 integer, v83 integer, v84
>>>> integer, v85 integer, v86 integer, v87 integer, v88 integer, v89
>>>> integer, v90 integer, v91 integer, v92 integer, v93 integer, v94
>>>> integer, v95 integer, v96 integer, v97 integer, v98 integer, v99
>>>> integer, v100 integer, v101 integer, v102 integer, v103 integer, v104
>>>> integer, v105 integer, v106 integer, v107 integer, v108 integer, v109
>>>> integer, v110 integer, v111 integer, v112 integer, v113 integer, v114
>>>> integer, v115 integer, v116 integer, v117 integer, v118 integer, v119
>>>> integer, v120 integer, v121 integer, v122 integer, v123 integer, v124
>>>> integer, v125 integer, v126 integer, v127 integer, v128 integer, v129
>>>> integer, v130 integer, v131 integer, v132 integer, v133 integer, v134
>>>> integer, v135 integer, v136 integer, v137 integer, v138 integer, v139
>>>> integer, v140 integer, v141 integer, v142 integer, v143 integer, v144
>>>> integer, v145 integer, v146 integer, v147 integer, v148 integer, v149
>>>> integer, v150 integer, v151 integer, v152 integer, v153 integer, v154
>>>> integer, v155 integer, v156 integer, v157 integer, v158 integer, v159
>>>> integer, v160 integer, v161 integer, v162 integer, v163 integer, v164
>>>> integer, v165 integer, v166 integer, v167 integer, v168 integer, v169
>>>> integer, v170 integer, v171 integer, v172 integer, v173 integer, v174
>>>> integer, v175 integer, day date) WITH (appendonly=true,
>>>> orientation=parquet)
>>>>   DISTRIBUTED BY (Time)
>>>>   PARTITION BY RANGE(day)
>>>> ( PARTITION JAN START (date '2015-01-01') INCLUSIVE ,
>>>> PARTITION FEB START (date '2015-02-01') INCLUSIVE ,
>>>> PARTITION MAR START (date '2015-03-01') INCLUSIVE ,
>>>> PARTITION APR START (date '2015-04-01') INCLUSIVE ,
>>>> PARTITION MAY START (date '2015-05-01') INCLUSIVE ,
>>>> PARTITION JUN START (date '2015-06-01') INCLUSIVE ,
>>>> PARTITION JUL START (date '2015-07-01') INCLUSIVE ,
>>>> PARTITION AUG START (date '2015-08-01') INCLUSIVE ,
>>>> PARTITION SEP START (date '2015-09-01') INCLUSIVE ,
>>>> PARTITION OCT START (date '2015-10-01') INCLUSIVE ,
>>>> PARTITION NOV START (date '2015-11-01') INCLUSIVE ,
>>>> PARTITION DEC START (date '2015-12-01') INCLUSIVE
>>>>                   END (date '2016-01-01') EXCLUSIVE );
>>>>
>>>> i.e the table has 12 partitions, and 1 month data in each partition.
>>>>
>>>> And I INSERT the table with the sql
>>>>
>>>> INSERT INTO zhongxing_2015_obj_12months SELECT  RegionID, eNodeBID,
>>>> CellID, Time, v1, v2, v3, v4, v5, v6, v7, v8, v9, v10, v11, v12, v13, v14,
>>>> v15, v16, v17, v18, v19, v20, v21, v22, v23, v24, v25, v26, v27, v28, v29,
>>>> v30, v31, v32, v33, v34, v35, v36, v37, v38, v39, v40, v41, v42, v43, v44,
>>>> v45, v46, v47, v48, v49, v50, v51, v52, v53, v54, v55, v56, v57, v58, v59,
>>>> v60, v61, v62, v63, v64, v65, v66, v67, v68, v69, v70, v71, v72, v73, v74,
>>>> v75, v76, v77, v78, v79, v80, v81, v82, v83, v84, v85, v86, v87, v88, v89,
>>>> v90, v91, v92, v93, v94, v95, v96, v97, v98, v99, v100, v101, v102, v103,
>>>> v104, v105, v106, v107, v108, v109, v110, v111, v112, v113, v114, v115,
>>>> v116, v117, v118, v119, v120, v121, v122, v123, v124, v125, v126, v127,
>>>> v128, v129, v130, v131, v132, v133, v134, v135, v136, v137, v138, v139,
>>>> v140, v141, v142, v143, v144, v145, v146, v147, v148, v149, v150, v151,
>>>> v152, v153, v154, v155, v156, v157, v158, v159, v160, v161, v162, v163,
>>>> v164, v165, v166, v167, v168, v169, v170, v171, v172, v173, v174, v175,
>>>> to_date(substring(Time from 1 for 6),'yyMMdd') AS day FROM
>>>> zhongxing_2015;
>>>>
>>>> but when I query one partition with the sql
>>>>
>>>> select
>>>> sum(v1)+sum(v2)+sum(v3)+sum(v4)+sum(v5) as P1,
>>>> sum(v6)+sum(v7)+sum(v8)+sum(v9)+sum(v10) as P2,
>>>> sum(v11)+sum(v12)+sum(v13)+sum(v14)+sum(v15) as P3,
>>>> sum(v16)+sum(v17)+sum(v18)+sum(v19)+sum(v20) as P4,
>>>> sum(v21)+sum(v22)+sum(v23)+sum(v24)+sum(v25) as P5,
>>>> sum(v26)+sum(v27)+sum(v28)+sum(v29)+sum(v30) as P6,
>>>> sum(v31)+sum(v32)+sum(v33)+sum(v34)+sum(v35) as P7,
>>>> sum(v36)+sum(v37)+sum(v38)+sum(v39)+sum(v40) as P8,
>>>> sum(v41)+sum(v42)+sum(v43)+sum(v44)+sum(v45) as P9,
>>>> sum(v46)+sum(v47)+sum(v48)+sum(v49)+sum(v50) as P10,
>>>> sum(v51)+sum(v52)+sum(v53)+sum(v54)+sum(v55) as P11,
>>>> sum(v56)+sum(v57)+sum(v58)+sum(v59)+sum(v60) as P12,
>>>> sum(v61)+sum(v62)+sum(v63)+sum(v64)+sum(v65) as P13,
>>>> sum(v66)+sum(v67)+sum(v68)+sum(v69)+sum(v70) as P14,
>>>> sum(v71)+sum(v72)+sum(v73)+sum(v74)+sum(v75) as P15,
>>>> sum(v76)+sum(v77)+sum(v78)+sum(v79)+sum(v80) as P16,
>>>> sum(v81)+sum(v82)+sum(v83)+sum(v84)+sum(v85) as P17,
>>>> sum(v86)+sum(v87)+sum(v88)+sum(v89)+sum(v90) as P18,
>>>> sum(v91)+sum(v92)+sum(v93)+sum(v94)+sum(v95) as P19,
>>>> sum(v96)+sum(v97)+sum(v98)+sum(v99)+sum(v100) as P20,
>>>> sum(v101)+sum(v102)+sum(v103)+sum(v104)+sum(v105) as P21,
>>>> sum(v106)+sum(v107)+sum(v108)+sum(v109)+sum(v110) as P22,
>>>> sum(v111)+sum(v112)+sum(v113)+sum(v114)+sum(v115) as P23,
>>>> sum(v116)+sum(v117)+sum(v118)+sum(v119)+sum(v120) as P24,
>>>> sum(v121)+sum(v122)+sum(v123)+sum(v124)+sum(v125) as P25,
>>>> sum(v126)+sum(v127)+sum(v128)+sum(v129)+sum(v130) as P26,
>>>> sum(v131)+sum(v132)+sum(v133)+sum(v134)+sum(v135) as P27,
>>>> sum(v136)+sum(v137)+sum(v138)+sum(v139)+sum(v140) as P28,
>>>> sum(v141)+sum(v142)+sum(v143)+sum(v144)+sum(v145) as P29,
>>>> sum(v146)+sum(v147)+sum(v148)+sum(v149)+sum(v150) as P30,
>>>> sum(v151)+sum(v152)+sum(v153)+sum(v154)+sum(v155) as P31,
>>>> sum(v156)+sum(v157)+sum(v158)+sum(v159)+sum(v160) as P32,
>>>> sum(v161)+sum(v162)+sum(v163)+sum(v164)+sum(v165) as P33,
>>>> sum(v166)+sum(v167)+sum(v168)+sum(v169)+sum(v170) as P34,
>>>> sum(v171)+sum(v172)+sum(v173)+sum(v174)+sum(v175) as P35,
>>>> substr(time,1,4) as month,
>>>> eNodeBID,
>>>> CellID
>>>> from zhongxing_2015_obj_12months
>>>> where (day >= '2015-03-01' and day <= '2015-03-31')
>>>> and (Time between '1503010000' and '1503312345')
>>>> and RegionID=999
>>>> group by eNodeBID,CellID,month
>>>> order by P1 DESC limit 50;
>>>>
>>>> I found the performance is not good, and when I substitute the table
>>>> name ‘zhongxing_2015_obj_12months’ with the sub partition table name ‘
>>>> zhongxing_2015_obj_12months_1_prt_mar’, I got almost 1x performance
>>>> improvement.
>>>>
>>>> I think for the partitioned table, this two sql will be parsed to the
>>>> same thing. So I wonder why this situation could be happened. I run the
>>>> explain query, and the result as follows:
>>>>
>>>>
>>>>                     QUERY PLAN
>>>>
>>>>
>>>> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>>>  Limit  (cost=1725101.68..1725102.80 rows=50 width=324)
>>>>    ->  Gather Motion 192:1  (slice2; segments: 192)
>>>> (cost=1725101.68..1725102.80 rows=50 width=324)
>>>>          Merge Key: p1
>>>>          ->  Limit  (cost=1725101.68..1725101.80 rows=1 width=324)
>>>>                ->  Sort  (cost=1725101.68..1725113.19 rows=24 width=324)
>>>>                      Sort Key (Limit): p1
>>>>                      ->  HashAggregate  (cost=1721149.86..1724821.55
>>>> rows=24 width=324)
>>>>                            Group By:
>>>> public.zhongxing_2015_obj_12months.enodebid,
>>>> public.zhongxing_2015_obj_12months.cellid, "?column3?"
>>>>                            ->  Redistribute Motion 192:192  (slice1;
>>>> segments: 192)  (cost=1718939.94..1719055.04 rows=24 width=1444)
>>>>                                  Hash Key:
>>>> public.zhongxing_2015_obj_12months.enodebid,
>>>> public.zhongxing_2015_obj_12months.cellid, unnamed_attr_3
>>>>                                  ->  HashAggregate
>>>> (cost=1718939.94..1718962.96 rows=24 width=1444)
>>>>                                        Group By:
>>>> public.zhongxing_2015_obj_12months.enodebid,
>>>> public.zhongxing_2015_obj_12months.cellid,
>>>> substr(public.zhongxing_2015_obj_12months.day::text, 1, 7)
>>>>                                        ->  Result
>>>> (cost=0.00..1698454.59 rows=240 width=716)
>>>>                                              ->  Append
>>>> (cost=0.00..1698224.41 rows=240 width=716)
>>>>                                                    ->  Parquet table
>>>> Scan on zhongxing_2015_obj_12months_1_prt_mar zhongxing_2015_obj_12months
>>>> (cost=0.00..1698224.41 rows=240 width=716)
>>>>                                                          Filter: day =
>>>> '2015-03-01 00:00:00'::timestamp without time zone
>>>>
>>>> I can see the “Parquet table Scan on
>>>> zhongxing_2015_obj12months_1_prt_mar zhongxing_2015_obj_12months” part,
>>>> dose this mean scanner scanning ALL DATA  of all partitions in table
>>>> zhongxing_2015_obj_12months plus one sub table
>>>> zhongxing_2015_obj_12months_1_prt_mar include one partition of 12?
>>>>
>>>> I will waiting for the answer online, this problem is very critical and
>>>> urgent for me!
>>>>
>>>> Thank you so mush!!!
>>>>
>>>> Best Regards
>>>> Rui Zha
>>>>
>>>>
>>>>
>>>>
>>>> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>
>>>>  受
>>>> Avast 保护的无病毒计算机已发送该电子邮件。
>>>> www.avast.com
>>>> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>
>>>>
>>>>
>>>>
>>>
>>>
>>
>> --
*Aitor Cedres*
*Global Support Services | Pivotal *
*Email: **[email protected] <[email protected]>*
*After hours contact #: +1 877.477.2269*
Case Escalations: https://support.pivotal.io/hc/en-us/articles/203809556
<https://support.pivotal.io/hc/en-us/articles/203809556-Pivotal-How-do-I-escalate-an-existing-case->

Re: selective partition scanning may not work

Reply via email to