Hi tony, Yes, confirmed it's a bug. Filed the bug in apache hawq jira https://issues.apache.org/jira/browse/HAWQ-1470. Will fix it later.
Thanks, Zhenglin 2017-05-22 10:31 GMT+08:00 tao tony <[email protected]>: > OK.this bug appears when using gpfdist external table in select list.I > create some test data to reproduce this bug in hawq docker. > > 1.create a external table > > CREATE EXTERNAL TABLE testext ( > a int, > b character varying(255) > ) LOCATION ( > 'gpfdist://172.19.0.2:8087/test.csv' > ) FORMAT 'text' (delimiter E',' null E'' escape E'OFF'); > > test.csv file contains: > > cat gpdata/test.csv > 1,abc > 2,bce > 3,ced > > 2.create a internal table: > > create table test1(c int); > > insert into test1 values(1); > > insert into test1 values(2); > > insert into test1 values(3); > > insert into test1 values(4); > > 3.run query,could not get testext.b values: > > select c,(select s.b from testext s where t.c=s.a) from test1 t; > c | ?column? > ---+---------- > 1 | > 2 | > 3 | > 4 | > (4 rows) > 4.create an internal table from testext,and run the query again,ti > returns the correct result > > create table test as select * from testext; > > select c,(select s.b from test s where t.c=s.a) from test1 t; > c | ?column? > ---+---------- > 1 | abc > 2 | bce > 3 | ced > 4 | > (4 rows) > > you could compare the 2 explains,in step 3 ,testext was not broadcasted. > > > > On 05/22/2017 09:25 AM, 陶征霖 wrote: > > Hi tony, > > > > Could you please provide the simple reproduce steps so that we can easily > > debug in our own env.? > > > > Thanks, > > Zhenglin > > > > 2017-05-18 14:17 GMT+08:00 tao tony <[email protected]>: > > > >> it seems a bug for querying on an external table because I found > >> ucloud_pay_tenanid ulist was a external table which using gpfdist. > >> > >> CREATE EXTERNAL TABLE ucloud_pay_tenanid ( > >> customercode character varying(255), > >> customername character varying(255), > >> prefixflag character varying(255), > >> customertype character varying(255), > >> comments character varying(255) > >> ) LOCATION ( > >> 'gpfdist://hdptest02.hddomain.cn:8087/ucloud_pay_tenanid_*.csv' > >> ) FORMAT 'text' (delimiter E';' null E'' escape E'OFF') > >> ENCODING 'UTF8'; > >> > >> then I create an internal table using: > >> > >> create table ucloudtest as select * from ucloud_pay_tenanid; > >> > >> run the explain,it lookks like the right query plan: > >> > >> hdb=# explain select cp.tenantid, > >> hdb-# (select ulist.customertype from ucloudtest ulist where > >> ulist.customercode=cp.tenantid),cp.orderuuid from cptest cp; > >> QUERY PLAN > >> ------------------------------------------------------------ > >> --------------------------------------------- > >> Gather Motion 1:1 (slice2; segments: 1) (cost=0.00..219.50 rows=100 > >> width=42) > >> -> Append-only Scan on cptest cp (cost=0.00..219.50 rows=100 > >> width=42) > >> SubPlan 1 > >> -> Result (cost=2.18..2.19 rows=1 width=7) > >> Filter: ulist.customercode::text = $0::text > >> -> Materialize (cost=2.18..2.19 rows=1 width=7) > >> -> Broadcast Motion 1:1 (slice1; segments: 1) > >> (cost=0.00..2.17 rows=1 width=7) > >> -> Append-only Scan on ucloudtest ulist > >> (cost=0.00..2.17 rows=1 width=7) > >> Settings: default_hash_table_bucket_number=18; optimizer=off > >> Optimizer status: legacy query optimizer > >> (10 rows) > >> > >> run the query,get the correct result: > >> > >> hdb=# select cp.tenantid, > >> (select ulist.customertype from ucloudtest ulist where > >> ulist.customercode=cp.tenantid),cp.orderuuid from cptest cp; > >> tenantid | ?column? | orderuuid > >> ----------+----------+---------------------------------- > >> sxve7r6c | 便利 | e6d9b57a0c55484392448ea908c1fe49 > >> sxve7r6c | 便利 | 22a80697bfc74d63b7f28eee246c4368 > >> 3e7rph46 | 专卖 | 420ad3e45762459e91860b975e9f2751 > >> 3e7rph46 | 专卖 | 0634e7e3539a4116b9917a7493838f51 > >> 7jvfka5m | 专卖 | a7b96194fe9f48379e2711ac6000191b > >> 6xydfh4y | 便利 | 7a55e97119784623a53f6e65ef9680c7 > >> sxve7r6c | 便利 | 227f3d22aec14723bb51efc4e2a6f0b4 > >> 3e7rph46 | 专卖 | f3d02cc77a2348829be2f72ce24bf846 > >> 6xydfh4y | 便利 | bab722ac7d5748408d3ad2973d292ab5 > >> > >> > >> On 05/18/2017 11:11 AM, tao tony wrote: > >>> hi guys, > >>> > >>> The different explains as below make me confused these days,could you > >>> please help me to explain why ulist.customertype is null. > >>> > >>> explain : > >>> > >>> hdb=# explain select cp.tenantid, > >>> (select ulist.customertype from ucloud_pay_tenanid ulist where > >>> ulist.customercode=cp.tenantid),cp.orderuuid from cptest cp; > >>> QUERY PLAN > >>> ------------------------------------------------------------ > >> -------------------------------------- > >>> Gather Motion 18:1 (slice1; segments: 18) (cost=0.00..1350002.00 > >>> rows=100 width=42) > >>> -> Append-only Scan on cptest cp (cost=0.00..1350002.00 rows=6 > >>> width=42) > >>> SubPlan 1 > >>> -> External Scan on ucloud_pay_tenanid ulist > >>> (cost=0.00..13500.00 rows=56 width=516) > >>> Filter: customercode::text = $0::text > >>> Settings: default_hash_table_bucket_number=18; optimizer=off > >>> Optimizer status: legacy query optimizer > >>> (7 rows) > >>> > >>> > >>> hdb=# explain select a,(select d from test1 s where t.b=s.e) from test2 > >> t; > >>> QUERY PLAN > >>> ------------------------------------------------------------ > >> --------------------------------------------- > >>> Gather Motion 1:1 (slice2; segments: 1) (cost=0.00..10.20 rows=9 > >>> width=6) > >>> -> Append-only Scan on test2 t (cost=0.00..10.20 rows=9 > width=6) > >>> SubPlan 1 > >>> -> Result (cost=1.01..1.02 rows=1 width=4) > >>> Filter: $0::text = s.e::text > >>> -> Materialize (cost=1.01..1.02 rows=1 width=4) > >>> -> Broadcast Motion 1:1 (slice1; segments: > 1) > >>> (cost=0.00..1.01 rows=1 width=4) > >>> -> Append-only Scan on test1 s > >>> (cost=0.00..1.01 rows=1 width=4) > >>> Settings: default_hash_table_bucket_number=18; optimizer=off > >>> Optimizer status: legacy query optimizer > >>> (10 rows) > >>> > >>> test data: > >>> > >>> query1: > >>> > >>> hdb-# (select ulist.customertype from ucloud_pay_tenanid ulist where > >>> ulist.customercode=cp.tenantid),cp.orderuuid from cptest cp; > >>> tenantid | ?column? | orderuuid > >>> ----------+----------+---------------------------------- > >>> sxve7r6c | | 901cf777a3144907899226de86ab8cfe > >>> k5r9qcfj | | 54c8cbdbe64f4dcca5cacb40b7ab52d6 > >>> 3e7rph46 | | f5e777a5189e409da607182d059ec0d5 > >>> 3e7rph46 | | 478d8a0921ab444ca80dc03ce97d3a94 > >>> 3e7rph46 | | 25808c4a66e34ecaad02d2fc183920ca > >>> 3e7rph46 | | 0a62832067ea4db68641cce35385aae7 > >>> 3e7rph46 | | 4c9e91233af947518c5c5767a2bd8b3c > >>> 6xydfh4y | | b2fa1af123384452b23bbc5e0794da56 > >>> 6xydfh4y | | ade5731d97044a0eb9d74836ea7dbf48 > >>> pejg93wh | | da2d037765bb4262a82764715cb4453d > >>> sxve7r6c | | b02df6c998ff4e448b575e3b1fae8e35 > >>> 3e7rph46 | | 4c9e91233af947518c5c5767a2bd8b3c > >>> > >>> query 2: > >>> > >>> hdb=# select a,(select d from test1 s where t.b=s.e) from test2 t; > >>> a | ?column? > >>> ---+---------- > >>> 1 | 1 > >>> 2 | 2 > >>> 3 | 3 > >>> 4 | 4 > >>> 5 | 5 > >>> 6 | 7 > >>> 7 | 8 > >>> 8 | 9 > >>> 9 | 10 > >>> (9 rows) > >>> > >>> another query equal to query1,customertype is not null: > >>> > >>> hdb=# select tenantid,ulist.customertype,cp.orderuuid from cptest > >>> cp,ucloud_pay_tenanid ulist > >>> hdb-# where cp.tenantid = ulist.customercode; > >>> tenantid | customertype | orderuuid > >>> ----------+--------------+---------------------------------- > >>> 7jvfka5m | 专卖 | f10771c9eb4e434dadcc82dc7cda0d4d > >>> 7jvfka5m | 专卖 | 246a13ed45174ee083fd9f743532ab4c > >>> 7jvfka5m | 专卖 | e2687954d3e94ab0bf1366369bdc9887 > >>> 7jvfka5m | 专卖 | 018ef9e35db94901b4822509494bd6b5 > >>> 7jvfka5m | 专卖 | 9054d2d5d7264a8aa4b41baddf1198e8 > >>> 7jvfka5m | 专卖 | 9054d2d5d7264a8aa4b41baddf1198e8 > >>> 7jvfka5m | 专卖 | a7b96194fe9f48379e2711ac6000191b > >>> u642nk35 | 便利 | df4ef8599c154c978d021b6765edb55b > >>> u642nk35 | 便利 | d60c5d3ac0f44ef1a278d50d84847c2f > >>> u642nk35 | 便利 | df6ca3d8fb444f8fba604db8611c6292 > >>> u642nk35 | 便利 | df4ef8599c154c978d021b6765edb55b > >>> > >> > >
