## Version:

phoenix: 4.13.2-cdh5.11.2

hive: 1.1.0-cdh5.11.2


There is an ColumnNotFoundException when joining a hive internal table with an 
Phoenix external table.



## Table1: phoenix external table "ext_tmp":

+-----------+------------+--------------------+--+

| col_name  | data_type  |      comment       |

+-----------+------------+--------------------+--+

| cola      | string     | from deserializer  |

| colb      | string     | from deserializer  |

+-----------+------------+--------------------+--+


### Backend Phoenix table "TMP":

select * from TMP;

+-------+------------+

| cola  |    colb    |

+-------+------------+

| a     | aaaa       |

| b     | bbaaaa     |

| ccc   | cccbbaaaa  |

+-------+------------+


### Hive external table creation statement:


+----------------------------------------------------+--+

|                   createtab_stmt                   |

+----------------------------------------------------+--+

| CREATE EXTERNAL TABLE `ext_tmp`(                   |

|   `cola` string COMMENT 'from deserializer',       |

|   `colb` string COMMENT 'from deserializer')       |

| ROW FORMAT SERDE                                   |

|   'org.apache.phoenix.hive.PhoenixSerDe'           |

| STORED BY                                          |

|   'org.apache.phoenix.hive.PhoenixStorageHandler'  |

| WITH SERDEPROPERTIES (                             |

|   'serialization.format'='1')                      |

| LOCATION                                           |

|   'hdfs://st:8020/data/user/hive/warehouse/ext_tmp'|

| TBLPROPERTIES (                                    |

|   'phoenix.column.mapping'='cola:cola,colb:colb',  |

|   'phoenix.rowkeys'='cola',                        |

|   'phoenix.table.name'='tmp',                      |

|   'phoenix.zookeeper.client.port'='2181',          |

|   'phoenix.zookeeper.quorum'='st1,st2,st3',        |

|   'phoenix.zookeeper.znode.parent'='/hbase',       |

|   'transient_lastDdlTime'='1523607352')            |

+----------------------------------------------------+--+



## Table2: hive internal table "native1":

+-----------+------------+----------+--+

| col_name  | data_type  | comment  |

+-----------+------------+----------+--+

| cola      | string     |          |

| colb      | string     |          |

+-----------+------------+----------+--+



## When join the two tables:


select *  from native1  join  ext_tmp t on native1.cola= t.cola;



Exception:

org.apache.phoenix.schema.ColumnNotFoundException: ERROR 504 (42703): Undefined 
column. columnName=TMP


Detailed exception as below. There is an empty column "" as readcolumn name, 
which is not exsit in Pheonix table at all. so the phoenix query will never be 
correct: select /*+ NO_CACHE  */ "","cola","colb" from tmp where "cola" is not 
null


Any thoughts?



---------

2018-04-14 21:13:40,923 INFO  org.apache.hadoop.hive.ql.io.HiveInputFormat: 
[HiveServer2-Background-Pool: Thread-304]: hive.io.file.readcolumn.ids=

2018-04-14 21:13:40,923 INFO  org.apache.hadoop.hive.ql.io.HiveInputFormat: 
[HiveServer2-Background-Pool: Thread-304]: 
hive.io.file.readcolumn.names=,cola,colb

2018-04-14 21:13:40,923 INFO  org.apache.hadoop.hive.ql.io.HiveInputFormat: 
[HiveServer2-Background-Pool: Thread-304]: Generating splits

2018-04-14 21:13:40,924 INFO  
org.apache.phoenix.hive.query.PhoenixQueryBuilder: 
[HiveServer2-Background-Pool: Thread-304]: Input query : select /*+ NO_CACHE  
*/ "","cola","colb" from tmp where "cola" is not null

2018-04-14 21:13:40,932 ERROR 
org.apache.phoenix.hive.mapreduce.PhoenixInputFormat: 
[HiveServer2-Background-Pool: Thread-304]: Failed to get the query plan with 
error [ERROR 504 (42703): Undefined column. columnName=TMP]


Thanks,

Wei

Reply via email to