[ 
https://issues.apache.org/jira/browse/HIVE-21940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor updated HIVE-21940:
--------------------------------
    Description: 
The issue is reproducible on a cluster with the following statements:
{code}
USE default;
drop table if exists my_table;
create external table my_table (col1 int, col3 int) partitioned by (col2 
string) STORED AS TEXTFILE;
insert into my_table VALUES(11,201,"F");
SELECT pp.* FROM sys.partition_params pp join sys.partitions p on p.part_id = 
pp.part_id join sys.tbls t on t.tbl_id = p.tbl_id where t.tbl_name = "my_table";
{code}

Seems like (propably) since HIVE-20833/HIVE-20221 there is an inconvenience 
while using PARTITION_PARAMS/PARAM_VALUE, because in postgres there is no such 
type as CLOB, and metastore simply saves large object ids in this field. More 
interesting is that the large object can be resolved in some codepath. In case 
of a describe for partition:
{code}
describe formatted my_table_for_sqoop partition (col2='F');

...

| Partition Parameters:             | NULL                                      
         | NULL                                               |
|                                   | COLUMN_STATS_ACCURATE                     
         | 
{\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"col1\":\"true\",\"col3\":\"true\"}}
 |
|                                   | numFiles                                  
         | 1                                                  |
|                                   | numRows                                   
         | 1                                                  |
|                                   | rawDataSize                               
         | 6                                                  |
|                                   | totalSize                                 
         | 7                                                  |
|                                   | transient_lastDdlTime                     
         | 1561976024                                         |
|                                   | NULL                                      
         | NULL                                               |

{code}

But in case for direct metastore query (from hive's sys schema, but the same 
result for direct postgres) query:
{code}
+-------------+------------------------+-----------------+
| pp.part_id  |      pp.param_key      | pp.param_value  |
+-------------+------------------------+-----------------+
| 151         | rawDataSize            | 28629           |
| 151         | numRows                | 28628           |
| 151         | transient_lastDdlTime  | 28627           |
| 151         | COLUMN_STATS_ACCURATE  | 28626           |
| 151         | numFiles               | 28625           |
| 151         | totalSize              | 28622           |
+-------------+------------------------+-----------------+
{code}


> Metastore: Postgres text <-> clob mismatch for PARTITION_PARAMS/PARAM_VALUE
> ---------------------------------------------------------------------------
>
>                 Key: HIVE-21940
>                 URL: https://issues.apache.org/jira/browse/HIVE-21940
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Laszlo Bodor
>            Assignee: Laszlo Bodor
>            Priority: Major
>
> The issue is reproducible on a cluster with the following statements:
> {code}
> USE default;
> drop table if exists my_table;
> create external table my_table (col1 int, col3 int) partitioned by (col2 
> string) STORED AS TEXTFILE;
> insert into my_table VALUES(11,201,"F");
> SELECT pp.* FROM sys.partition_params pp join sys.partitions p on p.part_id = 
> pp.part_id join sys.tbls t on t.tbl_id = p.tbl_id where t.tbl_name = 
> "my_table";
> {code}
> Seems like (propably) since HIVE-20833/HIVE-20221 there is an inconvenience 
> while using PARTITION_PARAMS/PARAM_VALUE, because in postgres there is no 
> such type as CLOB, and metastore simply saves large object ids in this field. 
> More interesting is that the large object can be resolved in some codepath. 
> In case of a describe for partition:
> {code}
> describe formatted my_table_for_sqoop partition (col2='F');
> ...
> | Partition Parameters:             | NULL                                    
>            | NULL                                               |
> |                                   | COLUMN_STATS_ACCURATE                   
>            | 
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"col1\":\"true\",\"col3\":\"true\"}}
>  |
> |                                   | numFiles                                
>            | 1                                                  |
> |                                   | numRows                                 
>            | 1                                                  |
> |                                   | rawDataSize                             
>            | 6                                                  |
> |                                   | totalSize                               
>            | 7                                                  |
> |                                   | transient_lastDdlTime                   
>            | 1561976024                                         |
> |                                   | NULL                                    
>            | NULL                                               |
> {code}
> But in case for direct metastore query (from hive's sys schema, but the same 
> result for direct postgres) query:
> {code}
> +-------------+------------------------+-----------------+
> | pp.part_id  |      pp.param_key      | pp.param_value  |
> +-------------+------------------------+-----------------+
> | 151         | rawDataSize            | 28629           |
> | 151         | numRows                | 28628           |
> | 151         | transient_lastDdlTime  | 28627           |
> | 151         | COLUMN_STATS_ACCURATE  | 28626           |
> | 151         | numFiles               | 28625           |
> | 151         | totalSize              | 28622           |
> +-------------+------------------------+-----------------+
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to