[
https://issues.apache.org/jira/browse/HIVE-21940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16877612#comment-16877612
]
Laszlo Bodor edited comment on HIVE-21940 at 7/3/19 8:05 AM:
-------------------------------------------------------------
local repro against posgres with [^HIVE-21940.repro.patch]
{code}
export QTEST_LEAVE_FILES=true
docker kill metastore-test-postgres-install
docker rm metastore-test-postgres-install
cd standalone-metastore
mvn verify -DskipITests=false -Dit.test=ITestPostgres#install -Dtest=nosuch
-Dmetastore.itest.no.stop.container=true
cd ..
mvn test -Dtest.output.overwrite=true -Pitests,hadoop-2 -pl itests/qtest
-Dtest=TestCliDriver -Dqfile=partition_params_postgres.q
-Dhive.metastore.rawstore.impl=org.apache.hadoop.hive.metastore.ObjectStore
{code}
was (Author: abstractdog):
local repro against posgres with [^HIVE-21940.repro.patch]
{code}
export QTEST_LEAVE_FILES=true
docker kill metastore-test-postgres-install
docker rm metastore-test-postgres-install
cd standalone-metastore
mvn verify -DskipITests=false -Dit.test=ITestPostgres#install -Dtest=nosuch
-Dmetastore.itest.no.stop.container=true
cd ..
mvn test -Dtest.output.overwrite=true -Pitests,hadoop-2 -pl itests/qtest
-Dtest=TestCliDriver -Dqfile=partition_params_postgres.q
-Dhive.metastore.rawstore.impl=org.apache.hadoop.hive.metastore.ObjectStore
{code}
> Metastore: Postgres text <-> clob mismatch for PARTITION_PARAMS/PARAM_VALUE
> ---------------------------------------------------------------------------
>
> Key: HIVE-21940
> URL: https://issues.apache.org/jira/browse/HIVE-21940
> Project: Hive
> Issue Type: Bug
> Affects Versions: 3.2.0
> Reporter: Laszlo Bodor
> Assignee: Laszlo Bodor
> Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21940.repro.patch
>
>
> The issue is reproducible on a cluster with postgres metastore db by the
> following statements:
> {code}
> USE default;
> drop table if exists my_table;
> create external table my_table (col1 int, col3 int) partitioned by (col2
> string) STORED AS TEXTFILE;
> insert into my_table VALUES(11,201,"F");
> SELECT pp.* FROM sys.partition_params pp join sys.partitions p on p.part_id =
> pp.part_id join sys.tbls t on t.tbl_id = p.tbl_id where t.tbl_name =
> "my_table";
> {code}
> sys query results in:
> {code}
> +-------------+------------------------+-----------------+
> | pp.part_id | pp.param_key | pp.param_value |
> +-------------+------------------------+-----------------+
> | 151 | rawDataSize | 28629 |
> | 151 | numRows | 28628 |
> | 151 | transient_lastDdlTime | 28627 |
> | 151 | COLUMN_STATS_ACCURATE | 28626 |
> | 151 | numFiles | 28625 |
> | 151 | totalSize | 28622 |
> +-------------+------------------------+-----------------+
> {code}
> Seems like (propably) since HIVE-20833/HIVE-20221 there is an inconvenience
> while using PARTITION_PARAMS/PARAM_VALUE, because in postgres there is no
> such type as CLOB, and metastore simply saves large object ids into this
> field. More interesting is that the large object can be resolved in some
> codepaths. In case of a describe for partition it works correctly:
> {code}
> describe formatted my_table partition (col2='F');
> ...
> | Partition Parameters: | NULL
> | NULL |
> | | COLUMN_STATS_ACCURATE
> |
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"col1\":\"true\",\"col3\":\"true\"}}
> |
> | | numFiles
> | 1 |
> | | numRows
> | 1 |
> | | rawDataSize
> | 6 |
> | | totalSize
> | 7 |
> | | transient_lastDdlTime
> | 1561976024 |
> | | NULL
> | NULL |
> {code}
> But in case of a direct metastore query (from hive's sys schema, but the same
> result for direct postgres), it shows the result above (see sys query
> output). This is an issue when hive treats these ids as they were real
> values, but they are obviously not correct, and this causes various failures
> (e.g. using serde parameter serialization.format=28392)
> param_value values above are large object ids, according to pg_dump
> | 151 | COLUMN_STATS_ACCURATE | 28626 |
> {code}
> SELECT pg_catalog.lo_open('28626', 131072);
> SELECT pg_catalog.lowrite(0,
> '\x7b2242415349435f5354415453223a2274727565222c22434f4c554d4e5f5354415453223a7b22636f6c31223a2274727565222c22636f6c33223a2274727565227d7d');
> SELECT pg_catalog.lo_close(0);
> {code}
> decoded large object value:
> {code}
> {"BASIC_STATS":"true","COLUMN_STATS":{"col1":"true","col3":"true"}}
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)