[
https://issues.apache.org/jira/browse/HIVE-16667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16014828#comment-16014828
]
Naveen Gangam commented on HIVE-16667:
--------------------------------------
[~rusanu] I created a new database {{apachehive}} as a new user {{apachehive}}
within postgres using {{create user apachehive createdb createuser password
'apachehive'; create database apachehive owner apachehive}}. I then installed
the hive schema for 3.0.0.
I can now see that the text columns are storing integers.
{code}
apachehive=# select * from "TBLS";
TBL_ID | CREATE_TIME | DB_ID | LAST_ACCESS_TIME | OWNER | RETENTION | SD_ID
| TBL_NAME | TBL_TYPE | VIEW_EXPANDED_TEXT | VIEW_ORIGINAL_TEXT |
IS_REWRITE_ENABLED
--------+-------------+-------+------------------+---------+-----------+-------+------------------+----------------+--------------------+--------------------+--------------------
1 | 1495055598 | 1 | 0 | ngangam | 0 | 1
| srcpart | MANAGED_TABLE | | |
f
2 | 1495055786 | 1 | 0 | ngangam | 0 | 3
| largeserdeparams | EXTERNAL_TABLE | | |
f
3 | 1495055859 | 1 | 0 | ngangam | 0 | 4
| largetblparams | EXTERNAL_TABLE | | |
f
(3 rows)
apachehive=# select * from "TABLE_PARAMS" where "TBL_ID"=2;
TBL_ID | PARAM_KEY | PARAM_VALUE
--------+-----------------------+-------------
2 | EXTERNAL | 24881
2 | transient_lastDdlTime | 24880
(2 rows)
{code}
However, I am able to retrieve the values correctly using both JDO and
DirectSQL, without any changes to the code. Please see the results and the hive
CLI output in the attached files. I am wondering if somehow your JDO mapping
file does NOT reflect this change to CLOB so JDO layer is treating it as a
String while the value is CLOB.
Would you be able to retry by starting with a fresh DB and installing just
3.0.0 schema (instead of upgrading from an older hive release)? In my prior
tests, I was using an upgraded schema (initially installed hive1.1.0 schema and
then upgraded to hive3.0.0 using the upgrade scripts). In that database, the
{{text}} values for columns still seem to be stored in-line. I am wondering if
this was due to the fact that it was initally a {{character}} type and then
converted to {{text}}.
Thanks
> PostgreSQL metastore handling of CLOB types for COLUMNS_V2.TYPE_NAME and
> other field is incorrect
> -------------------------------------------------------------------------------------------------
>
> Key: HIVE-16667
> URL: https://issues.apache.org/jira/browse/HIVE-16667
> Project: Hive
> Issue Type: Bug
> Reporter: Remus Rusanu
> Assignee: Naveen Gangam
>
> The CLOB JDO type introduced with HIVE-12274 does not work correctly with
> PostgreSQL. The value is written out-of-band and the LOB handle is written,as
> an INT, into the table. SELECTs return the INT value, which should had been
> read via the {{lo_get}} PG built-in, and then cast into string.
> Furthermore, the behavior is different between fields upgraded from earlier
> metastore versions (they retain their string storage) vs. values inserted
> after the upgrade (inserted as LOB roots).
> Teh code in
> {{MetasoreDirectSql.getPartitionsFromPartitionIds/extractSqlClob}} expects
> the underlying JDO/Datanucleus to map the column to a {{Clob}} but that does
> not happen, the value is a Java String containing the int which is the LOB
> root saved by PG.
> This manifests at runtime with errors like:
> {code}
> hive> select * from srcpart;
> Failed with exception java.io.IOException:java.lang.IllegalArgumentException:
> Error: type expected at the position 0 of '24030:24031' but '24030' is found.
> {code}
> the 24030:24031 should be 'string:string'.
> repro:
> {code}
> CREATE TABLE srcpart (key STRING COMMENT 'default', value STRING COMMENT
> 'default') PARTITIONED BY (ds STRING, hr STRING) STORED AS TEXTFILE;
> LOAD DATA LOCAL INPATH "${hiveconf:test.data.dir}/kv1.txt" OVERWRITE INTO
> TABLE srcpart PARTITION (ds="2008-04-09", hr="11");
> select * from srcpart;
> {code}
> I did not see the issue being hit by non-partitioned/textfile tables, but
> that is just the luck of the path taken by the code. Inspection of my PG
> metastore shows all the CLOB fields suffering from this issue.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)