[
https://issues.apache.org/jira/browse/HIVE-16667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16014175#comment-16014175
]
Naveen Gangam commented on HIVE-16667:
--------------------------------------
[~rusanu] I have run some tests with my postgres installation running on
{{Linux <snip> 2.6.32-504.16.2.el6.x86_64 #1 SMP Wed Apr 22 06:48:29 UTC 2015
x86_64 x86_64 x86_64 GNU/Linux}} and the CLOB values inserted into the database
are inline.
{code}
hive1=> \d "TABLE_PARAMS"
Table "public.TABLE_PARAMS"
Column | Type | Modifiers
-------------+------------------------+---------------------------------
TBL_ID | bigint | not null
PARAM_KEY | character varying(256) | not null
PARAM_VALUE | text | default NULL::character varying
Indexes:
"TABLE_PARAMS_pkey" PRIMARY KEY, btree ("TBL_ID", "PARAM_KEY")
"TABLE_PARAMS_N49" btree ("TBL_ID")
Foreign-key constraints:
"TABLE_PARAMS_TBL_ID_fkey" FOREIGN KEY ("TBL_ID") REFERENCES
"TBLS"("TBL_ID") DEFERRABLE
hive1=> select "PARAM_VALUE" from "TABLE_PARAMS";
1
true
0
46055
1494964727
0
1
46069
true
1494964728
0
0
1
true
15812
1494964729
0
0
1494964729
STRING,myField2 : STRING, myField3 : STRING, myField4 : STRING, myField5 :
STRING, myField6 : STRING, myField7 : STRING, myField8 : STRING, myField9 :
STRING, myField10 : STRING, myField11 : STRING, my
Field12 : STRING, myField13 : STRING, myField14 : STRING, myField15 : STRING,
myField16 : STRING, myField17 : STRING, myField18 : STRING, myField19 : STRING,
myField20 : STRING, myField21 : STRING, myFi
eld22 : STRING, myField23 : STRING, myField24 : STRING, myField25 : STRING,
{code}
The last value has been snipped for brevity but it is a long value beyond the
prior 4000 character limit. I am not sure why my "text" values are inline
whereas your values show an integer. Could this be a database config setting?
Thanks
> PostgreSQL metastore handling of CLOB types for COLUMNS_V2.TYPE_NAME and
> other field is incorrect
> -------------------------------------------------------------------------------------------------
>
> Key: HIVE-16667
> URL: https://issues.apache.org/jira/browse/HIVE-16667
> Project: Hive
> Issue Type: Bug
> Reporter: Remus Rusanu
> Assignee: Naveen Gangam
>
> The CLOB JDO type introduced with HIVE-12274 does not work correctly with
> PostgreSQL. The value is written out-of-band and the LOB handle is written,as
> an INT, into the table. SELECTs return the INT value, which should had been
> read via the {{lo_get}} PG built-in, and then cast into string.
> Furthermore, the behavior is different between fields upgraded from earlier
> metastore versions (they retain their string storage) vs. values inserted
> after the upgrade (inserted as LOB roots).
> Teh code in
> {{MetasoreDirectSql.getPartitionsFromPartitionIds/extractSqlClob}} expects
> the underlying JDO/Datanucleus to map the column to a {{Clob}} but that does
> not happen, the value is a Java String containing the int which is the LOB
> root saved by PG.
> This manifests at runtime with errors like:
> {code}
> hive> select * from srcpart;
> Failed with exception java.io.IOException:java.lang.IllegalArgumentException:
> Error: type expected at the position 0 of '24030:24031' but '24030' is found.
> {code}
> the 24030:24031 should be 'string:string'.
> repro:
> {code}
> CREATE TABLE srcpart (key STRING COMMENT 'default', value STRING COMMENT
> 'default') PARTITIONED BY (ds STRING, hr STRING) STORED AS TEXTFILE;
> LOAD DATA LOCAL INPATH "${hiveconf:test.data.dir}/kv1.txt" OVERWRITE INTO
> TABLE srcpart PARTITION (ds="2008-04-09", hr="11");
> select * from srcpart;
> {code}
> I did not see the issue being hit by non-partitioned/textfile tables, but
> that is just the luck of the path taken by the code. Inspection of my PG
> metastore shows all the CLOB fields suffering from this issue.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)