Remus Rusanu created HIVE-16667:
-----------------------------------

             Summary: PostgreSQL metastore handling of CLOB types for 
COLUMNS_V2.TYPE_NAME and other field is incorrect
                 Key: HIVE-16667
                 URL: https://issues.apache.org/jira/browse/HIVE-16667
             Project: Hive
          Issue Type: Bug
            Reporter: Remus Rusanu


The CLOB JDO type introduced with HIVE-12274 does not work correctly with 
PostgreSQL. The value is written out-of-band and the LOB handle is written,as 
an INT, into the table. SELECTs return the INT value, which should had been 
read via the {{lo_get}} PG built-in, and then cast into string.

Furthermore, the behavior is different between fields upgraded from earlier 
metastore versions (they retain their string storage) vs. values inserted after 
the upgrade (inserted as LOB roots).

Teh code in {{MetasoreDirectSql.getPartitionsFromPartitionIds/extractSqlClob}} 
expects the underlying JDO/Datanucleus to map the column to a {{Clob}} but that 
does not happen, the value is a Java String containing the int which is the LOB 
root saved by PG.

This manifests at runtime with errors like:
{code}
hive> select * from srcpart;
Failed with exception java.io.IOException:java.lang.IllegalArgumentException: 
Error: type expected at the position 0 of '24030:24031' but '24030' is found.
{code}

the 24030:24031 should be 'string:string'.

repro:

{code}
CREATE TABLE srcpart (key STRING COMMENT 'default', value STRING COMMENT 
'default') PARTITIONED BY (ds STRING, hr STRING) STORED AS TEXTFILE;
LOAD DATA LOCAL INPATH "${hiveconf:test.data.dir}/kv1.txt" OVERWRITE INTO TABLE 
srcpart PARTITION (ds="2008-04-09", hr="11");
select * from srcpart;
{code}

I did not see the issue being hit by non-partitioned/textfile tables, but that 
is just the luck of the path taken by the code. Inspection of my PG metastore 
shows all the CLOB fields suffering from this issue.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to