[
https://issues.apache.org/jira/browse/HIVE-12274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16009447#comment-16009447
]
Remus Rusanu commented on HIVE-12274:
-------------------------------------
[~ngangam] I'm having problems with PostgreSQL metastore after these changes.
The CLOB fields is saved by the PG driver as an int handle (see
https://github.com/pgjdbc/pgjdbc/blob/master/pgjdbc/src/main/java/org/postgresql/jdbc/PgPreparedStatement.javaL1239):
{code}
metastore=# select *, CAST(lo_get(CAST("TYPE_NAME" as bigint)) as TEXT) from
"COLUMNS_V2" LIMIT 1;
CD_ID | COMMENT | COLUMN_NAME | TYPE_NAME | INTEGER_IDX | lo_get
-------+---------+-------------+-----------+-------------+----------
2 | | customer | 21664 | 0 | \x696e74
(1 row)
metastore=# select version();
version
----------------------------------------------------------------------------------------------------------------
PostgreSQL 9.6.2 on x86_64-apple-darwin15.6.0, compiled by Apple LLVM version
8.0.0 (clang-800.0.42.1), 64-bit
{code}
This causes runtime failures on a partitioned table:
{code}
hive> select * from srcpart;
Failed with exception java.io.IOException:java.lang.IllegalArgumentException:
Error: type expected at the position 0 of '24030:24031' but '24030' is found.
{code}
the {{24030:24031}} should be {{string:string}} but is not translated through
lo_get by {{MetasoreDirectSql.getPartitionsFromPartitionIds}}
{code}
// Get FieldSchema stuff if any.
if (!colss.isEmpty()) {
// We are skipping the CDS table here, as it seems to be totally useless.
queryText = "select \"CD_ID\", \"COMMENT\", \"COLUMN_NAME\",
\"TYPE_NAME\""
+ " from \"COLUMNS_V2\" where \"CD_ID\" in (" + colIds + ") and
\"INTEGER_IDX\" >= 0"
+ " order by \"CD_ID\" asc, \"INTEGER_IDX\" asc";
loopJoinOrderedResult(colss, queryText, 0, new
ApplyFunc<List<FieldSchema>>() {
@Override
public void apply(List<FieldSchema> t, Object[] fields) {
t.add(new FieldSchema((String)fields[2], extractSqlClob(fields[3]),
(String)fields[1]));
}});
{code}
> Increase width of columns used for general configuration in the metastore.
> --------------------------------------------------------------------------
>
> Key: HIVE-12274
> URL: https://issues.apache.org/jira/browse/HIVE-12274
> Project: Hive
> Issue Type: Improvement
> Components: Metastore
> Affects Versions: 2.0.0
> Reporter: Elliot West
> Assignee: Naveen Gangam
> Labels: metastore
> Fix For: 2.3.0, 3.0.0
>
> Attachments: HIVE-12274.2.patch, HIVE-12274.3.patch,
> HIVE-12274.4.patch, HIVE-12274.5.patch, HIVE-12274.example.ddl.hql,
> HIVE-12274.patch
>
>
> h2. Overview
> This issue is very similar in principle to HIVE-1364. We are hitting a limit
> when processing JSON data that has a large nested schema. The struct
> definition is truncated when inserted into the metastore database column
> {{COLUMNS_V2.YPE_NAME}} as it is greater than 4000 characters in length.
> Given that the purpose of these columns is to hold very loosely defined
> configuration values it seems rather limiting to impose such a relatively low
> length bound. One can imagine that valid use cases will arise where
> reasonable parameter/property values exceed the current limit.
> h2. Context
> These limitations were in by the [patch
> attributed|https://github.com/apache/hive/commit/c21a526b0a752df2a51d20a2729cc8493c228799]
> to HIVE-1364 which mentions the _"max length on Oracle 9i/10g/11g"_ as the
> reason. However, nowadays the limit can be increased because:
> * Oracle DB's {{varchar2}} supports 32767 bytes now, by setting the
> configuration parameter {{MAX_STRING_SIZE}} to {{EXTENDED}}.
> ([source|http://docs.oracle.com/database/121/SQLRF/sql_elements001.htm#SQLRF55623])
> * Postgres supports a max of 1GB for {{character}} datatype.
> ([source|http://www.postgresql.org/docs/8.3/static/datatype-character.html])
> * MySQL can support upto 65535 bytes for the entire row. So long as the
> {{PARAM_KEY}} value + {{PARAM_VALUE}} is less than 65535, we should be good.
> ([source|http://dev.mysql.com/doc/refman/5.0/en/char.html])
> * SQL Server's {{varchar}} max length is 8000 and can go beyond using
> "varchar(max)" with the same limitation as MySQL being 65535 bytes for the
> entire row. ([source|http://dev.mysql.com/doc/refman/5.0/en/char.html])
> * Derby's {{varchar}} can be upto 32672 bytes.
> ([source|https://db.apache.org/derby/docs/10.7/ref/rrefsqlj41207.html])
> h2. Proposal
> Can these columns not use CLOB-like types as for example as used by
> {{TBLS.VIEW_EXPANDED_TEXT}}? It would seem that suitable type equivalents
> exist for all targeted database platforms:
> * MySQL: {{mediumtext}}
> * Postgres: {{text}}
> * Oracle: {{CLOB}}
> * Derby: {{LONG VARCHAR}}
> I'd suggest that the candidates for type change are:
> * {{COLUMNS_V2.TYPE_NAME}}
> * {{TABLE_PARAMS.PARAM_VALUE}}
> * {{SERDE_PARAMS.PARAM_VALUE}}
> * {{SD_PARAMS.PARAM_VALUE}}
> After updating the maximum length the metastore database needs to be
> configured and restarted with the new settings. Altering {{MAX_STRING_SIZE}}
> will update database objects and possibly invalidate them, as follows:
> * Tables with virtual columns will be updated with new data type metadata for
> virtual columns of {{VARCHAR2(4000)}}, 4000-byte {{NVARCHAR2}}, or
> {{RAW(2000)}} type.
> * Functional indexes will become unusable if a change to their associated
> virtual columns causes the index key to exceed index key length limits.
> Attempts to rebuild such indexes will fail with {{ORA-01450: maximum key
> length exceeded}}.
> * Views will be invalidated if they contain {{VARCHAR2(4000)}}, 4000-byte
> {{NVARCHAR2}}, or {{RAW(2000)}} typed expression columns.
> * Materialized views will be updated with new metadata {{VARCHAR2(4000)}},
> 4000-byte {{NVARCHAR2}}, and {{RAW(2000)}} typed expression columns
> * So the limitation could be raised to 32672 bytes, with the caveat that
> MySQL and SQL Server limit the row length to 65535 bytes, so that should also
> be validated to provide consistency.
> Finally, will this limitation persist in the work resulting from HIVE-9452?
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)