[
https://issues.apache.org/jira/browse/KUDU-3401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Khazar Mammadli resolved KUDU-3401.
-----------------------------------
Fix Version/s: 1.15.0
Resolution: Fixed
> Unable to query Kudu tables from Hive with Kudu HMS Integration enabled
> -----------------------------------------------------------------------
>
> Key: KUDU-3401
> URL: https://issues.apache.org/jira/browse/KUDU-3401
> Project: Kudu
> Issue Type: Bug
> Components: hms
> Reporter: Khazar Mammadli
> Assignee: Khazar Mammadli
> Priority: Major
> Fix For: 1.15.0
>
>
> When Kudu HMS integration is enabled there are several missing fields when
> creating a table via query "stored as kudu table" on Impala from hive. This
> results in ClassNotFound error when trying to query the table from Hive after
> creating the table:
>
> {code:java}
> ERROR : Failed
> org.apache.hadoop.hive.metastore.api.MetaException:
> java.lang.ClassNotFoundException Class not found
> at
> org.apache.hadoop.hive.metastore.HiveMetaStoreUtils.getDeserializer(HiveMetaStoreUtils.java:98)
> ~[hive-exec-3.1.3000.7.1.7.1000-141.jar:3.1.3000.7.1.7.1000-141]
> at
> org.apache.hadoop.hive.metastore.HiveMetaStoreUtils.getDeserializer(HiveMetaStoreUtils.java:77)
> ~[hive-exec-3.1.3000.7.1.7.1000-141.jar:3.1.3000.7.1.7.1000-141]
> at
> org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:331)
> ~[hive-exec-3.1.3000.7.1.7.1000-141.jar:3.1.3000.7.1.7.1000-141] {code}
>
> When running a following sample query in Impala to create a kudu table with
> Kudu HMS integration enabled the table gets created with the InputFormat,
> OutputFormat and SerDe Library fields are missing
>
> {code:java}
> create table default.kudu_test (
> col1 string comment 'col1',
> col2 string comment 'col2',
> primary key (col1)
> )
> comment 'kudu_test'
> stored as kudu;{code}
>
> |SerDe Library:| |NULL|
> |InputFormat:| |NULL|
> |OutputFormat:| |NULL|
> Hive Metastore log for the table creation:
> INFO org.apache.hadoop.hive.metastore.HiveMetaStore: [pool-5-thread-124]:
> 134: source:172.25.35.0 create_table: Table(tableName:kudu_test,
> dbName:default, owner:root, createTime:0, lastAccessTime:0, retention:0,
> sd:StorageDescriptor(cols:[FieldSchema(name:col1, type:string, comment:col1),
> FieldSchema(name:col2, type:string, comment:col2)], location:, inputFormat:,
> outputFormat:, compressed:false, numBuckets:0, serdeInfo:SerDeInfo(name:,
> serializationLib:, parameters:{}), bucketCols:[], sortCols:[],
> parameters:{}), partitionKeys:[],
> parameters:{kudu.table_name=default.kudu_test,
> kudu.table_id=5ac46856863f402fb69941ce7b967945, comment=,
> kudu.master_addresses=c3549-node2.coelab.cloudera.com:7051,
> storage_handler=org.apache.hadoop.hive.kudu.KuduStorageHandler,
> kudu.cluster_id=65c8dfbc8b75485db1328ab42f55fa07}, viewOriginalText:,
> viewExpandedText:, tableType:MANAGED_TABLE, temporary:false, ownerType:USER)
> Running the same query in Impala with Kudu HMS Integration disabled on the
> other hand has these fields populated when the table is created:
> |SerDe Library:|org.apache.hadoop.hive.kudu.KuduSerDe|NULL|
> |InputFormat:|org.apache.hadoop.hive.kudu.KuduInputFormat|NULL|
> |OutputFormat:|org.apache.hadoop.hive.kudu.KuduOutputFormat|NULL|
> Hive Metastore log for table creation:
> NFO org.apache.hadoop.hive.metastore.HiveMetaStore: [pool-5-thread-173]:
> 183: source:172.25.35.0 create_table_req: Table(tableName:kudu_test,
> dbName:default, owner:root, createTime:0, lastAccessTime:0, retention:0,
> sd:StorageDescriptor(cols:[FieldSchema(name:col1, type:string, comment:col1),
> FieldSchema(name:col2, type:string, comment:col2)], location:null,
> inputFormat:org.apache.hadoop.hive.kudu.KuduInputFormat,
> outputFormat:org.apache.hadoop.hive.kudu.KuduOutputFormat, compressed:false,
> numBuckets:0, serdeInfo:SerDeInfo(name:null,
> serializationLib:org.apache.hadoop.hive.kudu.KuduSerDe, parameters:{}),
> bucketCols:[], sortCols:[], parameters:null), partitionKeys:[],
> parameters:{comment=kudu_test_lbodor_no_hms_integration,
> kudu.master_addresses=c3549-node2.coelab.cloudera.com,
> storage_handler=org.apache.hadoop.hive.kudu.KuduStorageHandler,
> kudu.table_name=impala::default.kudu_test}, viewOriginalText:null,
> viewExpandedText:null, tableType:MANAGED_TABLE, catName:hive, ownerType:USER,
> accessType:8)
> --------------------------------
> Code path for table creation when Kudu HMS integration enabled(Kudu Codepath):
> Quick recap of steps when creating a kudu table:
> HMSCatalog::CreateTable() —> hive::Table declared and passed to
> PopulateTable(… , &table) -> Thirft client Execute call —>
> HMSClient::CreateTable(Table(one that just got populated),
> envcontext(default)) ->
> hms_client.create_table_with_environment_context(table, envcontext).
> CreateTable
> [https://github.com/apache/kudu/blob/master/src/kudu/hms/hms_catalog.cc#L146]
> ->
> Populate the fields of table
> [https://github.com/apache/kudu/blob/master/src/kudu/hms/hms_catalog.cc#L367]
> Hms client call
> [https://github.com/apache/kudu/blob/master/src/kudu/hms/hms_client.cc#L280]
> -----------------------------
> Code path for table creation when Kudu HMS integration is disabled(Impala
> Codepath):
> CreateTable -> CreateMetaStoreTable
> [https://github.com/apache/impala/blob/da3d6fc7f7c656b118bb3570cedf7d7c3158bd0b/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#L3191]
> ->line 3248 tbl.setSd(createSd(params));
> CreateSd
> [https://github.com/apache/impala/blob/da3d6fc7f7c656b118bb3570cedf7d7c3158bd0b/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#L3260|https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/fe/src/main/java/org/apache/impala/catalog/HiveStorageDescriptorFactory.java#L36]
>
> Checking the code paths its observable that the missing fields are filled via
> CreateSd with default values for the table getting created without Kudu HMS
> integration(Through Impala).
> These fields are untouched when Kudu HMS integration is enabled and table is
> getting created(Kudu code path).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)