[ 
https://issues.apache.org/jira/browse/IMPALA-9188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16980636#comment-16980636
 ] 

Anurag Mantripragada commented on IMPALA-9188:
----------------------------------------------

I think the bug is at this line: 
[https://github.com/apache/impala/blob/e716e76cccf59c2780571429b1b945d6bbc61b8d/fe/src/main/java/org/apache/impala/analysis/TableDef.java#L497]

For a composite primary key like (id, year) we are generating unique constraint 
names for each column whereas, they should have the same constraint name. In 
Hive, the comparator first sorts using constraint name and then key_seq if 
constraint names are same.This is why the hive comparator is giving different 
results. We should generate a new name only if key_seq is 1, if not, we should 
use existing constraint name. We already do something similar for foreign keys.

[https://github.com/apache/impala/blob/e716e76cccf59c2780571429b1b945d6bbc61b8d/fe/src/main/java/org/apache/impala/analysis/TableDef.java#L565]

> Dataload is failing when USE_CDP_HIVE=true
> ------------------------------------------
>
>                 Key: IMPALA-9188
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9188
>             Project: IMPALA
>          Issue Type: Bug
>            Reporter: Sahil Takiar
>            Assignee: Anurag Mantripragada
>            Priority: Critical
>
> When USE_CDP_HIVE=true, Impala builds are failing during dataload when 
> creating tables with PK/FK constraints.
> The error is:
> {code:java}
> ERROR: CREATE EXTERNAL TABLE IF NOT EXISTS 
> functional_seq_record_snap.child_table (
> seq int, id int, year string, a int, primary key(seq) DISABLE NOVALIDATE 
> RELY, foreign key
> (id, year) references functional_seq_record_snap.parent_table(id, year) 
> DISABLE NOVALIDATE
> RELY, foreign key(a) references functional_seq_record_snap.parent_table_2(a) 
> DISABLE
> NOVALIDATE RELY)
> row format delimited fields terminated by ','
> LOCATION '/test-warehouse/child_table'
> Traceback (most recent call last):
>   File "Impala/bin/load-data.py", line 208, in exec_impala_query_from_file
>     result = impala_client.execute(query)
>   File "Impala/tests/beeswax/impala_beeswax.py", line 187, in execute
>     handle = self.__execute_query(query_string.strip(), user=user)
>   File "Impala/tests/beeswax/impala_beeswax.py", line 362, in __execute_query
>     handle = self.execute_query_async(query_string, user=user)
>   File "Impala/tests/beeswax/impala_beeswax.py", line 356, in 
> execute_query_async
>     handle = self.__do_rpc(lambda: self.imp_service.query(query,))
>   File "Impala/tests/beeswax/impala_beeswax.py", line 519, in __do_rpc
>     raise ImpalaBeeswaxException(self.__build_error_message(b), b)
> ImpalaBeeswaxException: ImpalaBeeswaxException:
>  INNER EXCEPTION: <class 'beeswaxd.ttypes.BeeswaxException'>
>  MESSAGE: ImpalaRuntimeException: Error making 'createTable' RPC to Hive 
> Metastore:
> CAUSED BY: MetaException: Foreign key references id:int;year:string; but no 
> corresponding primary key or unique key exists. Possible keys: 
> [year:string;id:int;]{code}
> The corresponding error in HMS is:
> {code:java}
> 2019-11-22T06:36:59,937  INFO [pool-10-thread-13] metastore.HiveMetaStore: 
> 18: source:127.0.0.1 create_table_req: Table(tableName:child_table, 
> dbName:functional_seq_record_gzip, owner:jenkins, createTime:0, 
> lastAccessTime:0, retention:0, 
> sd:StorageDescriptor(cols:[FieldSchema(name:seq, type:int, comment:null), 
> FieldSchema(name:id, type:int, comment:null), FieldSchema(name:year, 
> type:string, comment:null), FieldSchema(name:a, type:int, comment:null)], 
> location:hdfs://localhost:20500/test-warehouse/child_table, 
> inputFormat:org.apache.hadoop.mapred.TextInputFormat, 
> outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, 
> compressed:false, numBuckets:0, serdeInfo:SerDeInfo(name:null, 
> serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, 
> parameters:{serialization.format=,, field.delim=,}), bucketCols:null, 
> sortCols:null, parameters:null), partitionKeys:[], parameters:{EXTERNAL=TRUE, 
> OBJCAPABILITIES=EXTREAD,EXTWRITE}, viewOriginalText:null, 
> viewExpandedText:null, tableType:EXTERNAL_TABLE, catName:hive, 
> ownerType:USER, accessType:8)
> 2019-11-22T06:36:59,937  INFO [pool-10-thread-13] HiveMetaStore.audit: 
> ugi=jenkins      ip=127.0.0.1    cmd=source:127.0.0.1 create_table_req: 
> Table(tableName:child_table, dbName:functional_seq_record_gzip, 
> owner:jenkins, createTime:0, lastAccessTime:0, retention:0, 
> sd:StorageDescriptor(cols:[FieldSchema(name:seq, type:int, comment:null), 
> FieldSchema(name:id, type:int, comment:null), FieldSchema(name:year, 
> type:string, comment:null), FieldSchema(name:a, type:int, comment:null)], 
> location:hdfs://localhost:20500/test-warehouse/child_table, 
> inputFormat:org.apache.hadoop.mapred.TextInputFormat, 
> outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, 
> compressed:false, numBuckets:0, serdeInfo:SerDeInfo(name:null, 
> serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, 
> parameters:{serialization.format=,, field.delim=,}), bucketCols:null, 
> sortCols:null, parameters:null), partitionKeys:[], parameters:{EXTERNAL=TRUE, 
> OBJCAPABILITIES=EXTREAD,EXTWRITE}, viewOriginalText:null, 
> viewExpandedText:null, tableType:EXTERNAL_TABLE, catName:hive, 
> ownerType:USER, accessType:8)
> 2019-11-22T06:36:59,937  INFO [pool-10-thread-13] 
> metastore.MetastoreDefaultTransformer: Starting translation for CreateTable 
> for processor Impala3.4.0-SNAPSHOT@localhost with [EXTWRITE, EXTREAD, 
> HIVEMANAGEDINSERTREAD, HIVEMANAGEDINSERTWRITE, HIVESQL, HIVEMQT, HIVEBUCKET2] 
> on table child_table
> 2019-11-22T06:36:59,937  INFO [pool-10-thread-13] 
> metastore.MetastoreDefaultTransformer: Table to be created is of type 
> EXTERNAL_TABLE but not MANAGED_TABLE
> 2019-11-22T06:36:59,937  INFO [pool-10-thread-13] 
> metastore.MetastoreDefaultTransformer: Transformer returning 
> table:Table(tableName:child_table, dbName:functional_seq_record_gzip, 
> owner:jenkins, createTime:0, lastAccessTime:0, retention:0, 
> sd:StorageDescriptor(cols:[FieldSchema(name:seq, type:int, comment:null), 
> FieldSchema(name:id, type:int, comment:null), FieldSchema(name:year, 
> type:string, comment:null), FieldSchema(name:a, type:int, comment:null)], 
> location:hdfs://localhost:20500/test-warehouse/child_table, 
> inputFormat:org.apache.hadoop.mapred.TextInputFormat, 
> outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, 
> compressed:false, numBuckets:0, serdeInfo:SerDeInfo(name:null, 
> serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, 
> parameters:{serialization.format=,, field.delim=,}), bucketCols:null, 
> sortCols:null, parameters:null), partitionKeys:[], parameters:{EXTERNAL=TRUE, 
> OBJCAPABILITIES=EXTREAD,EXTWRITE}, viewOriginalText:null, 
> viewExpandedText:null, tableType:EXTERNAL_TABLE, catName:hive, 
> ownerType:USER, accessType:8)
> 2019-11-22T06:36:59,945 ERROR [pool-10-thread-13] 
> metastore.RetryingHMSHandler: MetaException(message:Foreign key references 
> id:int;year:string; but no corresponding primary key or unique key exists. 
> Possible keys: [year:string;id:int;])
>         at 
> org.apache.hadoop.hive.metastore.ObjectStore.addForeignKeys(ObjectStore.java:4968)
>         at 
> org.apache.hadoop.hive.metastore.ObjectStore.createTableWithConstraints(ObjectStore.java:1289)
>         at sun.reflect.GeneratedMethodAccessor71.invoke(Unknown Source)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97)
>         at com.sun.proxy.$Proxy27.createTableWithConstraints(Unknown Source)
>         at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:2220)
>         at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_req(HiveMetaStore.java:2404)
>         at sun.reflect.GeneratedMethodAccessor24.invoke(Unknown Source)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
>         at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
>         at com.sun.proxy.$Proxy34.create_table_req(Unknown Source)
>         at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_req.getResult(ThriftHiveMetastore.java:16107)
>         at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_req.getResult(ThriftHiveMetastore.java:16091)
>         at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>         at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111)
>         at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876)
>         at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119)
>         at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> {code}
> Looks like this was caused by IMPALA-9104.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to