[ https://issues.apache.org/jira/browse/IMPALA-10451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17845053#comment-17845053 ]
ASF subversion and git services commented on IMPALA-10451: ---------------------------------------------------------- Commit d1d28c0eec52c212b4b8cd09080327c542f24bfa in impala's branch refs/heads/master from Joe McDonnell [ https://gitbox.apache.org/repos/asf?p=impala.git;h=d1d28c0ee ] IMPALA-10451: Fix avro table loading failures caused by HIVE-24157 HIVE-24157 introduces a restriction to prohibit casting DATE/TIMESTAMP types to and from NUMERIC. It's enabled by default and can be turned off by set hive.strict.timestamp.conversion=false. This restriction breaks the data loading on avro_coldef and avro_extra_coldef tables, which results in empty data set and finally fails TestAvroSchemaResolution.test_avro_schema_resolution. This patch explicitly disables the restriction in loading these two avro tables. The Hive version currently used for development does not have HIVE-24157, but upstream Hive does have it. Adding hive.strict.timestamp.conversion does not cause problems for Hive versions that don't have HIVE-24157. Tests: - Run the data loading and test_avro_schema_resolution locally using a Hive that has HIVE-24157. - Run CORE tests - Run data loading with a Hive that doesn't have HIVE-24157. Change-Id: I3e2a47d60d4079fece9c04091258215f3d6a7b52 Reviewed-on: http://gerrit.cloudera.org:8080/21413 Reviewed-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Tested-by: Joe McDonnell <joemcdonn...@cloudera.com> > TestAvroSchemaResolution.test_avro_schema_resolution fails when bumping Hive > to have HIVE-24157 > ----------------------------------------------------------------------------------------------- > > Key: IMPALA-10451 > URL: https://issues.apache.org/jira/browse/IMPALA-10451 > Project: IMPALA > Issue Type: Bug > Reporter: Quanlong Huang > Assignee: Joe McDonnell > Priority: Major > > TestAvroSchemaResolution.test_avro_schema_resolution recently fails when > building against a Hive version with HIVE-24157. > {code:java} > query_test.test_avro_schema_resolution.TestAvroSchemaResolution.test_avro_schema_resolution[protocol: > beeswax | exec_option: \{'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, > 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: > avro/snap/block] (from pytest) > query_test/test_avro_schema_resolution.py:36: in test_avro_schema_resolution > self.run_test_case('QueryTest/avro-schema-resolution', vector, > unique_database) > common/impala_test_suite.py:690: in run_test_case > self.__verify_results_and_errors(vector, test_section, result, use_db) > common/impala_test_suite.py:523: in __verify_results_and_errors > replace_filenames_with_placeholder) > common/test_result_verifier.py:456: in verify_raw_results > VERIFIER_MAP[verifier](expected, actual) > common/test_result_verifier.py:278: in verify_query_result_is_equal > assert expected_results == actual_results > E assert Comparing QueryTestResults (expected vs actual): > E 10 != 0 > {code} > The failed query is > {code:sql} > select count(*) from functional_avro_snap.avro_coldef {code} > The cause is that data loading for avro_coldef failed. The DML is > {code:sql} > INSERT OVERWRITE TABLE avro_coldef PARTITION(year=2014, month=1) > SELECT bool_col, tinyint_col, smallint_col, int_col, bigint_col, > float_col, double_col, date_string_col, string_col, timestamp_col > FROM (select * from functional.alltypes order by id limit 5) a; > {code} > The failure (found in HS2) is: > {code} > 2021-01-24T01:52:16,340 ERROR [9433ee64-d706-4fa4-a146-18d71bf17013 > HiveServer2-Handler-Pool: Thread-4946] parse.CalcitePlanner: CBO failed, > skipping CBO. > org.apache.hadoop.hive.ql.exec.UDFArgumentException: Casting DATE/TIMESTAMP > types to NUMERIC is prohibited (hive.strict.timestamp.conversion) > at > org.apache.hadoop.hive.ql.udf.TimestampCastRestrictorResolver.getEvalMethod(TimestampCastRestrictorResolver.java:62) > ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] > at > org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.initialize(GenericUDFBridge.java:168) > ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] > at > org.apache.hadoop.hive.ql.udf.generic.GenericUDF.initializeAndFoldConstants(GenericUDF.java:149) > ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] > at > org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc.newInstance(ExprNodeGenericFuncDesc.java:260) > ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] > at > org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc.newInstance(ExprNodeGenericFuncDesc.java:292) > ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] > at > org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getFuncExprNodeDescWithUdfData(TypeCheckProcFactory.java:987) > ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] > at > org.apache.hadoop.hive.ql.parse.ParseUtils.createConversionCast(ParseUtils.java:163) > ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genConversionSelectOperator(SemanticAnalyzer.java:8551) > ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFileSinkPlan(SemanticAnalyzer.java:7908) > ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:11100) > ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:10972) > ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11901) > ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11771) > ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:593) > ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12678) > ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:423) > ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:288) > ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] > at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:221) > ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] > at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:104) > ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:194) > ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:607) > ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] > at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:553) > ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] > at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:547) > ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] > at > org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:127) > ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] > at > org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:199) > ~[hive-service-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] > at > org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:260) > ~[hive-service-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] > at org.apache.hive.service.cli.operation.Operation.run(Operation.java:274) > ~[hive-service-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] > at > org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:565) > ~[hive-service-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] > at > org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:551) > ~[hive-service-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] > at > org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:315) > ~[hive-service-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] > at > org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:567) > ~[hive-service-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] > at > org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1557) > ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] > at > org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1542) > ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] > at > org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56) > ~[hive-service-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) > ~[hive-exec-3.1.3000.7.1.6.0-169.jar:3.1.3000.7.1.6.0-169] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > ~[?:1.8.0_144] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > ~[?:1.8.0_144] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144] > {code} > This check is introduced in HIVE-24157. Describe on the table shows the > timestamp_col is bigint: > {code:sql} > 0: jdbc:hive2://localhost:11050> desc avro_coldef; > INFO : Compiling > command(queryId=systest_20210125012100_83dadafd-8e20-4a45-8dd2-54d3a6f4b6e2): > desc avro_coldef > INFO : Semantic Analysis Completed (retrial = false) > INFO : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:col_name, > type:string, comment:from deserializer), FieldSchema(name:data_type, > type:string, comment:from deserializer), FieldSchema(name:comment, > type:string, comment:from deserializer)], properties:null) > INFO : Completed compiling > command(queryId=systest_20210125012100_83dadafd-8e20-4a45-8dd2-54d3a6f4b6e2); > Time taken: 0.016 seconds > INFO : Executing > command(queryId=systest_20210125012100_83dadafd-8e20-4a45-8dd2-54d3a6f4b6e2): > desc avro_coldef > INFO : Starting task [Stage-0:DDL] in serial mode > INFO : Completed executing > command(queryId=systest_20210125012100_83dadafd-8e20-4a45-8dd2-54d3a6f4b6e2); > Time taken: 0.008 seconds > INFO : OK > +--------------------------+------------+----------+ > | col_name | data_type | comment | > +--------------------------+------------+----------+ > | bool_col | boolean | | > | tinyint_col | int | | > | smallint_col | int | | > | int_col | int | | > | bigint_col | bigint | | > | float_col | float | | > | double_col | double | | > | date_string_col | string | | > | string_col | string | | > | timestamp_col | bigint | | > | year | int | | > | month | int | | > | | NULL | NULL | > | # Partition Information | NULL | NULL | > | # col_name | data_type | comment | > | year | int | | > | month | int | | > +--------------------------+------------+----------+{code} > This hits the restriction. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org