[jira] [Resolved] (IMPALA-10225) Bump Impyla version
[ https://issues.apache.org/jira/browse/IMPALA-10225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong resolved IMPALA-10225. Fix Version/s: Impala 4.0 Resolution: Fixed > Bump Impyla version > --- > > Key: IMPALA-10225 > URL: https://issues.apache.org/jira/browse/IMPALA-10225 > Project: IMPALA > Issue Type: Improvement > Components: Infrastructure >Reporter: Tim Armstrong >Assignee: Tim Armstrong >Priority: Major > Fix For: Impala 4.0 > > > There are a couple of new Impyla releases that we can test out in Impala's > end-to-end test environment - https://pypi.org/project/impyla/0.17a1/#history -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-10225) Bump Impyla version
[ https://issues.apache.org/jira/browse/IMPALA-10225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17211523#comment-17211523 ] ASF subversion and git services commented on IMPALA-10225: -- Commit b8a2b754669eb7f8d164e8112e594ac413e436ef in impala's branch refs/heads/master from Tim Armstrong [ https://gitbox.apache.org/repos/asf?p=impala.git;h=b8a2b75 ] IMPALA-10225: bump impyla version to 0.17a1 Update a couple of tests with the new improved error messages. Change-Id: I70a0e883275f3c29e2b01fd5bab7725857c8a1ed Reviewed-on: http://gerrit.cloudera.org:8080/16562 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Bump Impyla version > --- > > Key: IMPALA-10225 > URL: https://issues.apache.org/jira/browse/IMPALA-10225 > Project: IMPALA > Issue Type: Improvement > Components: Infrastructure >Reporter: Tim Armstrong >Assignee: Tim Armstrong >Priority: Major > Fix For: Impala 4.0 > > > There are a couple of new Impyla releases that we can test out in Impala's > end-to-end test environment - https://pypi.org/project/impyla/0.17a1/#history -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-9792) Split Kudu scan ranges into smaller chunks for greater paralellelism
[ https://issues.apache.org/jira/browse/IMPALA-9792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17211507#comment-17211507 ] ASF subversion and git services commented on IMPALA-9792: - Commit 2fd6f5bc5aa6b50e36547e52657c1117637384b6 in impala's branch refs/heads/master from Bikramjeet Vig [ https://gitbox.apache.org/repos/asf?p=impala.git;h=2fd6f5b ] IMPALA-9792: Add ability to split kudu scan ranges This patch adds the ability to split kudu scan token via the provided kudu java API. A query option "TARGETED_KUDU_SCAN_RANGE_LENGTH" has been added to set the scan range length used in this implementation. Potential benefit: This helps increase parallelism during scanning which can result in more efficient use of CPU with higher mt_dop. Limitation: - The scan range length sent to kudu is just a hint and does not guarantee that the token will be split at that limit. - Comes at an added cost of an RPC to tablet server per token in order to split it. A slow tablet server which can already slow down scanning during execution can now also potentially slow down planning. - Also adds the cost of an RPC per token to open a new scanner for it on the kudu side. Therefore, scanning many smaller split tokens can slow down scanning and we can also lose benefits of scanning a single large token sequentially with a single scanner. Testing: - Added an e2e test Change-Id: Ia02fd94cc1d13c61bc6cb0765dd2cbe90e9a5ce8 Reviewed-on: http://gerrit.cloudera.org:8080/16385 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Split Kudu scan ranges into smaller chunks for greater paralellelism > > > Key: IMPALA-9792 > URL: https://issues.apache.org/jira/browse/IMPALA-9792 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Reporter: Tim Armstrong >Assignee: Bikramjeet Vig >Priority: Major > Labels: kudu, multithreading > > We currently use one thread to scan each tablet, which may underparallelise > queries in many cases. Kudu added an API in KUDU-2437 and KUDU-2670 to split > tokens at a finer granularity. > See > https://github.com/apache/kudu/commit/22a6faa44364dec3a171ec79c15b814ad9277d8f#diff-a4afa9dba99c7612b2cb9176134ff2b0 > The major downside is that the planner has to do an extra RPC to a tserver > for each tablet being scanned in order to figure out key range splits. Maybe > we can tie this to mt_dop >= 2, or use some heuristics to avoid these RPCs > for smaller tables. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-10230) column stats num_nulls less than -1
[ https://issues.apache.org/jira/browse/IMPALA-10230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] logan zheng updated IMPALA-10230: - Description: when update impala 3.2.0(CDH6.3.2 ) to asf3.4.0 ,after when "increment stats default.test partition(xx=)": {noformat} ERROR: TableLoadingException: Failed to load metadata for table: default.test CAUSED BY: IllegalStateException: ColumnStats{avgSize_=13.0, avgSerializedSize_=25.0, maxSize_=19, numDistinct_=12, numNulls_=-2}{noformat} The table default.test already exists in impala 3.2.0, and has been running for a long time, and has also been added stats. was: when update impala 3.2.0(CDH6.3.2 ) to asf3.4.0 ,after when "increment stats default.test partition(xx=)": {noformat} ERROR: TableLoadingException: Failed to load metadata for table: default.test CAUSED BY: IllegalStateException: ColumnStats{avgSize_=13.0, avgSerializedSize_=25.0, maxSize_=19, numDistinct_=12, numNulls_=-2}{noformat} The table default.test already exists in impala 3.2.0, and has been running for a long time, and has also been added stats. > column stats num_nulls less than -1 > --- > > Key: IMPALA-10230 > URL: https://issues.apache.org/jira/browse/IMPALA-10230 > Project: IMPALA > Issue Type: Bug > Components: Catalog >Affects Versions: Impala 3.4.0 >Reporter: logan zheng >Priority: Critical > Original Estimate: 96h > Remaining Estimate: 96h > > when update impala 3.2.0(CDH6.3.2 ) to asf3.4.0 ,after when "increment stats > default.test partition(xx=)": > {noformat} > ERROR: TableLoadingException: Failed to load metadata for table: default.test > CAUSED BY: IllegalStateException: ColumnStats{avgSize_=13.0, > avgSerializedSize_=25.0, maxSize_=19, numDistinct_=12, numNulls_=-2}{noformat} > The table default.test already exists in impala 3.2.0, and has been running > for a long time, and has also been added stats. > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-9812) Remove --unlock_mt_dop and--mt_dop_auto_fallback
[ https://issues.apache.org/jira/browse/IMPALA-9812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong updated IMPALA-9812: -- Parent: IMPALA-8965 Issue Type: Sub-task (was: Task) > Remove --unlock_mt_dop and--mt_dop_auto_fallback > - > > Key: IMPALA-9812 > URL: https://issues.apache.org/jira/browse/IMPALA-9812 > Project: IMPALA > Issue Type: Sub-task >Reporter: Tim Armstrong >Priority: Minor > > These flags will become ineffective when DML is supported. We should clean up > all references and move them to the flag graveyard. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-10230) column stats num_nulls less than -1
[ https://issues.apache.org/jira/browse/IMPALA-10230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17211486#comment-17211486 ] Tim Armstrong commented on IMPALA-10230: [~logan zheng] also if you can give us standalone steps to reproduce the issue, that would probably help. I tried reproducing on my system but wasn't able to - I assume the data or partition layouts are somehow different. > column stats num_nulls less than -1 > --- > > Key: IMPALA-10230 > URL: https://issues.apache.org/jira/browse/IMPALA-10230 > Project: IMPALA > Issue Type: Bug > Components: Catalog >Affects Versions: Impala 3.4.0 >Reporter: logan zheng >Priority: Critical > Original Estimate: 96h > Remaining Estimate: 96h > > when update impala 3.2.0(CDH6.3.2 ) to asf3.4.0 ,after when "increment stats > default.test partition(xx=)": > {noformat} > ERROR: TableLoadingException: Failed to load metadata for table: default.test > CAUSED BY: IllegalStateException: ColumnStats{avgSize_=13.0, > avgSerializedSize_=25.0, maxSize_=19, numDistinct_=12, numNulls_=-2}{noformat} > The table default.test already exists in impala 3.2.0, and has been running > for a long time, and has also been added stats. > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-10230) column stats num_nulls less than -1
[ https://issues.apache.org/jira/browse/IMPALA-10230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong updated IMPALA-10230: --- Target Version: Impala 4.0 > column stats num_nulls less than -1 > --- > > Key: IMPALA-10230 > URL: https://issues.apache.org/jira/browse/IMPALA-10230 > Project: IMPALA > Issue Type: Bug > Components: Catalog >Affects Versions: Impala 3.4.0 >Reporter: logan zheng >Priority: Critical > Original Estimate: 96h > Remaining Estimate: 96h > > when update impala 3.2.0(CDH6.3.2 ) to asf3.4.0 ,after when "increment stats > default.test partition(xx=)": > {noformat} > ERROR: TableLoadingException: Failed to load metadata for table: default.test > CAUSED BY: IllegalStateException: ColumnStats{avgSize_=13.0, > avgSerializedSize_=25.0, maxSize_=19, numDistinct_=12, numNulls_=-2}{noformat} > The table default.test already exists in impala 3.2.0, and has been running > for a long time, and has also been added stats. > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-10230) column stats num_nulls less than -1
[ https://issues.apache.org/jira/browse/IMPALA-10230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17211481#comment-17211481 ] Tim Armstrong commented on IMPALA-10230: [~logan zheng] do you have the full IllegalStateException stacktrace from the catalogd logs? > column stats num_nulls less than -1 > --- > > Key: IMPALA-10230 > URL: https://issues.apache.org/jira/browse/IMPALA-10230 > Project: IMPALA > Issue Type: Bug > Components: Catalog >Affects Versions: Impala 3.4.0 >Reporter: logan zheng >Priority: Critical > Original Estimate: 96h > Remaining Estimate: 96h > > when update impala 3.2.0(CDH6.3.2 ) to asf3.4.0 ,after when "increment stats > default.test partition(xx=)": > {noformat} > ERROR: TableLoadingException: Failed to load metadata for table: default.test > CAUSED BY: IllegalStateException: ColumnStats{avgSize_=13.0, > avgSerializedSize_=25.0, maxSize_=19, numDistinct_=12, numNulls_=-2}{noformat} > The table default.test already exists in impala 3.2.0, and has been running > for a long time, and has also been added stats. > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8751) Kudu tables cannot be found after created
[ https://issues.apache.org/jira/browse/IMPALA-8751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17211428#comment-17211428 ] Grant Henke commented on IMPALA-8751: - A Kudu side fix was merged and could be pulled in to fix these test failures: https://github.com/apache/kudu/commit/6b20440f4c51a6b69c1382db51139bf8d3467b05 > Kudu tables cannot be found after created > - > > Key: IMPALA-8751 > URL: https://issues.apache.org/jira/browse/IMPALA-8751 > Project: IMPALA > Issue Type: Bug >Affects Versions: Impala 3.3.0 >Reporter: Yongzhi Chen >Priority: Major > > For example in some kudu integration test as: > TestKuduHMSIntegration.test_drop_db_cascade in custom_cluster/test_kudu.py > It failed with: > custom_cluster/test_kudu.py:239: in test_drop_db_cascade > assert not kudu_client.table_exists(kudu_table.name) > /usr/lib/python2.7/contextlib.py:35: in __exit__ > self.gen.throw(type, value, traceback) > common/kudu_test_suite.py:165: in temp_kudu_table > kudu.delete_table(name) > kudu/client.pyx:392: in kudu.client.Client.delete_table (kudu/client.cpp:7106) > ??? > kudu/errors.pyx:56: in kudu.errors.check_status (kudu/errors.cpp:904) > ??? > E KuduNotFound: failed to drop Hive Metastore table: TException - service > has thrown: NoSuchObjectException(message=s7mo1z) > And when trying to add default capabilities to kudu tables, it is sometime > effective, sometimes not: > For example after enable default kudu OBJCAPABILITIES, > ./run-tests.py metadata/test_ddl.py -k "create_kudu" will succeed while same > test in > ./run-tests.py custom_cluster/test_kudu.py : > {noformat} > TestKuduHMSIntegration.test_create_managed_kudu_tables[protocol: beeswax | > exec_option: {'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, > 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: > text/none] > custom_cluster/test_kudu.py:147: in test_create_managed_kudu_tables > self.run_test_case('QueryTest/kudu_create', vector, > use_db=unique_database) > common/impala_test_suite.py:563: in run_test_case > result = exec_fn(query, user=test_section.get('USER', '').strip() or None) > common/impala_test_suite.py:500: in __exec_in_impala > result = self.__execute_query(target_impalad_client, query, user=user) > common/impala_test_suite.py:798: in __execute_query > return impalad_client.execute(query, user=user) > common/impala_connection.py:184: in execute > return self.__beeswax_client.execute(sql_stmt, user=user) > beeswax/impala_beeswax.py:187: in execute > handle = self.__execute_query(query_string.strip(), user=user) > beeswax/impala_beeswax.py:362: in __execute_query > handle = self.execute_query_async(query_string, user=user) > beeswax/impala_beeswax.py:356: in execute_query_async > handle = self.__do_rpc(lambda: self.imp_service.query(query,)) > beeswax/impala_beeswax.py:519: in __do_rpc > raise ImpalaBeeswaxException(self.__build_error_message(b), b) > E ImpalaBeeswaxException: ImpalaBeeswaxException: > EINNER EXCEPTION: > EMESSAGE: AnalysisException: Write not supported. Table > test_create_managed_kudu_tables_a8d11828.add access type is: READONLY > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Comment Edited] (IMPALA-9728) Data load failed with EOFException writing functional_orc_def.complextypestbl_medium
[ https://issues.apache.org/jira/browse/IMPALA-9728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17211388#comment-17211388 ] Tim Armstrong edited comment on IMPALA-9728 at 10/9/20, 8:50 PM: - Hit again here https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/12331/artifact/Impala/logs_static/logs/data_loading/sql/functional/load-functional-query-exhaustive-hive-generated-orc-def-block.sql.log/*view*/ was (Author: tarmstrong): Hit again here > Data load failed with EOFException writing > functional_orc_def.complextypestbl_medium > > > Key: IMPALA-9728 > URL: https://issues.apache.org/jira/browse/IMPALA-9728 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Reporter: Tim Armstrong >Priority: Major > Labels: flaky > Attachments: > load-functional-query-exhaustive-hive-generated-orc-def-block.sql.log, > load-functional-query-exhaustive-hive-generated-orc-def-block.sql.log > > > {noformat} > INFO : Compiling > command(queryId=ubuntu_20200506012349_3c5cedc8-49d6-4e72-b4a5-e06cb82d1707): > INSERT OVERWRITE TABLE functional_orc_def.complextypestbl_medium SELECT c.* > FROM functional_parquet.complextypestbl c join functional.alltypes sort by id > INFO : Warning: Map Join MAPJOIN[9][bigTable=alltypes] in task 'Map 2' is a > cross product > INFO : Semantic Analysis Completed (retrial = false) > INFO : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:c.id, > type:bigint, comment:null), FieldSchema(name:c.int_array, type:array, > comment:null), FieldSchema(name:c.int_array_array, type:array>, > comment:null), FieldSchema(name:c.int_map, type:map, > comment:null), FieldSchema(name:c.int_map_array, type:array>, > comment:null), FieldSchema(name:c.nested_struct, > type:struct,c:struct>>>,g:map, > comment:null)], properties:null) > INFO : Completed compiling > command(queryId=ubuntu_20200506012349_3c5cedc8-49d6-4e72-b4a5-e06cb82d1707); > Time taken: 0.063 seconds > INFO : Executing > command(queryId=ubuntu_20200506012349_3c5cedc8-49d6-4e72-b4a5-e06cb82d1707): > INSERT OVERWRITE TABLE functional_orc_def.complextypestbl_medium SELECT c.* > FROM functional_parquet.complextypestbl c join functional.alltypes sort by id > INFO : Query ID = ubuntu_20200506012349_3c5cedc8-49d6-4e72-b4a5-e06cb82d1707 > INFO : Total jobs = 1 > INFO : Launching Job 1 out of 1 > INFO : Starting task [Stage-1:MAPRED] in serial mode > INFO : Subscribed to counters: [] for queryId: > ubuntu_20200506012349_3c5cedc8-49d6-4e72-b4a5-e06cb82d1707 > INFO : Session is already open > INFO : Dag name: INSERT OVERWRITE TABLE functional_orc_d...id (Stage-1) > INFO : Setting tez.task.scale.memory.reserve-fraction to 0.3001192092896 > INFO : Status: Running (Executing on YARN cluster with App id > application_1588725973781_0033) > ... > Getting log thread is interrupted, since query is done! > ERROR : Job Commit failed with exception > 'org.apache.hadoop.hive.ql.metadata.HiveException(java.io.EOFException)' > org.apache.hadoop.hive.ql.metadata.HiveException: java.io.EOFException > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator.jobCloseOp(FileSinkOperator.java:1470) > at org.apache.hadoop.hive.ql.exec.Operator.jobClose(Operator.java:798) > at org.apache.hadoop.hive.ql.exec.Operator.jobClose(Operator.java:803) > at org.apache.hadoop.hive.ql.exec.tez.TezTask.close(TezTask.java:620) > at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:335) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) > at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:359) > at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330) > at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246) > at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:721) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:488) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:482) > at > org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166) > at > org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:225) > at > org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87) > at > org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:322) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at >
[jira] [Commented] (IMPALA-9728) Data load failed with EOFException writing functional_orc_def.complextypestbl_medium
[ https://issues.apache.org/jira/browse/IMPALA-9728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17211388#comment-17211388 ] Tim Armstrong commented on IMPALA-9728: --- Hit again here > Data load failed with EOFException writing > functional_orc_def.complextypestbl_medium > > > Key: IMPALA-9728 > URL: https://issues.apache.org/jira/browse/IMPALA-9728 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Reporter: Tim Armstrong >Priority: Major > Labels: flaky > Attachments: > load-functional-query-exhaustive-hive-generated-orc-def-block.sql.log, > load-functional-query-exhaustive-hive-generated-orc-def-block.sql.log > > > {noformat} > INFO : Compiling > command(queryId=ubuntu_20200506012349_3c5cedc8-49d6-4e72-b4a5-e06cb82d1707): > INSERT OVERWRITE TABLE functional_orc_def.complextypestbl_medium SELECT c.* > FROM functional_parquet.complextypestbl c join functional.alltypes sort by id > INFO : Warning: Map Join MAPJOIN[9][bigTable=alltypes] in task 'Map 2' is a > cross product > INFO : Semantic Analysis Completed (retrial = false) > INFO : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:c.id, > type:bigint, comment:null), FieldSchema(name:c.int_array, type:array, > comment:null), FieldSchema(name:c.int_array_array, type:array>, > comment:null), FieldSchema(name:c.int_map, type:map, > comment:null), FieldSchema(name:c.int_map_array, type:array>, > comment:null), FieldSchema(name:c.nested_struct, > type:struct,c:struct>>>,g:map, > comment:null)], properties:null) > INFO : Completed compiling > command(queryId=ubuntu_20200506012349_3c5cedc8-49d6-4e72-b4a5-e06cb82d1707); > Time taken: 0.063 seconds > INFO : Executing > command(queryId=ubuntu_20200506012349_3c5cedc8-49d6-4e72-b4a5-e06cb82d1707): > INSERT OVERWRITE TABLE functional_orc_def.complextypestbl_medium SELECT c.* > FROM functional_parquet.complextypestbl c join functional.alltypes sort by id > INFO : Query ID = ubuntu_20200506012349_3c5cedc8-49d6-4e72-b4a5-e06cb82d1707 > INFO : Total jobs = 1 > INFO : Launching Job 1 out of 1 > INFO : Starting task [Stage-1:MAPRED] in serial mode > INFO : Subscribed to counters: [] for queryId: > ubuntu_20200506012349_3c5cedc8-49d6-4e72-b4a5-e06cb82d1707 > INFO : Session is already open > INFO : Dag name: INSERT OVERWRITE TABLE functional_orc_d...id (Stage-1) > INFO : Setting tez.task.scale.memory.reserve-fraction to 0.3001192092896 > INFO : Status: Running (Executing on YARN cluster with App id > application_1588725973781_0033) > ... > Getting log thread is interrupted, since query is done! > ERROR : Job Commit failed with exception > 'org.apache.hadoop.hive.ql.metadata.HiveException(java.io.EOFException)' > org.apache.hadoop.hive.ql.metadata.HiveException: java.io.EOFException > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator.jobCloseOp(FileSinkOperator.java:1470) > at org.apache.hadoop.hive.ql.exec.Operator.jobClose(Operator.java:798) > at org.apache.hadoop.hive.ql.exec.Operator.jobClose(Operator.java:803) > at org.apache.hadoop.hive.ql.exec.tez.TezTask.close(TezTask.java:620) > at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:335) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) > at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:359) > at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330) > at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246) > at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:721) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:488) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:482) > at > org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166) > at > org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:225) > at > org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87) > at > org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:322) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876) > at > org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:340) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at
[jira] [Updated] (IMPALA-9728) Data load failed with EOFException writing functional_orc_def.complextypestbl_medium
[ https://issues.apache.org/jira/browse/IMPALA-9728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong updated IMPALA-9728: -- Attachment: load-functional-query-exhaustive-hive-generated-orc-def-block.sql.log > Data load failed with EOFException writing > functional_orc_def.complextypestbl_medium > > > Key: IMPALA-9728 > URL: https://issues.apache.org/jira/browse/IMPALA-9728 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Reporter: Tim Armstrong >Priority: Major > Labels: flaky > Attachments: > load-functional-query-exhaustive-hive-generated-orc-def-block.sql.log, > load-functional-query-exhaustive-hive-generated-orc-def-block.sql.log > > > {noformat} > INFO : Compiling > command(queryId=ubuntu_20200506012349_3c5cedc8-49d6-4e72-b4a5-e06cb82d1707): > INSERT OVERWRITE TABLE functional_orc_def.complextypestbl_medium SELECT c.* > FROM functional_parquet.complextypestbl c join functional.alltypes sort by id > INFO : Warning: Map Join MAPJOIN[9][bigTable=alltypes] in task 'Map 2' is a > cross product > INFO : Semantic Analysis Completed (retrial = false) > INFO : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:c.id, > type:bigint, comment:null), FieldSchema(name:c.int_array, type:array, > comment:null), FieldSchema(name:c.int_array_array, type:array>, > comment:null), FieldSchema(name:c.int_map, type:map, > comment:null), FieldSchema(name:c.int_map_array, type:array>, > comment:null), FieldSchema(name:c.nested_struct, > type:struct,c:struct>>>,g:map, > comment:null)], properties:null) > INFO : Completed compiling > command(queryId=ubuntu_20200506012349_3c5cedc8-49d6-4e72-b4a5-e06cb82d1707); > Time taken: 0.063 seconds > INFO : Executing > command(queryId=ubuntu_20200506012349_3c5cedc8-49d6-4e72-b4a5-e06cb82d1707): > INSERT OVERWRITE TABLE functional_orc_def.complextypestbl_medium SELECT c.* > FROM functional_parquet.complextypestbl c join functional.alltypes sort by id > INFO : Query ID = ubuntu_20200506012349_3c5cedc8-49d6-4e72-b4a5-e06cb82d1707 > INFO : Total jobs = 1 > INFO : Launching Job 1 out of 1 > INFO : Starting task [Stage-1:MAPRED] in serial mode > INFO : Subscribed to counters: [] for queryId: > ubuntu_20200506012349_3c5cedc8-49d6-4e72-b4a5-e06cb82d1707 > INFO : Session is already open > INFO : Dag name: INSERT OVERWRITE TABLE functional_orc_d...id (Stage-1) > INFO : Setting tez.task.scale.memory.reserve-fraction to 0.3001192092896 > INFO : Status: Running (Executing on YARN cluster with App id > application_1588725973781_0033) > ... > Getting log thread is interrupted, since query is done! > ERROR : Job Commit failed with exception > 'org.apache.hadoop.hive.ql.metadata.HiveException(java.io.EOFException)' > org.apache.hadoop.hive.ql.metadata.HiveException: java.io.EOFException > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator.jobCloseOp(FileSinkOperator.java:1470) > at org.apache.hadoop.hive.ql.exec.Operator.jobClose(Operator.java:798) > at org.apache.hadoop.hive.ql.exec.Operator.jobClose(Operator.java:803) > at org.apache.hadoop.hive.ql.exec.tez.TezTask.close(TezTask.java:620) > at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:335) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) > at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:359) > at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330) > at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246) > at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:721) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:488) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:482) > at > org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166) > at > org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:225) > at > org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87) > at > org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:322) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876) > at > org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:340) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at
[jira] [Assigned] (IMPALA-9485) Enable file handle cache for EC files
[ https://issues.apache.org/jira/browse/IMPALA-9485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Takiar reassigned IMPALA-9485: Assignee: Sahil Takiar > Enable file handle cache for EC files > - > > Key: IMPALA-9485 > URL: https://issues.apache.org/jira/browse/IMPALA-9485 > Project: IMPALA > Issue Type: Sub-task > Components: Backend >Reporter: Sahil Takiar >Assignee: Sahil Takiar >Priority: Major > > Now that HDFS-14308 has been fixed, we can re-enable the file handle cache > for EC files. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-9485) Enable file handle cache for EC files
[ https://issues.apache.org/jira/browse/IMPALA-9485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Takiar resolved IMPALA-9485. -- Fix Version/s: Impala 4.0 Resolution: Fixed > Enable file handle cache for EC files > - > > Key: IMPALA-9485 > URL: https://issues.apache.org/jira/browse/IMPALA-9485 > Project: IMPALA > Issue Type: Sub-task > Components: Backend >Reporter: Sahil Takiar >Assignee: Sahil Takiar >Priority: Major > Fix For: Impala 4.0 > > > Now that HDFS-14308 has been fixed, we can re-enable the file handle cache > for EC files. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-10230) column stats num_nulls less than -1
[ https://issues.apache.org/jira/browse/IMPALA-10230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] logan zheng updated IMPALA-10230: - Description: when update impala 3.2.0(CDH6.3.2 ) to asf3.4.0 ,after when "increment stats default.test partition(xx=)": {noformat} ERROR: TableLoadingException: Failed to load metadata for table: default.test CAUSED BY: IllegalStateException: ColumnStats{avgSize_=13.0, avgSerializedSize_=25.0, maxSize_=19, numDistinct_=12, numNulls_=-2}{noformat} The table default.test already exists in impala 3.2.0, and has been running for a long time, and has also been added stats. was: when update impala 3.2.0(CDH6.3.2 ) to asf3.4.0 ,after executing "increment stats default.test partition(xx=)": ERROR: TableLoadingException: Failed to load metadata for table: default.test CAUSED BY: IllegalStateException: ColumnStats\{avgSize_=13.0, avgSerializedSize_=25.0, maxSize_=19, numDistinct_=12, numNulls_=-2} > column stats num_nulls less than -1 > --- > > Key: IMPALA-10230 > URL: https://issues.apache.org/jira/browse/IMPALA-10230 > Project: IMPALA > Issue Type: Bug > Components: Catalog >Affects Versions: Impala 3.4.0 >Reporter: logan zheng >Priority: Critical > Original Estimate: 96h > Remaining Estimate: 96h > > when update impala 3.2.0(CDH6.3.2 ) to asf3.4.0 ,after when "increment stats > default.test partition(xx=)": > {noformat} > ERROR: TableLoadingException: Failed to load metadata for table: default.test > CAUSED BY: IllegalStateException: ColumnStats{avgSize_=13.0, > avgSerializedSize_=25.0, maxSize_=19, numDistinct_=12, numNulls_=-2}{noformat} > The table default.test already exists in impala 3.2.0, and has been running > for a long time, and has also been added stats. > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-10230) column stats num_nulls less than -1
logan zheng created IMPALA-10230: Summary: column stats num_nulls less than -1 Key: IMPALA-10230 URL: https://issues.apache.org/jira/browse/IMPALA-10230 Project: IMPALA Issue Type: Bug Components: Catalog Affects Versions: Impala 3.4.0 Reporter: logan zheng when update impala 3.2.0(CDH6.3.2 ) to asf3.4.0 ,after executing "increment stats default.test partition(xx=)": ERROR: TableLoadingException: Failed to load metadata for table: default.test CAUSED BY: IllegalStateException: ColumnStats\{avgSize_=13.0, avgSerializedSize_=25.0, maxSize_=19, numDistinct_=12, numNulls_=-2} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-8304) Generate JUnitXML symptom for compilation/CMake failures
[ https://issues.apache.org/jira/browse/IMPALA-8304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe McDonnell resolved IMPALA-8304. --- Fix Version/s: Impala 4.0 Resolution: Fixed > Generate JUnitXML symptom for compilation/CMake failures > > > Key: IMPALA-8304 > URL: https://issues.apache.org/jira/browse/IMPALA-8304 > Project: IMPALA > Issue Type: Improvement > Components: Infrastructure >Affects Versions: Impala 3.3.0 >Reporter: Joe McDonnell >Assignee: Joe McDonnell >Priority: Major > Fix For: Impala 4.0 > > > When compilation or another CMake command fails, it should generate JUnitXML > containing the output of the command that failed to allow faster triage. All > of the information is currently available in the Jenkins log, but due to the > parallel nature of the build, the failure can be buried in logging. Some > builds are extremely verbose (e.g. clang tidy) and can hide errors in > megabytes of logs. > This should apply to both frontend and backend compilation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8178) Tests failing with “Could not allocate memory while trying to increase reservation” on EC filesystem
[ https://issues.apache.org/jira/browse/IMPALA-8178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17211130#comment-17211130 ] ASF subversion and git services commented on IMPALA-8178: - Commit 3382759664fe99317f27200b3e52a1e967f0a042 in impala's branch refs/heads/master from Sahil Takiar [ https://gitbox.apache.org/repos/asf?p=impala.git;h=3382759 ] IMPALA-9485: Enable file handle cache for EC files This is essentially a revert of IMPALA-8178. HDFS-14308 added CanUnbuffer support to the EC input stream APIs in the HDFS client lib. This patch enables file handle caching for EC files. Testing: * Ran core tests against an EC build (ERASURE_CODING=true) Change-Id: Ieb455eeed02a229a4559d3972dfdac7df32cdb99 Reviewed-on: http://gerrit.cloudera.org:8080/16567 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Tests failing with “Could not allocate memory while trying to increase > reservation” on EC filesystem > > > Key: IMPALA-8178 > URL: https://issues.apache.org/jira/browse/IMPALA-8178 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 3.2.0 >Reporter: Andrew Sherman >Assignee: Joe McDonnell >Priority: Blocker > Labels: broken-build > Fix For: Impala 3.2.0 > > > In tests run against an Erasure Coding filesystem, multiple tests failed with > memory allocation errors. > In total 10 tests failed: > * query_test.test_scanners.TestParquet.test_decimal_encodings > * query_test.test_scanners.TestTpchScanRangeLengths.test_tpch_scan_ranges > * query_test.test_exprs.TestExprs.test_exprs [enable_expr_rewrites: 0] > * query_test.test_exprs.TestExprs.test_exprs [enable_expr_rewrites: 1] > * query_test.test_hbase_queries.TestHBaseQueries.test_hbase_scan_node > * query_test.test_scanners.TestParquet.test_def_levels > * > query_test.test_scanners.TestTextSplitDelimiters.test_text_split_across_buffers_delimiterquery_test.test_hbase_queries.TestHBaseQueries.test_hbase_filters > * query_test.test_hbase_queries.TestHBaseQueries.test_hbase_inline_views > * query_test.test_hbase_queries.TestHBaseQueries.test_hbase_top_n > The first failure looked like this on the client side: > {quote} > F > query_test/test_scanners.py::TestParquet::()::test_decimal_encodings[protocol: > beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': True, > 'abort_on_error': 1, 'debug_action': > '-1:OPEN:SET_DENY_RESERVATION_PROBABILITY@0.5', > 'exec_single_node_rows_threshold': 0} | table_format: parquet/none] > query_test/test_scanners.py:717: in test_decimal_encodings > self.run_test_case('QueryTest/parquet-decimal-formats', vector, > unique_database) > common/impala_test_suite.py:472: in run_test_case > result = self.__execute_query(target_impalad_client, query, user=user) > common/impala_test_suite.py:699: in __execute_query > return impalad_client.execute(query, user=user) > common/impala_connection.py:174: in execute > return self.__beeswax_client.execute(sql_stmt, user=user) > beeswax/impala_beeswax.py:183: in execute > handle = self.__execute_query(query_string.strip(), user=user) > beeswax/impala_beeswax.py:360: in __execute_query > self.wait_for_finished(handle) > beeswax/impala_beeswax.py:381: in wait_for_finished > raise ImpalaBeeswaxException("Query aborted:" + error_log, None) > E ImpalaBeeswaxException: ImpalaBeeswaxException: > EQuery aborted:ExecQueryFInstances rpc > query_id=6e44c3c949a31be2:f973c7ff failed: Failed to get minimum > memory reservation of 8.00 KB on daemon xxx.com:22001 for query > 6e44c3c949a31be2:f973c7ff due to following error: Memory limit > exceeded: Could not allocate memory while trying to increase reservation. > E Query(6e44c3c949a31be2:f973c7ff) could not allocate 8.00 KB > without exceeding limit. > E Error occurred on backend xxx.com:22001 > E Memory left in process limit: 1.19 GB > E Query(6e44c3c949a31be2:f973c7ff): Reservation=0 > ReservationLimit=9.60 GB OtherMemory=0 Total=0 Peak=0 > E Memory is likely oversubscribed. Reducing query concurrency or > configuring admission control may help avoid this error. > {quote} > On the server side log: > {quote} > I0207 18:25:19.329311 5562 impala-server.cc:1063] > 6e44c3c949a31be2:f973c7ff] Registered query > query_id=6e44c3c949a31be2:f973c7ff > session_id=93497065f69e9d01:8a3bd06faff3da5 > I0207 18:25:19.329434 5562 Frontend.java:1242] > 6e44c3c949a31be2:f973c7ff] Analyzing query: select score from > decimal_stored_as_int32 > I0207 18:25:19.329583 5562 FeSupport.java:285] >
[jira] [Commented] (IMPALA-9485) Enable file handle cache for EC files
[ https://issues.apache.org/jira/browse/IMPALA-9485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17211129#comment-17211129 ] ASF subversion and git services commented on IMPALA-9485: - Commit 3382759664fe99317f27200b3e52a1e967f0a042 in impala's branch refs/heads/master from Sahil Takiar [ https://gitbox.apache.org/repos/asf?p=impala.git;h=3382759 ] IMPALA-9485: Enable file handle cache for EC files This is essentially a revert of IMPALA-8178. HDFS-14308 added CanUnbuffer support to the EC input stream APIs in the HDFS client lib. This patch enables file handle caching for EC files. Testing: * Ran core tests against an EC build (ERASURE_CODING=true) Change-Id: Ieb455eeed02a229a4559d3972dfdac7df32cdb99 Reviewed-on: http://gerrit.cloudera.org:8080/16567 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Enable file handle cache for EC files > - > > Key: IMPALA-9485 > URL: https://issues.apache.org/jira/browse/IMPALA-9485 > Project: IMPALA > Issue Type: Sub-task > Components: Backend >Reporter: Sahil Takiar >Priority: Major > > Now that HDFS-14308 has been fixed, we can re-enable the file handle cache > for EC files. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8304) Generate JUnitXML symptom for compilation/CMake failures
[ https://issues.apache.org/jira/browse/IMPALA-8304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17211131#comment-17211131 ] ASF subversion and git services commented on IMPALA-8304: - Commit 1f3160b4c07c8a5a146067222e6591d44bfa3c7d in impala's branch refs/heads/master from Joe McDonnell [ https://gitbox.apache.org/repos/asf?p=impala.git;h=1f3160b ] IMPALA-8304: Generate JUnitXML if a command run by CMake fails This wraps each command executed by CMake with a wrapper that generates a JUnitXML file if the command fails. If the command succeeds, the wrapper does nothing. The wrapper applies to C++ compilation, linking, and custom shell commands (such as building the frontend via maven). It does not apply to failures coming from CMake itself. It can be disabled by setting DISABLE_CMAKE_JUNITXML. The command output can include Unicode (e.g. smart quotes for g++), so this also updates generate_junitxml.py to handle Unicode. The wrapper interacts poorly with add_custom_command/add_custom_target CMake commands that use 'cd directory && do_something', so this switches those locations (in /docker) to use CMake's WORKING_DIRECTORY. Testing: - Verified it does not impact a successful build (including with ccache and/or distcc). - Verified it generates JUnitXML for C++ and Java compilation failures. - Verified it doesn't use the wrapper when DISABLE_CMAKE_JUNITXML is set. Change-Id: If71f2faf3ab5052b56b38f1b291fee53c390ce23 Reviewed-on: http://gerrit.cloudera.org:8080/12668 Reviewed-by: Joe McDonnell Tested-by: Impala Public Jenkins > Generate JUnitXML symptom for compilation/CMake failures > > > Key: IMPALA-8304 > URL: https://issues.apache.org/jira/browse/IMPALA-8304 > Project: IMPALA > Issue Type: Improvement > Components: Infrastructure >Affects Versions: Impala 3.3.0 >Reporter: Joe McDonnell >Assignee: Joe McDonnell >Priority: Major > > When compilation or another CMake command fails, it should generate JUnitXML > containing the output of the command that failed to allow faster triage. All > of the information is currently available in the Jenkins log, but due to the > parallel nature of the build, the failure can be buried in logging. Some > builds are extremely verbose (e.g. clang tidy) and can hide errors in > megabytes of logs. > This should apply to both frontend and backend compilation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-10171) Create query options for convert_legacy_hive_parquet_utc_timestamps and use_local_tz_for_unix_timestamp_conversions
[ https://issues.apache.org/jira/browse/IMPALA-10171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Csaba Ringhofer resolved IMPALA-10171. -- Resolution: Implemented > Create query options for convert_legacy_hive_parquet_utc_timestamps and > use_local_tz_for_unix_timestamp_conversions > --- > > Key: IMPALA-10171 > URL: https://issues.apache.org/jira/browse/IMPALA-10171 > Project: IMPALA > Issue Type: Improvement >Reporter: Csaba Ringhofer >Priority: Major > > convert_legacy_hive_parquet_utc_timestamps and > use_local_tz_for_unix_timestamp_conversions are flags that can be set on all > coordinators and executors. Possible inconsistencies could be avoided by > always using the flag's value on the coordinator, or adding a query options > for these settings. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Work started] (IMPALA-10224) Add startup flag not to expose debug web url via PingImpalaService/PingImpalaHS2Service RPC
[ https://issues.apache.org/jira/browse/IMPALA-10224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-10224 started by Attila Jeges. - > Add startup flag not to expose debug web url via > PingImpalaService/PingImpalaHS2Service RPC > --- > > Key: IMPALA-10224 > URL: https://issues.apache.org/jira/browse/IMPALA-10224 > Project: IMPALA > Issue Type: New Feature > Components: Backend >Affects Versions: Impala 3.4.0 >Reporter: Attila Jeges >Assignee: Attila Jeges >Priority: Major > > PingImpalaService/PingImpalaHS2Service RPC calls expose the coordinator's > debug web url to clients like impala shell. Since the debug web UI is not > something that end-users will necessarily have access to, we should have a > server option to send an empty string instead of the real url to the impala > client signalling that the debug web ui is not available. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-9952) Invalid offset index in Parquet file
[ https://issues.apache.org/jira/browse/IMPALA-9952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltán Borók-Nagy resolved IMPALA-9952. --- Fix Version/s: Impala 4.0 Resolution: Fixed > Invalid offset index in Parquet file > - > > Key: IMPALA-9952 > URL: https://issues.apache.org/jira/browse/IMPALA-9952 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 3.4.0 >Reporter: guojingfeng >Assignee: Zoltán Borók-Nagy >Priority: Major > Labels: Parquet > Fix For: Impala 4.0 > > > When reading parquet file in impala 3.4, encountered the following error: > {code:java} > I0714 16:11:48.307806 1075820 runtime-state.cc:207] > 8c43203adb2d4fc8:0478df9b018b] Error from query > 8c43203adb2d4fc8:0478df9b: Invalid offset index in Parquet file > hdfs://path/4844de7af4545a39-e8ebc7da005f_2015704758_data.0.parq. > I0714 16:11:48.834901 1075838 status.cc:126] > 8c43203adb2d4fc8:0478df9b02c0] Invalid offset index in Parquet file > hdfs://path/4844de7af4545a39-e8ebc7da005f_2015704758_data.0.parq. > @ 0xbf4ef9 > @ 0x1748c41 > @ 0x174e170 > @ 0x1750e58 > @ 0x17519f0 > @ 0x1748559 > @ 0x1510b41 > @ 0x1512c8f > @ 0x137488a > @ 0x1375759 > @ 0x1b48a19 > @ 0x7f34509f5e24 > @ 0x7f344d5ed35c > I0714 16:11:48.835763 1075838 runtime-state.cc:207] > 8c43203adb2d4fc8:0478df9b02c0] Error from query > 8c43203adb2d4fc8:0478df9b: Invalid offset index in Parquet file > hdfs://path/4844de7af4545a39-e8ebc7da005f_2015704758_data.0.parq. > I0714 16:11:48.893784 1075820 status.cc:126] > 8c43203adb2d4fc8:0478df9b018b] Top level rows aren't in sync during page > filtering in file > hdfs://path/4844de7af4545a39-e8ebc7da005f_2015704758_data.0.parq. > @ 0xbf4ef9 > @ 0x1749104 > @ 0x17494cc > @ 0x1751aee > @ 0x1748559 > @ 0x1510b41 > @ 0x1512c8f > @ 0x137488a > @ 0x1375759 > @ 0x1b48a19 > @ 0x7f34509f5e24 > @ 0x7f344d5ed35c > {code} > Corresponding source code: > {code:java} > Status HdfsParquetScanner::CheckPageFiltering() { > if (candidate_ranges_.empty() || scalar_readers_.empty()) return > Status::OK(); int64_t current_row = scalar_readers_[0]->LastProcessedRow(); > for (int i = 1; i < scalar_readers_.size(); ++i) { > if (current_row != scalar_readers_[i]->LastProcessedRow()) { > DCHECK(false); > return Status(Substitute( > "Top level rows aren't in sync during page filtering in file $0.", > filename())); > } > } > return Status::OK(); > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-9952) Invalid offset index in Parquet file
[ https://issues.apache.org/jira/browse/IMPALA-9952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17210740#comment-17210740 ] Zoltán Borók-Nagy commented on IMPALA-9952: --- Thanks for the verification, [~guojingfeng]! I think I'm closing this Jira as the patch resolves the crash mentioned in the description. We can use IMPALA-10186 to track the write side problem. > Invalid offset index in Parquet file > - > > Key: IMPALA-9952 > URL: https://issues.apache.org/jira/browse/IMPALA-9952 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 3.4.0 >Reporter: guojingfeng >Assignee: Zoltán Borók-Nagy >Priority: Major > Labels: Parquet > > When reading parquet file in impala 3.4, encountered the following error: > {code:java} > I0714 16:11:48.307806 1075820 runtime-state.cc:207] > 8c43203adb2d4fc8:0478df9b018b] Error from query > 8c43203adb2d4fc8:0478df9b: Invalid offset index in Parquet file > hdfs://path/4844de7af4545a39-e8ebc7da005f_2015704758_data.0.parq. > I0714 16:11:48.834901 1075838 status.cc:126] > 8c43203adb2d4fc8:0478df9b02c0] Invalid offset index in Parquet file > hdfs://path/4844de7af4545a39-e8ebc7da005f_2015704758_data.0.parq. > @ 0xbf4ef9 > @ 0x1748c41 > @ 0x174e170 > @ 0x1750e58 > @ 0x17519f0 > @ 0x1748559 > @ 0x1510b41 > @ 0x1512c8f > @ 0x137488a > @ 0x1375759 > @ 0x1b48a19 > @ 0x7f34509f5e24 > @ 0x7f344d5ed35c > I0714 16:11:48.835763 1075838 runtime-state.cc:207] > 8c43203adb2d4fc8:0478df9b02c0] Error from query > 8c43203adb2d4fc8:0478df9b: Invalid offset index in Parquet file > hdfs://path/4844de7af4545a39-e8ebc7da005f_2015704758_data.0.parq. > I0714 16:11:48.893784 1075820 status.cc:126] > 8c43203adb2d4fc8:0478df9b018b] Top level rows aren't in sync during page > filtering in file > hdfs://path/4844de7af4545a39-e8ebc7da005f_2015704758_data.0.parq. > @ 0xbf4ef9 > @ 0x1749104 > @ 0x17494cc > @ 0x1751aee > @ 0x1748559 > @ 0x1510b41 > @ 0x1512c8f > @ 0x137488a > @ 0x1375759 > @ 0x1b48a19 > @ 0x7f34509f5e24 > @ 0x7f344d5ed35c > {code} > Corresponding source code: > {code:java} > Status HdfsParquetScanner::CheckPageFiltering() { > if (candidate_ranges_.empty() || scalar_readers_.empty()) return > Status::OK(); int64_t current_row = scalar_readers_[0]->LastProcessedRow(); > for (int i = 1; i < scalar_readers_.size(); ++i) { > if (current_row != scalar_readers_[i]->LastProcessedRow()) { > DCHECK(false); > return Status(Substitute( > "Top level rows aren't in sync during page filtering in file $0.", > filename())); > } > } > return Status::OK(); > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-10164) Support HadoopCatalog for Iceberg table
[ https://issues.apache.org/jira/browse/IMPALA-10164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] WangSheng resolved IMPALA-10164. Resolution: Fixed > Support HadoopCatalog for Iceberg table > --- > > Key: IMPALA-10164 > URL: https://issues.apache.org/jira/browse/IMPALA-10164 > Project: IMPALA > Issue Type: Improvement >Reporter: WangSheng >Assignee: WangSheng >Priority: Minor > Labels: impala-iceberg > > We just supported HadoopTable api to create Iceberg table in Impala now, it's > apparently not enough, so we preparing to support HadoopCatalog. The main > design is to add a new table property named 'iceberg.catalog', and default > value is 'hadoop.tables', we implement 'hadoop.catalog' to supported > HadoopCatalog api. We may even support 'hive.catalog' in the future. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Work started] (IMPALA-10159) Support ORC file format for Iceberg table
[ https://issues.apache.org/jira/browse/IMPALA-10159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-10159 started by WangSheng. -- > Support ORC file format for Iceberg table > - > > Key: IMPALA-10159 > URL: https://issues.apache.org/jira/browse/IMPALA-10159 > Project: IMPALA > Issue Type: Sub-task >Reporter: WangSheng >Assignee: WangSheng >Priority: Minor > Labels: impala-iceberg > > Impala can query PARQUET file format for Iceberg Table now. Since have > already do some work in IMPALA-9741, we can continue ORC file format > supported work in this jira. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-9741) Support query iceberg table by impala
[ https://issues.apache.org/jira/browse/IMPALA-9741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] WangSheng resolved IMPALA-9741. --- Resolution: Fixed > Support query iceberg table by impala > - > > Key: IMPALA-9741 > URL: https://issues.apache.org/jira/browse/IMPALA-9741 > Project: IMPALA > Issue Type: Sub-task >Reporter: WangSheng >Assignee: WangSheng >Priority: Major > Labels: impala-iceberg > Attachments: select-iceberg.jpg > > > Since we have submit an patch of supporting create iceberg table by impala in > IMPALA-9688, we are preparing to implement iceberg table query by impala. But > we need to read the impala and iceberg code deeply to determine how to do > this. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-9967) Scan orc failed when table contains timestamp column
[ https://issues.apache.org/jira/browse/IMPALA-9967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] WangSheng reassigned IMPALA-9967: - Assignee: (was: WangSheng) > Scan orc failed when table contains timestamp column > > > Key: IMPALA-9967 > URL: https://issues.apache.org/jira/browse/IMPALA-9967 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 4.0 >Reporter: WangSheng >Priority: Minor > Labels: impala-iceberg > Attachments: 00031-31-26ff2064-c8f2-467f-ab7e-1949cb30d151-0.orc, > 00031-31-334beaba-ef4b-4d13-b338-e715cdf0ef85-0.orc > > > Recently, when I test impala query orc table, I found that scanning failed > when table contains timestamp column, here is there exception: > {code:java} > I0717 08:31:47.179124 78759 status.cc:129] 68436a6e0883be84:53877f720002] > Encountered parse error in tail of ORC file > hdfs://localhost:20500/test-warehouse/orc_scanner_test/00031-31-ac3cccf1-3ce7-40c6-933c-4fbd7bd57550-0.orc: > Unknown type kind > @ 0x1c9f753 impala::Status::Status() > @ 0x27aa049 impala::HdfsOrcScanner::ProcessFileTail() > @ 0x27a7fb3 impala::HdfsOrcScanner::Open() > @ 0x27365fe > impala::HdfsScanNodeBase::CreateAndOpenScannerHelper() > @ 0x28cb379 impala::HdfsScanNode::ProcessSplit() > @ 0x28caa7d impala::HdfsScanNode::ScannerThread() > @ 0x28c9de5 > _ZZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS_18ThreadResourcePoolEENKUlvE_clEv > @ 0x28cc19e > _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS3_18ThreadResourcePoolEEUlvE_vE6invokeERNS1_15function_bufferE > @ 0x205 boost::function0<>::operator()() > @ 0x2675d93 impala::Thread::SuperviseThread() > @ 0x267dd30 boost::_bi::list5<>::operator()<>() > @ 0x267dc54 boost::_bi::bind_t<>::operator()() > @ 0x267dc15 boost::detail::thread_data<>::run() > @ 0x3e3c3c1 thread_proxy > @ 0x7f32360336b9 start_thread > @ 0x7f3232bfe41c clone > I0717 08:31:47.325670 78759 hdfs-scan-node.cc:490] > 68436a6e0883be84:53877f720002] Error preparing scanner for scan range > hdfs://localhost:20500/test-warehouse/orc_scanner_test/00031-31-ac3cccf1-3ce7-40c6-933c-4fbd7bd57550-0.orc(0:582). > Encountered parse error in tail of ORC file > hdfs://localhost:20500/test-warehouse/orc_scanner_test/00031-31-ac3cccf1-3ce7-40c6-933c-4fbd7bd57550-0.orc: > Unknown type kind > {code} > When I remove timestamp colum from table, and generate test data, query > success. By the way, my test data is generated by spark. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Work stopped] (IMPALA-9967) Scan orc failed when table contains timestamp column
[ https://issues.apache.org/jira/browse/IMPALA-9967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-9967 stopped by WangSheng. - > Scan orc failed when table contains timestamp column > > > Key: IMPALA-9967 > URL: https://issues.apache.org/jira/browse/IMPALA-9967 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 4.0 >Reporter: WangSheng >Assignee: WangSheng >Priority: Minor > Labels: impala-iceberg > Attachments: 00031-31-26ff2064-c8f2-467f-ab7e-1949cb30d151-0.orc, > 00031-31-334beaba-ef4b-4d13-b338-e715cdf0ef85-0.orc > > > Recently, when I test impala query orc table, I found that scanning failed > when table contains timestamp column, here is there exception: > {code:java} > I0717 08:31:47.179124 78759 status.cc:129] 68436a6e0883be84:53877f720002] > Encountered parse error in tail of ORC file > hdfs://localhost:20500/test-warehouse/orc_scanner_test/00031-31-ac3cccf1-3ce7-40c6-933c-4fbd7bd57550-0.orc: > Unknown type kind > @ 0x1c9f753 impala::Status::Status() > @ 0x27aa049 impala::HdfsOrcScanner::ProcessFileTail() > @ 0x27a7fb3 impala::HdfsOrcScanner::Open() > @ 0x27365fe > impala::HdfsScanNodeBase::CreateAndOpenScannerHelper() > @ 0x28cb379 impala::HdfsScanNode::ProcessSplit() > @ 0x28caa7d impala::HdfsScanNode::ScannerThread() > @ 0x28c9de5 > _ZZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS_18ThreadResourcePoolEENKUlvE_clEv > @ 0x28cc19e > _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS3_18ThreadResourcePoolEEUlvE_vE6invokeERNS1_15function_bufferE > @ 0x205 boost::function0<>::operator()() > @ 0x2675d93 impala::Thread::SuperviseThread() > @ 0x267dd30 boost::_bi::list5<>::operator()<>() > @ 0x267dc54 boost::_bi::bind_t<>::operator()() > @ 0x267dc15 boost::detail::thread_data<>::run() > @ 0x3e3c3c1 thread_proxy > @ 0x7f32360336b9 start_thread > @ 0x7f3232bfe41c clone > I0717 08:31:47.325670 78759 hdfs-scan-node.cc:490] > 68436a6e0883be84:53877f720002] Error preparing scanner for scan range > hdfs://localhost:20500/test-warehouse/orc_scanner_test/00031-31-ac3cccf1-3ce7-40c6-933c-4fbd7bd57550-0.orc(0:582). > Encountered parse error in tail of ORC file > hdfs://localhost:20500/test-warehouse/orc_scanner_test/00031-31-ac3cccf1-3ce7-40c6-933c-4fbd7bd57550-0.orc: > Unknown type kind > {code} > When I remove timestamp colum from table, and generate test data, query > success. By the way, my test data is generated by spark. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Work started] (IMPALA-9967) Scan orc failed when table contains timestamp column
[ https://issues.apache.org/jira/browse/IMPALA-9967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-9967 started by WangSheng. - > Scan orc failed when table contains timestamp column > > > Key: IMPALA-9967 > URL: https://issues.apache.org/jira/browse/IMPALA-9967 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 4.0 >Reporter: WangSheng >Assignee: WangSheng >Priority: Minor > Labels: impala-iceberg > Attachments: 00031-31-26ff2064-c8f2-467f-ab7e-1949cb30d151-0.orc, > 00031-31-334beaba-ef4b-4d13-b338-e715cdf0ef85-0.orc > > > Recently, when I test impala query orc table, I found that scanning failed > when table contains timestamp column, here is there exception: > {code:java} > I0717 08:31:47.179124 78759 status.cc:129] 68436a6e0883be84:53877f720002] > Encountered parse error in tail of ORC file > hdfs://localhost:20500/test-warehouse/orc_scanner_test/00031-31-ac3cccf1-3ce7-40c6-933c-4fbd7bd57550-0.orc: > Unknown type kind > @ 0x1c9f753 impala::Status::Status() > @ 0x27aa049 impala::HdfsOrcScanner::ProcessFileTail() > @ 0x27a7fb3 impala::HdfsOrcScanner::Open() > @ 0x27365fe > impala::HdfsScanNodeBase::CreateAndOpenScannerHelper() > @ 0x28cb379 impala::HdfsScanNode::ProcessSplit() > @ 0x28caa7d impala::HdfsScanNode::ScannerThread() > @ 0x28c9de5 > _ZZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS_18ThreadResourcePoolEENKUlvE_clEv > @ 0x28cc19e > _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS3_18ThreadResourcePoolEEUlvE_vE6invokeERNS1_15function_bufferE > @ 0x205 boost::function0<>::operator()() > @ 0x2675d93 impala::Thread::SuperviseThread() > @ 0x267dd30 boost::_bi::list5<>::operator()<>() > @ 0x267dc54 boost::_bi::bind_t<>::operator()() > @ 0x267dc15 boost::detail::thread_data<>::run() > @ 0x3e3c3c1 thread_proxy > @ 0x7f32360336b9 start_thread > @ 0x7f3232bfe41c clone > I0717 08:31:47.325670 78759 hdfs-scan-node.cc:490] > 68436a6e0883be84:53877f720002] Error preparing scanner for scan range > hdfs://localhost:20500/test-warehouse/orc_scanner_test/00031-31-ac3cccf1-3ce7-40c6-933c-4fbd7bd57550-0.orc(0:582). > Encountered parse error in tail of ORC file > hdfs://localhost:20500/test-warehouse/orc_scanner_test/00031-31-ac3cccf1-3ce7-40c6-933c-4fbd7bd57550-0.orc: > Unknown type kind > {code} > When I remove timestamp colum from table, and generate test data, query > success. By the way, my test data is generated by spark. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-10221) Use 'iceberg.file_format' to replace 'iceberg_file_format'
[ https://issues.apache.org/jira/browse/IMPALA-10221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] WangSheng resolved IMPALA-10221. Resolution: Fixed > Use 'iceberg.file_format' to replace 'iceberg_file_format' > -- > > Key: IMPALA-10221 > URL: https://issues.apache.org/jira/browse/IMPALA-10221 > Project: IMPALA > Issue Type: Sub-task >Reporter: WangSheng >Assignee: WangSheng >Priority: Minor > Labels: impala-iceberg > > We provide several new table properties in IMPALA-10164, such as > 'iceberg.catalog', > in order to keep consist of these properties, we rename > 'iceberg_file_format' to > 'iceberg.file_format'. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-9688) Support create iceberg table by impala
[ https://issues.apache.org/jira/browse/IMPALA-9688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] WangSheng resolved IMPALA-9688. --- Resolution: Fixed > Support create iceberg table by impala > -- > > Key: IMPALA-9688 > URL: https://issues.apache.org/jira/browse/IMPALA-9688 > Project: IMPALA > Issue Type: Sub-task >Reporter: WangSheng >Assignee: WangSheng >Priority: Major > Labels: impala-iceberg > > This sub-task mainly realizes the creation of iceberg table through impala -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org