[jira] [Commented] (IMPALA-10785) when union kudu table and hdfs table, union passthrough does not take effect
[ https://issues.apache.org/jira/browse/IMPALA-10785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384646#comment-17384646 ] pengdou1990 commented on IMPALA-10785: -- union output tuple layout depence on it's first child node: [https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/SetOperationStmt.java#L599] each slot's isNullable_ value and null indicator depence on all childnode's corresponding slot: [https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/SetOperationStmt.java#L644] in kudu table tuple, the primary key slot‘s isNullable_ = false in hdfs table tuple, all slot's isNullable_ = true in kudu table union with hdfs table situation, tuple memory layout may as hdfs table’s tuple memory layout (string type length is 12 byte, without padding), and isNullable_ of slots in tuple as kudu table’s neither hdfs table nor kudu table can't pass isChildPassthrough check, so pass through does not take effect > when union kudu table and hdfs table, union passthrough does not take effect > > > Key: IMPALA-10785 > URL: https://issues.apache.org/jira/browse/IMPALA-10785 > Project: IMPALA > Issue Type: Improvement >Reporter: pengdou1990 >Priority: Major > > IMPALA-3586 already supports union passthrough, and brings great performance > improvements in union, but there is still some problems when union between > hdfs table and kudu table ,several points cause the problem: > # in kudu scanner node output TupleDescriptor, string slot is 16B,while in > hdfs scanner node output TupleDescriptor, string slot is 12B,cause tuple > memory layout mismatch > # in kudu scanner node output TupleDescriptor, string slot is 16B, while in > Union output TupleDescriptor, string slot is 12B,cause tuple memory layout > mismatch > # in Kudu Scannode, row key slot is not null, while in hdfs node, not null > slot can't get from the metadata, cause tuple memory layout mismatch > I hive resolved the 1st and 2nd points, how should I do with the 3rd point? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-10627) Use standard Iceberg table properties
[ https://issues.apache.org/jira/browse/IMPALA-10627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384539#comment-17384539 ] ASF subversion and git services commented on IMPALA-10627: -- Commit fabe994d1fb011afb88d1f0f5bf078113775c9db in impala's branch refs/heads/master from Attila Jeges [ https://gitbox.apache.org/repos/asf?p=impala.git;h=fabe994 ] IMPALA-10627: Use standard parquet-related Iceberg table properties This patch adds support for the following standard Iceberg properties: write.parquet.compression-codec: Parquet compression codec. Supported values are: NONE, GZIP, SNAPPY (default value), LZ4, ZSTD. The table property will be ignored if COMPRESSION_CODEC query option is set. write.parquet.compression-level: Parquet compression level. Used with ZSTD compression only. Supported range is [1, 22]. Default value is 3. The table property will be ignored if COMPRESSION_CODEC query option is set. write.parquet.row-group-size-bytes : Parquet row group size in bytes. Supported range is [8388608, 2146435072] (8MB - 2047MB). The table property will be ignored if PARQUET_FILE_SIZE query option is set. If neither the table property nor the PARQUET_FILE_SIZE query option is set, the way Impala calculates row group size will remain unchanged. write.parquet.page-size-bytes: Parquet page size in bytes. Used for PLAIN encoding. Supported range is [65536, 1073741824] (64KB - 1GB). If the table property is unset, the way Impala calculates page size will remain unchanged. write.parquet.dict-size-bytes: Parquet dictionary page size in bytes. Used for dictionary encoding. Supported range is [65536, 1073741824] (64KB - 1GB). If the table property is unset, the way Impala calculates dictionary page size will remain unchanged. This patch also renames 'iceberg.file_format' table property to 'write.format.default' which is the standard Iceberg name for the table property. Change-Id: I3b8aa9a52c13c41b48310d2f7c9c7426e1ff5f23 Reviewed-on: http://gerrit.cloudera.org:8080/17654 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Use standard Iceberg table properties > - > > Key: IMPALA-10627 > URL: https://issues.apache.org/jira/browse/IMPALA-10627 > Project: IMPALA > Issue Type: Bug >Reporter: Zoltán Borók-Nagy >Assignee: Attila Jeges >Priority: Major > Labels: impala-iceberg > > Iceberg lists the following properties: > [https://iceberg.apache.org/configuration/] > We should also use these properties if possible, e.g. write.format.default, > write..compression-codec > Currently Impala use the table property 'iceberg.file_format' to determine > the data file format for reads/writes. In the future, read operations should > automatically detect the file formats (IMPALA-10610), but for writes we > should use 'write.format.default'. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-10754) test_overlap_min_max_filters_on_sorted_columns failed during GVO
[ https://issues.apache.org/jira/browse/IMPALA-10754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384538#comment-17384538 ] ASF subversion and git services commented on IMPALA-10754: -- Commit 147b4b9e583098f9611fe28fc9ff1f8451f63e4b in impala's branch refs/heads/master from Qifan Chen [ https://gitbox.apache.org/repos/asf?p=impala.git;h=147b4b9 ] IMPALA-10754: test_overlap_min_max_filters_on_sorted_columns failed during GVO This patch addresses a failure in ubuntu-16.04 dockerised test. The test involved is found in overlap_min_max_filters_on_sorted_columns.test as follows. set minmax_filter_fast_code_path=on; set MINMAX_FILTER_THRESHOLD=0.0; SET RUNTIME_FILTER_WAIT_TIME_MS=$RUNTIME_FILTER_WAIT_TIME_MS; select straight_join count(a.timestamp_col) from alltypes_timestamp_col_only a join [SHUFFLE] alltypes_limited b where a.timestamp_col = b.timestamp_col and b.tinyint_col = 4; RUNTIME_PROFILE aggregation(SUM, NumRuntimeFilteredPages)> 57 The patch reduces the threshold from 58 to 50. Testing: Ran the unit test successfully. Change-Id: Icb4cc7d533139c4a2b46a872234a47d46cb8a17c Reviewed-on: http://gerrit.cloudera.org:8080/17696 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > test_overlap_min_max_filters_on_sorted_columns failed during GVO > > > Key: IMPALA-10754 > URL: https://issues.apache.org/jira/browse/IMPALA-10754 > Project: IMPALA > Issue Type: Bug > Components: Backend >Reporter: Zoltán Borók-Nagy >Assignee: Qifan Chen >Priority: Major > Labels: broken-build > Fix For: Impala 4.1 > > > test_overlap_min_max_filters_on_sorted_columns failed in the following build: > https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/4338/testReport/ > *Stack trace:* > {noformat} > query_test/test_runtime_filters.py:296: in > test_overlap_min_max_filters_on_sorted_columns > test_file_vars={'$RUNTIME_FILTER_WAIT_TIME_MS': str(WAIT_TIME_MS)}) > common/impala_test_suite.py:734: in run_test_case > update_section=pytest.config.option.update_results) > common/test_result_verifier.py:653: in verify_runtime_profile > % (function, field, expected_value, actual_value, op, actual)) > E AssertionError: Aggregation of SUM over NumRuntimeFilteredPages did not > match expected results. > E EXPECTED VALUE: > E 58 > E > E > E ACTUAL VALUE: > E 59 > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-10815) Ignore events on non-default hive catalogs
Vihang Karajgaonkar created IMPALA-10815: Summary: Ignore events on non-default hive catalogs Key: IMPALA-10815 URL: https://issues.apache.org/jira/browse/IMPALA-10815 Project: IMPALA Issue Type: Bug Reporter: Vihang Karajgaonkar Assignee: Vihang Karajgaonkar Hive-3 introduces a new object called catalog which is like a namespace for database and tables. Currently, Impala does not support hive catalog. However, if there are events on such non-default catalogs the events processing applies these events on the catalogd if the database and table name matches. Until we support custom catalogs in hive we should ignore the events coming from such non-default catalog objects. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IMPALA-10815) Ignore events on non-default hive catalogs
Vihang Karajgaonkar created IMPALA-10815: Summary: Ignore events on non-default hive catalogs Key: IMPALA-10815 URL: https://issues.apache.org/jira/browse/IMPALA-10815 Project: IMPALA Issue Type: Bug Reporter: Vihang Karajgaonkar Assignee: Vihang Karajgaonkar Hive-3 introduces a new object called catalog which is like a namespace for database and tables. Currently, Impala does not support hive catalog. However, if there are events on such non-default catalogs the events processing applies these events on the catalogd if the database and table name matches. Until we support custom catalogs in hive we should ignore the events coming from such non-default catalog objects. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-10468) DROP events which are generated while a batch is being processed may add table incorrectly
[ https://issues.apache.org/jira/browse/IMPALA-10468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar resolved IMPALA-10468. -- Fix Version/s: Impala 4.1 Resolution: Duplicate > DROP events which are generated while a batch is being processed may add > table incorrectly > -- > > Key: IMPALA-10468 > URL: https://issues.apache.org/jira/browse/IMPALA-10468 > Project: IMPALA > Issue Type: Bug >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > Fix For: Impala 4.1 > > > One of the problems with CREATE/DROP events is that they may occur while a > batch is being processed and hence EventsProcessor may not able aware of that. > For example, consider the following sequence of statements: > create table foo (c1 int); > drop table foo; > create table foo (c2 int); > drop table foo; > These statements will generate CREATE_TABLE, DROP_TABLE, CREATE_TABLE, > DROP_TABLE event sequence. Generally, if all these 4 events are fetched in a > batch, then the first CREATE_TABLE and third CREATE_TABLE is ignored because > it is followed by the a DROP_TABLE in the sequence and the DROP_TABLE events > take no effect since the table doesn't exist in catalogd anymore. > However, if the events processor fetches these events in 2 batches (3 and 1) > then after the first batch of CREATE_TABLE, DROP_TABLE, CREATE_TABLE is > processed, the third event will add the table foo in the catalogd. The > subsequent batch's DROP_TABLE will be processed and remove the table, but > between the two batches, catalogd will say that a table called foo exists. > This can lead to statements getting errored out. Eg. a statement like create > table foo (c3 int) after the above statements will error out with a > TableAlreadyExists error. > The problem happens for databases too. So far I have not been able to > reproduce this for Partitions but I don't see why it will not happen with > Partitions also. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-10490) truncate table fails with IllegalStateException
[ https://issues.apache.org/jira/browse/IMPALA-10490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar resolved IMPALA-10490. -- Fix Version/s: Impala 4.1 Resolution: Fixed > truncate table fails with IllegalStateException > --- > > Key: IMPALA-10490 > URL: https://issues.apache.org/jira/browse/IMPALA-10490 > Project: IMPALA > Issue Type: Bug >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > Fix For: Impala 4.1 > > > This is a problem for when events processing is turned on. I can reproduce it > by following steps. > 1. start impala without events processing > 2. create table, load data, compute stats on the table. > 3. restart impala with events processing turned on > 4. Run truncate table command. > I can see the truncate table command fails with following error. > [localhost:21050] default> truncate t5; > Query: truncate t5 > ERROR: CatalogException: Failed to truncate table: default.t5. > Table may be in a partially truncated state. > CAUSED BY: IllegalStateException: Table parameters must have catalog service > identifier before adding it to partition parameters -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-10785) when union kudu table and hdfs table, union passthrough does not take effect
[ https://issues.apache.org/jira/browse/IMPALA-10785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384435#comment-17384435 ] Qifan Chen commented on IMPALA-10785: - For 3), SlotDescriptor in FE has the field called isNullable_ (https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java#L66). Seems isNullable_ of the union should be set to isNullable_ field of the column in the hdfs table when that for the corresponding column in the kudu table is false (not nullable). > when union kudu table and hdfs table, union passthrough does not take effect > > > Key: IMPALA-10785 > URL: https://issues.apache.org/jira/browse/IMPALA-10785 > Project: IMPALA > Issue Type: Improvement >Reporter: pengdou1990 >Priority: Major > > IMPALA-3586 already supports union passthrough, and brings great performance > improvements in union, but there is still some problems when union between > hdfs table and kudu table ,several points cause the problem: > # in kudu scanner node output TupleDescriptor, string slot is 16B,while in > hdfs scanner node output TupleDescriptor, string slot is 12B,cause tuple > memory layout mismatch > # in kudu scanner node output TupleDescriptor, string slot is 16B, while in > Union output TupleDescriptor, string slot is 12B,cause tuple memory layout > mismatch > # in Kudu Scannode, row key slot is not null, while in hdfs node, not null > slot can't get from the metadata, cause tuple memory layout mismatch > I hive resolved the 1st and 2nd points, how should I do with the 3rd point? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-10814) Hit DCHECK in DecimalUtil::DecodeFromFixedLenByteArray for core-s3 build
[ https://issues.apache.org/jira/browse/IMPALA-10814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenzhe Zhou updated IMPALA-10814: - Labels: broken-build (was: ) > Hit DCHECK in DecimalUtil::DecodeFromFixedLenByteArray for core-s3 build > > > Key: IMPALA-10814 > URL: https://issues.apache.org/jira/browse/IMPALA-10814 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 4.1 >Reporter: Wenzhe Zhou >Priority: Major > Labels: broken-build > > Saw this build failure in asf-master-core-s3 build: > [https://master-03.jenkins.cloudera.com/job/impala-asf-master-core-s3/61/] > > Error Message > DCHECK found in log file: > /data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/logs/ee_tests/impalad.FATAL > h3. Standard Error > Log file created at: 2021/07/19 18:41:06 Running on machine: > [impala-ec2-centos74-m5-4xlarge-ondemand-072f.vpc.cloudera.com|http://impala-ec2-centos74-m5-4xlarge-ondemand-072f.vpc.cloudera.com/] > Log line format: [IWEF]mmdd hh:mm:ss.uu threadid file:line] msg > F0719 18:41:06.730994 4601 decimal-util.h:129] > fb4b98709a88f345:b51bf00b0002] Check failed: fixed_len_size > 0 (-15 vs. > 0) > F0719 18:41:08.161149 4711 decimal-util.h:129] > e5432b6d3730539d:cf6c2d310002] Check failed: fixed_len_size > 0 (-15 vs. > 0) > From timestamp, the issue seems happened in test > query_test/test_scanners_fuzz.py::TestScannersFuzzing::test_fuzz_uncompressed_parquet_orc -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-10814) Hit DCHECK in DecimalUtil::DecodeFromFixedLenByteArray for core-s3 build
[ https://issues.apache.org/jira/browse/IMPALA-10814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenzhe Zhou updated IMPALA-10814: - Description: Saw this build failure in asf-master-core-s3 build: [https://master-03.jenkins.cloudera.com/job/impala-asf-master-core-s3/61/] *Error Message* DCHECK found in log file: /data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/logs/ee_tests/impalad.FATAL h3. Standard Error Log file created at: 2021/07/19 18:41:06 Running on machine: [impala-ec2-centos74-m5-4xlarge-ondemand-072f.vpc.cloudera.com|http://impala-ec2-centos74-m5-4xlarge-ondemand-072f.vpc.cloudera.com/] Log line format: [IWEF]mmdd hh:mm:ss.uu threadid [file:line|file:///line]] msg F0719 18:41:06.730994 4601 decimal-util.h:129] fb4b98709a88f345:b51bf00b0002] Check failed: fixed_len_size > 0 (-15 vs. 0) F0719 18:41:08.161149 4711 decimal-util.h:129] e5432b6d3730539d:cf6c2d310002] Check failed: fixed_len_size > 0 (-15 vs. 0) >From timestamp, the issue seems happened in test: >query_test/test_scanners_fuzz.py::TestScannersFuzzing::test_fuzz_uncompressed_parquet_orc was: Saw this build failure in asf-master-core-s3 build: [https://master-03.jenkins.cloudera.com/job/impala-asf-master-core-s3/61/] Error Message DCHECK found in log file: /data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/logs/ee_tests/impalad.FATAL h3. Standard Error Log file created at: 2021/07/19 18:41:06 Running on machine: [impala-ec2-centos74-m5-4xlarge-ondemand-072f.vpc.cloudera.com|http://impala-ec2-centos74-m5-4xlarge-ondemand-072f.vpc.cloudera.com/] Log line format: [IWEF]mmdd hh:mm:ss.uu threadid file:line] msg F0719 18:41:06.730994 4601 decimal-util.h:129] fb4b98709a88f345:b51bf00b0002] Check failed: fixed_len_size > 0 (-15 vs. 0) F0719 18:41:08.161149 4711 decimal-util.h:129] e5432b6d3730539d:cf6c2d310002] Check failed: fixed_len_size > 0 (-15 vs. 0) >From timestamp, the issue seems happened in test query_test/test_scanners_fuzz.py::TestScannersFuzzing::test_fuzz_uncompressed_parquet_orc > Hit DCHECK in DecimalUtil::DecodeFromFixedLenByteArray for core-s3 build > > > Key: IMPALA-10814 > URL: https://issues.apache.org/jira/browse/IMPALA-10814 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 4.1 >Reporter: Wenzhe Zhou >Priority: Major > Labels: broken-build > > Saw this build failure in asf-master-core-s3 build: > [https://master-03.jenkins.cloudera.com/job/impala-asf-master-core-s3/61/] > > *Error Message* > DCHECK found in log file: > /data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/logs/ee_tests/impalad.FATAL > h3. Standard Error > Log file created at: 2021/07/19 18:41:06 Running on machine: > [impala-ec2-centos74-m5-4xlarge-ondemand-072f.vpc.cloudera.com|http://impala-ec2-centos74-m5-4xlarge-ondemand-072f.vpc.cloudera.com/] > Log line format: [IWEF]mmdd hh:mm:ss.uu threadid > [file:line|file:///line]] msg > F0719 18:41:06.730994 4601 decimal-util.h:129] > fb4b98709a88f345:b51bf00b0002] Check failed: fixed_len_size > 0 (-15 vs. > 0) > F0719 18:41:08.161149 4711 decimal-util.h:129] > e5432b6d3730539d:cf6c2d310002] Check failed: fixed_len_size > 0 (-15 vs. > 0) > > From timestamp, the issue seems happened in test: > query_test/test_scanners_fuzz.py::TestScannersFuzzing::test_fuzz_uncompressed_parquet_orc -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-10814) Hit DCHECK in DecimalUtil::DecodeFromFixedLenByteArray for core-s3 build
Wenzhe Zhou created IMPALA-10814: Summary: Hit DCHECK in DecimalUtil::DecodeFromFixedLenByteArray for core-s3 build Key: IMPALA-10814 URL: https://issues.apache.org/jira/browse/IMPALA-10814 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 4.1 Reporter: Wenzhe Zhou Saw this build failure in asf-master-core-s3 build: [https://master-03.jenkins.cloudera.com/job/impala-asf-master-core-s3/61/] Error Message DCHECK found in log file: /data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/logs/ee_tests/impalad.FATAL h3. Standard Error Log file created at: 2021/07/19 18:41:06 Running on machine: [impala-ec2-centos74-m5-4xlarge-ondemand-072f.vpc.cloudera.com|http://impala-ec2-centos74-m5-4xlarge-ondemand-072f.vpc.cloudera.com/] Log line format: [IWEF]mmdd hh:mm:ss.uu threadid file:line] msg F0719 18:41:06.730994 4601 decimal-util.h:129] fb4b98709a88f345:b51bf00b0002] Check failed: fixed_len_size > 0 (-15 vs. 0) F0719 18:41:08.161149 4711 decimal-util.h:129] e5432b6d3730539d:cf6c2d310002] Check failed: fixed_len_size > 0 (-15 vs. 0) >From timestamp, the issue seems happened in test query_test/test_scanners_fuzz.py::TestScannersFuzzing::test_fuzz_uncompressed_parquet_orc -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-10814) Hit DCHECK in DecimalUtil::DecodeFromFixedLenByteArray for core-s3 build
Wenzhe Zhou created IMPALA-10814: Summary: Hit DCHECK in DecimalUtil::DecodeFromFixedLenByteArray for core-s3 build Key: IMPALA-10814 URL: https://issues.apache.org/jira/browse/IMPALA-10814 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 4.1 Reporter: Wenzhe Zhou Saw this build failure in asf-master-core-s3 build: [https://master-03.jenkins.cloudera.com/job/impala-asf-master-core-s3/61/] Error Message DCHECK found in log file: /data/jenkins/workspace/impala-asf-master-core-s3/repos/Impala/logs/ee_tests/impalad.FATAL h3. Standard Error Log file created at: 2021/07/19 18:41:06 Running on machine: [impala-ec2-centos74-m5-4xlarge-ondemand-072f.vpc.cloudera.com|http://impala-ec2-centos74-m5-4xlarge-ondemand-072f.vpc.cloudera.com/] Log line format: [IWEF]mmdd hh:mm:ss.uu threadid file:line] msg F0719 18:41:06.730994 4601 decimal-util.h:129] fb4b98709a88f345:b51bf00b0002] Check failed: fixed_len_size > 0 (-15 vs. 0) F0719 18:41:08.161149 4711 decimal-util.h:129] e5432b6d3730539d:cf6c2d310002] Check failed: fixed_len_size > 0 (-15 vs. 0) >From timestamp, the issue seems happened in test query_test/test_scanners_fuzz.py::TestScannersFuzzing::test_fuzz_uncompressed_parquet_orc -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IMPALA-10502) delayed 'Invalidated objects in cache' cause 'Table already exists'
[ https://issues.apache.org/jira/browse/IMPALA-10502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384392#comment-17384392 ] ASF subversion and git services commented on IMPALA-10502: -- Commit 565d0bfa1d12df583ab6d2725ac6ecf2644cd50d in impala's branch refs/heads/master from Vihang Karajgaonkar [ https://gitbox.apache.org/repos/asf?p=impala.git;h=565d0bf ] IMPALA-10502: Fetch events in batches (Addendum) The earlier change for IMPALA-10502 passes in a batch size of -1 to fetch all the events from a given event id during a DDL execution. While this works when HMS backing database is postgres, it doesn't work well when the HMS backend is a MySQL database due to HIVE-20226. This change works around the hive bug to fetch the events in batches of 1000 instead of fetching all the events in one RPC during the DDL execution. Testing: 1. Added a unit test for the new changes introduced. 2. Ran the previously failing tests on MySQL HMS backend. Change-Id: I34bb8984aeb91b37439f77722746f638d8774478 Reviewed-on: http://gerrit.cloudera.org:8080/17698 Reviewed-by: Impala Public Jenkins Reviewed-by: Zoltan Borok-Nagy Tested-by: Zoltan Borok-Nagy > delayed 'Invalidated objects in cache' cause 'Table already exists' > --- > > Key: IMPALA-10502 > URL: https://issues.apache.org/jira/browse/IMPALA-10502 > Project: IMPALA > Issue Type: Bug > Components: Catalog, Clients, Frontend >Affects Versions: Impala 3.4.0 >Reporter: Adriano >Assignee: Vihang Karajgaonkar >Priority: Critical > Fix For: Impala 4.1 > > > In fast paced environment where the interval between the step 1 and 2 is # < > 100ms (a simplified pipeline looks like): > 0- catalog 'on demand' in use and disableHmsSync (enabled or disabled: no > difference) > 1- open session to coord A -> DROP TABLE X -> close session > 2- open session to coord A -> CREATE TABLE X-> close session > Results: the step -2- can fail with table already exist. > During the internal investigation was discovered that IMPALA-9913 will > regress the issue in almost all scenarios. > However considering that the investigation are internally ongoing it is nice > to have the event tracked also here. > Once we are sure that IMPALA-9913 fix these events we can close this as > duplicate, in alternative carry on the investigation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-10761) Provide query option for illegal UTF-8 characters handling
[ https://issues.apache.org/jira/browse/IMPALA-10761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384390#comment-17384390 ] ASF subversion and git services commented on IMPALA-10761: -- Commit 4df03a31ec77b54138aba2805ff5e376463c8e23 in impala's branch refs/heads/master from stiga-huang [ https://gitbox.apache.org/repos/asf?p=impala.git;h=4df03a3 ] IMPALA-2019(Part-2): Provide UTF-8 support in instr() and locate() Similar to the previous patch, this patch adds UTF-8 support in instr() and locate() builtin functions so they can have consistent behaviors with Hive's. These two string functions both have an optional argument as position: INSTR(STRING str, STRING substr[, BIGINT position[, BIGINT occurrence]]) LOCATE(STRING substr, STRING str[, INT pos]) Their return values are positions of the matched substring. In UTF-8 mode (turned on by set UTF8_MODE=true), these positions are counted by UTF-8 characters instead of bytes. Error handling: Malformed UTF-8 characters are counted as one byte per character. This is consistent with Hive since Hive replaces those bytes to U+FFFD (REPLACEMENT CHARACTER). E.g. GenericUDFInstr calls Text#toString(), which performs the replacement. We can provide more behaviors on error handling like ignoring them or reporting errors. IMPALA-10761 will focus on this. Tests: - Add BE unit tests and e2e tests - Add random tests to make sure malformed UTF-8 characters won't crash us. Change-Id: Ic13c3d04649c1aea56c1aaa464799b5e4674f662 Reviewed-on: http://gerrit.cloudera.org:8080/17580 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Provide query option for illegal UTF-8 characters handling > -- > > Key: IMPALA-10761 > URL: https://issues.apache.org/jira/browse/IMPALA-10761 > Project: IMPALA > Issue Type: New Feature >Reporter: Quanlong Huang >Priority: Major > > There are 3 ways to handle illegal UTF-8 characters: > * Replacing them with U+FFFD (REPLACEMENT CHARACTER) > * Ignoring them (removing them in the string) > * Reporting errors > We should provide a query option for this. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-10502) delayed 'Invalidated objects in cache' cause 'Table already exists'
[ https://issues.apache.org/jira/browse/IMPALA-10502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384391#comment-17384391 ] ASF subversion and git services commented on IMPALA-10502: -- Commit 565d0bfa1d12df583ab6d2725ac6ecf2644cd50d in impala's branch refs/heads/master from Vihang Karajgaonkar [ https://gitbox.apache.org/repos/asf?p=impala.git;h=565d0bf ] IMPALA-10502: Fetch events in batches (Addendum) The earlier change for IMPALA-10502 passes in a batch size of -1 to fetch all the events from a given event id during a DDL execution. While this works when HMS backing database is postgres, it doesn't work well when the HMS backend is a MySQL database due to HIVE-20226. This change works around the hive bug to fetch the events in batches of 1000 instead of fetching all the events in one RPC during the DDL execution. Testing: 1. Added a unit test for the new changes introduced. 2. Ran the previously failing tests on MySQL HMS backend. Change-Id: I34bb8984aeb91b37439f77722746f638d8774478 Reviewed-on: http://gerrit.cloudera.org:8080/17698 Reviewed-by: Impala Public Jenkins Reviewed-by: Zoltan Borok-Nagy Tested-by: Zoltan Borok-Nagy > delayed 'Invalidated objects in cache' cause 'Table already exists' > --- > > Key: IMPALA-10502 > URL: https://issues.apache.org/jira/browse/IMPALA-10502 > Project: IMPALA > Issue Type: Bug > Components: Catalog, Clients, Frontend >Affects Versions: Impala 3.4.0 >Reporter: Adriano >Assignee: Vihang Karajgaonkar >Priority: Critical > Fix For: Impala 4.1 > > > In fast paced environment where the interval between the step 1 and 2 is # < > 100ms (a simplified pipeline looks like): > 0- catalog 'on demand' in use and disableHmsSync (enabled or disabled: no > difference) > 1- open session to coord A -> DROP TABLE X -> close session > 2- open session to coord A -> CREATE TABLE X-> close session > Results: the step -2- can fail with table already exist. > During the internal investigation was discovered that IMPALA-9913 will > regress the issue in almost all scenarios. > However considering that the investigation are internally ongoing it is nice > to have the event tracked also here. > Once we are sure that IMPALA-9913 fix these events we can close this as > duplicate, in alternative carry on the investigation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-10799) Analysis slowdown with inline views and thousands of column
[ https://issues.apache.org/jira/browse/IMPALA-10799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384388#comment-17384388 ] ASF subversion and git services commented on IMPALA-10799: -- Commit bd9b7459d0ab453fa185ba6728e5c571835ffa3e in impala's branch refs/heads/master from xqhe [ https://gitbox.apache.org/repos/asf?p=impala.git;h=bd9b745 ] IMPALA-10799: Analysis slowdown with inline views and thousands of column If there are thousands of columns in the inlineview, it‘s very slow in analysis. Most of the cost is in the get() calls used to find expressions in the local substitution map when checking if the column is ambiguous. The fix is to 1.Use LinkedHashMap to search and check if we have already seen the alias. 2.Do the check of checkComposedFrom() when the log level is TRACE since the codes have been mature for a while. Testing: Performance testing with a query with 1 expressions of the following form: with a as (select c1 c1, c1 c2, c1 c3, ... from t) select c1, c2, c3, ... from a; repro query analysis went from 7.5 sec to less than 1 sec. Change-Id: I43da47dddfdb3db6d0e2073ae974a0a4d1b3ad7c Reviewed-on: http://gerrit.cloudera.org:8080/17688 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Analysis slowdown with inline views and thousands of column > --- > > Key: IMPALA-10799 > URL: https://issues.apache.org/jira/browse/IMPALA-10799 > Project: IMPALA > Issue Type: Improvement > Components: Frontend >Affects Versions: Impala 3.2.0 >Reporter: Xianqing He >Assignee: Xianqing He >Priority: Major > Fix For: Impala 4.1 > > > If there are thousands of columns in the inlineview, it‘s very slow in > analysis. For example, this sql will take almost 4s in analysis if the inline > view has tens of thousands of column > {code:java} > select c1 from (select c1, c2... c10001 from T) T >Query Compilation: 3s880ms >- Translate start: 968.000ns (968.000ns) >- Translate finished: 4.318ms (4.317ms) >- Metadata of all 1 tables cached: 42.219ms (37.900ms) >- Analysis finished: 3s776ms (3s734ms) >- Value transfer graph computed: 3s806ms (30.163ms) >- Single node plan created: 3s869ms (62.556ms) >- Runtime filters computed: 3s874ms (5.603ms) >- Distributed plan created: 3s874ms (128.086us) >- Planning finished: 3s880ms (5.836ms) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-2019) Proper UTF-8 support in string functions
[ https://issues.apache.org/jira/browse/IMPALA-2019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384389#comment-17384389 ] ASF subversion and git services commented on IMPALA-2019: - Commit 4df03a31ec77b54138aba2805ff5e376463c8e23 in impala's branch refs/heads/master from stiga-huang [ https://gitbox.apache.org/repos/asf?p=impala.git;h=4df03a3 ] IMPALA-2019(Part-2): Provide UTF-8 support in instr() and locate() Similar to the previous patch, this patch adds UTF-8 support in instr() and locate() builtin functions so they can have consistent behaviors with Hive's. These two string functions both have an optional argument as position: INSTR(STRING str, STRING substr[, BIGINT position[, BIGINT occurrence]]) LOCATE(STRING substr, STRING str[, INT pos]) Their return values are positions of the matched substring. In UTF-8 mode (turned on by set UTF8_MODE=true), these positions are counted by UTF-8 characters instead of bytes. Error handling: Malformed UTF-8 characters are counted as one byte per character. This is consistent with Hive since Hive replaces those bytes to U+FFFD (REPLACEMENT CHARACTER). E.g. GenericUDFInstr calls Text#toString(), which performs the replacement. We can provide more behaviors on error handling like ignoring them or reporting errors. IMPALA-10761 will focus on this. Tests: - Add BE unit tests and e2e tests - Add random tests to make sure malformed UTF-8 characters won't crash us. Change-Id: Ic13c3d04649c1aea56c1aaa464799b5e4674f662 Reviewed-on: http://gerrit.cloudera.org:8080/17580 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Proper UTF-8 support in string functions > > > Key: IMPALA-2019 > URL: https://issues.apache.org/jira/browse/IMPALA-2019 > Project: IMPALA > Issue Type: New Feature > Components: Backend >Affects Versions: Impala 2.1, Impala 2.2 >Reporter: Andrés Cordero >Assignee: Quanlong Huang >Priority: Critical > Labels: sql-language > > As documented here: > https://impala.apache.org/docs/build/html/topics/impala_string.html > Impala does not properly handle non-ASCII UTF-8 characters, and will return > results in string functions such as length that are inconsistent with Hive. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-10813) Invalidate external table from catalog cache for truncate table HMS api
Sourabh Goyal created IMPALA-10813: -- Summary: Invalidate external table from catalog cache for truncate table HMS api Key: IMPALA-10813 URL: https://issues.apache.org/jira/browse/IMPALA-10813 Project: IMPALA Issue Type: Bug Components: Catalog Reporter: Sourabh Goyal In IMPALA-10648, we started invalidating external tables when certain HMS endpoints are accessed from catalog Metastore server. We missed doing the same for truncate_table api. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IMPALA-10813) Invalidate external table from catalog cache for truncate table HMS api
Sourabh Goyal created IMPALA-10813: -- Summary: Invalidate external table from catalog cache for truncate table HMS api Key: IMPALA-10813 URL: https://issues.apache.org/jira/browse/IMPALA-10813 Project: IMPALA Issue Type: Bug Components: Catalog Reporter: Sourabh Goyal In IMPALA-10648, we started invalidating external tables when certain HMS endpoints are accessed from catalog Metastore server. We missed doing the same for truncate_table api. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-10811) RPC to submit query getting stuck for AWS NLB forever.
[ https://issues.apache.org/jira/browse/IMPALA-10811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384368#comment-17384368 ] Amogh Margoor commented on IMPALA-10811: DOC Jira for the same: IMPALA-10812 > RPC to submit query getting stuck for AWS NLB forever. > -- > > Key: IMPALA-10811 > URL: https://issues.apache.org/jira/browse/IMPALA-10811 > Project: IMPALA > Issue Type: Bug >Reporter: Amogh Margoor >Priority: Major > Attachments: profile+(13).txt > > > Initial RPC to submit a query and fetch the query handle can take quite long > time to return as it can do various operations for planning and submission > that involve executing Catalog Operations like Rename, Alter Table Recover > partition that can take time on tables with many > partitions([https://github.com/apache/impala/blob/1231208da7104c832c13f272d1e5b8f554d29337/be/src/exec/catalog-op-executor.cc#L92]). > Attached is the profile of one such DDL query (with few fields hidden). > These RPCs are: > 1. Beeswax: > [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-beeswax-server.cc#L57] > 2. HS2: > [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-hs2-server.cc#L462] > > One of the side effects of such RPC taking long time is that clients such as > impala-shell using AWS NLB can get stuck for ever. The reason is NLB tracks > and closes connections after 350s and cannot be configured. But after closing > the connection it doesn;t send TCP RST to the client. Only when client tries > to send data or packets NLB issues back TCP RST to indicate connection is not > alive. Documentation is here: > [https://docs.aws.amazon.com/elasticloadbalancing/latest/network/network-load-balancers.html#connection-idle-timeout]. > Hence the impala-shell waiting for RPC to return gets stuck indefinitely. > Hence, we may need to evaluate techniques for RPCs to return query handle > after > # Creating Driver: > [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-server.cc#L1150] > # Register Query: > [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-server.cc#L1168] > and execute later parts of RPC asynchronously in different thread without > blocking the RPC. That way clients can get query handle and poll for it for > state and results. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-10811) RPC to submit query getting stuck for AWS NLB forever.
[ https://issues.apache.org/jira/browse/IMPALA-10811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amogh Margoor updated IMPALA-10811: --- Attachment: profile+(13).txt > RPC to submit query getting stuck for AWS NLB forever. > -- > > Key: IMPALA-10811 > URL: https://issues.apache.org/jira/browse/IMPALA-10811 > Project: IMPALA > Issue Type: Bug >Reporter: Amogh Margoor >Priority: Major > Attachments: profile+(13).txt > > > Initial RPC to submit a query and fetch the query handle can take quite long > time to return as it can do various operations for planning and submission > that involve executing Catalog Operations like Rename, Alter Table Recover > partition that can take time on tables with many > partitions([https://github.com/apache/impala/blob/1231208da7104c832c13f272d1e5b8f554d29337/be/src/exec/catalog-op-executor.cc#L92]). > Attached is the profile of one such DDL query (with few fields hidden). > These RPCs are: > 1. Beeswax: > [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-beeswax-server.cc#L57] > 2. HS2: > [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-hs2-server.cc#L462] > > One of the side effects of such RPC taking long time is that clients such as > impala-shell using AWS NLB can get stuck for ever. The reason is NLB tracks > and closes connections after 350s and cannot be configured. But after closing > the connection it doesn;t send TCP RST to the client. Only when client tries > to send data or packets NLB issues back TCP RST to indicate connection is not > alive. Documentation is here: > [https://docs.aws.amazon.com/elasticloadbalancing/latest/network/network-load-balancers.html#connection-idle-timeout]. > Hence the impala-shell waiting for RPC to return gets stuck indefinitely. > Hence, we may need to evaluate techniques for RPCs to return query handle > after > # Creating Driver: > [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-server.cc#L1150] > # Register Query: > [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-server.cc#L1168] > and execute later parts of RPC asynchronously in different thread without > blocking the RPC. That way clients can get query handle and poll for it for > state and results. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-10811) RPC to submit query getting stuck for AWS NLB forever.
[ https://issues.apache.org/jira/browse/IMPALA-10811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amogh Margoor updated IMPALA-10811: --- Description: Initial RPC to submit a query and fetch the query handle can take quite long time to return as it can do various operations for planning and submission that involve executing Catalog Operations like Rename, Alter Table Recover partition that can take time on tables with many partitions([https://github.com/apache/impala/blob/1231208da7104c832c13f272d1e5b8f554d29337/be/src/exec/catalog-op-executor.cc#L92]). Attached is the profile of one such DDL query (with few fields hidden). These RPCs are: 1. Beeswax: [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-beeswax-server.cc#L57] 2. HS2: [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-hs2-server.cc#L462] One of the side effects of such RPC taking long time is that clients such as impala-shell using AWS NLB can get stuck for ever. The reason is NLB tracks and closes connections after 350s and cannot be configured. But after closing the connection it doesn;t send TCP RST to the client. Only when client tries to send data or packets NLB issues back TCP RST to indicate connection is not alive. Documentation is here: [https://docs.aws.amazon.com/elasticloadbalancing/latest/network/network-load-balancers.html#connection-idle-timeout]. Hence the impala-shell waiting for RPC to return gets stuck indefinitely. Hence, we may need to evaluate techniques for RPCs to return query handle after # Creating Driver: [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-server.cc#L1150] # Register Query: [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-server.cc#L1168] and execute later parts of RPC asynchronously in different thread without blocking the RPC. That way clients can get query handle and poll for it for state and results. was: Initial RPC to submit a query and fetch the query handle can take quite long time to return as it can do various operations for planning and submission that involve executing Catalog Operations like Rename, Alter Table Recover partition that can take time on tables with many partitions([https://github.com/apache/impala/blob/1231208da7104c832c13f272d1e5b8f554d29337/be/src/exec/catalog-op-executor.cc#L92]). Attached is the profile of one such DDL query. These RPCs are: 1. Beeswax: [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-beeswax-server.cc#L57] 2. HS2: [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-hs2-server.cc#L462] One of the side effects of such RPC taking long time is that clients such as impala-shell using AWS NLB can get stuck for ever. The reason is NLB tracks and closes connections after 350s and cannot be configured. But after closing the connection it doesn;t send TCP RST to the client. Only when client tries to send data or packets NLB issues back TCP RST to indicate connection is not alive. Documentation is here: [https://docs.aws.amazon.com/elasticloadbalancing/latest/network/network-load-balancers.html#connection-idle-timeout]. Hence the impala-shell waiting for RPC to return gets stuck indefinitely. Hence, we may need to evaluate techniques for RPCs to return query handle after # Creating Driver: https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-server.cc#L1150 # Register Query: [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-server.cc#L1168] and execute later parts of RPC asynchronously in different thread without blocking the RPC. That way clients can get query handle and poll for it for state and results. > RPC to submit query getting stuck for AWS NLB forever. > -- > > Key: IMPALA-10811 > URL: https://issues.apache.org/jira/browse/IMPALA-10811 > Project: IMPALA > Issue Type: Bug >Reporter: Amogh Margoor >Priority: Major > > Initial RPC to submit a query and fetch the query handle can take quite long > time to return as it can do various operations for planning and submission > that involve executing Catalog Operations like Rename, Alter Table Recover > partition that can take time on tables with many > partitions([https://github.com/apache/impala/blob/1231208da7104c832c13f272d1e5b8f554d29337/be/src/exec/catalog-op-executor.cc#L92]). > Attached is the profile of one such DDL query (with few fields hidden). > These RPCs are: > 1. Beeswax: >
[jira] [Resolved] (IMPALA-5628) Parquet support for additional valid decimal representations
[ https://issues.apache.org/jira/browse/IMPALA-5628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltán Borók-Nagy resolved IMPALA-5628. --- Fix Version/s: Impala 4.1 Resolution: Fixed > Parquet support for additional valid decimal representations > > > Key: IMPALA-5628 > URL: https://issues.apache.org/jira/browse/IMPALA-5628 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Reporter: Tim Armstrong >Assignee: Zoltán Borók-Nagy >Priority: Major > Labels: ramp-up > Fix For: Impala 4.1 > > > This is an umbrella JIRA to implement valid representations of DECIMAL that > Impala doesn't currently support. > See https://github.com/Parquet/parquet-format/blob/master/LogicalTypes.md -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-5628) Parquet support for additional valid decimal representations
[ https://issues.apache.org/jira/browse/IMPALA-5628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltán Borók-Nagy resolved IMPALA-5628. --- Fix Version/s: Impala 4.1 Resolution: Fixed > Parquet support for additional valid decimal representations > > > Key: IMPALA-5628 > URL: https://issues.apache.org/jira/browse/IMPALA-5628 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Reporter: Tim Armstrong >Assignee: Zoltán Borók-Nagy >Priority: Major > Labels: ramp-up > Fix For: Impala 4.1 > > > This is an umbrella JIRA to implement valid representations of DECIMAL that > Impala doesn't currently support. > See https://github.com/Parquet/parquet-format/blob/master/LogicalTypes.md -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IMPALA-10812) [DOCS] RPC to submit query getting stuck for AWS NLB forever.
[ https://issues.apache.org/jira/browse/IMPALA-10812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amogh Margoor updated IMPALA-10812: --- Description: We would need to document the behaviour of IMPALA-10811 as a limitation with AWS NLB. Problem description: Initial RPC to submit a query and fetch the query handle can take quite long time to return as it can do various operations for planning and submission that involve executing Catalog Operations like Rename, Alter Table Recover partition that can take time on tables with many partitions([https://github.com/apache/impala/blob/1231208da7104c832c13f272d1e5b8f554d29337/be/src/exec/catalog-op-executor.cc#L92]). Attached is the profile of one such DDL query. These RPCs are: 1. Beeswax: [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-beeswax-server.cc#L57] 2. HS2: [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-hs2-server.cc#L462] One of the side effects of such RPC taking long time is that clients such as impala-shell using AWS NLB can get stuck for ever. The reason is NLB tracks and closes connections after 350s and cannot be configured. But after closing the connection it doesn;t send TCP RST to the client. Only when client tries to send data or packets NLB issues back TCP RST to indicate connection is not alive. Documentation is here: [https://docs.aws.amazon.com/elasticloadbalancing/latest/network/network-load-balancers.html#connection-idle-timeout]. Hence clients like impala-shell waiting for RPC to return gets stuck indefinitely. was: Initial RPC to submit a query and fetch the query handle can take quite long time to return as it can do various operations for planning and submission that involve executing Catalog Operations like Rename, Alter Table Recover partition that can take time on tables with many partitions([https://github.com/apache/impala/blob/1231208da7104c832c13f272d1e5b8f554d29337/be/src/exec/catalog-op-executor.cc#L92]). Attached is the profile of one such DDL query. These RPCs are: 1. Beeswax: [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-beeswax-server.cc#L57] 2. HS2: [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-hs2-server.cc#L462] One of the side effects of such RPC taking long time is that clients such as impala-shell using AWS NLB can get stuck for ever. The reason is NLB tracks and closes connections after 350s and cannot be configured. But after closing the connection it doesn;t send TCP RST to the client. Only when client tries to send data or packets NLB issues back TCP RST to indicate connection is not alive. Documentation is here: [https://docs.aws.amazon.com/elasticloadbalancing/latest/network/network-load-balancers.html#connection-idle-timeout]. Hence the impala-shell waiting for RPC to return gets stuck indefinitely. Hence, we may need to evaluate techniques for RPCs to return query handle after # Creating Driver: https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-server.cc#L1150 # Register Query: [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-server.cc#L1168] and execute later parts of RPC asynchronously in different thread without blocking the RPC. That way clients can get query handle and poll for it for state and results. > [DOCS] RPC to submit query getting stuck for AWS NLB forever. > - > > Key: IMPALA-10812 > URL: https://issues.apache.org/jira/browse/IMPALA-10812 > Project: IMPALA > Issue Type: Bug >Reporter: Amogh Margoor >Priority: Major > > We would need to document the behaviour of IMPALA-10811 as a limitation with > AWS NLB. Problem description: > > Initial RPC to submit a query and fetch the query handle can take quite long > time to return as it can do various operations for planning and submission > that involve executing Catalog Operations like Rename, Alter Table Recover > partition that can take time on tables with many > partitions([https://github.com/apache/impala/blob/1231208da7104c832c13f272d1e5b8f554d29337/be/src/exec/catalog-op-executor.cc#L92]). > Attached is the profile of one such DDL query. > These RPCs are: > 1. Beeswax: > [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-beeswax-server.cc#L57] > 2. HS2: > [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-hs2-server.cc#L462] > > One of the side effects of such RPC taking long time is that clients such as > impala-shell using AWS NLB
[jira] [Created] (IMPALA-10812) [DOCS] RPC to submit query getting stuck for AWS NLB forever.
Amogh Margoor created IMPALA-10812: -- Summary: [DOCS] RPC to submit query getting stuck for AWS NLB forever. Key: IMPALA-10812 URL: https://issues.apache.org/jira/browse/IMPALA-10812 Project: IMPALA Issue Type: Bug Reporter: Amogh Margoor Initial RPC to submit a query and fetch the query handle can take quite long time to return as it can do various operations for planning and submission that involve executing Catalog Operations like Rename, Alter Table Recover partition that can take time on tables with many partitions([https://github.com/apache/impala/blob/1231208da7104c832c13f272d1e5b8f554d29337/be/src/exec/catalog-op-executor.cc#L92]). Attached is the profile of one such DDL query. These RPCs are: 1. Beeswax: [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-beeswax-server.cc#L57] 2. HS2: [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-hs2-server.cc#L462] One of the side effects of such RPC taking long time is that clients such as impala-shell using AWS NLB can get stuck for ever. The reason is NLB tracks and closes connections after 350s and cannot be configured. But after closing the connection it doesn;t send TCP RST to the client. Only when client tries to send data or packets NLB issues back TCP RST to indicate connection is not alive. Documentation is here: [https://docs.aws.amazon.com/elasticloadbalancing/latest/network/network-load-balancers.html#connection-idle-timeout]. Hence the impala-shell waiting for RPC to return gets stuck indefinitely. Hence, we may need to evaluate techniques for RPCs to return query handle after # Creating Driver: https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-server.cc#L1150 # Register Query: [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-server.cc#L1168] and execute later parts of RPC asynchronously in different thread without blocking the RPC. That way clients can get query handle and poll for it for state and results. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-10812) [DOCS] RPC to submit query getting stuck for AWS NLB forever.
Amogh Margoor created IMPALA-10812: -- Summary: [DOCS] RPC to submit query getting stuck for AWS NLB forever. Key: IMPALA-10812 URL: https://issues.apache.org/jira/browse/IMPALA-10812 Project: IMPALA Issue Type: Bug Reporter: Amogh Margoor Initial RPC to submit a query and fetch the query handle can take quite long time to return as it can do various operations for planning and submission that involve executing Catalog Operations like Rename, Alter Table Recover partition that can take time on tables with many partitions([https://github.com/apache/impala/blob/1231208da7104c832c13f272d1e5b8f554d29337/be/src/exec/catalog-op-executor.cc#L92]). Attached is the profile of one such DDL query. These RPCs are: 1. Beeswax: [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-beeswax-server.cc#L57] 2. HS2: [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-hs2-server.cc#L462] One of the side effects of such RPC taking long time is that clients such as impala-shell using AWS NLB can get stuck for ever. The reason is NLB tracks and closes connections after 350s and cannot be configured. But after closing the connection it doesn;t send TCP RST to the client. Only when client tries to send data or packets NLB issues back TCP RST to indicate connection is not alive. Documentation is here: [https://docs.aws.amazon.com/elasticloadbalancing/latest/network/network-load-balancers.html#connection-idle-timeout]. Hence the impala-shell waiting for RPC to return gets stuck indefinitely. Hence, we may need to evaluate techniques for RPCs to return query handle after # Creating Driver: https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-server.cc#L1150 # Register Query: [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-server.cc#L1168] and execute later parts of RPC asynchronously in different thread without blocking the RPC. That way clients can get query handle and poll for it for state and results. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IMPALA-10811) RPC to submit query getting stuck for AWS NLB forever.
[ https://issues.apache.org/jira/browse/IMPALA-10811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amogh Margoor updated IMPALA-10811: --- Description: Initial RPC to submit a query and fetch the query handle can take quite long time to return as it can do various operations for planning and submission that involve executing Catalog Operations like Rename, Alter Table Recover partition that can take time on tables with many partitions([https://github.com/apache/impala/blob/1231208da7104c832c13f272d1e5b8f554d29337/be/src/exec/catalog-op-executor.cc#L92]). Attached is the profile of one such DDL query. These RPCs are: 1. Beeswax: [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-beeswax-server.cc#L57] 2. HS2: [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-hs2-server.cc#L462] One of the side effects of such RPC taking long time is that clients such as impala-shell using AWS NLB can get stuck for ever. The reason is NLB tracks and closes connections after 350s and cannot be configured. But after closing the connection it doesn;t send TCP RST to the client. Only when client tries to send data or packets NLB issues back TCP RST to indicate connection is not alive. Documentation is here: [https://docs.aws.amazon.com/elasticloadbalancing/latest/network/network-load-balancers.html#connection-idle-timeout]. Hence the impala-shell waiting for RPC to return gets stuck indefinitely. Hence, we may need to evaluate techniques for RPCs to return query handle after # Creating Driver: https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-server.cc#L1150 # Register Query: [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-server.cc#L1168] and execute later parts of RPC asynchronously in different thread without blocking the RPC. That way clients can get query handle and poll for it for state and results. was: Initial RPC to submit a query and fetch the query handle can take quite long time to return as it can do various operations for planning and submission that involve executing Catalog Operations like Rename, Alter Table Recover partition that can take time on tables with many partitions(https://github.com/apache/impala/blob/1231208da7104c832c13f272d1e5b8f554d29337/be/src/exec/catalog-op-executor.cc#L92). Attached is the profile of one such DDL query. These RPCs are: 1. Beeswax: [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-beeswax-server.cc#L57] 2. HS2: [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-hs2-server.cc#L462] One of the side effects of such RPC taking long time is that clients such as impala-shell using AWS NLB can get stuck for ever. The reason is NLB tracks and closes connections after 350s and cannot be configured. But after closing the connection it doesn;t send TCP RST to the client. Only when client tries to send data or packets NLB issues back TCP RST to indicate connection is not alive. Documentation is here: [https://docs.aws.amazon.com/elasticloadbalancing/latest/network/network-load-balancers.html#connection-idle-timeout]. Hence the impala-shell waiting for RPC to return gets stuck indefinitely. Hence, we may need to evaluate techniques for RPCs to return query handle after # Creating Driver, # Register Query ([https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-server.cc#L1168]) and execute later parts of RPC asynchronously in different thread without blocking the RPC. That way clients can get query handle and poll for it for state and results. > RPC to submit query getting stuck for AWS NLB forever. > -- > > Key: IMPALA-10811 > URL: https://issues.apache.org/jira/browse/IMPALA-10811 > Project: IMPALA > Issue Type: Bug >Reporter: Amogh Margoor >Priority: Major > > Initial RPC to submit a query and fetch the query handle can take quite long > time to return as it can do various operations for planning and submission > that involve executing Catalog Operations like Rename, Alter Table Recover > partition that can take time on tables with many > partitions([https://github.com/apache/impala/blob/1231208da7104c832c13f272d1e5b8f554d29337/be/src/exec/catalog-op-executor.cc#L92]). > Attached is the profile of one such DDL query. > These RPCs are: > 1. Beeswax: > [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-beeswax-server.cc#L57] > 2. HS2: >
[jira] [Updated] (IMPALA-10811) RPC to submit query getting stuck for AWS NLB forever.
[ https://issues.apache.org/jira/browse/IMPALA-10811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amogh Margoor updated IMPALA-10811: --- Description: Initial RPC to submit a query and fetch the query handle can take quite long time to return as it can do various operations for planning and submission that involve executing Catalog Operations like Rename, Alter Table Recover partition that can take time on tables with many partitions(https://github.com/apache/impala/blob/1231208da7104c832c13f272d1e5b8f554d29337/be/src/exec/catalog-op-executor.cc#L92). Attached is the profile of one such DDL query. These RPCs are: 1. Beeswax: [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-beeswax-server.cc#L57] 2. HS2: [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-hs2-server.cc#L462] One of the side effects of such RPC taking long time is that clients such as impala-shell using AWS NLB can get stuck for ever. The reason is NLB tracks and closes connections after 350s and cannot be configured. But after closing the connection it doesn;t send TCP RST to the client. Only when client tries to send data or packets NLB issues back TCP RST to indicate connection is not alive. Documentation is here: [https://docs.aws.amazon.com/elasticloadbalancing/latest/network/network-load-balancers.html#connection-idle-timeout]. Hence the impala-shell waiting for RPC to return gets stuck indefinitely. Hence, we may need to evaluate techniques for RPCs to return query handle after # Creating Driver, # Register Query ([https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-server.cc#L1168]) and execute later parts of RPC asynchronously in different thread without blocking the RPC. That way clients can get query handle and poll for it for state and results. was: Initial RPC to submit a query and fetch the query handle can take quite long time to return due to expensive Catalog Operations like Rename, Alter Table Recover partition on tables with many partitions. Attached is the profile of one such DDL query. These RPCs are: 1. Beeswax: [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-beeswax-server.cc#L57] 2. HS2: [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-hs2-server.cc#L462] One of the side effects of such RPC taking long time is that clients such as impala-shell using AWS NLB can get stuck for ever. The reason is NLB tracks and closes connections after 350s and cannot be configured. But after closing the connection it doesn;t send TCP RST to the client. Only when client tries to send data or packets NLB issues back TCP RST to indicate connection is not alive. Documentation is here: [https://docs.aws.amazon.com/elasticloadbalancing/latest/network/network-load-balancers.html#connection-idle-timeout]. Hence the impala-shell waiting for RPC to return gets stuck indefinitely. Hence, we may need to evaluate techniques for RPCs to return query handle after # Creating Driver, # Register Query ([https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-server.cc#L1168]) and execute later parts of RPC asynchronously in different thread without blocking the RPC. That way clients can get query handle and poll for it for state and results. > RPC to submit query getting stuck for AWS NLB forever. > -- > > Key: IMPALA-10811 > URL: https://issues.apache.org/jira/browse/IMPALA-10811 > Project: IMPALA > Issue Type: Bug >Reporter: Amogh Margoor >Priority: Major > > Initial RPC to submit a query and fetch the query handle can take quite long > time to return as it can do various operations for planning and submission > that involve executing Catalog Operations like Rename, Alter Table Recover > partition that can take time on tables with many > partitions(https://github.com/apache/impala/blob/1231208da7104c832c13f272d1e5b8f554d29337/be/src/exec/catalog-op-executor.cc#L92). > Attached is the profile of one such DDL query. > These RPCs are: > 1. Beeswax: > [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-beeswax-server.cc#L57] > 2. HS2: > [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-hs2-server.cc#L462] > > One of the side effects of such RPC taking long time is that clients such as > impala-shell using AWS NLB can get stuck for ever. The reason is NLB tracks > and closes connections after 350s and cannot be configured. But after closing > the connection it doesn;t send TCP
[jira] [Updated] (IMPALA-10811) RPC to submit query getting stuck for AWS NLB forever.
[ https://issues.apache.org/jira/browse/IMPALA-10811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amogh Margoor updated IMPALA-10811: --- Description: Initial RPC to submit a query and fetch the query handle can take quite long time to return due to expensive Catalog Operations like Rename, Alter Table Recover partition on tables with many partitions. Attached is the profile of one such DDL query. These RPCs are: 1. Beeswax: [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-beeswax-server.cc#L57] 2. HS2: [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-hs2-server.cc#L462] One of the side effects of such RPC taking long time is that clients such as impala-shell using AWS NLB can get stuck for ever. The reason is NLB tracks and closes connections after 350s and cannot be configured. But after closing the connection it doesn;t send TCP RST to the client. Only when client tries to send data or packets NLB issues back TCP RST to indicate connection is not alive. Documentation is here: [https://docs.aws.amazon.com/elasticloadbalancing/latest/network/network-load-balancers.html#connection-idle-timeout]. Hence the impala-shell waiting for RPC to return gets stuck indefinitely. Hence, we may need to evaluate techniques for RPCs to return query handle after # Creating Driver, # Register Query ([https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-server.cc#L1168]) and execute later parts of RPC asynchronously in different thread without blocking the RPC. That way clients can get query handle and poll for it for state and results. was: Initial RPC to submit a query and fetch the query handle can take quite long time to return due to expensive Catalog Operations like Rename, Alter Table Recover partition on tables with many partitions. Attached is the profile of one such DDL query. These RPCs are: 1. Beeswax: [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-beeswax-server.cc#L57] 2. HS2: [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-hs2-server.cc#L462] One of the side effects of such RPC taking long time is that clients such as impala-shell using AWS NLB can get stuck for ever. The reason is NLB tracks and closes connections after 350s and cannot be configured. But after closing the connection it doesn;t send TCP RST to the client. Only when client tries to send data or packets NLB issues back TCP RST to indicate connection is not alive. Documentation is here: [https://docs.aws.amazon.com/elasticloadbalancing/latest/network/network-load-balancers.html#connection-idle-timeout]. Hence the impala-shell waiting for RPC to return gets stuck indefinitely. Hence, we may need to evaluate techniques for RPCs to return query handle sooner right after the Query Registration () and execute later parts of RPC asynchronously so that clients can get query handle and poll for it for results. > RPC to submit query getting stuck for AWS NLB forever. > -- > > Key: IMPALA-10811 > URL: https://issues.apache.org/jira/browse/IMPALA-10811 > Project: IMPALA > Issue Type: Bug >Reporter: Amogh Margoor >Priority: Major > > Initial RPC to submit a query and fetch the query handle can take quite long > time to return due to expensive Catalog Operations like Rename, Alter Table > Recover partition on tables with many partitions. Attached is the profile of > one such DDL query. > These RPCs are: > 1. Beeswax: > [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-beeswax-server.cc#L57] > 2. HS2: > [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-hs2-server.cc#L462] > > One of the side effects of such RPC taking long time is that clients such as > impala-shell using AWS NLB can get stuck for ever. The reason is NLB tracks > and closes connections after 350s and cannot be configured. But after closing > the connection it doesn;t send TCP RST to the client. Only when client tries > to send data or packets NLB issues back TCP RST to indicate connection is not > alive. Documentation is here: > [https://docs.aws.amazon.com/elasticloadbalancing/latest/network/network-load-balancers.html#connection-idle-timeout]. > Hence the impala-shell waiting for RPC to return gets stuck indefinitely. > Hence, we may need to evaluate techniques for RPCs to return query handle > after > # Creating Driver, > # Register Query >
[jira] [Updated] (IMPALA-10811) RPC to submit query getting stuck for AWS NLB forever.
[ https://issues.apache.org/jira/browse/IMPALA-10811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amogh Margoor updated IMPALA-10811: --- Summary: RPC to submit query getting stuck for AWS NLB forever. (was: RPC to submit query getting stuck for AWS NLB for ever.) > RPC to submit query getting stuck for AWS NLB forever. > -- > > Key: IMPALA-10811 > URL: https://issues.apache.org/jira/browse/IMPALA-10811 > Project: IMPALA > Issue Type: Bug >Reporter: Amogh Margoor >Priority: Major > > Initial RPC to submit a query and fetch the query handle can take quite long > time to return due to expensive Catalog Operations like Rename, Alter Table > Recover partition on tables with many partitions. Attached is the profile of > one such DDL query. > These RPCs are: > 1. Beeswax: > [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-beeswax-server.cc#L57] > 2. HS2: > [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-hs2-server.cc#L462] > > One of the side effects of such RPC taking long time is that clients such as > impala-shell using AWS NLB can get stuck for ever. The reason is NLB tracks > and closes connections after 350s and cannot be configured. But after closing > the connection it doesn;t send TCP RST to the client. Only when client tries > to send data or packets NLB issues back TCP RST to indicate connection is not > alive. Documentation is here: > [https://docs.aws.amazon.com/elasticloadbalancing/latest/network/network-load-balancers.html#connection-idle-timeout]. > Hence the impala-shell waiting for RPC to return gets stuck indefinitely. > Hence, we may need to evaluate techniques for RPCs to return query handle > sooner right after the Query Registration () and execute later parts of RPC > asynchronously so that clients can get query handle and poll for it for > results. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-10811) RPC to submit query getting stuck for AWS NLB for ever.
Amogh Margoor created IMPALA-10811: -- Summary: RPC to submit query getting stuck for AWS NLB for ever. Key: IMPALA-10811 URL: https://issues.apache.org/jira/browse/IMPALA-10811 Project: IMPALA Issue Type: Bug Reporter: Amogh Margoor Initial RPC to submit a query and fetch the query handle can take quite long time to return due to expensive Catalog Operations like Rename, Alter Table Recover partition on tables with many partitions. Attached is the profile of one such DDL query. These RPCs are: 1. Beeswax: [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-beeswax-server.cc#L57] 2. HS2: [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-hs2-server.cc#L462] One of the side effects of such RPC taking long time is that clients such as impala-shell using AWS NLB can get stuck for ever. The reason is NLB tracks and closes connections after 350s and cannot be configured. But after closing the connection it doesn;t send TCP RST to the client. Only when client tries to send data or packets NLB issues back TCP RST to indicate connection is not alive. Documentation is here: [https://docs.aws.amazon.com/elasticloadbalancing/latest/network/network-load-balancers.html#connection-idle-timeout]. Hence the impala-shell waiting for RPC to return gets stuck indefinitely. Hence, we may need to evaluate techniques for RPCs to return query handle sooner right after the Query Registration () and execute later parts of RPC asynchronously so that clients can get query handle and poll for it for results. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-10811) RPC to submit query getting stuck for AWS NLB for ever.
Amogh Margoor created IMPALA-10811: -- Summary: RPC to submit query getting stuck for AWS NLB for ever. Key: IMPALA-10811 URL: https://issues.apache.org/jira/browse/IMPALA-10811 Project: IMPALA Issue Type: Bug Reporter: Amogh Margoor Initial RPC to submit a query and fetch the query handle can take quite long time to return due to expensive Catalog Operations like Rename, Alter Table Recover partition on tables with many partitions. Attached is the profile of one such DDL query. These RPCs are: 1. Beeswax: [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-beeswax-server.cc#L57] 2. HS2: [https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/be/src/service/impala-hs2-server.cc#L462] One of the side effects of such RPC taking long time is that clients such as impala-shell using AWS NLB can get stuck for ever. The reason is NLB tracks and closes connections after 350s and cannot be configured. But after closing the connection it doesn;t send TCP RST to the client. Only when client tries to send data or packets NLB issues back TCP RST to indicate connection is not alive. Documentation is here: [https://docs.aws.amazon.com/elasticloadbalancing/latest/network/network-load-balancers.html#connection-idle-timeout]. Hence the impala-shell waiting for RPC to return gets stuck indefinitely. Hence, we may need to evaluate techniques for RPCs to return query handle sooner right after the Query Registration () and execute later parts of RPC asynchronously so that clients can get query handle and poll for it for results. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IMPALA-10810) Bump json-smart from 2.3 to at least 2.4.1
[ https://issues.apache.org/jira/browse/IMPALA-10810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltán Borók-Nagy updated IMPALA-10810: --- Component/s: Frontend > Bump json-smart from 2.3 to at least 2.4.1 > -- > > Key: IMPALA-10810 > URL: https://issues.apache.org/jira/browse/IMPALA-10810 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Reporter: Zoltán Borók-Nagy >Priority: Major > > I noticed that our json-smart dependency is stale and we could pick up a > newer version. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-10810) Bump json-smart from 2.3 to at least 2.4.1
Zoltán Borók-Nagy created IMPALA-10810: -- Summary: Bump json-smart from 2.3 to at least 2.4.1 Key: IMPALA-10810 URL: https://issues.apache.org/jira/browse/IMPALA-10810 Project: IMPALA Issue Type: Bug Reporter: Zoltán Borók-Nagy I noticed that our json-smart dependency is stale and we could pick up a newer version. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-10810) Bump json-smart from 2.3 to at least 2.4.1
Zoltán Borók-Nagy created IMPALA-10810: -- Summary: Bump json-smart from 2.3 to at least 2.4.1 Key: IMPALA-10810 URL: https://issues.apache.org/jira/browse/IMPALA-10810 Project: IMPALA Issue Type: Bug Reporter: Zoltán Borók-Nagy I noticed that our json-smart dependency is stale and we could pick up a newer version. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IMPALA-10808) Crash of illegal decimal schema in test_fuzz_decimal_tbl
[ https://issues.apache.org/jira/browse/IMPALA-10808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Quanlong Huang updated IMPALA-10808: Affects Version/s: Impala 4.1 > Crash of illegal decimal schema in test_fuzz_decimal_tbl > > > Key: IMPALA-10808 > URL: https://issues.apache.org/jira/browse/IMPALA-10808 > Project: IMPALA > Issue Type: Bug >Affects Versions: Impala 4.1 >Reporter: Quanlong Huang >Assignee: Quanlong Huang >Priority: Blocker > > Recently saw two unrelated jobs failed by the same crash: > * [https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/14369] > * [https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/14381] > For example in the second job, the test that crashes impalad is {code} > query_test/test_scanners_fuzz.py::TestScannersFuzzing::()::test_fuzz_decimal_tbl[protocol:beeswax|exec_option:{'debug_action':'-1:OPEN:SET_DENY_RESERVATION_PROBABILITY@0.5';'abort_on_error':False;'mem_limit':'512m';'num_nodes':0}|table_format:parquet/none > {code} > The failure is > {code:java} > I0720 03:34:53.168516 126039 runtime-state.cc:196] > 8a42e69ff49106c8:d2096a71] Error from query > 8a42e69ff49106c8:d2096a70: File > 'hdfs://localhost:20500/test-warehouse/test_fuzz_decimal_tbl_4a8e12be.db/decimal_tbl/d6=1/copy1_6b48619353a75ffb-66460f74_973668612_data.0.parq' > column 'd1' does not have the decimal precision set. > F0720 03:34:53.168567 126039 types.h:282] 8a42e69ff49106c8:d2096a71] > Check failed: precision > 0 (0 vs. 0) > {code} > CC [~boroknagyz] who owns the first job. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-10808) Crash of illegal decimal schema in test_fuzz_decimal_tbl
[ https://issues.apache.org/jira/browse/IMPALA-10808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Quanlong Huang reassigned IMPALA-10808: --- Assignee: Quanlong Huang > Crash of illegal decimal schema in test_fuzz_decimal_tbl > > > Key: IMPALA-10808 > URL: https://issues.apache.org/jira/browse/IMPALA-10808 > Project: IMPALA > Issue Type: Bug >Reporter: Quanlong Huang >Assignee: Quanlong Huang >Priority: Blocker > > Recently saw two unrelated jobs failed by the same crash: > * [https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/14369] > * [https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/14381] > For example in the second job, the test that crashes impalad is {code} > query_test/test_scanners_fuzz.py::TestScannersFuzzing::()::test_fuzz_decimal_tbl[protocol:beeswax|exec_option:{'debug_action':'-1:OPEN:SET_DENY_RESERVATION_PROBABILITY@0.5';'abort_on_error':False;'mem_limit':'512m';'num_nodes':0}|table_format:parquet/none > {code} > The failure is > {code:java} > I0720 03:34:53.168516 126039 runtime-state.cc:196] > 8a42e69ff49106c8:d2096a71] Error from query > 8a42e69ff49106c8:d2096a70: File > 'hdfs://localhost:20500/test-warehouse/test_fuzz_decimal_tbl_4a8e12be.db/decimal_tbl/d6=1/copy1_6b48619353a75ffb-66460f74_973668612_data.0.parq' > column 'd1' does not have the decimal precision set. > F0720 03:34:53.168567 126039 types.h:282] 8a42e69ff49106c8:d2096a71] > Check failed: precision > 0 (0 vs. 0) > {code} > CC [~boroknagyz] who owns the first job. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-10809) improve the performance of unnest operation
[ https://issues.apache.org/jira/browse/IMPALA-10809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] pengdou1990 updated IMPALA-10809: - Description: h2. current situation Impala's support for complex data types is not particularly friendly. For example, if you need to expand rows containing Array type fields, you need to unnest the array fields first, and then do a nested loop join. If you need to expand multiple array fields, you need to do multiple unnests, And perform multiple unest and nested loop joins, which puts a lot of computational pressure on the executor. DDL: {code:java} CREATE TABLE rawdata.users2 ( day INT, sampling_group INT, user_id BIGINT, time TIMESTAMP, _offset BIGINT, event_id INT, month_id INT, week_id INT, distinct_id STRING, event_bucket INT, adresses_list_string ARRAY, count_list_bigint ARRAY ) WITH SERDEPROPERTIES ('serialization.format'='1') STORED AS PARQUET LOCATION 'hdfs://localhost:20500/test-warehouse/rawdata.db/users2'{code} Query SQL: {code:java} SELECT `day`, list`.item, list1.item FROM rawdata.users2, rawdata.users2.adresses_list_string list1, rawdata.users2.count_list_bigint list2{code} Simplified Plan: {code:java} F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 | 07:EXCHANGE [UNPARTITIONED] | 01:SUBPLAN | |--06:NESTED LOOP JOIN [CROSS JOIN] | | | |--04:UNNEST [users2.count_list_bigint clist] | | | 05:NESTED LOOP JOIN [CROSS JOIN] | | | |--02:SINGULAR ROW SRC | | | 03:UNNEST [users2.adresses_list_string list] | 00:SCAN HDFS [rawdata.users2, RANDOM] {code} h2. Improve Solution In actual use, I found that if some changes are made to the calculation logic of unnest, the calculation performance will be greatly improved: At first, in FE construct and new plan type, named explode node, it and it's child node construct a pipeline operation then, in BE, the raw was explode locally, and the fileds layout as childnode the query sql and Plan greatly simplified: Query SQL: {code:java} SELECT `day`, explode(adresses_list_string), explode(count_list_bigint) from rawdata.users2{code} the simplified Plan as this: {code:java} F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 | 02:EXCHANGE [UNPARTITIONED] | 01:EXPLODE NODE [UNPARTITIONED] | 00:SCAN HDFS [rawdata.users2, RANDOM] {code} was: h2. current situation Impala's support for complex data types is not particularly friendly. For example, if you need to expand rows containing Array type fields, you need to unnest the array fields first, and then do a nested loop join. If you need to expand multiple array fields, you need to do multiple unnests, And perform multiple unest and nested loop joins, which puts a lot of computational pressure on the executor. DDL: CREATE TABLE rawdata.users2 ( day INT, sampling_group INT, user_id BIGINT, time TIMESTAMP, _offset BIGINT, event_id INT, month_id INT, week_id INT, distinct_id STRING, event_bucket INT, adresses_list_string ARRAY, count_list_bigint ARRAY ) WITH SERDEPROPERTIES ('serialization.format'='1') STORED AS PARQUET LOCATION 'hdfs://localhost:20500/test-warehouse/rawdata.db/users2' Query SQL: SELECT `day`, list`.item, list1.item FROM rawdata.users2, rawdata.users2.adresses_list_string list1, rawdata.users2.count_list_bigint list2 Simplified Plan: F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 | 07:EXCHANGE [UNPARTITIONED] | 01:SUBPLAN |
[jira] [Created] (IMPALA-10809) improve the performance of unnest operation
pengdou1990 created IMPALA-10809: Summary: improve the performance of unnest operation Key: IMPALA-10809 URL: https://issues.apache.org/jira/browse/IMPALA-10809 Project: IMPALA Issue Type: Improvement Reporter: pengdou1990 h2. current situation Impala's support for complex data types is not particularly friendly. For example, if you need to expand rows containing Array type fields, you need to unnest the array fields first, and then do a nested loop join. If you need to expand multiple array fields, you need to do multiple unnests, And perform multiple unest and nested loop joins, which puts a lot of computational pressure on the executor. DDL: CREATE TABLE rawdata.users2 ( day INT, sampling_group INT, user_id BIGINT, time TIMESTAMP, _offset BIGINT, event_id INT, month_id INT, week_id INT, distinct_id STRING, event_bucket INT, adresses_list_string ARRAY, count_list_bigint ARRAY ) WITH SERDEPROPERTIES ('serialization.format'='1') STORED AS PARQUET LOCATION 'hdfs://localhost:20500/test-warehouse/rawdata.db/users2' Query SQL: SELECT `day`, list`.item, list1.item FROM rawdata.users2, rawdata.users2.adresses_list_string list1, rawdata.users2.count_list_bigint list2 Simplified Plan: F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 | 07:EXCHANGE [UNPARTITIONED] | 01:SUBPLAN | |--06:NESTED LOOP JOIN [CROSS JOIN] | | | |--04:UNNEST [users2.count_list_bigint clist] | | | 05:NESTED LOOP JOIN [CROSS JOIN] | | | |--02:SINGULAR ROW SRC | | | 03:UNNEST [users2.adresses_list_string list] | 00:SCAN HDFS [rawdata.users2, RANDOM] h2. Improve Solution In actual use, I found that if some changes are made to the calculation logic of unnest, the calculation performance will be greatly improved: At first, in FE construct and new plan type, named explode node, it and it's child node construct a pipeline operation then, in BE, the raw was explode locally, and the fileds layout as childnode the query sql and Plan greatly simplified: Query SQL: SELECT `day`, explode(adresses_list_string), explode(count_list_bigint) from rawdata.users2 the simplified Plan as this: F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 | 02:EXCHANGE [UNPARTITIONED] | 01:EXPLODE NODE [UNPARTITIONED] | 00:SCAN HDFS [rawdata.users2, RANDOM] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-10809) improve the performance of unnest operation
pengdou1990 created IMPALA-10809: Summary: improve the performance of unnest operation Key: IMPALA-10809 URL: https://issues.apache.org/jira/browse/IMPALA-10809 Project: IMPALA Issue Type: Improvement Reporter: pengdou1990 h2. current situation Impala's support for complex data types is not particularly friendly. For example, if you need to expand rows containing Array type fields, you need to unnest the array fields first, and then do a nested loop join. If you need to expand multiple array fields, you need to do multiple unnests, And perform multiple unest and nested loop joins, which puts a lot of computational pressure on the executor. DDL: CREATE TABLE rawdata.users2 ( day INT, sampling_group INT, user_id BIGINT, time TIMESTAMP, _offset BIGINT, event_id INT, month_id INT, week_id INT, distinct_id STRING, event_bucket INT, adresses_list_string ARRAY, count_list_bigint ARRAY ) WITH SERDEPROPERTIES ('serialization.format'='1') STORED AS PARQUET LOCATION 'hdfs://localhost:20500/test-warehouse/rawdata.db/users2' Query SQL: SELECT `day`, list`.item, list1.item FROM rawdata.users2, rawdata.users2.adresses_list_string list1, rawdata.users2.count_list_bigint list2 Simplified Plan: F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 | 07:EXCHANGE [UNPARTITIONED] | 01:SUBPLAN | |--06:NESTED LOOP JOIN [CROSS JOIN] | | | |--04:UNNEST [users2.count_list_bigint clist] | | | 05:NESTED LOOP JOIN [CROSS JOIN] | | | |--02:SINGULAR ROW SRC | | | 03:UNNEST [users2.adresses_list_string list] | 00:SCAN HDFS [rawdata.users2, RANDOM] h2. Improve Solution In actual use, I found that if some changes are made to the calculation logic of unnest, the calculation performance will be greatly improved: At first, in FE construct and new plan type, named explode node, it and it's child node construct a pipeline operation then, in BE, the raw was explode locally, and the fileds layout as childnode the query sql and Plan greatly simplified: Query SQL: SELECT `day`, explode(adresses_list_string), explode(count_list_bigint) from rawdata.users2 the simplified Plan as this: F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 | 02:EXCHANGE [UNPARTITIONED] | 01:EXPLODE NODE [UNPARTITIONED] | 00:SCAN HDFS [rawdata.users2, RANDOM] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IMPALA-9659) Document supported distros for Impala 4.0
[ https://issues.apache.org/jira/browse/IMPALA-9659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383843#comment-17383843 ] ASF subversion and git services commented on IMPALA-9659: - Commit 602eec3b6e712c54cb2e78f534991aced74b7d33 in impala's branch refs/heads/master from stiga-huang [ https://gitbox.apache.org/repos/asf?p=impala.git;h=602eec3 ] IMPALA-9659: [DOCS] Document supported distros Our Requirements docuemnt points to the README.md about supported distros: https://impala.apache.org/docs/build/html/topics/impala_prereqs.html However, README.md doesn't mention is. This patch adds a section for this. Change-Id: I7104c24112d3ee298a9c9edd07e267b39bc77fa6 Reviewed-on: http://gerrit.cloudera.org:8080/17583 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Document supported distros for Impala 4.0 > - > > Key: IMPALA-9659 > URL: https://issues.apache.org/jira/browse/IMPALA-9659 > Project: IMPALA > Issue Type: Task > Components: Docs >Reporter: Tim Armstrong >Assignee: Quanlong Huang >Priority: Blocker > > We don't appear to document which distributions Impala is actually supported > on. We should clarify this going forward in Impala 4.0. We already sent out a > mail to the user list with a proposal: > https://mail-archives.apache.org/mod_mbox/impala-user/202004.mbox/browser > I think de-facto it is Ubuntu 16.04 and 18.04, CentOS/RHEL7 and soon 8 (and > compatible variants) and maybe SLES12 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-10808) Crash of illegal decimal schema in test_fuzz_decimal_tbl
Quanlong Huang created IMPALA-10808: --- Summary: Crash of illegal decimal schema in test_fuzz_decimal_tbl Key: IMPALA-10808 URL: https://issues.apache.org/jira/browse/IMPALA-10808 Project: IMPALA Issue Type: Bug Reporter: Quanlong Huang Recently saw two unrelated jobs failed by the same crash: * [https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/14369] * [https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/14381] For example in the second job, the test that crashes impalad is {code} query_test/test_scanners_fuzz.py::TestScannersFuzzing::()::test_fuzz_decimal_tbl[protocol:beeswax|exec_option:{'debug_action':'-1:OPEN:SET_DENY_RESERVATION_PROBABILITY@0.5';'abort_on_error':False;'mem_limit':'512m';'num_nodes':0}|table_format:parquet/none {code} The failure is {code:java} I0720 03:34:53.168516 126039 runtime-state.cc:196] 8a42e69ff49106c8:d2096a71] Error from query 8a42e69ff49106c8:d2096a70: File 'hdfs://localhost:20500/test-warehouse/test_fuzz_decimal_tbl_4a8e12be.db/decimal_tbl/d6=1/copy1_6b48619353a75ffb-66460f74_973668612_data.0.parq' column 'd1' does not have the decimal precision set. F0720 03:34:53.168567 126039 types.h:282] 8a42e69ff49106c8:d2096a71] Check failed: precision > 0 (0 vs. 0) {code} CC [~boroknagyz] who owns the first job. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-10808) Crash of illegal decimal schema in test_fuzz_decimal_tbl
Quanlong Huang created IMPALA-10808: --- Summary: Crash of illegal decimal schema in test_fuzz_decimal_tbl Key: IMPALA-10808 URL: https://issues.apache.org/jira/browse/IMPALA-10808 Project: IMPALA Issue Type: Bug Reporter: Quanlong Huang Recently saw two unrelated jobs failed by the same crash: * [https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/14369] * [https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/14381] For example in the second job, the test that crashes impalad is {code} query_test/test_scanners_fuzz.py::TestScannersFuzzing::()::test_fuzz_decimal_tbl[protocol:beeswax|exec_option:{'debug_action':'-1:OPEN:SET_DENY_RESERVATION_PROBABILITY@0.5';'abort_on_error':False;'mem_limit':'512m';'num_nodes':0}|table_format:parquet/none {code} The failure is {code:java} I0720 03:34:53.168516 126039 runtime-state.cc:196] 8a42e69ff49106c8:d2096a71] Error from query 8a42e69ff49106c8:d2096a70: File 'hdfs://localhost:20500/test-warehouse/test_fuzz_decimal_tbl_4a8e12be.db/decimal_tbl/d6=1/copy1_6b48619353a75ffb-66460f74_973668612_data.0.parq' column 'd1' does not have the decimal precision set. F0720 03:34:53.168567 126039 types.h:282] 8a42e69ff49106c8:d2096a71] Check failed: precision > 0 (0 vs. 0) {code} CC [~boroknagyz] who owns the first job. -- This message was sent by Atlassian Jira (v8.3.4#803005)