[jira] [Commented] (IMPALA-12402) Add some configurations for CatalogdMetaProvider's cache_

2023-09-07 Thread Maxwell Guo (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762954#comment-17762954
 ] 

Maxwell Guo commented on IMPALA-12402:
--

[~MikaelSmith]  thanks for your reply,I think it is better to make this param 
of guava cache's concurrencyLevel (also I may want to make more than this one 
param)  configurable instand of the default value 4.
for many tables I think the value should be more than 4 like 128 or 256. When 
we saw the jstack for impala at startup stage, we found the threads are all 
waitting for the lock. see 
https://github.com/google/guava/blob/master/guava/src/com/google/common/cache/CacheBuilder.java#L432
lower value will lead to thread contention . 
As in this cache ,the concurrency level can be use as the buckect number . So 
more buckect little  thread contention I think(We assume that the values ​​are 
random enough).

> Add some configurations for CatalogdMetaProvider's cache_
> -
>
> Key: IMPALA-12402
> URL: https://issues.apache.org/jira/browse/IMPALA-12402
> Project: IMPALA
>  Issue Type: Improvement
>  Components: fe
>Reporter: Maxwell Guo
>Assignee: Maxwell Guo
>Priority: Minor
>  Labels: pull-request-available
>
> when the cluster contains many db and tables such as if there are more than 
> 10 tables, and if we restart the impalad , the local cache_ 
> CatalogMetaProvider's need to doing some loading process. 
> As we know that the goole's guava cache 's concurrencyLevel os set to 4 by 
> default. 
> but if there is many tables the loading process will need more time and 
> increase the probability of lock contention, see 
> [here|https://github.com/google/guava/blob/master/guava/src/com/google/common/cache/CacheBuilder.java#L437].
>  
> So we propose to add some configurations here, the first is the concurrency 
> of cache.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-11498) Change port range of TEZ's web UI server after TEZ-4347

2023-09-07 Thread Fang-Yu Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang-Yu Rao resolved IMPALA-11498.
--
Resolution: Fixed

Resolve the issue since the fix has been merged.

> Change port range of TEZ's web UI server after TEZ-4347
> ---
>
> Key: IMPALA-11498
> URL: https://issues.apache.org/jira/browse/IMPALA-11498
> Project: IMPALA
>  Issue Type: Task
>  Components: Frontend
>Reporter: Fang-Yu Rao
>Assignee: Fang-Yu Rao
>Priority: Major
>
> After TEZ-4347, by default TEZ would attempt to start a web UI server before 
> opening a session. The default port range for the server specified in 
> [TezConfiguration.java|https://github.infra.cloudera.com/CDH/tez/blob/cdw-master/tez-api/src/main/java/org/apache/tez/dag/api/TezConfiguration.java#L1823]
>  (in the TEZ repository) is "5-50050", which does not seem to be a good 
> choice in Impala's testing environment in that there are always some other 
> client programs holding those ports when TEZ attempts to start its web UI 
> server. As a result, TEZ could not bind a port in the port range to start its 
> web UI
> server, resulting in TEZ session not being created.
> We should specify a better port ranger for TEZ once we start using a TEZ 
> dependency with TEZ-4347.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-12248) Add required Ranger configuration properties after RANGER-2895

2023-09-07 Thread Fang-Yu Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang-Yu Rao resolved IMPALA-12248.
--
Resolution: Fixed

Resolve the issue since the fix has been merged.

> Add required Ranger configuration properties after RANGER-2895
> --
>
> Key: IMPALA-12248
> URL: https://issues.apache.org/jira/browse/IMPALA-12248
> Project: IMPALA
>  Issue Type: Task
>Reporter: Fang-Yu Rao
>Assignee: Fang-Yu Rao
>Priority: Major
>
> [RANGER-2895|https://github.com/apache/ranger/commit/846031985cae70f7a8c5e92faf186948a302260e]
>  added and removed some configuration properties.
> [Three new configuration properties were 
> added|https://github.com/apache/ranger/commit/846031985cae70f7a8c5e92faf186948a302260e#diff-dcab4376623684e416c7e60162c7af7a7d3789fe1d61a2cfdaef794334426f05].
>  We found that once we bump up the build number to include RANGER-2895 and if 
> those new properties do not exist in 
> [ranger-admin-default-site.xml.template|https://github.com/apache/impala/blob/master/testdata/cluster/ranger/ranger-admin-default-site.xml.template]
>  or 
> [ranger-admin-site.xml.template|https://github.com/apache/impala/blob/master/testdata/cluster/ranger/ranger-admin-site.xml.template]
>  then the produced site files for Ranger will not contain those new 
> properties, resulting in some error message like the following in 
> catalina.log. As a result, Ranger's HTTP server could not be properly started.
> {code:java}
> 23/06/25 04:46:01 ERROR context.ContextLoader: Context initialization failed
> org.springframework.beans.factory.BeanDefinitionStoreException: Invalid bean 
> definition with name 'defaultDataSource' defined in ServletContext resource 
> [/META-INF/applicationContext.xml]: Could not resolve placeholder 
> 'ranger.jpa.jdbc.idletimeout' in value "${ranger.jpa.jdbc.idletimeout}"; 
> nested exception is java.lang.IllegalArgumentException: Could not resolve 
> placeholder 'ranger.jpa.jdbc.idletimeout' in value 
> "${ranger.jpa.jdbc.idletimeout}"
>   at
> {code}
> There are also some configuration properties removed in RANGER-2895, e.g., 
> [ranger.jpa.jdbc.idleconnectiontestperiod|https://github.com/apache/ranger/commit/846031985cae70f7a8c5e92faf186948a302260e#diff-dcab4376623684e416c7e60162c7af7a7d3789fe1d61a2cfdaef794334426f05L136].
>  In this regard, we could probably add these 3 new properties first and then 
> remove the unnecessary properties once we have bumped up the build number 
> that includes RANGER-2895.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Reopened] (IMPALA-12250) Remove deprecated Ranger configuration properties after RANGER-2895

2023-09-07 Thread Fang-Yu Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang-Yu Rao reopened IMPALA-12250:
--

Sorry I meant to close IMPALA-12248.

> Remove deprecated Ranger configuration properties after RANGER-2895
> ---
>
> Key: IMPALA-12250
> URL: https://issues.apache.org/jira/browse/IMPALA-12250
> Project: IMPALA
>  Issue Type: Task
>Reporter: Fang-Yu Rao
>Assignee: Fang-Yu Rao
>Priority: Major
>
> In IMPALA-12248, we added 3 new Ranger configuration properties that will be 
> required after we start using a build that includes 
> [RANGER-2895|https://github.com/apache/ranger/commit/846031985cae70f7a8c5e92faf186948a302260e#diff-dcab4376623684e416c7e60162c7af7a7d3789fe1d61a2cfdaef794334426f05]
>  in order to start Ranger's HTTP server.
> Recall that a Ranger configuration property was deprecated in 
> [RANGER-2895|https://github.com/apache/ranger/commit/846031985cae70f7a8c5e92faf186948a302260e#diff-dcab4376623684e416c7e60162c7af7a7d3789fe1d61a2cfdaef794334426f05],
>  i.e., 
> [ranger.jpa.jdbc.idleconnectiontestperiod|https://github.com/apache/ranger/commit/846031985cae70f7a8c5e92faf186948a302260e#diff-9669116dca1e5c9fffdb2c81d4d9ac57b489131e90b89ff17b56801131bad5a6L419].
>  Thus, we should also remove it from 
> [ranger-admin-default-site.xml.template|https://github.com/apache/impala/blob/master/testdata/cluster/ranger/ranger-admin-default-site.xml.template]
>  after starting using a build that includes 
> [RANGER-2895|https://github.com/apache/ranger/commit/846031985cae70f7a8c5e92faf186948a302260e#diff-dcab4376623684e416c7e60162c7af7a7d3789fe1d61a2cfdaef794334426f05].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-12250) Remove deprecated Ranger configuration properties after RANGER-2895

2023-09-07 Thread Fang-Yu Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang-Yu Rao resolved IMPALA-12250.
--
Resolution: Fixed

Resolve the issue since the fix has been merged.

> Remove deprecated Ranger configuration properties after RANGER-2895
> ---
>
> Key: IMPALA-12250
> URL: https://issues.apache.org/jira/browse/IMPALA-12250
> Project: IMPALA
>  Issue Type: Task
>Reporter: Fang-Yu Rao
>Assignee: Fang-Yu Rao
>Priority: Major
>
> In IMPALA-12248, we added 3 new Ranger configuration properties that will be 
> required after we start using a build that includes 
> [RANGER-2895|https://github.com/apache/ranger/commit/846031985cae70f7a8c5e92faf186948a302260e#diff-dcab4376623684e416c7e60162c7af7a7d3789fe1d61a2cfdaef794334426f05]
>  in order to start Ranger's HTTP server.
> Recall that a Ranger configuration property was deprecated in 
> [RANGER-2895|https://github.com/apache/ranger/commit/846031985cae70f7a8c5e92faf186948a302260e#diff-dcab4376623684e416c7e60162c7af7a7d3789fe1d61a2cfdaef794334426f05],
>  i.e., 
> [ranger.jpa.jdbc.idleconnectiontestperiod|https://github.com/apache/ranger/commit/846031985cae70f7a8c5e92faf186948a302260e#diff-9669116dca1e5c9fffdb2c81d4d9ac57b489131e90b89ff17b56801131bad5a6L419].
>  Thus, we should also remove it from 
> [ranger-admin-default-site.xml.template|https://github.com/apache/impala/blob/master/testdata/cluster/ranger/ranger-admin-default-site.xml.template]
>  after starting using a build that includes 
> [RANGER-2895|https://github.com/apache/ranger/commit/846031985cae70f7a8c5e92faf186948a302260e#diff-dcab4376623684e416c7e60162c7af7a7d3789fe1d61a2cfdaef794334426f05].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-12311) Extra newlines are produced when an end-to-end test is run with update_results

2023-09-07 Thread Fang-Yu Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang-Yu Rao resolved IMPALA-12311.
--
Resolution: Fixed

Resolve the issue since the fix has been merged.

> Extra newlines are produced when an end-to-end test is run with 
> update_results 
> ---
>
> Key: IMPALA-12311
> URL: https://issues.apache.org/jira/browse/IMPALA-12311
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 4.1.2
>Reporter: Fang-Yu Rao
>Assignee: Fang-Yu Rao
>Priority: Minor
>  Labels: test-infra
>
> We found that extra newlines are produced in the updated golden file when the 
> actual results do not match the expected results specified in the original 
> golden file.
> Take 
> [TestDecimalExprs::test_exprs()|https://github.com/apache/impala/blob/master/tests/query_test/test_decimal_queries.py#L75]
>  for example, this test runs the test cases in 
> [decimal-exprs.test|https://github.com/apache/impala/blob/master/testdata/workloads/functional-query/queries/QueryTest/decimal-exprs.test].
> Suppose that we modify the expected error message at 
> [https://github.com/apache/impala/blob/master/testdata/workloads/functional-query/queries/QueryTest/decimal-exprs.test#L107]
>  from "UDF WARNING: Decimal expression overflowed, returning NULL" to the 
> following (the original string with an additional "x").
> {noformat}
> UDF WARNING: Decimal expression overflowed, returning NULLx
> {noformat}
> Then we run this test using the following command with the command line 
> argument '--update_results'.
> {code:java}
> $IMPALA_HOME/bin/impala-py.test \
> --update_results \
> --junitxml=$IMPALA_EE_TEST_LOGS_DIR/results/test_decimal.xml \
> $IMPALA_HOME/tests/query_test/test_decimal_queries.py::TestDecimalExprs::test_exprs
> {code}
> In $IMPALA_HOME/logs/ee_tests/QueryTest_decimal-exprs.test, we will find that 
> the following subsection corresponding to the query. There are 3 additional 
> newlines in the subsection of 'ERRORS'.
> {noformat}
>  ERRORS
> UDF WARNING: Decimal expression overflowed, returning NULL
> 
> {noformat}
> One of the newlines was produced in 
> [join_section_lines()|https://github.com/apache/impala/blob/master/tests/util/test_file_parser.py#L298].
>  This function is called when the actual results do not match the expected 
> results in the following 4 places.
>  # [test_section['ERRORS'] = 
> join_section_lines(actual_errors)|https://github.com/apache/impala/blob/master/tests/common/test_result_verifier.py#L398].
>  # [test_section['TYPES'] = join_section_lines(\[', 
> '.join(actual_types)\])|https://github.com/apache/impala/blob/master/tests/common/test_result_verifier.py#L429].
>  # [test_section['LABELS'] = join_section_lines(\[', 
> '.join(actual_labels)\])|https://github.com/apache/impala/blob/master/tests/common/test_result_verifier.py#L451].
>  # [test_section[result_section] = 
> join_section_lines(actual.result_list)|https://github.com/apache/impala/blob/master/tests/common/test_result_verifier.py#L489].
> Thus, we also have the same issue for subsections like TYPES, LABELS, and 
> RESULTS in such a scenario (actual results do not match expected ones). It 
> would be good if a user/developer does not have to manually remove those 
> extra newlines when trying to generate the golden files for new test files.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-2422) % escaping does not work correctly in a LIKE clause

2023-09-07 Thread Michael Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-2422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith updated IMPALA-2422:
--
Target Version:   (was: Impala 4.3.0)

> % escaping does not work correctly in a LIKE clause
> ---
>
> Key: IMPALA-2422
> URL: https://issues.apache.org/jira/browse/IMPALA-2422
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend, Frontend
>Affects Versions: Impala 2.2.4, Impala 2.3.0, Impala 2.5.0, Impala 2.4.0, 
> Impala 2.6.0, Impala 2.7.0
>Reporter: Huaisi Xu
>Priority: Critical
>  Labels: 2023Q1, correctness, downgraded, incompatibility
>
> {code:java}
> [localhost:21000] > select '%' like "\%";
> Query: select '%' like "\%"
> +---+
> | '%' like '\%' |
> +---+
> | false   |   -> should return true.
> +---+
> Fetched 1 row(s) in 0.01s
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-6294) Concurrent hung with lots of spilling make slow progress due to blocking in DataStreamRecvr and DataStreamSender

2023-09-07 Thread Michael Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith updated IMPALA-6294:
--
Target Version:   (was: Impala 4.3.0)

> Concurrent hung with lots of spilling make slow progress due to blocking in 
> DataStreamRecvr and DataStreamSender
> 
>
> Key: IMPALA-6294
> URL: https://issues.apache.org/jira/browse/IMPALA-6294
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.11.0
>Reporter: Mostafa Mokhtar
>Assignee: Michael Ho
>Priority: Critical
> Attachments: IMPALA-6285 TPCDS Q3 slow broadcast, 
> slow_broadcast_q3_reciever.txt, slow_broadcast_q3_sender.txt
>
>
> While running a highly concurrent spilling workload on a large cluster 
> queries start running slower, even light weight queries that are not running 
> are affected by this slow down. 
> {code}
>   EXCHANGE_NODE (id=9):(Total: 3m1s, non-child: 3m1s, % non-child: 
> 100.00%)
>  - ConvertRowBatchTime: 999.990us
>  - PeakMemoryUsage: 0
>  - RowsReturned: 108.00K (108001)
>  - RowsReturnedRate: 593.00 /sec
> DataStreamReceiver:
>   BytesReceived(4s000ms): 254.47 KB, 338.82 KB, 338.82 KB, 852.43 
> KB, 1.32 MB, 1.33 MB, 1.50 MB, 2.53 MB, 2.99 MB, 3.00 MB, 3.00 MB, 3.00 MB, 
> 3.00 MB, 3.00 MB, 3.00 MB, 3.00 MB, 3.00 MB, 3.00 MB, 3.16 MB, 3.49 MB, 3.80 
> MB, 4.15 MB, 4.55 MB, 4.84 MB, 4.99 MB, 5.07 MB, 5.41 MB, 5.75 MB, 5.92 MB, 
> 6.00 MB, 6.00 MB, 6.00 MB, 6.07 MB, 6.28 MB, 6.33 MB, 6.43 MB, 6.67 MB, 6.91 
> MB, 7.29 MB, 8.03 MB, 9.12 MB, 9.68 MB, 9.90 MB, 9.97 MB, 10.44 MB, 11.25 MB
>- BytesReceived: 11.73 MB (12301692)
>- DeserializeRowBatchTimer: 957.990ms
>- FirstBatchArrivalWaitTime: 0.000ns
>- PeakMemoryUsage: 644.44 KB (659904)
>- SendersBlockedTimer: 0.000ns
>- SendersBlockedTotalTimer(*): 0.000ns
> {code}
> {code}
> DataStreamSender (dst_id=9):(Total: 1s819ms, non-child: 1s819ms, % 
> non-child: 100.00%)
>- BytesSent: 234.64 MB (246033840)
>- NetworkThroughput(*): 139.58 MB/sec
>- OverallThroughput: 128.92 MB/sec
>- PeakMemoryUsage: 33.12 KB (33920)
>- RowsReturned: 108.00K (108001)
>- SerializeBatchTime: 133.998ms
>- TransmitDataRPCTime: 1s680ms
>- UncompressedRowBatchSize: 446.42 MB (468102200)
> {code}
> Timeouts seen in IMPALA-6285 are caused by this issue
> {code}
> I1206 12:44:14.925405 25274 status.cc:58] RPC recv timed out: Client 
> foo-17.domain.com:22000 timed-out during recv call.
> @   0x957a6a  impala::Status::Status()
> @  0x11dd5fe  
> impala::DataStreamSender::Channel::DoTransmitDataRpc()
> @  0x11ddcd4  
> impala::DataStreamSender::Channel::TransmitDataHelper()
> @  0x11de080  impala::DataStreamSender::Channel::TransmitData()
> @  0x11e1004  impala::ThreadPool<>::WorkerThread()
> @   0xd10063  impala::Thread::SuperviseThread()
> @   0xd107a4  boost::detail::thread_data<>::run()
> @  0x128997a  (unknown)
> @ 0x7f68c5bc7e25  start_thread
> @ 0x7f68c58f534d  __clone
> {code}
> A similar behavior was also observed with KRPC enabled IMPALA-6048



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-6890) split-hbase.sh: Can't get master address from ZooKeeper; znode data == null

2023-09-07 Thread Michael Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith updated IMPALA-6890:
--
Target Version:   (was: Impala 4.3.0)

> split-hbase.sh: Can't get master address from ZooKeeper; znode data == null
> ---
>
> Key: IMPALA-6890
> URL: https://issues.apache.org/jira/browse/IMPALA-6890
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 2.12.0
>Reporter: Vuk Ercegovac
>Assignee: Joe McDonnell
>Priority: Critical
>
> {noformat}
> 20:57:13 FAILED (Took: 7 min 58 sec)
> 20:57:13 
> '/data/jenkins/workspace/impala-cdh5-2.12.0_5.15.0-exhaustive-thrift/repos/Impala/testdata/bin/split-hbase.sh'
>  failed. Tail of log:
> 20:57:13 Wed Apr 18 20:49:43 PDT 2018, 
> RpcRetryingCaller{globalStartTime=1524109783051, pause=100, retries=31}, 
> org.apache.hadoop.hbase.MasterNotRunningException: java.io.IOException: Can't 
> get master address from ZooKeeper; znode data == null
> 20:57:13 Wed Apr 18 20:49:43 PDT 2018, 
> RpcRetryingCaller{globalStartTime=1524109783051, pause=100, retries=31}, 
> org.apache.hadoop.hbase.MasterNotRunningException: java.io.IOException: Can't 
> get master address from ZooKeeper; znode data == null
> 20:57:13 Wed Apr 18 20:49:44 PDT 2018, 
> RpcRetryingCaller{globalStartTime=1524109783051, pause=100, retries=31}, 
> org.apache.hadoop.hbase.MasterNotRunningException: java.io.IOException: Can't 
> get master address from ZooKeeper; znode data == null
> ...
> 20:57:13 Wed Apr 18 20:57:13 PDT 2018, 
> RpcRetryingCaller{globalStartTime=1524109783051, pause=100, retries=31}, 
> org.apache.hadoop.hbase.MasterNotRunningException: java.io.IOException: Can't 
> get master address from ZooKeeper; znode data == null
> 20:57:13 
> 20:57:13  at 
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:157)
> 20:57:13  at 
> org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:4329)
> 20:57:13  at 
> org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:4321)
> 20:57:13  at 
> org.apache.hadoop.hbase.client.HBaseAdmin.getClusterStatus(HBaseAdmin.java:2952)
> 20:57:13  at 
> org.apache.impala.datagenerator.HBaseTestDataRegionAssigment.(HBaseTestDataRegionAssigment.java:74)
> 20:57:13  at 
> org.apache.impala.datagenerator.HBaseTestDataRegionAssigment.main(HBaseTestDataRegionAssigment.java:310)
> 20:57:13 Caused by: org.apache.hadoop.hbase.MasterNotRunningException: 
> java.io.IOException: Can't get master address from ZooKeeper; znode data == 
> null
> 20:57:13  at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStub(ConnectionManager.java:1698)
> 20:57:13  at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$MasterServiceStubMaker.makeStub(ConnectionManager.java:1718)
> 20:57:13  at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.getKeepAliveMasterService(ConnectionManager.java:1875)
> 20:57:13  at 
> org.apache.hadoop.hbase.client.MasterCallable.prepare(MasterCallable.java:38)
> 20:57:13  at 
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:134)
> 20:57:13  ... 5 more
> 20:57:13 Caused by: java.io.IOException: Can't get master address from 
> ZooKeeper; znode data == null
> 20:57:13  at 
> org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.getMasterAddress(MasterAddressTracker.java:154)
> 20:57:13  at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStubNoRetries(ConnectionManager.java:1648)
> 20:57:13  at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStub(ConnectionManager.java:1689)
> 20:57:13  ... 9 more
> 20:57:13 Error in 
> /data/jenkins/workspace/impala-cdh5-2.12.0_5.15.0-exhaustive-thrift/repos/Impala/testdata/bin/split-hbase.sh
>  at line 41: "$JAVA" ${JAVA_KERBEROS_MAGIC} \
> 20:57:13 Error in 
> /data/jenkins/workspace/impala-cdh5-2.12.0_5.15.0-exhaustive-thrift/repos/Impala/bin/run-all-tests.sh
>  at line 48: # Run End-to-end Tests{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7083) AnalysisException for GROUP BY and ORDER BY expressions that are folded to constants from 2.9 onwards

2023-09-07 Thread Michael Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith updated IMPALA-7083:
--
Target Version:   (was: Impala 4.3.0)

> AnalysisException for GROUP BY and ORDER BY expressions that are folded to 
> constants from 2.9 onwards
> -
>
> Key: IMPALA-7083
> URL: https://issues.apache.org/jira/browse/IMPALA-7083
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 2.9.0
>Reporter: Eric Lin
>Priority: Critical
>  Labels: regression
>
> To reproduce, please run below impala query:
> {code}
> DROP TABLE IF EXISTS test;
> CREATE TABLE test (a int);
> SELECT   ( 
> CASE 
>WHEN (1 =1) 
>THEN 1
>ELSE a
> end) AS b
> FROM  test 
> GROUP BY 1 
> ORDER BY ( 
> CASE 
>WHEN (1 =1) 
>THEN 1
>ELSE a
> end);
> {code}
> It will fail with below error:
> {code}
> ERROR: AnalysisException: ORDER BY expression not produced by aggregation 
> output (missing from GROUP BY clause?): (CASE WHEN TRUE THEN 1 ELSE a END)
> {code}
> However, if I replace column name "a" as a constant value, it works:
> {code}
> SELECT   ( 
> CASE 
>WHEN (1 =1) 
>THEN 1
>ELSE 2
> end) AS b
> FROM  test 
> GROUP BY 1 
> ORDER BY ( 
> CASE 
>WHEN (1 =1) 
>THEN 1
>ELSE 2
> end);
> {code}
> This issue is identified in CDH5.12.x (Impala 2.9), and no issues in 5.11.x 
> (Impala 2.8).
> We know that it can be worked around by re-write as below:
> {code}
> SELECT   ( 
> CASE 
>WHEN (1 =1) 
>THEN 1
>ELSE a
> end) AS b
> FROM  test 
> GROUP BY 1 
> ORDER BY 1;
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7471) Impala can hit dcheck in corrupted Parquet files

2023-09-07 Thread Michael Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith updated IMPALA-7471:
--
Target Version:   (was: Impala 4.3.0)

> Impala can hit dcheck in corrupted Parquet files
> 
>
> Key: IMPALA-7471
> URL: https://issues.apache.org/jira/browse/IMPALA-7471
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Tim Armstrong
>Assignee: Csaba Ringhofer
>Priority: Critical
>  Labels: complextype, correctness, crash, parquet
> Attachments: test_users_131786401297925138_0.parquet
>
>
> From 
> http://community.cloudera.com/t5/Interactive-Short-cycle-SQL/Impala-bug-with-nested-arrays-of-structures-where-some-of/m-p/78507/highlight/false#M4779
> {quote}We found a case where Impala returns incorrect values from simple 
> query. Our data contains nested array of structures and structures contains 
> other structures.
> We generated minimal sample data allowing to reproduce the issue.
>  
> SQL to create a table:
> {quote}
> {code}
> CREATE TABLE plat_test.test_users (
>   id INT,
>   name STRING,   
>   devices ARRAY<
> STRUCT<
>   id:STRING,
>   device_info:STRUCT<
> model:STRING
>   >
> >
>   >
> )
> STORED AS PARQUET
> {code}
> {quote}
> Please put attached parquet file to the location of the table and refresh the 
> table.
> In sample data we have 2 users, one with 2 devices, second one with 3. Some 
> of the devices.device_info.model fields are NULL.
>  
> When I issue a query:
> {quote}
> {code}
> SELECT u.name, d.device_info.model as model
> FROM test_users u,
> u.devices d;
> {code}
>  {quote}
> I'm expecting to get 5 records in results, but getting only one1.png
> If I change query to:
>  {quote}
> {code}
> SELECT u.name, d.device_info.model as model
> FROM test_users u
> LEFT OUTER JOIN u.devices d;
>  {code}
> {quote}
> I'm getting two records in the results, but still not as it should be.
> We found some workaround to this problem. If we add to the result columns 
> device.id we will get all records from parquet file:
> {quote}
> {code}
> SELECT u.name, d.id, d.device_info.model as model
> FROM test_users u
> , u.devices d
>  {code}
> {quote}
> And result is 3.png
>  
> But we can't rely on this workaround, because we don't need device.id in all 
> queries and Impala optimizes it, and as a result we are getting unpredicted 
> results.
>  
> I tested Hive query on this table and it returns expected results:
> {quote}
> {code}
> SELECT u.name, d.device_info.model
> FROM test_users u
> lateral view outer inline (u.devices) d;
>  {code}
> {quote}
> results:
> 4.png
> Please advice if it's a problem in Impala engine or we did some mistake in 
> our query.
>  
> Best regards,
> Come2Play team.
> {quote}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8521) Lots of "unreleased ByteBuffers allocated by read()" errors from HDFS client

2023-09-07 Thread Michael Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith updated IMPALA-8521:
--
Target Version:   (was: Impala 4.3.0)

> Lots of "unreleased ByteBuffers allocated by read()" errors from HDFS client
> 
>
> Key: IMPALA-8521
> URL: https://issues.apache.org/jira/browse/IMPALA-8521
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Tim Armstrong
>Assignee: Sahil Takiar
>Priority: Critical
>
> I'm looking at some job logs and seeing a bunch of errors like this. I don't 
> know if it's benign or if it's something more serious.
> {noformat}
> I0507 07:34:53.934693 20195 scan-range.cc:607] 
> dd4d6eb8d2ad9587:6b44fe1b0002] Cache read failed for scan range: 
> file=hdfs://localhost:20500/test-warehouse/f861f1a3/nation.tbl disk_id=0 
> offset=1024  exclusive_hdfs_fh=0xec09220 num_remote_bytes=0 cancel_status= 
> buffer_queue=0 num_buffers_in_readers=0 unused_iomgr_buffers=0 
> unused_iomgr_buffer_bytes=0 blocked_on_buffer=0. Switching to disk read path.
> W0507 07:34:53.934787 20195 DFSInputStream.java:668] 
> dd4d6eb8d2ad9587:6b44fe1b0002] closing file 
> /test-warehouse/f861f1a3/nation.tbl, but there are still unreleased 
> ByteBuffers allocated by read().  Please release 
> java.nio.DirectByteBufferR[pos=1024 lim=2048 cap=2199].
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8521) Lots of "unreleased ByteBuffers allocated by read()" errors from HDFS client

2023-09-07 Thread Michael Smith (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762889#comment-17762889
 ] 

Michael Smith commented on IMPALA-8521:
---

Was an issue ever filed with HDFS?

> Lots of "unreleased ByteBuffers allocated by read()" errors from HDFS client
> 
>
> Key: IMPALA-8521
> URL: https://issues.apache.org/jira/browse/IMPALA-8521
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Tim Armstrong
>Assignee: Sahil Takiar
>Priority: Critical
>
> I'm looking at some job logs and seeing a bunch of errors like this. I don't 
> know if it's benign or if it's something more serious.
> {noformat}
> I0507 07:34:53.934693 20195 scan-range.cc:607] 
> dd4d6eb8d2ad9587:6b44fe1b0002] Cache read failed for scan range: 
> file=hdfs://localhost:20500/test-warehouse/f861f1a3/nation.tbl disk_id=0 
> offset=1024  exclusive_hdfs_fh=0xec09220 num_remote_bytes=0 cancel_status= 
> buffer_queue=0 num_buffers_in_readers=0 unused_iomgr_buffers=0 
> unused_iomgr_buffer_bytes=0 blocked_on_buffer=0. Switching to disk read path.
> W0507 07:34:53.934787 20195 DFSInputStream.java:668] 
> dd4d6eb8d2ad9587:6b44fe1b0002] closing file 
> /test-warehouse/f861f1a3/nation.tbl, but there are still unreleased 
> ByteBuffers allocated by read().  Please release 
> java.nio.DirectByteBufferR[pos=1024 lim=2048 cap=2199].
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-9486) Creating a Kudu table via JDBC fails with "IllegalArgumentException"

2023-09-07 Thread Michael Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith updated IMPALA-9486:
--
Target Version:   (was: Impala 4.3.0)

> Creating a Kudu table via JDBC fails with "IllegalArgumentException"
> 
>
> Key: IMPALA-9486
> URL: https://issues.apache.org/jira/browse/IMPALA-9486
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.2.0
>Reporter: Grant Henke
>Assignee: Fang-Yu Rao
>Priority: Blocker
>
> A Kudu user reported that though creating tables via impala shell or Hue, 
> when using an external tool connected via JDBC the create statement fails 
> with the following:
> {noformat}
> [ImpalaJDBCDriver](500051) ERROR processing query/statement. Error Code: 0, 
> SQL state: TStatus(statusCode:ERROR_STATUS, sqlState:HY000, 
> errorMessage:ImpalaRuntimeException: Error creating Kudu table 
> 'impala::default.foo' CAUSED BY: IllegalArgumentException: table owner must 
> not be null or empty ), Query: …
> {noformat}
>  
> When debugging the issue further it looks like the call to set the owner on 
> the Kudu table should not be called if an owner is not explicitly set:
> [https://github.com/apache/impala/blob/497a17dbdc0669abd47c2360b8ca94de8b54d413/fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java#L252]
>  
> A possible fix could be to guard the call with _isSetOwner_:
> {code:java}
> if (msTbl.isSetOwner()) { 
>tableOpts.setOwner(msTbl.getOwner()); 
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-10040) Crash on UnionNode when codegen is disabled

2023-09-07 Thread Michael Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith updated IMPALA-10040:
---
Target Version:   (was: Impala 4.3.0)

> Crash on UnionNode when codegen is disabled
> ---
>
> Key: IMPALA-10040
> URL: https://issues.apache.org/jira/browse/IMPALA-10040
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Quanlong Huang
>Assignee: Quanlong Huang
>Priority: Critical
>  Labels: crash
>
> Saw a crash when ran a UNION query with codegen disabled:
> {code}
> F0803 15:37:44.551749 24805 union-node-ir.cc:26] 
> fd41196430b5c449:0a195a250006] Check failed: !dst_batch->AtCapacity() 
> *** Check failure stack trace: *** 
> @  0x514aa8c  google::LogMessage::Fail()
> @  0x514c37c  google::LogMessage::SendToLog()
> @  0x514a3ea  google::LogMessage::Flush()
> @  0x514dfe8  google::LogMessageFatal::~LogMessageFatal()
> @  0x286c323  impala::UnionNode::MaterializeExprs()
> @  0x286c983  impala::UnionNode::MaterializeBatch()
> @  0x286798a  impala::UnionNode::GetNextMaterialized()
> @  0x2868ac4  impala::UnionNode::GetNext()
> @  0x225f77c  impala::FragmentInstanceState::ExecInternal()
> @  0x225be20  impala::FragmentInstanceState::Exec()
> @  0x2285c35  impala::QueryState::ExecFInstance()
> @  0x2284037  
> _ZZN6impala10QueryState15StartFInstancesEvENKUlvE_clEv
> @  0x22877d6  
> _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala10QueryState15StartFInstancesEvEUlvE_vE6invokeERNS1_15function_bufferE
> @  0x2053061  boost::function0<>::operator()()
> @  0x2676bff  impala::Thread::SuperviseThread()
> @  0x267eb9c  boost::_bi::list5<>::operator()<>()
> @  0x267eac0  boost::_bi::bind_t<>::operator()()
> @  0x267ea81  boost::detail::thread_data<>::run()
> @  0x3e514e1  thread_proxy
> @ 0x7f6575c326b9  start_thread
> @ 0x7f65727fe4dc  clone
> {code}
> The query is
> {code}
> I0803 15:37:44.273838 24616 Frontend.java:1508] 
> fd41196430b5c449:0a195a25] Analyzing query: create table my_bigstrs 
> stored as parquet as
> select *, repeat(string_col, 10) as bigstr
> from functional.alltypes
> order by id
> limit 10
> union all
> select *, repeat(string_col, 1000) as bigstr
> from functional.alltypes
> order by id
> limit 10 db: default
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-10236) Queries stuck if catalog topic update compression fails

2023-09-07 Thread Michael Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith updated IMPALA-10236:
---
Target Version:   (was: Impala 4.3.0)

> Queries stuck if catalog topic update compression fails
> ---
>
> Key: IMPALA-10236
> URL: https://issues.apache.org/jira/browse/IMPALA-10236
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 2.12.0
>Reporter: Shant Hovsepian
>Assignee: Vihang Karajgaonkar
>Priority: Critical
>  Labels: hang, supportability
>
> If a to be compressed Catalog Object doesn't fit into a 2GB buffer, an error 
> is thrown. 
>  
> {code:java}
> /// Compresses a serialized catalog object using LZ4 and stores it back in 
> 'dst'. Stores
> /// the size of the uncompressed catalog object in the first sizeof(uint32_t) 
> bytes of
> /// 'dst'. The compression fails if the uncompressed data size exceeds 
> 0x7E00 bytes.
> Status CompressCatalogObject(const uint8_t* src, uint32_t size, std::string* 
> dst)
> WARN_UNUSED_RESULT;
> {code}
>  
> CatalogServer::AddPendingTopicItem() calls CompressCatalogObject()
>  
> {code:java}
> // Add a catalog update to pending_topic_updates_.
> extern "C"
> JNIEXPORT jboolean JNICALL
> Java_org_apache_impala_service_FeSupport_NativeAddPendingTopicItem(JNIEnv* 
> env,
> jclass caller_class, jlong native_catalog_server_ptr, jstring key, jlong 
> version,
> jbyteArray serialized_object, jboolean deleted) {
>   std::string key_string;
>   {
> JniUtfCharGuard key_str;
> if (!JniUtfCharGuard::create(env, key, _str).ok()) {
>   return static_cast(false);
> }
> key_string.assign(key_str.get());
>   }
>   JniScopedArrayCritical obj_buf;
>   if (!JniScopedArrayCritical::Create(env, serialized_object, _buf)) {
> return static_cast(false);
>   }
>   reinterpret_cast(native_catalog_server_ptr)->
>   AddPendingTopicItem(std::move(key_string), version, obj_buf.get(),
>   static_cast(obj_buf.size()), deleted);
>   return static_cast(true);
> }
> {code}
> However the JNI call to AddPendingTopicItem discards the return value.
> Recently the return value was maintained due to IMPALA-10076:
> {code:java}
> -if (!FeSupport.NativeAddPendingTopicItem(nativeCatalogServerPtr, 
> v1Key,
> -obj.catalog_version, data, delete)) {
> +int actualSize = 
> FeSupport.NativeAddPendingTopicItem(nativeCatalogServerPtr,
> +v1Key, obj.catalog_version, data, delete);
> +if (actualSize < 0) {
>LOG.error("NativeAddPendingTopicItem failed in BE. key=" + v1Key + 
> ", delete="
>+ delete + ", data_size=" + data.length);
> +} else if (summary != null && obj.type == HDFS_PARTITION) {
> +  summary.update(true, delete, obj.hdfs_partition.partition_name,
> +  obj.catalog_version, data.length, actualSize);
>  }
>}
> {code}
> CatalogServiceCatalog::addCatalogObject() now produces an error message but 
> the Catalog update doesn't go through.
> {code:java}
>   if (topicMode_ == TopicMode.FULL || topicMode_ == TopicMode.MIXED) {
> String v1Key = CatalogServiceConstants.CATALOG_TOPIC_V1_PREFIX + key;
> byte[] data = serializer.serialize(obj);
> int actualSize = 
> FeSupport.NativeAddPendingTopicItem(nativeCatalogServerPtr,
> v1Key, obj.catalog_version, data, delete);
> if (actualSize < 0) {
>   LOG.error("NativeAddPendingTopicItem failed in BE. key=" + v1Key + 
> ", delete="
>   + delete + ", data_size=" + data.length);
> } else if (summary != null && obj.type == HDFS_PARTITION) {
>   summary.update(true, delete, obj.hdfs_partition.partition_name,
>   obj.catalog_version, data.length, actualSize);
> }
>   }
> {code}
> Not sure what the right behavior would be, we could handle the compression 
> issue and try more aggressive compression, or unblock the catalog update.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-4741) ORDER BY behavior with UNION is incorrect

2023-09-07 Thread Michael Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-4741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith updated IMPALA-4741:
--
Priority: Critical  (was: Major)

> ORDER BY behavior with UNION is incorrect
> -
>
> Key: IMPALA-4741
> URL: https://issues.apache.org/jira/browse/IMPALA-4741
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 2.8.0
>Reporter: Greg Rahn
>Priority: Critical
>  Labels: correctness, incompatibility, ramp-up, sql-language, 
> tpc-ds
> Attachments: query36a.sql, query49.sql
>
>
> When a query uses the UNION, EXCEPT, or INTERSECT operators, the ORDER BY 
> clause must be specified at the end of the statement and the results of the 
> combined queries are sorted.  ORDER BY clauses are not allowed in individual 
> branches unless the branch is enclosed by parentheses.
> There are two bugs currently:
> # An ORDER BY is allowed in a branch of a UNION that is not enclosed in 
> parentheses
> # The final ORDER BY of a UNION is attached to the nearest branch when it 
> should be sorting the combined results of the UNION(s)
> For example, this is not valid syntax but is allowed in Impala
> {code}
> select * from t1 order by 1
> union all
> select * from t2
> {code}
> And for queries like this, the ORDER BY should order the unioned result, not 
> just the nearest branch which is the current behavior.
> {code}
> select * from t1
> union all
> select * from t2
> order by 1
> {code}
> If one wants ordering within a branch, the query block must be enclosed by 
> parentheses like such:
> {code}
> (select * from t1 order by 1)
> union all
> (select * from t2 order by 2)
> {code}
> Here is an example where incorrect results are returned.
> Impala
> {code}
> [impalad:21000] > select r_regionkey, r_name from region union all select 
> r_regionkey, r_name from region order by 1 limit 2;
> +-+-+
> | r_regionkey | r_name  |
> +-+-+
> | 0   | AFRICA  |
> | 1   | AMERICA |
> | 2   | ASIA|
> | 3   | EUROPE  |
> | 4   | MIDDLE EAST |
> | 0   | AFRICA  |
> | 1   | AMERICA |
> +-+-+
> Fetched 7 row(s) in 0.12s
> {code}
> PostgreSQL
> {code}
> tpch=# select r_regionkey, r_name from region union all select r_regionkey, 
> r_name from region order by 1 limit 2;
>  r_regionkey |  r_name
> -+---
>0 | AFRICA
>0 | AFRICA
> (2 rows) 
> {code}
> see also https://cloud.google.com/spanner/docs/query-syntax#syntax_5



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-10338) TestAdmissionController.test_queue_reasons_slots flaky because of slow/hanging fragment

2023-09-07 Thread Michael Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith reassigned IMPALA-10338:
--

Assignee: Andrew Sherman

> TestAdmissionController.test_queue_reasons_slots flaky because of 
> slow/hanging fragment
> ---
>
> Key: IMPALA-10338
> URL: https://issues.apache.org/jira/browse/IMPALA-10338
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.0.0
>Reporter: Tim Armstrong
>Assignee: Andrew Sherman
>Priority: Major
>  Labels: broken-build, flaky, hang
> Attachments: failure-output.txt, impalad.ERROR, impalad.INFO
>
>
> This is on an s3 debug build, commit 5a00a4c06f8ec40a8867dcbc036cf5bb47b8a3be
> {noformat}
> custom_cluster.test_admission_controller.TestAdmissionController.test_queue_reasons_slots
>  (from pytest)
> Failing for the past 1 build (Since Failed#673 )
> Took 1 min 58 sec.
> add description
> Error Message
> Timeout: query 504b3a2511f3cd0e:e27bec6b did not reach one of the 
> expected states [4], last known state 3
> Stacktrace
> custom_cluster/test_admission_controller.py:967: in test_queue_reasons_slots
> TIMEOUT_S, config_options={"mt_dop": 4})
> custom_cluster/test_admission_controller.py:277: in 
> _execute_and_collect_profiles
> state = self.wait_for_any_state(handle, expected_states, timeout_s)
> common/impala_test_suite.py:1081: in wait_for_any_state
> actual_state))
> E   Timeout: query 504b3a2511f3cd0e:e27bec6b did not reach one of the 
> expected states [4], last known state 3
> {noformat}
> Those numbers are beeswax QUeryStates:
> {noformat}
> enum QueryState {
>   CREATED = 0
>   INITIALIZED = 1
>   COMPILED = 2
>   RUNNING = 3
>   FINISHED = 4
>   EXCEPTION = 5
> }
> {noformat}
> I.e. it appears to have run for > 60 seconds.
> The timing of that query:
> {noformat}
> I1116 13:25:08.449323 32665 impala-server.cc:1242] 
> 504b3a2511f3cd0e:e27bec6b] Registered query 
> query_id=504b3a2511f3cd0e:e27bec6b 
> session_id=874c2c9eaf2ad730:0004c1bb0ba7b4a7
> I1116 13:25:08.449626 32665 Frontend.java:1532] 
> 504b3a2511f3cd0e:e27bec6b] Analyzing query: select 
> min(ss_wholesale_cost) from tpcds_parquet.store_sales db: default
> ...
> I1116 13:25:08.567667   367 admission-controller.cc:1532] 
> 504b3a2511f3cd0e:e27bec6b] Scheduling query 
> 504b3a2511f3cd0e:e27bec6b with membership version 2
> I1116 13:25:08.567767   367 admission-controller.cc:1590] 
> 504b3a2511f3cd0e:e27bec6b] Scheduling for executor group: 
> default-pool-group1 with 3 executors
> I1116 13:25:08.643026   367 admission-controller.cc:1640] 
> 504b3a2511f3cd0e:e27bec6b] Trying to admit query to pool default-pool 
> in executor group default-pool-group1 (3 executors)
> ...
> I1116 13:25:49.184185 32432 admission-controller.cc:1811] Admitting from 
> queue: query=504b3a2511f3cd0e:e27bec6b
> I1116 13:25:49.184196 32432 admission-controller.cc:1903] For Query 
> 504b3a2511f3cd0e:e27bec6b per_backend_mem_limit set to: -1.00 B 
> per_backend_m
> em_to_admit set to: 114.02 MB coord_backend_mem_limit set to: -1.00 B 
> coord_backend_mem_to_admit set to: 114.02 MB
> I1116 13:25:49.184350   367 admission-controller.cc:1288] 
> 504b3a2511f3cd0e:e27bec6b] Admitted queued query 
> id=504b3a2511f3cd0e:e27bec6b
> I1116 13:25:49.184370   367 admission-controller.cc:1289] 
> 504b3a2511f3cd0e:e27bec6b] Final: agg_num_running=1, 
> agg_num_queued=0, agg_mem_reserved
> =17.25 KB,  local_host(local_mem_admitted=342.05 MB, num_admitted_running=1, 
> num_queued=0, backend_mem_reserved=17.25 KB, topN_query_stats: queries=[0a42
> 32658ea48bc5:c0156469, 5b470ebea7782154:ea22bdb0, 
> 554d88d8f812e22d:efbaf752, aa4a301189a4d144:fdc17ff7], 
> total_mem_consum
> ed=17.25 KB, fraction_of_pool_total_mem=1; pool_level_stats: num_running=4, 
> min=0, max=17.25 KB, pool_total_mem=17.25 KB, average_per_query=4.31 KB)
> I1116 13:25:49.185214   367 impala-server.cc:2062] 
> 504b3a2511f3cd0e:e27bec6b] Registering query locations
> I1116 13:25:49.185261   367 coordinator.cc:149] 
> 504b3a2511f3cd0e:e27bec6b] Exec() 
> query_id=504b3a2511f3cd0e:e27bec6b stmt=select min(ss_w
> holesale_cost) from tpcds_parquet.store_sales
> I1116 13:25:49.186172   367 coordinator.cc:473] 
> 504b3a2511f3cd0e:e27bec6b] starting execution on 3 backends for 
> query_id=504b3a2511f3cd0e:e27bec6
> b
> I1116 13:25:49.189028 32071 control-service.cc:142] 
> 504b3a2511f3cd0e:e27bec6b] ExecQueryFInstances(): 
> query_id=504b3a2511f3cd0e:e27bec6b 
> coord=impala-ec2-centos74-m5-4xlarge-ondemand-018e.vpc.cloudera.com:27000 
> 

[jira] [Comment Edited] (IMPALA-10338) TestAdmissionController.test_queue_reasons_slots flaky because of slow/hanging fragment

2023-09-07 Thread Michael Smith (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762887#comment-17762887
 ] 

Michael Smith edited comment on IMPALA-10338 at 9/7/23 10:09 PM:
-

Hasn't been seen in awhile, lowering priority and removing target version.

Update: nevermind, it still pops up occasionally.


was (Author: JIRAUSER288956):
Hasn't been seen in awhile, lowering priority and removing target version.

> TestAdmissionController.test_queue_reasons_slots flaky because of 
> slow/hanging fragment
> ---
>
> Key: IMPALA-10338
> URL: https://issues.apache.org/jira/browse/IMPALA-10338
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.0.0
>Reporter: Tim Armstrong
>Assignee: Andrew Sherman
>Priority: Major
>  Labels: broken-build, flaky, hang
> Attachments: failure-output.txt, impalad.ERROR, impalad.INFO
>
>
> This is on an s3 debug build, commit 5a00a4c06f8ec40a8867dcbc036cf5bb47b8a3be
> {noformat}
> custom_cluster.test_admission_controller.TestAdmissionController.test_queue_reasons_slots
>  (from pytest)
> Failing for the past 1 build (Since Failed#673 )
> Took 1 min 58 sec.
> add description
> Error Message
> Timeout: query 504b3a2511f3cd0e:e27bec6b did not reach one of the 
> expected states [4], last known state 3
> Stacktrace
> custom_cluster/test_admission_controller.py:967: in test_queue_reasons_slots
> TIMEOUT_S, config_options={"mt_dop": 4})
> custom_cluster/test_admission_controller.py:277: in 
> _execute_and_collect_profiles
> state = self.wait_for_any_state(handle, expected_states, timeout_s)
> common/impala_test_suite.py:1081: in wait_for_any_state
> actual_state))
> E   Timeout: query 504b3a2511f3cd0e:e27bec6b did not reach one of the 
> expected states [4], last known state 3
> {noformat}
> Those numbers are beeswax QUeryStates:
> {noformat}
> enum QueryState {
>   CREATED = 0
>   INITIALIZED = 1
>   COMPILED = 2
>   RUNNING = 3
>   FINISHED = 4
>   EXCEPTION = 5
> }
> {noformat}
> I.e. it appears to have run for > 60 seconds.
> The timing of that query:
> {noformat}
> I1116 13:25:08.449323 32665 impala-server.cc:1242] 
> 504b3a2511f3cd0e:e27bec6b] Registered query 
> query_id=504b3a2511f3cd0e:e27bec6b 
> session_id=874c2c9eaf2ad730:0004c1bb0ba7b4a7
> I1116 13:25:08.449626 32665 Frontend.java:1532] 
> 504b3a2511f3cd0e:e27bec6b] Analyzing query: select 
> min(ss_wholesale_cost) from tpcds_parquet.store_sales db: default
> ...
> I1116 13:25:08.567667   367 admission-controller.cc:1532] 
> 504b3a2511f3cd0e:e27bec6b] Scheduling query 
> 504b3a2511f3cd0e:e27bec6b with membership version 2
> I1116 13:25:08.567767   367 admission-controller.cc:1590] 
> 504b3a2511f3cd0e:e27bec6b] Scheduling for executor group: 
> default-pool-group1 with 3 executors
> I1116 13:25:08.643026   367 admission-controller.cc:1640] 
> 504b3a2511f3cd0e:e27bec6b] Trying to admit query to pool default-pool 
> in executor group default-pool-group1 (3 executors)
> ...
> I1116 13:25:49.184185 32432 admission-controller.cc:1811] Admitting from 
> queue: query=504b3a2511f3cd0e:e27bec6b
> I1116 13:25:49.184196 32432 admission-controller.cc:1903] For Query 
> 504b3a2511f3cd0e:e27bec6b per_backend_mem_limit set to: -1.00 B 
> per_backend_m
> em_to_admit set to: 114.02 MB coord_backend_mem_limit set to: -1.00 B 
> coord_backend_mem_to_admit set to: 114.02 MB
> I1116 13:25:49.184350   367 admission-controller.cc:1288] 
> 504b3a2511f3cd0e:e27bec6b] Admitted queued query 
> id=504b3a2511f3cd0e:e27bec6b
> I1116 13:25:49.184370   367 admission-controller.cc:1289] 
> 504b3a2511f3cd0e:e27bec6b] Final: agg_num_running=1, 
> agg_num_queued=0, agg_mem_reserved
> =17.25 KB,  local_host(local_mem_admitted=342.05 MB, num_admitted_running=1, 
> num_queued=0, backend_mem_reserved=17.25 KB, topN_query_stats: queries=[0a42
> 32658ea48bc5:c0156469, 5b470ebea7782154:ea22bdb0, 
> 554d88d8f812e22d:efbaf752, aa4a301189a4d144:fdc17ff7], 
> total_mem_consum
> ed=17.25 KB, fraction_of_pool_total_mem=1; pool_level_stats: num_running=4, 
> min=0, max=17.25 KB, pool_total_mem=17.25 KB, average_per_query=4.31 KB)
> I1116 13:25:49.185214   367 impala-server.cc:2062] 
> 504b3a2511f3cd0e:e27bec6b] Registering query locations
> I1116 13:25:49.185261   367 coordinator.cc:149] 
> 504b3a2511f3cd0e:e27bec6b] Exec() 
> query_id=504b3a2511f3cd0e:e27bec6b stmt=select min(ss_w
> holesale_cost) from tpcds_parquet.store_sales
> I1116 13:25:49.186172   367 coordinator.cc:473] 
> 504b3a2511f3cd0e:e27bec6b] starting execution on 3 backends for 
> 

[jira] [Updated] (IMPALA-10338) TestAdmissionController.test_queue_reasons_slots flaky because of slow/hanging fragment

2023-09-07 Thread Michael Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith updated IMPALA-10338:
---
Priority: Major  (was: Critical)

> TestAdmissionController.test_queue_reasons_slots flaky because of 
> slow/hanging fragment
> ---
>
> Key: IMPALA-10338
> URL: https://issues.apache.org/jira/browse/IMPALA-10338
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.0.0
>Reporter: Tim Armstrong
>Priority: Major
>  Labels: broken-build, flaky, hang
> Attachments: failure-output.txt, impalad.ERROR, impalad.INFO
>
>
> This is on an s3 debug build, commit 5a00a4c06f8ec40a8867dcbc036cf5bb47b8a3be
> {noformat}
> custom_cluster.test_admission_controller.TestAdmissionController.test_queue_reasons_slots
>  (from pytest)
> Failing for the past 1 build (Since Failed#673 )
> Took 1 min 58 sec.
> add description
> Error Message
> Timeout: query 504b3a2511f3cd0e:e27bec6b did not reach one of the 
> expected states [4], last known state 3
> Stacktrace
> custom_cluster/test_admission_controller.py:967: in test_queue_reasons_slots
> TIMEOUT_S, config_options={"mt_dop": 4})
> custom_cluster/test_admission_controller.py:277: in 
> _execute_and_collect_profiles
> state = self.wait_for_any_state(handle, expected_states, timeout_s)
> common/impala_test_suite.py:1081: in wait_for_any_state
> actual_state))
> E   Timeout: query 504b3a2511f3cd0e:e27bec6b did not reach one of the 
> expected states [4], last known state 3
> {noformat}
> Those numbers are beeswax QUeryStates:
> {noformat}
> enum QueryState {
>   CREATED = 0
>   INITIALIZED = 1
>   COMPILED = 2
>   RUNNING = 3
>   FINISHED = 4
>   EXCEPTION = 5
> }
> {noformat}
> I.e. it appears to have run for > 60 seconds.
> The timing of that query:
> {noformat}
> I1116 13:25:08.449323 32665 impala-server.cc:1242] 
> 504b3a2511f3cd0e:e27bec6b] Registered query 
> query_id=504b3a2511f3cd0e:e27bec6b 
> session_id=874c2c9eaf2ad730:0004c1bb0ba7b4a7
> I1116 13:25:08.449626 32665 Frontend.java:1532] 
> 504b3a2511f3cd0e:e27bec6b] Analyzing query: select 
> min(ss_wholesale_cost) from tpcds_parquet.store_sales db: default
> ...
> I1116 13:25:08.567667   367 admission-controller.cc:1532] 
> 504b3a2511f3cd0e:e27bec6b] Scheduling query 
> 504b3a2511f3cd0e:e27bec6b with membership version 2
> I1116 13:25:08.567767   367 admission-controller.cc:1590] 
> 504b3a2511f3cd0e:e27bec6b] Scheduling for executor group: 
> default-pool-group1 with 3 executors
> I1116 13:25:08.643026   367 admission-controller.cc:1640] 
> 504b3a2511f3cd0e:e27bec6b] Trying to admit query to pool default-pool 
> in executor group default-pool-group1 (3 executors)
> ...
> I1116 13:25:49.184185 32432 admission-controller.cc:1811] Admitting from 
> queue: query=504b3a2511f3cd0e:e27bec6b
> I1116 13:25:49.184196 32432 admission-controller.cc:1903] For Query 
> 504b3a2511f3cd0e:e27bec6b per_backend_mem_limit set to: -1.00 B 
> per_backend_m
> em_to_admit set to: 114.02 MB coord_backend_mem_limit set to: -1.00 B 
> coord_backend_mem_to_admit set to: 114.02 MB
> I1116 13:25:49.184350   367 admission-controller.cc:1288] 
> 504b3a2511f3cd0e:e27bec6b] Admitted queued query 
> id=504b3a2511f3cd0e:e27bec6b
> I1116 13:25:49.184370   367 admission-controller.cc:1289] 
> 504b3a2511f3cd0e:e27bec6b] Final: agg_num_running=1, 
> agg_num_queued=0, agg_mem_reserved
> =17.25 KB,  local_host(local_mem_admitted=342.05 MB, num_admitted_running=1, 
> num_queued=0, backend_mem_reserved=17.25 KB, topN_query_stats: queries=[0a42
> 32658ea48bc5:c0156469, 5b470ebea7782154:ea22bdb0, 
> 554d88d8f812e22d:efbaf752, aa4a301189a4d144:fdc17ff7], 
> total_mem_consum
> ed=17.25 KB, fraction_of_pool_total_mem=1; pool_level_stats: num_running=4, 
> min=0, max=17.25 KB, pool_total_mem=17.25 KB, average_per_query=4.31 KB)
> I1116 13:25:49.185214   367 impala-server.cc:2062] 
> 504b3a2511f3cd0e:e27bec6b] Registering query locations
> I1116 13:25:49.185261   367 coordinator.cc:149] 
> 504b3a2511f3cd0e:e27bec6b] Exec() 
> query_id=504b3a2511f3cd0e:e27bec6b stmt=select min(ss_w
> holesale_cost) from tpcds_parquet.store_sales
> I1116 13:25:49.186172   367 coordinator.cc:473] 
> 504b3a2511f3cd0e:e27bec6b] starting execution on 3 backends for 
> query_id=504b3a2511f3cd0e:e27bec6
> b
> I1116 13:25:49.189028 32071 control-service.cc:142] 
> 504b3a2511f3cd0e:e27bec6b] ExecQueryFInstances(): 
> query_id=504b3a2511f3cd0e:e27bec6b 
> coord=impala-ec2-centos74-m5-4xlarge-ondemand-018e.vpc.cloudera.com:27000 
> #instances=5
> I1116 13:25:49.192133 

[jira] [Updated] (IMPALA-10338) TestAdmissionController.test_queue_reasons_slots flaky because of slow/hanging fragment

2023-09-07 Thread Michael Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith updated IMPALA-10338:
---
Target Version:   (was: Impala 4.3.0)

> TestAdmissionController.test_queue_reasons_slots flaky because of 
> slow/hanging fragment
> ---
>
> Key: IMPALA-10338
> URL: https://issues.apache.org/jira/browse/IMPALA-10338
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.0.0
>Reporter: Tim Armstrong
>Priority: Critical
>  Labels: broken-build, flaky, hang
> Attachments: failure-output.txt, impalad.ERROR, impalad.INFO
>
>
> This is on an s3 debug build, commit 5a00a4c06f8ec40a8867dcbc036cf5bb47b8a3be
> {noformat}
> custom_cluster.test_admission_controller.TestAdmissionController.test_queue_reasons_slots
>  (from pytest)
> Failing for the past 1 build (Since Failed#673 )
> Took 1 min 58 sec.
> add description
> Error Message
> Timeout: query 504b3a2511f3cd0e:e27bec6b did not reach one of the 
> expected states [4], last known state 3
> Stacktrace
> custom_cluster/test_admission_controller.py:967: in test_queue_reasons_slots
> TIMEOUT_S, config_options={"mt_dop": 4})
> custom_cluster/test_admission_controller.py:277: in 
> _execute_and_collect_profiles
> state = self.wait_for_any_state(handle, expected_states, timeout_s)
> common/impala_test_suite.py:1081: in wait_for_any_state
> actual_state))
> E   Timeout: query 504b3a2511f3cd0e:e27bec6b did not reach one of the 
> expected states [4], last known state 3
> {noformat}
> Those numbers are beeswax QUeryStates:
> {noformat}
> enum QueryState {
>   CREATED = 0
>   INITIALIZED = 1
>   COMPILED = 2
>   RUNNING = 3
>   FINISHED = 4
>   EXCEPTION = 5
> }
> {noformat}
> I.e. it appears to have run for > 60 seconds.
> The timing of that query:
> {noformat}
> I1116 13:25:08.449323 32665 impala-server.cc:1242] 
> 504b3a2511f3cd0e:e27bec6b] Registered query 
> query_id=504b3a2511f3cd0e:e27bec6b 
> session_id=874c2c9eaf2ad730:0004c1bb0ba7b4a7
> I1116 13:25:08.449626 32665 Frontend.java:1532] 
> 504b3a2511f3cd0e:e27bec6b] Analyzing query: select 
> min(ss_wholesale_cost) from tpcds_parquet.store_sales db: default
> ...
> I1116 13:25:08.567667   367 admission-controller.cc:1532] 
> 504b3a2511f3cd0e:e27bec6b] Scheduling query 
> 504b3a2511f3cd0e:e27bec6b with membership version 2
> I1116 13:25:08.567767   367 admission-controller.cc:1590] 
> 504b3a2511f3cd0e:e27bec6b] Scheduling for executor group: 
> default-pool-group1 with 3 executors
> I1116 13:25:08.643026   367 admission-controller.cc:1640] 
> 504b3a2511f3cd0e:e27bec6b] Trying to admit query to pool default-pool 
> in executor group default-pool-group1 (3 executors)
> ...
> I1116 13:25:49.184185 32432 admission-controller.cc:1811] Admitting from 
> queue: query=504b3a2511f3cd0e:e27bec6b
> I1116 13:25:49.184196 32432 admission-controller.cc:1903] For Query 
> 504b3a2511f3cd0e:e27bec6b per_backend_mem_limit set to: -1.00 B 
> per_backend_m
> em_to_admit set to: 114.02 MB coord_backend_mem_limit set to: -1.00 B 
> coord_backend_mem_to_admit set to: 114.02 MB
> I1116 13:25:49.184350   367 admission-controller.cc:1288] 
> 504b3a2511f3cd0e:e27bec6b] Admitted queued query 
> id=504b3a2511f3cd0e:e27bec6b
> I1116 13:25:49.184370   367 admission-controller.cc:1289] 
> 504b3a2511f3cd0e:e27bec6b] Final: agg_num_running=1, 
> agg_num_queued=0, agg_mem_reserved
> =17.25 KB,  local_host(local_mem_admitted=342.05 MB, num_admitted_running=1, 
> num_queued=0, backend_mem_reserved=17.25 KB, topN_query_stats: queries=[0a42
> 32658ea48bc5:c0156469, 5b470ebea7782154:ea22bdb0, 
> 554d88d8f812e22d:efbaf752, aa4a301189a4d144:fdc17ff7], 
> total_mem_consum
> ed=17.25 KB, fraction_of_pool_total_mem=1; pool_level_stats: num_running=4, 
> min=0, max=17.25 KB, pool_total_mem=17.25 KB, average_per_query=4.31 KB)
> I1116 13:25:49.185214   367 impala-server.cc:2062] 
> 504b3a2511f3cd0e:e27bec6b] Registering query locations
> I1116 13:25:49.185261   367 coordinator.cc:149] 
> 504b3a2511f3cd0e:e27bec6b] Exec() 
> query_id=504b3a2511f3cd0e:e27bec6b stmt=select min(ss_w
> holesale_cost) from tpcds_parquet.store_sales
> I1116 13:25:49.186172   367 coordinator.cc:473] 
> 504b3a2511f3cd0e:e27bec6b] starting execution on 3 backends for 
> query_id=504b3a2511f3cd0e:e27bec6
> b
> I1116 13:25:49.189028 32071 control-service.cc:142] 
> 504b3a2511f3cd0e:e27bec6b] ExecQueryFInstances(): 
> query_id=504b3a2511f3cd0e:e27bec6b 
> coord=impala-ec2-centos74-m5-4xlarge-ondemand-018e.vpc.cloudera.com:27000 
> #instances=5
> I1116 

[jira] [Commented] (IMPALA-10338) TestAdmissionController.test_queue_reasons_slots flaky because of slow/hanging fragment

2023-09-07 Thread Michael Smith (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762887#comment-17762887
 ] 

Michael Smith commented on IMPALA-10338:


Hasn't been seen in awhile, lowering priority and removing target version.

> TestAdmissionController.test_queue_reasons_slots flaky because of 
> slow/hanging fragment
> ---
>
> Key: IMPALA-10338
> URL: https://issues.apache.org/jira/browse/IMPALA-10338
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.0.0
>Reporter: Tim Armstrong
>Priority: Major
>  Labels: broken-build, flaky, hang
> Attachments: failure-output.txt, impalad.ERROR, impalad.INFO
>
>
> This is on an s3 debug build, commit 5a00a4c06f8ec40a8867dcbc036cf5bb47b8a3be
> {noformat}
> custom_cluster.test_admission_controller.TestAdmissionController.test_queue_reasons_slots
>  (from pytest)
> Failing for the past 1 build (Since Failed#673 )
> Took 1 min 58 sec.
> add description
> Error Message
> Timeout: query 504b3a2511f3cd0e:e27bec6b did not reach one of the 
> expected states [4], last known state 3
> Stacktrace
> custom_cluster/test_admission_controller.py:967: in test_queue_reasons_slots
> TIMEOUT_S, config_options={"mt_dop": 4})
> custom_cluster/test_admission_controller.py:277: in 
> _execute_and_collect_profiles
> state = self.wait_for_any_state(handle, expected_states, timeout_s)
> common/impala_test_suite.py:1081: in wait_for_any_state
> actual_state))
> E   Timeout: query 504b3a2511f3cd0e:e27bec6b did not reach one of the 
> expected states [4], last known state 3
> {noformat}
> Those numbers are beeswax QUeryStates:
> {noformat}
> enum QueryState {
>   CREATED = 0
>   INITIALIZED = 1
>   COMPILED = 2
>   RUNNING = 3
>   FINISHED = 4
>   EXCEPTION = 5
> }
> {noformat}
> I.e. it appears to have run for > 60 seconds.
> The timing of that query:
> {noformat}
> I1116 13:25:08.449323 32665 impala-server.cc:1242] 
> 504b3a2511f3cd0e:e27bec6b] Registered query 
> query_id=504b3a2511f3cd0e:e27bec6b 
> session_id=874c2c9eaf2ad730:0004c1bb0ba7b4a7
> I1116 13:25:08.449626 32665 Frontend.java:1532] 
> 504b3a2511f3cd0e:e27bec6b] Analyzing query: select 
> min(ss_wholesale_cost) from tpcds_parquet.store_sales db: default
> ...
> I1116 13:25:08.567667   367 admission-controller.cc:1532] 
> 504b3a2511f3cd0e:e27bec6b] Scheduling query 
> 504b3a2511f3cd0e:e27bec6b with membership version 2
> I1116 13:25:08.567767   367 admission-controller.cc:1590] 
> 504b3a2511f3cd0e:e27bec6b] Scheduling for executor group: 
> default-pool-group1 with 3 executors
> I1116 13:25:08.643026   367 admission-controller.cc:1640] 
> 504b3a2511f3cd0e:e27bec6b] Trying to admit query to pool default-pool 
> in executor group default-pool-group1 (3 executors)
> ...
> I1116 13:25:49.184185 32432 admission-controller.cc:1811] Admitting from 
> queue: query=504b3a2511f3cd0e:e27bec6b
> I1116 13:25:49.184196 32432 admission-controller.cc:1903] For Query 
> 504b3a2511f3cd0e:e27bec6b per_backend_mem_limit set to: -1.00 B 
> per_backend_m
> em_to_admit set to: 114.02 MB coord_backend_mem_limit set to: -1.00 B 
> coord_backend_mem_to_admit set to: 114.02 MB
> I1116 13:25:49.184350   367 admission-controller.cc:1288] 
> 504b3a2511f3cd0e:e27bec6b] Admitted queued query 
> id=504b3a2511f3cd0e:e27bec6b
> I1116 13:25:49.184370   367 admission-controller.cc:1289] 
> 504b3a2511f3cd0e:e27bec6b] Final: agg_num_running=1, 
> agg_num_queued=0, agg_mem_reserved
> =17.25 KB,  local_host(local_mem_admitted=342.05 MB, num_admitted_running=1, 
> num_queued=0, backend_mem_reserved=17.25 KB, topN_query_stats: queries=[0a42
> 32658ea48bc5:c0156469, 5b470ebea7782154:ea22bdb0, 
> 554d88d8f812e22d:efbaf752, aa4a301189a4d144:fdc17ff7], 
> total_mem_consum
> ed=17.25 KB, fraction_of_pool_total_mem=1; pool_level_stats: num_running=4, 
> min=0, max=17.25 KB, pool_total_mem=17.25 KB, average_per_query=4.31 KB)
> I1116 13:25:49.185214   367 impala-server.cc:2062] 
> 504b3a2511f3cd0e:e27bec6b] Registering query locations
> I1116 13:25:49.185261   367 coordinator.cc:149] 
> 504b3a2511f3cd0e:e27bec6b] Exec() 
> query_id=504b3a2511f3cd0e:e27bec6b stmt=select min(ss_w
> holesale_cost) from tpcds_parquet.store_sales
> I1116 13:25:49.186172   367 coordinator.cc:473] 
> 504b3a2511f3cd0e:e27bec6b] starting execution on 3 backends for 
> query_id=504b3a2511f3cd0e:e27bec6
> b
> I1116 13:25:49.189028 32071 control-service.cc:142] 
> 504b3a2511f3cd0e:e27bec6b] ExecQueryFInstances(): 
> query_id=504b3a2511f3cd0e:e27bec6b 
> 

[jira] [Resolved] (IMPALA-10567) Failed close open session in flight on impalad web UI

2023-09-07 Thread Michael Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith resolved IMPALA-10567.

Resolution: Duplicate

> Failed close open session in flight on impalad web UI
> -
>
> Key: IMPALA-10567
> URL: https://issues.apache.org/jira/browse/IMPALA-10567
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.2.0
>Reporter: Jimi
>Priority: Major
> Attachments: image-2021-03-09-11-12-01-488.png, 
> image-2021-03-09-11-12-10-922.png, image-2021-03-09-11-12-14-683.png
>
>
> when up to fe_service_threads limit, it will be hang open connection from the 
> jdbc client.
> Then i close in-flight session on impalad web UI, it still to hang it. my 
> impala version is 3.2.0-SNAPSHOT.
> After close in-flight session on impalad web UI, some metrics follows:
> !image-2021-03-09-11-12-01-488.png|width=1509,height=246!
> !image-2021-03-09-11-12-10-922.png|width=1563,height=221!!image-2021-03-09-11-12-14-683.png|width=1826,height=223!
> {{}}{{}}
> {code:java}
> //代码占位符
> impala-server.num-fragments-in-flight:63
> impala-server.num-open-hiveserver2-sessions:63
> impala-server.num-queries-registered:63
> impala.thrift-server.hiveserver2-frontend.connections-in-use:64
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-10567) Failed close open session in flight on impalad web UI

2023-09-07 Thread Michael Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith updated IMPALA-10567:
---
Target Version:   (was: Impala 4.3.0)

> Failed close open session in flight on impalad web UI
> -
>
> Key: IMPALA-10567
> URL: https://issues.apache.org/jira/browse/IMPALA-10567
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.2.0
>Reporter: Jimi
>Priority: Blocker
> Attachments: image-2021-03-09-11-12-01-488.png, 
> image-2021-03-09-11-12-10-922.png, image-2021-03-09-11-12-14-683.png
>
>
> when up to fe_service_threads limit, it will be hang open connection from the 
> jdbc client.
> Then i close in-flight session on impalad web UI, it still to hang it. my 
> impala version is 3.2.0-SNAPSHOT.
> After close in-flight session on impalad web UI, some metrics follows:
> !image-2021-03-09-11-12-01-488.png|width=1509,height=246!
> !image-2021-03-09-11-12-10-922.png|width=1563,height=221!!image-2021-03-09-11-12-14-683.png|width=1826,height=223!
> {{}}{{}}
> {code:java}
> //代码占位符
> impala-server.num-fragments-in-flight:63
> impala-server.num-open-hiveserver2-sessions:63
> impala-server.num-queries-registered:63
> impala.thrift-server.hiveserver2-frontend.connections-in-use:64
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-10567) Failed close open session in flight on impalad web UI

2023-09-07 Thread Michael Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith updated IMPALA-10567:
---
Priority: Major  (was: Blocker)

> Failed close open session in flight on impalad web UI
> -
>
> Key: IMPALA-10567
> URL: https://issues.apache.org/jira/browse/IMPALA-10567
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.2.0
>Reporter: Jimi
>Priority: Major
> Attachments: image-2021-03-09-11-12-01-488.png, 
> image-2021-03-09-11-12-10-922.png, image-2021-03-09-11-12-14-683.png
>
>
> when up to fe_service_threads limit, it will be hang open connection from the 
> jdbc client.
> Then i close in-flight session on impalad web UI, it still to hang it. my 
> impala version is 3.2.0-SNAPSHOT.
> After close in-flight session on impalad web UI, some metrics follows:
> !image-2021-03-09-11-12-01-488.png|width=1509,height=246!
> !image-2021-03-09-11-12-10-922.png|width=1563,height=221!!image-2021-03-09-11-12-14-683.png|width=1826,height=223!
> {{}}{{}}
> {code:java}
> //代码占位符
> impala-server.num-fragments-in-flight:63
> impala-server.num-open-hiveserver2-sessions:63
> impala-server.num-queries-registered:63
> impala.thrift-server.hiveserver2-frontend.connections-in-use:64
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Closed] (IMPALA-10570) FE tests get stuck and eventually time out during UBSAN build

2023-09-07 Thread Michael Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith closed IMPALA-10570.
--
Resolution: Cannot Reproduce

Seems to have become a non-issue in the intervening 2 years.

> FE tests get stuck and eventually time out during UBSAN build
> -
>
> Key: IMPALA-10570
> URL: https://issues.apache.org/jira/browse/IMPALA-10570
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 4.0.0
>Reporter: Laszlo Gaal
>Priority: Blocker
>  Labels: broken-build
>
> During UBSAN builds on private infrastructure (using CentOS 7.4 as the OS 
> platform) FE tests get stuck, then eventually get killed by the timeout 
> mechanism in buildall.sh. Unfortunately output buffering makes the Jenkins 
> console log slightly confusing, so it is not easy to tell which exact FE test 
> is getting stuck: the timeout and build shutdown sequences seem to be 
> artificially inserted into the middle of the FE build result summary.
> Representative log section:
> {code}
> 01:44:28.496 [INFO] Tests run: 99, Failures: 0, Errors: 0, Skipped: 0, Time 
> elapsed: 0.305 s - in org.apache.impala.analysis.ParserTest
> 01:44:28.496 [INFO] Running org.apache.impala.analysis.ToSqlTest
> 01:44:28.496 [INFO] Tests run: 41, Failures: 0, Errors: 0, Skipped: 0, Time 
> elapsed: 0.88 s - in org.apache.impala.analysis.ToSqlTest
> 01:44:28.496 [INFO] Running org.apache.impala.analysis.Ana
> 01:44:28.496 
> 20:55:34.940  run-all-tests.sh TIMED OUT! 
> 20:55:34.943 
> 20:55:34.943 
> 20:55:34.943  Generating backtrace of impalad with process id: 15971 to 
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/logs/timeout_stacktrace/impalad_15971_20210308-093840.txt
>  
> [. lots of debug and strack trace output elided for brevity's sake]
> [ complete log section for build shutdown elided.]
> 20:57:13.167 
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/common/thrift/MetricDefs.thrift
>  created.
> 20:57:13.167 
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/impala_schema.mdl 
> created.
> 20:57:13.167 + pg_dump -U hiveuser 
> HMS_data_jenkins_workspace_impala_cdpd_master_core_ubsan_re_cdp
> 20:57:13.167 + exit 1
> 20:57:13.502 Process leaked file descriptors. See 
> https://jenkins.io/redirect/troubleshooting/process-leaked-file-descriptors 
> for more information
> 20:57:23.505 Build step 'Execute shell' marked build as failure
> 20:57:23.576 lyzeStmtsTest
> 20:57:27.934 [INFO] Tests run: 67, Failures: 0, Errors: 0, Skipped: 0, Time 
> elapsed: 1.999 s - in org.apache.impala.analysis.AnalyzeStmtsTest
> 20:57:27.934 [INFO] Running org.apache.impala.analysis.ExprRewriteRulesTest
> 20:57:27.934 [INFO] Tests run: 21, Failures: 0, Errors: 0, Skipped: 0, Time 
> elapsed: 0.121 s - in org.apache.impala.analysis.ExprRewriteRulesTest
> [ rest of the FE test result summary follows.]
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-10570) FE tests get stuck and eventually time out during UBSAN build

2023-09-07 Thread Michael Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith updated IMPALA-10570:
---
Target Version:   (was: Impala 4.3.0)

> FE tests get stuck and eventually time out during UBSAN build
> -
>
> Key: IMPALA-10570
> URL: https://issues.apache.org/jira/browse/IMPALA-10570
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 4.0.0
>Reporter: Laszlo Gaal
>Priority: Blocker
>  Labels: broken-build
>
> During UBSAN builds on private infrastructure (using CentOS 7.4 as the OS 
> platform) FE tests get stuck, then eventually get killed by the timeout 
> mechanism in buildall.sh. Unfortunately output buffering makes the Jenkins 
> console log slightly confusing, so it is not easy to tell which exact FE test 
> is getting stuck: the timeout and build shutdown sequences seem to be 
> artificially inserted into the middle of the FE build result summary.
> Representative log section:
> {code}
> 01:44:28.496 [INFO] Tests run: 99, Failures: 0, Errors: 0, Skipped: 0, Time 
> elapsed: 0.305 s - in org.apache.impala.analysis.ParserTest
> 01:44:28.496 [INFO] Running org.apache.impala.analysis.ToSqlTest
> 01:44:28.496 [INFO] Tests run: 41, Failures: 0, Errors: 0, Skipped: 0, Time 
> elapsed: 0.88 s - in org.apache.impala.analysis.ToSqlTest
> 01:44:28.496 [INFO] Running org.apache.impala.analysis.Ana
> 01:44:28.496 
> 20:55:34.940  run-all-tests.sh TIMED OUT! 
> 20:55:34.943 
> 20:55:34.943 
> 20:55:34.943  Generating backtrace of impalad with process id: 15971 to 
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/logs/timeout_stacktrace/impalad_15971_20210308-093840.txt
>  
> [. lots of debug and strack trace output elided for brevity's sake]
> [ complete log section for build shutdown elided.]
> 20:57:13.167 
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/common/thrift/MetricDefs.thrift
>  created.
> 20:57:13.167 
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/impala_schema.mdl 
> created.
> 20:57:13.167 + pg_dump -U hiveuser 
> HMS_data_jenkins_workspace_impala_cdpd_master_core_ubsan_re_cdp
> 20:57:13.167 + exit 1
> 20:57:13.502 Process leaked file descriptors. See 
> https://jenkins.io/redirect/troubleshooting/process-leaked-file-descriptors 
> for more information
> 20:57:23.505 Build step 'Execute shell' marked build as failure
> 20:57:23.576 lyzeStmtsTest
> 20:57:27.934 [INFO] Tests run: 67, Failures: 0, Errors: 0, Skipped: 0, Time 
> elapsed: 1.999 s - in org.apache.impala.analysis.AnalyzeStmtsTest
> 20:57:27.934 [INFO] Running org.apache.impala.analysis.ExprRewriteRulesTest
> 20:57:27.934 [INFO] Tests run: 21, Failures: 0, Errors: 0, Skipped: 0, Time 
> elapsed: 0.121 s - in org.apache.impala.analysis.ExprRewriteRulesTest
> [ rest of the FE test result summary follows.]
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-10575) Expired sessions not closed in Impala

2023-09-07 Thread Michael Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith updated IMPALA-10575:
---
Priority: Critical  (was: Blocker)

> Expired sessions not closed in Impala
> -
>
> Key: IMPALA-10575
> URL: https://issues.apache.org/jira/browse/IMPALA-10575
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.2.0
>Reporter: Jimi
>Priority: Critical
>  Labels: impala
> Attachments: image-2021-03-10-00-17-41-487.png
>
>
> jdbc query option:
> {code:java}
> jdbc:impala://ip:port/cdp;idle_session_timeout=10;QUERY_TIMEOUT_S=10
> {code}
> but unclosed expired session, like this:
> !image-2021-03-10-00-17-41-487.png|width=1367,height=199!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-10575) Expired sessions not closed in Impala

2023-09-07 Thread Michael Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith updated IMPALA-10575:
---
Target Version:   (was: Impala 4.3.0)

> Expired sessions not closed in Impala
> -
>
> Key: IMPALA-10575
> URL: https://issues.apache.org/jira/browse/IMPALA-10575
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.2.0
>Reporter: Jimi
>Priority: Critical
>  Labels: impala
> Attachments: image-2021-03-10-00-17-41-487.png
>
>
> jdbc query option:
> {code:java}
> jdbc:impala://ip:port/cdp;idle_session_timeout=10;QUERY_TIMEOUT_S=10
> {code}
> but unclosed expired session, like this:
> !image-2021-03-10-00-17-41-487.png|width=1367,height=199!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10966) query_test.test_scanners.TestIceberg.test_iceberg_query multiple failures in an ASAN run

2023-09-07 Thread Michael Smith (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762883#comment-17762883
 ] 

Michael Smith commented on IMPALA-10966:


Preparing for a release. [~tmate] do you think this is still an issue?

> query_test.test_scanners.TestIceberg.test_iceberg_query multiple failures in 
> an ASAN run
> 
>
> Key: IMPALA-10966
> URL: https://issues.apache.org/jira/browse/IMPALA-10966
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.1.0
>Reporter: Laszlo Gaal
>Assignee: Tamas Mate
>Priority: Critical
>  Labels: broken-build, iceberg
>
> the actual failures look pretty similar.
> Pattern 1:
> {code}
> query_test/test_scanners.py:357: in test_iceberg_query 
> self.run_test_case('QueryTest/iceberg-query', vector) 
> common/impala_test_suite.py:713: in run_test_case 
> self.__verify_results_and_errors(vector, test_section, result, use_db) 
> common/impala_test_suite.py:549: in __verify_results_and_errors 
> replace_filenames_with_placeholder) common/test_result_verifier.py:469: in 
> verify_raw_results VERIFIER_MAP[verifier](expected, actual) 
> common/test_result_verifier.py:278: in verify_query_result_is_equal 
> assert expected_results == actual_results E   assert Comparing 
> QueryTestResults (expected vs actual): E 
> 'hdfs://localhost:20500/test-warehouse/iceberg_test/hadoop_catalog/iceberg_partitioned_orc/functional_parquet/iceberg_partitioned_orc/data/action=click/4-4-0982a5d3-48c0-4dd0-ab87-d24190894251-0.orc',regex:.*,''
>  == 
> 'hdfs://localhost:20500/test-warehouse/iceberg_test/hadoop_catalog/iceberg_partitioned_orc/functional_parquet/iceberg_partitioned_orc/data/action=click/4-4-0982a5d3-48c0-4dd0-ab87-d24190894251-0.orc','460B',''
>  E 
> 'hdfs://localhost:20500/test-warehouse/iceberg_test/hadoop_catalog/iceberg_partitioned_orc/functional_parquet/iceberg_partitioned_orc/data/action=click/00014-14-dc56d2c8-e285-428d-b81e-f3d07ec53c12-0.orc',regex:.*,''
>  == 
> 'hdfs://localhost:20500/test-warehouse/iceberg_test/hadoop_catalog/iceberg_partitioned_orc/functional_parquet/iceberg_partitioned_orc/data/action=click/00014-14-dc56d2c8-e285-428d-b81e-f3d07ec53c12-0.orc','460B',''
> [. matching result lines elided.]
> E 
> 'hdfs://localhost:20500/test-warehouse/iceberg_test/hadoop_catalog/iceberg_partitioned_orc/functional_parquet/iceberg_partitioned_orc/metadata/version-hint.text',regex:.*,''
>  != 
> 'hdfs://localhost:20500/test-warehouse/iceberg_test/hadoop_catalog/iceberg_partitioned_orc/functional_parquet/iceberg_partitioned_orc/metadata/v3.metadata.json','2.21KB',''
>  
> E None != 
> 'hdfs://localhost:20500/test-warehouse/iceberg_test/hadoop_catalog/iceberg_partitioned_orc/functional_parquet/iceberg_partitioned_orc/metadata/v4.metadata.json','2.44KB',''
>  
> E None != 
> 'hdfs://localhost:20500/test-warehouse/iceberg_test/hadoop_catalog/iceberg_partitioned_orc/functional_parquet/iceberg_partitioned_orc/metadata/v5.metadata.json','2.66KB',''
>  
> E None != 
> 'hdfs://localhost:20500/test-warehouse/iceberg_test/hadoop_catalog/iceberg_partitioned_orc/functional_parquet/iceberg_partitioned_orc/metadata/version-hint.text','1B',''
>  
> E Number of rows returned (expected vs actual): 25 != 28
> {code}
> Pattern 2:
> {code}
> query_test/test_scanners.py:357: in test_iceberg_query
> self.run_test_case('QueryTest/iceberg-query', vector)
> common/impala_test_suite.py:713: in run_test_case
> self.__verify_results_and_errors(vector, test_section, result, use_db)
> common/impala_test_suite.py:549: in __verify_results_and_errors
> replace_filenames_with_placeholder)
> common/test_result_verifier.py:469: in verify_raw_results
> VERIFIER_MAP[verifier](expected, actual)
> common/test_result_verifier.py:278: in verify_query_result_is_equal
> assert expected_results == actual_results
> E   assert Comparing QueryTestResults (expected vs actual):
> E 
> 'hdfs://localhost:20500/test-warehouse/iceberg_test/hadoop_catalog/iceberg_partitioned_orc/functional_parquet/iceberg_partitioned_orc/data/action=click/4-4-0982a5d3-48c0-4dd0-ab87-d24190894251-0.orc',regex:.*,''
>  == 
> 'hdfs://localhost:20500/test-warehouse/iceberg_test/hadoop_catalog/iceberg_partitioned_orc/functional_parquet/iceberg_partitioned_orc/data/action=click/4-4-0982a5d3-48c0-4dd0-ab87-d24190894251-0.orc','460B',''
> [.matching result lines elided...]
> E 
> 'hdfs://localhost:20500/test-warehouse/iceberg_test/hadoop_catalog/iceberg_partitioned_orc/functional_parquet/iceberg_partitioned_orc/metadata/version-hint.text',regex:.*,''
>  != 
> 

[jira] [Assigned] (IMPALA-10966) query_test.test_scanners.TestIceberg.test_iceberg_query multiple failures in an ASAN run

2023-09-07 Thread Michael Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith reassigned IMPALA-10966:
--

Assignee: Tamas Mate

> query_test.test_scanners.TestIceberg.test_iceberg_query multiple failures in 
> an ASAN run
> 
>
> Key: IMPALA-10966
> URL: https://issues.apache.org/jira/browse/IMPALA-10966
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.1.0
>Reporter: Laszlo Gaal
>Assignee: Tamas Mate
>Priority: Critical
>  Labels: broken-build, iceberg
>
> the actual failures look pretty similar.
> Pattern 1:
> {code}
> query_test/test_scanners.py:357: in test_iceberg_query 
> self.run_test_case('QueryTest/iceberg-query', vector) 
> common/impala_test_suite.py:713: in run_test_case 
> self.__verify_results_and_errors(vector, test_section, result, use_db) 
> common/impala_test_suite.py:549: in __verify_results_and_errors 
> replace_filenames_with_placeholder) common/test_result_verifier.py:469: in 
> verify_raw_results VERIFIER_MAP[verifier](expected, actual) 
> common/test_result_verifier.py:278: in verify_query_result_is_equal 
> assert expected_results == actual_results E   assert Comparing 
> QueryTestResults (expected vs actual): E 
> 'hdfs://localhost:20500/test-warehouse/iceberg_test/hadoop_catalog/iceberg_partitioned_orc/functional_parquet/iceberg_partitioned_orc/data/action=click/4-4-0982a5d3-48c0-4dd0-ab87-d24190894251-0.orc',regex:.*,''
>  == 
> 'hdfs://localhost:20500/test-warehouse/iceberg_test/hadoop_catalog/iceberg_partitioned_orc/functional_parquet/iceberg_partitioned_orc/data/action=click/4-4-0982a5d3-48c0-4dd0-ab87-d24190894251-0.orc','460B',''
>  E 
> 'hdfs://localhost:20500/test-warehouse/iceberg_test/hadoop_catalog/iceberg_partitioned_orc/functional_parquet/iceberg_partitioned_orc/data/action=click/00014-14-dc56d2c8-e285-428d-b81e-f3d07ec53c12-0.orc',regex:.*,''
>  == 
> 'hdfs://localhost:20500/test-warehouse/iceberg_test/hadoop_catalog/iceberg_partitioned_orc/functional_parquet/iceberg_partitioned_orc/data/action=click/00014-14-dc56d2c8-e285-428d-b81e-f3d07ec53c12-0.orc','460B',''
> [. matching result lines elided.]
> E 
> 'hdfs://localhost:20500/test-warehouse/iceberg_test/hadoop_catalog/iceberg_partitioned_orc/functional_parquet/iceberg_partitioned_orc/metadata/version-hint.text',regex:.*,''
>  != 
> 'hdfs://localhost:20500/test-warehouse/iceberg_test/hadoop_catalog/iceberg_partitioned_orc/functional_parquet/iceberg_partitioned_orc/metadata/v3.metadata.json','2.21KB',''
>  
> E None != 
> 'hdfs://localhost:20500/test-warehouse/iceberg_test/hadoop_catalog/iceberg_partitioned_orc/functional_parquet/iceberg_partitioned_orc/metadata/v4.metadata.json','2.44KB',''
>  
> E None != 
> 'hdfs://localhost:20500/test-warehouse/iceberg_test/hadoop_catalog/iceberg_partitioned_orc/functional_parquet/iceberg_partitioned_orc/metadata/v5.metadata.json','2.66KB',''
>  
> E None != 
> 'hdfs://localhost:20500/test-warehouse/iceberg_test/hadoop_catalog/iceberg_partitioned_orc/functional_parquet/iceberg_partitioned_orc/metadata/version-hint.text','1B',''
>  
> E Number of rows returned (expected vs actual): 25 != 28
> {code}
> Pattern 2:
> {code}
> query_test/test_scanners.py:357: in test_iceberg_query
> self.run_test_case('QueryTest/iceberg-query', vector)
> common/impala_test_suite.py:713: in run_test_case
> self.__verify_results_and_errors(vector, test_section, result, use_db)
> common/impala_test_suite.py:549: in __verify_results_and_errors
> replace_filenames_with_placeholder)
> common/test_result_verifier.py:469: in verify_raw_results
> VERIFIER_MAP[verifier](expected, actual)
> common/test_result_verifier.py:278: in verify_query_result_is_equal
> assert expected_results == actual_results
> E   assert Comparing QueryTestResults (expected vs actual):
> E 
> 'hdfs://localhost:20500/test-warehouse/iceberg_test/hadoop_catalog/iceberg_partitioned_orc/functional_parquet/iceberg_partitioned_orc/data/action=click/4-4-0982a5d3-48c0-4dd0-ab87-d24190894251-0.orc',regex:.*,''
>  == 
> 'hdfs://localhost:20500/test-warehouse/iceberg_test/hadoop_catalog/iceberg_partitioned_orc/functional_parquet/iceberg_partitioned_orc/data/action=click/4-4-0982a5d3-48c0-4dd0-ab87-d24190894251-0.orc','460B',''
> [.matching result lines elided...]
> E 
> 'hdfs://localhost:20500/test-warehouse/iceberg_test/hadoop_catalog/iceberg_partitioned_orc/functional_parquet/iceberg_partitioned_orc/metadata/version-hint.text',regex:.*,''
>  != 
> 

[jira] [Assigned] (IMPALA-11284) INSERT query with concat operator fails with 'Function not set in thrift node' error

2023-09-07 Thread Michael Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith reassigned IMPALA-11284:
--

Assignee: Riza Suminto  (was: Abhishek Rawat)

> INSERT query with concat operator fails with 'Function not set in thrift 
> node' error
> 
>
> Key: IMPALA-11284
> URL: https://issues.apache.org/jira/browse/IMPALA-11284
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 4.1.0
>Reporter: Abhishek Rawat
>Assignee: Riza Suminto
>Priority: Critical
>
> *Steps to Reproduce:*
> {code:java}
> DROP TABLE t2;
> CREATE TABLE t2(c0 BOOLEAN, c1 STRING) STORED AS ICEBERG; 
> INSERT INTO t2(c0, c1) VALUES ( TRUE, ( 'abc' ||('927160245' || 'Q') ) );
> Error: Function not set in thrift node{code}
> Looks like a regression introduced by IMPALA-6590.
> fn_ was previously serialized during rewrite in :
> {code:java}
> treeToThriftHelper:FunctionCallExpr(Expr).treeToThriftHelper(TExpr) line: 866
> FunctionCallExpr(Expr).treeToThrift() line: 844 
> FeSupport.EvalExprWithoutRowBounded(Expr, TQueryCtx, int) line: 188
> LiteralExpr.createBounded(Expr, TQueryCtx, int) line: 210
> FoldConstantsRule.apply(Expr, Analyzer) line: 66
> ExprRewriter.applyRuleBottomUp(Expr, ExprRewriteRule, Analyzer) line: 85
> ExprRewriter.applyRuleRepeatedly(Expr, ExprRewriteRule, Analyzer) line: 71
> ExprRewriter.rewrite(Expr, Analyzer) line: 55   
> SelectList.rewriteExprs(ExprRewriter, Analyzer) line: 100
> SelectStmt.rewriteExprs(ExprRewriter) line: 1189
> ValuesStmt(SetOperationStmt).rewriteExprs(ExprRewriter) line: 467
> InsertStmt.rewriteExprs(ExprRewriter) line: 1119
> AnalysisContext.analyze(StmtMetadataLoader$StmtTableCache, 
> AuthorizationContext) line: 537       {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-11725) Query result incorrect when querying and filtering NULL values of sub-query

2023-09-07 Thread Michael Smith (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-11725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762882#comment-17762882
 ] 

Michael Smith commented on IMPALA-11725:


Is this a regression introduced in 4.1.0?

> Query result incorrect when querying and filtering NULL values of sub-query 
> 
>
> Key: IMPALA-11725
> URL: https://issues.apache.org/jira/browse/IMPALA-11725
> Project: IMPALA
>  Issue Type: Bug
>  Components: Distributed Exec, Frontend
>Affects Versions: Impala 4.1.0
>Reporter: Yuchen Fan
>Assignee: Yuchen Fan
>Priority: Critical
>
> We found that Impala can't filter NULL values from sub-query. For example, 
> prepare two test tables err_tbl1 and err_tbl2(id INT, dt STRING):
> {noformat}
> +++
> | id | dt         |
> +++
> | 14 | 2022-11-13 |
> | 15 | 2022-11-13 |
> | 13 | 2022-11-13 |
> +++
> +++
> | id | dt         |
> +++
> | 14 | 2022-11-13 |
> | 16 | 2022-11-13 |
> | 13 | 2022-11-13 |
> +++
> {noformat}
> And submit query below:
> {code:java}
> SELECT *
> FROM (
>     SELECT aid, bid, COUNT(*) AS c
>     FROM (
>         SELECT id AS aid
>         FROM err_tbl1
>         WHERE dt = '2022-11-13'
>     ) a
>         FULL JOIN (
>             SELECT id AS bid
>             FROM err_tbl2
>             WHERE dt = '2022-11-13'
>         ) b
>         ON a.aid = b.bid
>     GROUP BY aid, bid
> ) t1
> WHERE aid = bid;{code}
> Out result includes 4 rows:
> {noformat}
> +--+--+---+
> | aid  | bid  | c |
> +--+--+---+
> | 13   | 13   | 1 |
> | 14   | 14   | 1 |
> | NULL | 15   | 1 |
> | 16   | NULL | 1 |
> +--+--+—+
> {noformat}
> Obviously, condition of 'aid=bid' is invalid. 'NULL' value should be filtered 
> out. But if we use condition of 'aid!=bid', there will be empty result set, 
> which means '!=' can filter out 'NULL' value while '=' cannot. What's more, 
> if we create a table as select * from sub-query and execute 'SELECT * FROM 
> sub_table WHERE aid=bid'. The result is correct. If we surround 'aid=bid' 
> with 'trim()' like 'trim(cast(aid as string))=trim(cast(bid as string))', 
> result is also correct. In Spark, result of the same query doesn't contain 
> 'NULL' values.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-11284) INSERT query with concat operator fails with 'Function not set in thrift node' error

2023-09-07 Thread Michael Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-11284 started by Michael Smith.
--
> INSERT query with concat operator fails with 'Function not set in thrift 
> node' error
> 
>
> Key: IMPALA-11284
> URL: https://issues.apache.org/jira/browse/IMPALA-11284
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 4.1.0
>Reporter: Abhishek Rawat
>Assignee: Michael Smith
>Priority: Critical
>
> *Steps to Reproduce:*
> {code:java}
> DROP TABLE t2;
> CREATE TABLE t2(c0 BOOLEAN, c1 STRING) STORED AS ICEBERG; 
> INSERT INTO t2(c0, c1) VALUES ( TRUE, ( 'abc' ||('927160245' || 'Q') ) );
> Error: Function not set in thrift node{code}
> Looks like a regression introduced by IMPALA-6590.
> fn_ was previously serialized during rewrite in :
> {code:java}
> treeToThriftHelper:FunctionCallExpr(Expr).treeToThriftHelper(TExpr) line: 866
> FunctionCallExpr(Expr).treeToThrift() line: 844 
> FeSupport.EvalExprWithoutRowBounded(Expr, TQueryCtx, int) line: 188
> LiteralExpr.createBounded(Expr, TQueryCtx, int) line: 210
> FoldConstantsRule.apply(Expr, Analyzer) line: 66
> ExprRewriter.applyRuleBottomUp(Expr, ExprRewriteRule, Analyzer) line: 85
> ExprRewriter.applyRuleRepeatedly(Expr, ExprRewriteRule, Analyzer) line: 71
> ExprRewriter.rewrite(Expr, Analyzer) line: 55   
> SelectList.rewriteExprs(ExprRewriter, Analyzer) line: 100
> SelectStmt.rewriteExprs(ExprRewriter) line: 1189
> ValuesStmt(SetOperationStmt).rewriteExprs(ExprRewriter) line: 467
> InsertStmt.rewriteExprs(ExprRewriter) line: 1119
> AnalysisContext.analyze(StmtMetadataLoader$StmtTableCache, 
> AuthorizationContext) line: 537       {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-11542) TestFailpoints::test_failpoints crash in ARM build

2023-09-07 Thread Michael Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith updated IMPALA-11542:
---
Target Version:   (was: Impala 4.1.2)

> TestFailpoints::test_failpoints crash in ARM build
> --
>
> Key: IMPALA-11542
> URL: https://issues.apache.org/jira/browse/IMPALA-11542
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.1.0
>Reporter: Quanlong Huang
>Assignee: Michael Smith
>Priority: Critical
>  Labels: arm
>
> Saw the crash in 
> [https://jenkins.impala.io/job/ubuntu-16.04-from-scratch-ARM/13]
> In the ERROR log:
> {noformat}
> Picked up JAVA_TOOL_OPTIONS: 
> -agentlib:jdwp=transport=dt_socket,address=3,server=y,suspend=n  
> impalad: 
> /home/ubuntu/native-toolchain/source/llvm/llvm-5.0.1-asserts.src-p3/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp:400:
>  void llvm::RuntimeDyldELF::resolveAArch64Relocation(const 
> llvm::SectionEntry&, uint64_t, uint64_t, uint32_t, int64_t): Assertion 
> `isInt<33>(Result) && "overflow check failed for relocation"' failed.
> Minidump in thread [20013]exec-finstance 
> (finst:1e4a0f56622f2a15:51c297090005) running query 
> 1e4a0f56622f2a15:51c29709, fragment instance 
> 1e4a0f56622f2a15:51c297090005
> Wrote minidump to 
> /home/ubuntu/Impala/logs/ee_tests/minidumps/impalad/de4830d8-009d-47f4-f14bb68a-f0d8cd4c.dmp
>  {noformat}
> In the INFO log:
> {noformat}
> I0830 06:54:49.173234 11329 impala-beeswax-server.cc:516] query: Query {
>   01: query (string) = "SELECT STRAIGHT_JOIN *\n   FROM alltypes t1\n 
>  JOIN /*+broadcast*/ alltypesagg t2 ON t1.id = t2.id\n
>WHERE t2.int_col < 1000",
>   03: configuration (list) = list[10] {
> [0] = "CLIENT_IDENTIFIE[...](273)",
> [1] = "TEST_REPLAN=1",
> [2] = "DISABLE_CODEGEN=False",
> [3] = "BATCH_SIZE=0",
> [4] = "NUM_NODES=0",
> [5] = "DISABLE_CODEGEN_ROWS_THRESHOLD=0",
> [6] = "MT_DOP=4",
> [7] = "ABORT_ON_ERROR=1",
> [8] = 
> "DEBUG_ACTION=4:GETNEXT:MEM_LIMIT_EXCEEDED|COORD_BEFORE_EXEC_RPC:JITTER@100@0.3",
> [9] = "EXEC_SINGLE_NODE_ROWS_THRESHOLD=0",
>   },
>   04: hadoop_user (string) = "ubuntu",
> }
> ...
>   74: client_identifier (string) = 
> "failure/test_failpoints.py::TestFailpoints::()::test_failpoints[protocol:beeswax|table_format:seq/snap/block|exec_option:{'test_replan':1;'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_sing",
> ...
> I0830 06:54:49.173739 11329 Frontend.java:1877] 
> 1e4a0f56622f2a15:51c29709] Analyzing query: SELECT STRAIGHT_JOIN *
>FROM alltypes t1
>   JOIN /*+broadcast*/ alltypesagg t2 ON t1.id = t2.id
>WHERE t2.int_col < 1000 db: functional_seq_snap {noformat}
> The client_identifier shows it's TestFailpoints::test_failpoints.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-4741) ORDER BY behavior with UNION is incorrect

2023-09-07 Thread Michael Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-4741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith updated IMPALA-4741:
--
Priority: Major  (was: Critical)

> ORDER BY behavior with UNION is incorrect
> -
>
> Key: IMPALA-4741
> URL: https://issues.apache.org/jira/browse/IMPALA-4741
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 2.8.0
>Reporter: Greg Rahn
>Priority: Major
>  Labels: correctness, incompatibility, ramp-up, sql-language, 
> tpc-ds
> Attachments: query36a.sql, query49.sql
>
>
> When a query uses the UNION, EXCEPT, or INTERSECT operators, the ORDER BY 
> clause must be specified at the end of the statement and the results of the 
> combined queries are sorted.  ORDER BY clauses are not allowed in individual 
> branches unless the branch is enclosed by parentheses.
> There are two bugs currently:
> # An ORDER BY is allowed in a branch of a UNION that is not enclosed in 
> parentheses
> # The final ORDER BY of a UNION is attached to the nearest branch when it 
> should be sorting the combined results of the UNION(s)
> For example, this is not valid syntax but is allowed in Impala
> {code}
> select * from t1 order by 1
> union all
> select * from t2
> {code}
> And for queries like this, the ORDER BY should order the unioned result, not 
> just the nearest branch which is the current behavior.
> {code}
> select * from t1
> union all
> select * from t2
> order by 1
> {code}
> If one wants ordering within a branch, the query block must be enclosed by 
> parentheses like such:
> {code}
> (select * from t1 order by 1)
> union all
> (select * from t2 order by 2)
> {code}
> Here is an example where incorrect results are returned.
> Impala
> {code}
> [impalad:21000] > select r_regionkey, r_name from region union all select 
> r_regionkey, r_name from region order by 1 limit 2;
> +-+-+
> | r_regionkey | r_name  |
> +-+-+
> | 0   | AFRICA  |
> | 1   | AMERICA |
> | 2   | ASIA|
> | 3   | EUROPE  |
> | 4   | MIDDLE EAST |
> | 0   | AFRICA  |
> | 1   | AMERICA |
> +-+-+
> Fetched 7 row(s) in 0.12s
> {code}
> PostgreSQL
> {code}
> tpch=# select r_regionkey, r_name from region union all select r_regionkey, 
> r_name from region order by 1 limit 2;
>  r_regionkey |  r_name
> -+---
>0 | AFRICA
>0 | AFRICA
> (2 rows) 
> {code}
> see also https://cloud.google.com/spanner/docs/query-syntax#syntax_5



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-4741) ORDER BY behavior with UNION is incorrect

2023-09-07 Thread Michael Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-4741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith updated IMPALA-4741:
--
Target Version:   (was: Impala 4.3.0)

> ORDER BY behavior with UNION is incorrect
> -
>
> Key: IMPALA-4741
> URL: https://issues.apache.org/jira/browse/IMPALA-4741
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 2.8.0
>Reporter: Greg Rahn
>Priority: Critical
>  Labels: correctness, incompatibility, ramp-up, sql-language, 
> tpc-ds
> Attachments: query36a.sql, query49.sql
>
>
> When a query uses the UNION, EXCEPT, or INTERSECT operators, the ORDER BY 
> clause must be specified at the end of the statement and the results of the 
> combined queries are sorted.  ORDER BY clauses are not allowed in individual 
> branches unless the branch is enclosed by parentheses.
> There are two bugs currently:
> # An ORDER BY is allowed in a branch of a UNION that is not enclosed in 
> parentheses
> # The final ORDER BY of a UNION is attached to the nearest branch when it 
> should be sorting the combined results of the UNION(s)
> For example, this is not valid syntax but is allowed in Impala
> {code}
> select * from t1 order by 1
> union all
> select * from t2
> {code}
> And for queries like this, the ORDER BY should order the unioned result, not 
> just the nearest branch which is the current behavior.
> {code}
> select * from t1
> union all
> select * from t2
> order by 1
> {code}
> If one wants ordering within a branch, the query block must be enclosed by 
> parentheses like such:
> {code}
> (select * from t1 order by 1)
> union all
> (select * from t2 order by 2)
> {code}
> Here is an example where incorrect results are returned.
> Impala
> {code}
> [impalad:21000] > select r_regionkey, r_name from region union all select 
> r_regionkey, r_name from region order by 1 limit 2;
> +-+-+
> | r_regionkey | r_name  |
> +-+-+
> | 0   | AFRICA  |
> | 1   | AMERICA |
> | 2   | ASIA|
> | 3   | EUROPE  |
> | 4   | MIDDLE EAST |
> | 0   | AFRICA  |
> | 1   | AMERICA |
> +-+-+
> Fetched 7 row(s) in 0.12s
> {code}
> PostgreSQL
> {code}
> tpch=# select r_regionkey, r_name from region union all select r_regionkey, 
> r_name from region order by 1 limit 2;
>  r_regionkey |  r_name
> -+---
>0 | AFRICA
>0 | AFRICA
> (2 rows) 
> {code}
> see also https://cloud.google.com/spanner/docs/query-syntax#syntax_5



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-12432) Keep LdapKerberosImpalaShellTest* compatible with older guava versions

2023-09-07 Thread Joe McDonnell (Jira)
Joe McDonnell created IMPALA-12432:
--

 Summary: Keep LdapKerberosImpalaShellTest* compatible with older 
guava versions
 Key: IMPALA-12432
 URL: https://issues.apache.org/jira/browse/IMPALA-12432
 Project: IMPALA
  Issue Type: Improvement
  Components: Frontend
Affects Versions: Impala 4.3.0
Reporter: Joe McDonnell


LdapKerberosImpalaShellTestBase.java and LdapKerberosImpalaShellTest.java use 
the ImmutableMap.of function with 8+ pairs. Older versions of guava like 
28.1-jre do not have ImmutableMap.of() for that number of arguments.

Since we often want to use the guava version that the underlying Hadoop/Hive 
use, it can be useful for compatibility to be able to build against older guava 
(like 28.1-jre).

Most other code is fine, so if we switch these locations to use 
ImmutableMap.builder(), then the whole codebase can compile 
with the older guava (while remaining forward compatible as well).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-4052) CREATE TABLE LIKE for Kudu tables

2023-09-07 Thread Michael Smith (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-4052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762881#comment-17762881
 ] 

Michael Smith commented on IMPALA-4052:
---

I'd like to see documentation added for this. If it's not addressed by the 
4.3.0 release, I'll file a separate ticket to do that later.

> CREATE TABLE LIKE for Kudu tables
> -
>
> Key: IMPALA-4052
> URL: https://issues.apache.org/jira/browse/IMPALA-4052
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Affects Versions: Impala 2.7.0
>Reporter: Dimitris Tsirogiannis
>Assignee: gaoxiaoqing
>Priority: Major
>  Labels: kudu
> Fix For: Impala 4.3.0
>
>
> The semantics of CREATE TABLE LIKE when Kudu tables are involved, either as a 
> source or as a target table, are not well specified or properly implemented; 
> in some cases a misleading ImpalaRuntimeException is thrown. 
> Actions: 
> # Decide whether CREATE TABLE LIKE will be supported for Kudu tables. 
> # Implement whatever approach is decided
> # Properly document both in Impala and Kudu docs the supported operations



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12402) Add some configurations for CatalogdMetaProvider's cache_

2023-09-07 Thread Michael Smith (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762880#comment-17762880
 ] 

Michael Smith commented on IMPALA-12402:


For lots of tables, do you mean to increase concurrency or lower it? I'd guess 
lower it because the cost of lock contention exceeds the benefits of 
concurrency?

> Add some configurations for CatalogdMetaProvider's cache_
> -
>
> Key: IMPALA-12402
> URL: https://issues.apache.org/jira/browse/IMPALA-12402
> Project: IMPALA
>  Issue Type: Improvement
>  Components: fe
>Reporter: Maxwell Guo
>Assignee: Maxwell Guo
>Priority: Minor
>  Labels: pull-request-available
>
> when the cluster contains many db and tables such as if there are more than 
> 10 tables, and if we restart the impalad , the local cache_ 
> CatalogMetaProvider's need to doing some loading process. 
> As we know that the goole's guava cache 's concurrencyLevel os set to 4 by 
> default. 
> but if there is many tables the loading process will need more time and 
> increase the probability of lock contention, see 
> [here|https://github.com/google/guava/blob/master/guava/src/com/google/common/cache/CacheBuilder.java#L437].
>  
> So we propose to add some configurations here, the first is the concurrency 
> of cache.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-11400) Kudu scan bottleneck due to sharing a single Kudu client for multiple tablet scans

2023-09-07 Thread Michael Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith updated IMPALA-11400:
---
Target Version: Impala 4.3.0

> Kudu scan bottleneck due to sharing a single Kudu client for multiple tablet 
> scans
> --
>
> Key: IMPALA-11400
> URL: https://issues.apache.org/jira/browse/IMPALA-11400
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.1.0
>Reporter: Sameera Wijerathne
>Priority: Major
>  Labels: performance
> Attachments: 0.JPG, 1.JPG, 2-1.jpeg, 2.JPG, 2.jpeg, 3.JPG, 4.JPG, 
> 5.JPG, Impala_1.png, Impala_2.png, Kudu_1.png, Kudu_2.png, WhatsApp Image 
> 2022-06-07 at 10.39.27 PM.jpeg
>
>
> This issue was observed when impala queries large datasets resides in Kudu. 
> Even single ImpalaD is scanning multiple kudu tablets, it shows a slowness to 
> retrive data eventhough ImpalaD makes parrellel scans. Reason for this is 
> ImpalaD only uses a single Kudu client for multiple scans but 
> KuduScanner::NextBatch runs on a single thread. So it's rpc reactor thread 
> utilizes upto a single core and bottlenecks all parrelel scans. 
> This behaviour makes Impala clusters that scans kudu cannot be vertically 
> scales to the maximum performance/cores of a node.
> Please refer the screenshots from Kudu slack channel for more information.
>  
> !2-1.jpeg|width=717,height=961!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-10859) Impala 4.0 should build on top of kudu 1.15

2023-09-07 Thread Michael Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith resolved IMPALA-10859.

Resolution: Invalid

No longer relevant. 4.3.0 will use Kudu 1.17.

> Impala 4.0 should build on top of kudu 1.15
> ---
>
> Key: IMPALA-10859
> URL: https://issues.apache.org/jira/browse/IMPALA-10859
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Reporter: Quanlong Huang
>Assignee: Quanlong Huang
>Priority: Major
>
> The kudu version (commit id) Impala 4.0.0 depends is b5e7362e69 which is 
> between kudu 1.14 and 1.15. To avoid issue like KUDU-3286, we should bump 
> Impala 4.0's kudu version to 1.15.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-11400) Kudu scan bottleneck due to sharing a single Kudu client for multiple tablet scans

2023-09-07 Thread Michael Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith updated IMPALA-11400:
---
Fix Version/s: (was: Impala 4.3.0)

> Kudu scan bottleneck due to sharing a single Kudu client for multiple tablet 
> scans
> --
>
> Key: IMPALA-11400
> URL: https://issues.apache.org/jira/browse/IMPALA-11400
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.1.0
>Reporter: Sameera Wijerathne
>Priority: Major
>  Labels: performance
> Attachments: 0.JPG, 1.JPG, 2-1.jpeg, 2.JPG, 2.jpeg, 3.JPG, 4.JPG, 
> 5.JPG, Impala_1.png, Impala_2.png, Kudu_1.png, Kudu_2.png, WhatsApp Image 
> 2022-06-07 at 10.39.27 PM.jpeg
>
>
> This issue was observed when impala queries large datasets resides in Kudu. 
> Even single ImpalaD is scanning multiple kudu tablets, it shows a slowness to 
> retrive data eventhough ImpalaD makes parrellel scans. Reason for this is 
> ImpalaD only uses a single Kudu client for multiple scans but 
> KuduScanner::NextBatch runs on a single thread. So it's rpc reactor thread 
> utilizes upto a single core and bottlenecks all parrelel scans. 
> This behaviour makes Impala clusters that scans kudu cannot be vertically 
> scales to the maximum performance/cores of a node.
> Please refer the screenshots from Kudu slack channel for more information.
>  
> !2-1.jpeg|width=717,height=961!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-11400) Kudu scan bottleneck due to sharing a single Kudu client for multiple tablet scans

2023-09-07 Thread Michael Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith updated IMPALA-11400:
---
Target Version:   (was: Impala 4.1.0)

> Kudu scan bottleneck due to sharing a single Kudu client for multiple tablet 
> scans
> --
>
> Key: IMPALA-11400
> URL: https://issues.apache.org/jira/browse/IMPALA-11400
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.1.0
>Reporter: Sameera Wijerathne
>Priority: Major
>  Labels: performance
> Fix For: Impala 4.3.0
>
> Attachments: 0.JPG, 1.JPG, 2-1.jpeg, 2.JPG, 2.jpeg, 3.JPG, 4.JPG, 
> 5.JPG, Impala_1.png, Impala_2.png, Kudu_1.png, Kudu_2.png, WhatsApp Image 
> 2022-06-07 at 10.39.27 PM.jpeg
>
>
> This issue was observed when impala queries large datasets resides in Kudu. 
> Even single ImpalaD is scanning multiple kudu tablets, it shows a slowness to 
> retrive data eventhough ImpalaD makes parrellel scans. Reason for this is 
> ImpalaD only uses a single Kudu client for multiple scans but 
> KuduScanner::NextBatch runs on a single thread. So it's rpc reactor thread 
> utilizes upto a single core and bottlenecks all parrelel scans. 
> This behaviour makes Impala clusters that scans kudu cannot be vertically 
> scales to the maximum performance/cores of a node.
> Please refer the screenshots from Kudu slack channel for more information.
>  
> !2-1.jpeg|width=717,height=961!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-12280) Add storate_handler to Atlas lineage log

2023-09-07 Thread Michael Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith updated IMPALA-12280:
---
Fix Version/s: (was: Impala 4.2.0)

> Add storate_handler to Atlas lineage log
> 
>
> Key: IMPALA-12280
> URL: https://issues.apache.org/jira/browse/IMPALA-12280
> Project: IMPALA
>  Issue Type: New Feature
>  Components: fe
>Affects Versions: Impala 4.2.0
>Reporter: Tamas Mate
>Priority: Major
>
> Atlas lineage report should have a {{storage_handler}} property as well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-12402) Add some configurations for CatalogdMetaProvider's cache_

2023-09-07 Thread Michael Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith updated IMPALA-12402:
---
Target Version:   (was: Impala 4.2.0)

> Add some configurations for CatalogdMetaProvider's cache_
> -
>
> Key: IMPALA-12402
> URL: https://issues.apache.org/jira/browse/IMPALA-12402
> Project: IMPALA
>  Issue Type: Improvement
>  Components: fe
>Reporter: Maxwell Guo
>Assignee: Maxwell Guo
>Priority: Minor
>  Labels: pull-request-available
>
> when the cluster contains many db and tables such as if there are more than 
> 10 tables, and if we restart the impalad , the local cache_ 
> CatalogMetaProvider's need to doing some loading process. 
> As we know that the goole's guava cache 's concurrencyLevel os set to 4 by 
> default. 
> but if there is many tables the loading process will need more time and 
> increase the probability of lock contention, see 
> [here|https://github.com/google/guava/blob/master/guava/src/com/google/common/cache/CacheBuilder.java#L437].
>  
> So we propose to add some configurations here, the first is the concurrency 
> of cache.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] (IMPALA-12274) Memory leak because of the local reference created by `NewObject` in class Catalog was not released

2023-09-07 Thread Michael Smith (Jira)


[ https://issues.apache.org/jira/browse/IMPALA-12274 ]


Michael Smith deleted comment on IMPALA-12274:


was (Author: JIRAUSER288956):
Could this have caused IMPALA-12273? Or do you believe that's another issue?

>  Memory leak because of  the local reference created by `NewObject` in class 
> Catalog was not released 
> --
>
> Key: IMPALA-12274
> URL: https://issues.apache.org/jira/browse/IMPALA-12274
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.0, Impala 3.1.0, Impala 3.2.0, Impala 4.0.0, 
> Impala 3.3.0, Impala 3.4.0, Impala 4.1.0, Impala 4.2.0
>Reporter: zhangqianqiong
>Assignee: zhangqianqiong
>Priority: Minor
> Fix For: Impala 4.3.0
>
>
>      In the constructor of the {{catalog}} class, after converting the 
> {{catalog}} object created by {{NewObject}} to a global reference, the local 
> reference was forgotten to be released, resulting in a minor memory leak.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12274) Memory leak because of the local reference created by `NewObject` in class Catalog was not released

2023-09-07 Thread Michael Smith (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762877#comment-17762877
 ] 

Michael Smith commented on IMPALA-12274:


Could this have caused IMPALA-12273? Or do you believe that's another issue?

>  Memory leak because of  the local reference created by `NewObject` in class 
> Catalog was not released 
> --
>
> Key: IMPALA-12274
> URL: https://issues.apache.org/jira/browse/IMPALA-12274
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.0, Impala 3.1.0, Impala 3.2.0, Impala 4.0.0, 
> Impala 3.3.0, Impala 3.4.0, Impala 4.1.0, Impala 4.2.0
>Reporter: zhangqianqiong
>Assignee: zhangqianqiong
>Priority: Minor
> Fix For: Impala 4.3.0
>
>
>      In the constructor of the {{catalog}} class, after converting the 
> {{catalog}} object created by {{NewObject}} to a global reference, the local 
> reference was forgotten to be released, resulting in a minor memory leak.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-12274) Memory leak because of the local reference created by `NewObject` in class Catalog was not released

2023-09-07 Thread Michael Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith resolved IMPALA-12274.

Resolution: Fixed

>  Memory leak because of  the local reference created by `NewObject` in class 
> Catalog was not released 
> --
>
> Key: IMPALA-12274
> URL: https://issues.apache.org/jira/browse/IMPALA-12274
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.0, Impala 3.1.0, Impala 3.2.0, Impala 4.0.0, 
> Impala 3.3.0, Impala 3.4.0, Impala 4.1.0, Impala 4.2.0
>Reporter: zhangqianqiong
>Assignee: zhangqianqiong
>Priority: Minor
> Fix For: Impala 4.3.0
>
>
>      In the constructor of the {{catalog}} class, after converting the 
> {{catalog}} object created by {{NewObject}} to a global reference, the local 
> reference was forgotten to be released, resulting in a minor memory leak.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-11996) Iceberg Metadata querying executor change

2023-09-07 Thread Michael Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith resolved IMPALA-11996.

Resolution: Fixed

> Iceberg Metadata querying executor change
> -
>
> Key: IMPALA-11996
> URL: https://issues.apache.org/jira/browse/IMPALA-11996
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Affects Versions: Impala 4.2.0
>Reporter: Tamas Mate
>Assignee: Tamas Mate
>Priority: Major
>  Labels: impala-iceberg
> Fix For: Impala 4.3.0
>
>
> After the parser and planner changes are ready the executor should execute 
> the created plan.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-12385) Enable Periodic metrics by default

2023-09-07 Thread Michael Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-12385 started by Michael Smith.
--
> Enable Periodic metrics by default
> --
>
> Key: IMPALA-12385
> URL: https://issues.apache.org/jira/browse/IMPALA-12385
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Kurt Deschler
>Assignee: Michael Smith
>Priority: Major
>
> Periodic metrics currently require user to set resource_trace_ratio and in 
> many cases to restart backends with different 
> periodic_counter_update_period_ms settings. Changing the defaults and tuning 
> the collection mechanisms will make metrics always available and they can be 
> turned back off if not desired.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-12385) Enable Periodic metrics by default

2023-09-07 Thread Michael Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith reassigned IMPALA-12385:
--

Assignee: Kurt Deschler

> Enable Periodic metrics by default
> --
>
> Key: IMPALA-12385
> URL: https://issues.apache.org/jira/browse/IMPALA-12385
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Kurt Deschler
>Assignee: Kurt Deschler
>Priority: Major
>
> Periodic metrics currently require user to set resource_trace_ratio and in 
> many cases to restart backends with different 
> periodic_counter_update_period_ms settings. Changing the defaults and tuning 
> the collection mechanisms will make metrics always available and they can be 
> turned back off if not desired.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-11284) INSERT query with concat operator fails with 'Function not set in thrift node' error

2023-09-07 Thread Michael Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith updated IMPALA-11284:
---
Target Version: Impala 4.3.0

> INSERT query with concat operator fails with 'Function not set in thrift 
> node' error
> 
>
> Key: IMPALA-11284
> URL: https://issues.apache.org/jira/browse/IMPALA-11284
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 4.1.0
>Reporter: Abhishek Rawat
>Assignee: Abhishek Rawat
>Priority: Critical
>
> *Steps to Reproduce:*
> {code:java}
> DROP TABLE t2;
> CREATE TABLE t2(c0 BOOLEAN, c1 STRING) STORED AS ICEBERG; 
> INSERT INTO t2(c0, c1) VALUES ( TRUE, ( 'abc' ||('927160245' || 'Q') ) );
> Error: Function not set in thrift node{code}
> Looks like a regression introduced by IMPALA-6590.
> fn_ was previously serialized during rewrite in :
> {code:java}
> treeToThriftHelper:FunctionCallExpr(Expr).treeToThriftHelper(TExpr) line: 866
> FunctionCallExpr(Expr).treeToThrift() line: 844 
> FeSupport.EvalExprWithoutRowBounded(Expr, TQueryCtx, int) line: 188
> LiteralExpr.createBounded(Expr, TQueryCtx, int) line: 210
> FoldConstantsRule.apply(Expr, Analyzer) line: 66
> ExprRewriter.applyRuleBottomUp(Expr, ExprRewriteRule, Analyzer) line: 85
> ExprRewriter.applyRuleRepeatedly(Expr, ExprRewriteRule, Analyzer) line: 71
> ExprRewriter.rewrite(Expr, Analyzer) line: 55   
> SelectList.rewriteExprs(ExprRewriter, Analyzer) line: 100
> SelectStmt.rewriteExprs(ExprRewriter) line: 1189
> ValuesStmt(SetOperationStmt).rewriteExprs(ExprRewriter) line: 467
> InsertStmt.rewriteExprs(ExprRewriter) line: 1119
> AnalysisContext.analyze(StmtMetadataLoader$StmtTableCache, 
> AuthorizationContext) line: 537       {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-12413) Make Iceberg tables created by Trino compatible with Impala

2023-09-07 Thread Jira


 [ 
https://issues.apache.org/jira/browse/IMPALA-12413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltán Borók-Nagy resolved IMPALA-12413.

Fix Version/s: Impala 4.3.0
   Resolution: Fixed

> Make Iceberg tables created by Trino compatible with Impala
> ---
>
> Key: IMPALA-12413
> URL: https://issues.apache.org/jira/browse/IMPALA-12413
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Reporter: Zoltán Borók-Nagy
>Assignee: Zoltán Borók-Nagy
>Priority: Major
>  Labels: impala-iceberg
> Fix For: Impala 4.3.0
>
>
> Currently Iceberg tables created by Trino are not compatible with Impala as 
> Trino doesn't set storage_handler property and storage descriptors.
> It only denotes the table type via the table property 'table_type' which is 
> set to Iceberg.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12413) Make Iceberg tables created by Trino compatible with Impala

2023-09-07 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762786#comment-17762786
 ] 

ASF subversion and git services commented on IMPALA-12413:
--

Commit 0f55e551bc98843c79a9ec82582ddca237aa4fe9 in impala's branch 
refs/heads/master from Zoltan Borok-Nagy
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=0f55e551b ]

IMPALA-12413: Make Iceberg tables created by Trino compatible with Impala

Trino creates Iceberg tables without 'engine.hive.enabled'='true'. It
also doesn't provide a way for users to set this property. Therefore
Trino always creates Iceberg tables with non-HiveIceberg storage
descriptors.

Impala uses the Input/Output/SerDe properties to recognize table types.
This change relaxes this a bit for Iceberg tables, i.e. a table is also
considered to be an Iceberg table if the table property
'table_type'='ICEBERG' is set.

During table loading Impala uses an internal HDFS table to load table
metadata. It currently throws an exception when no proper storage
descriptor is being set. To workaround this, IcebergTable changes
the in-memory HMS table's storage descriptor properties to the
HiveIceberg* properties. Normally, this shouldn't persist to the
HMS database on read operations. Though it wouldn't harm AFAICT, we
just want to be on the safe side.

Modifications to the table from Impala goes through its Iceberg
libary (with 'engine.hive.enabled'='true'), which means we set
the HiveIceberg storage descriptors. Trino is still compatible with
such tables.

Testing
 * Manually tested with Trino
 * IMPALA-12422 will add interop tests once we have Trino in the
   minicluster environment

Change-Id: I18ea3858314d70a6131982a4e4d3ca90a95a311a
Reviewed-on: http://gerrit.cloudera.org:8080/20453
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Make Iceberg tables created by Trino compatible with Impala
> ---
>
> Key: IMPALA-12413
> URL: https://issues.apache.org/jira/browse/IMPALA-12413
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Reporter: Zoltán Borók-Nagy
>Assignee: Zoltán Borók-Nagy
>Priority: Major
>  Labels: impala-iceberg
>
> Currently Iceberg tables created by Trino are not compatible with Impala as 
> Trino doesn't set storage_handler property and storage descriptors.
> It only denotes the table type via the table property 'table_type' which is 
> set to Iceberg.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12422) Add interop tests with Trino

2023-09-07 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762787#comment-17762787
 ] 

ASF subversion and git services commented on IMPALA-12422:
--

Commit 0f55e551bc98843c79a9ec82582ddca237aa4fe9 in impala's branch 
refs/heads/master from Zoltan Borok-Nagy
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=0f55e551b ]

IMPALA-12413: Make Iceberg tables created by Trino compatible with Impala

Trino creates Iceberg tables without 'engine.hive.enabled'='true'. It
also doesn't provide a way for users to set this property. Therefore
Trino always creates Iceberg tables with non-HiveIceberg storage
descriptors.

Impala uses the Input/Output/SerDe properties to recognize table types.
This change relaxes this a bit for Iceberg tables, i.e. a table is also
considered to be an Iceberg table if the table property
'table_type'='ICEBERG' is set.

During table loading Impala uses an internal HDFS table to load table
metadata. It currently throws an exception when no proper storage
descriptor is being set. To workaround this, IcebergTable changes
the in-memory HMS table's storage descriptor properties to the
HiveIceberg* properties. Normally, this shouldn't persist to the
HMS database on read operations. Though it wouldn't harm AFAICT, we
just want to be on the safe side.

Modifications to the table from Impala goes through its Iceberg
libary (with 'engine.hive.enabled'='true'), which means we set
the HiveIceberg storage descriptors. Trino is still compatible with
such tables.

Testing
 * Manually tested with Trino
 * IMPALA-12422 will add interop tests once we have Trino in the
   minicluster environment

Change-Id: I18ea3858314d70a6131982a4e4d3ca90a95a311a
Reviewed-on: http://gerrit.cloudera.org:8080/20453
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Add interop tests with Trino
> 
>
> Key: IMPALA-12422
> URL: https://issues.apache.org/jira/browse/IMPALA-12422
> Project: IMPALA
>  Issue Type: Test
>Reporter: Zoltán Borók-Nagy
>Priority: Major
>  Labels: impala-iceberg
>
> IMPALA-12413 makes Impala able to deal with Iceberg tables create by Trino, 
> but doesn't add tests for it because Trino is not yet part of the minicluster.
> We need thorough interop testing between Impala and Trino.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-12431) Support reading compressed JSON file

2023-09-07 Thread Ye Zihao (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-12431 started by Ye Zihao.
-
> Support reading compressed JSON file
> 
>
> Key: IMPALA-12431
> URL: https://issues.apache.org/jira/browse/IMPALA-12431
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: be
>Affects Versions: Impala 4.3.0
>Reporter: Ye Zihao
>Assignee: Ye Zihao
>Priority: Major
>
> We have already supported reading uncompressed JSON files, but given that 
> JSON files have low storage efficiency and are generally stored compressed to 
> save space, it is important to support reading compressed JSON files as well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-12431) Support reading compressed JSON file

2023-09-07 Thread Ye Zihao (Jira)
Ye Zihao created IMPALA-12431:
-

 Summary: Support reading compressed JSON file
 Key: IMPALA-12431
 URL: https://issues.apache.org/jira/browse/IMPALA-12431
 Project: IMPALA
  Issue Type: Sub-task
  Components: be
Affects Versions: Impala 4.3.0
Reporter: Ye Zihao
Assignee: Ye Zihao


We have already supported reading uncompressed JSON files, but given that JSON 
files have low storage efficiency and are generally stored compressed to save 
space, it is important to support reading compressed JSON files as well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-12430) Optimize sending rows within the same process

2023-09-07 Thread Csaba Ringhofer (Jira)
Csaba Ringhofer created IMPALA-12430:


 Summary: Optimize sending rows within the same process
 Key: IMPALA-12430
 URL: https://issues.apache.org/jira/browse/IMPALA-12430
 Project: IMPALA
  Issue Type: Improvement
  Components: Backend
Reporter: Csaba Ringhofer


Currently sending row batches to exchange nodes always goes through KRPC even 
if the sender and receiver are within the same process.

This means that the following work is done without actually being necessary:

sender:
1. serialize RowBatch to a single buffer
2. compress the buffer with LZ4
3. send the buffer as a sidecar in KRPC
receiver:
4. fetch buffer from KRPC
5. decompress the buffer
6. convert the buffer to RowBatch

Ideally a single deep copy from the sender's RowBatch to the destination's 
RowBatch is enough (this is needed to cleanup the memory referenced in the 
original RowBatch during send).

The most expensive part is 2, the compression with LZ4 (decompression is much 
faster) and can be avoided with minimal changes.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-11572) TestHdfsScannerSkew.test_mt_dop_skew_lpt is flaky

2023-09-07 Thread Jira


 [ 
https://issues.apache.org/jira/browse/IMPALA-11572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltán Borók-Nagy resolved IMPALA-11572.

Fix Version/s: Impala 4.3.0
   Resolution: Fixed

> TestHdfsScannerSkew.test_mt_dop_skew_lpt is flaky
> -
>
> Key: IMPALA-11572
> URL: https://issues.apache.org/jira/browse/IMPALA-11572
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Zoltán Borók-Nagy
>Assignee: Zoltán Borók-Nagy
>Priority: Major
>  Labels: broken-build
> Fix For: Impala 4.3.0
>
> Attachments: profile-stdout.txt, profile.txt
>
>
> The test can fail with:
> {noformat}
> query_test/test_scanners.py:428: in test_mt_dop_skew_lpt
> assert cnt_fail < 3
> E   assert 3 < 3
> {noformat}
> We need to fine-tune the test to deflake it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-11488) Add virtual column FILE__POSITION for ORC tables

2023-09-07 Thread Jira


 [ 
https://issues.apache.org/jira/browse/IMPALA-11488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltán Borók-Nagy resolved IMPALA-11488.

Resolution: Duplicate

> Add virtual column FILE__POSITION for ORC tables
> 
>
> Key: IMPALA-11488
> URL: https://issues.apache.org/jira/browse/IMPALA-11488
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Reporter: Zoltán Borók-Nagy
>Priority: Major
>
> Implement virtual column FILE__POSITION for ORC tables.
> See IMPALA-11350.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org