[jira] [Closed] (IMPALA-12402) Make CatalogdMetaProvider's cache concurrency level configurable

2023-10-23 Thread Maxwell Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maxwell Guo closed IMPALA-12402.

Resolution: Fixed

> Make CatalogdMetaProvider's cache concurrency level configurable
> 
>
> Key: IMPALA-12402
> URL: https://issues.apache.org/jira/browse/IMPALA-12402
> Project: IMPALA
>  Issue Type: Improvement
>  Components: fe
>Reporter: Maxwell Guo
>Assignee: Maxwell Guo
>Priority: Minor
>  Labels: pull-request-available
>
> when the cluster contains many db and tables such as if there are more than 
> 10 tables, and if we restart the impalad , the local cache_ 
> CatalogMetaProvider's need to doing some loading process. 
> As we know that the goole's guava cache 's concurrencyLevel os set to 4 by 
> default. 
> but if there is many tables the loading process will need more time and 
> increase the probability of lock contention, see 
> [here|https://github.com/google/guava/blob/master/guava/src/com/google/common/cache/CacheBuilder.java#L437].
>  
> So we propose to add some configurations here, the first is the concurrency 
> of cache.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12402) Make CatalogdMetaProvider's cache concurrency level configurable

2023-10-23 Thread Maxwell Guo (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17778936#comment-17778936
 ] 

Maxwell Guo commented on IMPALA-12402:
--

Thank you so much . [~stigahuang]

> Make CatalogdMetaProvider's cache concurrency level configurable
> 
>
> Key: IMPALA-12402
> URL: https://issues.apache.org/jira/browse/IMPALA-12402
> Project: IMPALA
>  Issue Type: Improvement
>  Components: fe
>Reporter: Maxwell Guo
>Assignee: Maxwell Guo
>Priority: Minor
>  Labels: pull-request-available
>
> when the cluster contains many db and tables such as if there are more than 
> 10 tables, and if we restart the impalad , the local cache_ 
> CatalogMetaProvider's need to doing some loading process. 
> As we know that the goole's guava cache 's concurrencyLevel os set to 4 by 
> default. 
> but if there is many tables the loading process will need more time and 
> increase the probability of lock contention, see 
> [here|https://github.com/google/guava/blob/master/guava/src/com/google/common/cache/CacheBuilder.java#L437].
>  
> So we propose to add some configurations here, the first is the concurrency 
> of cache.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12402) Make CatalogdMetaProvider's cache concurrency level configurable

2023-10-23 Thread Quanlong Huang (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17778935#comment-17778935
 ] 

Quanlong Huang commented on IMPALA-12402:
-

The failure is unrelated and is tracked at IMPALA-12499. Just merged the patch. 
We can resolve this now.
[~maxwellguo] Thanks for your contribution and looking forward to more from you!

> Make CatalogdMetaProvider's cache concurrency level configurable
> 
>
> Key: IMPALA-12402
> URL: https://issues.apache.org/jira/browse/IMPALA-12402
> Project: IMPALA
>  Issue Type: Improvement
>  Components: fe
>Reporter: Maxwell Guo
>Assignee: Maxwell Guo
>Priority: Minor
>  Labels: pull-request-available
>
> when the cluster contains many db and tables such as if there are more than 
> 10 tables, and if we restart the impalad , the local cache_ 
> CatalogMetaProvider's need to doing some loading process. 
> As we know that the goole's guava cache 's concurrencyLevel os set to 4 by 
> default. 
> but if there is many tables the loading process will need more time and 
> increase the probability of lock contention, see 
> [here|https://github.com/google/guava/blob/master/guava/src/com/google/common/cache/CacheBuilder.java#L437].
>  
> So we propose to add some configurations here, the first is the concurrency 
> of cache.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12402) Make CatalogdMetaProvider's cache concurrency level configurable

2023-10-23 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17778933#comment-17778933
 ] 

ASF subversion and git services commented on IMPALA-12402:
--

Commit c244aadcf367360e52807a84e7fba8b6237651fd in impala's branch 
refs/heads/master from maxwellguo
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=c244aadcf ]

IMPALA-12402: Make CatalogdMetaProvider's cache concurrency level configurable

The local CatalogMetaProvider's cache_ need to doing some loading
process with the default cache concurrency level of 4, when the
table number is very big, the loading process will cost much time,
in that case the restart process will cost much time too, so we make
cache concurrency level parameter configurable.

Change-Id: I8e3c10660e371498c2edc1eb8d235b7b8ca170c9
Reviewed-on: http://gerrit.cloudera.org:8080/20443
Reviewed-by: Impala Public Jenkins 
Tested-by: Quanlong Huang 


> Make CatalogdMetaProvider's cache concurrency level configurable
> 
>
> Key: IMPALA-12402
> URL: https://issues.apache.org/jira/browse/IMPALA-12402
> Project: IMPALA
>  Issue Type: Improvement
>  Components: fe
>Reporter: Maxwell Guo
>Assignee: Maxwell Guo
>Priority: Minor
>  Labels: pull-request-available
>
> when the cluster contains many db and tables such as if there are more than 
> 10 tables, and if we restart the impalad , the local cache_ 
> CatalogMetaProvider's need to doing some loading process. 
> As we know that the goole's guava cache 's concurrencyLevel os set to 4 by 
> default. 
> but if there is many tables the loading process will need more time and 
> increase the probability of lock contention, see 
> [here|https://github.com/google/guava/blob/master/guava/src/com/google/common/cache/CacheBuilder.java#L437].
>  
> So we propose to add some configurations here, the first is the concurrency 
> of cache.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Reopened] (IMPALA-12499) TestScanMemLimit.test_hdfs_scanner_thread_mem_scaling fails intermittently

2023-10-23 Thread Quanlong Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Quanlong Huang reopened IMPALA-12499:
-

Reopen this since it occurs again:
https://jenkins.impala.io/job/ubuntu-20.04-dockerised-tests/677/testReport/junit/query_test.test_mem_usage_scaling/TestScanMemLimit/test_hdfs_scanner_thread_non_reserved_bytes_protocol__beeswax___exec_optiontest_replan___1___batch_size___0___num_nodes___0___disable_codegen_rows_threshold___5000___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__avro_snap_block_/

> TestScanMemLimit.test_hdfs_scanner_thread_mem_scaling fails intermittently
> --
>
> Key: IMPALA-12499
> URL: https://issues.apache.org/jira/browse/IMPALA-12499
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.4.0
>Reporter: Joe McDonnell
>Assignee: Riza Suminto
>Priority: Critical
>  Labels: broken-build, flaky
> Fix For: Impala 4.4.0
>
>
> An ASAN test job ran into a failure on the new test case for 
> TestScanMemLimit.test_hdfs_scanner_thread_mem_scaling:
> {noformat}
> query_test/test_mem_usage_scaling.py:376: in 
> test_hdfs_scanner_thread_mem_scaling
> self.run_test_case('QueryTest/hdfs-scanner-thread-mem-scaling', vector)
> common/impala_test_suite.py:776: in run_test_case
> update_section=pytest.config.option.update_results)
> common/test_result_verifier.py:682: in verify_runtime_profile
> % (function, field, expected_value, actual_value, op, actual))
> E   AssertionError: Aggregation of SUM over NumScannerThreadsStarted did not 
> match expected results.
> E   EXPECTED VALUE:
> E   3
> E   
> E   
> E   ACTUAL VALUE:
> E   1{noformat}
> That must correspond to this test case: 
> [https://github.com/apache/impala/blob/master/testdata/workloads/functional-query/queries/QueryTest/hdfs-scanner-thread-mem-scaling.test#L36-L51]
> This was added recently with the fix for IMPALA-11068.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12444) PROCESSING_COST_MIN_THREADS can get ignored by scan fragment.

2023-10-23 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17778931#comment-17778931
 ] 

ASF subversion and git services commented on IMPALA-12444:
--

Commit 1388be8eb8011eaad45327e2da8671a1ff10844e in impala's branch 
refs/heads/master from Riza Suminto
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=1388be8eb ]

IMPALA-12510: Floor PlanFragment.maxParallelism_ at 1

IMPALA-12444 introduce a bug where PlanFragment.maxParallelism_ can be
set to 0. This can happen at scan fragment if table is empty. Number of
scan ranges will be 0, which then propagate to
ScanNode.maxScannerThreads_ and PlanFragment.maxParallelism_.

This patch fix it by flooring ScanNode.maxScannerThreads_ and
PlanFragment.maxParallelism_ at 1.

Testing:
- Add select star over an empty table testcase to
  PlannerTest.testProcessingCost.

Change-Id: Ibfa50abfdb9cdb994c5c3d7904b377a25f5b8b97
Reviewed-on: http://gerrit.cloudera.org:8080/20606
Reviewed-by: Impala Public Jenkins 
Tested-by: Riza Suminto 


> PROCESSING_COST_MIN_THREADS can get ignored by scan fragment.
> -
>
> Key: IMPALA-12444
> URL: https://issues.apache.org/jira/browse/IMPALA-12444
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 4.2.0
>Reporter: Riza Suminto
>Assignee: Riza Suminto
>Priority: Major
> Fix For: Impala 4.4.0
>
>
> There is a bug in PlanFragment.java where scan fragment might not follow 
> PROCESSING_COST_MIN_THREADS set by user even if total scan ranges allow to do 
> so.
> Frontend planner also need to sanity check such that 
> PROCESSING_COST_MIN_THREADS <= MAX_FRAGMENT_INSTANCES_PER_NODE.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-5081) Expose IR optimization level via query option

2023-10-23 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17778929#comment-17778929
 ] 

ASF subversion and git services commented on IMPALA-5081:
-

Commit 7230c57f6784cdb99080f3d4c8b679c33b54da21 in impala's branch 
refs/heads/master from Michael Smith
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=7230c57f6 ]

IMPALA-5081: Add codegen_opt_level query option

Adds the 'codegen_opt_level' query option to select LLVM optimization
level for generated code. Retains the prior behavior - O2 - as default.

If optimization level is changed for an entry already in cache, the
cache entry will be used unless the new optimization level is higher
than the cached level.

Adds additional counters for NumOptimizedFunctions and
NumOptimizedInstructions, which allow observing some impacts from
codegen optimization. These additional counters, and tracking opt level
for cached entries, increases the size of each cached entry.

Adds unit tests for all optimizition levels checking
- that small functions are inlined at higher levels (as a way to verify
  that optimization level has an effect)
- codegen cache entries are updated when optimizing the same fragment at
  a higher level, and not updated the rest of the time

Change-Id: I371f8758b6552263e91a1fbfd9a6e1c28e1fa2bd
Reviewed-on: http://gerrit.cloudera.org:8080/20399
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Expose IR optimization level via query option
> -
>
> Key: IMPALA-5081
> URL: https://issues.apache.org/jira/browse/IMPALA-5081
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Michael Ho
>Assignee: Michael Smith
>Priority: Minor
>  Labels: codegen
> Fix For: Impala 4.4.0
>
>
> Certain queries may spend a lot of time in the IR optimization. Currently, 
> there is a start-up option to disable optimization in LLVM. However, it may 
> be of inconvenience to users to have to restart the entire Impala cluster to 
> just use that option. This JIRA aims at exploring exposing a query option for 
> users to choose the optimization level for a given query (e.g. we can have a 
> level which just only have a dead code elimination pass or no optimization at 
> all).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12510) ProcessingCost.getNumInstanceMax() should not return 0.

2023-10-23 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17778930#comment-17778930
 ] 

ASF subversion and git services commented on IMPALA-12510:
--

Commit 1388be8eb8011eaad45327e2da8671a1ff10844e in impala's branch 
refs/heads/master from Riza Suminto
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=1388be8eb ]

IMPALA-12510: Floor PlanFragment.maxParallelism_ at 1

IMPALA-12444 introduce a bug where PlanFragment.maxParallelism_ can be
set to 0. This can happen at scan fragment if table is empty. Number of
scan ranges will be 0, which then propagate to
ScanNode.maxScannerThreads_ and PlanFragment.maxParallelism_.

This patch fix it by flooring ScanNode.maxScannerThreads_ and
PlanFragment.maxParallelism_ at 1.

Testing:
- Add select star over an empty table testcase to
  PlannerTest.testProcessingCost.

Change-Id: Ibfa50abfdb9cdb994c5c3d7904b377a25f5b8b97
Reviewed-on: http://gerrit.cloudera.org:8080/20606
Reviewed-by: Impala Public Jenkins 
Tested-by: Riza Suminto 


> ProcessingCost.getNumInstanceMax() should not return 0.
> ---
>
> Key: IMPALA-12510
> URL: https://issues.apache.org/jira/browse/IMPALA-12510
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 4.3.0
>Reporter: Riza Suminto
>Assignee: Riza Suminto
>Priority: Major
>
> ProcessingCost.getNumInstanceMax() is used for calculating cost-based max 
> parallelism. It should return 1 at minimum, not 0.
> [https://github.com/apache/impala/blob/b15d6dc2e7df05392a1daa4bc1b3da9ca31a583b/fe/src/main/java/org/apache/impala/planner/ProcessingCost.java#L207-L211]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12402) Make CatalogdMetaProvider's cache concurrency level configurable

2023-10-23 Thread Maxwell Guo (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17778927#comment-17778927
 ] 

Maxwell Guo commented on IMPALA-12402:
--

Hi [~MikaelSmith][~stigahuang] can you help to take a look at this build ? It 
seems some test is failed agagin. After looking at it, I don’t have any clues 
about how to solve this error. :(

> Make CatalogdMetaProvider's cache concurrency level configurable
> 
>
> Key: IMPALA-12402
> URL: https://issues.apache.org/jira/browse/IMPALA-12402
> Project: IMPALA
>  Issue Type: Improvement
>  Components: fe
>Reporter: Maxwell Guo
>Assignee: Maxwell Guo
>Priority: Minor
>  Labels: pull-request-available
>
> when the cluster contains many db and tables such as if there are more than 
> 10 tables, and if we restart the impalad , the local cache_ 
> CatalogMetaProvider's need to doing some loading process. 
> As we know that the goole's guava cache 's concurrencyLevel os set to 4 by 
> default. 
> but if there is many tables the loading process will need more time and 
> increase the probability of lock contention, see 
> [here|https://github.com/google/guava/blob/master/guava/src/com/google/common/cache/CacheBuilder.java#L437].
>  
> So we propose to add some configurations here, the first is the concurrency 
> of cache.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-10130) Catalog restart doesn't invalidate authPolicy cache in local catalog

2023-10-23 Thread Quanlong Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Quanlong Huang reassigned IMPALA-10130:
---

Assignee: (was: Quanlong Huang)

> Catalog restart doesn't invalidate authPolicy cache in local catalog
> 
>
> Key: IMPALA-10130
> URL: https://issues.apache.org/jira/browse/IMPALA-10130
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 2.0, Impala 3.0, Impala 4.0.0
>Reporter: Abhishek Rawat
>Priority: Major
>
> When the catalog service is restarted, LocalCatalog detects it in the topic 
> update due to the change in its service id and invalidates its cache 
> contents. However, it looks like it doesn't invalidate its existing 
> authPolicy cache contents.
> {code:java}
>   private void witnessCatalogServiceId(TUniqueId serviceId) {
> synchronized (catalogServiceIdLock_) {
>   if (!catalogServiceId_.equals(serviceId)) {
> if (!catalogServiceId_.equals(Catalog.INITIAL_CATALOG_SERVICE_ID)) {
>   LOG.warn("Detected catalog service restart: service ID changed from 
> " +
>   "{} to {}. Invalidating all cached metadata on this 
> coordinator.",
>   catalogServiceId_, serviceId);
> }
> catalogServiceId_ = serviceId;
> cache_.invalidateAll();
> // Clear cached items from the previous catalogd instance. Otherwise, 
> we'll
> // ignore new updates from the new catalogd instance since they have 
> lower
> // versions.
> hdfsCachePools_.clear();
> // TODO(todd): we probably need to invalidate the auth policy too.
> // we are probably better off detecting this at a higher level and
> // reinstantiating the metaprovider entirely, similar to how 
> ImpaladCatalog
> // handles this.
> // TODO(todd): slight race here: a concurrent request from the old 
> catalog
> // could theoretically be just about to write something back into the 
> cache
> // after we do the above invalidate. Maybe we would be better off 
> replacing
> // the whole cache object, or doing a soft barrier here to wait for 
> any
> // concurrent cache accessors to cycle out. Another option is to 
> associate
> // the catalog service ID as part of all of the cache keys.
> //
> // This is quite unlikely to be an issue in practice, so deferring it 
> to later
> // clean-up.
>   }
> }
>   }
> {code}
> If the older authpolicy is not cleared above, it is possible that when the 
> principle was added into the cache it was ignored since its catalog version 
> was higher as seen below:
> {code:java}
>   public synchronized void addPrincipal(Principal principal) {
> Principal existingPrincipal = getPrincipal(principal.getName(),
> principal.getPrincipalType());
> // There is already a newer version of this principal in the catalog, 
> ignore
> // just return.
> if (existingPrincipal != null &&
> existingPrincipal.getCatalogVersion() >= 
> principal.getCatalogVersion()) return;
> {code}
> When the update tries to add the privilege associated with the principle 
> above, it looks up the principal using id instead of name and if there is a 
> id mismatch it throws error.
> {code:java}
>   /**
>* Adds a new privilege to the policy mapping to the principal specified by 
> the
>* principal ID in the privilege. Throws a CatalogException no principal 
> with a
>* corresponding ID existing in the catalog.
>*/
>   public synchronized void addPrivilege(PrincipalPrivilege privilege)
>   throws CatalogException {
> if (LOG.isTraceEnabled()) {
>   LOG.trace("Adding privilege: " + privilege.getName() + " " +
>   Principal.toString(privilege.getPrincipalType()).toLowerCase() +
>   " ID: " + privilege.getPrincipalId());
> }
> Principal principal = getPrincipal(privilege.getPrincipalId(),
> privilege.getPrincipalType());
> if (principal == null) {
>   throw new CatalogException(String.format("Error adding privilege: %s. 
> %s ID " +
>   "'%d' does not exist.", privilege.getName(),
>   Principal.toString(privilege.getPrincipalType()), 
> privilege.getPrincipalId()));
> }{code}
>  
> The legacy catalog mode doesn't have this issue because the whole 
> ImpaladCatalog instance is re-created when detecting catalogd restarts:
> {code:java}
> @Override
> TUpdateCatalogCacheResponse updateCatalogCache(TUpdateCatalogCacheRequest 
> req)
> throws CatalogException, TException {
>   ImpaladCatalog catalog = catalog_.get();
>   if (req.is_delta) return catalog.updateCatalog(req);
>   // If this is not a delta, this update should 

[jira] [Updated] (IMPALA-12187) TestEventProcessing.test_event_based_replication flaky for truncate table

2023-10-23 Thread Quanlong Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Quanlong Huang updated IMPALA-12187:

Epic Link: IMPALA-11533

> TestEventProcessing.test_event_based_replication flaky for truncate table
> -
>
> Key: IMPALA-12187
> URL: https://issues.apache.org/jira/browse/IMPALA-12187
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 4.3.0
>Reporter: Joe McDonnell
>Priority: Critical
>  Labels: broken-build, flaky
>
> There have been a couple Jenkins jobs that have seen a failure on 
> TestEventProcessing.test_event_based_replication() where the test is 
> expecting the truncated table to have zero rows, but instead the table has 
> 100 rows:
> {noformat}
> metadata/test_event_processing.py:180: in test_event_based_replication
> self.__run_event_based_replication_tests()
> metadata/test_event_processing.py:329: in __run_event_based_replication_tests
> assert rows_in_part_tbl_target == 0
> E   assert 100 == 0{noformat}
> More logs:
> {noformat}
> truncate table repl_source_tsmyd.part_tbl;
> -- 2023-06-02 06:44:19,049 INFO MainThread: Started query 
> 50469ac62856f797:53e74fb4
> -- 2023-06-02 06:44:41,638 INFO MainThread: Waiting until events 
> processor syncs to event id:32187
> -- 2023-06-02 06:44:42,596 DEBUGMainThread: Metric last-synced-event-id 
> has reached the desired value: 32187
> -- 2023-06-02 06:44:42,632 DEBUGMainThread: Found 3 impalad/1 
> statestored/1 catalogd process(es)
> -- 2023-06-02 06:44:42,648 INFO MainThread: Getting metric: 
> catalog.curr-version from hostname:25000
> -- 2023-06-02 06:44:42,651 INFO MainThread: Sleeping 1s before next retry.
> -- 2023-06-02 06:44:43,653 INFO MainThread: Getting metric: 
> catalog.curr-version from hostname:25000
> -- 2023-06-02 06:44:43,669 INFO MainThread: Sleeping 1s before next retry.
> -- 2023-06-02 06:44:44,670 INFO MainThread: Getting metric: 
> catalog.curr-version from hostname:25000
> -- 2023-06-02 06:44:44,674 INFO MainThread: Sleeping 1s before next retry.
> -- 2023-06-02 06:44:45,676 INFO MainThread: Getting metric: 
> catalog.curr-version from hostname:25000
> -- 2023-06-02 06:44:45,679 INFO MainThread: Sleeping 1s before next retry.
> -- 2023-06-02 06:44:46,680 INFO MainThread: Getting metric: 
> catalog.curr-version from hostname:25000
> -- 2023-06-02 06:44:46,683 INFO MainThread: Sleeping 1s before next retry.
> -- 2023-06-02 06:44:47,685 INFO MainThread: Getting metric: 
> catalog.curr-version from hostname:25000
> -- 2023-06-02 06:44:47,688 INFO MainThread: Metric 'catalog.curr-version' 
> has reached desired value: 9771
> -- 2023-06-02 06:44:47,688 INFO MainThread: Getting metric: 
> catalog.curr-version from hostname:25001
> -- 2023-06-02 06:44:47,691 INFO MainThread: Metric 'catalog.curr-version' 
> has reached desired value: 9771
> -- 2023-06-02 06:44:47,691 INFO MainThread: Getting metric: 
> catalog.curr-version from hostname:25002
> -- 2023-06-02 06:44:47,694 INFO MainThread: Metric 'catalog.curr-version' 
> has reached desired value: 9771
> -- executing against localhost:21000
> select count(*) from repl_target_hhkuw.unpart_tbl;
> -- 2023-06-02 06:44:47,697 INFO MainThread: Started query 
> 6c40644e00cdf143:3be5e75a
> -- executing against localhost:21000
> select count(*) from repl_target_hhkuw.part_tbl;{noformat}
> This was seen in a debug core job and a debug erasure coding job. Only for 
> the partitioned table and not the unpartitioned table.
> This seems like a symptom that doesn't match the existing flakiness for 
> TestEventProcessing.test_event_based_replication().



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12500) TestObservability.test_global_exchange_counters is flaky

2023-10-23 Thread Fang-Yu Rao (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17778839#comment-17778839
 ] 

Fang-Yu Rao commented on IMPALA-12500:
--

Hi [~csringhofer], assigned this JIRA to you since you recently revised the 
test at 
[IMPALA-12430|https://github.com/apache/impala/commit/fb2d2b27641a95f51b6789639fab73b60abd7bc5#diff-a317a4067b5728a2d0af9839c1dce94710e7bd50825ceffc0a3c88aca3e27de3R553]
 and thus may be more familiar with the test. Please feel free to reassign the 
JIRA as you see fit. Thanks!

> TestObservability.test_global_exchange_counters is flaky
> 
>
> Key: IMPALA-12500
> URL: https://issues.apache.org/jira/browse/IMPALA-12500
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.4.0
>Reporter: Joe McDonnell
>Assignee: Csaba Ringhofer
>Priority: Critical
>  Labels: broken-build, flaky
>
> There have been intermittent failures on this test with the following symptom:
> {noformat}
> query_test/test_observability.py:564: in test_global_exchange_counters
> assert "ExchangeScanRatio: 4.63" in profile
> E   assert 'ExchangeScanRatio: 4.63' in 'Query 
> (id=c04b974db37e7046:b5fe4dea):\n  DEBUG MODE WARNING: Query profile 
> created while running a DEBUG buil...: 0.000ns\n - WriteIoBytes: 
> 0\n - WriteIoOps: 0 (0)\n - WriteIoWaitTime: 
> 0.000ns\n'
> -- executing against localhost:21000
> select count(*), sleep(50) from tpch_parquet.orders o
> inner join tpch_parquet.lineitem l on o.o_orderkey = l.l_orderkey
> group by o.o_clerk limit 10;
> -- 2023-10-05 19:47:29,817 INFO MainThread: Started query 
> c04b974db37e7046:b5fe4dea{noformat}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-12500) TestObservability.test_global_exchange_counters is flaky

2023-10-23 Thread Fang-Yu Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang-Yu Rao reassigned IMPALA-12500:


Assignee: Fang-Yu Rao

> TestObservability.test_global_exchange_counters is flaky
> 
>
> Key: IMPALA-12500
> URL: https://issues.apache.org/jira/browse/IMPALA-12500
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.4.0
>Reporter: Joe McDonnell
>Assignee: Fang-Yu Rao
>Priority: Critical
>  Labels: broken-build, flaky
>
> There have been intermittent failures on this test with the following symptom:
> {noformat}
> query_test/test_observability.py:564: in test_global_exchange_counters
> assert "ExchangeScanRatio: 4.63" in profile
> E   assert 'ExchangeScanRatio: 4.63' in 'Query 
> (id=c04b974db37e7046:b5fe4dea):\n  DEBUG MODE WARNING: Query profile 
> created while running a DEBUG buil...: 0.000ns\n - WriteIoBytes: 
> 0\n - WriteIoOps: 0 (0)\n - WriteIoWaitTime: 
> 0.000ns\n'
> -- executing against localhost:21000
> select count(*), sleep(50) from tpch_parquet.orders o
> inner join tpch_parquet.lineitem l on o.o_orderkey = l.l_orderkey
> group by o.o_clerk limit 10;
> -- 2023-10-05 19:47:29,817 INFO MainThread: Started query 
> c04b974db37e7046:b5fe4dea{noformat}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-12500) TestObservability.test_global_exchange_counters is flaky

2023-10-23 Thread Fang-Yu Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang-Yu Rao reassigned IMPALA-12500:


Assignee: Csaba Ringhofer  (was: Fang-Yu Rao)

> TestObservability.test_global_exchange_counters is flaky
> 
>
> Key: IMPALA-12500
> URL: https://issues.apache.org/jira/browse/IMPALA-12500
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.4.0
>Reporter: Joe McDonnell
>Assignee: Csaba Ringhofer
>Priority: Critical
>  Labels: broken-build, flaky
>
> There have been intermittent failures on this test with the following symptom:
> {noformat}
> query_test/test_observability.py:564: in test_global_exchange_counters
> assert "ExchangeScanRatio: 4.63" in profile
> E   assert 'ExchangeScanRatio: 4.63' in 'Query 
> (id=c04b974db37e7046:b5fe4dea):\n  DEBUG MODE WARNING: Query profile 
> created while running a DEBUG buil...: 0.000ns\n - WriteIoBytes: 
> 0\n - WriteIoOps: 0 (0)\n - WriteIoWaitTime: 
> 0.000ns\n'
> -- executing against localhost:21000
> select count(*), sleep(50) from tpch_parquet.orders o
> inner join tpch_parquet.lineitem l on o.o_orderkey = l.l_orderkey
> group by o.o_clerk limit 10;
> -- 2023-10-05 19:47:29,817 INFO MainThread: Started query 
> c04b974db37e7046:b5fe4dea{noformat}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12187) TestEventProcessing.test_event_based_replication flaky for truncate table

2023-10-23 Thread Riza Suminto (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17778836#comment-17778836
 ] 

Riza Suminto commented on IMPALA-12187:
---

Saw this again recently 
[https://jenkins.impala.io/job/ubuntu-20.04-dockerised-tests/674]

> TestEventProcessing.test_event_based_replication flaky for truncate table
> -
>
> Key: IMPALA-12187
> URL: https://issues.apache.org/jira/browse/IMPALA-12187
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 4.3.0
>Reporter: Joe McDonnell
>Priority: Critical
>  Labels: broken-build, flaky
>
> There have been a couple Jenkins jobs that have seen a failure on 
> TestEventProcessing.test_event_based_replication() where the test is 
> expecting the truncated table to have zero rows, but instead the table has 
> 100 rows:
> {noformat}
> metadata/test_event_processing.py:180: in test_event_based_replication
> self.__run_event_based_replication_tests()
> metadata/test_event_processing.py:329: in __run_event_based_replication_tests
> assert rows_in_part_tbl_target == 0
> E   assert 100 == 0{noformat}
> More logs:
> {noformat}
> truncate table repl_source_tsmyd.part_tbl;
> -- 2023-06-02 06:44:19,049 INFO MainThread: Started query 
> 50469ac62856f797:53e74fb4
> -- 2023-06-02 06:44:41,638 INFO MainThread: Waiting until events 
> processor syncs to event id:32187
> -- 2023-06-02 06:44:42,596 DEBUGMainThread: Metric last-synced-event-id 
> has reached the desired value: 32187
> -- 2023-06-02 06:44:42,632 DEBUGMainThread: Found 3 impalad/1 
> statestored/1 catalogd process(es)
> -- 2023-06-02 06:44:42,648 INFO MainThread: Getting metric: 
> catalog.curr-version from hostname:25000
> -- 2023-06-02 06:44:42,651 INFO MainThread: Sleeping 1s before next retry.
> -- 2023-06-02 06:44:43,653 INFO MainThread: Getting metric: 
> catalog.curr-version from hostname:25000
> -- 2023-06-02 06:44:43,669 INFO MainThread: Sleeping 1s before next retry.
> -- 2023-06-02 06:44:44,670 INFO MainThread: Getting metric: 
> catalog.curr-version from hostname:25000
> -- 2023-06-02 06:44:44,674 INFO MainThread: Sleeping 1s before next retry.
> -- 2023-06-02 06:44:45,676 INFO MainThread: Getting metric: 
> catalog.curr-version from hostname:25000
> -- 2023-06-02 06:44:45,679 INFO MainThread: Sleeping 1s before next retry.
> -- 2023-06-02 06:44:46,680 INFO MainThread: Getting metric: 
> catalog.curr-version from hostname:25000
> -- 2023-06-02 06:44:46,683 INFO MainThread: Sleeping 1s before next retry.
> -- 2023-06-02 06:44:47,685 INFO MainThread: Getting metric: 
> catalog.curr-version from hostname:25000
> -- 2023-06-02 06:44:47,688 INFO MainThread: Metric 'catalog.curr-version' 
> has reached desired value: 9771
> -- 2023-06-02 06:44:47,688 INFO MainThread: Getting metric: 
> catalog.curr-version from hostname:25001
> -- 2023-06-02 06:44:47,691 INFO MainThread: Metric 'catalog.curr-version' 
> has reached desired value: 9771
> -- 2023-06-02 06:44:47,691 INFO MainThread: Getting metric: 
> catalog.curr-version from hostname:25002
> -- 2023-06-02 06:44:47,694 INFO MainThread: Metric 'catalog.curr-version' 
> has reached desired value: 9771
> -- executing against localhost:21000
> select count(*) from repl_target_hhkuw.unpart_tbl;
> -- 2023-06-02 06:44:47,697 INFO MainThread: Started query 
> 6c40644e00cdf143:3be5e75a
> -- executing against localhost:21000
> select count(*) from repl_target_hhkuw.part_tbl;{noformat}
> This was seen in a debug core job and a debug erasure coding job. Only for 
> the partitioned table and not the unpartitioned table.
> This seems like a symptom that doesn't match the existing flakiness for 
> TestEventProcessing.test_event_based_replication().



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-5081) Expose IR optimization level via query option

2023-10-23 Thread Michael Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith resolved IMPALA-5081.
---
Fix Version/s: Impala 4.4.0
   Resolution: Fixed

> Expose IR optimization level via query option
> -
>
> Key: IMPALA-5081
> URL: https://issues.apache.org/jira/browse/IMPALA-5081
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Michael Ho
>Assignee: Michael Smith
>Priority: Minor
>  Labels: codegen
> Fix For: Impala 4.4.0
>
>
> Certain queries may spend a lot of time in the IR optimization. Currently, 
> there is a start-up option to disable optimization in LLVM. However, it may 
> be of inconvenience to users to have to restart the entire Impala cluster to 
> just use that option. This JIRA aims at exploring exposing a query option for 
> users to choose the optimization level for a given query (e.g. we can have a 
> level which just only have a dead code elimination pass or no optimization at 
> all).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-12512) impala-asf-master-exhaustive Multiple tests failed to get memory reservation

2023-10-23 Thread Kurt Deschler (Jira)
Kurt Deschler created IMPALA-12512:
--

 Summary: impala-asf-master-exhaustive Multiple tests failed to get 
memory reservation
 Key: IMPALA-12512
 URL: https://issues.apache.org/jira/browse/IMPALA-12512
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Reporter: Kurt Deschler


The following tests failed with 

 
|!https://master-03.jenkins.cloudera.com/static/09f3c943/images/16x16/document_add.png!
  [query_test.test_iceberg.TestIcebergTable.test_partitioned_insert[protocol: 
beeswax \| exec_option: \{'test_replan': 1, 'batch_size': 0, 'num_nodes': 0, 
'disable_codegen_rows_threshold': 0, 'disable_codegen': True, 'abort_on_error': 
1, 'exec_single_node_rows_threshold': 0} \| table_format: 
parquet/none]|https://master-03.jenkins.cloudera.com/view/Impala/view/Evergreen-asf-master/job/impala-asf-master-exhaustive/329/testReport/query_test.test_iceberg/TestIcebergTable/test_partitioned_insert_protocol__beeswax___exec_optiontest_replan___1___batch_size___0___num_nodes___0___disable_codegen_rows_threshold___0___disable_codegen___True___abort_on_error___1___exec_single_node_rows_threshold___0table_format__parquet_none_/]
 
E Query(b647ab266187152b:4bb55321) could not allocate 12.00 MB without 
exceeding limit. E Error occurred on backend 
impala-ec2-centos79-m6i-4xlarge-ondemand-0806.vpc.cloudera.com:27000 E Memory 
left in process limit: 1.19 GB
 
|1 min 4 
sec|[1|https://master-03.jenkins.cloudera.com/view/Impala/view/Evergreen-asf-master/job/impala-asf-master-exhaustive/329/]|
|!https://master-03.jenkins.cloudera.com/static/09f3c943/images/16x16/document_add.png!
  [query_test.test_insert.TestInsertQueries.test_insert[compression_codec: gzip 
\| protocol: beeswax \| exec_option: \{'sync_ddl': 0, 'test_replan': 1, 
'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 
'disable_codegen': False, 'abort_on_error': 1, 
'exec_single_node_rows_threshold': 0} \| table_format: 
parquet/none-unique_database0]|https://master-03.jenkins.cloudera.com/view/Impala/view/Evergreen-asf-master/job/impala-asf-master-exhaustive/329/testReport/query_test.test_insert/TestInsertQueries/test_insert_compression_codec__gzip___protocol__beeswax___exec_optionsync_ddl___0___test_replan___1___batch_size___0___num_nodes___0___disable_codegen_rows_threshold___0___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__parquet_none_unique_database0_/]
 
E Query(5443ac13ca407586:46348162) could not allocate 12.01 MB without 
exceeding limit. E Error occurred on backend 
impala-ec2-centos79-m6i-4xlarge-ondemand-0806.vpc.cloudera.com:27000 E Memory 
left in process limit: 1.19 GB
 
|46 
sec|[1|https://master-03.jenkins.cloudera.com/view/Impala/view/Evergreen-asf-master/job/impala-asf-master-exhaustive/329/]|
|!https://master-03.jenkins.cloudera.com/static/09f3c943/images/16x16/document_add.png!
  [query_test.test_join_queries.TestSemiJoinQueries.test_semi_joins[batch_size: 
0 \| protocol: beeswax \| exec_option: \{'test_replan': 1, 'batch_size': 0, 
'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} \| table_format: 
parquet/none]|https://master-03.jenkins.cloudera.com/view/Impala/view/Evergreen-asf-master/job/impala-asf-master-exhaustive/329/testReport/query_test.test_join_queries/TestSemiJoinQueries/test_semi_joins_batch_size__0___protocol__beeswax___exec_optiontest_replan___1___batch_size___0___num_nodes___0___disable_codegen_rows_threshold___0___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__parquet_none_/]
 
E Query(0b421d488655a74e:87207d39) could not allocate 73.95 MB without 
exceeding limit. E Error occurred on backend 
impala-ec2-centos79-m6i-4xlarge-ondemand-0806.vpc.cloudera.com:27000 E Memory 
left in process limit: 1.18 GB
 
|16 
sec|[1|https://master-03.jenkins.cloudera.com/view/Impala/view/Evergreen-asf-master/job/impala-asf-master-exhaustive/329/]|
|!https://master-03.jenkins.cloudera.com/static/09f3c943/images/16x16/document_add.png!
  [query_test.test_parquet_stats.TestParquetStats.test_page_index[mt_dop: 0 \| 
protocol: beeswax \| exec_option: \{'test_replan': 1, 'batch_size': 0, 
'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': True, 
'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} \| table_format: 

[jira] [Commented] (IMPALA-12402) Make CatalogdMetaProvider's cache concurrency level configurable

2023-10-23 Thread Maxwell Guo (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17778596#comment-17778596
 ] 

Maxwell Guo commented on IMPALA-12402:
--

update agagin.

> Make CatalogdMetaProvider's cache concurrency level configurable
> 
>
> Key: IMPALA-12402
> URL: https://issues.apache.org/jira/browse/IMPALA-12402
> Project: IMPALA
>  Issue Type: Improvement
>  Components: fe
>Reporter: Maxwell Guo
>Assignee: Maxwell Guo
>Priority: Minor
>  Labels: pull-request-available
>
> when the cluster contains many db and tables such as if there are more than 
> 10 tables, and if we restart the impalad , the local cache_ 
> CatalogMetaProvider's need to doing some loading process. 
> As we know that the goole's guava cache 's concurrencyLevel os set to 4 by 
> default. 
> but if there is many tables the loading process will need more time and 
> increase the probability of lock contention, see 
> [here|https://github.com/google/guava/blob/master/guava/src/com/google/common/cache/CacheBuilder.java#L437].
>  
> So we propose to add some configurations here, the first is the concurrency 
> of cache.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12509) Optimize the backend startup and planner time of large Iceberg table query

2023-10-23 Thread Quanlong Huang (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17778576#comment-17778576
 ] 

Quanlong Huang commented on IMPALA-12509:
-

Could you share how you measure the serialization time of TQueryCtx? Some 
screenshots or logs might help.

> Optimize the backend startup and planner time of large Iceberg table query
> --
>
> Key: IMPALA-12509
> URL: https://issues.apache.org/jira/browse/IMPALA-12509
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Fu Lili
>Assignee: Fu Lili
>Priority: Major
>
> We found that when querying an Iceberg table with a large number of files 
> (>=20), the Query Plan and start backends took an abnormal time (>= 2s). 
> The reason was that unnecessary objects were serialized when building 
> TQueryCtx. The main function involved is IcebergTable::toThriftDescriptor



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org