[jira] [Commented] (IMPALA-13016) Fix ambiguous row_regex that check for no-existence

2024-04-19 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-13016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839194#comment-17839194
 ] 

ASF subversion and git services commented on IMPALA-13016:
--

Commit 9a41dfbdc7b59728a16c28c9bfed483a6ee9d3ae in impala's branch 
refs/heads/master from Riza Suminto
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=9a41dfbdc ]

IMPALA-13016: Fix ambiguous row_regex that check for no-existence

There are few row_regex patterns used in EE test files that are
ambiguous on whether a pattern does not exist in all parts of the
results/runtime profile or at least one row does not have that pattern.
These were caught by grepping the following pattern:

$ git grep -n "row_regex: (?\!"

This patch replaces them with either with !row_regex or VERIFY_IS_NOT_IN
comment.

Testing:
- Run and pass modified tests.

Change-Id: Ic81de34bf997dfaf1c199b1fe1b05346b55ff4da
Reviewed-on: http://gerrit.cloudera.org:8080/21333
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Fix ambiguous row_regex that check for no-existence
> ---
>
> Key: IMPALA-13016
> URL: https://issues.apache.org/jira/browse/IMPALA-13016
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Reporter: Riza Suminto
>Assignee: Riza Suminto
>Priority: Minor
>
> There are few row_regex pattern used in EE test files that is ambiguous on 
> whether a parttern not exist in all parts of results/runtime filter or at 
> least one row does not have that pattern:
> {code:java}
> $ git grep -n "row_regex: (?\!"
> testdata/workloads/functional-query/queries/QueryTest/acid-clear-statsaccurate.test:34:row_regex:
>  (?!.*COLUMN_STATS_ACCURATE)
> testdata/workloads/functional-query/queries/QueryTest/acid-truncate.test:47:row_regex:
>  (?!.*COLUMN_STATS_ACCURATE)
> testdata/workloads/functional-query/queries/QueryTest/clear-statsaccurate.test:28:row_regex:
>  (?!.*COLUMN_STATS_ACCURATE)
> testdata/workloads/functional-query/queries/QueryTest/iceberg-v2-directed-mode.test:14:row_regex:
>  (?!.*F03:JOIN BUILD.*) {code}
> They should be replaced either with !row_regex or VERIFY_IS_NOT_IN comment.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12988) Calculate an unbounded version of CpuAsk

2024-04-19 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839196#comment-17839196
 ] 

ASF subversion and git services commented on IMPALA-12988:
--

Commit d437334e5304823836b9ceb5ffda9945dd7cb183 in impala's branch 
refs/heads/master from Riza Suminto
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=d437334e5 ]

IMPALA-12988: Calculate an unbounded version of CpuAsk

Planner calculates CpuAsk through a recursive call beginning at
Planner.computeBlockingAwareCores(), which is called after
Planner.computeEffectiveParallelism(). It does blocking operator
analysis over the selected degree of parallelism that was decided during
computeEffectiveParallelism() traversal. That selected degree of
parallelism, however, is already bounded by min and max parallelism
config, derived from PROCESSING_COST_MIN_THREADS and
MAX_FRAGMENT_INSTANCES_PER_NODE options accordingly.

This patch calculates an unbounded version of CpuAsk that is not bounded
by min and max parallelism config. It is purely based on the fragment's
ProcessingCost and query plan relationship constraint (for example, the
number of JOIN BUILDER fragments should equal the number of destination
JOIN fragments for partitioned join).

Frontend will receive both bounded and unbounded CpuAsk values from
TQueryExecRequest on each executor group set selection round. The
unbounded CpuAsk is then scaled down once using a nth root based
sublinear-function, controlled by the total cpu count of the smallest
executor group set and the bounded CpuAsk number. Another linear scaling
is then applied on both bounded and unbounded CpuAsk using
QUERY_CPU_COUNT_DIVISOR option. Frontend then compare the unbounded
CpuAsk after scaling against CpuMax to avoid assigning a query to a
small executor group set too soon. The last executor group set stays as
the "catch-all" executor group set.

After this patch, setting COMPUTE_PROCESSING_COST=True will show
following changes in query profile:
- The "max-parallelism" fields in the query plan will all be set to
  maximum parallelism based on ProcessingCost.
- The CpuAsk counter is changed to show the unbounded CpuAsk after
  scaling.
- A new counter CpuAskBounded shows the bounded CpuAsk after scaling. If
  QUERY_CPU_COUNT_DIVISOR=1 and PLANNER_CPU_ASK slot counting strategy
  is selected, this CpuAskBounded is also the minimum total admission
  slots given to the query.
- A new counter MaxParallelism shows the unbounded CpuAsk before
  scaling.
- The EffectiveParallelism counter remains unchanged,
  showing bounded CpuAsk before scaling.

Testing:
- Update and pass FE test TpcdsCpuCostPlannerTest and
  PlannerTest#testProcessingCost.
- Pass EE test tests/query_test/test_tpcds_queries.py
- Pass custom cluster test tests/custom_cluster/test_executor_groups.py

Change-Id: I5441e31088f90761062af35862be4ce09d116923
Reviewed-on: http://gerrit.cloudera.org:8080/21277
Reviewed-by: Kurt Deschler 
Reviewed-by: Abhishek Rawat 
Tested-by: Impala Public Jenkins 


> Calculate an unbounded version of CpuAsk
> 
>
> Key: IMPALA-12988
> URL: https://issues.apache.org/jira/browse/IMPALA-12988
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Reporter: Riza Suminto
>Assignee: Riza Suminto
>Priority: Major
>
> CpuAsk is calculated through recursive call beginning at 
> Planner.computeBlockingAwareCores(), which called after 
> Planner.computeEffectiveParallelism(). It does blocking operator analysis 
> over selected degree of parallelism that decided during 
> computeEffectiveParallelism() traversal. That selected degree of parallelism, 
> however, is already bounded by min and max parallelism config, derived from 
> PROCESSING_COST_MIN_THREADS and MAX_FRAGMENT_INSTANCES_PER_NODE options 
> accordingly.
> It is beneficial to have another version of CpuAsk that is not bounded by min 
> and max parallelism config. It should purely based on the fragment's 
> ProcessingCost and query plan relationship constraint (ie., num JOIN BUILDER 
> fragment should be equal as num JOIN fragment for partitioned join). During 
> executor group set selection, Frontend should use the unbounded CpuAsk number 
> to avoid assigning query to small executor group set prematurely.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12938) test_no_inaccessible_objects failed in JDK11 build

2024-04-19 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839195#comment-17839195
 ] 

ASF subversion and git services commented on IMPALA-12938:
--

Commit 5e7d720257ba86c2d020d483c09673650a3f02d9 in impala's branch 
refs/heads/master from Michael Smith
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=5e7d72025 ]

IMPALA-12938: add-opens for platform.cgroupv1

Adds '--add-opens=jdk.internal.platform.cgroupv1' for Java 11 with
ehcache, covering Impala daemons and frontend tests. Fixes
InaccessibleObjectException detected by test_banned_log_messages.py.

Change-Id: I312ae987c17c6f06e1ffe15e943b1865feef6b82
Reviewed-on: http://gerrit.cloudera.org:8080/21334
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> test_no_inaccessible_objects failed in JDK11 build
> --
>
> Key: IMPALA-12938
> URL: https://issues.apache.org/jira/browse/IMPALA-12938
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Zoltán Borók-Nagy
>Assignee: Michael Smith
>Priority: Major
>  Labels: broken-build
> Fix For: Impala 4.4.0
>
>
> h3. Error Message
> {noformat}
> AssertionError: 
> /data/jenkins/workspace/impala-asf-master-core-jdk11/repos/Impala/logs/custom_cluster_tests/impalad.impala-ec2-centos79-m6i-4xlarge-xldisk-197f.vpc.cloudera.com.jenkins.log.INFO.20240323-184351.16035
>  contains 'InaccessibleObjectException' assert 0 == 1{noformat}
> h3. Stacktrace
> {noformat}
> verifiers/test_banned_log_messages.py:40: in test_no_inaccessible_objects
> self.assert_message_absent('InaccessibleObjectException')
> verifiers/test_banned_log_messages.py:36: in assert_message_absent
> assert returncode == 1, "%s contains '%s'" % (log_file_path, message)
> E   AssertionError: 
> /data/jenkins/workspace/impala-asf-master-core-jdk11/repos/Impala/logs/custom_cluster_tests/impalad.impala-ec2-centos79-m6i-4xlarge-xldisk-197f.vpc.cloudera.com.jenkins.log.INFO.20240323-184351.16035
>  contains 'InaccessibleObjectException'
> E   assert 0 == 1{noformat}
> h3. Standard Output
> {noformat}
> java.lang.reflect.InaccessibleObjectException: Unable to make field private 
> jdk.internal.platform.cgroupv1.CgroupV1MemorySubSystemController 
> jdk.internal.platform.cgroupv1.CgroupV1Subsystem.memory accessible: module 
> java.base does not "opens jdk.internal.platform.cgroupv1" to unnamed module 
> @1a2e2935
> java.lang.reflect.InaccessibleObjectException: Unable to make field private 
> jdk.internal.platform.cgroupv1.CgroupV1SubsystemController 
> jdk.internal.platform.cgroupv1.CgroupV1Subsystem.cpu accessible: module 
> java.base does not "opens jdk.internal.platform.cgroupv1" to unnamed module 
> @1a2e2935
> java.lang.reflect.InaccessibleObjectException: Unable to make field private 
> jdk.internal.platform.cgroupv1.CgroupV1SubsystemController 
> jdk.internal.platform.cgroupv1.CgroupV1Subsystem.cpuacct accessible: module 
> java.base does not "opens jdk.internal.platform.cgroupv1" to unnamed module 
> @1a2e2935
> java.lang.reflect.InaccessibleObjectException: Unable to make field private 
> jdk.internal.platform.cgroupv1.CgroupV1SubsystemController 
> jdk.internal.platform.cgroupv1.CgroupV1Subsystem.cpuset accessible: module 
> java.base does not "opens jdk.internal.platform.cgroupv1" to unnamed module 
> @1a2e2935
> java.lang.reflect.InaccessibleObjectException: Unable to make field private 
> jdk.internal.platform.cgroupv1.CgroupV1SubsystemController 
> jdk.internal.platform.cgroupv1.CgroupV1Subsystem.blkio accessible: module 
> java.base does not "opens jdk.internal.platform.cgroupv1" to unnamed module 
> @1a2e2935
> java.lang.reflect.InaccessibleObjectException: Unable to make field private 
> jdk.internal.platform.cgroupv1.CgroupV1SubsystemController 
> jdk.internal.platform.cgroupv1.CgroupV1Subsystem.pids accessible: module 
> java.base does not "opens jdk.internal.platform.cgroupv1" to unnamed module 
> @1a2e2935
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-13023) support webserver ldap filter when using spnego

2024-04-19 Thread YUBI LEE (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-13023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839193#comment-17839193
 ] 

YUBI LEE commented on IMPALA-13023:
---

[https://gerrit.cloudera.org/#/c/21339/]

> support webserver ldap filter when using spnego
> ---
>
> Key: IMPALA-13023
> URL: https://issues.apache.org/jira/browse/IMPALA-13023
> Project: IMPALA
>  Issue Type: Improvement
>  Components: fe
>Affects Versions: Impala 4.3.0
>Reporter: YUBI LEE
>Priority: Major
>
> It is possible to use ldap filter when Kerberos is enabled since IMPALA-11726.
> In the same way, it is good to support webserver to use ldap filter when 
> configured with spnego.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-13023) support webserver ldap filter when using spnego

2024-04-19 Thread YUBI LEE (Jira)
YUBI LEE created IMPALA-13023:
-

 Summary: support webserver ldap filter when using spnego
 Key: IMPALA-13023
 URL: https://issues.apache.org/jira/browse/IMPALA-13023
 Project: IMPALA
  Issue Type: Improvement
  Components: fe
Affects Versions: Impala 4.3.0
Reporter: YUBI LEE


It is possible to use ldap filter when Kerberos is enabled since IMPALA-11726.

In the same way, it is good to support webserver to use ldap filter when 
configured with spnego.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-12350) Daemon fails to initialize large catalog

2024-04-19 Thread Quanlong Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Quanlong Huang resolved IMPALA-12350.
-
Resolution: Workaround

Thank [~saulius.vl]! I'll resove this as workaround provided.

We will track the fix of sending a >2GB initial catalog update in IMPALA-13020. 
Feel free to file new JRIAs for other issues.

BTW, the slower invalidate/refresh operations might be fixed by IMPALA-11501.

> Daemon fails to initialize large catalog
> 
>
> Key: IMPALA-12350
> URL: https://issues.apache.org/jira/browse/IMPALA-12350
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 4.2.0
>Reporter: Saulius Valatka
>Priority: Major
>
> When the statestored catalog topic is large enough (>2gb) daemons fail to 
> restart and get stuck in a loop:
> {{I0808 13:07:17.702653 3633556 Frontend.java:1618] Waiting for local catalog 
> to be initialized, attempt: 2068}}
>  
> The statestored reports errors as follows:
> {{I0808 13:07:05.587296 2134270 thrift-util.cc:196] TSocket::write_partial() 
> send() : Broken pipe}}
> {{I0808 13:07:05.587356 2134270 client-cache.h:362] RPC Error: Client for 
> gs1-hdp-data70:23000 hit an unexpected exception: write() send(): Broken 
> pipe, type: N6apache6thrift9transport19TTransportExceptionE, rpc: 
> N6impala20TUpdateStateResponseE, send: not done}}
> {{I0808 13:07:05.587365 2134270 client-cache.cc:174] Broken Connection, 
> destroy client for gs1-hdp-data70:23000}}
>  
> If this happens we are forced to restart statestore and thus the whole 
> cluster, meaning that we can't tolerate failure from even a single daemon.
> Interestingly the catalog topic increased significantly after upgrading from 
> 3.4.0 to 4.2.0 - from ~800mb to ~3.4gb. Invalidate/refresh operations also 
> became significantly slower (~10ms -> 5s).
> Probably related to thrift_rpc_max_message_size? but I see the maximum value 
> is 2gb.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-13020) catalog-topic updates >2GB do not work due to Thrift's max message size

2024-04-19 Thread Quanlong Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-13020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Quanlong Huang updated IMPALA-13020:

Affects Version/s: Impala 4.3.0
   Impala 4.2.0
   (was: Impala 4.4.0)

> catalog-topic updates >2GB do not work due to Thrift's max message size
> ---
>
> Key: IMPALA-13020
> URL: https://issues.apache.org/jira/browse/IMPALA-13020
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.2.0, Impala 4.3.0
>Reporter: Joe McDonnell
>Priority: Critical
>
> Thrift 0.16.0 added a max message size to protect against malicious packets 
> that can consume a large amount of memory on the receiver side. This max 
> message size is a signed 32-bit integer, so it maxes out at 2GB (which we set 
> via thrift_rpc_max_message_size).
> In catalog v1, the catalog-update statestore topic can become larger than 2GB 
> when there are a large number of tables / partitions / files. If this happens 
> and an Impala coordinator needs to start up (or needs a full topic update for 
> any other reason), it is expecting the statestore to send it the full topic 
> update, but the coordinator actually can't process the message. The 
> deserialization of the message hits the 2GB max message size limit and fails.
> On the statestore side, it shows this message:
> {noformat}
> I0418 16:54:51.727290 3844140 statestore.cc:507] Preparing initial 
> catalog-update topic update for 
> impa...@mcdonnellthrift.vpc.cloudera.com:27000. Size = 2.27 GB
> I0418 16:54:53.889446 3844140 thrift-util.cc:198] TSocket::write_partial() 
> send() : Broken pipe
> I0418 16:54:53.889488 3844140 client-cache.cc:82] ReopenClient(): re-creating 
> client for mcdonnellthrift.vpc.cloudera.com:23000
> I0418 16:54:53.889493 3844140 thrift-util.cc:198] TSocket::write_partial() 
> send() : Broken pipe
> I0418 16:54:53.889503 3844140 thrift-client.cc:116] Error closing connection 
> to: mcdonnellthrift.vpc.cloudera.com:23000, ignoring (write() send(): Broken 
> pipe)
> I0418 16:54:56.052882 3844140 thrift-util.cc:198] TSocket::write_partial() 
> send() : Broken pipe
> I0418 16:54:56.052932 3844140 client-cache.h:363] RPC Error: Client for 
> mcdonnellthrift.vpc.cloudera.com:23000 hit an unexpected exception: write() 
> send(): Broken pipe, type: N6apache6thrift9transport19TTransportExceptionE, 
> rpc: N6impala20TUpdateStateResponseE, send: not done
> I0418 16:54:56.052937 3844140 client-cache.cc:174] Broken Connection, destroy 
> client for mcdonnellthrift.vpc.cloudera.com:23000{noformat}
> On the Impala side, it doesn't give a good error, but we see this:
> {noformat}
> I0418 16:54:53.889683 3214537 TAcceptQueueServer.cpp:355] New connection to 
> server StatestoreSubscriber from client 
> I0418 16:54:54.080694 3214136 Frontend.java:1837] Waiting for local catalog 
> to be initialized, attempt: 110
> I0418 16:54:56.080920 3214136 Frontend.java:1837] Waiting for local catalog 
> to be initialized, attempt: 111
> I0418 16:54:58.081131 3214136 Frontend.java:1837] Waiting for local catalog 
> to be initialized, attempt: 112
> I0418 16:55:00.081358 3214136 Frontend.java:1837] Waiting for local catalog 
> to be initialized, attempt: 113{noformat}
> With a patch Thrift that allows an int64_t max message size and setting that 
> to a larger value, the Impala was able to start up (even without restarting 
> the statestored).
> Some clusters that upgrade to a newer version may hit this, as Thrift didn't 
> use to enforce this limit, so this is something we should fix to avoid 
> upgrade issues.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-12938) test_no_inaccessible_objects failed in JDK11 build

2024-04-19 Thread Michael Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith resolved IMPALA-12938.

Fix Version/s: Impala 4.4.0
   Resolution: Fixed

> test_no_inaccessible_objects failed in JDK11 build
> --
>
> Key: IMPALA-12938
> URL: https://issues.apache.org/jira/browse/IMPALA-12938
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Zoltán Borók-Nagy
>Assignee: Michael Smith
>Priority: Major
>  Labels: broken-build
> Fix For: Impala 4.4.0
>
>
> h3. Error Message
> {noformat}
> AssertionError: 
> /data/jenkins/workspace/impala-asf-master-core-jdk11/repos/Impala/logs/custom_cluster_tests/impalad.impala-ec2-centos79-m6i-4xlarge-xldisk-197f.vpc.cloudera.com.jenkins.log.INFO.20240323-184351.16035
>  contains 'InaccessibleObjectException' assert 0 == 1{noformat}
> h3. Stacktrace
> {noformat}
> verifiers/test_banned_log_messages.py:40: in test_no_inaccessible_objects
> self.assert_message_absent('InaccessibleObjectException')
> verifiers/test_banned_log_messages.py:36: in assert_message_absent
> assert returncode == 1, "%s contains '%s'" % (log_file_path, message)
> E   AssertionError: 
> /data/jenkins/workspace/impala-asf-master-core-jdk11/repos/Impala/logs/custom_cluster_tests/impalad.impala-ec2-centos79-m6i-4xlarge-xldisk-197f.vpc.cloudera.com.jenkins.log.INFO.20240323-184351.16035
>  contains 'InaccessibleObjectException'
> E   assert 0 == 1{noformat}
> h3. Standard Output
> {noformat}
> java.lang.reflect.InaccessibleObjectException: Unable to make field private 
> jdk.internal.platform.cgroupv1.CgroupV1MemorySubSystemController 
> jdk.internal.platform.cgroupv1.CgroupV1Subsystem.memory accessible: module 
> java.base does not "opens jdk.internal.platform.cgroupv1" to unnamed module 
> @1a2e2935
> java.lang.reflect.InaccessibleObjectException: Unable to make field private 
> jdk.internal.platform.cgroupv1.CgroupV1SubsystemController 
> jdk.internal.platform.cgroupv1.CgroupV1Subsystem.cpu accessible: module 
> java.base does not "opens jdk.internal.platform.cgroupv1" to unnamed module 
> @1a2e2935
> java.lang.reflect.InaccessibleObjectException: Unable to make field private 
> jdk.internal.platform.cgroupv1.CgroupV1SubsystemController 
> jdk.internal.platform.cgroupv1.CgroupV1Subsystem.cpuacct accessible: module 
> java.base does not "opens jdk.internal.platform.cgroupv1" to unnamed module 
> @1a2e2935
> java.lang.reflect.InaccessibleObjectException: Unable to make field private 
> jdk.internal.platform.cgroupv1.CgroupV1SubsystemController 
> jdk.internal.platform.cgroupv1.CgroupV1Subsystem.cpuset accessible: module 
> java.base does not "opens jdk.internal.platform.cgroupv1" to unnamed module 
> @1a2e2935
> java.lang.reflect.InaccessibleObjectException: Unable to make field private 
> jdk.internal.platform.cgroupv1.CgroupV1SubsystemController 
> jdk.internal.platform.cgroupv1.CgroupV1Subsystem.blkio accessible: module 
> java.base does not "opens jdk.internal.platform.cgroupv1" to unnamed module 
> @1a2e2935
> java.lang.reflect.InaccessibleObjectException: Unable to make field private 
> jdk.internal.platform.cgroupv1.CgroupV1SubsystemController 
> jdk.internal.platform.cgroupv1.CgroupV1Subsystem.pids accessible: module 
> java.base does not "opens jdk.internal.platform.cgroupv1" to unnamed module 
> @1a2e2935
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-11972) Factor in row width during ProcessingCost calculation.

2024-04-19 Thread Riza Suminto (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Riza Suminto resolved IMPALA-11972.
---
 Fix Version/s: Impala 4.4.0
Target Version: Impala 4.4.0
Resolution: Duplicate

This will be resolved in more general way by 
[IMPALA-12657|http://issues.apache.org/jira/browse/IMPALA-12657].

> Factor in row width during ProcessingCost calculation.
> --
>
> Key: IMPALA-11972
> URL: https://issues.apache.org/jira/browse/IMPALA-11972
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Affects Versions: Impala 4.3.0
>Reporter: Riza Suminto
>Assignee: Riza Suminto
>Priority: Major
> Fix For: Impala 4.4.0
>
>
> IMPALA-11604 add ProcessingCost (PC) concept to measure the cost for a 
> distinct PlanNode / DataSink / PlanFragment to process its input rows 
> globally across all of its instances.
> We should investigate if the row width should be considered in computing PC 
> for more operators, and if that will make the PC model more accurate. The 
> code in IMPALA-11604 has materialization cost parameter to accommodate PC 
> where row width should factor in. Currently, PC of ScanNode, ExchangeNode, 
> and DataStreamSink has row width factored in through materialization 
> parameter here.
> For VARCHAR, we can use some kind of average width stats, if available.  For 
> fixed width columns, we just use the width. In both cases, the unit should be 
> in bytes. The idea of including a width in costing is to make the outcome as 
> precise and less error-prone as possible.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-12499) TestScanMemLimit.test_hdfs_scanner_thread_mem_scaling fails intermittently

2024-04-19 Thread Riza Suminto (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Riza Suminto resolved IMPALA-12499.
---
Target Version: Impala 4.4.0
Resolution: Fixed

Has not seen this anymore since last occurrence. Resolving this JIRA.

> TestScanMemLimit.test_hdfs_scanner_thread_mem_scaling fails intermittently
> --
>
> Key: IMPALA-12499
> URL: https://issues.apache.org/jira/browse/IMPALA-12499
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.4.0
>Reporter: Joe McDonnell
>Assignee: Riza Suminto
>Priority: Critical
>  Labels: broken-build, flaky
> Fix For: Impala 4.4.0
>
>
> An ASAN test job ran into a failure on the new test case for 
> TestScanMemLimit.test_hdfs_scanner_thread_mem_scaling:
> {noformat}
> query_test/test_mem_usage_scaling.py:376: in 
> test_hdfs_scanner_thread_mem_scaling
> self.run_test_case('QueryTest/hdfs-scanner-thread-mem-scaling', vector)
> common/impala_test_suite.py:776: in run_test_case
> update_section=pytest.config.option.update_results)
> common/test_result_verifier.py:682: in verify_runtime_profile
> % (function, field, expected_value, actual_value, op, actual))
> E   AssertionError: Aggregation of SUM over NumScannerThreadsStarted did not 
> match expected results.
> E   EXPECTED VALUE:
> E   3
> E   
> E   
> E   ACTUAL VALUE:
> E   1{noformat}
> That must correspond to this test case: 
> [https://github.com/apache/impala/blob/master/testdata/workloads/functional-query/queries/QueryTest/hdfs-scanner-thread-mem-scaling.test#L36-L51]
> This was added recently with the fix for IMPALA-11068.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-12864) Deflake test_query_log_size_in_bytes

2024-04-19 Thread Riza Suminto (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Riza Suminto resolved IMPALA-12864.
---
 Fix Version/s: Impala 4.4.0
Target Version: Impala 4.4.0
Resolution: Fixed

> Deflake test_query_log_size_in_bytes
> 
>
> Key: IMPALA-12864
> URL: https://issues.apache.org/jira/browse/IMPALA-12864
> Project: IMPALA
>  Issue Type: Test
>Reporter: Riza Suminto
>Assignee: Riza Suminto
>Priority: Major
>  Labels: broken-build
> Fix For: Impala 4.4.0
>
>
> test_query_log_size_in_bytes is flaky in exhaustive build. The expected 
> QueryStateRecord is off by around 100KB per query, indicating that the test 
> query might be too complex and lead to variability in final query profile 
> that is being archived.
> The test query and assertion need to be simplified.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-13021) Failed test: test_iceberg_deletes_and_updates_and_optimize

2024-04-19 Thread Csaba Ringhofer (Jira)
Csaba Ringhofer created IMPALA-13021:


 Summary: Failed test: test_iceberg_deletes_and_updates_and_optimize
 Key: IMPALA-13021
 URL: https://issues.apache.org/jira/browse/IMPALA-13021
 Project: IMPALA
  Issue Type: Bug
Reporter: Csaba Ringhofer


{code}
test_iceberg_deletes_and_updates_and_optimize
run_tasks([deleter, updater, optimizer, checker])
stress/stress_util.py:46: in run_tasks
pool.map_async(Task.run, tasks).get(timeout_seconds)
Impala-Toolchain/toolchain-packages-gcc10.4.0/python-2.7.16/lib/python2.7/multiprocessing/pool.py:568:
 in get
raise TimeoutError
E   TimeoutError
{code}
This happened in an exhaustive test run with data cache.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8998) Admission control accounting for mt_dop

2024-04-19 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17838947#comment-17838947
 ] 

ASF subversion and git services commented on IMPALA-8998:
-

Commit 6abfdbc56c3d0ec3ac201dd4b8c2c35656d24eaf in impala's branch 
refs/heads/master from Riza Suminto
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=6abfdbc56 ]

IMPALA-12980: Translate CpuAsk into admission control slots

Impala has a concept of "admission control slots" - the amount of
parallelism that should be allowed on an Impala daemon. This defaults to
the number of processors per executor and can be overridden with
-–admission_control_slots flag.

Admission control slot accounting is described in IMPALA-8998. It
computes 'slots_to_use' for each backend based on the maximum number of
instances of any fragment on that backend. This can lead to slot
underestimation and query overadmission. For example, assume an executor
node with 48 CPU cores and configured with -–admission_control_slots=48.
It is assigned 4 non-blocking query fragments, each has 12 instances
scheduled in this executor. IMPALA-8998 algorithm will request the max
instance (12) slots rather than the sum of all non-blocking fragment
instances (48). With the 36 remaining slots free, the executor can still
admit another fragment from a different query but will potentially have
CPU contention with the one that is currently running.

When COMPUTE_PROCESSING_COST is enabled, Planner will generate a CpuAsk
number that represents the cpu requirement of that query over a
particular executor group set. This number is an estimation of the
largest number of query fragment instances that can run in parallel
without waiting, given by the blocking operator analysis. Therefore, the
fragment trace that sums into that CpuAsk number can be translated into
'slots_to_use' as well, which will be a closer resemblance of maximum
parallel execution of fragment instances.

This patch adds a new query option called SLOT_COUNT_STRATEGY to control
which admission control slot accounting to use. There are two possible
values:
- LARGEST_FRAGMENT, which is the original algorithm from IMPALA-8998.
  This is still the default value for the SLOT_COUNT_STRATEGY option.
- PLANNER_CPU_ASK, which will follow the fragment trace that contributes
  towards CpuAsk number. This strategy will schedule more or equal
  admission control slots than the LARGEST_FRAGMENT strategy.

To do the PLANNER_CPU_ASK strategy, the Planner will mark fragments that
contribute to CpuAsk as dominant fragments. It also passes
max_slot_per_executor information that it knows about the executor group
set to the scheduler.

AvgAdmissionSlotsPerExecutor counter is added to describe what Planner
thinks the average 'slots_to_use' per backend will be, which follows
this formula:

  AvgAdmissionSlotsPerExecutor = ceil(CpuAsk / num_executors)

Actual 'slots_to_use' in each backend may differ than
AvgAdmissionSlotsPerExecutor, depending on what is scheduled on that
backend. 'slots_to_use' will be shown as 'AdmissionSlots' counter under
each executor profile node.

Testing:
- Update test_executors.py with AvgAdmissionSlotsPerExecutor assertion.
- Pass test_tpcds_queries.py::TestTpcdsQueryWithProcessingCost.
- Add EE test test_processing_cost.py.
- Add FE test PlannerTest#testProcessingCostPlanAdmissionSlots.

Change-Id: I338ca96555bfe8d07afce0320b3688a0861663f2
Reviewed-on: http://gerrit.cloudera.org:8080/21257
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Admission control accounting for mt_dop
> ---
>
> Key: IMPALA-8998
> URL: https://issues.apache.org/jira/browse/IMPALA-8998
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Major
> Fix For: Impala 3.4.0
>
>
> We should account for the degree of parallelism that the query runs with on a 
> backend to avoid overadmitting too many parallel queries. 
> We could probably simply count the effective degree of parallelism (max # 
> instances of a fragment on that backend) toward the number of slots in 
> admission control (although slots are not enabled for the default group yet - 
> see IMPALA-8757).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12980) Translate CpuAsk into admission control slot to use

2024-04-19 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17838946#comment-17838946
 ] 

ASF subversion and git services commented on IMPALA-12980:
--

Commit 6abfdbc56c3d0ec3ac201dd4b8c2c35656d24eaf in impala's branch 
refs/heads/master from Riza Suminto
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=6abfdbc56 ]

IMPALA-12980: Translate CpuAsk into admission control slots

Impala has a concept of "admission control slots" - the amount of
parallelism that should be allowed on an Impala daemon. This defaults to
the number of processors per executor and can be overridden with
-–admission_control_slots flag.

Admission control slot accounting is described in IMPALA-8998. It
computes 'slots_to_use' for each backend based on the maximum number of
instances of any fragment on that backend. This can lead to slot
underestimation and query overadmission. For example, assume an executor
node with 48 CPU cores and configured with -–admission_control_slots=48.
It is assigned 4 non-blocking query fragments, each has 12 instances
scheduled in this executor. IMPALA-8998 algorithm will request the max
instance (12) slots rather than the sum of all non-blocking fragment
instances (48). With the 36 remaining slots free, the executor can still
admit another fragment from a different query but will potentially have
CPU contention with the one that is currently running.

When COMPUTE_PROCESSING_COST is enabled, Planner will generate a CpuAsk
number that represents the cpu requirement of that query over a
particular executor group set. This number is an estimation of the
largest number of query fragment instances that can run in parallel
without waiting, given by the blocking operator analysis. Therefore, the
fragment trace that sums into that CpuAsk number can be translated into
'slots_to_use' as well, which will be a closer resemblance of maximum
parallel execution of fragment instances.

This patch adds a new query option called SLOT_COUNT_STRATEGY to control
which admission control slot accounting to use. There are two possible
values:
- LARGEST_FRAGMENT, which is the original algorithm from IMPALA-8998.
  This is still the default value for the SLOT_COUNT_STRATEGY option.
- PLANNER_CPU_ASK, which will follow the fragment trace that contributes
  towards CpuAsk number. This strategy will schedule more or equal
  admission control slots than the LARGEST_FRAGMENT strategy.

To do the PLANNER_CPU_ASK strategy, the Planner will mark fragments that
contribute to CpuAsk as dominant fragments. It also passes
max_slot_per_executor information that it knows about the executor group
set to the scheduler.

AvgAdmissionSlotsPerExecutor counter is added to describe what Planner
thinks the average 'slots_to_use' per backend will be, which follows
this formula:

  AvgAdmissionSlotsPerExecutor = ceil(CpuAsk / num_executors)

Actual 'slots_to_use' in each backend may differ than
AvgAdmissionSlotsPerExecutor, depending on what is scheduled on that
backend. 'slots_to_use' will be shown as 'AdmissionSlots' counter under
each executor profile node.

Testing:
- Update test_executors.py with AvgAdmissionSlotsPerExecutor assertion.
- Pass test_tpcds_queries.py::TestTpcdsQueryWithProcessingCost.
- Add EE test test_processing_cost.py.
- Add FE test PlannerTest#testProcessingCostPlanAdmissionSlots.

Change-Id: I338ca96555bfe8d07afce0320b3688a0861663f2
Reviewed-on: http://gerrit.cloudera.org:8080/21257
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Translate CpuAsk into admission control slot to use
> ---
>
> Key: IMPALA-12980
> URL: https://issues.apache.org/jira/browse/IMPALA-12980
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Distributed Exec, Frontend
>Reporter: Riza Suminto
>Assignee: Riza Suminto
>Priority: Major
>
> Admission control slot accounting is described in IMPALA-8998. On each 
> backend, number of. It compute 'slots_to_use' for each backend based on the 
> max number of instances of any fragment on that backend. This is simplistic, 
> because multiple fragments with same number of instance count, say 4 
> non-blocking fragments each with 12 instances, only request the max instance 
> (12) admission slots rather than sum of it (48).
> When COMPUTE_PROCESSING_COST is enabled, Planner will generate a CpuAsk 
> number that represent the cpu requirement of that query over a particular 
> executor group set. This number is an estimation of what is the largest 
> number of query fragments that can run in-parallel given the blocking 
> operator analysis. Therefore, the fragment trace that sums into that CpuAsk 
> number can be translated into 'slots_to_use' as well, which will be a closer 
> resemblance of maximum parallel