[
https://issues.apache.org/jira/browse/IMPALA-13469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17892996#comment-17892996
]
ASF subversion and git services commented on IMPALA-13469:
----------------------------------------------------------
Commit b07bd6ddeb6b1970f65b85ff927726ba3003f8aa in impala's branch
refs/heads/master from Riza Suminto
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=b07bd6dde ]
IMPALA-13469: Deflake test_query_cpu_count_on_insert
A new test case from IMPALA-13445 reveals a pre-existing bug where
cost-based planning may increase expectedNumInputInstance greater than
inputFragment.getNumInstances(), which leads to precondition violation.
The following scenario all happened when the Precondition was hit:
1. The environment is either Erasure Coded HDFS or Ozone.
2. The source table does not have stats nor numRows table property.
3. There is only one fragment consisting of a ScanNode in the plan tree
before the addition of DML fragment.
4. Byte-based cardinality estimation logic kicks in.
5. Byte-based cardinality causes high scan cost, which leads to
maxScanThread exceeding inputFragment.getPlanRoot().
6. expectedNumInputInstance is assigned equal to maxScanThread.
7. Precondition expectedNumInputInstance < inputFragment.getPlanRoot()
is violated.
This scenario triggers a special condition that attempts to lower
expectedNumInputInstance. But instead of lowering
expectedNumInputInstance, the special logic increases it due to higher
byte-based cardinality estimation.
There is also a new bug where DistributedPlanner.java mistakenly passes
root.getInputCardinality() instead of root.getCardinality().
This patch fixes both issues and does minor refactoring to change
variable names into camel cases. Relaxed validation of the last test
case of test_query_cpu_count_on_insert to let it pass in Erasure Coded
HDFS and Ozone setup.
Testing:
- Make several assertions in test_executor_groups.py more verbose.
- Pass test_executor_groups.py in Erasure Coded HDFS and Ozone setup.
- Added new Planner tests with unknown cardinality estimation.
- Pass core tests in regular setup.
Change-Id: I834eb6bf896752521e733cd6b77a03f746e6a447
Reviewed-on: http://gerrit.cloudera.org:8080/21966
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> test_query_cpu_count_on_insert seems to be flaky
> ------------------------------------------------
>
> Key: IMPALA-13469
> URL: https://issues.apache.org/jira/browse/IMPALA-13469
> Project: IMPALA
> Issue Type: Bug
> Reporter: Fang-Yu Rao
> Assignee: Riza Suminto
> Priority: Major
> Labels: broken-build
> Fix For: Impala 4.5.0
>
>
> We found that the test test_query_cpu_count_on_insert() that was recently
> added in IMPALA-13445 seems to be flaky could fail with the following error.
> +*Error Message*+
> {code:java}
> ImpalaBeeswaxException: Query 554a332e9f9b499a:da216f5900000000 failed:
> IllegalStateException: null
> {code}
> +*Stacktrace*+
> {code:java}
> custom_cluster/test_executor_groups.py:1375: in test_query_cpu_count_on_insert
> "Verdict: Match", "CpuAsk: 9", "CpuAskBounded: 9", "|
> partitions=unavailable"])
> custom_cluster/test_executor_groups.py:946: in _run_query_and_verify_profile
> result = self.execute_query_expect_success(self.client, query)
> common/impala_test_suite.py:891: in wrapper
> return function(*args, **kwargs)
> common/impala_test_suite.py:901: in execute_query_expect_success
> result = cls.__execute_query(impalad_client, query, query_options, user)
> common/impala_test_suite.py:1045: in __execute_query
> return impalad_client.execute(query, user=user)
> common/impala_connection.py:216: in execute
> fetch_profile_after_close=fetch_profile_after_close)
> beeswax/impala_beeswax.py:190: in execute
> handle = self.__execute_query(query_string.strip(), user=user)
> beeswax/impala_beeswax.py:381: in __execute_query
> handle = self.execute_query_async(query_string, user=user)
> beeswax/impala_beeswax.py:375: in execute_query_async
> handle = self.__do_rpc(lambda: self.imp_service.query(query,))
> beeswax/impala_beeswax.py:553: in __do_rpc
> raise ImpalaBeeswaxException(self.__build_error_message(b), b)
> E ImpalaBeeswaxException: Query 554a332e9f9b499a:da216f5900000000 failed:
> E IllegalStateException: null
> {code}
>
> The stack trace from the coordinator is given as follows too.
> {code}
> I1021 09:42:04.707075 18064 jni-util.cc:321]
> 554a332e9f9b499a:da216f5900000000] java.lang.IllegalStateException
> at
> com.google.common.base.Preconditions.checkState(Preconditions.java:496)
> at
> org.apache.impala.planner.DistributedPlanner.createDmlFragment(DistributedPlanner.java:308)
> at
> org.apache.impala.planner.Planner.createPlanFragments(Planner.java:173)
> at org.apache.impala.planner.Planner.createPlans(Planner.java:310)
> at
> org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1969)
> at
> org.apache.impala.service.Frontend.getPlannedExecRequest(Frontend.java:2968)
> at
> org.apache.impala.service.Frontend.doCreateExecRequest(Frontend.java:2730)
> at
> org.apache.impala.service.Frontend.getTExecRequest(Frontend.java:2269)
> at
> org.apache.impala.service.Frontend.createExecRequest(Frontend.java:2030)
> at
> org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:175)
> I1021 09:42:04.707089 18064 status.cc:129] 554a332e9f9b499a:da216f5900000000]
> IllegalStateException: null
> @ 0x10c3dc7 impala::Status::Status()
> @ 0x19b3668 impala::JniUtil::GetJniExceptionMsg()
> @ 0x16b39ee impala::JniCall::Call<>()
> @ 0x1684d0f impala::Frontend::GetExecRequest()
> @ 0x23acec3 impala::QueryDriver::DoFrontendPlanning()
> @ 0x23ad0b3 impala::QueryDriver::RunFrontendPlanner()
> @ 0x17124cb impala::ImpalaServer::ExecuteInternal()
> @ 0x17131ba impala::ImpalaServer::Execute()
> @ 0x1885fd1 impala::ImpalaServer::query()
> @ 0x175c4bc beeswax::BeeswaxServiceProcessorT<>::process_query()
> @ 0x17e0545 beeswax::BeeswaxServiceProcessorT<>::dispatchCall()
> @ 0x17e0aea impala::ImpalaServiceProcessorT<>::dispatchCall()
> @ 0xf6c5d3 apache::thrift::TDispatchProcessor::process()
> @ 0x13ea8b6
> apache::thrift::server::TAcceptQueueServer::Task::run()
> @ 0x13d727b impala::ThriftThread::RunRunnable()
> @ 0x13d8eab
> boost::detail::function::void_function_obj_invoker0<>::invoke()
> @ 0x1a8c2d8 impala::Thread::SuperviseThread()
> @ 0x1a8d0e1 boost::detail::thread_data<>::run()
> @ 0x256bee7 thread_proxy
> @ 0x7f4361e24ea5 start_thread
> @ 0x7f435ed1fb0d __clone
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]