[ 
https://issues.apache.org/jira/browse/IMPALA-13469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17892996#comment-17892996
 ] 

ASF subversion and git services commented on IMPALA-13469:
----------------------------------------------------------

Commit b07bd6ddeb6b1970f65b85ff927726ba3003f8aa in impala's branch 
refs/heads/master from Riza Suminto
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=b07bd6dde ]

IMPALA-13469: Deflake test_query_cpu_count_on_insert

A new test case from IMPALA-13445 reveals a pre-existing bug where
cost-based planning may increase expectedNumInputInstance greater than
inputFragment.getNumInstances(), which leads to precondition violation.
The following scenario all happened when the Precondition was hit:

1. The environment is either Erasure Coded HDFS or Ozone.
2. The source table does not have stats nor numRows table property.
3. There is only one fragment consisting of a ScanNode in the plan tree
   before the addition of DML fragment.
4. Byte-based cardinality estimation logic kicks in.
5. Byte-based cardinality causes high scan cost, which leads to
   maxScanThread exceeding inputFragment.getPlanRoot().
6. expectedNumInputInstance is assigned equal to maxScanThread.
7. Precondition expectedNumInputInstance < inputFragment.getPlanRoot()
   is violated.

This scenario triggers a special condition that attempts to lower
expectedNumInputInstance. But instead of lowering
expectedNumInputInstance, the special logic increases it due to higher
byte-based cardinality estimation.

There is also a new bug where DistributedPlanner.java mistakenly passes
root.getInputCardinality() instead of root.getCardinality().

This patch fixes both issues and does minor refactoring to change
variable names into camel cases. Relaxed validation of the last test
case of test_query_cpu_count_on_insert to let it pass in Erasure Coded
HDFS and Ozone setup.

Testing:
- Make several assertions in test_executor_groups.py more verbose.
- Pass test_executor_groups.py in Erasure Coded HDFS and Ozone setup.
- Added new Planner tests with unknown cardinality estimation.
- Pass core tests in regular setup.

Change-Id: I834eb6bf896752521e733cd6b77a03f746e6a447
Reviewed-on: http://gerrit.cloudera.org:8080/21966
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>


> test_query_cpu_count_on_insert seems to be flaky
> ------------------------------------------------
>
>                 Key: IMPALA-13469
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13469
>             Project: IMPALA
>          Issue Type: Bug
>            Reporter: Fang-Yu Rao
>            Assignee: Riza Suminto
>            Priority: Major
>              Labels: broken-build
>             Fix For: Impala 4.5.0
>
>
> We found that the test test_query_cpu_count_on_insert() that was recently 
> added in IMPALA-13445 seems to be flaky could fail with the following error.
> +*Error Message*+
> {code:java}
> ImpalaBeeswaxException: Query 554a332e9f9b499a:da216f5900000000 failed: 
> IllegalStateException: null
> {code}
> +*Stacktrace*+
> {code:java}
> custom_cluster/test_executor_groups.py:1375: in test_query_cpu_count_on_insert
>     "Verdict: Match", "CpuAsk: 9", "CpuAskBounded: 9", "|  
> partitions=unavailable"])
> custom_cluster/test_executor_groups.py:946: in _run_query_and_verify_profile
>     result = self.execute_query_expect_success(self.client, query)
> common/impala_test_suite.py:891: in wrapper
>     return function(*args, **kwargs)
> common/impala_test_suite.py:901: in execute_query_expect_success
>     result = cls.__execute_query(impalad_client, query, query_options, user)
> common/impala_test_suite.py:1045: in __execute_query
>     return impalad_client.execute(query, user=user)
> common/impala_connection.py:216: in execute
>     fetch_profile_after_close=fetch_profile_after_close)
> beeswax/impala_beeswax.py:190: in execute
>     handle = self.__execute_query(query_string.strip(), user=user)
> beeswax/impala_beeswax.py:381: in __execute_query
>     handle = self.execute_query_async(query_string, user=user)
> beeswax/impala_beeswax.py:375: in execute_query_async
>     handle = self.__do_rpc(lambda: self.imp_service.query(query,))
> beeswax/impala_beeswax.py:553: in __do_rpc
>     raise ImpalaBeeswaxException(self.__build_error_message(b), b)
> E   ImpalaBeeswaxException: Query 554a332e9f9b499a:da216f5900000000 failed:
> E   IllegalStateException: null
> {code}
>  
> The stack trace from the coordinator is given as follows too.
> {code}
> I1021 09:42:04.707075 18064 jni-util.cc:321] 
> 554a332e9f9b499a:da216f5900000000] java.lang.IllegalStateException
>         at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:496)
>         at 
> org.apache.impala.planner.DistributedPlanner.createDmlFragment(DistributedPlanner.java:308)
>         at 
> org.apache.impala.planner.Planner.createPlanFragments(Planner.java:173)
>         at org.apache.impala.planner.Planner.createPlans(Planner.java:310)
>         at 
> org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1969)
>         at 
> org.apache.impala.service.Frontend.getPlannedExecRequest(Frontend.java:2968)
>         at 
> org.apache.impala.service.Frontend.doCreateExecRequest(Frontend.java:2730)
>         at 
> org.apache.impala.service.Frontend.getTExecRequest(Frontend.java:2269)
>         at 
> org.apache.impala.service.Frontend.createExecRequest(Frontend.java:2030)
>         at 
> org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:175)
> I1021 09:42:04.707089 18064 status.cc:129] 554a332e9f9b499a:da216f5900000000] 
> IllegalStateException: null
>     @          0x10c3dc7  impala::Status::Status()
>     @          0x19b3668  impala::JniUtil::GetJniExceptionMsg()
>     @          0x16b39ee  impala::JniCall::Call<>()
>     @          0x1684d0f  impala::Frontend::GetExecRequest()
>     @          0x23acec3  impala::QueryDriver::DoFrontendPlanning()
>     @          0x23ad0b3  impala::QueryDriver::RunFrontendPlanner()
>     @          0x17124cb  impala::ImpalaServer::ExecuteInternal()
>     @          0x17131ba  impala::ImpalaServer::Execute()
>     @          0x1885fd1  impala::ImpalaServer::query()
>     @          0x175c4bc  beeswax::BeeswaxServiceProcessorT<>::process_query()
>     @          0x17e0545  beeswax::BeeswaxServiceProcessorT<>::dispatchCall()
>     @          0x17e0aea  impala::ImpalaServiceProcessorT<>::dispatchCall()
>     @           0xf6c5d3  apache::thrift::TDispatchProcessor::process()
>     @          0x13ea8b6  
> apache::thrift::server::TAcceptQueueServer::Task::run()
>     @          0x13d727b  impala::ThriftThread::RunRunnable()
>     @          0x13d8eab  
> boost::detail::function::void_function_obj_invoker0<>::invoke()
>     @          0x1a8c2d8  impala::Thread::SuperviseThread()
>     @          0x1a8d0e1  boost::detail::thread_data<>::run()
>     @          0x256bee7  thread_proxy
>     @     0x7f4361e24ea5  start_thread
>     @     0x7f435ed1fb0d  __clone
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to