Re: Fw: Issues with tests in Release-mode Impala build

Michael Brown Wed, 03 Aug 2016 16:41:53 -0700

Hi Valencia,

Apologies for the multiple replies. I'm at a conference and hadn't had
time previously to look into your problem in detail.


It's OK that the tests you cited previously are being skipped, and I
think for your purposes you can ignore those 3 skips. That said, the
run shouldn't be failing simply due to skips. If your run is still
failing, there's a test failure, error, or bug somewhere else.

More:

Your tests are being skipped due to confusion on what it means "to run
the exhaustive tests" vs. a proliferation of so-called "workloads"
bound to the test classes, and the implications of that as they
pertain to "how to run the exhaustive tests". I filed
https://issues.cloudera.org/browse/IMPALA-3947 to address this
confusing problem.


On Wed, Aug 3, 2016 at 10:21 AM, Michael Brown <[email protected]> wrote:
> Valencia,
>
> Also ensure you've incorporated the fix for
> https://issues.cloudera.org/browse/IMPALA-3630 ("exhaustive custom cluster
> tests never run") .
>
> Thanks.
>
> On Wed, Aug 3, 2016 at 8:15 AM, Tim Armstrong <[email protected]>
> wrote:
>>
>> Hi Valencia,
>>   I'm not sure exactly what's happening but I have a couple of ideas:
>>
>> You may be running into some known bugs in pytest that causes custom
>> cluster tests to be skipped: see
>> https://issues.cloudera.org/browse/IMPALA-3614 . We don't see that on Impala
>> master but maybe you added a skip marker to one of the custom cluster tests.
>> You could add a "set -x" command to the top of the script to check the
>> exact arguments run-tests is being invoked with.
>>
>>
>> On Tue, Aug 2, 2016 at 9:25 PM, Valencia Serrao <[email protected]>
>> wrote:
>>>
>>> Hi Tim,
>>>
>>> To trace the exploration strategy var at each step leading to the
>>> "test_spilling" test, I put a few print statements for it. I observed that,
>>> once the process flow reaches the impala_test_suite.py the strategy is
>>> changed to 'core'. The logs printed as follows: (logs marked pink)
>>>
>>> Waiting for HiveServer2 at localhost:11050...
>>> Could not connect to localhost:11050
>>> HiveServer2 service is up at localhost:11050
>>> --> Starting the Sentry Policy Server
>>> in buildall.sh before run-all-tests.sh:::::::Exploration strategy=
>>> exhaustive
>>> 1st in run-all-tests.sh :::::::Exploration strategy= exhaustive
>>> Split and assign HBase regions (logging to split-hbase.log)... OK
>>> Starting Impala cluster (logging to start-impala-cluster.log)... OK
>>> Run test run-workload (logging to test-run-workload.log)... OK
>>> Starting CC tests:::::::Exploration strategy= exhaustive
>>> ============================= test session starts
>>> ==============================
>>> platform linux2 -- Python 2.7.10 -- py-1.4.30 -- pytest-2.7.2
>>> rootdir: /home/test/ProjectImpala, inifile:
>>> plugins: xdist, random
>>> default_strategy in impala_test_suite.py::::EXPLORATION STRATEGY= core
>>> default_strategy in impala_test_suite.py::::EXPLORATION STRATEGY= core
>>> in conftest.py:::::Exploration Strategy = core
>>> in test_vector:::::Exploration Strategy = core
>>> default_strategy in impala_test_suite.py::::EXPLORATION STRATEGY= core
>>> default_strategy in impala_test_suite.py::::EXPLORATION STRATEGY= core
>>> in conftest.py:::::Exploration Strategy = core
>>> in test_vector.py:::::Exploration Strategy = core
>>> collected 4 items
>>>
>>> custom_cluster/test_spilling.py sss.
>>>
>>> generated xml file:
>>> /home/test/ProjectImpala/ImpalaPPC/tests/custom_cluster/results/TEST-impala-custom-cluster.xml
>>> ============== 1 passed, 3 skipped, 1 warnings in 110.42 seconds
>>> ===============
>>>
>>> Could you please guide me on this issue ?
>>>
>>> Regards,
>>> Valencia
>>>
>>> Valencia Serrao---08/02/2016 10:49:19 AM-------- Forwarded by Valencia
>>> Serrao/Austin/Contr/IBM on 08/02/2016 10:48 AM ----- From: Valencia Serr
>>>
>>> From: Valencia Serrao/Austin/Contr/IBM
>>> To: Nishidha Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
>>> Jagadale/Austin/Contr/IBM@IBMUS, Manish Patil/Austin/Contr/IBM@IBMUS
>>> Date: 08/02/2016 10:49 AM
>>> Subject: Fw: Issues with tests in Release-mode Impala build
>>>
>>> ________________________________
>>>
>>>
>>>
>>> ----- Forwarded by Valencia Serrao/Austin/Contr/IBM on 08/02/2016 10:48
>>> AM -----
>>>
>>> From: Valencia Serrao/Austin/Contr/IBM
>>> To: Tim Armstrong <[email protected]>
>>> Date: 08/02/2016 09:05 AM
>>>
>>> Subject: Re: Issues with tests in Release-mode Impala build
>>> ________________________________
>>>
>>>
>>> Hi Tim,
>>>
>>> I am executing the tests in exhaustive mode, but still I see that the
>>> "test_spilling" test still fails with skip message "runs only in
>>> exhaustive." Following are the various ways I tried to run the tests:
>>>
>>> 1. ${IMPALA_HOME}/buildall.sh -noclean -testexhaustive
>>> 2. Explicitly set EXPLORATION_STRATEGY in run-all_tests.sh and
>>> buildall.sh to exhaustive.
>>>
>>> I think it is getting reset somewhere to some other strategy. Could you
>>> please help me to correctly set the env to run the Custom cluster in the
>>> exhaustive exploration strategy ?
>>>
>>> Regards,
>>> Valencia
>>>
>>>
>>> Valencia Serrao---07/25/2016 05:56:16 PM---Hi Tim, Thanks for the
>>> detailed response.
>>>
>>> From: Valencia Serrao/Austin/Contr/IBM
>>> To: Tim Armstrong <[email protected]>
>>> Cc: [email protected], Manish Patil/Austin/Contr/IBM@IBMUS,
>>> Nishidha Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
>>> Jagadale/Austin/Contr/IBM@IBMUS, Valencia Serrao/Austin/Contr/IBM@IBMUS
>>> Date: 07/25/2016 05:56 PM
>>> Subject: Re: Issues with tests in Release-mode Impala build
>>> ________________________________
>>>
>>>
>>> Hi Tim,
>>>
>>> Thanks for the detailed response.
>>>
>>> Also, the BE "benchmark-test" issue is resolved. It now passes together
>>> with the complete BE suite in Release mode.
>>>
>>> Regards,
>>> Valencia
>>>
>>>
>>> Tim Armstrong ---07/23/2016 12:15:10 AM---2a. Exhaustive is a superset of
>>> core. We run the core tests pre-commit on
>>>
>>> From: Tim Armstrong <[email protected]>
>>> To: Valencia Serrao/Austin/Contr/IBM@IBMUS
>>> Cc: [email protected], Manish Patil/Austin/Contr/IBM@IBMUS,
>>> Nishidha Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
>>> Jagadale/Austin/Contr/IBM@IBMUS
>>> Date: 07/23/2016 12:15 AM
>>> Subject: Re: Issues with tests in Release-mode Impala build
>>> ________________________________
>>>
>>>
>>>
>>> 2a.
>>> Exhaustive is a superset of core. We run the core tests pre-commit on
>>> CentOS 6 + HDFS and the full exhaustive tests post-commit on a wider range
>>> of configurations. We don't release Impala unless all exhaustive tests
>>> passed on all configurations we test (if there's a valid reason why
>>> something doesn't work on a given platform we skip the test).
>>>
>>> 2b.
>>> Exhaustive is a superset of core, so if exhaustive passes then core
>>> should do. The exhaustive build takes much longer than core so it makes
>>> sense to run it less frequently (e.g. we run it nightly for some
>>> configurations and weekly for others).
>>>
>>> 2c.
>>> Confusingly, the core/exhaustive data load doesn't map to core/exhaustive
>>> tests. We actually use the same data load for all test configurations. See
>>> testdata/bin/create-load-data.sh for how the core/exhaustive data load is
>>> invoked. E.g. we load the functional data with exhaustive (i.e. all
>>> supported file formats) and the larger tpc-h/tpc-ds data sets for only a
>>> subset of file forms.
>>>
>>>
>>> On Wed, Jul 20, 2016 at 9:39 PM, Valencia Serrao <[email protected]>
>>> wrote:
>>>
>>> Hi Tim,
>>>
>>> Thank you for the insight on the issues.
>>>
>>> 1. BE test -issue: benchmark-test hangs
>>> As you suggested, I increased the "batch_size" value to upto 125000000,
>>> however, the sw.ElapsedTime() does not increase inside the while and again
>>> gets caught up in an infinite loop. The optimization level seems to cause
>>> this behavior. I am still working in this.
>>>
>>> 2. Custom cluster tests: skipping some tests in test_spilling
>>> I found in the logs that the "test_spilling" test was skipped as the
>>> exploration strategy was set to "core" on our Impala setup.
>>>
>>> Some question here,
>>> a. From a Impala release perspective how significant are these strategies
>>> (core, exhaustive, etc.) ?
>>> b. Do we have to test with all combinations (core|release mode build and
>>> exhaustive|release mode build).
>>> c. Does the exploration strategy selection also affect the test data
>>> loaded ? (data loaded is different in each exploration strategy ? )
>>>
>>> Please let me know your comments.
>>>
>>> Regards,
>>> Valencia
>>>
>>> Tim Armstrong ---07/19/2016 09:11:48 PM---With 2, it's a little strange
>>> that test_spilling is being skipped - I think that one should be run.
>>>
>>> From: Tim Armstrong <[email protected]>
>>> To: Valencia Serrao/Austin/Contr/IBM@IBMUS
>>> Cc: [email protected], Manish Patil/Austin/Contr/IBM@IBMUS,
>>> Nishidha Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
>>> Jagadale/Austin/Contr/IBM@IBMUS
>>> Date: 07/19/2016 09:11 PM
>>>
>>>
>>> Subject: Re: Issues with tests in Release-mode Impala build
>>>
>>> ________________________________
>>>
>>>
>>>
>>> With 2, it's a little strange that test_spilling is being skipped - I
>>> think that one should be run.
>>>
>>> On Tue, Jul 19, 2016 at 8:39 AM, Tim Armstrong <[email protected]>
>>> wrote:
>>>
>>> It looks like the benchmark-test issue is something to do with the
>>> granularity of the clock. It can get stuck in an infinite loop if the
>>> function call below always takes less than the smallest measurable unit of
>>> time (i.e. Start() and Stop() are called in the same time quantum).
>>>
>>>   while (sw.ElapsedTime() < target_cycles) {
>>>     sw.Start();
>>>     function(batch_size, args);
>>>     sw.Stop();
>>>     iters += batch_size;
>>>   }
>>>
>>> We use Intel's rdtsc instruction for a timer here, so I guess whatever
>>> PPC alternative you used may work a little differently. This is probably ok,
>>> but it's possible that it could affect timers elsewhere in Impala.
>>>
>>> One solution would be to increase the default batch size.
>>>
>>> On Tue, Jul 19, 2016 at 5:29 AM, Valencia Serrao <[email protected]>
>>> wrote:
>>> Hi Tim,
>>>
>>> Following are some observations:
>>>
>>> 1. BE test -issue: benchmark-test hangs
>>> Putting trace logs like below in benchmark.cc:
>>> while (sw.ElapsedTime() < target_cycles) {
>>> LOG(INFO) <<" in while(sw.ElapsedTime() < target_cycles)";
>>> sw.Start();
>>> function(batch_size, args);
>>> sw.Stop();
>>> iters += batch_size;
>>> LOG(INFO) <<" In while:::::::: sw.ElapsedTime() "<< sw.ElapsedTime();
>>> LOG(INFO) <<" In while:::::::: iters = " << iters ;
>>>
>>> In Release mode, I observed that the sw.ElapsedTime() remains constant
>>> and does not increase, therefore, it is caught up in an infinite loop and
>>> the benchmark-test hangs. In Debug mode, sw.ElapsedTime() keeps on
>>> increasing and therefore is able to come out of the while loop and
>>> benchmark-test doesn't hang in Debug mode.
>>> I'm working on this issue, however, if you could give any pointers about
>>> it, that would be really great.
>>>
>>> 2. Custom cluster tests: I have included the code changes in my branch
>>> and many of the earlier 36 skipped tests have now executed and they pass,
>>> but with the following exception(when compared to the output in
>>> thehttps://issues.cloudera.org/browse/IMPALA-3614 ):
>>> custom_cluster/test_spilling.py sss.
>>>
>>> Current CC test stats: 34 passed, 7 skipped, 3 warnings.
>>>
>>> 3. End-to-End tests: I couldn't dive into the EE tests. I will surely let
>>> you know more about them as soon as I'm done with them.
>>>
>>> Regards,
>>> Valencia
>>>
>>> Valencia Serrao---07/19/2016 10:26:31 AM---Hi Tim, Thank you for the
>>> information.
>>>
>>> From: Valencia Serrao/Austin/Contr/IBM
>>> To: Tim Armstrong <[email protected]>
>>> Cc: [email protected], Manish Patil/Austin/Contr/IBM@IBMUS,
>>> Nishidha Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
>>> Jagadale/Austin/Contr/IBM@IBMUS
>>> Date: 07/19/2016 10:26 AM
>>> Subject: Re: Issues with tests in Release-mode Impala build
>>> ________________________________
>>>
>>>
>>> Hi Tim,
>>>
>>> Thank you for the information.
>>>
>>> I am working on the pointers you have given and also on the fix for
>>> Custom cluster (skipped) tests. I will inform you on the findings.
>>>
>>> Regards,
>>> Valencia
>>>
>>>
>>>
>>> Tim Armstrong ---07/18/2016 09:19:52 PM---Hi Valencia, 1. We run tests in
>>> release mode nightly and it doesn't look like we've seen
>>>
>>> From: Tim Armstrong <[email protected]>
>>> To: [email protected]
>>> Cc: Valencia Serrao/Austin/Contr/IBM@IBMUS, Nishidha
>>> Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan Jagadale/Austin/Contr/IBM@IBMUS,
>>> Manish Patil/Austin/Contr/IBM@IBMUS
>>> Date: 07/18/2016 09:19 PM
>>> Subject: Re: Issues with tests in Release-mode Impala build
>>> ________________________________
>>>
>>>
>>>
>>> Hi Valencia,
>>>
>>> 1. We run tests in release mode nightly and it doesn't look like we've
>>> seen this hang. I'd suggest you attach a debugger to the benchmark-test
>>> process and see what it's doing. It could either be an actual hang, or an
>>> infinite/very long loop. That test is only testing our benchmarking
>>> utilities, not Impala itself, but IMO it's always good to understand why
>>> something like that is happening in case there's a more general problem.
>>> 2. Sounds like https://issues.cloudera.org/browse/IMPALA-3614 . Have you
>>> got the fix for that in your branch?
>>> 3. Look forward to hearing more.
>>>
>>> Cheers,
>>> Tim
>>>
>>> On Mon, Jul 18, 2016 at 2:49 AM, Valencia Serrao <[email protected]>
>>> wrote:
>>>
>>>
>>> Hi All,
>>>
>>> I have built Impala in Release mode. I executed the tests,  following are
>>> some observations:
>>>
>>> 1. BE test: The test execution hangs at the "benchmark-test". There are
>>> no
>>> errors shown and it hangs at this test. Earlier, running the BE tests in
>>> debug mode this issue did not occur.
>>> 2. Custom Cluster test: 5 tests passed and 36 tests skipped. All of the
>>> skipped cases give the message: "INSERT not implemented for S3"
>>> 3. EE tests: I've also seen some failures here (yet to check the details)
>>>
>>> As for FE and JDBC tests, everything works fine, release mode test output
>>> is same as that of debug mode test output.
>>>
>>> Is the  "benchmark-test" test known to fail in Release mode or am I
>>> missing
>>> out on any configuration. Also, I want to understand the significance of
>>> this test, if in case we could ignore it and move ahead.
>>>
>>>
>>>
>>> Regards,
>>> Valencia
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>

Re: Fw: Issues with tests in Release-mode Impala build

Reply via email to