Hi Valencia, Apologies for the multiple replies. I'm at a conference and hadn't had time previously to look into your problem in detail.
It's OK that the tests you cited previously are being skipped, and I think for your purposes you can ignore those 3 skips. That said, the run shouldn't be failing simply due to skips. If your run is still failing, there's a test failure, error, or bug somewhere else. More: Your tests are being skipped due to confusion on what it means "to run the exhaustive tests" vs. a proliferation of so-called "workloads" bound to the test classes, and the implications of that as they pertain to "how to run the exhaustive tests". I filed https://issues.cloudera.org/browse/IMPALA-3947 to address this confusing problem. On Wed, Aug 3, 2016 at 10:21 AM, Michael Brown <[email protected]> wrote: > Valencia, > > Also ensure you've incorporated the fix for > https://issues.cloudera.org/browse/IMPALA-3630 ("exhaustive custom cluster > tests never run") . > > Thanks. > > On Wed, Aug 3, 2016 at 8:15 AM, Tim Armstrong <[email protected]> > wrote: >> >> Hi Valencia, >> I'm not sure exactly what's happening but I have a couple of ideas: >> >> You may be running into some known bugs in pytest that causes custom >> cluster tests to be skipped: see >> https://issues.cloudera.org/browse/IMPALA-3614 . We don't see that on Impala >> master but maybe you added a skip marker to one of the custom cluster tests. >> You could add a "set -x" command to the top of the script to check the >> exact arguments run-tests is being invoked with. >> >> >> On Tue, Aug 2, 2016 at 9:25 PM, Valencia Serrao <[email protected]> >> wrote: >>> >>> Hi Tim, >>> >>> To trace the exploration strategy var at each step leading to the >>> "test_spilling" test, I put a few print statements for it. I observed that, >>> once the process flow reaches the impala_test_suite.py the strategy is >>> changed to 'core'. The logs printed as follows: (logs marked pink) >>> >>> Waiting for HiveServer2 at localhost:11050... >>> Could not connect to localhost:11050 >>> HiveServer2 service is up at localhost:11050 >>> --> Starting the Sentry Policy Server >>> in buildall.sh before run-all-tests.sh:::::::Exploration strategy= >>> exhaustive >>> 1st in run-all-tests.sh :::::::Exploration strategy= exhaustive >>> Split and assign HBase regions (logging to split-hbase.log)... OK >>> Starting Impala cluster (logging to start-impala-cluster.log)... OK >>> Run test run-workload (logging to test-run-workload.log)... OK >>> Starting CC tests:::::::Exploration strategy= exhaustive >>> ============================= test session starts >>> ============================== >>> platform linux2 -- Python 2.7.10 -- py-1.4.30 -- pytest-2.7.2 >>> rootdir: /home/test/ProjectImpala, inifile: >>> plugins: xdist, random >>> default_strategy in impala_test_suite.py::::EXPLORATION STRATEGY= core >>> default_strategy in impala_test_suite.py::::EXPLORATION STRATEGY= core >>> in conftest.py:::::Exploration Strategy = core >>> in test_vector:::::Exploration Strategy = core >>> default_strategy in impala_test_suite.py::::EXPLORATION STRATEGY= core >>> default_strategy in impala_test_suite.py::::EXPLORATION STRATEGY= core >>> in conftest.py:::::Exploration Strategy = core >>> in test_vector.py:::::Exploration Strategy = core >>> collected 4 items >>> >>> custom_cluster/test_spilling.py sss. >>> >>> generated xml file: >>> /home/test/ProjectImpala/ImpalaPPC/tests/custom_cluster/results/TEST-impala-custom-cluster.xml >>> ============== 1 passed, 3 skipped, 1 warnings in 110.42 seconds >>> =============== >>> >>> Could you please guide me on this issue ? >>> >>> Regards, >>> Valencia >>> >>> Valencia Serrao---08/02/2016 10:49:19 AM-------- Forwarded by Valencia >>> Serrao/Austin/Contr/IBM on 08/02/2016 10:48 AM ----- From: Valencia Serr >>> >>> From: Valencia Serrao/Austin/Contr/IBM >>> To: Nishidha Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan >>> Jagadale/Austin/Contr/IBM@IBMUS, Manish Patil/Austin/Contr/IBM@IBMUS >>> Date: 08/02/2016 10:49 AM >>> Subject: Fw: Issues with tests in Release-mode Impala build >>> >>> ________________________________ >>> >>> >>> >>> ----- Forwarded by Valencia Serrao/Austin/Contr/IBM on 08/02/2016 10:48 >>> AM ----- >>> >>> From: Valencia Serrao/Austin/Contr/IBM >>> To: Tim Armstrong <[email protected]> >>> Date: 08/02/2016 09:05 AM >>> >>> Subject: Re: Issues with tests in Release-mode Impala build >>> ________________________________ >>> >>> >>> Hi Tim, >>> >>> I am executing the tests in exhaustive mode, but still I see that the >>> "test_spilling" test still fails with skip message "runs only in >>> exhaustive." Following are the various ways I tried to run the tests: >>> >>> 1. ${IMPALA_HOME}/buildall.sh -noclean -testexhaustive >>> 2. Explicitly set EXPLORATION_STRATEGY in run-all_tests.sh and >>> buildall.sh to exhaustive. >>> >>> I think it is getting reset somewhere to some other strategy. Could you >>> please help me to correctly set the env to run the Custom cluster in the >>> exhaustive exploration strategy ? >>> >>> Regards, >>> Valencia >>> >>> >>> Valencia Serrao---07/25/2016 05:56:16 PM---Hi Tim, Thanks for the >>> detailed response. >>> >>> From: Valencia Serrao/Austin/Contr/IBM >>> To: Tim Armstrong <[email protected]> >>> Cc: [email protected], Manish Patil/Austin/Contr/IBM@IBMUS, >>> Nishidha Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan >>> Jagadale/Austin/Contr/IBM@IBMUS, Valencia Serrao/Austin/Contr/IBM@IBMUS >>> Date: 07/25/2016 05:56 PM >>> Subject: Re: Issues with tests in Release-mode Impala build >>> ________________________________ >>> >>> >>> Hi Tim, >>> >>> Thanks for the detailed response. >>> >>> Also, the BE "benchmark-test" issue is resolved. It now passes together >>> with the complete BE suite in Release mode. >>> >>> Regards, >>> Valencia >>> >>> >>> Tim Armstrong ---07/23/2016 12:15:10 AM---2a. Exhaustive is a superset of >>> core. We run the core tests pre-commit on >>> >>> From: Tim Armstrong <[email protected]> >>> To: Valencia Serrao/Austin/Contr/IBM@IBMUS >>> Cc: [email protected], Manish Patil/Austin/Contr/IBM@IBMUS, >>> Nishidha Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan >>> Jagadale/Austin/Contr/IBM@IBMUS >>> Date: 07/23/2016 12:15 AM >>> Subject: Re: Issues with tests in Release-mode Impala build >>> ________________________________ >>> >>> >>> >>> 2a. >>> Exhaustive is a superset of core. We run the core tests pre-commit on >>> CentOS 6 + HDFS and the full exhaustive tests post-commit on a wider range >>> of configurations. We don't release Impala unless all exhaustive tests >>> passed on all configurations we test (if there's a valid reason why >>> something doesn't work on a given platform we skip the test). >>> >>> 2b. >>> Exhaustive is a superset of core, so if exhaustive passes then core >>> should do. The exhaustive build takes much longer than core so it makes >>> sense to run it less frequently (e.g. we run it nightly for some >>> configurations and weekly for others). >>> >>> 2c. >>> Confusingly, the core/exhaustive data load doesn't map to core/exhaustive >>> tests. We actually use the same data load for all test configurations. See >>> testdata/bin/create-load-data.sh for how the core/exhaustive data load is >>> invoked. E.g. we load the functional data with exhaustive (i.e. all >>> supported file formats) and the larger tpc-h/tpc-ds data sets for only a >>> subset of file forms. >>> >>> >>> On Wed, Jul 20, 2016 at 9:39 PM, Valencia Serrao <[email protected]> >>> wrote: >>> >>> Hi Tim, >>> >>> Thank you for the insight on the issues. >>> >>> 1. BE test -issue: benchmark-test hangs >>> As you suggested, I increased the "batch_size" value to upto 125000000, >>> however, the sw.ElapsedTime() does not increase inside the while and again >>> gets caught up in an infinite loop. The optimization level seems to cause >>> this behavior. I am still working in this. >>> >>> 2. Custom cluster tests: skipping some tests in test_spilling >>> I found in the logs that the "test_spilling" test was skipped as the >>> exploration strategy was set to "core" on our Impala setup. >>> >>> Some question here, >>> a. From a Impala release perspective how significant are these strategies >>> (core, exhaustive, etc.) ? >>> b. Do we have to test with all combinations (core|release mode build and >>> exhaustive|release mode build). >>> c. Does the exploration strategy selection also affect the test data >>> loaded ? (data loaded is different in each exploration strategy ? ) >>> >>> Please let me know your comments. >>> >>> Regards, >>> Valencia >>> >>> Tim Armstrong ---07/19/2016 09:11:48 PM---With 2, it's a little strange >>> that test_spilling is being skipped - I think that one should be run. >>> >>> From: Tim Armstrong <[email protected]> >>> To: Valencia Serrao/Austin/Contr/IBM@IBMUS >>> Cc: [email protected], Manish Patil/Austin/Contr/IBM@IBMUS, >>> Nishidha Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan >>> Jagadale/Austin/Contr/IBM@IBMUS >>> Date: 07/19/2016 09:11 PM >>> >>> >>> Subject: Re: Issues with tests in Release-mode Impala build >>> >>> ________________________________ >>> >>> >>> >>> With 2, it's a little strange that test_spilling is being skipped - I >>> think that one should be run. >>> >>> On Tue, Jul 19, 2016 at 8:39 AM, Tim Armstrong <[email protected]> >>> wrote: >>> >>> It looks like the benchmark-test issue is something to do with the >>> granularity of the clock. It can get stuck in an infinite loop if the >>> function call below always takes less than the smallest measurable unit of >>> time (i.e. Start() and Stop() are called in the same time quantum). >>> >>> while (sw.ElapsedTime() < target_cycles) { >>> sw.Start(); >>> function(batch_size, args); >>> sw.Stop(); >>> iters += batch_size; >>> } >>> >>> We use Intel's rdtsc instruction for a timer here, so I guess whatever >>> PPC alternative you used may work a little differently. This is probably ok, >>> but it's possible that it could affect timers elsewhere in Impala. >>> >>> One solution would be to increase the default batch size. >>> >>> On Tue, Jul 19, 2016 at 5:29 AM, Valencia Serrao <[email protected]> >>> wrote: >>> Hi Tim, >>> >>> Following are some observations: >>> >>> 1. BE test -issue: benchmark-test hangs >>> Putting trace logs like below in benchmark.cc: >>> while (sw.ElapsedTime() < target_cycles) { >>> LOG(INFO) <<" in while(sw.ElapsedTime() < target_cycles)"; >>> sw.Start(); >>> function(batch_size, args); >>> sw.Stop(); >>> iters += batch_size; >>> LOG(INFO) <<" In while:::::::: sw.ElapsedTime() "<< sw.ElapsedTime(); >>> LOG(INFO) <<" In while:::::::: iters = " << iters ; >>> >>> In Release mode, I observed that the sw.ElapsedTime() remains constant >>> and does not increase, therefore, it is caught up in an infinite loop and >>> the benchmark-test hangs. In Debug mode, sw.ElapsedTime() keeps on >>> increasing and therefore is able to come out of the while loop and >>> benchmark-test doesn't hang in Debug mode. >>> I'm working on this issue, however, if you could give any pointers about >>> it, that would be really great. >>> >>> 2. Custom cluster tests: I have included the code changes in my branch >>> and many of the earlier 36 skipped tests have now executed and they pass, >>> but with the following exception(when compared to the output in >>> thehttps://issues.cloudera.org/browse/IMPALA-3614 ): >>> custom_cluster/test_spilling.py sss. >>> >>> Current CC test stats: 34 passed, 7 skipped, 3 warnings. >>> >>> 3. End-to-End tests: I couldn't dive into the EE tests. I will surely let >>> you know more about them as soon as I'm done with them. >>> >>> Regards, >>> Valencia >>> >>> Valencia Serrao---07/19/2016 10:26:31 AM---Hi Tim, Thank you for the >>> information. >>> >>> From: Valencia Serrao/Austin/Contr/IBM >>> To: Tim Armstrong <[email protected]> >>> Cc: [email protected], Manish Patil/Austin/Contr/IBM@IBMUS, >>> Nishidha Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan >>> Jagadale/Austin/Contr/IBM@IBMUS >>> Date: 07/19/2016 10:26 AM >>> Subject: Re: Issues with tests in Release-mode Impala build >>> ________________________________ >>> >>> >>> Hi Tim, >>> >>> Thank you for the information. >>> >>> I am working on the pointers you have given and also on the fix for >>> Custom cluster (skipped) tests. I will inform you on the findings. >>> >>> Regards, >>> Valencia >>> >>> >>> >>> Tim Armstrong ---07/18/2016 09:19:52 PM---Hi Valencia, 1. We run tests in >>> release mode nightly and it doesn't look like we've seen >>> >>> From: Tim Armstrong <[email protected]> >>> To: [email protected] >>> Cc: Valencia Serrao/Austin/Contr/IBM@IBMUS, Nishidha >>> Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, >>> Manish Patil/Austin/Contr/IBM@IBMUS >>> Date: 07/18/2016 09:19 PM >>> Subject: Re: Issues with tests in Release-mode Impala build >>> ________________________________ >>> >>> >>> >>> Hi Valencia, >>> >>> 1. We run tests in release mode nightly and it doesn't look like we've >>> seen this hang. I'd suggest you attach a debugger to the benchmark-test >>> process and see what it's doing. It could either be an actual hang, or an >>> infinite/very long loop. That test is only testing our benchmarking >>> utilities, not Impala itself, but IMO it's always good to understand why >>> something like that is happening in case there's a more general problem. >>> 2. Sounds like https://issues.cloudera.org/browse/IMPALA-3614 . Have you >>> got the fix for that in your branch? >>> 3. Look forward to hearing more. >>> >>> Cheers, >>> Tim >>> >>> On Mon, Jul 18, 2016 at 2:49 AM, Valencia Serrao <[email protected]> >>> wrote: >>> >>> >>> Hi All, >>> >>> I have built Impala in Release mode. I executed the tests, following are >>> some observations: >>> >>> 1. BE test: The test execution hangs at the "benchmark-test". There are >>> no >>> errors shown and it hangs at this test. Earlier, running the BE tests in >>> debug mode this issue did not occur. >>> 2. Custom Cluster test: 5 tests passed and 36 tests skipped. All of the >>> skipped cases give the message: "INSERT not implemented for S3" >>> 3. EE tests: I've also seen some failures here (yet to check the details) >>> >>> As for FE and JDBC tests, everything works fine, release mode test output >>> is same as that of debug mode test output. >>> >>> Is the "benchmark-test" test known to fail in Release mode or am I >>> missing >>> out on any configuration. Also, I want to understand the significance of >>> this test, if in case we could ignore it and move ahead. >>> >>> >>> >>> Regards, >>> Valencia >>> >>> >>> >>> >>> >>> >>> >> >
