Hi Tim/Michael
Thanks for the pointers. The 'set -x' was useful.
Yes, I did use the fix at IMPALA-3614 . The new code in test_spilling.py
did have a skip condition in the setup_class method. Removing the skip
condition helped execute the test_spilling.py test. Additionally,
increasing the "read_size" from 262144 to 270000, it also passes now. The
code change done in test_spilling.py is as follows.
|--------------------------------------------------------------------+--------------------------------------------------------------------|
|Original code | Modified
Code |
|--------------------------------------------------------------------+--------------------------------------------------------------------|
|@classmethod
|@classmethod |
| def setup_class(cls): | def
setup_class(cls): |
| super(TestSpillStress, cls).setup_class() |
super(TestSpillStress, cls).setup_class() |
| if cls.exploration_strategy() != 'exhaustive': |
cls._start_impala_cluster |
| pytest.skip('runs only in exhaustive')
|(['--impalad_args=--"read_size=270000"', |
| # Since test_spill_stress below runs TEST_IDS * NUM_ITERATIONS |
'catalogd_args="--load_catalog_in_background=false"']) |
|times, but we only |
|
| # need Impala to start once, it's inefficient to use |
|
| # @CustomClusterTestSuite.with_args() to restart Impala every |
|
|time. Instead, start |
|
| # Impala here, once. |
|
| # |
|
| # Start with 256KB buffers to reduce data size required to force|
|
|spilling. |
|
| #262144 |
|
| cls._start_impala_cluster |
|
|(['--impalad_args=--"read_size=262144"', |
|
| 'catalogd_args="--load_catalog_in_background=false"']) |
|
| |
|
|--------------------------------------------------------------------+--------------------------------------------------------------------|
Also, I have been able to run the tests in 'exhaustive' strategy using the
run-all-tests.sh script at https://issues.cloudera.org/browse/IMPALA-3630
. I will inform you the results.
Regards,
Valencia
From: Tim Armstrong <[email protected]>
To: Valencia Serrao/Austin/Contr/IBM@IBMUS
Cc: [email protected], Manish
Patil/Austin/Contr/IBM@IBMUS, Sudarshan
Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
Panpaliya/Austin/Contr/IBM@IBMUS
Date: 08/03/2016 08:46 PM
Subject: Re: Fw: Issues with tests in Release-mode Impala build
Hi Valencia,
I'm not sure exactly what's happening but I have a couple of ideas:
1. You may be running into some known bugs in pytest that causes custom
cluster tests to be skipped: see
https://issues.cloudera.org/browse/IMPALA-3614 . We don't see that on
Impala master but maybe you added a skip marker to one of the custom
cluster tests.
2. You could add a "set -x" command to the top of the script to check
the exact arguments run-tests is being invoked with.
On Tue, Aug 2, 2016 at 9:25 PM, Valencia Serrao <[email protected]> wrote:
Hi Tim,
To trace the exploration strategy var at each step leading to the
"test_spilling" test, I put a few print statements for it. I observed
that, once the process flow reaches the impala_test_suite.py the strategy
is changed to 'core'. The logs printed as follows: (logs marked pink)
Waiting for HiveServer2 at localhost:11050...
Could not connect to localhost:11050
HiveServer2 service is up at localhost:11050
--> Starting the Sentry Policy Server
in buildall.sh before run-all-tests.sh:::::::Exploration strategy=
exhaustive
1st in run-all-tests.sh :::::::Exploration strategy= exhaustive
Split and assign HBase regions (logging to split-hbase.log)... OK
Starting Impala cluster (logging to start-impala-cluster.log)... OK
Run test run-workload (logging to test-run-workload.log)... OK
Starting CC tests:::::::Exploration strategy= exhaustive
============================= test session starts
==============================
platform linux2 -- Python 2.7.10 -- py-1.4.30 -- pytest-2.7.2
rootdir: /home/test/ProjectImpala, inifile:
plugins: xdist, random
default_strategy in impala_test_suite.py::::EXPLORATION STRATEGY= core
default_strategy in impala_test_suite.py::::EXPLORATION STRATEGY= core
in conftest.py:::::Exploration Strategy = core
in test_vector:::::Exploration Strategy = core
default_strategy in impala_test_suite.py::::EXPLORATION STRATEGY= core
default_strategy in impala_test_suite.py::::EXPLORATION STRATEGY= core
in conftest.py:::::Exploration Strategy = core
in test_vector.py:::::Exploration Strategy = core
collected 4 items
custom_cluster/test_spilling.py sss.
generated xml
file:
/home/test/ProjectImpala/ImpalaPPC/tests/custom_cluster/results/TEST-impala-custom-cluster.xml
============== 1 passed, 3 skipped, 1 warnings in 110.42 seconds
===============
Could you please guide me on this issue ?
Regards,
Valencia
Inactive hide details for Valencia Serrao---08/02/2016 10:49:19
AM-------- Forwarded by Valencia Serrao/Austin/Contr/IBM on 08/Valencia
Serrao---08/02/2016 10:49:19 AM-------- Forwarded by Valencia
Serrao/Austin/Contr/IBM on 08/02/2016 10:48 AM ----- From: Valencia Serr
From: Valencia Serrao/Austin/Contr/IBM
To: Nishidha Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
Jagadale/Austin/Contr/IBM@IBMUS, Manish Patil/Austin/Contr/IBM@IBMUS
Date: 08/02/2016 10:49 AM
Subject: Fw: Issues with tests in Release-mode Impala build
----- Forwarded by Valencia Serrao/Austin/Contr/IBM on 08/02/2016 10:48
AM -----
From: Valencia Serrao/Austin/Contr/IBM
To: Tim Armstrong <[email protected]>
Date: 08/02/2016 09:05 AM
Subject: Re: Issues with tests in Release-mode Impala build
Hi Tim,
I am executing the tests in exhaustive mode, but still I see that the
"test_spilling" test still fails with skip message "runs only in
exhaustive." Following are the various ways I tried to run the tests:
1. ${IMPALA_HOME}/buildall.sh -noclean -testexhaustive
2. Explicitly set EXPLORATION_STRATEGY in run-all_tests.sh and
buildall.sh to exhaustive.
I think it is getting reset somewhere to some other strategy. Could you
please help me to correctly set the env to run the Custom cluster in the
exhaustive exploration strategy ?
Regards,
Valencia
Inactive hide details for Valencia Serrao---07/25/2016 05:56:16 PM---Hi
Tim, Thanks for the detailed response.Valencia Serrao---07/25/2016
05:56:16 PM---Hi Tim, Thanks for the detailed response.
From: Valencia Serrao/Austin/Contr/IBM
To: Tim Armstrong <[email protected]>
Cc: [email protected], Manish Patil/Austin/Contr/IBM@IBMUS,
Nishidha Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
Jagadale/Austin/Contr/IBM@IBMUS, Valencia Serrao/Austin/Contr/IBM@IBMUS
Date: 07/25/2016 05:56 PM
Subject: Re: Issues with tests in Release-mode Impala build
Hi Tim,
Thanks for the detailed response.
Also, the BE "benchmark-test" issue is resolved. It now passes together
with the complete BE suite in Release mode.
Regards,
Valencia
Inactive hide details for Tim Armstrong ---07/23/2016 12:15:10 AM---2a.
Exhaustive is a superset of core. We run the core testsTim Armstrong
---07/23/2016 12:15:10 AM---2a. Exhaustive is a superset of core. We run
the core tests pre-commit on
From: Tim Armstrong <[email protected]>
To: Valencia Serrao/Austin/Contr/IBM@IBMUS
Cc: [email protected], Manish Patil/Austin/Contr/IBM@IBMUS,
Nishidha Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
Jagadale/Austin/Contr/IBM@IBMUS
Date: 07/23/2016 12:15 AM
Subject: Re: Issues with tests in Release-mode Impala build
2a.
Exhaustive is a superset of core. We run the core tests pre-commit on
CentOS 6 + HDFS and the full exhaustive tests post-commit on a wider
range of configurations. We don't release Impala unless all exhaustive
tests passed on all configurations we test (if there's a valid reason why
something doesn't work on a given platform we skip the test).
2b.
Exhaustive is a superset of core, so if exhaustive passes then core
should do. The exhaustive build takes much longer than core so it makes
sense to run it less frequently (e.g. we run it nightly for some
configurations and weekly for others).
2c.
Confusingly, the core/exhaustive data load doesn't map to core/exhaustive
tests. We actually use the same data load for all test configurations.
See testdata/bin/create-load-data.sh for how the core/exhaustive data
load is invoked. E.g. we load the functional data with exhaustive (i.e.
all supported file formats) and the larger tpc-h/tpc-ds data sets for
only a subset of file forms.
On Wed, Jul 20, 2016 at 9:39 PM, Valencia Serrao <[email protected]>
wrote:
Hi Tim,
Thank you for the insight on the issues.
1. BE test -issue: benchmark-test hangs
As you suggested, I increased the "batch_size" value to upto
125000000, however, the sw.ElapsedTime() does not increase inside
the while and again gets caught up in an infinite loop. The
optimization level seems to cause this behavior. I am still working
in this.
2. Custom cluster tests: skipping some tests in test_spilling
I found in the logs that the "test_spilling" test was skipped as
the exploration strategy was set to "core" on our Impala setup.
Some question here,
a. From a Impala release perspective how significant are these
strategies (core, exhaustive, etc.) ?
b. Do we have to test with all combinations (core|release mode
build and exhaustive|release mode build).
c. Does the exploration strategy selection also affect the test
data loaded ? (data loaded is different in each exploration
strategy ? )
Please let me know your comments.
Regards,
Valencia
Inactive hide details for Tim Armstrong ---07/19/2016 09:11:48
PM---With 2, it's a little strange that test_spilling is being sTim
Armstrong ---07/19/2016 09:11:48 PM---With 2, it's a little strange
that test_spilling is being skipped - I think that one should be
run.
From: Tim Armstrong <[email protected]>
To: Valencia Serrao/Austin/Contr/IBM@IBMUS
Cc: [email protected], Manish
Patil/Austin/Contr/IBM@IBMUS, Nishidha
Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
Jagadale/Austin/Contr/IBM@IBMUS
Date: 07/19/2016 09:11 PM
Subject: Re: Issues with tests in Release-mode Impala build
With 2, it's a little strange that test_spilling is being skipped -
I think that one should be run.
On Tue, Jul 19, 2016 at 8:39 AM, Tim Armstrong <
[email protected]> wrote:
It looks like the benchmark-test issue is something to
do with the granularity of the clock. It can get stuck
in an infinite loop if the function call below always
takes less than the smallest measurable unit of time
(i.e. Start() and Stop() are called in the same time
quantum).
while (sw.ElapsedTime() < target_cycles) {
sw.Start();
function(batch_size, args);
sw.Stop();
iters += batch_size;
}
We use Intel's rdtsc instruction for a timer here, so I
guess whatever PPC alternative you used may work a
little differently. This is probably ok, but it's
possible that it could affect timers elsewhere in
Impala.
One solution would be to increase the default batch
size.
On Tue, Jul 19, 2016 at 5:29 AM, Valencia Serrao <
[email protected]> wrote:
Hi Tim,
Following are some observations:
1. BE test -issue: benchmark-test hangs
Putting trace logs like below in benchmark.cc:
while (sw.ElapsedTime() < target_cycles) {
LOG(INFO) <<" in while(sw.ElapsedTime() <
target_cycles)";
sw.Start();
function(batch_size, args);
sw.Stop();
iters += batch_size;
LOG(INFO) <<" In while:::::::: sw.ElapsedTime() "<<
sw.ElapsedTime();
LOG(INFO) <<" In while:::::::: iters = " << iters ;
In Release mode, I observed that the sw.ElapsedTime()
remains constant and does not increase, therefore, it
is caught up in an infinite loop and the benchmark-test
hangs. In Debug mode, sw.ElapsedTime() keeps on
increasing and therefore is able to come out of the
while loop and benchmark-test doesn't hang in Debug
mode.
I'm working on this issue, however, if you could give
any pointers about it, that would be really great.
2. Custom cluster tests: I have included the code
changes in my branch and many of the earlier 36 skipped
tests have now executed and they pass, but with the
following exception(when compared to the output in the
https://issues.cloudera.org/browse/IMPALA-3614 ):
custom_cluster/test_spilling.py sss.
Current CC test stats: 34 passed, 7 skipped, 3
warnings.
3. End-to-End tests: I couldn't dive into the EE tests.
I will surely let you know more about them as soon as
I'm done with them.
Regards,
Valencia
Inactive hide details for Valencia Serrao---07/19/2016
10:26:31 AM---Hi Tim, Thank you for the information.
Valencia Serrao---07/19/2016 10:26:31 AM---Hi Tim,
Thank you for the information.
From: Valencia Serrao/Austin/Contr/IBM
To: Tim Armstrong <[email protected]>
Cc: [email protected], Manish
Patil/Austin/Contr/IBM@IBMUS, Nishidha
Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
Jagadale/Austin/Contr/IBM@IBMUS
Date: 07/19/2016 10:26 AM
Subject: Re: Issues with tests in Release-mode Impala
build
Hi Tim,
Thank you for the information.
I am working on the pointers you have given and also on
the fix for Custom cluster (skipped) tests. I will
inform you on the findings.
Regards,
Valencia
Inactive hide details for Tim Armstrong ---07/18/2016
09:19:52 PM---Hi Valencia, 1. We run tests in release
mode nightly and itTim Armstrong ---07/18/2016 09:19:52
PM---Hi Valencia, 1. We run tests in release mode
nightly and it doesn't look like we've seen
From: Tim Armstrong <[email protected]>
To: [email protected]
Cc: Valencia Serrao/Austin/Contr/IBM@IBMUS, Nishidha
Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
Jagadale/Austin/Contr/IBM@IBMUS, Manish
Patil/Austin/Contr/IBM@IBMUS
Date: 07/18/2016 09:19 PM
Subject: Re: Issues with tests in Release-mode Impala
build
Hi Valencia,
1. We run tests in release mode nightly and it doesn't
look like we've seen this hang. I'd suggest you attach
a debugger to the benchmark-test process and see what
it's doing. It could either be an actual hang, or an
infinite/very long loop. That test is only testing our
benchmarking utilities, not Impala itself, but IMO it's
always good to understand why something like that is
happening in case there's a more general problem.
2. Sounds like
https://issues.cloudera.org/browse/IMPALA-3614 . Have
you got the fix for that in your branch?
3. Look forward to hearing more.
Cheers,
Tim
On Mon, Jul 18, 2016 at 2:49 AM, Valencia Serrao <
[email protected]> wrote:
Hi All,
I have built Impala in Release
mode. I executed the tests,
following are
some observations:
1. BE test: The test execution
hangs at the "benchmark-test".
There are no
errors shown and it hangs at
this test. Earlier, running the
BE tests in
debug mode this issue did not
occur.
2. Custom Cluster test: 5 tests
passed and 36 tests skipped.
All of the
skipped cases give the message:
"INSERT not implemented for S3"
3. EE tests: I've also seen
some failures here (yet to
check the details)
As for FE and JDBC tests,
everything works fine, release
mode test output
is same as that of debug mode
test output.
Is the "benchmark-test" test
known to fail in Release mode
or am I missing
out on any configuration. Also,
I want to understand the
significance of
this test, if in case we could
ignore it and move ahead.
Regards,
Valencia