Re: Fw: Issues with tests in Release-mode Impala build

Valencia Serrao Thu, 11 Aug 2016 01:22:12 -0700

Hi Tim/Michael

Thanks for the pointers. The 'set -x' was useful.


Yes, I did use the fix at  IMPALA-3614 . The new code in test_spilling.py
did have a skip condition in the setup_class method. Removing the skip
condition helped execute the  test_spilling.py test. Additionally,
increasing  the "read_size" from 262144 to 270000, it also passes now. The
code change done in test_spilling.py is as follows.


|--------------------------------------------------------------------+--------------------------------------------------------------------|
|Original code                                                       | Modified 
Code                                                      |
|--------------------------------------------------------------------+--------------------------------------------------------------------|
|@classmethod                                                        
|@classmethod                                                        |
|  def setup_class(cls):                                             |  def 
setup_class(cls):                                             |
|    super(TestSpillStress, cls).setup_class()                       |    
super(TestSpillStress, cls).setup_class()                       |
|    if cls.exploration_strategy() != 'exhaustive':                  |    
cls._start_impala_cluster                                       |
|      pytest.skip('runs only in exhaustive')                        
|(['--impalad_args=--"read_size=270000"',                            |
|    # Since test_spill_stress below runs TEST_IDS * NUM_ITERATIONS  |        
'catalogd_args="--load_catalog_in_background=false"'])      |
|times, but we only                                                  |          
                                                          |
|    # need Impala to start once, it's inefficient to use            |          
                                                          |
|    # @CustomClusterTestSuite.with_args() to restart Impala every   |          
                                                          |
|time. Instead, start                                                |          
                                                          |
|    # Impala here, once.                                            |          
                                                          |
|    #                                                               |          
                                                          |
|    # Start with 256KB buffers to reduce data size required to force|          
                                                          |
|spilling.                                                           |          
                                                          |
|    #262144                                                         |          
                                                          |
|    cls._start_impala_cluster                                       |          
                                                          |
|(['--impalad_args=--"read_size=262144"',                            |          
                                                          |
|        'catalogd_args="--load_catalog_in_background=false"'])      |          
                                                          |
|                                                                    |          
                                                          |
|--------------------------------------------------------------------+--------------------------------------------------------------------|


Also, I have been able to run the tests in 'exhaustive' strategy using the
run-all-tests.sh script at https://issues.cloudera.org/browse/IMPALA-3630
. I will inform you the results.

Regards,
Valencia





From:   Tim Armstrong <[email protected]>
To:     Valencia Serrao/Austin/Contr/IBM@IBMUS
Cc:     [email protected], Manish
            Patil/Austin/Contr/IBM@IBMUS, Sudarshan
            Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
            Panpaliya/Austin/Contr/IBM@IBMUS
Date:   08/03/2016 08:46 PM
Subject:        Re: Fw: Issues with tests in Release-mode Impala build



Hi Valencia,
  I'm not sure exactly what's happening but I have a couple of ideas:

   1.   You may be running into some known bugs in pytest that causes custom
      cluster tests to be skipped: see
      https://issues.cloudera.org/browse/IMPALA-3614 . We don't see that on
      Impala master but maybe you added a skip marker to one of the custom
      cluster tests.
   2.   You could add a "set -x" command to the top of the script to check
      the exact arguments run-tests is being invoked with.

On Tue, Aug 2, 2016 at 9:25 PM, Valencia Serrao <[email protected]> wrote:
  Hi Tim,

  To trace the exploration strategy var at each step leading to the
  "test_spilling" test, I put a few print statements for it. I observed
  that, once the process flow reaches the impala_test_suite.py the strategy
  is changed to 'core'. The logs printed as follows: (logs marked pink)

  Waiting for HiveServer2 at localhost:11050...
  Could not connect to localhost:11050
  HiveServer2 service is up at localhost:11050
  --> Starting the Sentry Policy Server
  in buildall.sh before run-all-tests.sh:::::::Exploration strategy=
  exhaustive
  1st in run-all-tests.sh :::::::Exploration strategy= exhaustive
  Split and assign HBase regions (logging to split-hbase.log)... OK
  Starting Impala cluster (logging to start-impala-cluster.log)... OK
  Run test run-workload (logging to test-run-workload.log)... OK
  Starting CC tests:::::::Exploration strategy= exhaustive
  ============================= test session starts
  ==============================
  platform linux2 -- Python 2.7.10 -- py-1.4.30 -- pytest-2.7.2
  rootdir: /home/test/ProjectImpala, inifile:
  plugins: xdist, random
  default_strategy in impala_test_suite.py::::EXPLORATION STRATEGY= core
  default_strategy in impala_test_suite.py::::EXPLORATION STRATEGY= core
  in conftest.py:::::Exploration Strategy = core
  in test_vector:::::Exploration Strategy = core
  default_strategy in impala_test_suite.py::::EXPLORATION STRATEGY= core
  default_strategy in impala_test_suite.py::::EXPLORATION STRATEGY= core
  in conftest.py:::::Exploration Strategy = core
  in test_vector.py:::::Exploration Strategy = core
  collected 4 items

  custom_cluster/test_spilling.py sss.

  generated xml
  file: 
/home/test/ProjectImpala/ImpalaPPC/tests/custom_cluster/results/TEST-impala-custom-cluster.xml

  ============== 1 passed, 3 skipped, 1 warnings in 110.42 seconds
  ===============

  Could you please guide me on this issue ?

  Regards,
  Valencia

  Inactive hide details for Valencia Serrao---08/02/2016 10:49:19
  AM-------- Forwarded by Valencia Serrao/Austin/Contr/IBM on 08/Valencia
  Serrao---08/02/2016 10:49:19 AM-------- Forwarded by Valencia
  Serrao/Austin/Contr/IBM on 08/02/2016 10:48 AM ----- From: Valencia Serr

  From: Valencia Serrao/Austin/Contr/IBM
  To: Nishidha Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
  Jagadale/Austin/Contr/IBM@IBMUS, Manish Patil/Austin/Contr/IBM@IBMUS
  Date: 08/02/2016 10:49 AM
  Subject: Fw: Issues with tests in Release-mode Impala build



  ----- Forwarded by Valencia Serrao/Austin/Contr/IBM on 08/02/2016 10:48
  AM -----

  From: Valencia Serrao/Austin/Contr/IBM
  To: Tim Armstrong <[email protected]>
  Date: 08/02/2016 09:05 AM

  Subject: Re: Issues with tests in Release-mode Impala build


  Hi Tim,

  I am executing the tests in exhaustive mode, but still I see that the
  "test_spilling" test still fails with skip message "runs only in
  exhaustive." Following are the various ways I tried to run the tests:

  1. ${IMPALA_HOME}/buildall.sh -noclean -testexhaustive
  2. Explicitly set EXPLORATION_STRATEGY in run-all_tests.sh and
  buildall.sh to exhaustive.

  I think it is getting reset somewhere to some other strategy. Could you
  please help me to correctly set the env to run the Custom cluster in the
  exhaustive exploration strategy ?

  Regards,
  Valencia


  Inactive hide details for Valencia Serrao---07/25/2016 05:56:16 PM---Hi
  Tim,  Thanks for the detailed response.Valencia Serrao---07/25/2016
  05:56:16 PM---Hi Tim, Thanks for the detailed response.

  From: Valencia Serrao/Austin/Contr/IBM
  To: Tim Armstrong <[email protected]>
  Cc: [email protected], Manish Patil/Austin/Contr/IBM@IBMUS,
  Nishidha Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
  Jagadale/Austin/Contr/IBM@IBMUS, Valencia Serrao/Austin/Contr/IBM@IBMUS
  Date: 07/25/2016 05:56 PM
  Subject: Re: Issues with tests in Release-mode Impala build


  Hi Tim,

  Thanks for the detailed response.

  Also, the BE "benchmark-test" issue is resolved. It now passes together
  with the complete BE suite in Release mode.

  Regards,
  Valencia


  Inactive hide details for Tim Armstrong ---07/23/2016 12:15:10 AM---2a.
  Exhaustive is a superset of core. We run the core testsTim Armstrong
  ---07/23/2016 12:15:10 AM---2a. Exhaustive is a superset of core. We run
  the core tests pre-commit on

  From: Tim Armstrong <[email protected]>
  To: Valencia Serrao/Austin/Contr/IBM@IBMUS
  Cc: [email protected], Manish Patil/Austin/Contr/IBM@IBMUS,
  Nishidha Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
  Jagadale/Austin/Contr/IBM@IBMUS
  Date: 07/23/2016 12:15 AM
  Subject: Re: Issues with tests in Release-mode Impala build



  2a.
  Exhaustive is a superset of core. We run the core tests pre-commit on
  CentOS 6 + HDFS and the full exhaustive tests post-commit on a wider
  range of configurations. We don't release Impala unless all exhaustive
  tests passed on all configurations we test (if there's a valid reason why
  something doesn't work on a given platform we skip the test).

  2b.
  Exhaustive is a superset of core, so if exhaustive passes then core
  should do. The exhaustive build takes much longer than core so it makes
  sense to run it less frequently (e.g. we run it nightly for some
  configurations and weekly for others).

  2c.
  Confusingly, the core/exhaustive data load doesn't map to core/exhaustive
  tests. We actually use the same data load for all test configurations.
  See testdata/bin/create-load-data.sh for how the core/exhaustive data
  load is invoked. E.g. we load the functional data with exhaustive (i.e.
  all supported file formats) and the larger tpc-h/tpc-ds data sets for
  only a subset of file forms.


  On Wed, Jul 20, 2016 at 9:39 PM, Valencia Serrao <[email protected]>
  wrote:
        Hi Tim,

        Thank you for the insight on the issues.

        1. BE test -issue: benchmark-test hangs
        As you suggested, I increased the "batch_size" value to upto
        125000000, however, the sw.ElapsedTime() does not increase inside
        the while and again gets caught up in an infinite loop. The
        optimization level seems to cause this behavior. I am still working
        in this.

        2. Custom cluster tests: skipping some tests in test_spilling
        I found in the logs that the "test_spilling" test was skipped as
        the exploration strategy was set to "core" on our Impala setup.

        Some question here,
        a. From a Impala release perspective how significant are these
        strategies (core, exhaustive, etc.) ?
        b. Do we have to test with all combinations (core|release mode
        build and exhaustive|release mode build).
        c. Does the exploration strategy selection also affect the test
        data loaded ? (data loaded is different in each exploration
        strategy ? )

        Please let me know your comments.

        Regards,
        Valencia

        Inactive hide details for Tim Armstrong ---07/19/2016 09:11:48
        PM---With 2, it's a little strange that test_spilling is being sTim
        Armstrong ---07/19/2016 09:11:48 PM---With 2, it's a little strange
        that test_spilling is being skipped - I think that one should be
        run.

        From: Tim Armstrong <[email protected]>
        To: Valencia Serrao/Austin/Contr/IBM@IBMUS
        Cc: [email protected], Manish
        Patil/Austin/Contr/IBM@IBMUS, Nishidha
        Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
        Jagadale/Austin/Contr/IBM@IBMUS
        Date: 07/19/2016 09:11 PM



        Subject: Re: Issues with tests in Release-mode Impala build





        With 2, it's a little strange that test_spilling is being skipped -
        I think that one should be run.

        On Tue, Jul 19, 2016 at 8:39 AM, Tim Armstrong <
        [email protected]> wrote:
                    It looks like the benchmark-test issue is something to
                    do with the granularity of the clock. It can get stuck
                    in an infinite loop if the function call below always
                    takes less than the smallest measurable unit of time
                    (i.e. Start() and Stop() are called in the same time
                    quantum).

                      while (sw.ElapsedTime() < target_cycles) {
                        sw.Start();
                        function(batch_size, args);
                        sw.Stop();
                        iters += batch_size;
                      }

                    We use Intel's rdtsc instruction for a timer here, so I
                    guess whatever PPC alternative you used may work a
                    little differently. This is probably ok, but it's
                    possible that it could affect timers elsewhere in
                    Impala.

                    One solution would be to increase the default batch
                    size.

                    On Tue, Jul 19, 2016 at 5:29 AM, Valencia Serrao <
                    [email protected]> wrote:
                    Hi Tim,

                    Following are some observations:

                    1. BE test -issue: benchmark-test hangs
                    Putting trace logs like below in benchmark.cc:
                    while (sw.ElapsedTime() < target_cycles) {
                    LOG(INFO) <<" in while(sw.ElapsedTime() <
                    target_cycles)";
                    sw.Start();
                    function(batch_size, args);
                    sw.Stop();
                    iters += batch_size;
                    LOG(INFO) <<" In while:::::::: sw.ElapsedTime() "<<
                    sw.ElapsedTime();
                    LOG(INFO) <<" In while:::::::: iters = " << iters ;

                    In Release mode, I observed that the sw.ElapsedTime()
                    remains constant and does not increase, therefore, it
                    is caught up in an infinite loop and the benchmark-test
                    hangs. In Debug mode, sw.ElapsedTime() keeps on
                    increasing and therefore is able to come out of the
                    while loop and benchmark-test doesn't hang in Debug
                    mode.
                    I'm working on this issue, however, if you could give
                    any pointers about it, that would be really great.

                    2. Custom cluster tests: I have included the code
                    changes in my branch and many of the earlier 36 skipped
                    tests have now executed and they pass, but with the
                    following exception(when compared to the output in the
                    https://issues.cloudera.org/browse/IMPALA-3614 ):
                    custom_cluster/test_spilling.py sss.

                    Current CC test stats: 34 passed, 7 skipped, 3
                    warnings.

                    3. End-to-End tests: I couldn't dive into the EE tests.
                    I will surely let you know more about them as soon as
                    I'm done with them.

                    Regards,
                    Valencia

                    Inactive hide details for Valencia Serrao---07/19/2016
                    10:26:31 AM---Hi Tim, Thank you for the information.
                    Valencia Serrao---07/19/2016 10:26:31 AM---Hi Tim,
                    Thank you for the information.

                    From: Valencia Serrao/Austin/Contr/IBM
                    To: Tim Armstrong <[email protected]>
                    Cc: [email protected], Manish
                    Patil/Austin/Contr/IBM@IBMUS, Nishidha
                    Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
                    Jagadale/Austin/Contr/IBM@IBMUS
                    Date: 07/19/2016 10:26 AM
                    Subject: Re: Issues with tests in Release-mode Impala
                    build


                    Hi Tim,

                    Thank you for the information.

                    I am working on the pointers you have given and also on
                    the fix for Custom cluster (skipped) tests. I will
                    inform you on the findings.

                    Regards,
                    Valencia



                    Inactive hide details for Tim Armstrong ---07/18/2016
                    09:19:52 PM---Hi Valencia, 1. We run tests in release
                    mode nightly and itTim Armstrong ---07/18/2016 09:19:52
                    PM---Hi Valencia, 1. We run tests in release mode
                    nightly and it doesn't look like we've seen

                    From: Tim Armstrong <[email protected]>
                    To: [email protected]
                    Cc: Valencia Serrao/Austin/Contr/IBM@IBMUS, Nishidha
                    Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
                    Jagadale/Austin/Contr/IBM@IBMUS, Manish
                    Patil/Austin/Contr/IBM@IBMUS
                    Date: 07/18/2016 09:19 PM
                    Subject: Re: Issues with tests in Release-mode Impala
                    build



                    Hi Valencia,

                    1. We run tests in release mode nightly and it doesn't
                    look like we've seen this hang. I'd suggest you attach
                    a debugger to the benchmark-test process and see what
                    it's doing. It could either be an actual hang, or an
                    infinite/very long loop. That test is only testing our
                    benchmarking utilities, not Impala itself, but IMO it's
                    always good to understand why something like that is
                    happening in case there's a more general problem.
                    2. Sounds like
                    https://issues.cloudera.org/browse/IMPALA-3614 . Have
                    you got the fix for that in your branch?
                    3. Look forward to hearing more.

                    Cheers,
                    Tim

                    On Mon, Jul 18, 2016 at 2:49 AM, Valencia Serrao <
                    [email protected]> wrote:

                                            Hi All,

                                            I have built Impala in Release
                                            mode. I executed the tests,
                                            following are
                                            some observations:

                                            1. BE test: The test execution
                                            hangs at the "benchmark-test".
                                            There are no
                                            errors shown and it hangs at
                                            this test. Earlier, running the
                                            BE tests in
                                            debug mode this issue did not
                                            occur.
                                            2. Custom Cluster test: 5 tests
                                            passed and 36 tests skipped.
                                            All of the
                                            skipped cases give the message:
                                            "INSERT not implemented for S3"
                                            3. EE tests: I've also seen
                                            some failures here (yet to
                                            check the details)

                                            As for FE and JDBC tests,
                                            everything works fine, release
                                            mode test output
                                            is same as that of debug mode
                                            test output.

                                            Is the  "benchmark-test" test
                                            known to fail in Release mode
                                            or am I missing
                                            out on any configuration. Also,
                                            I want to understand the
                                            significance of
                                            this test, if in case we could
                                            ignore it and move ahead.



                                            Regards,
                                            Valencia

Re: Fw: Issues with tests in Release-mode Impala build

Reply via email to