Re: Fw: Impala tests and estimate

Alex Behm Thu, 12 May 2016 23:35:43 -0700

On Thu, May 12, 2016 at 11:09 PM, Nishidha Panpaliya <[email protected]>
wrote:


>
>
> Thanks Jim for this information.
>
> I'd a few more queries -
>    What is the system configuration you are using on which the estimates
>    you gave hold true? RAM, HDD, CPU or any other requirement?
>

Times were reported on a m2.4xlarge EC2 instance. See specs here:
http://www.ec2instances.info/

   We also wanted to know pre-requisites to run each of these tests so that
>    we start preparing for it upfront. For e.g. backend tests does not need
>    any test data, however frontend tests do need test data to be generated
>    and loaded.
>

All tests except the backend tests require the test data to be loaded.


>    Are there any detailed documents listing steps to prepare and execute
>    all these tests.
>

Probably not detailed enough for you. Tracing through buildall.sh and
run-all-tests.sh should give you a good idea.


>    Test data generation is being done by default using buildall.sh with
>    -testdata argument. Can we customize this step to generate different
>    data or some scaled (small scale) data? Do we even need to do so to
>    ensure Impala works with different data sets?
>

The tests require the test data to be set up exactly the way it is today. I
highly recommend running the functional tests for validation.

You can certainly customize, but it's well... custom. So we cannot really
help you much there. You'll need to change the scripts/flow to your liking.


>    Also, does time for each of these tests as you mentioned take test data
>    generation and loading time into consideration or is it purely test
>    execution duration?
>

Purely test execution.


>    We also observed test data loading takes more than 5 hrs at our end both
>    on x86 and power? How much time does it take for you? Also, when should
>    we really need to generate test data from scratch (-format argument to
>    buildall.sh)? I hope it is not needed every time.
>

The test data does not need to be loaded from scratch every time. We have
the following workflow in place that you could replicate:

1. Generate test data snapshots
-  run buildall.sh with -testdata to generate the test data
- zip the HDFS test warehouse directory into a "data snapshot"
- dump the Hive metastore database into a "metastore snapshot"
- these two snapshots allow for a fast snapshot-based data load in
subsequent test runs

2. Use test data snapshots in a test run:
- do a buildall.sh with the -snapshot_file and -metastore_snapshot_file
arguments that point to the snapshots mentioned above
- data loading from these snapshots takes roughly 20-30 minutes

Of course, when you make changes to the test data, then you probably need
to regenerate these snapshots.

I will privately send you a script that can hopefully get you started with
this workflow, assuming you want to follow it.



>    Should we consider testing of release build and debug build separately?
>    Do you expect any differences in behavior? Also, what all dependencies
>    will need to be rebuilt in release mode?
>

Testing release and debug is certainly recommended.

I recommend you take a look at the CMakeLists.txt in the Impala root
directory to see what happens in a release build.
You can also look at bin/make_release.sh to learn more.


>
> We are also open for a call if any developer/tester is interested in
> discussing these points. Actually, we need this test plan a bit urgent as
> couple of our customers are waiting for timeline.
>

I'm open to getting on a call next week.

Best regards,

Alex


>
> Thanks,
> Nishidha
>
>
> ----- Forwarded by Nishidha Panpaliya/Austin/Contr/IBM on 05/13/2016 11:11
> AM -----
>
> From:   Sudarshan Jagadale/Austin/Contr/IBM
> To:     Nishidha Panpaliya/Austin/Contr/IBM@IBMUS
> Date:   05/13/2016 10:54 AM
> Subject:        Fw: Impala tests and estimate
>
>
> FYI
> Thanks and Regards
> Sudarshan Jagadale
> Power Open Source Solutions
> ----- Forwarded by Sudarshan Jagadale/Austin/Contr/IBM on 05/13/2016 10:53
> AM -----
>
> From:   Jim Apple <[email protected]>
> To:     [email protected]
> Cc:     Manish Patil/Austin/Contr/IBM@IBMUS, Sudarshan
>             Jagadale/Austin/Contr/IBM@IBMUS, Anup
>             Halarnkar/Austin/Contr/IBM@IBMUS, Valencia
>             Serrao/Austin/Contr/IBM@IBMUS
> Date:   05/12/2016 11:56 PM
> Subject:        Re: Fw: Impala tests and estimate
>
>
>
> The backend tests take 12 minutes. The frontend tests take 10 seconds. The
> JDBC tests take 2 minutes. The custom cluster tests take 35 minutes. The
> end-to-end tests take 3 hours.
>
> That's in "core" mode. "exhaustive" mode quadruples the total time, IIRC,
> and I'd guess that's all in the end-to-end tests, but I'm not sure.
>
> On Thu, May 12, 2016 at 5:40 AM, Nishidha Panpaliya <[email protected]>
> wrote:
>   Hi All,
>
>   Could you please let me know the scope of Impala unit testing? I mean
>   what all tests should be executed and ensured. I saw BE, FE, EE, JDBC,
>   Cluster tests in run-all-tests.sh.
>   And a guess estimate of how much time each of these take to execute?
>
>   Thanks,
>   Nishidha
>   ----- Forwarded by Nishidha Panpaliya/Austin/Contr/IBM on 05/12/2016
>   06:07 PM -----
>
>   From: Nishidha Panpaliya/Austin/Contr/IBM
>   To: [email protected]
>   Cc: "Jim Apple" <[email protected]>, Manish
>   Patil/Austin/Contr/IBM@IBMUS, Sudarshan Jagadale/Austin/Contr/IBM@IBMUS,
>   "Tim Armstrong" <[email protected]>, Valencia
>   Serrao/Austin/Contr/IBM@IBMUS
>   Date: 03/29/2016 06:59 PM
>   Subject: Re: Impala tests and estimate
>
>
>   Just one more request.
>
>   We'll be thankful if we could also get to know the count of each of these
>   tests (for e.g. there are 71 backend tests).
>
>   Thanks,
>   Nishidha
>
>   Nishidha Panpaliya---03/29/2016 10:05:29 AM---Hi All, I again need your
>   help in understanding Impala tests to be run and ensured and their
>   estimat
>
>   From: Nishidha Panpaliya/Austin/Contr/IBM
>   To: [email protected], "Tim Armstrong" <
>   [email protected]>, "Jim Apple" <[email protected]>
>   Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Manish
>   Patil/Austin/Contr/IBM@IBMUS, Valencia Serrao/Austin/Contr/IBM@IBMUS
>   Date: 03/29/2016 10:05 AM
>   Subject: Impala tests and estimate
>
>
>   Hi All,
>
>   I again need your help in understanding Impala tests to be run and
>   ensured and their estimates.
>
>   Last time, I know you had given way to run only backend tests and it was
>   helpful to us. I've also gone through run-all-tests.sh which triggers
>   backend test, frontend test, end-to-end tests, etc. Could you provide me
>   individual commands to run each of them and if any setup steps are
>   required? Also, I would like to know if there are any specific system
>   requirements that I must have up-front to run all these tests.
>
>   Along with these commands/scripts, I'm also interested in knowing how
>   much time each of these tests take to run, if we do not run into any
>   issues. This is required to know the guess estimate of how long will this
>   activity be taking from now.
>
>   Thanks in advance,
>   Nishidha
>
>
>
>
>
>

Re: Fw: Impala tests and estimate

Reply via email to