[
https://issues.apache.org/jira/browse/HIVE-22942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17055864#comment-17055864
]
Zoltan Haindrich commented on HIVE-22942:
-----------------------------------------
Hey All,
I think the best would be to replace the ptest thing with something else -
which is not maintained by the Hive community; moving to junit5 would be cool;
but it might be challenging to do...the arallel execution of tests within the
same machine tend to uncover further issues when we don't expect 2 pieces of
the same kind of test to be executed at the same time...and I don't think we
can have a single machine to execute all of them in one place - I think running
batches in isolated environment on 1 thread might be more robust - and
reliable; so that we can actually will be able to repro the issue.
I've opened a PR with a working prototype; it isn't complete - but it's able to
do the following:
* builds upon some jenkins plugins; and the job itself is defined as a
Jenkinsfile
* uses docker images executed on a kubernetes cluster to provide
reproducibility - so anyone will be more likely to be able to repro runs of the
tests by using docker
* to make the parrallel test executor plugin "happy" - I needed to find a way
to reduce the max testclass execution time belove ~30 minutest
** as a first approach I went on and analyzed test execution times based on the
actual testcase times....its possible; but defining the ranges and maintaining
them long term might be intersting at least
** then I compared how "well" a naive approach would compare...and I concluded
that going over twice as many splits the result is acceptable....so I went this
way its a cleaner way to do it..
** I wanted to not disrupt existing usages of testing so I came up with the
following way to declare further classes for qtest over 30minutes ; let's go
with TestCliDriver for now:
*** in case a special flag is enables (qsplits) the TestCliDriver is split into
a number of parts; the "split" classes are differ only in the package name; so
a "-Dtest=TestCliDriver" will still work to run the testcase
*** there is some shell script / java reflection stuff which actually does the
splitting of the test parameter list into smaller pieces
currently I think the replacement layout will be:
* a kubernetes cluster somewhere (gce/gke)
* a jenkins running inside the kubernetes cluster
* a local artifact caching instance is added to reduce outside comm
* it would be easier to tie the job into github PRs and live with that instead
retaining the run-a-patch approach
* as for running multiple ptest; it will be easily possible as the limit will
be the number of pods the jenkins may launch;
things that are still need investigations/etc:
* there are a bunch of failing tests ... I guess most of them has some env
issue in the background
* there should be a timeout on executing a set of tests; the ptest env uses a
"timeout" on the maven command - I can just throw in the timeout plugin; but
timeouts should be fixed....they are a sign of big problems like deadlocks/etc
* no support for "isolated" tests - this should be rethinked
> Replace PTest with an alternative
> ---------------------------------
>
> Key: HIVE-22942
> URL: https://issues.apache.org/jira/browse/HIVE-22942
> Project: Hive
> Issue Type: Improvement
> Reporter: Zoltan Haindrich
> Assignee: Zoltan Haindrich
> Priority: Major
> Labels: pull-request-available
> Time Spent: 10m
> Remaining Estimate: 0h
>
> I never opened a jira about this...but it might actually help collect ideas
> and actually start going somewhere sooner than later :D
> Right now we maintain the ptest2 project inside Hive to be able to run Hive
> tests in a distributed fashion...the backstab of this solution is that we are
> putting much effort into maintaining a distributed test execution framework...
> I think it would be better if we could find an off the shelf solution for the
> task and migrate to that instead of putting more efforts into the ptest
> framework
--
This message was sent by Atlassian Jira
(v8.3.4#803005)