Ramana, You are right. We are trying to address multiple issues here, but not with a single solution. I am summarizing them
1. Tests should be visible to everyone (Implicit goal) 2. Before applying a patch we should run tests in a clustered environment. Parth had a suggestion(#4) in his original email. 3. Developers should be able to debug majority of the tests on their local environment. I made a few suggestions above to this regard - Rahul On Fri, Jul 24, 2015 at 10:40 AM, Ramana I N <[email protected]> wrote: > One important thing which we need to be clear on here is what are we trying > to address? > > I feel there are two separate issues here and I do not think one solution > will fit both the issues. > > 1. Allowing developers to run tests on their local box so they know the > changes they have are not completely wrong. > 2. Allowing transparency in the integration tests process which is > currently a black box. > > 1 is needed for developers to make changes and have an idea that their > changes are not going to fail tests en masse in the integration suite. 2 is > needed because its a prerequisite for changes to be committed. > > > Regards > Ramana > > > On Fri, Jul 24, 2015 at 10:28 AM, rahul challapalli < > [email protected]> wrote: > > > Ramana, > > > > Let me fill in more details. > > > > 1. Before we accept a patch we want to make sure the tests run in a > cluster > > environment. No exceptions here. > > 2. We want the contributors to be able to debug the failing tests on > their > > laptops in as many cases as possbile. This requires : > > 1. Tests should run on top of a local file system. (Tests can > > launch an embedded drillbit or they can connect to a running drillbit > > through zookeeper) > > 2. Running suites which require additional setup (hive, hbase > etc) > > should be made optional and sufficient documentation should be provided > for > > enabling and disabling these tests. > > 3. In my opinion making these new tests part of drill would make it > easier > > for the developers to debug and run tests instead of having a different > > repository. But as you said it might bloat the drill project > > > > - Rahul > > > > On Fri, Jul 24, 2015 at 9:42 AM, Ted Dunning <[email protected]> > > wrote: > > > > > The Hadoop family of projects has some software that integrates a > > > continuous integration system so that every time a JIRA is marked as > > > patch-available, the associated patch attached to the bug will have > > > integration tests run against it. I believe that there has been some > > > process to use git hashes instead of patches. The CI results are put > > back > > > on the JIRA. > > > > > > This is done using a fairly simple set of scripts. Apache Yetus is > just > > > forming as a direct-to-top-level spinoff from Hadoop > > > > > > Proposal is here (don't be fooled by the fact that it looks like an > > > incubation proposal): > > > > > > http://wiki.apache.org/incubator/YetusProposal > > > > > > Early code can be found here (don't guess that this is very real yet). > > > More links can be found in the proposal. > > > > > > https://github.com/sekikn/pre-yetus/tree/master/precommit/docs > > > > > > The project has not yet been formed and there are no mailing lists or > git > > > repo yet. > > > > > > > > > > > > On Fri, Jul 24, 2015 at 9:25 AM, Ramana I N <[email protected]> > wrote: > > > > > > > As someone who worked on this for a while, including it as part of > > drill > > > > may bloat drill a bit too much. Also not a big fan of running against > > an > > > > embedded drillbit. Does not replicate an actual production use case. > > > > > > > > Additionally, setting up hive hbase and other components maybe > painful > > > and > > > > unnecessary for most ppl. It would deter people from ever > contributing > > to > > > > drill. We could spin up in memory hive and hbase but that's similar > to > > an > > > > embedded drill bit. Does not replicate a production scenario. > > > > > > > > Would prefer the hive way with a central Jenkins server hosted on aws > > and > > > > accessible to everyone. Users should be able to submit a git url and > > > that > > > > should be able to deploy and fire off tests. Should then have a way > to > > > > easily communicate failures to contributors and if success notify the > > > > commiters to commit the change. > > > > > > > > Ps: if hive's way is open source maybe we can look into reuse rather > > than > > > > doing it from scratch. Esp the Jenkins and configuration stuff. > > > > > > > > Regards > > > > Ramana > > > > > > > > > > > > On Thursday, July 23, 2015, Parth Chandra <[email protected]> wrote: > > > > > > > > > Drill devs use a set of tests that are not available as part of the > > > > Apache > > > > > distribution. These tests are a pre-requisite for all commits, but > > are > > > > not > > > > > available to any contributors outside the current devs. > > > > > > > > > > This thread is to discuss various options to make these tests > > > available. > > > > > > > > > > Assumptions and requirements - > > > > > 1) A functional test (as opposed to a unit test) needs to be closer > > to > > > > the > > > > > end user environment than a development environment. As such, we > > should > > > > be > > > > > running functional tests in a cluster environment, connect using > > > > zookeeper > > > > > etc. > > > > > 2) Functional test will keep increasing in number, get more complex > > and > > > > > take a longer and longer time to execute as we go along. > > > > > 3) Some requirements are: > > > > > a) We want to be strict in enforcing the pre-commit > requirements, > > > but > > > > > not penalize the contributor who has a minor fix. > > > > > b) All parts of the product (especially various 'certified' > > storage > > > > > plugins like Hive and Hbase should get tested) > > > > > c) It should be easy to debug issues when a test fails. Tests > > > should > > > > > fail deterministically. If a test fails, it should always fail and > > > always > > > > > fail in the same way (easier said than done). > > > > > > > > > > Some suggestions - > > > > > 1) Tests should be a top-level maven module within the drill > project > > > > > a) We want the integration tests to run as part of the > > drill's > > > > > maven build process > > > > > b) The build step for the integration-tests module would > > launch > > > > an > > > > > embedded drillbit and runs tests against it > > > > > c) The tests will be a separate target so they need not be > > run > > > > all > > > > > the time > > > > > 2) Tests should be divided into multiple suites that are based on > > > > > components. For example a test suite for testing datatypes will > > contain > > > > the > > > > > tests for various datatypes including complex types. A contributor > or > > > > > developer can then run these tests more frequently as an issue is > > being > > > > > addressed and run the entire suite only once before commit. > > > > > 3) Provide the tests as a hosted service > > > > > 4) Setup a bot to fire the test on an AWS cluster and post the > > results > > > to > > > > > the JIRA (Hive does this). Or some variant of this idea. > > > > > > > > > > > > > > > Some questions - > > > > > 1) What do some other projects do? > > > > > 2) Are there any technologies we can leverage that will make this > > > easier? > > > > > 3) How do we make it easier to debug failing tests. > > > > > > > > > > > > > > > Please feel free to question the assumptions and requirements. Be > > > > creative > > > > > with your suggestions. > > > > > > > > > > Parth > > > > > > > > > > > > > > >
