one more: seems like python/run-tests should have an option at least to not bail at the first failure: https://github.com/apache/spark/blob/master/python/run-tests.py#L113-L132
this is particularly annoying with flaky tests -- since the rest of the tests aren't run, you don't know whether you *only* had a failure in that flaky test, or if there was some other real failure as well. On Wed, Sep 5, 2018 at 1:31 PM Imran Rashid <iras...@cloudera.com> wrote: > Hi all, > > More pyspark noob questions from me. I find it really hard to figure out > what versions of python I should be testing and what is tested upstream. > While I'd like to just know the answers to those questions, more > importantly I'd like to make sure that info is visible somewhere so all > devs can figure it out themselves. I think we should have: > > 1. all of the output in target/test-reports & python/unit-tests.log should > be included in the jenkins archived artifacts. > > 2. That test output needs to be separated by python executable. It seems > to me that right now if you run python/run-tests with multiple > python-executables, you get separate test output (because each output file > includes a timestamp), but you can't tell which python version was used. > > 3. the test output should be incorporated into jenkins test output, so its > easier to see which test is failing, which tests are run, test trends, > etc. Along with the above, that means the tests should be prefixed (or > something) with the python executable in the reports so you can track test > results for each executable. (it seems this was done at one point by > SPARK-11295, but for whatever reason, doesn't seem to work anymore.) > > if we had these features as part of the regular testing infrastructure, I > think it would make it easier for everyone to understand what was happening > in the current pyspark tests and to compare their own local tests with them. > > thoughts? is this covered somewhere that I don't know about? > > thanks, > Imran >