Re: python test infrastructure

Hyukjin Kwon Wed, 05 Sep 2018 22:00:02 -0700

> 1. all of the output in target/test-reports & python/unit-tests.log
should be included in the jenkins archived artifacts.


Hmmm, I thought they are already archived (
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95734/artifact/target/unit-tests.log
).
FWIW, unit-tests.log are pretty messy and they are shown when specific
tests are borken currently.


> 2. That test output needs to be separated by python executable.  It seems
to me that right now if you run python/run-tests with multiple
python-executables, you get separate test output (because each output file
includes a timestamp), but you can't tell which python version was used.

It wouldn't be difficult. I can make the changes if they are necessary;
however, I still think it's rather minor since logs are shown when some
tests are broken.


> 3. the test output should be incorporated into jenkins test output, so
its easier to see which test is failing, which tests are run, test trends,
etc.  Along with the above, that means the tests should be prefixed (or
something) with the python executable in the reports so you can track test
results for each executable.  (it seems this was done at one point by
SPARK-11295, but for whatever reason, doesn't seem to work anymore.)

Yea, I have taken a look for organising logs stuff before (for instance
https://github.com/apache/spark/pull/21107) but not for this idea itself. I
agree with this idea in general.

2018년 9월 6일 (목) 오전 5:41, Imran Rashid <iras...@cloudera.com.invalid>님이 작성:

> one more: seems like python/run-tests should have an option at least to
> not bail at the first failure:
> https://github.com/apache/spark/blob/master/python/run-tests.py#L113-L132
>
> this is particularly annoying with flaky tests -- since the rest of the
> tests aren't run, you don't know whether you *only* had a failure in that
> flaky test, or if there was some other real failure as well.
>
> On Wed, Sep 5, 2018 at 1:31 PM Imran Rashid <iras...@cloudera.com> wrote:
>
>> Hi all,
>>
>> More pyspark noob questions from me.  I find it really hard to figure out
>> what versions of python I should be testing and what is tested upstream.
>> While I'd like to just know the answers to those questions, more
>> importantly I'd like to make sure that info is visible somewhere so all
>> devs can figure it out themselves.  I think we should have:
>>
>> 1. all of the output in target/test-reports & python/unit-tests.log
>> should be included in the jenkins archived artifacts.
>>
>> 2. That test output needs to be separated by python executable.  It seems
>> to me that right now if you run python/run-tests with multiple
>> python-executables, you get separate test output (because each output file
>> includes a timestamp), but you can't tell which python version was used.
>>
>> 3. the test output should be incorporated into jenkins test output, so
>> its easier to see which test is failing, which tests are run, test trends,
>> etc.  Along with the above, that means the tests should be prefixed (or
>> something) with the python executable in the reports so you can track test
>> results for each executable.  (it seems this was done at one point by
>> SPARK-11295, but for whatever reason, doesn't seem to work anymore.)
>>
>> if we had these features as part of the regular testing infrastructure, I
>> think it would make it easier for everyone to understand what was happening
>> in the current pyspark tests and to compare their own local tests with them.
>>
>> thoughts?  is this covered somewhere that I don't know about?
>>
>> thanks,
>> Imran
>>
>

Re: python test infrastructure

Reply via email to