Hi,

I'd like to propose that we move away from such heavy reliance on doctests
in python, and move towards more traditional unit tests.  The main reason
is that its hard to share test code in doc tests.  For example, I was just
looking at
https://github.com/apache/spark/commit/82c18c240a6913a917df3b55cc5e22649561c4dd
 and wondering if we had any tests for some of the pyspark changes.
SparkSession.createDataFrame has doctests, but those are just run with one
standard spark configuration, which does not enable arrow.  Its hard to
easily reuse that test, just with another spark context with a different
conf.  Similarly I've wondered about reusing test cases but with
local-cluster instead of local mode.  I feel like they also discourage
writing a test which tries to get more exhaustive coverage on corner cases.

I'm not saying we should stop using doctests -- I see why they're nice.  I
just think they should really only be when you want that code snippet in
the doc anyway, so you might as well test it.

Admittedly, I'm not really a python-developer, so I could be totally wrong
about the right way to author doctests -- pushback welcome!

Thoughts?

thanks,
Imran

Reply via email to