Hi Li, yes that makes perfect sense. That more-or-less is the same as my view, though I framed it differently. I guess in that case, I'm really asking:
Can pyspark changes please be accompanied by more unit tests, and not assume we're getting coverage from doctests? Imran On Wed, Aug 29, 2018 at 2:02 PM Li Jin <ice.xell...@gmail.com> wrote: > Hi Imran, > > My understanding is that doctests and unittests are orthogonal - doctests > are used to make sure docstring examples are correct and are not meant to > replace unittests. > Functionalities are covered by unit tests to ensure correctness and > doctests are used to test the docstring, not the functionalities itself. > > There are issues with doctests, for example, we cannot test arrow related > functions in doctest because of pyarrow is optional dependency, but I think > that's a separate issue. > > Does this make sense? > > Li > > On Wed, Aug 29, 2018 at 6:35 PM Imran Rashid <iras...@cloudera.com.invalid> > wrote: > >> Hi, >> >> I'd like to propose that we move away from such heavy reliance on >> doctests in python, and move towards more traditional unit tests. The main >> reason is that its hard to share test code in doc tests. For example, I >> was just looking at >> >> https://github.com/apache/spark/commit/82c18c240a6913a917df3b55cc5e22649561c4dd >> and wondering if we had any tests for some of the pyspark changes. >> SparkSession.createDataFrame has doctests, but those are just run with one >> standard spark configuration, which does not enable arrow. Its hard to >> easily reuse that test, just with another spark context with a different >> conf. Similarly I've wondered about reusing test cases but with >> local-cluster instead of local mode. I feel like they also discourage >> writing a test which tries to get more exhaustive coverage on corner cases. >> >> I'm not saying we should stop using doctests -- I see why they're nice. >> I just think they should really only be when you want that code snippet in >> the doc anyway, so you might as well test it. >> >> Admittedly, I'm not really a python-developer, so I could be totally >> wrong about the right way to author doctests -- pushback welcome! >> >> Thoughts? >> >> thanks, >> Imran >> >