GitHub user MaxGekk opened a pull request:
https://github.com/apache/spark/pull/20963
[SPARK-23849][SQL] Tests for the samplingRatio option of JSON datasource
## What changes were proposed in this pull request?
Proposed tests checks that only subset of input dataset is touched during
schema inferring.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/MaxGekk/spark-1 json-sampling-tests
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/20963.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #20963
----
commit 76e38a8670303afe20b530fc2a837a1363a01974
Author: Maxim Gekk <maxim.gekk@...>
Date: 2018-04-02T13:28:32Z
Adding samplingRation tests for json
commit 1acc3ece4121ffc3209fb65a20526dec574820e5
Author: Maxim Gekk <maxim.gekk@...>
Date: 2018-04-02T13:34:06Z
Removing debug code
commit 0d5fcfb59e112cc989753404813d32432257a0f6
Author: Maxim Gekk <maxim.gekk@...>
Date: 2018-04-02T18:45:18Z
Adding the ticket ref to test titles
commit a664465d8a3042f06cf7a866283c34a674b24939
Author: Maxim Gekk <maxim.gekk@...>
Date: 2018-04-02T18:48:56Z
Making Scala style checker happy
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]