[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-29 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20959 LGTM Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20959 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89967/ Test PASSed. ---

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20959 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20959 **[Test build #89967 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89967/testReport)** for PR 20959 at commit [`d4d9d65`](https://github.com/apache/spark/commit/d

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20959 **[Test build #89967 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89967/testReport)** for PR 20959 at commit [`d4d9d65`](https://github.com/apache/spark/commit/d4

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-29 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20959 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: review

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20959 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20959 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89964/ Test FAILed. ---

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20959 **[Test build #89964 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89964/testReport)** for PR 20959 at commit [`d4d9d65`](https://github.com/apache/spark/commit/d

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20959 **[Test build #89964 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89964/testReport)** for PR 20959 at commit [`d4d9d65`](https://github.com/apache/spark/commit/d4

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-28 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20959 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: review

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20959 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20959 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89879/ Test PASSed. ---

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-26 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20959 **[Test build #89879 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89879/testReport)** for PR 20959 at commit [`d4d9d65`](https://github.com/apache/spark/commit/d

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-26 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20959 **[Test build #89879 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89879/testReport)** for PR 20959 at commit [`d4d9d65`](https://github.com/apache/spark/commit/d4

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20959 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20959 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89796/ Test PASSed. ---

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20959 **[Test build #89796 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89796/testReport)** for PR 20959 at commit [`b2c552c`](https://github.com/apache/spark/commit/b

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20959 **[Test build #89796 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89796/testReport)** for PR 20959 at commit [`b2c552c`](https://github.com/apache/spark/commit/b2

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20959 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20959 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89677/ Test PASSed. ---

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20959 **[Test build #89677 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89677/testReport)** for PR 20959 at commit [`0737bf7`](https://github.com/apache/spark/commit/0

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20959 **[Test build #89677 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89677/testReport)** for PR 20959 at commit [`0737bf7`](https://github.com/apache/spark/commit/07

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20959 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20959 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89664/ Test PASSed. ---

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20959 **[Test build #89664 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89664/testReport)** for PR 20959 at commit [`257b363`](https://github.com/apache/spark/commit/2

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20959 **[Test build #89664 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89664/testReport)** for PR 20959 at commit [`257b363`](https://github.com/apache/spark/commit/25

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-18 Thread MaxGekk
Github user MaxGekk commented on the issue: https://github.com/apache/spark/pull/20959 @gatorsmile @HyukjinKwon @sujithjay May I ask you to look at the PR again --- - To unsubscribe, e-mail: reviews-unsubscr...@spark

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20959 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89239/ Test PASSed. ---

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20959 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20959 **[Test build #89239 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89239/testReport)** for PR 20959 at commit [`d12c2e2`](https://github.com/apache/spark/commit/d

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20959 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89241/ Test FAILed. ---

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20959 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20959 **[Test build #89241 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89241/testReport)** for PR 20959 at commit [`a37bf3b`](https://github.com/apache/spark/commit/a

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20959 **[Test build #89241 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89241/testReport)** for PR 20959 at commit [`a37bf3b`](https://github.com/apache/spark/commit/a3

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20959 **[Test build #89239 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89239/testReport)** for PR 20959 at commit [`d12c2e2`](https://github.com/apache/spark/commit/d1

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20959 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89220/ Test FAILed. ---

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20959 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20959 **[Test build #89220 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89220/testReport)** for PR 20959 at commit [`1427f73`](https://github.com/apache/spark/commit/1

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20959 **[Test build #89220 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89220/testReport)** for PR 20959 at commit [`1427f73`](https://github.com/apache/spark/commit/14

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20959 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89217/ Test FAILed. ---

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20959 **[Test build #89217 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89217/testReport)** for PR 20959 at commit [`9f26bb7`](https://github.com/apache/spark/commit/9

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20959 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20959 **[Test build #89217 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89217/testReport)** for PR 20959 at commit [`9f26bb7`](https://github.com/apache/spark/commit/9f

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20959 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20959 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89208/ Test FAILed. ---

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20959 **[Test build #89208 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89208/testReport)** for PR 20959 at commit [`3bceb3a`](https://github.com/apache/spark/commit/3

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20959 **[Test build #89208 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89208/testReport)** for PR 20959 at commit [`3bceb3a`](https://github.com/apache/spark/commit/3b

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20959 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89154/ Test FAILed. ---

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20959 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20959 **[Test build #89154 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89154/testReport)** for PR 20959 at commit [`d4e815e`](https://github.com/apache/spark/commit/d

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20959 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89155/ Test FAILed. ---

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20959 **[Test build #89155 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89155/testReport)** for PR 20959 at commit [`d584cfe`](https://github.com/apache/spark/commit/d

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20959 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20959 **[Test build #89155 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89155/testReport)** for PR 20959 at commit [`d584cfe`](https://github.com/apache/spark/commit/d5

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20959 **[Test build #89154 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89154/testReport)** for PR 20959 at commit [`d4e815e`](https://github.com/apache/spark/commit/d4

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-07 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20959 Let's address @gatorsmile's and mine at https://github.com/apache/spark/pull/20963 too as well. Seems fine otherwise. --- -

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-07 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20959 @MaxGekk Thanks for working on this! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional command

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-03 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20959 sure, will review this and go ahead. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comman

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-03 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/20959 I'm good with having this option given the data @MaxGekk posted. (I haven't reviewed the code - somebody else should do that before merging). `val sampledSchema = spark.read.option("inferSchema

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-03 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20959 For usability, the workaround I suggested above has more flexibility. For example, we can make different operation (e.g, filter) on schema inference path. They are only few lines. Schem

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-03 Thread MaxGekk
Github user MaxGekk commented on the issue: https://github.com/apache/spark/pull/20959 @rxin I made an experiment on json files but numbers for csv are almost the same. For example, inferring schema for 50GB json: ``` scala> spark.read.option("samplingRatio", 0.1).json(

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-02 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20959 _if I remember this correctly_, there was a discussion about it a lone ago and, @rxin was not sure how much it improves the perf and if it's worth, which I ended up with agreeing with. @rxin, d

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20959 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88827/ Test FAILed. ---

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20959 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20959 **[Test build #88827 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88827/testReport)** for PR 20959 at commit [`b6a7cc8`](https://github.com/apache/spark/commit/b

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20959 **[Test build #88827 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88827/testReport)** for PR 20959 at commit [`b6a7cc8`](https://github.com/apache/spark/commit/b6

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20959 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88818/ Test FAILed. ---

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20959 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20959 **[Test build #88818 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88818/testReport)** for PR 20959 at commit [`91c57cf`](https://github.com/apache/spark/commit/9

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20959 **[Test build #88818 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88818/testReport)** for PR 20959 at commit [`91c57cf`](https://github.com/apache/spark/commit/91

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20959 **[Test build #88817 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88817/testReport)** for PR 20959 at commit [`ba12fca`](https://github.com/apache/spark/commit/b

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20959 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20959 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88817/ Test FAILed. ---

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20959 **[Test build #88817 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88817/testReport)** for PR 20959 at commit [`ba12fca`](https://github.com/apache/spark/commit/ba

[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...

2018-04-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20959 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional