[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-19 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/10805#discussion_r50085474 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVParameters.scala --- @@ -107,3 +117,28 @@ private[csv] object ParseModes {

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-19 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10805#issuecomment-172785578 **[Test build #49679 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49679/consoleFull)** for PR 10805 at commit

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-19 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/10805#issuecomment-17234 Although `CSVCompressionCodecs` might be shared with JSON datasource, I will make that share this at the separate PR for JSON. --- If your project is set up for

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-19 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/10805#discussion_r50085424 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVParameters.scala --- @@ -107,3 +117,28 @@ private[csv] object ParseModes {

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-19 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/10805#discussion_r50085643 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVParameters.scala --- @@ -73,6 +76,14 @@ private[sql] case class

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-19 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/10805#discussion_r50085687 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVRelation.scala --- @@ -99,6 +100,15 @@ private[csv] class CSVRelation(

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10805#issuecomment-172780089 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10805#issuecomment-172780088 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-19 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10805#issuecomment-172780590 **[Test build #49676 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49676/consoleFull)** for PR 10805 at commit

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10805#issuecomment-172787690 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10805#issuecomment-172787688 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-19 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10805#issuecomment-172789851 **[Test build #49671 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49671/consoleFull)** for PR 10805 at commit

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10805#issuecomment-172790523 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10805#issuecomment-172790525 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-19 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10805#issuecomment-172813154 **[Test build #49676 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49676/consoleFull)** for PR 10805 at commit

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10805#issuecomment-172813600 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10805#issuecomment-172813603 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-19 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10805#issuecomment-172816508 **[Test build #49679 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49679/consoleFull)** for PR 10805 at commit

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10805#issuecomment-172816804 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10805#issuecomment-172816803 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-19 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/10805#discussion_r50081941 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVParameters.scala --- @@ -73,6 +82,12 @@ private[sql] case class

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-19 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/10805#discussion_r50081902 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVParameters.scala --- @@ -73,6 +82,12 @@ private[sql] case class

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-19 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/10805#issuecomment-173016289 Oh yes it does. Actually I am reading compressed files in the test I added

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-19 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/10805#issuecomment-172944028 Oh one thing: this doesn't support reading with compression yet, does it? --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-19 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/10805#issuecomment-173022112 I see. I will anyway try to figure this out though. I somehow this might be a bit too much as almost all files would have proper extensions and I think the

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-19 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/10805#discussion_r50206066 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVParameters.scala --- @@ -107,3 +114,28 @@ private[csv] object

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-19 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/10805#discussion_r50206079 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVParameters.scala --- @@ -107,3 +114,28 @@ private[csv] object

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-19 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/10805#issuecomment-173017720 Yea I'm thinking we should also support specifying options, and it is "auto" by default which decides based on extensions. --- If your project is set up for it, you can

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-19 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/10805#issuecomment-173068930 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-19 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10805#issuecomment-173065443 **[Test build #49750 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49750/consoleFull)** for PR 10805 at commit

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10805#issuecomment-173080338 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10805#issuecomment-173080339 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-19 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10805#issuecomment-173080189 **[Test build #49750 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49750/consoleFull)** for PR 10805 at commit

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-19 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/10805#issuecomment-173085873 I've merged this in master. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-19 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/10805 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10805#issuecomment-172496706 **[Test build #49592 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49592/consoleFull)** for PR 10805 at commit

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-18 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/10805 [SPARK-12871][SQL] Support to specify the option for compression codec. https://issues.apache.org/jira/browse/SPARK-12871 This PR added an option to support to specify compression codec.

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10805#issuecomment-172517601 **[Test build #49592 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49592/consoleFull)** for PR 10805 at commit

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10805#issuecomment-172517812 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10805#issuecomment-172517811 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10805#issuecomment-172709347 **[Test build #49643 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49643/consoleFull)** for PR 10805 at commit

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10805#issuecomment-172709719 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10805#issuecomment-172691164 **[Test build #49643 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49643/consoleFull)** for PR 10805 at commit

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10805#issuecomment-172709720 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-18 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/10805#issuecomment-172747397 I will resolve conflicts and update this soon. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-18 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/10805#discussion_r50074385 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVParameters.scala --- @@ -71,6 +71,8 @@ private[sql] case class

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-18 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/10805#issuecomment-172745372 Yup we are dropping Hadoop 1.x support, so it is OK to have it only for Hadoop 2.x. --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-18 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/10805#discussion_r50074461 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVParameters.scala --- @@ -71,6 +71,8 @@ private[sql] case class

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-18 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/10805#discussion_r50081670 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVParameters.scala --- @@ -44,6 +46,13 @@ private[sql] case class

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10805#issuecomment-172768286 **[Test build #49671 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49671/consoleFull)** for PR 10805 at commit

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-18 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/10805#issuecomment-172766510 Supported shorten names for compression codecs are below (case insensitive): `bzip2` -> `org.apache.hadoop.io.compress.BZip2Codec` `gzip` ->

[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...

2016-01-18 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/10805#discussion_r50053254 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVParameters.scala --- @@ -71,6 +71,8 @@ private[sql] case class