[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-08 Thread koertkuipers
Github user koertkuipers commented on the pull request: https://github.com/apache/spark/pull/11509#issuecomment-193921325 if it did then it was not always in the apis i think? i remember the apis having paths: Seq[String] instead of files: Seq[FileStatus]. by explicitly

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-08 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/11509#issuecomment-193902037 @koertkuipers improving the efficiency of working with large files was certainly a goal in this refactoring and this API is definitely not done yet. That said, I'm

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-08 Thread koertkuipers
Github user koertkuipers commented on the pull request: https://github.com/apache/spark/pull/11509#issuecomment-193877543 i believe the need to pass all files along (e.g. inputFiles: Array[FileStatus]) instead of just the input paths came from the need to cache it so that stuff

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-08 Thread tedyu
Github user tedyu commented on a diff in the pull request: https://github.com/apache/spark/pull/11509#discussion_r55372864 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/sources/SimpleTextRelation.scala --- @@ -1,265 +0,0 @@ -/* - * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-07 Thread tedyu
Github user tedyu commented on a diff in the pull request: https://github.com/apache/spark/pull/11509#discussion_r55318504 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/sources/interfaces.scala --- @@ -465,214 +379,165 @@ abstract class OutputWriter { }

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11509#issuecomment-193511261 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11509#issuecomment-193511260 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-07 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11509#issuecomment-193510810 **[Test build #52582 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52582/consoleFull)** for PR 11509 at commit

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-07 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/11509 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-07 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/11509#issuecomment-193498704 Going to merge this in master. We should rename HiveFileCatalog to MetastoreFileCatalog. cc @andrewor14 --- If your project is set up for it, you

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11509#issuecomment-193493845 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-07 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11509#issuecomment-193493406 **[Test build #52590 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52590/consoleFull)** for PR 11509 at commit

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11509#issuecomment-193493854 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-07 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11509#issuecomment-193431891 **[Test build #52590 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52590/consoleFull)** for PR 11509 at commit

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-07 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/11509#discussion_r55260105 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/DefaultSource.scala --- @@ -17,32 +17,153 @@ package

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-07 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11509#issuecomment-193415690 **[Test build #52582 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52582/consoleFull)** for PR 11509 at commit

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-07 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/11509#discussion_r55257029 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/ExistingRDD.scala --- @@ -226,16 +226,17 @@ private[sql] object PhysicalRDD {

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-07 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/11509#discussion_r55256446 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/sources/SimpleTextRelation.scala --- @@ -1,265 +0,0 @@ -/* - * Licensed to the Apache

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-07 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/11509#discussion_r55254873 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/commands.scala --- @@ -147,6 +147,13 @@ case class CreateMetastoreDataSource(

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-07 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/11509#discussion_r55254593 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala --- @@ -366,13 +366,6 @@ final class DataFrameWriter private[sql](df:

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-07 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/11509#discussion_r55254454 --- Diff: mllib/src/test/scala/org/apache/spark/ml/source/libsvm/LibSVMRelationSuite.scala --- @@ -88,7 +88,8 @@ class LibSVMRelationSuite extends

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-07 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/11509#discussion_r55253851 --- Diff: mllib/src/main/scala/org/apache/spark/ml/source/libsvm/LibSVMRelation.scala --- @@ -167,22 +117,63 @@ class DefaultSource extends

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-07 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/11509#discussion_r55253572 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/orc/OrcQuerySuite.scala --- @@ -348,7 +348,7 @@ class OrcQuerySuite extends QueryTest with

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-07 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/11509#discussion_r55252933 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/ResolvedDataSource.scala --- @@ -278,26 +298,61 @@ object ResolvedDataSource

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-07 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/11509#discussion_r55252685 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InsertIntoHadoopFsRelation.scala --- @@ -101,45 +111,28 @@ private[sql] case

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-07 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/11509#discussion_r55216537 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/ExistingRDD.scala --- @@ -226,16 +226,17 @@ private[sql] object PhysicalRDD {

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-07 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/11509#discussion_r55213849 --- Diff: mllib/src/test/scala/org/apache/spark/ml/source/libsvm/LibSVMRelationSuite.scala --- @@ -88,7 +88,8 @@ class LibSVMRelationSuite extends

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-07 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/11509#discussion_r55212994 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/sources/SimpleTextRelation.scala --- @@ -1,265 +0,0 @@ -/* - * Licensed to the Apache

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-07 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/11509#discussion_r55211124 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/commands.scala --- @@ -147,6 +147,13 @@ case class CreateMetastoreDataSource(

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-07 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/11509#discussion_r55210119 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/sources/interfaces.scala --- @@ -465,214 +379,168 @@ abstract class OutputWriter { }

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-07 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/11509#discussion_r55206478 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala --- @@ -366,13 +366,6 @@ final class DataFrameWriter private[sql](df:

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-07 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/11509#discussion_r55205874 --- Diff: mllib/src/test/scala/org/apache/spark/ml/source/libsvm/LibSVMRelationSuite.scala --- @@ -88,7 +88,8 @@ class LibSVMRelationSuite extends

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-07 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/11509#discussion_r55204271 --- Diff: mllib/src/main/scala/org/apache/spark/ml/source/libsvm/LibSVMRelation.scala --- @@ -167,22 +117,63 @@ class DefaultSource extends

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-05 Thread davies
Github user davies commented on the pull request: https://github.com/apache/spark/pull/11509#issuecomment-192604670 Did one pass on this, looks great! All the comments are minor, it's fine to be addressed later. --- If your project is set up for it, you can reply to this email and

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-05 Thread davies
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/11509#discussion_r55117589 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/orc/OrcQuerySuite.scala --- @@ -348,7 +348,7 @@ class OrcQuerySuite extends QueryTest with

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-04 Thread davies
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/11509#discussion_r55117518 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/sources/interfaces.scala --- @@ -465,214 +379,168 @@ abstract class OutputWriter { }

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-04 Thread davies
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/11509#discussion_r55117452 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/DefaultSource.scala --- @@ -17,32 +17,153 @@ package

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-04 Thread davies
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/11509#discussion_r55117341 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/WriterContainer.scala --- @@ -351,8 +354,8 @@ private[sql] class

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-04 Thread davies
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/11509#discussion_r55117300 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/ResolvedDataSource.scala --- @@ -278,26 +298,61 @@ object ResolvedDataSource

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-04 Thread davies
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/11509#discussion_r55117253 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/ResolvedDataSource.scala --- @@ -92,19 +96,61 @@ object ResolvedDataSource

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-04 Thread davies
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/11509#discussion_r55117222 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InsertIntoHadoopFsRelation.scala --- @@ -101,45 +111,28 @@ private[sql] case

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11509#issuecomment-192558309 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11509#issuecomment-192558302 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-04 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11509#issuecomment-192557781 **[Test build #52498 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52498/consoleFull)** for PR 11509 at commit

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11509#issuecomment-192543318 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11509#issuecomment-192543317 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-04 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11509#issuecomment-192543057 **[Test build #52493 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52493/consoleFull)** for PR 11509 at commit

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-04 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11509#issuecomment-192531809 **[Test build #52498 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52498/consoleFull)** for PR 11509 at commit

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-04 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11509#issuecomment-192514145 **[Test build #52493 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52493/consoleFull)** for PR 11509 at commit

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-04 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/11509#discussion_r55096068 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InsertIntoHadoopFsRelation.scala --- @@ -58,18 +57,29 @@ import

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-04 Thread davies
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/11509#discussion_r55086715 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InsertIntoHadoopFsRelation.scala --- @@ -58,18 +57,29 @@ import

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-04 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/11509#discussion_r55082206 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InsertIntoHadoopFsRelation.scala --- @@ -58,18 +57,29 @@ import

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-04 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/11509#discussion_r55082136 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVRelation.scala --- @@ -246,8 +116,10 @@ object CSVRelation extends

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-04 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/11509#discussion_r55081339 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/sources/interfaces.scala --- @@ -464,215 +378,140 @@ abstract class OutputWriter { } }

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-04 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/11509#issuecomment-192441294 @davies agree, we should have a default internalScan that delegates to external version while doing the `Row` => `InternalRow`. We can then make that method

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-04 Thread nongli
Github user nongli commented on a diff in the pull request: https://github.com/apache/spark/pull/11509#discussion_r55077018 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/sources/interfaces.scala --- @@ -464,215 +378,140 @@ abstract class OutputWriter { } }

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-04 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/11509#discussion_r55076629 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/ExistingRDD.scala --- @@ -173,16 +173,17 @@ private[sql] object PhysicalRDD {

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-04 Thread nongli
Github user nongli commented on a diff in the pull request: https://github.com/apache/spark/pull/11509#discussion_r55076660 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVRelation.scala --- @@ -246,8 +116,10 @@ object CSVRelation extends

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-04 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/11509#discussion_r55076507 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/types/DataType.scala --- @@ -103,7 +103,7 @@ object DataType { /** Given the string

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-04 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/11509#discussion_r55075899 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/ResolvedDataSource.scala --- @@ -92,19 +96,61 @@ object ResolvedDataSource

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-04 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/11509#discussion_r55075595 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/sources/CommitFailureTestRelationSuite.scala --- @@ -1,104 +0,0 @@ -/* - * Licensed to the

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-04 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/11509#discussion_r55072046 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/sources/interfaces.scala --- @@ -464,215 +378,140 @@ abstract class OutputWriter { } }

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-04 Thread davies
Github user davies commented on the pull request: https://github.com/apache/spark/pull/11509#issuecomment-192406483 @marmbrus InternalRow is not a public API, so we will have buildScan() to return an RDD of Row for external libraries? --- If your project is set up for it, you can

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-04 Thread davies
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/11509#discussion_r55071018 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InsertIntoHadoopFsRelation.scala --- @@ -58,18 +57,29 @@ import

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-04 Thread davies
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/11509#discussion_r55070549 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/ExistingRDD.scala --- @@ -173,16 +173,17 @@ private[sql] object PhysicalRDD {

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-04 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/11509#discussion_r55069011 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/types/DataType.scala --- @@ -103,7 +103,7 @@ object DataType { /** Given the string

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-04 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/11509#discussion_r55067756 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/ResolvedDataSource.scala --- @@ -92,19 +96,61 @@ object ResolvedDataSource

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-04 Thread davies
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/11509#discussion_r55067502 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/sources/CommitFailureTestRelationSuite.scala --- @@ -1,104 +0,0 @@ -/* - * Licensed to the

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11509#issuecomment-192081129 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-03 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11509#issuecomment-192081106 **[Test build #52439 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52439/consoleFull)** for PR 11509 at commit

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11509#issuecomment-192081128 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-03 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11509#issuecomment-192076770 **[Test build #52439 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52439/consoleFull)** for PR 11509 at commit

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-03 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/11509#issuecomment-192076357 @rxin @nongli @cloud-fan @liancheng @yhuai --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-13665][SQL] Separate the concerns of Ha...

2016-03-03 Thread marmbrus
GitHub user marmbrus opened a pull request: https://github.com/apache/spark/pull/11509 [SPARK-13665][SQL] Separate the concerns of HadoopFsRelation `HadoopFsRelation` is used for reading most files into Spark SQL. However today this class mixes the concerns of file management,