[GitHub] spark issue #22134: [SPARK-25143][SQL] Support data source name mapping conf...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/22134 I got it. I'll close this approach. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22134: [SPARK-25143][SQL] Support data source name mapping conf...
Github user rxin commented on the issue: https://github.com/apache/spark/pull/22134 I think it's premature to introduce this. The extra layer of abstraction actually makes it more difficult to reason about what's going on. We don't have that many data sources that require flexibility here, and we can always add the flexibility if needed later. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22134: [SPARK-25143][SQL] Support data source name mapping conf...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22134 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94902/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22134: [SPARK-25143][SQL] Support data source name mapping conf...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22134 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22134: [SPARK-25143][SQL] Support data source name mapping conf...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22134 **[Test build #94902 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94902/testReport)** for PR 22134 at commit [`e477a8e`](https://github.com/apache/spark/commit/e477a8ed6d2ea331be357a6fbbb3d55c504971b1). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22134: [SPARK-25143][SQL] Support data source name mapping conf...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/22134 @gengliangwang . As of now, it's not possible to use that full class name for the Hive table saved as `com.databricks.spark.avro`. So, we are trying to find this way (this PR) or that way (maybe your PR). > Also, besides the Databricks avro/csv repo, users can just provide their full class name in specifying the file format, or short name if it doesn't not conflict with built-in ones. I don't think they need such control. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22134: [SPARK-25143][SQL] Support data source name mapping conf...
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/22134 > The purpose of this PR is RESET to override the built-in backwardCompatibilityMap. Before this PR, we don't have any control over the built-in mapping. We need to allow the users can take a full controlability. I don't think it is user friendly for such RESET. Also, besides the Databricks avro/csv repo, users can just provide their full class name in specifying the file format, or short name if it doesn't not conflict with built-in ones. I don't think they need such control. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22134: [SPARK-25143][SQL] Support data source name mapping conf...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/22134 @gengliangwang and @gatorsmile . First of all, the internal 3rd party mapping should be removed in Apache Spark 3.0. Please consider that. Also, with this PR, we can remove `com.databricks.spark.avro` in Spark 2.4, too. If you want, I can remove that in this PR, but I just want to keep this PR simplest. And, we don't need to `unset` here. The purpose of this PR is *RESET* to override the built-in `backwardCompatibilityMap`. Before this PR, we don't have any control over the built-in mapping. We need to allow the users can take a full controlability. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22134: [SPARK-25143][SQL] Support data source name mapping conf...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22134 Do we need it in the current stage? Regarding UX, it looks complex to end users. I am unable to remember the names. It is very easy to provide a wrong class name. ``` "spark.sql.datasource.map.org.apache.spark.sql.avro" -> testDataSource.getCanonicalName, "spark.sql.datasource.map.com.databricks.spark.csv" -> testDataSource.getCanonicalName, "spark.sql.datasource.map.com.databricks.spark.avro" -> testDataSource.getCanonicalName) ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22134: [SPARK-25143][SQL] Support data source name mapping conf...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22134 **[Test build #94902 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94902/testReport)** for PR 22134 at commit [`e477a8e`](https://github.com/apache/spark/commit/e477a8ed6d2ea331be357a6fbbb3d55c504971b1). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22134: [SPARK-25143][SQL] Support data source name mapping conf...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22134 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2285/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22134: [SPARK-25143][SQL] Support data source name mapping conf...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22134 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22134: [SPARK-25143][SQL] Support data source name mapping conf...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/22134 Retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22134: [SPARK-25143][SQL] Support data source name mapping conf...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22134 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94889/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22134: [SPARK-25143][SQL] Support data source name mapping conf...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22134 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22134: [SPARK-25143][SQL] Support data source name mapping conf...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22134 **[Test build #94889 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94889/testReport)** for PR 22134 at commit [`e477a8e`](https://github.com/apache/spark/commit/e477a8ed6d2ea331be357a6fbbb3d55c504971b1). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22134: [SPARK-25143][SQL] Support data source name mapping conf...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/22134 Could you review this, @tgravescs , @cloud-fan , @gatorsmile , @HyukjinKwon ? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22134: [SPARK-25143][SQL] Support data source name mapping conf...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22134 **[Test build #94889 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94889/testReport)** for PR 22134 at commit [`e477a8e`](https://github.com/apache/spark/commit/e477a8ed6d2ea331be357a6fbbb3d55c504971b1). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22134: [SPARK-25143][SQL] Support data source name mapping conf...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22134 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2273/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22134: [SPARK-25143][SQL] Support data source name mapping conf...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22134 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22134: [SPARK-25143][SQL] Support data source name mapping conf...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/22134 Ping, @gengliangwang . I made a different approach from #22133 . This will be more general. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org