[GitHub] spark issue #22134: [SPARK-25143][SQL] Support data source name mapping conf...

2018-08-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/22134
  
I got it. I'll close this approach.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22134: [SPARK-25143][SQL] Support data source name mapping conf...

2018-08-17 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/22134
  
I think it's premature to introduce this. The extra layer of abstraction 
actually makes it more difficult to reason about what's going on. We don't have 
that many data sources that require flexibility here, and we can always add the 
flexibility if needed later.



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22134: [SPARK-25143][SQL] Support data source name mapping conf...

2018-08-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22134
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94902/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22134: [SPARK-25143][SQL] Support data source name mapping conf...

2018-08-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22134
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22134: [SPARK-25143][SQL] Support data source name mapping conf...

2018-08-17 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22134
  
**[Test build #94902 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94902/testReport)**
 for PR 22134 at commit 
[`e477a8e`](https://github.com/apache/spark/commit/e477a8ed6d2ea331be357a6fbbb3d55c504971b1).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22134: [SPARK-25143][SQL] Support data source name mapping conf...

2018-08-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/22134
  
@gengliangwang . As of now, it's not possible to use that full class name 
for the Hive table saved as `com.databricks.spark.avro`. So, we are trying to 
find this way (this PR) or that way (maybe your PR).

> Also, besides the Databricks avro/csv repo, users can just provide their 
full class name in specifying the file format, or short name if it doesn't not 
conflict with built-in ones. I don't think they need such control.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22134: [SPARK-25143][SQL] Support data source name mapping conf...

2018-08-17 Thread gengliangwang
Github user gengliangwang commented on the issue:

https://github.com/apache/spark/pull/22134
  
> The purpose of this PR is RESET to override the built-in 
backwardCompatibilityMap. Before this PR, we don't have any control over the 
built-in mapping. We need to allow the users can take a full controlability.

I don't think it is user friendly for such RESET. 
Also, besides the Databricks avro/csv repo, users can just provide their 
full class name in specifying the file format, or short name if it doesn't not 
conflict with built-in ones. I don't think they need such control.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22134: [SPARK-25143][SQL] Support data source name mapping conf...

2018-08-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/22134
  
@gengliangwang and @gatorsmile .

First of all, the internal 3rd party mapping should be removed in Apache 
Spark 3.0. Please consider that. Also, with this PR, we can remove 
`com.databricks.spark.avro` in Spark 2.4, too. If you want, I can remove that 
in this PR, but I just want to keep this PR simplest.

And, we don't need to `unset` here. The purpose of this PR is *RESET* to 
override the built-in `backwardCompatibilityMap`. Before this PR, we don't have 
any control over the built-in mapping. We need to allow the users can take a 
full controlability.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22134: [SPARK-25143][SQL] Support data source name mapping conf...

2018-08-17 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/22134
  
Do we need it in the current stage?  Regarding UX, it looks complex to end 
users. I am unable to remember the names. It is very easy to provide a wrong 
class name. 
```
"spark.sql.datasource.map.org.apache.spark.sql.avro" -> 
testDataSource.getCanonicalName,
"spark.sql.datasource.map.com.databricks.spark.csv" -> 
testDataSource.getCanonicalName,
"spark.sql.datasource.map.com.databricks.spark.avro" -> 
testDataSource.getCanonicalName) 
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22134: [SPARK-25143][SQL] Support data source name mapping conf...

2018-08-17 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22134
  
**[Test build #94902 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94902/testReport)**
 for PR 22134 at commit 
[`e477a8e`](https://github.com/apache/spark/commit/e477a8ed6d2ea331be357a6fbbb3d55c504971b1).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22134: [SPARK-25143][SQL] Support data source name mapping conf...

2018-08-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22134
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2285/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22134: [SPARK-25143][SQL] Support data source name mapping conf...

2018-08-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22134
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22134: [SPARK-25143][SQL] Support data source name mapping conf...

2018-08-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/22134
  
Retest this please.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22134: [SPARK-25143][SQL] Support data source name mapping conf...

2018-08-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22134
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94889/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22134: [SPARK-25143][SQL] Support data source name mapping conf...

2018-08-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22134
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22134: [SPARK-25143][SQL] Support data source name mapping conf...

2018-08-17 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22134
  
**[Test build #94889 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94889/testReport)**
 for PR 22134 at commit 
[`e477a8e`](https://github.com/apache/spark/commit/e477a8ed6d2ea331be357a6fbbb3d55c504971b1).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22134: [SPARK-25143][SQL] Support data source name mapping conf...

2018-08-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/22134
  
Could you review this, @tgravescs , @cloud-fan , @gatorsmile , @HyukjinKwon 
?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22134: [SPARK-25143][SQL] Support data source name mapping conf...

2018-08-17 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22134
  
**[Test build #94889 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94889/testReport)**
 for PR 22134 at commit 
[`e477a8e`](https://github.com/apache/spark/commit/e477a8ed6d2ea331be357a6fbbb3d55c504971b1).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22134: [SPARK-25143][SQL] Support data source name mapping conf...

2018-08-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22134
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2273/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22134: [SPARK-25143][SQL] Support data source name mapping conf...

2018-08-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22134
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22134: [SPARK-25143][SQL] Support data source name mapping conf...

2018-08-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/22134
  
Ping, @gengliangwang . I made a different approach from #22133 . This will 
be more general.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org