[GitHub] spark pull request: [SPARK-6163][SQL] jsonFile should be backed by...

2015-03-05 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/4896


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6163][SQL] jsonFile should be backed by...

2015-03-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4896#issuecomment-77405690
  
  [Test build #28298 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28298/consoleFull)
 for   PR 4896 at commit 
[`45e023e`](https://github.com/apache/spark/commit/45e023e4951b3c075c306cf7741b4c58716d5e38).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6163][SQL] jsonFile should be backed by...

2015-03-05 Thread yhuai
Github user yhuai commented on the pull request:

https://github.com/apache/spark/pull/4896#issuecomment-77388603
  
A user may use jsonFile get a DF and then register it as a temp table. 
He/she may then try to insert new data into this temp table. Since jsonFile is 
not backed by the JSON data source right now, he/she will see an exception 
saying that Spark SQL cannot find a physical plan for the given logical plan.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6163][SQL] jsonFile should be backed by...

2015-03-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4896#issuecomment-77419193
  
  [Test build #28298 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28298/consoleFull)
 for   PR 4896 at commit 
[`45e023e`](https://github.com/apache/spark/commit/45e023e4951b3c075c306cf7741b4c58716d5e38).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6163][SQL] jsonFile should be backed by...

2015-03-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4896#issuecomment-77419208
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28298/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6163][SQL] jsonFile should be backed by...

2015-03-05 Thread chenghao-intel
Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/4896#discussion_r25858313
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/json/JsonSuite.scala 
---
@@ -551,6 +551,32 @@ class JsonSuite extends QueryTest {
 jsonDF.registerTempTable(jsonTable)
   }
 
+  test(jsonFile should be based on JSONRelation) {
+val file = getTempFilePath(json)
+val path = file.toString
+sparkContext.parallelize(1 to 100).map(i = s{a: 1, b: 
str$i}).saveAsTextFile(path)
+val jsonDF = jsonFile(path, 0.49)
+
+val analyzed = jsonDF.queryExecution.analyzed
+assert(
+  analyzed.isInstanceOf[LogicalRelation],
+  The DataFrame returned by jsonFile should be based on 
JSONRelation.)
+val relation = analyzed.asInstanceOf[LogicalRelation].relation
+assert(
+  relation.isInstanceOf[JSONRelation],
+  The DataFrame returned by jsonFile should be based on 
JSONRelation.)
+assert(relation.asInstanceOf[JSONRelation].path === path)
+assert(Math.round(relation.asInstanceOf[JSONRelation].samplingRatio) 
=== 0L)
--- End diff --

```scala
import org.scalatest.MustMatchers._
relation.asInstanceOf[JSONRelation].samplingRatio must be (0.49 +- 0.01)
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6163][SQL] jsonFile should be backed by...

2015-03-05 Thread chenghao-intel
Github user chenghao-intel commented on the pull request:

https://github.com/apache/spark/pull/4896#issuecomment-77353953
  
@yhuai What do you mean `Otherwise, users cannot insert data into the DF 
returned by jsonFile.`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6163][SQL] jsonFile should be backed by...

2015-03-04 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4896#issuecomment-77280366
  
  [Test build #28272 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28272/consoleFull)
 for   PR 4896 at commit 
[`2e8734e`](https://github.com/apache/spark/commit/2e8734ee5082907b4815233283a0ea7388d60cc2).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6163][SQL] jsonFile should be backed by...

2015-03-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4896#issuecomment-77289567
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28272/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6163][SQL] jsonFile should be backed by...

2015-03-04 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4896#issuecomment-77289559
  
  [Test build #28272 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28272/consoleFull)
 for   PR 4896 at commit 
[`2e8734e`](https://github.com/apache/spark/commit/2e8734ee5082907b4815233283a0ea7388d60cc2).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6163][SQL] jsonFile should be backed by...

2015-03-04 Thread yhuai
GitHub user yhuai opened a pull request:

https://github.com/apache/spark/pull/4896

[SPARK-6163][SQL] jsonFile should be backed by the data source API

jira: https://issues.apache.org/jira/browse/SPARK-6163

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/yhuai/spark SPARK-6163

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/4896.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4896


commit 92a4a338ea7709887ca3fd4e478e8ab454cf9380
Author: Yin Huai yh...@databricks.com
Date:   2015-03-04T23:25:54Z

Test.

commit 2e8734ee5082907b4815233283a0ea7388d60cc2
Author: Yin Huai yh...@databricks.com
Date:   2015-03-05T00:00:06Z

Use JSON data source for jsonFile.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org