[GitHub] spark pull request: [SPARK-6163][SQL] jsonFile should be backed by...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/4896 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6163][SQL] jsonFile should be backed by...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4896#issuecomment-77405690 [Test build #28298 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28298/consoleFull) for PR 4896 at commit [`45e023e`](https://github.com/apache/spark/commit/45e023e4951b3c075c306cf7741b4c58716d5e38). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6163][SQL] jsonFile should be backed by...
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/4896#issuecomment-77388603 A user may use jsonFile get a DF and then register it as a temp table. He/she may then try to insert new data into this temp table. Since jsonFile is not backed by the JSON data source right now, he/she will see an exception saying that Spark SQL cannot find a physical plan for the given logical plan. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6163][SQL] jsonFile should be backed by...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4896#issuecomment-77419193 [Test build #28298 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28298/consoleFull) for PR 4896 at commit [`45e023e`](https://github.com/apache/spark/commit/45e023e4951b3c075c306cf7741b4c58716d5e38). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6163][SQL] jsonFile should be backed by...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4896#issuecomment-77419208 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28298/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6163][SQL] jsonFile should be backed by...
Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/4896#discussion_r25858313 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/json/JsonSuite.scala --- @@ -551,6 +551,32 @@ class JsonSuite extends QueryTest { jsonDF.registerTempTable(jsonTable) } + test(jsonFile should be based on JSONRelation) { +val file = getTempFilePath(json) +val path = file.toString +sparkContext.parallelize(1 to 100).map(i = s{a: 1, b: str$i}).saveAsTextFile(path) +val jsonDF = jsonFile(path, 0.49) + +val analyzed = jsonDF.queryExecution.analyzed +assert( + analyzed.isInstanceOf[LogicalRelation], + The DataFrame returned by jsonFile should be based on JSONRelation.) +val relation = analyzed.asInstanceOf[LogicalRelation].relation +assert( + relation.isInstanceOf[JSONRelation], + The DataFrame returned by jsonFile should be based on JSONRelation.) +assert(relation.asInstanceOf[JSONRelation].path === path) +assert(Math.round(relation.asInstanceOf[JSONRelation].samplingRatio) === 0L) --- End diff -- ```scala import org.scalatest.MustMatchers._ relation.asInstanceOf[JSONRelation].samplingRatio must be (0.49 +- 0.01) ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6163][SQL] jsonFile should be backed by...
Github user chenghao-intel commented on the pull request: https://github.com/apache/spark/pull/4896#issuecomment-77353953 @yhuai What do you mean `Otherwise, users cannot insert data into the DF returned by jsonFile.`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6163][SQL] jsonFile should be backed by...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4896#issuecomment-77280366 [Test build #28272 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28272/consoleFull) for PR 4896 at commit [`2e8734e`](https://github.com/apache/spark/commit/2e8734ee5082907b4815233283a0ea7388d60cc2). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6163][SQL] jsonFile should be backed by...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4896#issuecomment-77289567 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28272/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6163][SQL] jsonFile should be backed by...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4896#issuecomment-77289559 [Test build #28272 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28272/consoleFull) for PR 4896 at commit [`2e8734e`](https://github.com/apache/spark/commit/2e8734ee5082907b4815233283a0ea7388d60cc2). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6163][SQL] jsonFile should be backed by...
GitHub user yhuai opened a pull request: https://github.com/apache/spark/pull/4896 [SPARK-6163][SQL] jsonFile should be backed by the data source API jira: https://issues.apache.org/jira/browse/SPARK-6163 You can merge this pull request into a Git repository by running: $ git pull https://github.com/yhuai/spark SPARK-6163 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/4896.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4896 commit 92a4a338ea7709887ca3fd4e478e8ab454cf9380 Author: Yin Huai yh...@databricks.com Date: 2015-03-04T23:25:54Z Test. commit 2e8734ee5082907b4815233283a0ea7388d60cc2 Author: Yin Huai yh...@databricks.com Date: 2015-03-05T00:00:06Z Use JSON data source for jsonFile. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org