[GitHub] spark issue #16545: [SPARK-19166][SQL]rename from InsertIntoHadoopFsRelation...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16545 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71896/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16545: [SPARK-19166][SQL]rename from InsertIntoHadoopFsRelation...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16545 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16545: [SPARK-19166][SQL]rename from InsertIntoHadoopFsRelation...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16545 **[Test build #71896 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71896/testReport)** for PR 16545 at commit [`15f6759`](https://github.com/apache/spark/commit/15f6759f4d9c111a4876928714da052652056c4c). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16545: [SPARK-19166][SQL]rename from InsertIntoHadoopFsRelation...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16545 **[Test build #71896 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71896/testReport)** for PR 16545 at commit [`15f6759`](https://github.com/apache/spark/commit/15f6759f4d9c111a4876928714da052652056c4c). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16545: [SPARK-19166][SQL]rename from InsertIntoHadoopFsRelation...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16545 Build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16545: [SPARK-19166][SQL]rename from InsertIntoHadoopFsRelation...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16545 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71413/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16545: [SPARK-19166][SQL]rename from InsertIntoHadoopFsRelation...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16545 **[Test build #71413 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71413/testReport)** for PR 16545 at commit [`6d1defb`](https://github.com/apache/spark/commit/6d1defb57407a12c6bf6020ed18cb2249328e435). * This patch passes all tests. * This patch **does not merge cleanly**. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16545: [SPARK-19166][SQL]rename from InsertIntoHadoopFsRelation...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16545 **[Test build #71413 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71413/testReport)** for PR 16545 at commit [`6d1defb`](https://github.com/apache/spark/commit/6d1defb57407a12c6bf6020ed18cb2249328e435). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16545: [SPARK-19166][SQL]rename from InsertIntoHadoopFsRelation...
Github user windpiger commented on the issue: https://github.com/apache/spark/pull/16545 1. yes, it take issue with the internal method/doc name, just a minor improvement . after all, partitioned and non-partitioned case are not different. 2.a file matching the same prefix being deleted, this doesn't happen, it is ok. it seems like `deleteMatchingPrefix ` also make some confused. maybe `deleteMatchingPaths` more properly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16545: [SPARK-19166][SQL]rename from InsertIntoHadoopFsRelation...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/16545 So you're saying it works fine, you just take issue with the internal method/doc name? OK, but in the non-partitioned case, deleting everything is also a matter of deleting `/a*`. Implicitly there is one partition in the sense used here. I just don't see a big deal here. But the example in the comment you added seems to refer to a different case, about a file matching the same prefix being deleted? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16545: [SPARK-19166][SQL]rename from InsertIntoHadoopFsRelation...
Github user windpiger commented on the issue: https://github.com/apache/spark/pull/16545 sorry, it is not the point. the example make some confuse. replaced it with another example. val df = spark.read.json("/path/a") val df1 = spark.read.json("/path/b") df.createOrReplaceTempView("x") df1.createOrReplaceTempView("y") spark.sql("insert overwrite table x select * from y") this sql executed will hit the function `deleteMatchingPartitions` here https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InsertIntoHadoopFsRelationCommand.scala#L104 the point is `/path/a` is `not a partition path`(e.g. /path/a=1),we still delete it through `deleteMatchingPartitions`. From the method name `MatchingPartitions` ,we may think it is only apply to a `partition file`(e.g. /path/a=1). If it also apply to `no partition file`, we will confused. if we change the name from `deleteMatchingPartitions` to `deleteMatchingPrefix` , is't more clear? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16545: [SPARK-19166][SQL]rename from InsertIntoHadoopFsRelation...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/16545 Are you saying the problem arises when the path to one dataset/table is a prefix of another? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16545: [SPARK-19166][SQL]rename from InsertIntoHadoopFsRelation...
Github user windpiger commented on the issue: https://github.com/apache/spark/pull/16545 thanks! val df = spark.read.json("/path/jsonfile") val df1 = spark.read.json("/path/jsonfile_1") df.createOrReplaceTempView("t") df1.createOrReplaceTempView("t1") spark.sql("insert overwrite table t select * from t1") this sql executed will hit the function `deleteMatchingPartitions` here https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InsertIntoHadoopFsRelationCommand.scala#L104 while `/path/jsonfile` is `not` a partition path file, `deleteMatchingPartitions` is not properly here? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16545: [SPARK-19166][SQL]rename from InsertIntoHadoopFsRelation...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/16545 Why would a file exist there? if it doesn't exist in any normal operation then I don't see a good motive for changing this, as it works as designed already. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16545: [SPARK-19166][SQL]rename from InsertIntoHadoopFsRelation...
Github user windpiger commented on the issue: https://github.com/apache/spark/pull/16545 thanks! even though it is a no-partition file, it will also be deleted, so I think this change will more clear --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16545: [SPARK-19166][SQL]rename from InsertIntoHadoopFsRelation...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/16545 I don't think this kind of thing is worth changing; the docs look correct. When would you have other files as the peer of a partition directory? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16545: [SPARK-19166][SQL]rename from InsertIntoHadoopFsRelation...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16545 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71190/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16545: [SPARK-19166][SQL]rename from InsertIntoHadoopFsRelation...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16545 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16545: [SPARK-19166][SQL]rename from InsertIntoHadoopFsRelation...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16545 **[Test build #71190 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71190/testReport)** for PR 16545 at commit [`0622c04`](https://github.com/apache/spark/commit/0622c04c6129ef699ca0d8f6907d8bbc6d025387). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org