[GitHub] spark pull request: [SPARK-10304][SQL]: throw error when the table...
Github user zhzhan commented on the pull request: https://github.com/apache/spark/pull/8547#issuecomment-136610247 Adding an PartitionValues.empty does not cover all problems. Will close this PR, and investigate other approaches. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL]: throw error when the table...
Github user zhzhan closed the pull request at: https://github.com/apache/spark/pull/8547 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL]: throw error when the table...
Github user zhzhan commented on a diff in the pull request: https://github.com/apache/spark/pull/8547#discussion_r38391122 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/sources/interfaces.scala --- @@ -436,7 +436,8 @@ abstract class HadoopFsRelation private[sql](maybePartitionSpec: Option[Partitio Try(fs.listStatus(qualified)).getOrElse(Array.empty) }.filterNot { status => val name = status.getPath.getName - name.toLowerCase == "_temporary" || name.startsWith(".") + // Is it safe to replace "_temporary" to "_"? --- End diff -- Thanks for the comments. It seems there is a lot of corner cases to be covered from the test case. for example 1st is valid, but 2nd is not: 1st: "hdfs://host:9000/path/_temporary", "hdfs://host:9000/path/a=10/b=20", "hdfs://host:9000/path/_temporary/path", 2nd: "hdfs://host:9000/path/_temporary", "hdfs://host:9000/path/a=10/b=20", "hdfs://host:9000/path/path1", Adding an PartitionValues.empty does not solve the problem. Will close this PR, and investigate other approaches. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL]: throw error when the table...
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/8547#discussion_r38389890 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/sources/interfaces.scala --- @@ -436,7 +436,8 @@ abstract class HadoopFsRelation private[sql](maybePartitionSpec: Option[Partitio Try(fs.listStatus(qualified)).getOrElse(Array.empty) }.filterNot { status => val name = status.getPath.getName - name.toLowerCase == "_temporary" || name.startsWith(".") + // Is it safe to replace "_temporary" to "_"? --- End diff -- No. We need to preserve files like Parquet summary files (`_metadata` and `_common_metadata`). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL]: throw error when the table...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8547#issuecomment-136591421 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41855/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL]: throw error when the table...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8547#issuecomment-136591420 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL]: throw error when the table...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8547#issuecomment-136591392 [Test build #41855 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41855/console) for PR 8547 at commit [`be5522d`](https://github.com/apache/spark/commit/be5522d24a39da20fc8d08b13e15def8e9139cc3). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL]: throw error when the table...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8547#issuecomment-136589274 [Test build #41855 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41855/consoleFull) for PR 8547 at commit [`be5522d`](https://github.com/apache/spark/commit/be5522d24a39da20fc8d08b13e15def8e9139cc3). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL]: throw error when the table...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8547#issuecomment-136588350 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL]: throw error when the table...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8547#issuecomment-136588337 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL]: throw error when the table...
GitHub user zhzhan opened a pull request: https://github.com/apache/spark/pull/8547 [SPARK-10304][SQL]: throw error when the table directory is invalid Throw error if the directory of a table is invalid, validated by either all files in the directory are partitioned, or none of them are partitioned. You can merge this pull request into a Git repository by running: $ git pull https://github.com/zhzhan/spark SPARK-10304 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/8547.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #8547 commit be5522d24a39da20fc8d08b13e15def8e9139cc3 Author: Zhan Zhang Date: 2015-09-01T05:01:18Z SPARK-10304: throw error when the table directory is invalid --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org