[GitHub] spark pull request: [SPARK-8501] [SQL] Avoids reading schema from ...
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/7200#issuecomment-118462722 @sarutak Thanks for reminding, closing it :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8501] [SQL] Avoids reading schema from ...
Github user liancheng closed the pull request at: https://github.com/apache/spark/pull/7200 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8501] [SQL] Avoids reading schema from ...
Github user sarutak commented on the pull request: https://github.com/apache/spark/pull/7200#issuecomment-118382044 This PR has already merged right? It's funny that this PR is still open. @liancheng Mind manually closing? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8501] [SQL] Avoids reading schema from ...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/7199 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8501] [SQL] Avoids reading schema from ...
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/7199#issuecomment-118227685 Merging to master. This PR is backported to branch-1.4 by #7200. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8501] [SQL] Avoids reading schema from ...
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/7200#issuecomment-118227664 Merging to branch-1.4. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8501] [SQL] Avoids reading schema from ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7200#issuecomment-118224506 [Test build #36458 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/36458/consoleFull) for PR 7200 at commit [`725e9e3`](https://github.com/apache/spark/commit/725e9e31edb072969d0cdf1fcc6c3c750ffc19bd). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8501] [SQL] Avoids reading schema from ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7200#issuecomment-118224694 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8501] [SQL] Avoids reading schema from ...
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/7200#discussion_r33839073 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/orc/OrcSourceSuite.scala --- @@ -44,7 +44,7 @@ abstract class OrcSuite extends QueryTest with BeforeAndAfterAll { import org.apache.spark.sql.hive.test.TestHive.implicits._ sparkContext - .makeRDD(1 to 100) + .makeRDD(1 to 10) --- End diff -- The numbers were increased to 100 to workaround SPARK-8501. Now it's fixed, so revert them back. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8501] [SQL] Avoids reading schema from ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7199#issuecomment-118221445 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8501] [SQL] Avoids reading schema from ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7199#issuecomment-118221412 [Test build #36459 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/36459/console) for PR 7199 at commit [`bb8cd95`](https://github.com/apache/spark/commit/bb8cd95a530e7442a01156f4882f07c09e198c30). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8501] [SQL] Avoids reading schema from ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7199#issuecomment-118216196 [Test build #36456 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/36456/console) for PR 7199 at commit [`ad5b0ae`](https://github.com/apache/spark/commit/ad5b0aeebc4ca907da17b7404ae505416c95fd5c). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8501] [SQL] Avoids reading schema from ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7199#issuecomment-118216210 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8501] [SQL] Avoids reading schema from ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7200#issuecomment-118215013 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8501] [SQL] Avoids reading schema from ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7200#issuecomment-118214999 [Test build #36455 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/36455/consoleFull) for PR 7200 at commit [`9538bff`](https://github.com/apache/spark/commit/9538bff6bf7baea020dae2e57e7c03d680e475bb). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8501] [SQL] Avoids reading schema from ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7199#issuecomment-118210027 [Test build #36459 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/36459/consoleFull) for PR 7199 at commit [`bb8cd95`](https://github.com/apache/spark/commit/bb8cd95a530e7442a01156f4882f07c09e198c30). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8501] [SQL] Avoids reading schema from ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7199#issuecomment-118209983 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8501] [SQL] Avoids reading schema from ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7199#issuecomment-118209963 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8501] [SQL] Avoids reading schema from ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7200#issuecomment-118208853 [Test build #36458 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/36458/consoleFull) for PR 7200 at commit [`725e9e3`](https://github.com/apache/spark/commit/725e9e31edb072969d0cdf1fcc6c3c750ffc19bd). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8501] [SQL] Avoids reading schema from ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7200#issuecomment-118208707 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8501] [SQL] Avoids reading schema from ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7200#issuecomment-118208725 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8501] [SQL] Avoids reading schema from ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7199#issuecomment-118208254 [Test build #36456 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/36456/consoleFull) for PR 7199 at commit [`ad5b0ae`](https://github.com/apache/spark/commit/ad5b0aeebc4ca907da17b7404ae505416c95fd5c). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8501] [SQL] Avoids reading schema from ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7200#issuecomment-118208239 [Test build #36455 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/36455/consoleFull) for PR 7200 at commit [`9538bff`](https://github.com/apache/spark/commit/9538bff6bf7baea020dae2e57e7c03d680e475bb). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8501] [SQL] Avoids reading schema from ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7200#issuecomment-118207969 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8501] [SQL] Avoids reading schema from ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7200#issuecomment-118207955 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8501] [SQL] Avoids reading schema from ...
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/7200#issuecomment-118208036 @zhzhan Thanks for the review! Updated. Actually while writing Javadoc of `OrcFileOperator.getFileReader`, I feel the semantics of this method is kinda weird and need some refactoring. But let's leave this to a follow-up PR since 1.4.1 RC2 is being cut soon and I'd like to include this one in 1.4.1. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8501] [SQL] Avoids reading schema from ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7199#issuecomment-118207963 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8501] [SQL] Avoids reading schema from ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7199#issuecomment-118207976 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8501] [SQL] Avoids reading schema from ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7199#issuecomment-118195319 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8501] [SQL] Avoids reading schema from ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7199#issuecomment-118195239 [Test build #36437 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/36437/console) for PR 7199 at commit [`a290221`](https://github.com/apache/spark/commit/a290221cb9bef1f58795e94c29a25fc4bc699628). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8501] [SQL] Avoids reading schema from ...
Github user zhzhan commented on the pull request: https://github.com/apache/spark/pull/7200#issuecomment-118190051 some minor comments. Overall, LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8501] [SQL] Avoids reading schema from ...
Github user zhzhan commented on a diff in the pull request: https://github.com/apache/spark/pull/7200#discussion_r33831074 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/orc/OrcFileOperator.scala --- @@ -24,30 +24,58 @@ import org.apache.hadoop.hive.serde2.objectinspector.StructObjectInspector import org.apache.spark.Logging import org.apache.spark.deploy.SparkHadoopUtil +import org.apache.spark.sql.AnalysisException import org.apache.spark.sql.hive.HiveMetastoreTypes import org.apache.spark.sql.types.StructType -private[orc] object OrcFileOperator extends Logging{ - def getFileReader(pathStr: String, config: Option[Configuration] = None ): Reader = { +private[orc] object OrcFileOperator extends Logging { + // TODO Needs to consider all files when schema evolution is taken into account. + def getFileReader(basePath: String, config: Option[Configuration] = None): Option[Reader] = { +def isWithNonEmptySchema(path: Path, reader: Reader): Boolean = { + reader.getObjectInspector match { +case oi: StructObjectInspector if oi.getAllStructFieldRefs.size() > 0 => + true +case oi: StructObjectInspector if oi.getAllStructFieldRefs.size() == 0 => + logInfo( +s"ORC file $path has empty schema, it probably contains no rows. " + + "Trying to read another ORC file to figure out the schema.") + false +case _ => false --- End diff -- In what situation, will the third case happen? If not exist, can we collapse the 2nd and 3rd case? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8501] [SQL] Avoids reading schema from ...
Github user zhzhan commented on the pull request: https://github.com/apache/spark/pull/7200#issuecomment-118187344 @liancheng Because in spark, we will not create the orc file if the record is empty. It is only happens with the ORC file created by hive, right? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8501] [SQL] Avoids reading schema from ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7200#issuecomment-118185793 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8501] [SQL] Avoids reading schema from ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7200#issuecomment-118185781 [Test build #36439 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/36439/consoleFull) for PR 7200 at commit [`0fa25af`](https://github.com/apache/spark/commit/0fa25af89cd1823760f25d7e2b1d302ae9d57ae0). * This patch **fails MiMa tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8501] [SQL] Avoids reading schema from ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7200#issuecomment-118184403 [Test build #36439 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/36439/consoleFull) for PR 7200 at commit [`0fa25af`](https://github.com/apache/spark/commit/0fa25af89cd1823760f25d7e2b1d302ae9d57ae0). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8501] [SQL] Avoids reading schema from ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7200#issuecomment-118183923 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8501] [SQL] Avoids reading schema from ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7200#issuecomment-118183931 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8501] [SQL] Avoids reading schema from ...
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/7200#issuecomment-118183822 cc @yhuai @zhzhan --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8501] [SQL] Avoids reading schema from ...
GitHub user liancheng opened a pull request: https://github.com/apache/spark/pull/7200 [SPARK-8501] [SQL] Avoids reading schema from empty ORC files (backport to 1.4) This PR backports #7199 to branch-1.4 You can merge this pull request into a Git repository by running: $ git pull https://github.com/liancheng/spark spark-8501-for-1.4 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/7200.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #7200 commit 0fa25af89cd1823760f25d7e2b1d302ae9d57ae0 Author: Cheng Lian Date: 2015-07-02T21:57:38Z Avoids reading schema from empty ORC files --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8501] [SQL] Avoids reading schema from ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7199#issuecomment-118183221 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8501] [SQL] Avoids reading schema from ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7199#issuecomment-118181161 [Test build #36437 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/36437/consoleFull) for PR 7199 at commit [`a290221`](https://github.com/apache/spark/commit/a290221cb9bef1f58795e94c29a25fc4bc699628). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8501] [SQL] Avoids reading schema from ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7199#issuecomment-118180967 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8501] [SQL] Avoids reading schema from ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7199#issuecomment-118180975 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8501] [SQL] Avoids reading schema from ...
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/7199#issuecomment-118180385 cc @yhuai @zhzhan --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8501] [SQL] Avoids reading schema from ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7199#issuecomment-118180044 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8501] [SQL] Avoids reading schema from ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7199#issuecomment-118180026 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8501] [SQL] Avoids reading schema from ...
GitHub user liancheng opened a pull request: https://github.com/apache/spark/pull/7199 [SPARK-8501] [SQL] Avoids reading schema from empty ORC files ORC writes empty schema (`struct<>`) to ORC files containing zero rows. This is OK for Hive since the table schema is managed by the metastore. But it causes trouble when reading raw ORC files via Spark SQL since we have to discover the schema from the files. Notice that the ORC data source always avoids writing empty ORC files, but it's still problematic when reading Hive tables which contain empty part-files. You can merge this pull request into a Git repository by running: $ git pull https://github.com/liancheng/spark spark-8501 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/7199.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #7199 commit c3a4623700ee1c34f43d1716164354589a0493f4 Author: Cheng Lian Date: 2015-07-02T21:57:38Z Avoids reading schema from empty ORC files --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org