[GitHub] spark pull request #13189: [SPARK-14670][SQL] allow updating driver side sql...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/13189 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13189: [SPARK-14670][SQL] allow updating driver side sql metric...
Github user davies commented on the issue: https://github.com/apache/spark/pull/13189 LGTM, merging this into master and 2.0, thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13546: [SPARK-15808] [SQL] File Format Checking When Appending ...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/13546 cc @cloud-fan --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/12836 **[Test build #60223 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60223/consoleFull)** for PR 12836 at commit [`afa385d`](https://github.com/apache/spark/commit/afa385dde3dbbebb2e7f605566e8827187fb434e). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13518: [WIP][SPARK-15472][SQL] Add support for writing in `csv`...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13518 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60219/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13518: [WIP][SPARK-15472][SQL] Add support for writing in `csv`...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13518 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13518: [WIP][SPARK-15472][SQL] Add support for writing in `csv`...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13518 **[Test build #60219 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60219/consoleFull)** for PR 13518 at commit [`bc28f41`](https://github.com/apache/spark/commit/bc28f4112ca9eca6a9f1602a891dd0388fa3185c). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13531: [SPARK-15654] [SQL] fix non-splitable files for t...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/13531#discussion_r66383788 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/fileSourceInterfaces.scala --- @@ -298,6 +309,28 @@ trait FileFormat { } /** + * The base class file format that is based on text file. + */ +abstract class TextBasedFileFormat extends FileFormat { + private var codecFactory: CompressionCodecFactory = null + override def isSplitable( + sparkSession: SparkSession, + options: Map[String, String], + path: Path): Boolean = { +if (codecFactory == null) { + synchronized { +if (codecFactory == null) { + codecFactory = new CompressionCodecFactory( --- End diff -- Sorry, I do not understand your question or suggestion. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #7841: [SPARK-8992] [SQL] Add pivot to dataframe api
Github user rxin commented on the issue: https://github.com/apache/spark/pull/7841 @aray this pull request was highlighted in http://www.slideshare.net/databricks/deep-dive-into-catalyst-apache-spark-20s-optimizer --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13564: [SPARK-15827][BUILD][WIP] Publish Spark's forked sbt-pom...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13564 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60216/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13564: [SPARK-15827][BUILD][WIP] Publish Spark's forked sbt-pom...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13564 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13564: [SPARK-15827][BUILD][WIP] Publish Spark's forked sbt-pom...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13564 **[Test build #60216 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60216/consoleFull)** for PR 13564 at commit [`57199a8`](https://github.com/apache/spark/commit/57199a84a97ad8d324a350e5e71f3888da5a2772). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13567: [Minor][Doc] In Dataset Docs, Remove self link to Datase...
Github user rxin commented on the issue: https://github.com/apache/spark/pull/13567 LGTM pending tests. can you update the title to say "[Minor][Doc] In Dataset docs, remove self link to Dataset and add link to Column" --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13555: [SPARK-15804][SQL]Include metadata in the toStructType
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/13555 LGTM, pending jenkins --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13563: [SPARK-15826] [CORE] PipedRDD to strictly use UTF...
Github user tejasapatil commented on a diff in the pull request: https://github.com/apache/spark/pull/13563#discussion_r66383126 --- Diff: core/src/main/scala/org/apache/spark/rdd/PipedRDD.scala --- @@ -129,7 +130,7 @@ private[spark] class PipedRDD[T: ClassTag]( override def run(): Unit = { val err = proc.getErrorStream try { - for (line <- Source.fromInputStream(err).getLines) { + for (line <- Source.fromInputStream(err)(StandardCharsets.UTF_8).getLines) { --- End diff -- @zsxwing : I don't agree that using system default encoding would be a good idea. One can see different result on different systems which would be annoying. A user can run a binary in pipe() operator and let its stderr stream emit data with a character encoding that he/she wishes to. @srowen : I agree that UTF-8 is a safer default. Making this configurable would be the ideal thing to do. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13555: [SPARK-15804][SQL]Include metadata in the toStructType
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13555 **[Test build #60222 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60222/consoleFull)** for PR 13555 at commit [`229ca27`](https://github.com/apache/spark/commit/229ca27bb376452127c40345210c99105f6ceee3). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13555: [SPARK-15804][SQL]Include metadata in the toStruc...
Github user kevinyu98 commented on a diff in the pull request: https://github.com/apache/spark/pull/13555#discussion_r66383008 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetQuerySuite.scala --- @@ -625,6 +625,21 @@ class ParquetQuerySuite extends QueryTest with ParquetTest with SharedSQLContext } } } + + test("SPARK-15804: write out the metadata to parquet file") { +val df = Seq((1, "abc"), (2, "hello")).toDF("a", "b") +val md = new MetadataBuilder().putString("key", "value").build() +val dfWithmeta = df.select('a, 'b.as("b", md)) + +withTempPath { dir => + val path = s"${dir.getCanonicalPath}" --- End diff -- Done, Thanks very much. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13567: [Minor][Doc] In Dataset Docs, Remove self link to Datase...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13567 **[Test build #60221 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60221/consoleFull)** for PR 13567 at commit [`f219de0`](https://github.com/apache/spark/commit/f219de0bf040f153c9ff56e28f25a5c571d68d76). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13567: [Minor][Doc] In Dataset Docs, Remove self link to Datase...
Github user techaddict commented on the issue: https://github.com/apache/spark/pull/13567 @rxin Done ð --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13065: [SPARK-15214][SQL] Code-generation for Generate
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13065 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60215/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13065: [SPARK-15214][SQL] Code-generation for Generate
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13065 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13065: [SPARK-15214][SQL] Code-generation for Generate
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13065 **[Test build #60215 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60215/consoleFull)** for PR 13065 at commit [`2732b06`](https://github.com/apache/spark/commit/2732b06ee9843c7492d479c29fff6ce0e9025a89). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13563: [SPARK-15826] [CORE] PipedRDD to strictly use UTF-8 and ...
Github user sadikovi commented on the issue: https://github.com/apache/spark/pull/13563 @tejasapatil Can you also hard-code UTF-8 for `stderr`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13567: [Minor][Doc] Dataset.reduce Scaladoc Link to Dataset
Github user techaddict commented on the issue: https://github.com/apache/spark/pull/13567 @rxin Sure --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13567: [Minor][Doc] Dataset.reduce Scaladoc Link to Dataset
Github user rxin commented on the issue: https://github.com/apache/spark/pull/13567 Alright do you want to update the pr to remove them instead? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13555: [SPARK-15804][SQL]Include metadata in the toStruc...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/13555#discussion_r66381875 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetQuerySuite.scala --- @@ -625,6 +625,21 @@ class ParquetQuerySuite extends QueryTest with ParquetTest with SharedSQLContext } } } + + test("SPARK-15804: write out the metadata to parquet file") { +val df = Seq((1, "abc"), (2, "hello")).toDF("a", "b") +val md = new MetadataBuilder().putString("key", "value").build() +val dfWithmeta = df.select('a, 'b.as("b", md)) + +withTempPath { dir => + val path = s"${dir.getCanonicalPath}" --- End diff -- nit: just `dir.getCanonicalPath`, no need to wrap it inside `""` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13558: [SPARK-15820][PySpark][SQL]Add Catalog.refreshTable into...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13558 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60220/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13558: [SPARK-15820][PySpark][SQL]Add Catalog.refreshTable into...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13558 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13558: [SPARK-15820][PySpark][SQL]Add Catalog.refreshTable into...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13558 **[Test build #60220 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60220/consoleFull)** for PR 13558 at commit [`eedb961`](https://github.com/apache/spark/commit/eedb961ebb141f2bafd1c9798bb09d5de939ea5c). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13564: [SPARK-15827][BUILD][WIP] Publish Spark's forked sbt-pom...
Github user ScrapCodes commented on the issue: https://github.com/apache/spark/pull/13564 Thanks for doing this. It will definitely save time, and some CPU cycles(energy) for all the first time builds. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13558: [SPARK-15820][PySpark][SQL]Add Catalog.refreshTable into...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13558 **[Test build #60220 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60220/consoleFull)** for PR 13558 at commit [`eedb961`](https://github.com/apache/spark/commit/eedb961ebb141f2bafd1c9798bb09d5de939ea5c). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13558: [SPARK-15820][PySpark][SQL]Add Catalog.refreshTable into...
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/13558 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #12913: [SPARK-928][CORE] Add support for Unsafe-based serialize...
Github user techaddict commented on the issue: https://github.com/apache/spark/pull/12913 ping @JoshRosen --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13567: [Minor][Doc] Dataset.reduce Scaladoc Link to Dataset
Github user techaddict commented on the issue: https://github.com/apache/spark/pull/13567 @rxin I'm also biased on not doing self-links, but there are already so many self-links on datasets docs So we should either remove those or add these to make everything consistent. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13518: [WIP][SPARK-15472][SQL] Add support for writing in `csv`...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13518 **[Test build #60219 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60219/consoleFull)** for PR 13518 at commit [`bc28f41`](https://github.com/apache/spark/commit/bc28f4112ca9eca6a9f1602a891dd0388fa3185c). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13518: [WIP][SPARK-15472][SQL] Add support for writing in `csv`...
Github user lw-lin commented on the issue: https://github.com/apache/spark/pull/13518 Jenkins retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13518: [WIP][SPARK-15472][SQL] Add support for writing in `csv`...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13518 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13518: [WIP][SPARK-15472][SQL] Add support for writing in `csv`...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13518 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60218/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13518: [WIP][SPARK-15472][SQL] Add support for writing in `csv`...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13518 **[Test build #60218 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60218/consoleFull)** for PR 13518 at commit [`345fdba`](https://github.com/apache/spark/commit/345fdbac36f66c90955e1ee954bcabcf73be). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13518: [WIP][SPARK-15472][SQL] Add support for writing in `csv`...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13518 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13518: [WIP][SPARK-15472][SQL] Add support for writing in `csv`...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13518 **[Test build #60218 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60218/consoleFull)** for PR 13518 at commit [`345fdba`](https://github.com/apache/spark/commit/345fdbac36f66c90955e1ee954bcabcf73be). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13518: [WIP][SPARK-15472][SQL] Add support for writing in `csv`...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13518 **[Test build #60217 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60217/consoleFull)** for PR 13518 at commit [`fc0fb51`](https://github.com/apache/spark/commit/fc0fb513bc839a53dec59f60b728017ab080d288). * This patch **fails executing the `dev/run-tests` script**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13518: [WIP][SPARK-15472][SQL] Add support for writing in `csv`...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13518 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60217/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13065: [SPARK-15214][SQL] Code-generation for Generate
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13065 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60214/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13065: [SPARK-15214][SQL] Code-generation for Generate
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13065 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13065: [SPARK-15214][SQL] Code-generation for Generate
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13065 **[Test build #60214 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60214/consoleFull)** for PR 13065 at commit [`c9b3eda`](https://github.com/apache/spark/commit/c9b3eda19b9d4f1071d33760bd96d8875566fc47). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13564: [SPARK-15827][BUILD][WIP] Publish Spark's forked sbt-pom...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13564 **[Test build #60216 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60216/consoleFull)** for PR 13564 at commit [`57199a8`](https://github.com/apache/spark/commit/57199a84a97ad8d324a350e5e71f3888da5a2772). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13518: [WIP][SPARK-15472][SQL] Add support for writing in `csv`...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13518 **[Test build #60217 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60217/consoleFull)** for PR 13518 at commit [`fc0fb51`](https://github.com/apache/spark/commit/fc0fb513bc839a53dec59f60b728017ab080d288). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13518: [WIP][SPARK-15472][SQL] Add support for writing in `csv`...
Github user lw-lin commented on the issue: https://github.com/apache/spark/pull/13518 Jenkins retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13518: [WIP][SPARK-15472][SQL] Add support for writing i...
GitHub user lw-lin reopened a pull request: https://github.com/apache/spark/pull/13518 [WIP][SPARK-15472][SQL] Add support for writing in `csv`, `json`, `text` formats in Structured Streaming ## What changes were proposed in this pull request? This patch adds support for writing in `csv`, `json`, `text` formats in Structured Streaming: **1. at a high level, this patch forms the following hierarchy**(`text` as an example): ``` â TextOutputWriterBase â â BatchTextOutputWriter StreamingTextOutputWriter ``` ``` â â BatchTextOutputWriterFactory StreamingOutputWriterFactory â StreamingTextOutputWriterFactory ``` The `StreamingTextOutputWriter` and other 'streaming' output writers would write data **without** using an `OutputCommitter`. This was the same approach taken by [SPARK-14716](https://github.com/apache/spark/pull/12409). **2. to support compression, this patch attaches an extension to the path assigned by `FileStreamSink`**, which is slightly different from [SPARK-14716](https://github.com/apache/spark/pull/12409). For example, if we are writing out using the `gzip` compression and `FileStreamSink` assigns path `${uuid}` to a text writer, then in the end the file written out will be `${uuid}.txt.gz` -- so that when we read the file back, we'll correctly interpret it as `gzip` compressed. ## How was this patch tested? `FileStreamSinkSuite` is expanded much more to cover the added `csv`, `json`, `text` formats: ```scala test(" csv - unpartitioned data - codecs: none/gzip") test("json - unpartitioned data - codecs: none/gzip") test("text - unpartitioned data - codecs: none/gzip") test(" csv - partitioned data - codecs: none/gzip") test("json - partitioned data - codecs: none/gzip") test("text - partitioned data - codecs: none/gzip") test(" csv - unpartitioned writing and batch reading - codecs: none/gzip") test("json - unpartitioned writing and batch reading - codecs: none/gzip") test("text - unpartitioned writing and batch reading - codecs: none/gzip") test(" csv - partitioned writing and batch reading - codecs: none/gzip") test("json - partitioned writing and batch reading - codecs: none/gzip") test("text - partitioned writing and batch reading - codecs: none/gzip") ``` You can merge this pull request into a Git repository by running: $ git pull https://github.com/lw-lin/spark add-csv-json-text-for-ss Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/13518.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #13518 commit 97034f9aeb092b10e1606e60a8e6b4878ebd54cf Author: Liwei LinDate: 2016-06-05T09:03:04Z Add csv, json, text commit 2035b597b44aa519d8da3b155036446f88b3050e Author: Liwei Lin Date: 2016-06-05T09:03:15Z Fix parquet extension commit 4737361489fd680405b291ec498ab91374685ffe Author: Liwei Lin Date: 2016-06-05T11:52:14Z Fix style commit 90d02c4a10c14af83bbed985e36ef99a1edaa48b Author: Liwei Lin Date: 2016-06-06T08:02:32Z Fix tests commit daec480bd16ed52137d32f332debe3806953f4d2 Author: Liwei Lin Date: 2016-06-06T09:03:08Z Revert "Fix tests" This reverts commit 90d02c4a10c14af83bbed985e36ef99a1edaa48b. commit 43b68d426e9b64061095eca7a1db0e762843adef Author: Liwei Lin Date: 2016-06-06T09:09:10Z Fix tests commit 56dbb9b4f0f7e2bf76935e0d1d2fc6c6cdf141ff Author: Liwei Lin Date: 2016-06-06T12:34:07Z Investigate test commit 91e51aed5caf663d5068057e5cf28f21eb768310 Author: Liwei Lin Date: 2016-06-06T12:43:39Z Investigate test commit eb2090ce9fa04efbd23370f7d3e6cb98fd0b4c74 Author: Liwei Lin Date: 2016-06-06T13:00:26Z Update run-tests commit 1e9af1cef892706de8b07728c192dd8ca5e5851e Author: Liwei Lin Date: 2016-06-07T04:10:34Z Investigate test commit f76b1d9fe5e4970ad66dfb6d4b2fed939b68728c Author: Liwei Lin Date: 2016-06-07T04:49:12Z Fix tests commit 7fca579d9fec2589f13403b0eb0f2d5f5e6bd52a Author: Liwei Lin Date: 2016-06-07T04:50:24Z Fix tests commit 2ead307d01d8f908951fbc059b8e49bbc77947b1 Author: Liwei Lin Date: 2016-06-07T05:02:55Z Fix tests commit b64afc64d3121479eb5c3f8c8b5663b6e05349b7 Author: Liwei Lin Date: 2016-06-07T05:35:34Z
[GitHub] spark issue #13065: [SPARK-15214][SQL] Code-generation for Generate
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13065 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60213/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13065: [SPARK-15214][SQL] Code-generation for Generate
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13065 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13065: [SPARK-15214][SQL] Code-generation for Generate
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13065 **[Test build #60213 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60213/consoleFull)** for PR 13065 at commit [`1d2d595`](https://github.com/apache/spark/commit/1d2d595c0f5ac52fc896c966c3258f67ec07aabc). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13065: [SPARK-15214][SQL] Code-generation for Generate
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13065 **[Test build #60215 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60215/consoleFull)** for PR 13065 at commit [`2732b06`](https://github.com/apache/spark/commit/2732b06ee9843c7492d479c29fff6ce0e9025a89). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13565: [SPARK-15783][CORE] Fix Flakiness in BlacklistIntegratio...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13565 **[Test build #3072 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3072/consoleFull)** for PR 13565 at commit [`174d070`](https://github.com/apache/spark/commit/174d0704eb1bf01df6834ca5f437518a2131d45a). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13566: [SPARK-15678] Add support to REFRESH data source paths
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13566 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60212/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13065: [SPARK-15214][SQL] Code-generation for Generate
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/13065 Here are the code dumps for all the generator paths. # Setup ```scala val df = spark.range(1 << 20).selectExpr( "id as key", "array(rand(), rand(), rand(), rand(), rand()) as values", "map('a', rand(), 'b', rand(), 'c', rand(), 'd', rand(), 'e', rand()) pairs", "concat('{\"key\": ', id, ', \"value\": \"v_', id, '\"}') json") df.createTempView("df") ``` # explode(array) ```java > println(sql("explain codegen select key, explode(values) as value from df").collect()(0)) /* ... */ /* 099 */ protected void processNext() throws java.io.IOException { /* 100 */ // initialize Range /* 101 */ if (!range_initRange) { /* 102 */ range_initRange = true; /* 103 */ initRange(partitionIndex); /* 104 */ } /* 105 */ /* 106 */ while (!range_overflow && range_number < range_partitionEnd) { /* 107 */ long range_value = range_number; /* 108 */ range_number += 1L; /* 109 */ if (range_number < range_value ^ 1L < 0) { /* 110 */ range_overflow = true; /* 111 */ } /* 112 */ /* 113 */ final boolean project_isNull1 = false; /* 114 */ this.project_values = new Object[5]; /* 115 */ final double project_value2 = project_rng.nextDouble(); /* 116 */ if (false) { /* 117 */ project_values[0] = null; /* 118 */ } else { /* 119 */ project_values[0] = project_value2; /* 120 */ } /* 121 */ /* 122 */ final double project_value3 = project_rng1.nextDouble(); /* 123 */ if (false) { /* 124 */ project_values[1] = null; /* 125 */ } else { /* 126 */ project_values[1] = project_value3; /* 127 */ } /* 128 */ /* 129 */ final double project_value4 = project_rng2.nextDouble(); /* 130 */ if (false) { /* 131 */ project_values[2] = null; /* 132 */ } else { /* 133 */ project_values[2] = project_value4; /* 134 */ } /* 135 */ /* 136 */ final double project_value5 = project_rng3.nextDouble(); /* 137 */ if (false) { /* 138 */ project_values[3] = null; /* 139 */ } else { /* 140 */ project_values[3] = project_value5; /* 141 */ } /* 142 */ /* 143 */ final double project_value6 = project_rng4.nextDouble(); /* 144 */ if (false) { /* 145 */ project_values[4] = null; /* 146 */ } else { /* 147 */ project_values[4] = project_value6; /* 148 */ } /* 149 */ /* 150 */ final ArrayData project_value1 = new org.apache.spark.sql.catalyst.util.GenericArrayData(project_values); /* 151 */ this.project_values = null; /* 152 */ /* 153 */ int generate_numElements = project_isNull1 ? 0 : project_value1.numElements(); /* 154 */ for (int generate_index = 0; generate_index < generate_numElements; generate_index++) { /* 155 */ generate_numOutputRows.add(1); /* 156 */ /* 157 */ double generate_col = project_value1.getDouble(generate_index); /* 158 */ project_rowWriter1.write(0, range_value); /* 159 */ /* 160 */ project_rowWriter1.write(1, generate_col); /* 161 */ append(project_result1.copy()); /* 162 */ /* 163 */ } /* 164 */ /* 165 */ if (shouldStop()) return; /* 166 */ } /* 167 */ } /* 168 */ } ``` # explode(map) ```java > println(sql("explain codegen select key, explode(pairs) as (k, v) from df").collect()(0)) /* 105 */ protected void processNext() throws java.io.IOException { /* 106 */ // initialize Range /* 107 */ if (!range_initRange) { /* 108 */ range_initRange = true; /* 109 */ initRange(partitionIndex); /* 110 */ } /* 111 */ /* 112 */ while (!range_overflow && range_number < range_partitionEnd) { /* 113 */ long range_value = range_number; /* 114 */ range_number += 1L; /* 115 */ if (range_number < range_value ^ 1L < 0) { /* 116 */ range_overflow = true; /* 117 */ } /* 118 */ /* 119 */ final boolean project_isNull1 = false; /* 120 */ project_keyArray = new Object[5]; /* 121 */ project_valueArray = new Object[5]; /* 122 */ /* 123 */ Object project_obj = ((Expression) references[1]).eval(null); /* 124 */ UTF8String project_value2 = (UTF8String) project_obj; /* 125 */ if (false) { /* 126 */ throw new RuntimeException("Cannot use null as map key!"); /* 127 */ } else { /* 128 */ project_keyArray[0] = project_value2; /* 129 */
[GitHub] spark issue #13566: [SPARK-15678] Add support to REFRESH data source paths
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13566 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13566: [SPARK-15678] Add support to REFRESH data source paths
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13566 **[Test build #60212 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60212/consoleFull)** for PR 13566 at commit [`6acd0c0`](https://github.com/apache/spark/commit/6acd0c044759721719d0fe62e7214fca8c95d7b0). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13565: [SPARK-15783][CORE] Fix Flakiness in BlacklistIntegratio...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13565 **[Test build #3071 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3071/consoleFull)** for PR 13565 at commit [`174d070`](https://github.com/apache/spark/commit/174d0704eb1bf01df6834ca5f437518a2131d45a). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13555: [SPARK-15804][SQL]Include metadata in the toStructType
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13555 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60211/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13555: [SPARK-15804][SQL]Include metadata in the toStructType
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13555 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13555: [SPARK-15804][SQL]Include metadata in the toStructType
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13555 **[Test build #60211 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60211/consoleFull)** for PR 13555 at commit [`a47dad4`](https://github.com/apache/spark/commit/a47dad40bb89bff637c9f77ff9569a648ece11ca). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13563: [SPARK-15826] [CORE] PipedRDD to strictly use UTF-8 and ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13563 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13563: [SPARK-15826] [CORE] PipedRDD to strictly use UTF-8 and ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13563 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60209/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13563: [SPARK-15826] [CORE] PipedRDD to strictly use UTF-8 and ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13563 **[Test build #60209 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60209/consoleFull)** for PR 13563 at commit [`f97cb5c`](https://github.com/apache/spark/commit/f97cb5c377072eebd49e36601143323aa3ff9f52). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13564: [SPARK-15827][BUILD][WIP] Publish Spark's forked ...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/13564#discussion_r66371769 --- Diff: project/plugins.sbt --- @@ -21,3 +21,14 @@ libraryDependencies += "org.ow2.asm" % "asm" % "5.0.3" libraryDependencies += "org.ow2.asm" % "asm-commons" % "5.0.3" addSbtPlugin("com.simplytyped" % "sbt-antlr4" % "0.7.11") + +// Spark uses a custom fork of the sbt-pom-reader plugin which contains a patch to fix issues +// related to test-jar dependencies (https://github.com/sbt/sbt-pom-reader/pull/14). The source for +// this fork is published at https://github.com/JoshRosen/sbt-pom-reader/tree/v1.0.0-spark +// and corresponds to commit b160317fcb0b9d1009635a7c5aa05d0f3be61936 in that repository. +// In the long run, we should try to merge our patch upstream and switch to an upstream version of +// the plugin; this is tracked at SPARK-14401. + +resolvers += "Spark fork of sbt-pom-reader" at "https://oss.sonatype.org/content/repositories/orgspark-project-1124; --- End diff -- Done. Let's wait for it to hit Central. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13065: [SPARK-15214][SQL] Code-generation for Generate
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13065 **[Test build #60214 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60214/consoleFull)** for PR 13065 at commit [`c9b3eda`](https://github.com/apache/spark/commit/c9b3eda19b9d4f1071d33760bd96d8875566fc47). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13424: [SPARK-15489][SQL] Dataset kryo encoder won't load custo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13424 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60208/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13424: [SPARK-15489][SQL] Dataset kryo encoder won't load custo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13424 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13424: [SPARK-15489][SQL] Dataset kryo encoder won't load custo...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13424 **[Test build #60208 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60208/consoleFull)** for PR 13424 at commit [`0a53f8e`](https://github.com/apache/spark/commit/0a53f8ef9962ce49f78cfd9bea34a0cbb497d1f7). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13564: [SPARK-15827][BUILD][WIP] Publish Spark's forked ...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/13564#discussion_r66369823 --- Diff: project/plugins.sbt --- @@ -21,3 +21,14 @@ libraryDependencies += "org.ow2.asm" % "asm" % "5.0.3" libraryDependencies += "org.ow2.asm" % "asm-commons" % "5.0.3" addSbtPlugin("com.simplytyped" % "sbt-antlr4" % "0.7.11") + +// Spark uses a custom fork of the sbt-pom-reader plugin which contains a patch to fix issues +// related to test-jar dependencies (https://github.com/sbt/sbt-pom-reader/pull/14). The source for +// this fork is published at https://github.com/JoshRosen/sbt-pom-reader/tree/v1.0.0-spark +// and corresponds to commit b160317fcb0b9d1009635a7c5aa05d0f3be61936 in that repository. +// In the long run, we should try to merge our patch upstream and switch to an upstream version of +// the plugin; this is tracked at SPARK-14401. + +resolvers += "Spark fork of sbt-pom-reader" at "https://oss.sonatype.org/content/repositories/orgspark-project-1124; --- End diff -- please go ahead! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13564: [SPARK-15827][BUILD][WIP] Publish Spark's forked ...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/13564#discussion_r66369515 --- Diff: project/plugins.sbt --- @@ -21,3 +21,14 @@ libraryDependencies += "org.ow2.asm" % "asm" % "5.0.3" libraryDependencies += "org.ow2.asm" % "asm-commons" % "5.0.3" addSbtPlugin("com.simplytyped" % "sbt-antlr4" % "0.7.11") + +// Spark uses a custom fork of the sbt-pom-reader plugin which contains a patch to fix issues +// related to test-jar dependencies (https://github.com/sbt/sbt-pom-reader/pull/14). The source for +// this fork is published at https://github.com/JoshRosen/sbt-pom-reader/tree/v1.0.0-spark +// and corresponds to commit b160317fcb0b9d1009635a7c5aa05d0f3be61936 in that repository. +// In the long run, we should try to merge our patch upstream and switch to an upstream version of +// the plugin; this is tracked at SPARK-14401. + +resolvers += "Spark fork of sbt-pom-reader" at "https://oss.sonatype.org/content/repositories/orgspark-project-1124; --- End diff -- No, if we decide that the bits that I published here look okay then I'll push to central and remove this resolver. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13065: [SPARK-15214][SQL] Code-generation for Generate
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13065 **[Test build #60213 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60213/consoleFull)** for PR 13065 at commit [`1d2d595`](https://github.com/apache/spark/commit/1d2d595c0f5ac52fc896c966c3258f67ec07aabc). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13564: [SPARK-15827][BUILD][WIP] Publish Spark's forked ...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/13564#discussion_r66369245 --- Diff: project/plugins.sbt --- @@ -21,3 +21,14 @@ libraryDependencies += "org.ow2.asm" % "asm" % "5.0.3" libraryDependencies += "org.ow2.asm" % "asm-commons" % "5.0.3" addSbtPlugin("com.simplytyped" % "sbt-antlr4" % "0.7.11") + +// Spark uses a custom fork of the sbt-pom-reader plugin which contains a patch to fix issues +// related to test-jar dependencies (https://github.com/sbt/sbt-pom-reader/pull/14). The source for +// this fork is published at https://github.com/JoshRosen/sbt-pom-reader/tree/v1.0.0-spark +// and corresponds to commit b160317fcb0b9d1009635a7c5aa05d0f3be61936 in that repository. +// In the long run, we should try to merge our patch upstream and switch to an upstream version of +// the plugin; this is tracked at SPARK-14401. + +resolvers += "Spark fork of sbt-pom-reader" at "https://oss.sonatype.org/content/repositories/orgspark-project-1124; --- End diff -- is this permanent? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13569: [SPARK-15791] Fix NPE in ScalarSubquery
Github user rxin commented on the issue: https://github.com/apache/spark/pull/13569 Can you try create a unit test? Maybe it's related to ConvertToLocalRelation --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13570: [SPARK-15832][SQL] Embedded IN/EXISTS predicate subquery...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13570 **[Test build #3070 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3070/consoleFull)** for PR 13570 at commit [`eea703a`](https://github.com/apache/spark/commit/eea703aa673aab5f56d6a97ad86860422cd563a3). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13569: [SPARK-15791] Fix NPE in ScalarSubquery
Github user rxin commented on the issue: https://github.com/apache/spark/pull/13569 It's possible this was going through the local execution short-cut? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13147: [SPARK-6320][SQL] Move planLater method into GenericStra...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13147 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60207/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13147: [SPARK-6320][SQL] Move planLater method into GenericStra...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13147 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13147: [SPARK-6320][SQL] Move planLater method into GenericStra...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13147 **[Test build #60207 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60207/consoleFull)** for PR 13147 at commit [`254381d`](https://github.com/apache/spark/commit/254381d245cabf3cbad57f7ab06eec155ae79d96). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13496: [SPARK-15753][SQL] Move Analyzer stuff to Analyzer from ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13496 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60206/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13496: [SPARK-15753][SQL] Move Analyzer stuff to Analyzer from ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13496 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13496: [SPARK-15753][SQL] Move Analyzer stuff to Analyzer from ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13496 **[Test build #60206 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60206/consoleFull)** for PR 13496 at commit [`40a7f31`](https://github.com/apache/spark/commit/40a7f315297d13e16aea4c370963867545c3408f). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13566: [SPARK-15678] Add support to REFRESH data source paths
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13566 **[Test build #60212 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60212/consoleFull)** for PR 13566 at commit [`6acd0c0`](https://github.com/apache/spark/commit/6acd0c044759721719d0fe62e7214fca8c95d7b0). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13566: [SPARK-15678] Add support to REFRESH data source paths
Github user sameeragarwal commented on the issue: https://github.com/apache/spark/pull/13566 Thanks, I pulled it out in a separate function. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13555: [SPARK-15804][SQL]Include metadata in the toStructType
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13555 **[Test build #60211 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60211/consoleFull)** for PR 13555 at commit [`a47dad4`](https://github.com/apache/spark/commit/a47dad40bb89bff637c9f77ff9569a648ece11ca). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13555: [SPARK-15804][SQL]Include metadata in the toStruc...
Github user kevinyu98 commented on a diff in the pull request: https://github.com/apache/spark/pull/13555#discussion_r66366855 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetQuerySuite.scala --- @@ -625,6 +625,21 @@ class ParquetQuerySuite extends QueryTest with ParquetTest with SharedSQLContext } } } + + test("SPARK-15804: write out the metadata to parquet file") { +val df = Seq((1, "abc"), (2, "hello")).toDF("a", "b") +val md = new MetadataBuilder().putString("key", "value").build() +val dfWithmeta = df.select('a, 'b.as("b", md)) + +withTempPath { dir => + val path = s"${dir.getCanonicalPath}/data" + dfWithmeta.write.parquet(path) + + readParquetFile(path) { df => +assert(df.schema.json.contains("\"key\":\"value\"")) --- End diff -- @cloud-fan Thanks for your comments, I have changed the test case. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13555: [SPARK-15804][SQL]Include metadata in the toStruc...
Github user kevinyu98 commented on a diff in the pull request: https://github.com/apache/spark/pull/13555#discussion_r66366487 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetQuerySuite.scala --- @@ -625,6 +625,21 @@ class ParquetQuerySuite extends QueryTest with ParquetTest with SharedSQLContext } } } + + test("SPARK-15804: write out the metadata to parquet file") { +val df = Seq((1, "abc"), (2, "hello")).toDF("a", "b") +val md = new MetadataBuilder().putString("key", "value").build() +val dfWithmeta = df.select('a, 'b.as("b", md)) + +withTempPath { dir => + val path = s"${dir.getCanonicalPath}/data" --- End diff -- ok, I will make change --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13565: [SPARK-15783][CORE] Fix Flakiness in BlacklistIntegratio...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13565 **[Test build #3072 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3072/consoleFull)** for PR 13565 at commit [`174d070`](https://github.com/apache/spark/commit/174d0704eb1bf01df6834ca5f437518a2131d45a). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13555: [SPARK-15804][SQL]Include metadata in the toStruc...
Github user kevinyu98 commented on a diff in the pull request: https://github.com/apache/spark/pull/13555#discussion_r66366470 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetQuerySuite.scala --- @@ -625,6 +625,21 @@ class ParquetQuerySuite extends QueryTest with ParquetTest with SharedSQLContext } } } + + test("SPARK-15804: write out the metadata to parquet file") { +val df = Seq((1, "abc"), (2, "hello")).toDF("a", "b") +val md = new MetadataBuilder().putString("key", "value").build() +val dfWithmeta = df.select('a, 'b.as("b", md)) + +withTempPath { dir => + val path = s"${dir.getCanonicalPath}/data" + dfWithmeta.write.parquet(path) + + readParquetFile(path) { df => +assert(df.schema.json.contains("\"key\":\"value\"")) --- End diff -- sure, I will do that --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13565: [SPARK-15783][CORE] Fix Flakiness in BlacklistIntegratio...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13565 **[Test build #3071 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3071/consoleFull)** for PR 13565 at commit [`174d070`](https://github.com/apache/spark/commit/174d0704eb1bf01df6834ca5f437518a2131d45a). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13569: [SPARK-15791] Fix NPE in ScalarSubquery
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13569 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60205/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13569: [SPARK-15791] Fix NPE in ScalarSubquery
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13569 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13569: [SPARK-15791] Fix NPE in ScalarSubquery
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13569 **[Test build #60205 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60205/consoleFull)** for PR 13569 at commit [`7e2eb6f`](https://github.com/apache/spark/commit/7e2eb6f6f014f5c8ea15343deab205bed59496a1). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13571: [SPARK-15369][WIP][RFC][PySpark][SQL] Expose potential t...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13571 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13571: [SPARK-15369][WIP][RFC][PySpark][SQL] Expose potential t...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13571 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60210/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13571: [SPARK-15369][WIP][RFC][PySpark][SQL] Expose potential t...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13571 **[Test build #60210 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60210/consoleFull)** for PR 13571 at commit [`e84e52a`](https://github.com/apache/spark/commit/e84e52aaea627f96a3df396f57a9dc77cddb4c35). * This patch **fails build dependency tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class CreateTempViewUsing(` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13189: [SPARK-14670][SQL] allow updating driver side sql metric...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13189 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60204/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13189: [SPARK-14670][SQL] allow updating driver side sql metric...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13189 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org