[GitHub] [spark-website] kiszk commented on issue #221: Add Apache Spark 2.3.4 release news and update links
kiszk commented on issue #221: Add Apache Spark 2.3.4 release news and update links URL: https://github.com/apache/spark-website/pull/221#issuecomment-529772939 Thanks, @cloud-fan, @srowen, @dongjoon-hyun, @viirya, and @HyukjinKwon This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (580c626 -> 962e330)
This is an automated email from the ASF dual-hosted git repository. yumwang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 580c626 [SPARK-28939][SQL][FOLLOWUP] Fix JDK11 compilation due to ambiguous reference add 962e330 [SPARK-26598][SQL] Fix HiveThriftServer2 cannot be modified hiveconf/hivevar variables No new revisions were added by this update. Summary of changes: .../sql/hive/thriftserver/SparkSQLSessionManager.scala | 11 +++ .../hive/thriftserver/server/SparkSQLOperationManager.scala | 12 .../sql/hive/thriftserver/HiveThriftServer2Suites.scala | 13 ++--- 3 files changed, 21 insertions(+), 15 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] kiszk closed pull request #221: Add Apache Spark 2.3.4 release news and update links
kiszk closed pull request #221: Add Apache Spark 2.3.4 release news and update links URL: https://github.com/apache/spark-website/pull/221 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] kiszk edited a comment on issue #221: Add Apache Spark 2.3.4 release news and update links
kiszk edited a comment on issue #221: Add Apache Spark 2.3.4 release news and update links URL: https://github.com/apache/spark-website/pull/221#issuecomment-529761830 Got it at https://dist.apache.org/repos/dist/release/spark/KEYS! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] kiszk commented on issue #221: Add Apache Spark 2.3.4 release news and update links
kiszk commented on issue #221: Add Apache Spark 2.3.4 release news and update links URL: https://github.com/apache/spark-website/pull/221#issuecomment-529761830 Got it! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (c2d8ee9 -> 580c626)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from c2d8ee9 [SPARK-28878][SQL][FOLLOWUP] Remove extra project for DSv2 streaming scan add 580c626 [SPARK-28939][SQL][FOLLOWUP] Fix JDK11 compilation due to ambiguous reference No new revisions were added by this update. Summary of changes: .../src/main/scala/org/apache/spark/sql/execution/SQLExecutionRDD.scala | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] cloud-fan commented on issue #221: Add Apache Spark 2.3.4 release news and update links
cloud-fan commented on issue #221: Add Apache Spark 2.3.4 release news and update links URL: https://github.com/apache/spark-website/pull/221#issuecomment-529753165 I've synced the keys This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r35721 - /release/spark/KEYS
Author: wenchen Date: Tue Sep 10 03:27:38 2019 New Revision: 35721 Log: Update KEYS Modified: release/spark/KEYS Modified: release/spark/KEYS == --- release/spark/KEYS (original) +++ release/spark/KEYS Tue Sep 10 03:27:38 2019 @@ -991,4 +991,62 @@ QRMaCSG2MOvUAI8Zzk6i1Gi5InRlP5v8sQdrMYvS meyB5uExVklZg9yaoH2zAFXLkjG1pftpkCb57UIyC+Tk5KAMZXyS2vHNGxsnI3FG ZTFPNYvCMMHM8A== =PEdD --END PGP PUBLIC KEY BLOCK- \ No newline at end of file +-END PGP PUBLIC KEY BLOCK- + +pub 4096R/7F0FEF75 2019-08-19 +uid Kazuaki Ishizaki (CODE SIGNING KEY) +sub 4096R/7C3AEC68 2019-08-19 + +-BEGIN PGP PUBLIC KEY BLOCK- +Version: GnuPG v1 + +mQINBF1a3YcBEAC7I6f1jWpY9WlJBkbwvLneYBjnD2BRwG1eKjkz49aUXVKkx4Du +XB7b+agbhWL7EIPjQHVJf0RVGochOujKfcPxOz5bZwAV078EbsJpiAYIAeVEimQF +Pv/uqaf9DbIjZAnJtZhKlyXJaXLpuZbqEwBimpfbgvF5ib4ii7a9kY7BO/YsSXXc +ksLBIHKwNAeKSMIGmCQaxz/tNmRm1tAagFknCEoQ0CMsA8FesjXyS+U6nfJWdK3K ++678joAIhZvdn5k3f/bR94ifeDCh0QsY/zuG95er4Gp0rdr8EmRQbfJAUAwfkn8a +viQD1FkTs+aJn4MSClb+FDXu7hNrPPdayA5CI6PSMdir//+Z7Haox92mvhQT5pBJ +X21R4BDqF6bmL2d/RL3e2Zb1rmztDbTq43OL3Jm+x9R3OPg9UVwFJgHUy/xEirve +Nah5Y6GzV3po/VSJbRIdM/p8OENv6YahFbLr5rT5O9iZns/PXHUpXYXLQDfdFJD2 +oCNFxlQmjfbxIL3PIcdS2gY2o1FmEbYuaLi6Bb9FDTm/J78vHYtR3wLvwufLh3PX +5en9e6+g7o5w3jN/3J1skwXUUSOHK88mWBGt2B9ZwYS+7TQ0zWcgrXjwHQoi92nA +JEADyvQSxTB/zd5usCVel8038FSKhawkhrmLBk2UoJR4prhnPC364MnjgQARAQAB +tDZLYXp1YWtpIElzaGl6YWtpIChDT0RFIFNJR05JTkcgS0VZKSA8a2lzemtAYXBh +Y2hlLm9yZz6JAjgEEwECACIFAl1a3YcCGwMGCwkIBwMCBhUIAgkKCwQWAgMBAh4B +AheAAAoJEOSaBGx/D+91w5AQALB6gff1BuaDyMSiSYaAGBGrBAxs1+ixQxlLX+ld +KG9y/u41S3s8pBn0GXp1jthdURnPm+raLqJk1lVPUZ4JqNYot0FL/nGBIZjRRG6J +TfmlWTza1AfgvzcROaO+7jVPMskBx/HZn8XxEOlMcnBv4P/v3m/QUW9/tH8j+6Bc +JwfiqD3LIaWZTicAMxWE9r7MREDcgkrFROJDDJPMFxoVKomIcc3vzXJeI7BfVtkG +5NHWYDVn4QTQygv+qes4ke9fcik7T5c9NcOjXgks6eF0z7Z/Rj6DUrIyVKleUwJZ +AWpBJcbNc8crg623DRaXpGhXsGvnD5PxcPvVjJ9Jud7o884OhVr2abxQ++rIv/+m +K5K99jbp2E/6Q6tR4ODEoPTGN6fSijziWfhuad26K/grN3878hayGmey57vPH3tx +LsBkUfc9bz46HjcdhfaU1dS82YOMmrFLLmgBEL1PViK628gk0TR7C6N4kHKGWd1f +tQz/bTFzoyXOTpS6bvceE88fZ2FSeepP0AgvZPZsUXxrHXo78oECZ9CAoO/q1P1J +OrKr5oG5om9pB+4SI3FhD2PKxt/+ayMCyA6PVBlw8HDI2XLBmBi9YkiP2ws7gJcF +A958J3CWc6Q7PstrU7LCmL0Apbl8T2Iqph7jB2Qiko2sOyxe5Vwkwh9vHYnhy1ox +YZ2quQINBF1a3YcBEADfvUJtKQKQEHl6ug/c2PxDL5pfEhCXQfBIkfPScUgiQCO9 +aiSigMUReiYa/7cau2jmGUcBktjgLwlAGywX6YTGt/ZIWCkGRdK8K3mVRNssGwXs ++oWcNinRbzIV1cvZu9zndzM7lzIMFriIP/Shsi9QPg6SibK1XhgkYr2pTN8i1zmQ +sd/FGnhEeGZxXDwW7wG6tPXvzQiAZgJEsUh90i9AbQzI/MWG2RqqjKGO423BcpQ8 +nHgUlj7JbgRI2knBjpnxAyKroDGw9dKXNBqYrGjQtbXcCkBTk6vDyOkXUWOz63Bc +AtVfXwL5+RILvYjzn8bZne5jt8fkNK3z29XTv7N3Ee8HRwPnGp6Ny7jGR/f740gP +3b8y4A6QI9YlyvOlp2SHIRPHEYKUQCLaTT1/b4DYN5SGtWwXA4GafCLBVBwD3fr+ +jIhCbInX0+MWOZwuTYuwpoE6nnsnWpsAd6ZOMJInULRyW1f7/zXoq2XvtFH8+IQN +DYtF1lr2C8lm7WUKqSg2bmVy6+gV6KvYqj6oihLQBxlnmrKBQFhkBeOyNYxRW8rf +c+nZZza/5QMZLD7mYL+BGmgHB2eycSuz7UkZ8H5DD0u7Wz74mmmHOg9EyJuJSa3z +UXgg1VNtZCW/m7ha5jedQTiXSYX1R7HjjoX6vWm85mRLAFbyW7DaKnfbYlJvjwAR +AQABiQIfBBgBAgAJBQJdWt2HAhsMAAoJEOSaBGx/D+91YNwQAIY41adyEUHRtwnP +sT90VjheUdz9++rAet8jstwGK8M3wrnhDet18E7wTxt52Knkw7vMS2wqjm3jxeFs +/pI/eA6Tq+AWLEySODegM9TGFxAtcP9TAR0bXGspw5LUWUKO+MJ17pyVs0M/0gb0 +GEjbVCjDn/h0Ozr3n81eokVDhvBZ8n2dUGoetmuZ77Wz1liPoV9G0paISKyLsj9d +iQkE3ExZlGkvX6OiNbJMoo1pHMA4knAo9ch62THofPaoLX5mCKwhNgQDECYd4k89 +ww176ndkrllV8t1v/UDHXPwmDWGK+mLeLk4e+fDJ+bOQrZ543AYk6MB1gRyb94G7 +bQniuoc2YvB+Cn6qOB83ARhDz0zPUGVj/85P8xwmcsZJxlLGpiPAXEQJX2Zk6zFR +1HLxy831IsHaEktglF9tBH+OxJqBg45fbRhuYclWfo724enVdm/rLtR1n93ybaJS +eNmw1Lomks7IsX6qdBR36zVB2WgmIcsnxjtMee+YqfFiAbzbm27lV6A7aTDyIPzQ +R2fSta747XADEy7rzYawV5zuCupmUHp/ZgfQK9xYDnZ+lJHHaipDgmIe4Mfe/3Je +au2shXGZFmo4V56uCJ5HqZTJJZaMceQx7u8uqZbhtHG+lLhbvHXVylaxxEYpqf2O +XJ5Dp1pqv9DC6cl9vLSHctRrM2kG +=mQLW +-END PGP PUBLIC KEY BLOCK- + - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] kiszk commented on issue #221: Add Apache Spark 2.3.4 release news and update links
kiszk commented on issue #221: Add Apache Spark 2.3.4 release news and update links URL: https://github.com/apache/spark-website/pull/221#issuecomment-529751569 @dongjoon-hyun thanks. Let me wait for an additional hour. When KEY is synced, I will merge this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] HyukjinKwon commented on issue #222: Add Weichen Xu to committer list
HyukjinKwon commented on issue #222: Add Weichen Xu to committer list URL: https://github.com/apache/spark-website/pull/222#issuecomment-529749156 @WeichenXu123, can you try to push it by yourself? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-28878][SQL][FOLLOWUP] Remove extra project for DSv2 streaming scan
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new c2d8ee9 [SPARK-28878][SQL][FOLLOWUP] Remove extra project for DSv2 streaming scan c2d8ee9 is described below commit c2d8ee9c54adf4a425ce41d8743e24dd8be864c3 Author: Wenchen Fan AuthorDate: Tue Sep 10 11:01:57 2019 +0800 [SPARK-28878][SQL][FOLLOWUP] Remove extra project for DSv2 streaming scan ### What changes were proposed in this pull request? Remove the project node if the streaming scan is columnar ### Why are the changes needed? This is a followup of https://github.com/apache/spark/pull/25586. Batch and streaming share the same DS v2 read API so both can support columnar reads. We should apply #25586 to streaming scan as well. ### Does this PR introduce any user-facing change? no ### How was this patch tested? existing tests Closes #25727 from cloud-fan/follow. Authored-by: Wenchen Fan Signed-off-by: Wenchen Fan --- .../datasources/v2/DataSourceV2Strategy.scala | 29 -- 1 file changed, 21 insertions(+), 8 deletions(-) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala index 7cad305..f629f36 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala @@ -155,17 +155,30 @@ object DataSourceV2Strategy extends Strategy with PredicateHelper { case r: StreamingDataSourceV2Relation if r.startOffset.isDefined && r.endOffset.isDefined => val microBatchStream = r.stream.asInstanceOf[MicroBatchStream] - // ensure there is a projection, which will produce unsafe rows required by some operators - ProjectExec(r.output, -MicroBatchScanExec( - r.output, r.scan, microBatchStream, r.startOffset.get, r.endOffset.get)) :: Nil + val scanExec = MicroBatchScanExec( +r.output, r.scan, microBatchStream, r.startOffset.get, r.endOffset.get) + + val withProjection = if (scanExec.supportsColumnar) { +scanExec + } else { +// Add a Project here to make sure we produce unsafe rows. +ProjectExec(r.output, scanExec) + } + + withProjection :: Nil case r: StreamingDataSourceV2Relation if r.startOffset.isDefined && r.endOffset.isEmpty => val continuousStream = r.stream.asInstanceOf[ContinuousStream] - // ensure there is a projection, which will produce unsafe rows required by some operators - ProjectExec(r.output, -ContinuousScanExec( - r.output, r.scan, continuousStream, r.startOffset.get)) :: Nil + val scanExec = ContinuousScanExec(r.output, r.scan, continuousStream, r.startOffset.get) + + val withProjection = if (scanExec.supportsColumnar) { +scanExec + } else { +// Add a Project here to make sure we produce unsafe rows. +ProjectExec(r.output, scanExec) + } + + withProjection :: Nil case WriteToDataSourceV2(writer, query) => WriteToDataSourceV2Exec(writer, planLater(query)) :: Nil - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated: [SPARK-23519][SQL][2.4] Create view should work from query with duplicate output columns
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new 75b902f [SPARK-23519][SQL][2.4] Create view should work from query with duplicate output columns 75b902f is described below commit 75b902f547dfd392f17013210a03dc671f94fcdc Author: hemanth meka AuthorDate: Tue Sep 10 10:52:03 2019 +0800 [SPARK-23519][SQL][2.4] Create view should work from query with duplicate output columns **What changes were proposed in this pull request?** Backporting the pullrequest [25570](https://github.com/apache/spark/pull/25570) to branch-2.4 Moving the call for checkColumnNameDuplication out of generateViewProperties. This way we can choose ifcheckColumnNameDuplication will be performed on analyzed or aliased plan without having to pass an additional argument(aliasedPlan) to generateViewProperties. Before the pr column name duplication was performed on the query output of below sql(c1, c1) and the pr makes it perform check on the user provided schema of view definition(c1, c2) **Why are the changes needed?** Changes are to fix SPARK-23519 bug. Below queries would cause an exception. This pr fixes them and also added a test case. `CREATE TABLE t23519 AS SELECT 1 AS c1 CREATE VIEW v23519 (c1, c2) AS SELECT c1, c1 FROM t23519` Does this PR introduce any user-facing change? No **How was this patch tested?** new unit test added in SQLViewSuite Closes #25733 from hem1891/SPARK-23519-backport-to-2.4. Lead-authored-by: hemanth meka Co-authored-by: hem1891 Signed-off-by: Wenchen Fan --- .../org/apache/spark/sql/execution/command/views.scala | 18 +++--- .../org/apache/spark/sql/execution/SQLViewSuite.scala | 10 ++ 2 files changed, 21 insertions(+), 7 deletions(-) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala index 5172f32..abc8515 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala @@ -26,7 +26,7 @@ import org.apache.spark.sql.catalyst.catalog.{CatalogStorageFormat, CatalogTable import org.apache.spark.sql.catalyst.expressions.{Alias, SubqueryExpression} import org.apache.spark.sql.catalyst.plans.QueryPlan import org.apache.spark.sql.catalyst.plans.logical.{LogicalPlan, Project, View} -import org.apache.spark.sql.types.MetadataBuilder +import org.apache.spark.sql.types.{MetadataBuilder, StructType} import org.apache.spark.sql.util.SchemaUtils @@ -233,14 +233,15 @@ case class CreateViewCommand( throw new AnalysisException( "It is not allowed to create a persisted view from the Dataset API") } - -val newProperties = generateViewProperties(properties, session, analyzedPlan) +val aliasedSchema = aliasPlan(session, analyzedPlan).schema +val newProperties = generateViewProperties( + properties, session, analyzedPlan, aliasedSchema.fieldNames) CatalogTable( identifier = name, tableType = CatalogTableType.VIEW, storage = CatalogStorageFormat.empty, - schema = aliasPlan(session, analyzedPlan).schema, + schema = aliasedSchema, properties = newProperties, viewText = originalText, comment = comment @@ -294,7 +295,8 @@ case class AlterViewAsCommand( val viewIdent = viewMeta.identifier checkCyclicViewReference(analyzedPlan, Seq(viewIdent), viewIdent) -val newProperties = generateViewProperties(viewMeta.properties, session, analyzedPlan) +val newProperties = generateViewProperties( + viewMeta.properties, session, analyzedPlan, analyzedPlan.schema.fieldNames) val updatedViewMeta = viewMeta.copy( schema = analyzedPlan.schema, @@ -355,13 +357,15 @@ object ViewHelper { def generateViewProperties( properties: Map[String, String], session: SparkSession, - analyzedPlan: LogicalPlan): Map[String, String] = { + analyzedPlan: LogicalPlan, + fieldNames: Array[String]): Map[String, String] = { +// for createViewCommand queryOutput may be different from fieldNames val queryOutput = analyzedPlan.schema.fieldNames // Generate the query column names, throw an AnalysisException if there exists duplicate column // names. SchemaUtils.checkColumnNameDuplication( - queryOutput, "in the view definition", session.sessionState.conf.resolver) + fieldNames, "in the view definition", session.sessionState.conf.resolver) // Generate the view default database name. val viewDefaultDatabase = session.sessionState.catalog.getCurrentDatabase diff --git
[GitHub] [spark-website] HyukjinKwon commented on a change in pull request #222: Add Weichen Xu to committer list
HyukjinKwon commented on a change in pull request #222: Add Weichen Xu to committer list URL: https://github.com/apache/spark-website/pull/222#discussion_r322528559 ## File path: committers.md ## @@ -78,6 +78,7 @@ navigation: |Patrick Wendell|Databricks| |Andrew Xia|Alibaba| |Reynold Xin|Databricks| +|Weichen Xu|Databricks| Review comment: @WeichenXu123, HTML also has to be generated by `jekyll build`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] WeichenXu123 opened a new pull request #222: Add Weichen Xu to committer list
WeichenXu123 opened a new pull request #222: Add Weichen Xu to committer list URL: https://github.com/apache/spark-website/pull/222 *Make sure that you generate site HTML with `jekyll build`, and include the changes to the HTML in your pull request also. See README.md for more information. Please remove this message.* This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (aa805ec -> 86fc890)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from aa805ec [SPARK-23265][ML] Update multi-column error handling logic in QuantileDiscretizer add 86fc890 [SPARK-28988][SQL][TESTS] Fix invalid tests in CliSuite No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/hive/thriftserver/CliSuite.scala | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (aafce7e -> aa805ec)
This is an automated email from the ASF dual-hosted git repository. viirya pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from aafce7e [SPARK-28412][SQL] ANSI SQL: OVERLAY function support byte array add aa805ec [SPARK-23265][ML] Update multi-column error handling logic in QuantileDiscretizer No new revisions were added by this update. Summary of changes: .../spark/ml/feature/QuantileDiscretizer.scala | 43 +++- .../ml/feature/QuantileDiscretizerSuite.scala | 76 +++--- 2 files changed, 96 insertions(+), 23 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated: Revert "[SPARK-28657][CORE] Fix currentContext Instance failed sometimes"
This is an automated email from the ASF dual-hosted git repository. srowen pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new 92e5216 Revert "[SPARK-28657][CORE] Fix currentContext Instance failed sometimes" 92e5216 is described below commit 92e5216ea0763e67d369547291e28b61ff5065fd Author: Sean Owen AuthorDate: Mon Sep 9 18:44:38 2019 -0500 Revert "[SPARK-28657][CORE] Fix currentContext Instance failed sometimes" This reverts commit df55f3cb120a5fd57aeec9ca3d67434e756e4b1c. --- core/src/main/scala/org/apache/spark/util/Utils.scala | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/core/src/main/scala/org/apache/spark/util/Utils.scala b/core/src/main/scala/org/apache/spark/util/Utils.scala index e7d1613..8f86b47 100644 --- a/core/src/main/scala/org/apache/spark/util/Utils.scala +++ b/core/src/main/scala/org/apache/spark/util/Utils.scala @@ -2946,8 +2946,7 @@ private[spark] class CallerContext( if (CallerContext.callerContextSupported) { try { val callerContext = Utils.classForName("org.apache.hadoop.ipc.CallerContext") -val builder: Class[AnyRef] = - Utils.classForName("org.apache.hadoop.ipc.CallerContext$Builder") +val builder = Utils.classForName("org.apache.hadoop.ipc.CallerContext$Builder") val builderInst = builder.getConstructor(classOf[String]).newInstance(context) val hdfsContext = builder.getMethod("build").invoke(builderInst) callerContext.getMethod("setCurrent", callerContext).invoke(null, hdfsContext) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-28412][SQL] ANSI SQL: OVERLAY function support byte array
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new aafce7e [SPARK-28412][SQL] ANSI SQL: OVERLAY function support byte array aafce7e is described below commit aafce7ebffe1acd8f6022f208beaa9ec6c9f7592 Author: gengjiaan AuthorDate: Tue Sep 10 08:16:18 2019 +0900 [SPARK-28412][SQL] ANSI SQL: OVERLAY function support byte array ## What changes were proposed in this pull request? This is a ANSI SQL and feature id is `T312` ``` ::= OVERLAY PLACING FROM [ FOR ] ``` This PR related to https://github.com/apache/spark/pull/24918 and support treat byte array. ref: https://www.postgresql.org/docs/11/functions-binarystring.html ## How was this patch tested? new UT. There are some show of the PR on my production environment. ``` spark-sql> select overlay(encode('Spark SQL', 'utf-8') PLACING encode('_', 'utf-8') FROM 6); Spark_SQL Time taken: 0.285 s spark-sql> select overlay(encode('Spark SQL', 'utf-8') PLACING encode('CORE', 'utf-8') FROM 7); Spark CORE Time taken: 0.202 s spark-sql> select overlay(encode('Spark SQL', 'utf-8') PLACING encode('ANSI ', 'utf-8') FROM 7 FOR 0); Spark ANSI SQL Time taken: 0.165 s spark-sql> select overlay(encode('Spark SQL', 'utf-8') PLACING encode('tructured', 'utf-8') FROM 2 FOR 4); Structured SQL Time taken: 0.141 s ``` Closes #25172 from beliefer/ansi-overlay-byte-array. Lead-authored-by: gengjiaan Co-authored-by: Jiaan Geng Signed-off-by: Takeshi Yamamuro --- .../catalyst/expressions/stringExpressions.scala | 60 +++--- .../expressions/StringExpressionsSuite.scala | 72 +- .../scala/org/apache/spark/sql/functions.scala | 16 ++--- .../apache/spark/sql/StringFunctionsSuite.scala| 33 +++--- 4 files changed, 157 insertions(+), 24 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala index d7a5fb2..e4847e9 100755 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala @@ -472,6 +472,19 @@ object Overlay { builder.append(input.substringSQL(pos + length, Int.MaxValue)) builder.build() } + + def calculate(input: Array[Byte], replace: Array[Byte], pos: Int, len: Int): Array[Byte] = { +// If you specify length, it must be a positive whole number or zero. +// Otherwise it will be ignored. +// The default value for length is the length of replace. +val length = if (len >= 0) { + len +} else { + replace.length +} +ByteArray.concat(ByteArray.subStringSQL(input, 1, pos - 1), + replace, ByteArray.subStringSQL(input, pos + length, Int.MaxValue)) + } } // scalastyle:off line.size.limit @@ -487,6 +500,14 @@ object Overlay { Spark ANSI SQL > SELECT _FUNC_('Spark SQL' PLACING 'tructured' FROM 2 FOR 4); Structured SQL + > SELECT _FUNC_(encode('Spark SQL', 'utf-8') PLACING encode('_', 'utf-8') FROM 6); + Spark_SQL + > SELECT _FUNC_(encode('Spark SQL', 'utf-8') PLACING encode('CORE', 'utf-8') FROM 7); + Spark CORE + > SELECT _FUNC_(encode('Spark SQL', 'utf-8') PLACING encode('ANSI ', 'utf-8') FROM 7 FOR 0); + Spark ANSI SQL + > SELECT _FUNC_(encode('Spark SQL', 'utf-8') PLACING encode('tructured', 'utf-8') FROM 2 FOR 4); + Structured SQL """) // scalastyle:on line.size.limit case class Overlay(input: Expression, replace: Expression, pos: Expression, len: Expression) @@ -496,19 +517,42 @@ case class Overlay(input: Expression, replace: Expression, pos: Expression, len: this(str, replace, pos, Literal.create(-1, IntegerType)) } - override def dataType: DataType = StringType + override def dataType: DataType = input.dataType - override def inputTypes: Seq[AbstractDataType] = -Seq(StringType, StringType, IntegerType, IntegerType) + override def inputTypes: Seq[AbstractDataType] = Seq(TypeCollection(StringType, BinaryType), +TypeCollection(StringType, BinaryType), IntegerType, IntegerType) override def children: Seq[Expression] = input :: replace :: pos :: len :: Nil + override def checkInputDataTypes(): TypeCheckResult = { +val inputTypeCheck = super.checkInputDataTypes() +if (inputTypeCheck.isSuccess) { + TypeUtils.checkForSameTypeInputExpr( +input.dataType :: replace.dataType :: Nil, s"function $prettyName") +} else { + inputTypeCheck +} + } + + private lazy val
[spark] branch branch-2.4 updated (9ef48f7 -> df55f3c)
This is an automated email from the ASF dual-hosted git repository. srowen pushed a change to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git. from 9ef48f7 [SPARK-29011][BUILD] Update netty-all from 4.1.30-Final to 4.1.39-Final add df55f3c [SPARK-28657][CORE] Fix currentContext Instance failed sometimes No new revisions were added by this update. Summary of changes: core/src/main/scala/org/apache/spark/util/Utils.scala | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (e516f7e -> bdc1598)
This is an automated email from the ASF dual-hosted git repository. srowen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from e516f7e [SPARK-28928][SS] Use Kafka delegation token protocol on sources/sinks add bdc1598 [SPARK-28657][CORE] Fix currentContext Instance failed sometimes No new revisions were added by this update. Summary of changes: core/src/main/scala/org/apache/spark/util/Utils.scala | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (8018ded -> e516f7e)
This is an automated email from the ASF dual-hosted git repository. vanzin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 8018ded [SPARK-28214][STREAMING][TESTS] CheckpointSuite: wait for batch to be fully processed before accessing DStreamCheckpointData add e516f7e [SPARK-28928][SS] Use Kafka delegation token protocol on sources/sinks No new revisions were added by this update. Summary of changes: docs/structured-streaming-kafka-integration.md | 4 +++- .../sql/kafka010/KafkaDelegationTokenSuite.scala | 2 -- .../apache/spark/kafka010/KafkaConfigUpdater.scala | 1 + .../spark/kafka010/KafkaTokenSparkConf.scala | 4 +++- .../spark/kafka010/KafkaConfigUpdaterSuite.scala | 26 +- 5 files changed, 32 insertions(+), 5 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (125af78d -> 8018ded)
This is an automated email from the ASF dual-hosted git repository. vanzin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 125af78d [SPARK-28831][DOC][SQL] Document CLEAR CACHE statement in SQL Reference add 8018ded [SPARK-28214][STREAMING][TESTS] CheckpointSuite: wait for batch to be fully processed before accessing DStreamCheckpointData No new revisions were added by this update. Summary of changes: .../spark/streaming/scheduler/JobGenerator.scala | 2 +- .../spark/streaming/scheduler/JobScheduler.scala | 3 +- .../apache/spark/streaming/CheckpointSuite.scala | 32 -- 3 files changed, 26 insertions(+), 11 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] dongjoon-hyun edited a comment on issue #221: Add Apache Spark 2.3.4 release news and update links
dongjoon-hyun edited a comment on issue #221: Add Apache Spark 2.3.4 release news and update links URL: https://github.com/apache/spark-website/pull/221#issuecomment-529680249 BTW, @cloud-fan and @kiszk . We need to sync `KEYS`, too~ - https://dist.apache.org/repos/dist/dev/spark/KEYS (`Kazuaki Ishizaki (CODE SIGNING KEY)`) - https://dist.apache.org/repos/dist/release/spark/KEYS (No `Kazuaki Ishizaki (CODE SIGNING KEY)`) Otherwise,[ the checker](https://checker.apache.org/projs/spark.html) will complain later. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] dongjoon-hyun edited a comment on issue #221: Add Apache Spark 2.3.4 release news and update links
dongjoon-hyun edited a comment on issue #221: Add Apache Spark 2.3.4 release news and update links URL: https://github.com/apache/spark-website/pull/221#issuecomment-529680249 BTW, @cloud-fan and @kiszk . We need to sync `KEYS`, too. - https://dist.apache.org/repos/dist/dev/spark/KEYS (`Kazuaki Ishizaki (CODE SIGNING KEY)`) - https://dist.apache.org/repos/dist/release/spark/KEYS (No `Kazuaki Ishizaki (CODE SIGNING KEY)`) Otherwise,[ the checker](https://checker.apache.org/projs/spark.html) will complain later. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] dongjoon-hyun commented on issue #221: Add Apache Spark 2.3.4 release news and update links
dongjoon-hyun commented on issue #221: Add Apache Spark 2.3.4 release news and update links URL: https://github.com/apache/spark-website/pull/221#issuecomment-529680249 BTW, @cloud-fan and @kiszk . We need to sync `KEYS`, too. - https://dist.apache.org/repos/dist/dev/spark/KEYS `Kazuaki Ishizaki (CODE SIGNING KEY)` - https://dist.apache.org/repos/dist/release/spark/KEYS (No `Kazuaki Ishizaki (CODE SIGNING KEY)`) Otherwise,[ the checker](https://checker.apache.org/projs/spark.html) will complain later. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] dongjoon-hyun commented on issue #221: Add Apache Spark 2.3.4 release news and update links
dongjoon-hyun commented on issue #221: Add Apache Spark 2.3.4 release news and update links URL: https://github.com/apache/spark-website/pull/221#issuecomment-529676525 Thank you for updating all here. Later, we can update `README.md` in order to use `v4.0.0` always. `Jekyll v4.0.0` is released 20 days ago. `asf-site` branch seems to be not a result of `v4.0.0` according to the git log. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (c839d09 -> 125af78d)
This is an automated email from the ASF dual-hosted git repository. lixiao pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from c839d09 [SPARK-28773][DOC][SQL] Handling of NULL data in Spark SQL add 125af78d [SPARK-28831][DOC][SQL] Document CLEAR CACHE statement in SQL Reference No new revisions were added by this update. Summary of changes: docs/sql-ref-syntax-aux-cache-clear-cache.md | 18 +- docs/sql-ref-syntax-aux-cache-uncache-table.md | 4 ++-- docs/sql-ref-syntax-aux-cache.md | 15 +++ 3 files changed, 26 insertions(+), 11 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] dongjoon-hyun edited a comment on issue #221: Add Apache Spark 2.3.4 release news and update links
dongjoon-hyun edited a comment on issue #221: Add Apache Spark 2.3.4 release news and update links URL: https://github.com/apache/spark-website/pull/221#issuecomment-529670463 ~Ur, @cloud-fan . It seems that we still don't have 2.3.4 at Archive directory.~ Never mind. It seems due to a delay. - https://archive.apache.org/dist/spark/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] dongjoon-hyun commented on issue #221: Add Apache Spark 2.3.4 release news and update links
dongjoon-hyun commented on issue #221: Add Apache Spark 2.3.4 release news and update links URL: https://github.com/apache/spark-website/pull/221#issuecomment-529670463 Ur, @cloud-fan . It seems that we still don't have 2.3.4 at Archive directory. - https://archive.apache.org/dist/spark/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] dongjoon-hyun commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links
dongjoon-hyun commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links URL: https://github.com/apache/spark-website/pull/221#discussion_r322439479 ## File path: site/sitemap.xml ## @@ -139,713 +139,721 @@ Review comment: Oh, thanks! This seems to be due to my mistake during `jekyll watch`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] dongjoon-hyun commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links
dongjoon-hyun commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links URL: https://github.com/apache/spark-website/pull/221#discussion_r322447723 ## File path: site/examples.html ## @@ -230,36 +230,36 @@ Word Count -text_file = sc.textFile(hdfs://...) -counts = text_file.flatMap(lambda line: line.split( )) \ - .map(lambda word: (word, 1)) \ +text_file = sc.textFile("hdfs://...") +counts = text_file.flatMap(lambda line: line.split(" ")) \ + .map(lambda word: (word, 1)) \ .reduceByKey(lambda a, b: a + b) -counts.saveAsTextFile(hdfs://...) +counts.saveAsTextFile("hdfs://...") Review comment: Oh, now Jekyll support `"hdfs://..."` instead of `hdfs://...`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] dongjoon-hyun commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links
dongjoon-hyun commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links URL: https://github.com/apache/spark-website/pull/221#discussion_r322446452 ## File path: site/examples.html ## @@ -230,7 +230,7 @@ Word Count -text_file = sc.textFile(hdfs://...) Review comment: Yes. Mine was 3.8.5 or 3.8.6, not v4.0.0 (IIRC, the version on my laptop at home). - https://github.com/apache/spark-website/commit/4f850d15a68650384e4c1dd8b74c585ffedc875a (Fix jekyll build before updating) - https://github.com/apache/spark-website/commit/950c65da1e91419b4e4830a0a23dbc6cb732ddaf (Release v2.4.4) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6378d4b -> c839d09)
This is an automated email from the ASF dual-hosted git repository. lixiao pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6378d4b [SPARK-28980][CORE][SQL][STREAMING][MLLIB] Remove most items deprecated in Spark 2.2.0 or earlier, for Spark 3 add c839d09 [SPARK-28773][DOC][SQL] Handling of NULL data in Spark SQL No new revisions were added by this update. Summary of changes: docs/_data/menu-sql.yaml | 2 + docs/sql-ref-null-semantics.md | 703 + 2 files changed, 705 insertions(+) create mode 100644 docs/sql-ref-null-semantics.md - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] dongjoon-hyun commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links
dongjoon-hyun commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links URL: https://github.com/apache/spark-website/pull/221#discussion_r322439479 ## File path: site/sitemap.xml ## @@ -139,713 +139,721 @@ Review comment: Oh, thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] viirya commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links
viirya commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links URL: https://github.com/apache/spark-website/pull/221#discussion_r322438237 ## File path: site/examples.html ## @@ -230,7 +230,7 @@ Word Count -text_file = sc.textFile(hdfs://...) Review comment: Interesting. Maybe last change wasn't done with jekyll v4.0.0. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] kiszk commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links
kiszk commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links URL: https://github.com/apache/spark-website/pull/221#discussion_r322435187 ## File path: site/examples.html ## @@ -230,7 +230,7 @@ Word Count -text_file = sc.textFile(hdfs://...) Review comment: Sure, committed the change with jekyll v4.0.0 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] kiszk commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links
kiszk commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links URL: https://github.com/apache/spark-website/pull/221#discussion_r322432836 ## File path: site/sitemap.xml ## @@ -139,713 +139,721 @@ Review comment: Yeah, I realized it. It was introduced unintentionally in the last one. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated: [SPARK-29011][BUILD] Update netty-all from 4.1.30-Final to 4.1.39-Final
This is an automated email from the ASF dual-hosted git repository. srowen pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new 9ef48f7 [SPARK-29011][BUILD] Update netty-all from 4.1.30-Final to 4.1.39-Final 9ef48f7 is described below commit 9ef48f72a9a0bf19da5aef38d921fc7890ae0a77 Author: Nicholas Marion AuthorDate: Mon Sep 9 15:13:37 2019 -0500 [SPARK-29011][BUILD] Update netty-all from 4.1.30-Final to 4.1.39-Final ### What changes were proposed in this pull request? Upgrade netty-all to latest in the 4.1.x line which is 4.1.39-Final. ### Why are the changes needed? Currency of dependencies. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Existing unit-tests against 2.4 branch. Closes #25732 from n-marion/branch-2.4. Authored-by: Nicholas Marion Signed-off-by: Sean Owen --- dev/deps/spark-deps-hadoop-2.6 | 2 +- dev/deps/spark-deps-hadoop-2.7 | 2 +- dev/deps/spark-deps-hadoop-3.1 | 2 +- pom.xml| 2 +- 4 files changed, 4 insertions(+), 4 deletions(-) diff --git a/dev/deps/spark-deps-hadoop-2.6 b/dev/deps/spark-deps-hadoop-2.6 index 2974f2b..e53cde9 100644 --- a/dev/deps/spark-deps-hadoop-2.6 +++ b/dev/deps/spark-deps-hadoop-2.6 @@ -149,7 +149,7 @@ metrics-json-3.1.5.jar metrics-jvm-3.1.5.jar minlog-1.3.0.jar netty-3.9.9.Final.jar -netty-all-4.1.17.Final.jar +netty-all-4.1.39.Final.jar objenesis-2.5.1.jar okhttp-3.8.1.jar okio-1.13.0.jar diff --git a/dev/deps/spark-deps-hadoop-2.7 b/dev/deps/spark-deps-hadoop-2.7 index c25648d..c2e6d75 100644 --- a/dev/deps/spark-deps-hadoop-2.7 +++ b/dev/deps/spark-deps-hadoop-2.7 @@ -150,7 +150,7 @@ metrics-json-3.1.5.jar metrics-jvm-3.1.5.jar minlog-1.3.0.jar netty-3.9.9.Final.jar -netty-all-4.1.17.Final.jar +netty-all-4.1.39.Final.jar objenesis-2.5.1.jar okhttp-3.8.1.jar okio-1.13.0.jar diff --git a/dev/deps/spark-deps-hadoop-3.1 b/dev/deps/spark-deps-hadoop-3.1 index 6ce8287..6ba49fd 100644 --- a/dev/deps/spark-deps-hadoop-3.1 +++ b/dev/deps/spark-deps-hadoop-3.1 @@ -167,7 +167,7 @@ metrics-jvm-3.1.5.jar minlog-1.3.0.jar mssql-jdbc-6.2.1.jre7.jar netty-3.9.9.Final.jar -netty-all-4.1.17.Final.jar +netty-all-4.1.39.Final.jar nimbus-jose-jwt-4.41.1.jar objenesis-2.5.1.jar okhttp-2.7.5.jar diff --git a/pom.xml b/pom.xml index c75d2ca..2ef88af 100644 --- a/pom.xml +++ b/pom.xml @@ -589,7 +589,7 @@ io.netty netty-all -4.1.17.Final +4.1.39.Final io.netty - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] srowen commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links
srowen commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links URL: https://github.com/apache/spark-website/pull/221#discussion_r322430711 ## File path: site/examples.html ## @@ -230,7 +230,7 @@ Word Count -text_file = sc.textFile(hdfs://...) Review comment: If you're using 4.0.0, this should be fine. The difference looks minor. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] srowen commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links
srowen commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links URL: https://github.com/apache/spark-website/pull/221#discussion_r322430323 ## File path: site/sitemap.xml ## @@ -139,713 +139,721 @@ Review comment: Oops, did the sitemap get published last time with the localhost:4000 URLs? that's an error. Good change! this one is easy to miss and GIthub collapses this big diff. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] kiszk commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links
kiszk commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links URL: https://github.com/apache/spark-website/pull/221#discussion_r322429291 ## File path: site/examples.html ## @@ -230,7 +230,7 @@ Word Count -text_file = sc.textFile(hdfs://...) Review comment: I use v4.0.0. However, the diff looks larger than the current commit. ``` @@ -230,11 +230,11 @@ In this page, we will show examples using RDD API as well as examples using high -text_file = sc.textFile(hdfs://...) -counts = text_file.flatMap(lambda line: line.split( )) \ - .map(lambda word: (word, 1)) \ +text_file = sc.textFile("hdfs://...") +counts = text_file.flatMap(lambda line: line.split(" ")) \ + .map(lambda word: (word, 1)) \ .reduceByKey(lambda a, b: a + b) -counts.saveAsTextFile(hdfs://...) +counts.saveAsTextFile("hdfs://...") ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] srowen commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links
srowen commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links URL: https://github.com/apache/spark-website/pull/221#discussion_r322412975 ## File path: site/examples.html ## @@ -230,7 +230,7 @@ Word Count -text_file = sc.textFile(hdfs://...) Review comment: Use the latest jekyll, 4.0.0. I know Hyukjin just updated all the rendering to be consistent with 4.0.0. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] kiszk commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links
kiszk commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links URL: https://github.com/apache/spark-website/pull/221#discussion_r322404583 ## File path: site/examples.html ## @@ -230,7 +230,7 @@ Word Count -text_file = sc.textFile(hdfs://...) Review comment: I used v 3.8.6. ``` /usr/local/bin/jekyll -v jekyll 3.8.6 ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] viirya commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links
viirya commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links URL: https://github.com/apache/spark-website/pull/221#discussion_r322402727 ## File path: site/examples.html ## @@ -230,7 +230,7 @@ Word Count -text_file = sc.textFile(hdfs://...) Review comment: Not sure. @dongjoon-hyun Do you see such thing when you did the similar change? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] kiszk commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links
kiszk commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links URL: https://github.com/apache/spark-website/pull/221#discussion_r322397252 ## File path: site/examples.html ## @@ -230,7 +230,7 @@ Word Count -text_file = sc.textFile(hdfs://...) Review comment: I think so. This is automatically done by `jekyll`. Should we revert this? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] viirya commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links
viirya commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links URL: https://github.com/apache/spark-website/pull/221#discussion_r322379190 ## File path: site/examples.html ## @@ -230,7 +230,7 @@ Word Count -text_file = sc.textFile(hdfs://...) Review comment: Does this remove a redundant span? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] kiszk commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links
kiszk commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links URL: https://github.com/apache/spark-website/pull/221#discussion_r322371987 ## File path: js/downloads.js ## @@ -23,7 +23,7 @@ var packagesV8 = [hadoop2p7, hadoop2p6, hadoopFree, sources]; var packagesV9 = [hadoop2p7, hadoop2p6, hadoopFree, scala2p12_hadoopFree, sources]; addRelease("2.4.4", new Date("08/30/2019"), packagesV9, true); -addRelease("2.3.3", new Date("02/15/2019"), packagesV8, true); +addRelease("2.3.4", new Date("09/09/2019"), packagesV8, true); Review comment: @dongjoon-hyun Thank you. Two `downloads.js` have been updated. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] kiszk commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links
kiszk commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links URL: https://github.com/apache/spark-website/pull/221#discussion_r322368525 ## File path: js/downloads.js ## @@ -23,7 +23,7 @@ var packagesV8 = [hadoop2p7, hadoop2p6, hadoopFree, sources]; var packagesV9 = [hadoop2p7, hadoop2p6, hadoopFree, scala2p12_hadoopFree, sources]; addRelease("2.4.4", new Date("08/30/2019"), packagesV9, true); -addRelease("2.3.3", new Date("02/15/2019"), packagesV8, true); +addRelease("2.3.4", new Date("09/09/2019"), packagesV8, true); Review comment: Thank you. I will update the date soon. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] dongjoon-hyun commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links
dongjoon-hyun commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links URL: https://github.com/apache/spark-website/pull/221#discussion_r322363675 ## File path: js/downloads.js ## @@ -23,7 +23,7 @@ var packagesV8 = [hadoop2p7, hadoop2p6, hadoopFree, sources]; var packagesV9 = [hadoop2p7, hadoop2p6, hadoopFree, scala2p12_hadoopFree, sources]; addRelease("2.4.4", new Date("08/30/2019"), packagesV9, true); -addRelease("2.3.3", new Date("02/15/2019"), packagesV8, true); +addRelease("2.3.4", new Date("09/09/2019"), packagesV8, true); Review comment: Thank you, @kiszk . This should be VOTE pass day which is different from news announce day. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] kiszk commented on issue #221: Add Apache Spark 2.3.4 release news and update links
kiszk commented on issue #221: Add Apache Spark 2.3.4 release news and update links URL: https://github.com/apache/spark-website/pull/221#issuecomment-529583439 Hi, @srowen , @felixcheung , @dongjoon-hyun, @viirya, @gatorsmile , @cloud-fan, @HyukjinKwon Could you review this? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] kiszk opened a new pull request #221: Spark 2.3.4 release
kiszk opened a new pull request #221: Spark 2.3.4 release URL: https://github.com/apache/spark-website/pull/221 This PR aims to the Spark 2.3.3 release news and update links. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (3d6b33a -> 6378d4b)
This is an automated email from the ASF dual-hosted git repository. srowen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 3d6b33a [SPARK-28939][SQL] Propagate SQLConf for plans executed by toRdd add 6378d4b [SPARK-28980][CORE][SQL][STREAMING][MLLIB] Remove most items deprecated in Spark 2.2.0 or earlier, for Spark 3 No new revisions were added by this update. Summary of changes: R/pkg/R/sparkR.R | 7 +- R/pkg/tests/fulltests/test_sparkR.R| 4 +- .../main/scala/org/apache/spark/SparkConf.scala| 17 - .../org/apache/spark/deploy/SparkSubmit.scala | 19 - .../HadoopFSDelegationTokenProviderSuite.scala | 5 +- .../spark/scheduler/BlacklistTrackerSuite.scala| 2 +- dev/sparktestsupport/modules.py| 1 - docs/mllib-evaluation-metrics.md | 28 - docs/mllib-feature-extraction.md | 14 - docs/mllib-linear-methods.md | 51 -- docs/sql-migration-guide-upgrade.md| 5 + docs/streaming-kinesis-integration.md | 2 +- docs/streaming-programming-guide.md| 4 +- .../mllib/JavaLinearRegressionWithSGDExample.java | 81 --- .../mllib/JavaRegressionMetricsExample.java| 83 --- .../spark/examples/mllib/LinearRegression.scala| 138 - .../mllib/LinearRegressionWithSGDExample.scala | 65 --- .../apache/spark/examples/mllib/PCAExample.scala | 75 --- .../examples/mllib/RegressionMetricsExample.scala | 74 --- .../streaming/JavaKinesisWordCountASL.java | 23 +- .../spark/streaming/kinesis/KinesisUtils.scala | 642 - .../kinesis/KinesisUtilsPythonHelper.scala | 93 +++ .../streaming/kinesis/JavaKinesisStreamSuite.java | 98 .../streaming/kinesis/KinesisStreamSuite.scala | 24 +- .../spark/launcher/SparkSubmitCommandBuilder.java | 2 +- .../spark/mllib/api/python/PythonMLLibAPI.scala| 1 - .../mllib/classification/LogisticRegression.scala | 106 .../org/apache/spark/mllib/clustering/KMeans.scala | 67 --- .../apache/spark/mllib/feature/ChiSqSelector.scala | 11 - .../org/apache/spark/mllib/regression/Lasso.scala | 111 .../spark/mllib/regression/LinearRegression.scala | 102 .../spark/mllib/regression/RidgeRegression.scala | 108 .../JavaLogisticRegressionSuite.java | 9 +- .../spark/mllib/clustering/JavaKMeansSuite.java| 4 +- .../spark/mllib/regression/JavaLassoSuite.java | 7 +- .../regression/JavaLinearRegressionSuite.java | 9 +- .../mllib/regression/JavaRidgeRegressionSuite.java | 14 +- .../classification/LogisticRegressionSuite.scala | 22 +- .../spark/mllib/clustering/KMeansSuite.scala | 2 +- .../apache/spark/mllib/regression/LassoSuite.scala | 9 +- .../mllib/regression/LinearRegressionSuite.scala | 8 +- .../mllib/regression/RidgeRegressionSuite.scala| 11 +- project/MimaExcludes.scala | 14 + python/pyspark/__init__.py | 2 +- python/pyspark/ml/tests/test_image.py | 43 +- python/pyspark/mllib/clustering.py | 8 +- python/pyspark/sql/__init__.py | 4 +- python/pyspark/sql/catalog.py | 20 - python/pyspark/sql/context.py | 67 +-- python/pyspark/sql/tests/test_appsubmit.py | 97 python/pyspark/sql/tests/test_context.py | 22 +- python/pyspark/streaming/kinesis.py| 1 - .../apache/spark/deploy/yarn/ClientArguments.scala | 2 +- sql/README.md | 2 +- .../scala/org/apache/spark/sql/SQLContext.scala| 91 --- .../org/apache/spark/sql/catalog/Catalog.scala | 102 +--- .../org/apache/spark/sql/hive/HiveContext.scala| 63 -- .../scala/org/apache/spark/sql/hive/package.scala | 3 - .../sql/hive/JavaMetastoreDataSourcesSuite.java| 54 -- .../apache/spark/sql/hive/CachedTableSuite.scala | 6 +- .../sql/hive/HiveContextCompatibilitySuite.scala | 103 .../spark/sql/hive/MetastoreDataSourcesSuite.scala | 8 +- .../apache/spark/sql/hive/MultiDatabaseSuite.scala | 8 +- .../spark/sql/hive/execution/HiveDDLSuite.scala| 2 +- 64 files changed, 224 insertions(+), 2656 deletions(-) delete mode 100644 examples/src/main/java/org/apache/spark/examples/mllib/JavaLinearRegressionWithSGDExample.java delete mode 100644 examples/src/main/java/org/apache/spark/examples/mllib/JavaRegressionMetricsExample.java delete mode 100644 examples/src/main/scala/org/apache/spark/examples/mllib/LinearRegression.scala delete mode 100644 examples/src/main/scala/org/apache/spark/examples/mllib/LinearRegressionWithSGDExample.scala delete mode 100644 examples/src/main/scala/org/apache/spark/examples/mllib/PCAExample.scala delete mode
[spark] branch master updated: [SPARK-28939][SQL] Propagate SQLConf for plans executed by toRdd
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 3d6b33a [SPARK-28939][SQL] Propagate SQLConf for plans executed by toRdd 3d6b33a is described below commit 3d6b33a49a8daba17973994169ee4a9e2507a6d9 Author: Marco Gaido AuthorDate: Mon Sep 9 21:20:34 2019 +0800 [SPARK-28939][SQL] Propagate SQLConf for plans executed by toRdd ### What changes were proposed in this pull request? The PR proposes to create a custom `RDD` which enables to propagate `SQLConf` also in cases not tracked by SQL execution, as it happens when a `Dataset` is converted to and RDD either using `.rdd` or `.queryExecution.toRdd` and then the returned RDD is used to invoke actions on it. In this way, SQL configs are effective also in these cases, while earlier they were ignored. ### Why are the changes needed? Without this patch, all the times `.rdd` or `.queryExecution.toRdd` are used, all the SQL configs set are ignored. An example of a reproducer can be: ``` withSQLConf(SQLConf.SUBEXPRESSION_ELIMINATION_ENABLED.key, "false") { val df = spark.range(2).selectExpr((0 to 5000).map(i => s"id as field_$i"): _*) df.createOrReplaceTempView("spark64kb") val data = spark.sql("select * from spark64kb limit 10") // Subexpression elimination is used here, despite it should have been disabled data.describe() } ``` ### Does this PR introduce any user-facing change? When a user calls `.queryExecution.toRdd`, a `SQLExecutionRDD` is returned wrapping the `RDD` of the execute. When `.rdd` is used, an additional `SQLExecutionRDD` is present in the hierarchy. ### How was this patch tested? added UT Closes #25643 from mgaido91/SPARK-28939. Authored-by: Marco Gaido Signed-off-by: Wenchen Fan --- .../org/apache/spark/sql/internal/SQLConf.scala| 11 +++- .../spark/sql/execution/QueryExecution.scala | 3 +- .../spark/sql/execution/SQLExecutionRDD.scala | 64 ++ .../sql/internal/ExecutorSideSQLConfSuite.scala| 46 +++- 4 files changed, 119 insertions(+), 5 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala index 6c6cca8..d9b0a72 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala @@ -115,7 +115,9 @@ object SQLConf { * Returns the active config object within the current scope. If there is an active SparkSession, * the proper SQLConf associated with the thread's active session is used. If it's called from * tasks in the executor side, a SQLConf will be created from job local properties, which are set - * and propagated from the driver side. + * and propagated from the driver side, unless a `SQLConf` has been set in the scope by + * `withExistingConf` as done for propagating SQLConf for operations performed on RDDs created + * from DataFrames. * * The way this works is a little bit convoluted, due to the fact that config was added initially * only for physical plans (and as a result not in sql/catalyst module). @@ -129,7 +131,12 @@ object SQLConf { */ def get: SQLConf = { if (TaskContext.get != null) { - new ReadOnlySQLConf(TaskContext.get()) + val conf = existingConf.get() + if (conf != null) { +conf + } else { +new ReadOnlySQLConf(TaskContext.get()) + } } else { val isSchedulerEventLoopThread = SparkContext.getActive .map(_.dagScheduler.eventProcessLoop.eventThread) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/QueryExecution.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/QueryExecution.scala index e5e86db..630d062 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/QueryExecution.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/QueryExecution.scala @@ -105,7 +105,8 @@ class QueryExecution( * Given QueryExecution is not a public class, end users are discouraged to use this: please * use `Dataset.rdd` instead where conversion will be applied. */ - lazy val toRdd: RDD[InternalRow] = executedPlan.execute() + lazy val toRdd: RDD[InternalRow] = new SQLExecutionRDD( +executedPlan.execute(), sparkSession.sessionState.conf) /** * Prepares a planned [[SparkPlan]] for execution by inserting shuffle operations and internal diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/SQLExecutionRDD.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/SQLExecutionRDD.scala new file
[spark] branch master updated (dadb720 -> abec6d7)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from dadb720 [SPARK-28340][CORE] Noisy exceptions when tasks are killed: "DiskBloc… add abec6d7 [SPARK-28341][SQL] create a public API for V2SessionCatalog No new revisions were added by this update. Summary of changes: .../Transform.java => CatalogExtension.java} | 28 +++--- .../sql/catalog/v2/DelegatingCatalogExtension.java | 101 + .../spark/sql/catalog/v2/CatalogManager.scala | 36 ++-- .../spark/sql/catalog/v2/LookupCatalog.scala | 2 +- .../spark/sql/catalyst/analysis/Analyzer.scala | 36 +++- .../org/apache/spark/sql/internal/SQLConf.scala| 7 +- .../sql/catalyst/catalog/CatalogManagerSuite.scala | 7 +- .../org/apache/spark/sql/DataFrameWriter.scala | 14 ++- .../datasources/DataSourceResolution.scala | 19 +--- .../datasources/v2/V2SessionCatalog.scala | 28 ++ .../sql/internal/BaseSessionStateBuilder.scala | 6 +- .../execution/command/PlanResolutionSuite.scala| 8 +- .../datasources/v2/V2SessionCatalogSuite.scala | 13 +-- .../DataSourceV2DataFrameSessionCatalogSuite.scala | 9 +- .../v2/DataSourceV2SQLSessionCatalogSuite.scala| 2 +- .../sql/sources/v2/DataSourceV2SQLSuite.scala | 7 +- .../v2/utils/TestV2SessionCatalogBase.scala| 5 +- .../spark/sql/hive/HiveSessionStateBuilder.scala | 2 +- 18 files changed, 219 insertions(+), 111 deletions(-) copy sql/catalyst/src/main/java/org/apache/spark/sql/catalog/v2/{expressions/Transform.java => CatalogExtension.java} (53%) create mode 100644 sql/catalyst/src/main/java/org/apache/spark/sql/catalog/v2/DelegatingCatalogExtension.java - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (4a3a6b6 -> dadb720)
This is an automated email from the ASF dual-hosted git repository. srowen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 4a3a6b6 [SPARK-28637][SQL] Thriftserver support interval type add dadb720 [SPARK-28340][CORE] Noisy exceptions when tasks are killed: "DiskBloc… No new revisions were added by this update. Summary of changes: .../apache/spark/storage/DiskBlockObjectWriter.scala | 8 +++- .../spark/storage/ShuffleBlockFetcherIterator.scala | 19 --- 2 files changed, 23 insertions(+), 4 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r35707 - /dev/spark/v2.3.4-rc1-docs/
Author: wenchen Date: Mon Sep 9 09:18:18 2019 New Revision: 35707 Log: Remove RC artifacts Removed: dev/spark/v2.3.4-rc1-docs/ - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r35709 - /release/spark/spark-2.3.3/
Author: wenchen Date: Mon Sep 9 09:18:20 2019 New Revision: 35709 Log: Remove old release Removed: release/spark/spark-2.3.3/ - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r35708 - /dev/spark/v2.3.4-rc1-bin/ /release/spark/spark-2.3.4/
Author: wenchen Date: Mon Sep 9 09:18:19 2019 New Revision: 35708 Log: Apache Spark 2.3.4 Added: release/spark/spark-2.3.4/ - copied from r35707, dev/spark/v2.3.4-rc1-bin/ Removed: dev/spark/v2.3.4-rc1-bin/ - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated (0a4b356 -> 483dcf5)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git. from 0a4b356 Revert "[SPARK-28912][STREAMING] Fixed MatchError in getCheckpointFiles()" add 483dcf5 [SPARK-28912][BRANCH-2.4] Fixed MatchError in getCheckpointFiles() No new revisions were added by this update. Summary of changes: .../org/apache/spark/streaming/Checkpoint.scala | 4 ++-- .../org/apache/spark/streaming/CheckpointSuite.scala | 20 2 files changed, 22 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] tag v2.3.4 created (now 8c6f815)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to tag v2.3.4 in repository https://gitbox.apache.org/repos/asf/spark.git. at 8c6f815 (commit) No new revisions were added by this update. - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (d4eca7c -> 4a3a6b6)
This is an automated email from the ASF dual-hosted git repository. lixiao pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from d4eca7c [SPARK-29000][SQL] Decimal precision overflow when don't allow precision loss add 4a3a6b6 [SPARK-28637][SQL] Thriftserver support interval type No new revisions were added by this update. Summary of changes: .../thriftserver/SparkExecuteStatementOperation.scala | 9 - .../sql/hive/thriftserver/HiveThriftServer2Suites.scala | 15 +++ .../SparkThriftServerProtocolVersionsSuite.scala | 4 ++-- 3 files changed, 25 insertions(+), 3 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org