date:20190909

[GitHub] [spark-website] kiszk commented on issue #221: Add Apache Spark 2.3.4 release news and update links

2019-09-09 Thread GitBox

kiszk commented on issue #221: Add Apache Spark 2.3.4 release news and update 
links
URL: https://github.com/apache/spark-website/pull/221#issuecomment-529772939
 
 
   Thanks, @cloud-fan, @srowen, @dongjoon-hyun, @viirya, and @HyukjinKwon 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (580c626 -> 962e330)

2019-09-09 Thread yumwang

This is an automated email from the ASF dual-hosted git repository.

yumwang pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 580c626  [SPARK-28939][SQL][FOLLOWUP] Fix JDK11 compilation due to 
ambiguous reference
 add 962e330  [SPARK-26598][SQL] Fix HiveThriftServer2 cannot be modified 
hiveconf/hivevar variables

No new revisions were added by this update.

Summary of changes:
 .../sql/hive/thriftserver/SparkSQLSessionManager.scala  | 11 +++
 .../hive/thriftserver/server/SparkSQLOperationManager.scala | 12 
 .../sql/hive/thriftserver/HiveThriftServer2Suites.scala | 13 ++---
 3 files changed, 21 insertions(+), 15 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[GitHub] [spark-website] kiszk closed pull request #221: Add Apache Spark 2.3.4 release news and update links

2019-09-09 Thread GitBox

kiszk closed pull request #221: Add Apache Spark 2.3.4 release news and update 
links
URL: https://github.com/apache/spark-website/pull/221
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[GitHub] [spark-website] kiszk edited a comment on issue #221: Add Apache Spark 2.3.4 release news and update links

2019-09-09 Thread GitBox

kiszk edited a comment on issue #221: Add Apache Spark 2.3.4 release news and 
update links
URL: https://github.com/apache/spark-website/pull/221#issuecomment-529761830
 
 
   Got it at https://dist.apache.org/repos/dist/release/spark/KEYS!
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[GitHub] [spark-website] kiszk commented on issue #221: Add Apache Spark 2.3.4 release news and update links

2019-09-09 Thread GitBox

kiszk commented on issue #221: Add Apache Spark 2.3.4 release news and update 
links
URL: https://github.com/apache/spark-website/pull/221#issuecomment-529761830
 
 
   Got it!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (c2d8ee9 -> 580c626)

2019-09-09 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from c2d8ee9  [SPARK-28878][SQL][FOLLOWUP] Remove extra project for DSv2 
streaming scan
 add 580c626  [SPARK-28939][SQL][FOLLOWUP] Fix JDK11 compilation due to 
ambiguous reference

No new revisions were added by this update.

Summary of changes:
 .../src/main/scala/org/apache/spark/sql/execution/SQLExecutionRDD.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[GitHub] [spark-website] cloud-fan commented on issue #221: Add Apache Spark 2.3.4 release news and update links

2019-09-09 Thread GitBox

cloud-fan commented on issue #221: Add Apache Spark 2.3.4 release news and 
update links
URL: https://github.com/apache/spark-website/pull/221#issuecomment-529753165
 
 
   I've synced the keys


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

svn commit: r35721 - /release/spark/KEYS

2019-09-09 Thread wenchen

Author: wenchen
Date: Tue Sep 10 03:27:38 2019
New Revision: 35721

Log:
Update KEYS

Modified:
release/spark/KEYS

Modified: release/spark/KEYS
==
--- release/spark/KEYS (original)
+++ release/spark/KEYS Tue Sep 10 03:27:38 2019
@@ -991,4 +991,62 @@ QRMaCSG2MOvUAI8Zzk6i1Gi5InRlP5v8sQdrMYvS
 meyB5uExVklZg9yaoH2zAFXLkjG1pftpkCb57UIyC+Tk5KAMZXyS2vHNGxsnI3FG
 ZTFPNYvCMMHM8A==
 =PEdD
--END PGP PUBLIC KEY BLOCK-
\ No newline at end of file
+-END PGP PUBLIC KEY BLOCK-
+
+pub   4096R/7F0FEF75 2019-08-19
+uid  Kazuaki Ishizaki (CODE SIGNING KEY) 
+sub   4096R/7C3AEC68 2019-08-19
+
+-BEGIN PGP PUBLIC KEY BLOCK-
+Version: GnuPG v1
+
+mQINBF1a3YcBEAC7I6f1jWpY9WlJBkbwvLneYBjnD2BRwG1eKjkz49aUXVKkx4Du
+XB7b+agbhWL7EIPjQHVJf0RVGochOujKfcPxOz5bZwAV078EbsJpiAYIAeVEimQF
+Pv/uqaf9DbIjZAnJtZhKlyXJaXLpuZbqEwBimpfbgvF5ib4ii7a9kY7BO/YsSXXc
+ksLBIHKwNAeKSMIGmCQaxz/tNmRm1tAagFknCEoQ0CMsA8FesjXyS+U6nfJWdK3K
++678joAIhZvdn5k3f/bR94ifeDCh0QsY/zuG95er4Gp0rdr8EmRQbfJAUAwfkn8a
+viQD1FkTs+aJn4MSClb+FDXu7hNrPPdayA5CI6PSMdir//+Z7Haox92mvhQT5pBJ
+X21R4BDqF6bmL2d/RL3e2Zb1rmztDbTq43OL3Jm+x9R3OPg9UVwFJgHUy/xEirve
+Nah5Y6GzV3po/VSJbRIdM/p8OENv6YahFbLr5rT5O9iZns/PXHUpXYXLQDfdFJD2
+oCNFxlQmjfbxIL3PIcdS2gY2o1FmEbYuaLi6Bb9FDTm/J78vHYtR3wLvwufLh3PX
+5en9e6+g7o5w3jN/3J1skwXUUSOHK88mWBGt2B9ZwYS+7TQ0zWcgrXjwHQoi92nA
+JEADyvQSxTB/zd5usCVel8038FSKhawkhrmLBk2UoJR4prhnPC364MnjgQARAQAB
+tDZLYXp1YWtpIElzaGl6YWtpIChDT0RFIFNJR05JTkcgS0VZKSA8a2lzemtAYXBh
+Y2hlLm9yZz6JAjgEEwECACIFAl1a3YcCGwMGCwkIBwMCBhUIAgkKCwQWAgMBAh4B
+AheAAAoJEOSaBGx/D+91w5AQALB6gff1BuaDyMSiSYaAGBGrBAxs1+ixQxlLX+ld
+KG9y/u41S3s8pBn0GXp1jthdURnPm+raLqJk1lVPUZ4JqNYot0FL/nGBIZjRRG6J
+TfmlWTza1AfgvzcROaO+7jVPMskBx/HZn8XxEOlMcnBv4P/v3m/QUW9/tH8j+6Bc
+JwfiqD3LIaWZTicAMxWE9r7MREDcgkrFROJDDJPMFxoVKomIcc3vzXJeI7BfVtkG
+5NHWYDVn4QTQygv+qes4ke9fcik7T5c9NcOjXgks6eF0z7Z/Rj6DUrIyVKleUwJZ
+AWpBJcbNc8crg623DRaXpGhXsGvnD5PxcPvVjJ9Jud7o884OhVr2abxQ++rIv/+m
+K5K99jbp2E/6Q6tR4ODEoPTGN6fSijziWfhuad26K/grN3878hayGmey57vPH3tx
+LsBkUfc9bz46HjcdhfaU1dS82YOMmrFLLmgBEL1PViK628gk0TR7C6N4kHKGWd1f
+tQz/bTFzoyXOTpS6bvceE88fZ2FSeepP0AgvZPZsUXxrHXo78oECZ9CAoO/q1P1J
+OrKr5oG5om9pB+4SI3FhD2PKxt/+ayMCyA6PVBlw8HDI2XLBmBi9YkiP2ws7gJcF
+A958J3CWc6Q7PstrU7LCmL0Apbl8T2Iqph7jB2Qiko2sOyxe5Vwkwh9vHYnhy1ox
+YZ2quQINBF1a3YcBEADfvUJtKQKQEHl6ug/c2PxDL5pfEhCXQfBIkfPScUgiQCO9
+aiSigMUReiYa/7cau2jmGUcBktjgLwlAGywX6YTGt/ZIWCkGRdK8K3mVRNssGwXs
++oWcNinRbzIV1cvZu9zndzM7lzIMFriIP/Shsi9QPg6SibK1XhgkYr2pTN8i1zmQ
+sd/FGnhEeGZxXDwW7wG6tPXvzQiAZgJEsUh90i9AbQzI/MWG2RqqjKGO423BcpQ8
+nHgUlj7JbgRI2knBjpnxAyKroDGw9dKXNBqYrGjQtbXcCkBTk6vDyOkXUWOz63Bc
+AtVfXwL5+RILvYjzn8bZne5jt8fkNK3z29XTv7N3Ee8HRwPnGp6Ny7jGR/f740gP
+3b8y4A6QI9YlyvOlp2SHIRPHEYKUQCLaTT1/b4DYN5SGtWwXA4GafCLBVBwD3fr+
+jIhCbInX0+MWOZwuTYuwpoE6nnsnWpsAd6ZOMJInULRyW1f7/zXoq2XvtFH8+IQN
+DYtF1lr2C8lm7WUKqSg2bmVy6+gV6KvYqj6oihLQBxlnmrKBQFhkBeOyNYxRW8rf
+c+nZZza/5QMZLD7mYL+BGmgHB2eycSuz7UkZ8H5DD0u7Wz74mmmHOg9EyJuJSa3z
+UXgg1VNtZCW/m7ha5jedQTiXSYX1R7HjjoX6vWm85mRLAFbyW7DaKnfbYlJvjwAR
+AQABiQIfBBgBAgAJBQJdWt2HAhsMAAoJEOSaBGx/D+91YNwQAIY41adyEUHRtwnP
+sT90VjheUdz9++rAet8jstwGK8M3wrnhDet18E7wTxt52Knkw7vMS2wqjm3jxeFs
+/pI/eA6Tq+AWLEySODegM9TGFxAtcP9TAR0bXGspw5LUWUKO+MJ17pyVs0M/0gb0
+GEjbVCjDn/h0Ozr3n81eokVDhvBZ8n2dUGoetmuZ77Wz1liPoV9G0paISKyLsj9d
+iQkE3ExZlGkvX6OiNbJMoo1pHMA4knAo9ch62THofPaoLX5mCKwhNgQDECYd4k89
+ww176ndkrllV8t1v/UDHXPwmDWGK+mLeLk4e+fDJ+bOQrZ543AYk6MB1gRyb94G7
+bQniuoc2YvB+Cn6qOB83ARhDz0zPUGVj/85P8xwmcsZJxlLGpiPAXEQJX2Zk6zFR
+1HLxy831IsHaEktglF9tBH+OxJqBg45fbRhuYclWfo724enVdm/rLtR1n93ybaJS
+eNmw1Lomks7IsX6qdBR36zVB2WgmIcsnxjtMee+YqfFiAbzbm27lV6A7aTDyIPzQ
+R2fSta747XADEy7rzYawV5zuCupmUHp/ZgfQK9xYDnZ+lJHHaipDgmIe4Mfe/3Je
+au2shXGZFmo4V56uCJ5HqZTJJZaMceQx7u8uqZbhtHG+lLhbvHXVylaxxEYpqf2O
+XJ5Dp1pqv9DC6cl9vLSHctRrM2kG
+=mQLW
+-END PGP PUBLIC KEY BLOCK-
+



-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[GitHub] [spark-website] kiszk commented on issue #221: Add Apache Spark 2.3.4 release news and update links

2019-09-09 Thread GitBox

kiszk commented on issue #221: Add Apache Spark 2.3.4 release news and update 
links
URL: https://github.com/apache/spark-website/pull/221#issuecomment-529751569
 
 
   @dongjoon-hyun thanks. Let me wait for an additional hour.
   When KEY is synced, I will merge this.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[GitHub] [spark-website] HyukjinKwon commented on issue #222: Add Weichen Xu to committer list

2019-09-09 Thread GitBox

HyukjinKwon commented on issue #222: Add Weichen Xu to committer list
URL: https://github.com/apache/spark-website/pull/222#issuecomment-529749156
 
 
   @WeichenXu123, can you try to push it by yourself?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-28878][SQL][FOLLOWUP] Remove extra project for DSv2 streaming scan

2019-09-09 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new c2d8ee9  [SPARK-28878][SQL][FOLLOWUP] Remove extra project for DSv2 
streaming scan
c2d8ee9 is described below

commit c2d8ee9c54adf4a425ce41d8743e24dd8be864c3
Author: Wenchen Fan 
AuthorDate: Tue Sep 10 11:01:57 2019 +0800

[SPARK-28878][SQL][FOLLOWUP] Remove extra project for DSv2 streaming scan

### What changes were proposed in this pull request?

Remove the project node if the streaming scan is columnar

### Why are the changes needed?

This is a followup of https://github.com/apache/spark/pull/25586. Batch and 
streaming share the same DS v2 read API so both can support columnar reads. We 
should apply #25586 to streaming scan as well.

### Does this PR introduce any user-facing change?

no

### How was this patch tested?

existing tests

Closes #25727 from cloud-fan/follow.

Authored-by: Wenchen Fan 
Signed-off-by: Wenchen Fan 
---
 .../datasources/v2/DataSourceV2Strategy.scala  | 29 --
 1 file changed, 21 insertions(+), 8 deletions(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala
index 7cad305..f629f36 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala
@@ -155,17 +155,30 @@ object DataSourceV2Strategy extends Strategy with 
PredicateHelper {
 
 case r: StreamingDataSourceV2Relation if r.startOffset.isDefined && 
r.endOffset.isDefined =>
   val microBatchStream = r.stream.asInstanceOf[MicroBatchStream]
-  // ensure there is a projection, which will produce unsafe rows required 
by some operators
-  ProjectExec(r.output,
-MicroBatchScanExec(
-  r.output, r.scan, microBatchStream, r.startOffset.get, 
r.endOffset.get)) :: Nil
+  val scanExec = MicroBatchScanExec(
+r.output, r.scan, microBatchStream, r.startOffset.get, r.endOffset.get)
+
+  val withProjection = if (scanExec.supportsColumnar) {
+scanExec
+  } else {
+// Add a Project here to make sure we produce unsafe rows.
+ProjectExec(r.output, scanExec)
+  }
+
+  withProjection :: Nil
 
 case r: StreamingDataSourceV2Relation if r.startOffset.isDefined && 
r.endOffset.isEmpty =>
   val continuousStream = r.stream.asInstanceOf[ContinuousStream]
-  // ensure there is a projection, which will produce unsafe rows required 
by some operators
-  ProjectExec(r.output,
-ContinuousScanExec(
-  r.output, r.scan, continuousStream, r.startOffset.get)) :: Nil
+  val scanExec = ContinuousScanExec(r.output, r.scan, continuousStream, 
r.startOffset.get)
+
+  val withProjection = if (scanExec.supportsColumnar) {
+scanExec
+  } else {
+// Add a Project here to make sure we produce unsafe rows.
+ProjectExec(r.output, scanExec)
+  }
+
+  withProjection :: Nil
 
 case WriteToDataSourceV2(writer, query) =>
   WriteToDataSourceV2Exec(writer, planLater(query)) :: Nil


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [SPARK-23519][SQL][2.4] Create view should work from query with duplicate output columns

2019-09-09 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new 75b902f  [SPARK-23519][SQL][2.4] Create view should work from query 
with duplicate output columns
75b902f is described below

commit 75b902f547dfd392f17013210a03dc671f94fcdc
Author: hemanth meka 
AuthorDate: Tue Sep 10 10:52:03 2019 +0800

[SPARK-23519][SQL][2.4] Create view should work from query with duplicate 
output columns

**What changes were proposed in this pull request?**

Backporting the pullrequest 
[25570](https://github.com/apache/spark/pull/25570) to branch-2.4

Moving the call for checkColumnNameDuplication out of 
generateViewProperties. This way we can choose ifcheckColumnNameDuplication 
will be performed on analyzed or aliased plan without having to pass an 
additional argument(aliasedPlan) to generateViewProperties.

Before the pr column name duplication was performed on the query output of 
below sql(c1, c1) and the pr makes it perform check on the user provided schema 
of view definition(c1, c2)

**Why are the changes needed?**

Changes are to fix SPARK-23519 bug. Below queries would cause an exception. 
This pr fixes them and also added a test case.

`CREATE TABLE t23519 AS SELECT 1 AS c1 CREATE VIEW v23519 (c1, c2) AS 
SELECT c1, c1 FROM t23519`

Does this PR introduce any user-facing change?
No

**How was this patch tested?**
new unit test added in SQLViewSuite

Closes #25733 from hem1891/SPARK-23519-backport-to-2.4.

Lead-authored-by: hemanth meka 
Co-authored-by: hem1891 
Signed-off-by: Wenchen Fan 
---
 .../org/apache/spark/sql/execution/command/views.scala | 18 +++---
 .../org/apache/spark/sql/execution/SQLViewSuite.scala  | 10 ++
 2 files changed, 21 insertions(+), 7 deletions(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala
index 5172f32..abc8515 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala
@@ -26,7 +26,7 @@ import 
org.apache.spark.sql.catalyst.catalog.{CatalogStorageFormat, CatalogTable
 import org.apache.spark.sql.catalyst.expressions.{Alias, SubqueryExpression}
 import org.apache.spark.sql.catalyst.plans.QueryPlan
 import org.apache.spark.sql.catalyst.plans.logical.{LogicalPlan, Project, View}
-import org.apache.spark.sql.types.MetadataBuilder
+import org.apache.spark.sql.types.{MetadataBuilder, StructType}
 import org.apache.spark.sql.util.SchemaUtils
 
 
@@ -233,14 +233,15 @@ case class CreateViewCommand(
   throw new AnalysisException(
 "It is not allowed to create a persisted view from the Dataset API")
 }
-
-val newProperties = generateViewProperties(properties, session, 
analyzedPlan)
+val aliasedSchema = aliasPlan(session, analyzedPlan).schema
+val newProperties = generateViewProperties(
+  properties, session, analyzedPlan, aliasedSchema.fieldNames)
 
 CatalogTable(
   identifier = name,
   tableType = CatalogTableType.VIEW,
   storage = CatalogStorageFormat.empty,
-  schema = aliasPlan(session, analyzedPlan).schema,
+  schema = aliasedSchema,
   properties = newProperties,
   viewText = originalText,
   comment = comment
@@ -294,7 +295,8 @@ case class AlterViewAsCommand(
 val viewIdent = viewMeta.identifier
 checkCyclicViewReference(analyzedPlan, Seq(viewIdent), viewIdent)
 
-val newProperties = generateViewProperties(viewMeta.properties, session, 
analyzedPlan)
+val newProperties = generateViewProperties(
+  viewMeta.properties, session, analyzedPlan, 
analyzedPlan.schema.fieldNames)
 
 val updatedViewMeta = viewMeta.copy(
   schema = analyzedPlan.schema,
@@ -355,13 +357,15 @@ object ViewHelper {
   def generateViewProperties(
   properties: Map[String, String],
   session: SparkSession,
-  analyzedPlan: LogicalPlan): Map[String, String] = {
+  analyzedPlan: LogicalPlan,
+  fieldNames: Array[String]): Map[String, String] = {
+// for createViewCommand queryOutput may be different from fieldNames
 val queryOutput = analyzedPlan.schema.fieldNames
 
 // Generate the query column names, throw an AnalysisException if there 
exists duplicate column
 // names.
 SchemaUtils.checkColumnNameDuplication(
-  queryOutput, "in the view definition", 
session.sessionState.conf.resolver)
+  fieldNames, "in the view definition", session.sessionState.conf.resolver)
 
 // Generate the view default database name.
 val viewDefaultDatabase = session.sessionState.catalog.getCurrentDatabase
diff --git

[GitHub] [spark-website] HyukjinKwon commented on a change in pull request #222: Add Weichen Xu to committer list

2019-09-09 Thread GitBox

HyukjinKwon commented on a change in pull request #222: Add Weichen Xu to 
committer list
URL: https://github.com/apache/spark-website/pull/222#discussion_r322528559
 
 

 ##
 File path: committers.md
 ##
 @@ -78,6 +78,7 @@ navigation:
 |Patrick Wendell|Databricks|
 |Andrew Xia|Alibaba|
 |Reynold Xin|Databricks|
+|Weichen Xu|Databricks|
 
 Review comment:
   @WeichenXu123, HTML also has to be generated by `jekyll build`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[GitHub] [spark-website] WeichenXu123 opened a new pull request #222: Add Weichen Xu to committer list

2019-09-09 Thread GitBox

WeichenXu123 opened a new pull request #222: Add Weichen Xu to committer list
URL: https://github.com/apache/spark-website/pull/222
 
 
   *Make sure that you generate site HTML with `jekyll build`, and include the 
changes to the HTML in your pull request also. See README.md for more 
information. Please remove this message.*
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (aa805ec -> 86fc890)

2019-09-09 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from aa805ec  [SPARK-23265][ML] Update multi-column error handling logic in 
QuantileDiscretizer
 add 86fc890  [SPARK-28988][SQL][TESTS] Fix invalid tests in CliSuite

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/hive/thriftserver/CliSuite.scala  | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (aafce7e -> aa805ec)

2019-09-09 Thread viirya

This is an automated email from the ASF dual-hosted git repository.

viirya pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from aafce7e  [SPARK-28412][SQL] ANSI SQL: OVERLAY function support byte 
array
 add aa805ec  [SPARK-23265][ML] Update multi-column error handling logic in 
QuantileDiscretizer

No new revisions were added by this update.

Summary of changes:
 .../spark/ml/feature/QuantileDiscretizer.scala | 43 +++-
 .../ml/feature/QuantileDiscretizerSuite.scala  | 76 +++---
 2 files changed, 96 insertions(+), 23 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: Revert "[SPARK-28657][CORE] Fix currentContext Instance failed sometimes"

2019-09-09 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new 92e5216  Revert "[SPARK-28657][CORE] Fix currentContext Instance 
failed sometimes"
92e5216 is described below

commit 92e5216ea0763e67d369547291e28b61ff5065fd
Author: Sean Owen 
AuthorDate: Mon Sep 9 18:44:38 2019 -0500

Revert "[SPARK-28657][CORE] Fix currentContext Instance failed sometimes"

This reverts commit df55f3cb120a5fd57aeec9ca3d67434e756e4b1c.
---
 core/src/main/scala/org/apache/spark/util/Utils.scala | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/core/src/main/scala/org/apache/spark/util/Utils.scala 
b/core/src/main/scala/org/apache/spark/util/Utils.scala
index e7d1613..8f86b47 100644
--- a/core/src/main/scala/org/apache/spark/util/Utils.scala
+++ b/core/src/main/scala/org/apache/spark/util/Utils.scala
@@ -2946,8 +2946,7 @@ private[spark] class CallerContext(
 if (CallerContext.callerContextSupported) {
   try {
 val callerContext = 
Utils.classForName("org.apache.hadoop.ipc.CallerContext")
-val builder: Class[AnyRef] =
-  Utils.classForName("org.apache.hadoop.ipc.CallerContext$Builder")
+val builder = 
Utils.classForName("org.apache.hadoop.ipc.CallerContext$Builder")
 val builderInst = 
builder.getConstructor(classOf[String]).newInstance(context)
 val hdfsContext = builder.getMethod("build").invoke(builderInst)
 callerContext.getMethod("setCurrent", callerContext).invoke(null, 
hdfsContext)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-28412][SQL] ANSI SQL: OVERLAY function support byte array

2019-09-09 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new aafce7e  [SPARK-28412][SQL] ANSI SQL: OVERLAY function support byte 
array
aafce7e is described below

commit aafce7ebffe1acd8f6022f208beaa9ec6c9f7592
Author: gengjiaan 
AuthorDate: Tue Sep 10 08:16:18 2019 +0900

[SPARK-28412][SQL] ANSI SQL: OVERLAY function support byte array

## What changes were proposed in this pull request?

This is a ANSI SQL and feature id is `T312`

```
 ::=
OVERLAY   PLACING 
FROM  [ FOR  ] 
```

This PR related to https://github.com/apache/spark/pull/24918 and support 
treat byte array.

ref: https://www.postgresql.org/docs/11/functions-binarystring.html

## How was this patch tested?

new UT.
There are some show of the PR on my production environment.
```
spark-sql> select overlay(encode('Spark SQL', 'utf-8') PLACING encode('_', 
'utf-8') FROM 6);
Spark_SQL
Time taken: 0.285 s
spark-sql> select overlay(encode('Spark SQL', 'utf-8') PLACING 
encode('CORE', 'utf-8') FROM 7);
Spark CORE
Time taken: 0.202 s
spark-sql> select overlay(encode('Spark SQL', 'utf-8') PLACING encode('ANSI 
', 'utf-8') FROM 7 FOR 0);
Spark ANSI SQL
Time taken: 0.165 s
spark-sql> select overlay(encode('Spark SQL', 'utf-8') PLACING 
encode('tructured', 'utf-8') FROM 2 FOR 4);
Structured SQL
Time taken: 0.141 s
```

Closes #25172 from beliefer/ansi-overlay-byte-array.

Lead-authored-by: gengjiaan 
Co-authored-by: Jiaan Geng 
Signed-off-by: Takeshi Yamamuro 
---
 .../catalyst/expressions/stringExpressions.scala   | 60 +++---
 .../expressions/StringExpressionsSuite.scala   | 72 +-
 .../scala/org/apache/spark/sql/functions.scala | 16 ++---
 .../apache/spark/sql/StringFunctionsSuite.scala| 33 +++---
 4 files changed, 157 insertions(+), 24 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
index d7a5fb2..e4847e9 100755
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
@@ -472,6 +472,19 @@ object Overlay {
 builder.append(input.substringSQL(pos + length, Int.MaxValue))
 builder.build()
   }
+
+  def calculate(input: Array[Byte], replace: Array[Byte], pos: Int, len: Int): 
Array[Byte] = {
+// If you specify length, it must be a positive whole number or zero.
+// Otherwise it will be ignored.
+// The default value for length is the length of replace.
+val length = if (len >= 0) {
+  len
+} else {
+  replace.length
+}
+ByteArray.concat(ByteArray.subStringSQL(input, 1, pos - 1),
+  replace, ByteArray.subStringSQL(input, pos + length, Int.MaxValue))
+  }
 }
 
 // scalastyle:off line.size.limit
@@ -487,6 +500,14 @@ object Overlay {
Spark ANSI SQL
   > SELECT _FUNC_('Spark SQL' PLACING 'tructured' FROM 2 FOR 4);
Structured SQL
+  > SELECT _FUNC_(encode('Spark SQL', 'utf-8') PLACING encode('_', 
'utf-8') FROM 6);
+   Spark_SQL
+  > SELECT _FUNC_(encode('Spark SQL', 'utf-8') PLACING encode('CORE', 
'utf-8') FROM 7);
+   Spark CORE
+  > SELECT _FUNC_(encode('Spark SQL', 'utf-8') PLACING encode('ANSI ', 
'utf-8') FROM 7 FOR 0);
+   Spark ANSI SQL
+  > SELECT _FUNC_(encode('Spark SQL', 'utf-8') PLACING encode('tructured', 
'utf-8') FROM 2 FOR 4);
+   Structured SQL
   """)
 // scalastyle:on line.size.limit
 case class Overlay(input: Expression, replace: Expression, pos: Expression, 
len: Expression)
@@ -496,19 +517,42 @@ case class Overlay(input: Expression, replace: 
Expression, pos: Expression, len:
 this(str, replace, pos, Literal.create(-1, IntegerType))
   }
 
-  override def dataType: DataType = StringType
+  override def dataType: DataType = input.dataType
 
-  override def inputTypes: Seq[AbstractDataType] =
-Seq(StringType, StringType, IntegerType, IntegerType)
+  override def inputTypes: Seq[AbstractDataType] = 
Seq(TypeCollection(StringType, BinaryType),
+TypeCollection(StringType, BinaryType), IntegerType, IntegerType)
 
   override def children: Seq[Expression] = input :: replace :: pos :: len :: 
Nil
 
+  override def checkInputDataTypes(): TypeCheckResult = {
+val inputTypeCheck = super.checkInputDataTypes()
+if (inputTypeCheck.isSuccess) {
+  TypeUtils.checkForSameTypeInputExpr(
+input.dataType :: replace.dataType :: Nil, s"function $prettyName")
+} else {
+  inputTypeCheck
+}
+  }
+
+  private lazy val

[spark] branch branch-2.4 updated (9ef48f7 -> df55f3c)

2019-09-09 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 9ef48f7  [SPARK-29011][BUILD] Update netty-all from 4.1.30-Final to 
4.1.39-Final
 add df55f3c  [SPARK-28657][CORE] Fix currentContext Instance failed 
sometimes

No new revisions were added by this update.

Summary of changes:
 core/src/main/scala/org/apache/spark/util/Utils.scala | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (e516f7e -> bdc1598)

2019-09-09 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from e516f7e  [SPARK-28928][SS] Use Kafka delegation token protocol on 
sources/sinks
 add bdc1598  [SPARK-28657][CORE] Fix currentContext Instance failed 
sometimes

No new revisions were added by this update.

Summary of changes:
 core/src/main/scala/org/apache/spark/util/Utils.scala | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (8018ded -> e516f7e)

2019-09-09 Thread vanzin

This is an automated email from the ASF dual-hosted git repository.

vanzin pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 8018ded  [SPARK-28214][STREAMING][TESTS] CheckpointSuite: wait for 
batch to be fully processed before accessing DStreamCheckpointData
 add e516f7e  [SPARK-28928][SS] Use Kafka delegation token protocol on 
sources/sinks

No new revisions were added by this update.

Summary of changes:
 docs/structured-streaming-kafka-integration.md |  4 +++-
 .../sql/kafka010/KafkaDelegationTokenSuite.scala   |  2 --
 .../apache/spark/kafka010/KafkaConfigUpdater.scala |  1 +
 .../spark/kafka010/KafkaTokenSparkConf.scala   |  4 +++-
 .../spark/kafka010/KafkaConfigUpdaterSuite.scala   | 26 +-
 5 files changed, 32 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (125af78d -> 8018ded)

2019-09-09 Thread vanzin

This is an automated email from the ASF dual-hosted git repository.

vanzin pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 125af78d [SPARK-28831][DOC][SQL] Document CLEAR CACHE statement in SQL 
Reference
 add 8018ded  [SPARK-28214][STREAMING][TESTS] CheckpointSuite: wait for 
batch to be fully processed before accessing DStreamCheckpointData

No new revisions were added by this update.

Summary of changes:
 .../spark/streaming/scheduler/JobGenerator.scala   |  2 +-
 .../spark/streaming/scheduler/JobScheduler.scala   |  3 +-
 .../apache/spark/streaming/CheckpointSuite.scala   | 32 --
 3 files changed, 26 insertions(+), 11 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[GitHub] [spark-website] dongjoon-hyun edited a comment on issue #221: Add Apache Spark 2.3.4 release news and update links

2019-09-09 Thread GitBox

dongjoon-hyun edited a comment on issue #221: Add Apache Spark 2.3.4 release 
news and update links
URL: https://github.com/apache/spark-website/pull/221#issuecomment-529680249
 
 
   BTW, @cloud-fan and @kiszk . We need to sync `KEYS`, too~
   - https://dist.apache.org/repos/dist/dev/spark/KEYS (`Kazuaki Ishizaki (CODE 
SIGNING KEY)`)
   - https://dist.apache.org/repos/dist/release/spark/KEYS (No `Kazuaki 
Ishizaki (CODE SIGNING KEY)`)
   
   Otherwise,[ the checker](https://checker.apache.org/projs/spark.html) will 
complain later.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[GitHub] [spark-website] dongjoon-hyun edited a comment on issue #221: Add Apache Spark 2.3.4 release news and update links

2019-09-09 Thread GitBox

dongjoon-hyun edited a comment on issue #221: Add Apache Spark 2.3.4 release 
news and update links
URL: https://github.com/apache/spark-website/pull/221#issuecomment-529680249
 
 
   BTW, @cloud-fan and @kiszk . We need to sync `KEYS`, too. 
   - https://dist.apache.org/repos/dist/dev/spark/KEYS (`Kazuaki Ishizaki (CODE 
SIGNING KEY)`)
   - https://dist.apache.org/repos/dist/release/spark/KEYS (No `Kazuaki 
Ishizaki (CODE SIGNING KEY)`)
   
   Otherwise,[ the checker](https://checker.apache.org/projs/spark.html) will 
complain later.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[GitHub] [spark-website] dongjoon-hyun commented on issue #221: Add Apache Spark 2.3.4 release news and update links

2019-09-09 Thread GitBox

dongjoon-hyun commented on issue #221: Add Apache Spark 2.3.4 release news and 
update links
URL: https://github.com/apache/spark-website/pull/221#issuecomment-529680249
 
 
   BTW, @cloud-fan and @kiszk . We need to sync `KEYS`, too. 
   - https://dist.apache.org/repos/dist/dev/spark/KEYS `Kazuaki Ishizaki (CODE 
SIGNING KEY)`
   - https://dist.apache.org/repos/dist/release/spark/KEYS (No `Kazuaki 
Ishizaki (CODE SIGNING KEY)`)
   
   Otherwise,[ the checker](https://checker.apache.org/projs/spark.html) will 
complain later.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[GitHub] [spark-website] dongjoon-hyun commented on issue #221: Add Apache Spark 2.3.4 release news and update links

2019-09-09 Thread GitBox

dongjoon-hyun commented on issue #221: Add Apache Spark 2.3.4 release news and 
update links
URL: https://github.com/apache/spark-website/pull/221#issuecomment-529676525
 
 
   Thank you for updating all here. Later, we can update `README.md` in order 
to use `v4.0.0` always. `Jekyll v4.0.0` is released 20 days ago. `asf-site` 
branch seems to be not a result of `v4.0.0` according to the git log.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (c839d09 -> 125af78d)

2019-09-09 Thread lixiao

This is an automated email from the ASF dual-hosted git repository.

lixiao pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from c839d09  [SPARK-28773][DOC][SQL] Handling of NULL data in Spark SQL
 add 125af78d [SPARK-28831][DOC][SQL] Document CLEAR CACHE statement in SQL 
Reference

No new revisions were added by this update.

Summary of changes:
 docs/sql-ref-syntax-aux-cache-clear-cache.md   | 18 +-
 docs/sql-ref-syntax-aux-cache-uncache-table.md |  4 ++--
 docs/sql-ref-syntax-aux-cache.md   | 15 +++
 3 files changed, 26 insertions(+), 11 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[GitHub] [spark-website] dongjoon-hyun edited a comment on issue #221: Add Apache Spark 2.3.4 release news and update links

2019-09-09 Thread GitBox

dongjoon-hyun edited a comment on issue #221: Add Apache Spark 2.3.4 release 
news and update links
URL: https://github.com/apache/spark-website/pull/221#issuecomment-529670463
 
 
   ~Ur, @cloud-fan . It seems that we still don't have 2.3.4 at Archive 
directory.~ Never mind. It seems due to a delay.
   - https://archive.apache.org/dist/spark/


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[GitHub] [spark-website] dongjoon-hyun commented on issue #221: Add Apache Spark 2.3.4 release news and update links

2019-09-09 Thread GitBox

dongjoon-hyun commented on issue #221: Add Apache Spark 2.3.4 release news and 
update links
URL: https://github.com/apache/spark-website/pull/221#issuecomment-529670463
 
 
   Ur, @cloud-fan . It seems that we still don't have 2.3.4 at Archive 
directory.
   - https://archive.apache.org/dist/spark/


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[GitHub] [spark-website] dongjoon-hyun commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links

2019-09-09 Thread GitBox

dongjoon-hyun commented on a change in pull request #221: Add Apache Spark 
2.3.4 release news and update links
URL: https://github.com/apache/spark-website/pull/221#discussion_r322439479
 
 

 ##
 File path: site/sitemap.xml
 ##
 @@ -139,713 +139,721 @@
 
 
 
 
 Review comment:
   Oh, thanks! This seems to be due to my mistake during `jekyll watch`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[GitHub] [spark-website] dongjoon-hyun commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links

2019-09-09 Thread GitBox

dongjoon-hyun commented on a change in pull request #221: Add Apache Spark 
2.3.4 release news and update links
URL: https://github.com/apache/spark-website/pull/221#discussion_r322447723
 
 

 ##
 File path: site/examples.html
 ##
 @@ -230,36 +230,36 @@ Word Count
 
 
 
-text_file = sc.textFile(hdfs://...)
-counts = text_file.flatMap(lambda 
line: line.split( )) \
- .map(lambda word: (word, 1)) \
+text_file = 
sc.textFile("hdfs://...")
+counts = text_file.flatMap(lambda 
line: line.split(" ")) \
+ .map(lambda word: (word, 1)) \
  .reduceByKey(lambda a, b: a + b)
-counts.saveAsTextFile(hdfs://...)
+counts.saveAsTextFile("hdfs://...")
 
 Review comment:
   Oh, now Jekyll support `"hdfs://..."` instead of 
`hdfs://...`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[GitHub] [spark-website] dongjoon-hyun commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links

2019-09-09 Thread GitBox

dongjoon-hyun commented on a change in pull request #221: Add Apache Spark 
2.3.4 release news and update links
URL: https://github.com/apache/spark-website/pull/221#discussion_r322446452
 
 

 ##
 File path: site/examples.html
 ##
 @@ -230,7 +230,7 @@ Word Count
 
 
 
-text_file = sc.textFile(hdfs://...)
 
 Review comment:
   Yes. Mine was 3.8.5 or 3.8.6, not v4.0.0 (IIRC, the version on my laptop at 
home).
   - 
https://github.com/apache/spark-website/commit/4f850d15a68650384e4c1dd8b74c585ffedc875a
 (Fix jekyll build before updating)
   - 
https://github.com/apache/spark-website/commit/950c65da1e91419b4e4830a0a23dbc6cb732ddaf
 (Release v2.4.4)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (6378d4b -> c839d09)

2019-09-09 Thread lixiao

This is an automated email from the ASF dual-hosted git repository.

lixiao pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 6378d4b  [SPARK-28980][CORE][SQL][STREAMING][MLLIB] Remove most items 
deprecated in Spark 2.2.0 or earlier, for Spark 3
 add c839d09  [SPARK-28773][DOC][SQL] Handling of NULL data in Spark SQL

No new revisions were added by this update.

Summary of changes:
 docs/_data/menu-sql.yaml   |   2 +
 docs/sql-ref-null-semantics.md | 703 +
 2 files changed, 705 insertions(+)
 create mode 100644 docs/sql-ref-null-semantics.md


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[GitHub] [spark-website] dongjoon-hyun commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links

2019-09-09 Thread GitBox

dongjoon-hyun commented on a change in pull request #221: Add Apache Spark 
2.3.4 release news and update links
URL: https://github.com/apache/spark-website/pull/221#discussion_r322439479
 
 

 ##
 File path: site/sitemap.xml
 ##
 @@ -139,713 +139,721 @@
 
 
 
 
 Review comment:
   Oh, thanks!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[GitHub] [spark-website] viirya commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links

2019-09-09 Thread GitBox

viirya commented on a change in pull request #221: Add Apache Spark 2.3.4 
release news and update links
URL: https://github.com/apache/spark-website/pull/221#discussion_r322438237
 
 

 ##
 File path: site/examples.html
 ##
 @@ -230,7 +230,7 @@ Word Count
 
 
 
-text_file = sc.textFile(hdfs://...)
 
 Review comment:
   Interesting. Maybe last change wasn't done with jekyll v4.0.0.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[GitHub] [spark-website] kiszk commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links

2019-09-09 Thread GitBox

kiszk commented on a change in pull request #221: Add Apache Spark 2.3.4 
release news and update links
URL: https://github.com/apache/spark-website/pull/221#discussion_r322435187
 
 

 ##
 File path: site/examples.html
 ##
 @@ -230,7 +230,7 @@ Word Count
 
 
 
-text_file = sc.textFile(hdfs://...)
 
 Review comment:
   Sure, committed the change with jekyll v4.0.0


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[GitHub] [spark-website] kiszk commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links

2019-09-09 Thread GitBox

kiszk commented on a change in pull request #221: Add Apache Spark 2.3.4 
release news and update links
URL: https://github.com/apache/spark-website/pull/221#discussion_r322432836
 
 

 ##
 File path: site/sitemap.xml
 ##
 @@ -139,713 +139,721 @@
 
 
 
 
 Review comment:
   Yeah, I realized it. It was introduced unintentionally in the last one.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [SPARK-29011][BUILD] Update netty-all from 4.1.30-Final to 4.1.39-Final

2019-09-09 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new 9ef48f7  [SPARK-29011][BUILD] Update netty-all from 4.1.30-Final to 
4.1.39-Final
9ef48f7 is described below

commit 9ef48f72a9a0bf19da5aef38d921fc7890ae0a77
Author: Nicholas Marion 
AuthorDate: Mon Sep 9 15:13:37 2019 -0500

[SPARK-29011][BUILD] Update netty-all from 4.1.30-Final to 4.1.39-Final

### What changes were proposed in this pull request?
Upgrade netty-all to latest in the 4.1.x line which is 4.1.39-Final.

### Why are the changes needed?
Currency of dependencies.

### Does this PR introduce any user-facing change?
No.

### How was this patch tested?
Existing unit-tests against 2.4 branch.

Closes #25732 from n-marion/branch-2.4.

Authored-by: Nicholas Marion 
Signed-off-by: Sean Owen 
---
 dev/deps/spark-deps-hadoop-2.6 | 2 +-
 dev/deps/spark-deps-hadoop-2.7 | 2 +-
 dev/deps/spark-deps-hadoop-3.1 | 2 +-
 pom.xml| 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/dev/deps/spark-deps-hadoop-2.6 b/dev/deps/spark-deps-hadoop-2.6
index 2974f2b..e53cde9 100644
--- a/dev/deps/spark-deps-hadoop-2.6
+++ b/dev/deps/spark-deps-hadoop-2.6
@@ -149,7 +149,7 @@ metrics-json-3.1.5.jar
 metrics-jvm-3.1.5.jar
 minlog-1.3.0.jar
 netty-3.9.9.Final.jar
-netty-all-4.1.17.Final.jar
+netty-all-4.1.39.Final.jar
 objenesis-2.5.1.jar
 okhttp-3.8.1.jar
 okio-1.13.0.jar
diff --git a/dev/deps/spark-deps-hadoop-2.7 b/dev/deps/spark-deps-hadoop-2.7
index c25648d..c2e6d75 100644
--- a/dev/deps/spark-deps-hadoop-2.7
+++ b/dev/deps/spark-deps-hadoop-2.7
@@ -150,7 +150,7 @@ metrics-json-3.1.5.jar
 metrics-jvm-3.1.5.jar
 minlog-1.3.0.jar
 netty-3.9.9.Final.jar
-netty-all-4.1.17.Final.jar
+netty-all-4.1.39.Final.jar
 objenesis-2.5.1.jar
 okhttp-3.8.1.jar
 okio-1.13.0.jar
diff --git a/dev/deps/spark-deps-hadoop-3.1 b/dev/deps/spark-deps-hadoop-3.1
index 6ce8287..6ba49fd 100644
--- a/dev/deps/spark-deps-hadoop-3.1
+++ b/dev/deps/spark-deps-hadoop-3.1
@@ -167,7 +167,7 @@ metrics-jvm-3.1.5.jar
 minlog-1.3.0.jar
 mssql-jdbc-6.2.1.jre7.jar
 netty-3.9.9.Final.jar
-netty-all-4.1.17.Final.jar
+netty-all-4.1.39.Final.jar
 nimbus-jose-jwt-4.41.1.jar
 objenesis-2.5.1.jar
 okhttp-2.7.5.jar
diff --git a/pom.xml b/pom.xml
index c75d2ca..2ef88af 100644
--- a/pom.xml
+++ b/pom.xml
@@ -589,7 +589,7 @@
   
 io.netty
 netty-all
-4.1.17.Final
+4.1.39.Final
   
   
 io.netty


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[GitHub] [spark-website] srowen commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links

2019-09-09 Thread GitBox

srowen commented on a change in pull request #221: Add Apache Spark 2.3.4 
release news and update links
URL: https://github.com/apache/spark-website/pull/221#discussion_r322430711
 
 

 ##
 File path: site/examples.html
 ##
 @@ -230,7 +230,7 @@ Word Count
 
 
 
-text_file = sc.textFile(hdfs://...)
 
 Review comment:
   If you're using 4.0.0, this should be fine. The difference looks minor.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[GitHub] [spark-website] srowen commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links

2019-09-09 Thread GitBox

srowen commented on a change in pull request #221: Add Apache Spark 2.3.4 
release news and update links
URL: https://github.com/apache/spark-website/pull/221#discussion_r322430323
 
 

 ##
 File path: site/sitemap.xml
 ##
 @@ -139,713 +139,721 @@
 
 
 
 
 Review comment:
   Oops, did the sitemap get published last time with the localhost:4000 URLs? 
that's an error. Good change! this one is easy to miss and GIthub collapses 
this big diff.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[GitHub] [spark-website] kiszk commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links

2019-09-09 Thread GitBox

kiszk commented on a change in pull request #221: Add Apache Spark 2.3.4 
release news and update links
URL: https://github.com/apache/spark-website/pull/221#discussion_r322429291
 
 

 ##
 File path: site/examples.html
 ##
 @@ -230,7 +230,7 @@ Word Count
 
 
 
-text_file = sc.textFile(hdfs://...)
 
 Review comment:
   I use v4.0.0. However, the diff looks larger than the current commit.
   
   ```
   @@ -230,11 +230,11 @@ In this page, we will show examples using RDD API as 
well as examples using high



   -text_file = sc.textFile(hdfs://...)
   -counts = text_file.flatMap(lambda 
line: line.split( )) \
   - .map(lambda word: (word, 1)) \
   +text_file = 
sc.textFile("hdfs://...")
   +counts = text_file.flatMap(lambda 
line: line.split(" ")) \
   + .map(lambda word: (word, 1)) \
 .reduceByKey(lambda a, b: a + b)
   -counts.saveAsTextFile(hdfs://...)
   +counts.saveAsTextFile("hdfs://...")
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[GitHub] [spark-website] srowen commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links

2019-09-09 Thread GitBox

srowen commented on a change in pull request #221: Add Apache Spark 2.3.4 
release news and update links
URL: https://github.com/apache/spark-website/pull/221#discussion_r322412975
 
 

 ##
 File path: site/examples.html
 ##
 @@ -230,7 +230,7 @@ Word Count
 
 
 
-text_file = sc.textFile(hdfs://...)
 
 Review comment:
   Use the latest jekyll, 4.0.0. I know Hyukjin just updated all the rendering 
to be consistent with 4.0.0.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[GitHub] [spark-website] kiszk commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links

2019-09-09 Thread GitBox

kiszk commented on a change in pull request #221: Add Apache Spark 2.3.4 
release news and update links
URL: https://github.com/apache/spark-website/pull/221#discussion_r322404583
 
 

 ##
 File path: site/examples.html
 ##
 @@ -230,7 +230,7 @@ Word Count
 
 
 
-text_file = sc.textFile(hdfs://...)
 
 Review comment:
   I used v 3.8.6.
   ```
   /usr/local/bin/jekyll -v
   jekyll 3.8.6
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[GitHub] [spark-website] viirya commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links

2019-09-09 Thread GitBox

viirya commented on a change in pull request #221: Add Apache Spark 2.3.4 
release news and update links
URL: https://github.com/apache/spark-website/pull/221#discussion_r322402727
 
 

 ##
 File path: site/examples.html
 ##
 @@ -230,7 +230,7 @@ Word Count
 
 
 
-text_file = sc.textFile(hdfs://...)
 
 Review comment:
   Not sure. @dongjoon-hyun Do you see such thing when you did the similar 
change?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[GitHub] [spark-website] kiszk commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links

2019-09-09 Thread GitBox

kiszk commented on a change in pull request #221: Add Apache Spark 2.3.4 
release news and update links
URL: https://github.com/apache/spark-website/pull/221#discussion_r322397252
 
 

 ##
 File path: site/examples.html
 ##
 @@ -230,7 +230,7 @@ Word Count
 
 
 
-text_file = sc.textFile(hdfs://...)
 
 Review comment:
   I think so. This is automatically done by `jekyll`. Should we revert this?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[GitHub] [spark-website] viirya commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links

2019-09-09 Thread GitBox

viirya commented on a change in pull request #221: Add Apache Spark 2.3.4 
release news and update links
URL: https://github.com/apache/spark-website/pull/221#discussion_r322379190
 
 

 ##
 File path: site/examples.html
 ##
 @@ -230,7 +230,7 @@ Word Count
 
 
 
-text_file = sc.textFile(hdfs://...)
 
 Review comment:
   Does this remove a redundant span?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[GitHub] [spark-website] kiszk commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links

2019-09-09 Thread GitBox

kiszk commented on a change in pull request #221: Add Apache Spark 2.3.4 
release news and update links
URL: https://github.com/apache/spark-website/pull/221#discussion_r322371987
 
 

 ##
 File path: js/downloads.js
 ##
 @@ -23,7 +23,7 @@ var packagesV8 = [hadoop2p7, hadoop2p6, hadoopFree, sources];
 var packagesV9 = [hadoop2p7, hadoop2p6, hadoopFree, scala2p12_hadoopFree, 
sources];
 
 addRelease("2.4.4", new Date("08/30/2019"), packagesV9, true);
-addRelease("2.3.3", new Date("02/15/2019"), packagesV8, true);
+addRelease("2.3.4", new Date("09/09/2019"), packagesV8, true);
 
 Review comment:
   @dongjoon-hyun Thank you. Two `downloads.js` have been updated.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[GitHub] [spark-website] kiszk commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links

2019-09-09 Thread GitBox

kiszk commented on a change in pull request #221: Add Apache Spark 2.3.4 
release news and update links
URL: https://github.com/apache/spark-website/pull/221#discussion_r322368525
 
 

 ##
 File path: js/downloads.js
 ##
 @@ -23,7 +23,7 @@ var packagesV8 = [hadoop2p7, hadoop2p6, hadoopFree, sources];
 var packagesV9 = [hadoop2p7, hadoop2p6, hadoopFree, scala2p12_hadoopFree, 
sources];
 
 addRelease("2.4.4", new Date("08/30/2019"), packagesV9, true);
-addRelease("2.3.3", new Date("02/15/2019"), packagesV8, true);
+addRelease("2.3.4", new Date("09/09/2019"), packagesV8, true);
 
 Review comment:
   Thank you. I will update the date soon.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[GitHub] [spark-website] dongjoon-hyun commented on a change in pull request #221: Add Apache Spark 2.3.4 release news and update links

2019-09-09 Thread GitBox

dongjoon-hyun commented on a change in pull request #221: Add Apache Spark 
2.3.4 release news and update links
URL: https://github.com/apache/spark-website/pull/221#discussion_r322363675
 
 

 ##
 File path: js/downloads.js
 ##
 @@ -23,7 +23,7 @@ var packagesV8 = [hadoop2p7, hadoop2p6, hadoopFree, sources];
 var packagesV9 = [hadoop2p7, hadoop2p6, hadoopFree, scala2p12_hadoopFree, 
sources];
 
 addRelease("2.4.4", new Date("08/30/2019"), packagesV9, true);
-addRelease("2.3.3", new Date("02/15/2019"), packagesV8, true);
+addRelease("2.3.4", new Date("09/09/2019"), packagesV8, true);
 
 Review comment:
   Thank you, @kiszk . This should be VOTE pass day which is different from 
news announce day.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[GitHub] [spark-website] kiszk commented on issue #221: Add Apache Spark 2.3.4 release news and update links

2019-09-09 Thread GitBox

kiszk commented on issue #221: Add Apache Spark 2.3.4 release news and update 
links
URL: https://github.com/apache/spark-website/pull/221#issuecomment-529583439
 
 
   Hi, @srowen , @felixcheung , @dongjoon-hyun, @viirya, @gatorsmile , 
@cloud-fan, @HyukjinKwon
   Could you review this?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[GitHub] [spark-website] kiszk opened a new pull request #221: Spark 2.3.4 release

2019-09-09 Thread GitBox

kiszk opened a new pull request #221: Spark 2.3.4 release
URL: https://github.com/apache/spark-website/pull/221
 
 
   This PR aims to the Spark 2.3.3 release news and update links.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (3d6b33a -> 6378d4b)

2019-09-09 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 3d6b33a  [SPARK-28939][SQL] Propagate SQLConf for plans executed by 
toRdd
 add 6378d4b  [SPARK-28980][CORE][SQL][STREAMING][MLLIB] Remove most items 
deprecated in Spark 2.2.0 or earlier, for Spark 3

No new revisions were added by this update.

Summary of changes:
 R/pkg/R/sparkR.R   |   7 +-
 R/pkg/tests/fulltests/test_sparkR.R|   4 +-
 .../main/scala/org/apache/spark/SparkConf.scala|  17 -
 .../org/apache/spark/deploy/SparkSubmit.scala  |  19 -
 .../HadoopFSDelegationTokenProviderSuite.scala |   5 +-
 .../spark/scheduler/BlacklistTrackerSuite.scala|   2 +-
 dev/sparktestsupport/modules.py|   1 -
 docs/mllib-evaluation-metrics.md   |  28 -
 docs/mllib-feature-extraction.md   |  14 -
 docs/mllib-linear-methods.md   |  51 --
 docs/sql-migration-guide-upgrade.md|   5 +
 docs/streaming-kinesis-integration.md  |   2 +-
 docs/streaming-programming-guide.md|   4 +-
 .../mllib/JavaLinearRegressionWithSGDExample.java  |  81 ---
 .../mllib/JavaRegressionMetricsExample.java|  83 ---
 .../spark/examples/mllib/LinearRegression.scala| 138 -
 .../mllib/LinearRegressionWithSGDExample.scala |  65 ---
 .../apache/spark/examples/mllib/PCAExample.scala   |  75 ---
 .../examples/mllib/RegressionMetricsExample.scala  |  74 ---
 .../streaming/JavaKinesisWordCountASL.java |  23 +-
 .../spark/streaming/kinesis/KinesisUtils.scala | 642 -
 .../kinesis/KinesisUtilsPythonHelper.scala |  93 +++
 .../streaming/kinesis/JavaKinesisStreamSuite.java  |  98 
 .../streaming/kinesis/KinesisStreamSuite.scala |  24 +-
 .../spark/launcher/SparkSubmitCommandBuilder.java  |   2 +-
 .../spark/mllib/api/python/PythonMLLibAPI.scala|   1 -
 .../mllib/classification/LogisticRegression.scala  | 106 
 .../org/apache/spark/mllib/clustering/KMeans.scala |  67 ---
 .../apache/spark/mllib/feature/ChiSqSelector.scala |  11 -
 .../org/apache/spark/mllib/regression/Lasso.scala  | 111 
 .../spark/mllib/regression/LinearRegression.scala  | 102 
 .../spark/mllib/regression/RidgeRegression.scala   | 108 
 .../JavaLogisticRegressionSuite.java   |   9 +-
 .../spark/mllib/clustering/JavaKMeansSuite.java|   4 +-
 .../spark/mllib/regression/JavaLassoSuite.java |   7 +-
 .../regression/JavaLinearRegressionSuite.java  |   9 +-
 .../mllib/regression/JavaRidgeRegressionSuite.java |  14 +-
 .../classification/LogisticRegressionSuite.scala   |  22 +-
 .../spark/mllib/clustering/KMeansSuite.scala   |   2 +-
 .../apache/spark/mllib/regression/LassoSuite.scala |   9 +-
 .../mllib/regression/LinearRegressionSuite.scala   |   8 +-
 .../mllib/regression/RidgeRegressionSuite.scala|  11 +-
 project/MimaExcludes.scala |  14 +
 python/pyspark/__init__.py |   2 +-
 python/pyspark/ml/tests/test_image.py  |  43 +-
 python/pyspark/mllib/clustering.py |   8 +-
 python/pyspark/sql/__init__.py |   4 +-
 python/pyspark/sql/catalog.py  |  20 -
 python/pyspark/sql/context.py  |  67 +--
 python/pyspark/sql/tests/test_appsubmit.py |  97 
 python/pyspark/sql/tests/test_context.py   |  22 +-
 python/pyspark/streaming/kinesis.py|   1 -
 .../apache/spark/deploy/yarn/ClientArguments.scala |   2 +-
 sql/README.md  |   2 +-
 .../scala/org/apache/spark/sql/SQLContext.scala|  91 ---
 .../org/apache/spark/sql/catalog/Catalog.scala | 102 +---
 .../org/apache/spark/sql/hive/HiveContext.scala|  63 --
 .../scala/org/apache/spark/sql/hive/package.scala  |   3 -
 .../sql/hive/JavaMetastoreDataSourcesSuite.java|  54 --
 .../apache/spark/sql/hive/CachedTableSuite.scala   |   6 +-
 .../sql/hive/HiveContextCompatibilitySuite.scala   | 103 
 .../spark/sql/hive/MetastoreDataSourcesSuite.scala |   8 +-
 .../apache/spark/sql/hive/MultiDatabaseSuite.scala |   8 +-
 .../spark/sql/hive/execution/HiveDDLSuite.scala|   2 +-
 64 files changed, 224 insertions(+), 2656 deletions(-)
 delete mode 100644 
examples/src/main/java/org/apache/spark/examples/mllib/JavaLinearRegressionWithSGDExample.java
 delete mode 100644 
examples/src/main/java/org/apache/spark/examples/mllib/JavaRegressionMetricsExample.java
 delete mode 100644 
examples/src/main/scala/org/apache/spark/examples/mllib/LinearRegression.scala
 delete mode 100644 
examples/src/main/scala/org/apache/spark/examples/mllib/LinearRegressionWithSGDExample.scala
 delete mode 100644 
examples/src/main/scala/org/apache/spark/examples/mllib/PCAExample.scala
 delete mode

[spark] branch master updated: [SPARK-28939][SQL] Propagate SQLConf for plans executed by toRdd

2019-09-09 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 3d6b33a  [SPARK-28939][SQL] Propagate SQLConf for plans executed by 
toRdd
3d6b33a is described below

commit 3d6b33a49a8daba17973994169ee4a9e2507a6d9
Author: Marco Gaido 
AuthorDate: Mon Sep 9 21:20:34 2019 +0800

[SPARK-28939][SQL] Propagate SQLConf for plans executed by toRdd

### What changes were proposed in this pull request?

The PR proposes to create a custom `RDD` which enables to propagate 
`SQLConf` also in cases not tracked by SQL execution, as it happens when a 
`Dataset` is converted to and RDD either using `.rdd` or 
`.queryExecution.toRdd` and then the returned RDD is used to invoke actions on 
it.

In this way, SQL configs are effective also in these cases, while earlier 
they were ignored.

### Why are the changes needed?

Without this patch, all the times `.rdd` or `.queryExecution.toRdd` are 
used, all the SQL configs set are ignored. An example of a reproducer can be:
```
  withSQLConf(SQLConf.SUBEXPRESSION_ELIMINATION_ENABLED.key, "false") {
val df = spark.range(2).selectExpr((0 to 5000).map(i => s"id as 
field_$i"): _*)
df.createOrReplaceTempView("spark64kb")
val data = spark.sql("select * from spark64kb limit 10")
// Subexpression elimination is used here, despite it should have been 
disabled
data.describe()
  }
```

### Does this PR introduce any user-facing change?

When a user calls `.queryExecution.toRdd`, a `SQLExecutionRDD` is returned 
wrapping the `RDD` of the execute. When `.rdd` is used, an additional 
`SQLExecutionRDD` is present in the hierarchy.

### How was this patch tested?

added UT

Closes #25643 from mgaido91/SPARK-28939.

Authored-by: Marco Gaido 
Signed-off-by: Wenchen Fan 
---
 .../org/apache/spark/sql/internal/SQLConf.scala| 11 +++-
 .../spark/sql/execution/QueryExecution.scala   |  3 +-
 .../spark/sql/execution/SQLExecutionRDD.scala  | 64 ++
 .../sql/internal/ExecutorSideSQLConfSuite.scala| 46 +++-
 4 files changed, 119 insertions(+), 5 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
index 6c6cca8..d9b0a72 100644
--- a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
+++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
@@ -115,7 +115,9 @@ object SQLConf {
* Returns the active config object within the current scope. If there is an 
active SparkSession,
* the proper SQLConf associated with the thread's active session is used. 
If it's called from
* tasks in the executor side, a SQLConf will be created from job local 
properties, which are set
-   * and propagated from the driver side.
+   * and propagated from the driver side, unless a `SQLConf` has been set in 
the scope by
+   * `withExistingConf` as done for propagating SQLConf for operations 
performed on RDDs created
+   * from DataFrames.
*
* The way this works is a little bit convoluted, due to the fact that 
config was added initially
* only for physical plans (and as a result not in sql/catalyst module).
@@ -129,7 +131,12 @@ object SQLConf {
*/
   def get: SQLConf = {
 if (TaskContext.get != null) {
-  new ReadOnlySQLConf(TaskContext.get())
+  val conf = existingConf.get()
+  if (conf != null) {
+conf
+  } else {
+new ReadOnlySQLConf(TaskContext.get())
+  }
 } else {
   val isSchedulerEventLoopThread = SparkContext.getActive
 .map(_.dagScheduler.eventProcessLoop.eventThread)
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/QueryExecution.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/QueryExecution.scala
index e5e86db..630d062 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/QueryExecution.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/QueryExecution.scala
@@ -105,7 +105,8 @@ class QueryExecution(
* Given QueryExecution is not a public class, end users are discouraged to 
use this: please
* use `Dataset.rdd` instead where conversion will be applied.
*/
-  lazy val toRdd: RDD[InternalRow] = executedPlan.execute()
+  lazy val toRdd: RDD[InternalRow] = new SQLExecutionRDD(
+executedPlan.execute(), sparkSession.sessionState.conf)
 
   /**
* Prepares a planned [[SparkPlan]] for execution by inserting shuffle 
operations and internal
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/SQLExecutionRDD.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/SQLExecutionRDD.scala
new file

[spark] branch master updated (dadb720 -> abec6d7)

2019-09-09 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from dadb720  [SPARK-28340][CORE] Noisy exceptions when tasks are killed: 
"DiskBloc…
 add abec6d7  [SPARK-28341][SQL] create a public API for V2SessionCatalog

No new revisions were added by this update.

Summary of changes:
 .../Transform.java => CatalogExtension.java}   |  28 +++---
 .../sql/catalog/v2/DelegatingCatalogExtension.java | 101 +
 .../spark/sql/catalog/v2/CatalogManager.scala  |  36 ++--
 .../spark/sql/catalog/v2/LookupCatalog.scala   |   2 +-
 .../spark/sql/catalyst/analysis/Analyzer.scala |  36 +++-
 .../org/apache/spark/sql/internal/SQLConf.scala|   7 +-
 .../sql/catalyst/catalog/CatalogManagerSuite.scala |   7 +-
 .../org/apache/spark/sql/DataFrameWriter.scala |  14 ++-
 .../datasources/DataSourceResolution.scala |  19 +---
 .../datasources/v2/V2SessionCatalog.scala  |  28 ++
 .../sql/internal/BaseSessionStateBuilder.scala |   6 +-
 .../execution/command/PlanResolutionSuite.scala|   8 +-
 .../datasources/v2/V2SessionCatalogSuite.scala |  13 +--
 .../DataSourceV2DataFrameSessionCatalogSuite.scala |   9 +-
 .../v2/DataSourceV2SQLSessionCatalogSuite.scala|   2 +-
 .../sql/sources/v2/DataSourceV2SQLSuite.scala  |   7 +-
 .../v2/utils/TestV2SessionCatalogBase.scala|   5 +-
 .../spark/sql/hive/HiveSessionStateBuilder.scala   |   2 +-
 18 files changed, 219 insertions(+), 111 deletions(-)
 copy 
sql/catalyst/src/main/java/org/apache/spark/sql/catalog/v2/{expressions/Transform.java
 => CatalogExtension.java} (53%)
 create mode 100644 
sql/catalyst/src/main/java/org/apache/spark/sql/catalog/v2/DelegatingCatalogExtension.java


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (4a3a6b6 -> dadb720)

2019-09-09 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 4a3a6b6  [SPARK-28637][SQL] Thriftserver support interval type
 add dadb720  [SPARK-28340][CORE] Noisy exceptions when tasks are killed: 
"DiskBloc…

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/storage/DiskBlockObjectWriter.scala  |  8 +++-
 .../spark/storage/ShuffleBlockFetcherIterator.scala   | 19 ---
 2 files changed, 23 insertions(+), 4 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

svn commit: r35707 - /dev/spark/v2.3.4-rc1-docs/

2019-09-09 Thread wenchen

Author: wenchen
Date: Mon Sep  9 09:18:18 2019
New Revision: 35707

Log:
Remove RC artifacts

Removed:
dev/spark/v2.3.4-rc1-docs/


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

svn commit: r35709 - /release/spark/spark-2.3.3/

2019-09-09 Thread wenchen

Author: wenchen
Date: Mon Sep  9 09:18:20 2019
New Revision: 35709

Log:
Remove old release

Removed:
release/spark/spark-2.3.3/


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

svn commit: r35708 - /dev/spark/v2.3.4-rc1-bin/ /release/spark/spark-2.3.4/

2019-09-09 Thread wenchen

Author: wenchen
Date: Mon Sep  9 09:18:19 2019
New Revision: 35708

Log:
Apache Spark 2.3.4

Added:
release/spark/spark-2.3.4/
  - copied from r35707, dev/spark/v2.3.4-rc1-bin/
Removed:
dev/spark/v2.3.4-rc1-bin/


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated (0a4b356 -> 483dcf5)

2019-09-09 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 0a4b356  Revert "[SPARK-28912][STREAMING] Fixed MatchError in 
getCheckpointFiles()"
 add 483dcf5  [SPARK-28912][BRANCH-2.4] Fixed MatchError in 
getCheckpointFiles()

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/streaming/Checkpoint.scala  |  4 ++--
 .../org/apache/spark/streaming/CheckpointSuite.scala | 20 
 2 files changed, 22 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] tag v2.3.4 created (now 8c6f815)

2019-09-09 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to tag v2.3.4
in repository https://gitbox.apache.org/repos/asf/spark.git.


  at 8c6f815  (commit)
No new revisions were added by this update.


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (d4eca7c -> 4a3a6b6)

2019-09-09 Thread lixiao

This is an automated email from the ASF dual-hosted git repository.

lixiao pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from d4eca7c  [SPARK-29000][SQL] Decimal precision overflow when don't 
allow precision loss
 add 4a3a6b6  [SPARK-28637][SQL] Thriftserver support interval type

No new revisions were added by this update.

Summary of changes:
 .../thriftserver/SparkExecuteStatementOperation.scala |  9 -
 .../sql/hive/thriftserver/HiveThriftServer2Suites.scala   | 15 +++
 .../SparkThriftServerProtocolVersionsSuite.scala  |  4 ++--
 3 files changed, 25 insertions(+), 3 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

61 matches

Mail list logo