date:20190303

[spark] branch master updated: [SPARK-27001][SQL] Refactor "serializerFor" method between ScalaReflection and JavaTypeInference

2019-03-03 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 34f6066  [SPARK-27001][SQL] Refactor "serializerFor" method between 
ScalaReflection and JavaTypeInference
34f6066 is described below

commit 34f606678a90e860711a5f9f9618cf00788c9eb0
Author: Jungtaek Lim (HeartSaVioR) 
AuthorDate: Mon Mar 4 10:45:48 2019 +0800

[SPARK-27001][SQL] Refactor "serializerFor" method between ScalaReflection 
and JavaTypeInference

## What changes were proposed in this pull request?

This patch proposes refactoring `serializerFor` method between 
`ScalaReflection` and `JavaTypeInference`, being consistent with what we 
refactored for `deserializerFor` in #23854.

This patch also extracts the logic on recording walk type path since the 
logic is duplicated across `serializerFor` and `deserializerFor` with 
`ScalaReflection` and `JavaTypeInference`.

## How was this patch tested?

Existing tests.

Closes #23908 from HeartSaVioR/SPARK-27001.

Authored-by: Jungtaek Lim (HeartSaVioR) 
Signed-off-by: Wenchen Fan 
---
 .../sql/catalyst/DeserializerBuildHelper.scala |  32 ++-
 .../spark/sql/catalyst/JavaTypeInference.scala | 143 +-
 .../spark/sql/catalyst/ScalaReflection.scala   | 220 +++--
 .../spark/sql/catalyst/SerializerBuildHelper.scala | 198 +++
 .../apache/spark/sql/catalyst/WalkedTypePath.scala |  57 ++
 .../spark/sql/catalyst/analysis/Analyzer.scala |   2 +-
 .../sql/catalyst/encoders/ExpressionEncoder.scala  |   2 +-
 .../spark/sql/catalyst/encoders/RowEncoder.scala   |   2 +-
 .../spark/sql/catalyst/expressions/Cast.scala  |   4 +-
 .../catalyst/expressions/CodeGenerationSuite.scala |   2 +-
 .../expressions/NullExpressionsSuite.scala |   2 +-
 11 files changed, 394 insertions(+), 270 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/DeserializerBuildHelper.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/DeserializerBuildHelper.scala
index d75d3ca..e55c25c 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/DeserializerBuildHelper.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/DeserializerBuildHelper.scala
@@ -29,7 +29,7 @@ object DeserializerBuildHelper {
   path: Expression,
   part: String,
   dataType: DataType,
-  walkedTypePath: Seq[String]): Expression = {
+  walkedTypePath: WalkedTypePath): Expression = {
 val newPath = UnresolvedExtractValue(path, expressions.Literal(part))
 upCastToExpectedType(newPath, dataType, walkedTypePath)
   }
@@ -39,40 +39,30 @@ object DeserializerBuildHelper {
   path: Expression,
   ordinal: Int,
   dataType: DataType,
-  walkedTypePath: Seq[String]): Expression = {
+  walkedTypePath: WalkedTypePath): Expression = {
 val newPath = GetStructField(path, ordinal)
 upCastToExpectedType(newPath, dataType, walkedTypePath)
   }
 
-  def deserializerForWithNullSafety(
-  expr: Expression,
-  dataType: DataType,
-  nullable: Boolean,
-  walkedTypePath: Seq[String],
-  funcForCreatingNewExpr: (Expression, Seq[String]) => Expression): 
Expression = {
-val newExpr = funcForCreatingNewExpr(expr, walkedTypePath)
-expressionWithNullSafety(newExpr, nullable, walkedTypePath)
-  }
-
   def deserializerForWithNullSafetyAndUpcast(
   expr: Expression,
   dataType: DataType,
   nullable: Boolean,
-  walkedTypePath: Seq[String],
-  funcForCreatingNewExpr: (Expression, Seq[String]) => Expression): 
Expression = {
+  walkedTypePath: WalkedTypePath,
+  funcForCreatingDeserializer: (Expression, WalkedTypePath) => 
Expression): Expression = {
 val casted = upCastToExpectedType(expr, dataType, walkedTypePath)
-deserializerForWithNullSafety(casted, dataType, nullable, walkedTypePath,
-  funcForCreatingNewExpr)
+expressionWithNullSafety(funcForCreatingDeserializer(casted, 
walkedTypePath),
+  nullable, walkedTypePath)
   }
 
-  private def expressionWithNullSafety(
+  def expressionWithNullSafety(
   expr: Expression,
   nullable: Boolean,
-  walkedTypePath: Seq[String]): Expression = {
+  walkedTypePath: WalkedTypePath): Expression = {
 if (nullable) {
   expr
 } else {
-  AssertNotNull(expr, walkedTypePath)
+  AssertNotNull(expr, walkedTypePath.getPaths)
 }
   }
 
@@ -167,10 +157,10 @@ object DeserializerBuildHelper {
   private def upCastToExpectedType(
   expr: Expression,
   expected: DataType,
-  walkedTypePath: Seq[String]): Expression = expected match {
+  walkedTypePath: WalkedTypePath): Expression = expected match {
 case _: StructType => expr
 case _: ArrayType => expr
 case _:

[spark] branch master updated: [SPARK-26893][SQL] Allow partition pruning with subquery filters on file source

2019-03-03 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 82820f8  [SPARK-26893][SQL] Allow partition pruning with subquery 
filters on file source
82820f8 is described below

commit 82820f8e2574ca18faa00b54dff1a3a44d105359
Author: Peter Toth 
AuthorDate: Mon Mar 4 13:38:22 2019 +0800

[SPARK-26893][SQL] Allow partition pruning with subquery filters on file 
source

## What changes were proposed in this pull request?

This PR introduces leveraging of subquery filters for partition pruning in 
file source.

Subquery expressions are not allowed to be used for partition pruning in 
`FileSourceStrategy` now, instead a `FilterExec` is added around the 
`FileSourceScanExec` to do the job.
This PR optimizes the process by allowing partition pruning subquery 
expressions as partition filters.

## How was this patch tested?

Added new UT and run existing UTs especially SPARK-25482 and SPARK-24085 
related ones.

Closes #23802 from peter-toth/SPARK-26893.

Authored-by: Peter Toth 
Signed-off-by: Wenchen Fan 
---
 .../spark/sql/execution/DataSourceScanExec.scala   | 22 +++---
 .../execution/datasources/DataSourceStrategy.scala |  2 +-
 .../execution/datasources/FileSourceStrategy.scala | 10 --
 .../datasources/PruneFileSourcePartitions.scala| 11 ++-
 .../datasources/v2/DataSourceV2Strategy.scala  |  5 +++--
 .../org/apache/spark/sql/execution/subquery.scala  | 12 
 .../scala/org/apache/spark/sql/SubquerySuite.scala | 21 +
 7 files changed, 62 insertions(+), 21 deletions(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala
index 3aed2ce..92f7d66 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala
@@ -187,6 +187,14 @@ case class FileSourceScanExec(
 ret
   }
 
+  /**
+   * [[partitionFilters]] can contain subqueries whose results are available 
only at runtime so
+   * accessing [[selectedPartitions]] should be guarded by this method during 
planning
+   */
+  private def hasPartitionsAvailableAtRunTime: Boolean = {
+partitionFilters.exists(ExecSubqueryExpression.hasSubquery)
+  }
+
   override lazy val (outputPartitioning, outputOrdering): (Partitioning, 
Seq[SortOrder]) = {
 val bucketSpec = if 
(relation.sparkSession.sessionState.conf.bucketingEnabled) {
   relation.bucketSpec
@@ -223,7 +231,7 @@ case class FileSourceScanExec(
   val sortColumns =
 spec.sortColumnNames.map(x => toAttribute(x)).takeWhile(x => 
x.isDefined).map(_.get)
 
-  val sortOrder = if (sortColumns.nonEmpty) {
+  val sortOrder = if (sortColumns.nonEmpty && 
!hasPartitionsAvailableAtRunTime) {
 // In case of bucketing, its possible to have multiple files 
belonging to the
 // same bucket in a given relation. Each of these files are 
locally sorted
 // but those files combined together are not globally sorted. 
Given that,
@@ -272,12 +280,12 @@ case class FileSourceScanExec(
 "PushedFilters" -> seqToString(pushedDownFilters),
 "DataFilters" -> seqToString(dataFilters),
 "Location" -> locationDesc)
-val withOptPartitionCount =
-  relation.partitionSchemaOption.map { _ =>
-metadata + ("PartitionCount" -> selectedPartitions.size.toString)
-  } getOrElse {
-metadata
-  }
+val withOptPartitionCount = if (relation.partitionSchemaOption.isDefined &&
+  !hasPartitionsAvailableAtRunTime) {
+  metadata + ("PartitionCount" -> selectedPartitions.size.toString)
+} else {
+  metadata
+}
 
 val withSelectedBucketsCount = relation.bucketSpec.map { spec =>
   val numSelectedBuckets = optionalBucketSet.map { b =>
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala
index b73dc30..a1252ee 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala
@@ -434,7 +434,7 @@ object DataSourceStrategy {
   protected[sql] def normalizeFilters(
   filters: Seq[Expression],
   attributes: Seq[AttributeReference]): Seq[Expression] = {
-filters.filterNot(SubqueryExpression.hasSubquery).map { e =>
+filters.map { e =>
   e transform {
 case a: AttributeReference =>

[spark] branch master updated: [SPARK-26956][SS] remove streaming output mode from data source v2 APIs

2019-03-03 Thread lixiao

This is an automated email from the ASF dual-hosted git repository.

lixiao pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 382d5a8  [SPARK-26956][SS] remove streaming output mode from data 
source v2 APIs
382d5a8 is described below

commit 382d5a82b0e8ef1dd01209e828eae67fe7993a56
Author: Wenchen Fan 
AuthorDate: Sun Mar 3 22:20:31 2019 -0800

[SPARK-26956][SS] remove streaming output mode from data source v2 APIs

## What changes were proposed in this pull request?

Similar to `SaveMode`, we should remove streaming `OutputMode` from data 
source v2 API, and use operations that has clear semantic.

The changes are:
1. append mode: create `StreamingWrite` directly. By default, the 
`WriteBuilder` will create `Write` to append data.
2. complete mode: call `SupportsTruncate#truncate`. Complete mode means 
truncating all the old data and appending new data of the current epoch. 
`SupportsTruncate` has exactly the same semantic.
3. update mode: fail. The current streaming framework can't propagate the 
update keys, so v2 sinks are not able to implement update mode. In the future 
we can introduce a `SupportsUpdate` trait.

The behavior changes:
1. all the v2 sinks(foreach, console, memory, kafka, noop) don't support 
update mode. The fact is, previously all the v2 sinks implement the update mode 
wrong. None of them can really support it.
2. kafka sink doesn't support complete mode. The fact is, the kafka sink 
can only append data.

## How was this patch tested?

existing tests

Closes #23859 from cloud-fan/update.

Authored-by: Wenchen Fan 
Signed-off-by: gatorsmile 
---
 .../spark/sql/kafka010/KafkaSourceProvider.scala   |  6 +--
 .../v2/writer/streaming/SupportsOutputMode.java| 29 
 .../datasources/noop/NoopDataSource.scala  |  7 ++-
 .../execution/streaming/MicroBatchExecution.scala  | 10 +
 .../sql/execution/streaming/StreamExecution.scala  | 34 +++
 .../spark/sql/execution/streaming/console.scala| 10 ++---
 .../streaming/continuous/ContinuousExecution.scala | 10 +
 .../streaming/sources/ForeachWriterTable.scala | 11 ++---
 .../sql/execution/streaming/sources/memoryV2.scala | 51 +-
 .../execution/streaming/MemorySinkV2Suite.scala|  4 +-
 10 files changed, 75 insertions(+), 97 deletions(-)

diff --git 
a/external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala
 
b/external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala
index a139573..4dc6955 100644
--- 
a/external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala
+++ 
b/external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala
@@ -34,7 +34,7 @@ import org.apache.spark.sql.sources.v2._
 import org.apache.spark.sql.sources.v2.reader.{Scan, ScanBuilder}
 import org.apache.spark.sql.sources.v2.reader.streaming.{ContinuousStream, 
MicroBatchStream}
 import org.apache.spark.sql.sources.v2.writer.WriteBuilder
-import org.apache.spark.sql.sources.v2.writer.streaming.{StreamingWrite, 
SupportsOutputMode}
+import org.apache.spark.sql.sources.v2.writer.streaming.StreamingWrite
 import org.apache.spark.sql.streaming.OutputMode
 import org.apache.spark.sql.types.StructType
 
@@ -362,7 +362,7 @@ private[kafka010] class KafkaSourceProvider extends 
DataSourceRegister
 }
 
 override def newWriteBuilder(options: DataSourceOptions): WriteBuilder = {
-  new WriteBuilder with SupportsOutputMode {
+  new WriteBuilder {
 private var inputSchema: StructType = _
 
 override def withInputDataSchema(schema: StructType): WriteBuilder = {
@@ -370,8 +370,6 @@ private[kafka010] class KafkaSourceProvider extends 
DataSourceRegister
   this
 }
 
-override def outputMode(mode: OutputMode): WriteBuilder = this
-
 override def buildForStreaming(): StreamingWrite = {
   import scala.collection.JavaConverters._
 
diff --git 
a/sql/core/src/main/java/org/apache/spark/sql/sources/v2/writer/streaming/SupportsOutputMode.java
 
b/sql/core/src/main/java/org/apache/spark/sql/sources/v2/writer/streaming/SupportsOutputMode.java
deleted file mode 100644
index 832dcfa..000
--- 
a/sql/core/src/main/java/org/apache/spark/sql/sources/v2/writer/streaming/SupportsOutputMode.java
+++ /dev/null
@@ -1,29 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements.  See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the "License"); you may not use this file except in compliance with
- * the License.  You may

svn commit: r32737 - /dev/spark/KEYS

2019-03-03 Thread dbtsai

Author: dbtsai
Date: Mon Mar  4 04:33:15 2019
New Revision: 32737

Log:
Update KEYS

Modified:
dev/spark/KEYS

Modified: dev/spark/KEYS
==
--- dev/spark/KEYS (original)
+++ dev/spark/KEYS Mon Mar  4 04:33:15 2019
@@ -888,30 +888,49 @@ pqnGj9s6Uudh/FXfVN5MC0/pH/ySSACkXwCmKXAh
 =4noL
 -END PGP PUBLIC KEY BLOCK-
 
-pub   nistp384 2019-02-21 [SC]
-  F8E155E845606E350C4147710359BC9965359766
+pub   rsa4096 2019-03-04 [SCEA]
+  5566B502701B3A5F5ABD5B1F42E5B25A8F7A82C1
 uid   [ultimate] DB Tsai 
 uid   [ultimate] DB Tsai 
-sub   nistp384 2019-02-21 [E]
 
 -BEGIN PGP PUBLIC KEY BLOCK-
 
-mG8EXG3xSBMFK4EEACIDAwTFUItdmiASsQ34SfIfDCGtSSqpGQHlSfDB801cJRSK
-tAxO51Xu3E6BSpSTcImHzstwxGj7rkSDSHhiwZG18314+ykKVNHSuIFDdYEUi2aR
-UQ0RWSiGhVS+Eg51v07Zc2q0G0RCIFRzYWkgPGRidHNhaUBkYnRzYWkuY29tPoiz
-BBMTCQA7AhsDBQsJCAcCBhUKCQgLAgQWAgMBAh4BAheAFiEE+OFV6EVgbjUMQUdx
-A1m8mWU1l2YFAlxt8YECGQEACgkQA1m8mWU1l2Y+tgGAqzmo6XJZ9D0HxKGmvqu3
-HF/DHN2rEkutYfoGk8Wv3I3ALVVa8Qnh8zikO5y4hL4tAX90IL/NDXHoUR2sdsbJ
-tf+Eidnqc6BARrJiMXueH5KvlmtM3nmL8f28+ht8J09SPLm0G0RCIFRzYWkgPGRi
-dHNhaUBhcGFjaGUub3JnPoiwBBMTCQA4FiEE+OFV6EVgbjUMQUdxA1m8mWU1l2YF
-Alxt8ZcCGwMFCwkIBwIGFQoJCAsCBBYCAwECHgECF4AACgkQA1m8mWU1l2bZFwF/
-b46arpjADavN31useQj7cEPPjaPLgTsTKmG51lYZaDZlOXZLFVXKp/S++2IqO3C+
-AX46fL0wL+yPZ1XJ8hz0vkDIzUstC6AH6EjtmD0x/JLOei4wFB/ke631GUVkHrQZ
-lsS4cwRcbfFIEgUrgQQAIgMDBGKOAw+vHLvIS0OQ+U8M4360AFE/anh7M/wn1bFW
-50IdrOOW0Ss19Rib9UzBBW9FHU1udMGJmVOH4aBFRTXr2VOTPcswI1ArG7DoB/bs
-R2uwKt67grmq3MqQj5EJNKUzFwMBCQmImAQYEwkAIBYhBPjhVehFYG41DEFHcQNZ
-vJllNZdmBQJcbfFIAhsMAAoJEANZvJllNZdmUowBf3uHuFhO3IVozWsAnEfB8wdN
-vGQkKNb5uKWrMUAAkeUuY7+AeFDiZFxIcdfrx7B7tgF/c/CS4sgVUAIpDbw/gsfl
-UwauMsN1CWpCIaMEFZYNo//B+auqTseNXMtqqGaIgTXv
-=NYIb
+mQINBFx8pR4BEADHjMDto5wSHrIng7vJn53qKeyaYb3evfppJGw1roUZFlV0USwx
+ADOpLgtFR3AxOJRrw9NzGcPtx+G+D7LQl6zD+uGZlgjbJfzfOjSvwHXhuf+MJbUp
+FU0e/IuT/7+FBo2C9j6ITVo8knWyOXDMIlCckgkXaQ3AyILeJpxeN5vJR5Q2PBal
+hCXreVY0Bny9+pJlpl4xQ3tCpBpBD9ouqdR9Zhhh4QFmxBgg2VsDYsbpixsGB8Ce
+VzL6yQkqDxTL6PcmnMspHF2rvlSA68ikK20au+PTY9VYIb9Iry84ZGrhVQSZatAx
+icTD5GLggC5fWV5lFV1INF3+syhFPXuDNJ64uLCMDzDXfMUUXAV1538XyWV2WNwR
+ZLEr73bYkTTfgmkuRlDvLqz7b9iJLrXqAPVu1Sv+NpEgowU/GRxWd/Ekt+u1XkL1
+OnzyvHoFKjntY+Z2tQEkr7Uk/4u8N2d10XuFgEc07FSKYxz0QvPiHkiQD8I2v7hL
+Fp0YHC6mSeoKuIXnYj6yf8V9vxVTBj0pWhQPFUWn5/vIEdpm8D/c1BONsh4/FZ3U
+QIwm58v6e/Fw84jFtAFBPoFj3M8bQk559ZOESaHFjM+ZD+AFsA8vRkV0jiz5+VAr
+8R1QkoAslZKD6XiTbLAkSvvvg2QUSW2sK2g4iXj0d6J3KbLq8KPN+eMnxwARAQAB
+tBtEQiBUc2FpIDxkYnRzYWlAZGJ0c2FpLmNvbT6JAlEEEwEIADsCGy8FCwkIBwIG
+FQoJCAsCBBYCAwECHgECF4AWIQRVZrUCcBs6X1q9Wx9C5bJaj3qCwQUCXHylkQIZ
+AQAKCRBC5bJaj3qCwWUjEADHFI07aYBkpAiddTQt++fuWOFDAcKTutT3LVxQmNjO
+7ADEcZhinKBKKYEehKiwmBGe5Gp1/nRQ/yGganO/NlghEyMU59Q+s07cipvYNJSk
++gO1OvcOyziDOFGvu1/1A/AeM9wJhKqVl/ELbtiwTozMv/hpG37ybFOn05oGKsjI
+JrZHiwAhPk9k5Z2KYarxaqEVu7Uon+0gvG+atORq7yF4zybKFhYmdXVy5pMJAmKs
+acFYO/bvwhMA/0zgjS96jD+CoisYkeRub6JOLTtGLAQKdUNdiCXEfIxpOUG0NRRR
+PH+mf+pmZfPZRKtksPoaNHfrDyEk2sFQe2TVCUEFGwSAw99u0y623ewwWatjoywg
+uALOIVwqzp/J1dWrN225zgXt7X93dHwooRcN64rhRQQkvtnhE4Wka3u7f2pJRDC9
+wwpsENvWYfJIgoZPvbLIBNKcAAWu94Uv+k3wgkxj9CITc5J7XG5in8ndM0ioSCDa
+zs5SCicLqaW1PLJgBJnZaKXTPtV7pNtIb3Qi03SVrsvPiDpJHCXxMlFo9uDnWfmB
+sTaylTga7slAZNp+DcAH2I849tamOopC1kvawocpEZI1dDOs+XR5VPo8Jw+2Loyr
+poHbA3GJrv1b6eA3Hlma/u3Hp5qOYxCpawDeRIMmTSJca43+6Od4IX/yJbaSIyzO
+S7QbREIgVHNhaSA8ZGJ0c2FpQGFwYWNoZS5vcmc+iQJOBBMBCAA4FiEEVWa1AnAb
+Ol9avVsfQuWyWo96gsEFAlx8pYkCGy8FCwkIBwIGFQoJCAsCBBYCAwECHgECF4AA
+CgkQQuWyWo96gsEL8w/8CMPZxMVia2kXthXN70ptIHCc9dFxXdSwfBbFj3lIx1Qn
+I3IZfxKqxi44UeUW9hYcSzgDyCP3IRv0yOPeqkOJtHkgctSkCKC0rWctH0EhvuED
+pqudTRUkpH+tYdcU1qlnOb/4altsvJHguBtxQF9LKi3N0PZDuMPYlHbW2EMsBvLF
+QfowjZ+sBS8gjoq2N7phLqnNjepuhXLJMWeIVZRUIzFMzJszv65wCaAA4u/qJ4ac
+lKUkMiqjupDJcFE6A/JMwr6xXCvPbCAKcaY09i2i1UuG+Q1MN9yzM9ipcYkgjcCg
+KLo7iT2cVKo9ISGbh1VILSJrWqnVzevmJDy3o0z0g9WzcVnVqynvvZ7/A+8dViMW
+K10F7u2vg8Y0BDmw+vcVTvF7uhPoxfqIeTDJjeZw0fCVJFVzUsZIa7AqtM/Gzi0J
+jdEF8LyZQUC4t3lgdUoiBf83cz0/JtEr8oZZ02rKWiXdvdKLcMy7qYEVF/5gtE/g
+sjrN9LcM3vhv1//pkn017uTeL6/aOosSxJFPO2YWGMlFZUfmGj1dWCXK+LBRi4JO
+a9HbpFNXEIRMvNI9f183iSWfyMtFXhLrxHETX87IHx3/SztViTl7D1ebnzBwEUR4
+QSiz4fEiq6agEwb78fyEEaBsMFNFM4seVz2V1WvKMemME5DRCzaPyqHMGXEzenk=
+=KULP
 -END PGP PUBLIC KEY BLOCK-



-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-27032][TEST] De-flake org.apache.spark.sql.execution.streaming.HDFSMetadataLogSuite.HDFSMetadataLog: metadata directory collision

2019-03-03 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new b76f262  [SPARK-27032][TEST] De-flake 
org.apache.spark.sql.execution.streaming.HDFSMetadataLogSuite.HDFSMetadataLog: 
metadata directory collision
b76f262 is described below

commit b76f262fc84a1fbe52bbc6bf74573dec4d1ae4df
Author: Sean Owen 
AuthorDate: Mon Mar 4 13:36:41 2019 +0900

[SPARK-27032][TEST] De-flake 
org.apache.spark.sql.execution.streaming.HDFSMetadataLogSuite.HDFSMetadataLog: 
metadata directory collision

## What changes were proposed in this pull request?

Reduce work in HDFSMetadataLogSuite test to possibly de-flake it.

## How was this patch tested?

Existing tests

Closes #23937 from srowen/SPARK-27032.

Authored-by: Sean Owen 
Signed-off-by: Hyukjin Kwon 
---
 .../spark/sql/execution/streaming/HDFSMetadataLogSuite.scala  | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/HDFSMetadataLogSuite.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/HDFSMetadataLogSuite.scala
index 0e36e7f..3706835 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/HDFSMetadataLogSuite.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/HDFSMetadataLogSuite.scala
@@ -131,9 +131,10 @@ class HDFSMetadataLogSuite extends SparkFunSuite with 
SharedSQLContext {
 
   testQuietly("HDFSMetadataLog: metadata directory collision") {
 withTempDir { temp =>
-  val waiter = new Waiter
-  val maxBatchId = 100
-  for (id <- 0 until 10) {
+  val waiter = new Waiter()
+  val maxBatchId = 10
+  val numThreads = 5
+  for (id <- 0 until numThreads) {
 new UninterruptibleThread(s"HDFSMetadataLog: metadata directory 
collision - thread $id") {
   override def run(): Unit = waiter {
 val metadataLog =
@@ -146,7 +147,7 @@ class HDFSMetadataLogSuite extends SparkFunSuite with 
SharedSQLContext {
 nextBatchId += 1
   }
 } catch {
-  case e: ConcurrentModificationException =>
+  case _: ConcurrentModificationException =>
   // This is expected since there are multiple writers
 } finally {
   waiter.dismiss()
@@ -155,7 +156,7 @@ class HDFSMetadataLogSuite extends SparkFunSuite with 
SharedSQLContext {
 }.start()
   }
 
-  waiter.await(timeout(10.seconds), dismissals(10))
+  waiter.await(timeout(10.seconds), dismissals(numThreads))
   val metadataLog = new HDFSMetadataLog[String](spark, 
temp.getAbsolutePath)
   assert(metadataLog.getLatest() === Some(maxBatchId -> 
maxBatchId.toString))
   assert(


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark-website] branch asf-site updated: Link to sigs on www.apache.org; provide direct link to release's sigs, checksums

2019-03-03 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/spark-website.git


The following commit(s) were added to refs/heads/asf-site by this push:
 new c0f7dd2  Link to sigs on www.apache.org; provide direct link to 
release's sigs, checksums
c0f7dd2 is described below

commit c0f7dd27612d14cfd57e19bfeeecf683246ba40f
Author: Sean Owen 
AuthorDate: Sun Mar 3 09:40:37 2019 -0600

Link to sigs on www.apache.org; provide direct link to release's sigs, 
checksums

See https://issues.apache.org/jira/browse/SPARK-26274
This also cleans up some weirdness in the JS.
I tested it manually and all the links appear to work fine.

Author: Sean Owen 

Closes #184 from srowen/SPARK-26274.
---
 js/downloads.js  | 36 +++-
 site/js/downloads.js | 36 +++-
 2 files changed, 30 insertions(+), 42 deletions(-)

diff --git a/js/downloads.js b/js/downloads.js
index be6c29c..49ceb36 100644
--- a/js/downloads.js
+++ b/js/downloads.js
@@ -74,10 +74,8 @@ function onPackageSelect() {
 function onVersionSelect() {
   var versionSelect = document.getElementById("sparkVersionSelect");
   var packageSelect = document.getElementById("sparkPackageSelect");
-  var verifyLink = document.getElementById("sparkDownloadVerify");
 
   empty(packageSelect);
-  empty(verifyLink);
 
   var version = getSelectedValue(versionSelect);
   var packages = releases[version]["packages"];
@@ -88,10 +86,6 @@ function onVersionSelect() {
 append(packageSelect, option);
   }
 
-  var href = "https://archive.apache.org/dist/spark/spark-; + version + "/";
-  var link = "" + versionShort(version) + " 
signatures and checksums";
-  append(verifyLink, link);
-
   // Populate releases
   updateDownloadLink(releases[version].mirrored);
 }
@@ -100,32 +94,32 @@ function updateDownloadLink(isMirrored) {
   var versionSelect = document.getElementById("sparkVersionSelect");
   var packageSelect = document.getElementById("sparkPackageSelect");
   var downloadLink = document.getElementById("spanDownloadLink");
+  var verifyLink = document.getElementById("sparkDownloadVerify");
 
   empty(downloadLink);
+  empty(verifyLink);
 
   var version = getSelectedValue(versionSelect);
   var pkg = getSelectedValue(packageSelect);
 
-  var artifactName = "spark-$ver-bin-$pkg.tgz"
-.replace(/\$ver/g, version)
-.replace(/\$pkg/g, pkg)
+  var artifactName = "spark-" + version + "-bin-" + pkg + ".tgz"
 .replace(/-bin-sources/, ""); // special case for source packages
 
-  var link = "";
+  var downloadHref = "";
   if (isMirrored) {
-link = "https://www.apache.org/dyn/closer.lua/spark/spark-$ver/$artifact;;
+downloadHref = "https://www.apache.org/dyn/closer.lua/spark/spark-; + 
version + "/" + artifactName;
   } else {
-link = "https://archive.apache.org/dist/spark/spark-$ver/$artifact;;
+downloadHref = "https://archive.apache.org/dist/spark/spark-; + version + 
"/" + artifactName;
   }
-  link = link
-.replace(/\$ver/, version)
-.replace(/\$artifact/, artifactName);
-  var text = link.split("/").reverse()[0];
-
+  var text = downloadHref.split("/").reverse()[0];
   var onClick =
-"trackOutboundLink(this, 'Release Download Links', 'apache_$artifact'); 
return false;"
-.replace(/\$artifact/, artifactName);
-
-  var contents = "" + 
text + "";
+"trackOutboundLink(this, 'Release Download Links', 'apache_" + 
artifactName + "'); return false;";
+  var contents = "" + text + "";
   append(downloadLink, contents);
+
+  var sigHref = "https://www.apache.org/dist/spark/spark-; + version + "/" + 
artifactName + ".asc";
+  var checksumHref = "https://www.apache.org/dist/spark/spark-; + version + 
"/" + artifactName + ".sha512";
+  var verifyLinks = versionShort(version) + " signatures, checksums";
+  append(verifyLink, verifyLinks);
 }
diff --git a/site/js/downloads.js b/site/js/downloads.js
index be6c29c..49ceb36 100644
--- a/site/js/downloads.js
+++ b/site/js/downloads.js
@@ -74,10 +74,8 @@ function onPackageSelect() {
 function onVersionSelect() {
   var versionSelect = document.getElementById("sparkVersionSelect");
   var packageSelect = document.getElementById("sparkPackageSelect");
-  var verifyLink = document.getElementById("sparkDownloadVerify");
 
   empty(packageSelect);
-  empty(verifyLink);
 
   var version = getSelectedValue(versionSelect);
   var packages = releases[version]["packages"];
@@ -88,10 +86,6 @@ function onVersionSelect() {
 append(packageSelect, option);
   }
 
-  var href = "https://archive.apache.org/dist/spark/spark-; + version + "/";
-  var link = "" + versionShort(version) + " 
signatures and checksums";
-  append(verifyLink, link);
-
   // Populate releases
   updateDownloadLink(releases[version].mirrored);
 }
@@ -100,32 +94,32 @@ function updateDownloadLink(isMirrored) {
   var versionSelect =

[GitHub] [spark-website] srowen closed pull request #184: Link to sigs on www.apache.org; provide direct link to release's sigs, checksums

2019-03-03 Thread GitBox

srowen closed pull request #184: Link to sigs on www.apache.org; provide direct 
link to release's sigs, checksums
URL: https://github.com/apache/spark-website/pull/184
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-27016][SQL][BUILD] Treat all antlr warnings as errors while generating parser from the sql grammar file.

2019-03-03 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 04ad559  [SPARK-27016][SQL][BUILD] Treat all antlr warnings as errors 
while generating parser from the sql grammar file.
04ad559 is described below

commit 04ad559ab64df19ddfdf9e6fe0de3f8484daf86e
Author: Dilip Biswal 
AuthorDate: Sun Mar 3 10:02:25 2019 -0600

[SPARK-27016][SQL][BUILD] Treat all antlr warnings as errors while 
generating parser from the sql grammar file.

## What changes were proposed in this pull request?
Use the maven plugin option `treatWarningsAsErrors` to make sure the 
warnings are treated as errors while generating the parser file. In the absence 
of it, we may inadvertently introducing problems while making grammar changes.  
Please refer to [PR-23897](https://github.com/apache/spark/pull/23897) to know 
more about the context.
## How was this patch tested?
We can use two ways to build Spark 1) sbt 2) Maven
This PR, we made a change to configure the maven antlr plugin to include a 
parameter that makes antlr4 report error on warning. However, when spark is 
built using sbt, we use the sbt antlr plugin which does not allow us to pass 
this additional compilation flag.  More info on sbt-antlr plugin can be found 
at 
[link](https://github.com/ihji/sbt-antlr4/blob/master/src/main/scala/com/simplytyped/Antlr4Plugin.scala)
In summary, this fix only applicable when we use maven to build.

Closes #23925 from dilipbiswal/antlr_fix.

Authored-by: Dilip Biswal 
Signed-off-by: Sean Owen 
---
 sql/catalyst/pom.xml | 1 +
 1 file changed, 1 insertion(+)

diff --git a/sql/catalyst/pom.xml b/sql/catalyst/pom.xml
index 20cc5d0..15fc5e3 100644
--- a/sql/catalyst/pom.xml
+++ b/sql/catalyst/pom.xml
@@ -156,6 +156,7 @@
 
   true
   ../catalyst/src/main/antlr4
+  true
 
   
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[GitHub] [spark-website] srowen commented on issue #184: Link to sigs on www.apache.org; provide direct link to release's sigs, checksums

2019-03-03 Thread GitBox

srowen commented on issue #184: Link to sigs on www.apache.org; provide direct 
link to release's sigs, checksums
URL: https://github.com/apache/spark-website/pull/184#issuecomment-469034747
 
 
   Merged. The download page now has separate direct links for sigs and 
checksum files for the selected release.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-27001][SQL] Refactor "serializerFor" method between ScalaReflection and JavaTypeInference

[spark] branch master updated: [SPARK-26893][SQL] Allow partition pruning with subquery filters on file source

[spark] branch master updated: [SPARK-26956][SS] remove streaming output mode from data source v2 APIs

svn commit: r32737 - /dev/spark/KEYS

[spark] branch master updated: [SPARK-27032][TEST] De-flake org.apache.spark.sql.execution.streaming.HDFSMetadataLogSuite.HDFSMetadataLog: metadata directory collision

[spark-website] branch asf-site updated: Link to sigs on www.apache.org; provide direct link to release's sigs, checksums

[GitHub] [spark-website] srowen closed pull request #184: Link to sigs on www.apache.org; provide direct link to release's sigs, checksums

[spark] branch master updated: [SPARK-27016][SQL][BUILD] Treat all antlr warnings as errors while generating parser from the sql grammar file.

[GitHub] [spark-website] srowen commented on issue #184: Link to sigs on www.apache.org; provide direct link to release's sigs, checksums

9 matches

Site Navigation

Mail list logo

Footer information