[GitHub] SparkQA commented on issue #23521: [SPARK-26604][CORE] Register channel for stream request

2019-01-14 Thread GitBox
SparkQA commented on issue #23521: [SPARK-26604][CORE] Register channel for 
stream request
URL: https://github.com/apache/spark/pull/23521#issuecomment-454030918
 
 
   **[Test build #101187 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/101187/testReport)**
 for PR 23521 at commit 
[`9082f01`](https://github.com/apache/spark/commit/9082f01b9fec637399a1cf43eb55141cc0dcee6b).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] SparkQA removed a comment on issue #23521: [SPARK-26604][CORE] Clean up channel registration for StreamManager

2019-01-14 Thread GitBox
SparkQA removed a comment on issue #23521: [SPARK-26604][CORE] Clean up channel 
registration for StreamManager
URL: https://github.com/apache/spark/pull/23521#issuecomment-454030918
 
 
   **[Test build #101187 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/101187/testReport)**
 for PR 23521 at commit 
[`9082f01`](https://github.com/apache/spark/commit/9082f01b9fec637399a1cf43eb55141cc0dcee6b).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins removed a comment on issue #23521: [SPARK-26604][CORE] Clean up channel registration for StreamManager

2019-01-14 Thread GitBox
AmplabJenkins removed a comment on issue #23521: [SPARK-26604][CORE] Clean up 
channel registration for StreamManager
URL: https://github.com/apache/spark/pull/23521#issuecomment-454032368
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] viirya commented on a change in pull request #22807: [SPARK-25811][PySpark] Raise a proper error when unsafe cast is detected by PyArrow

2019-01-14 Thread GitBox
viirya commented on a change in pull request #22807: [SPARK-25811][PySpark] 
Raise a proper error when unsafe cast is detected by PyArrow
URL: https://github.com/apache/spark/pull/22807#discussion_r247521892
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
 ##
 @@ -1331,6 +1331,16 @@ object SQLConf {
   .booleanConf
   .createWithDefault(true)
 
+  val PANDAS_ARROW_SAFE_TYPE_CONVERSION =
+buildConf("spark.sql.execution.pandas.arrowSafeTypeConversion")
+  .internal()
+  .doc("When true, enabling Arrow do safe type conversion check when 
converting" +
+"Pandas.Series to Arrow Array during serialization. Arrow will raise 
errors " +
+"when detecting unsafe type conversion. When false, disabling Arrow's 
type " +
+"check and do type conversions anyway.")
+  .booleanConf
+  .createWithDefault(true)
 
 Review comment:
   Thanks @kszucs


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] srowen commented on a change in pull request #23486: [SPARK-26457] Show hadoop configurations in HistoryServer environment tab

2019-01-14 Thread GitBox
srowen commented on a change in pull request #23486: [SPARK-26457] Show hadoop 
configurations in HistoryServer environment tab
URL: https://github.com/apache/spark/pull/23486#discussion_r247521171
 
 

 ##
 File path: core/src/main/scala/org/apache/spark/SparkEnv.scala
 ##
 @@ -442,9 +445,13 @@ object SparkEnv extends Logging {
 val addedJarsAndFiles = (addedJars ++ addedFiles).map((_, "Added By User"))
 val classPaths = (addedJarsAndFiles ++ classPathEntries).sorted
 
+// Add Hadoop properties, it will not ignore configs including in Spark. 
Some spark
+// conf starting with "spark.hadoop" may overwrite it.
+val hadoopProperties = hadoopConf.asScala.map(entry => (entry.getKey, 
entry.getValue)).toSeq
 
 Review comment:
   I think `.asScala.toSeq` works directly too, no big deal


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] srowen commented on a change in pull request #23445: [SPARK-26530]Validate heartheat arguments in SparkSubmitArguments

2019-01-14 Thread GitBox
srowen commented on a change in pull request #23445: [SPARK-26530]Validate 
heartheat arguments in SparkSubmitArguments
URL: https://github.com/apache/spark/pull/23445#discussion_r247524504
 
 

 ##
 File path: 
core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala
 ##
 @@ -301,6 +301,28 @@ private[deploy] class SparkSubmitArguments(args: 
Seq[String], env: Map[String, S
 if (proxyUser != null && principal != null) {
   error("Only one of --proxy-user or --principal can be provided.")
 }
+
+// Ensure heartbeat arguments is valid.
+val executorTimeoutMs =
+  Utils.timeStringAsMs(sparkProperties.getOrElse(
+"spark.storage.blockManagerSlaveTimeoutMs",
+s"${Utils.timeStringAsSeconds(
+  sparkProperties.getOrElse("spark.network.timeout", "120s"))}s"))
+val executorHeartbeatIntervalMs = Utils.timeStringAsMs(
+  sparkProperties.getOrElse("spark.executor.heartbeatInterval", "10s"))
+val checkTimeoutIntervalMs =
+  Utils.timeStringAsSeconds(sparkProperties.getOrElse(
+"spark.network.timeoutInterval",
+s"${Utils.timeStringAsMs(sparkProperties.getOrElse(
+  "spark.storage.blockManagerTimeoutIntervalMs", "60s"))}ms")) * 1000
+if (checkTimeoutIntervalMs >= executorTimeoutMs) {
+  error(s"Incorrect heartbeat arguments, checkTimeoutIntervalMs should " +
+s"less than executorTimeoutMs.")
+}
+if (executorHeartbeatIntervalMs >= checkTimeoutIntervalMs) {
 
 Review comment:
   Yeah good point; this is what makes me wonder if this is the right place to 
validate this. Similar checks aren't done here in SparkSubmitArguments. For 
example, if these properties are mainly used in `HeartbeatReceiver`, that could 
be the right place to validate them.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] srowen commented on a change in pull request #23447: [SPARK-26462][CORE] Use ConfigEntry for hardcoded configs for execution categories

2019-01-14 Thread GitBox
srowen commented on a change in pull request #23447: [SPARK-26462][CORE] Use 
ConfigEntry for hardcoded configs for execution categories
URL: https://github.com/apache/spark/pull/23447#discussion_r247528412
 
 

 ##
 File path: core/src/main/scala/org/apache/spark/internal/config/package.scala
 ##
 @@ -187,6 +188,105 @@ package object config {
 .checkValue(_ >= 0, "The off-heap memory size must not be negative")
 .createWithDefault(0)
 
+  private[spark] val MEMORY_STORAGE_FRACTION = 
ConfigBuilder("spark.memory.storageFraction")
+.doc("Amount of storage memory immune to eviction, expressed as a fraction 
of the " +
+  "size of the region set aside by spark.memory.fraction. The higher this 
is, the " +
+  "less working memory may be available to execution and tasks may spill 
to disk more " +
+  "often. Leaving this at the default value is recommended. ")
+.doubleConf
+.createWithDefault(0.5)
+
+  private[spark] val MEMORY_FRACTION = ConfigBuilder("spark.memory.fraction")
+.doc("Fraction of (heap space - 300MB) used for execution and storage. The 
" +
+  "lower this is, the more frequently spills and cached data eviction 
occur. " +
+  "The purpose of this config is to set aside memory for internal 
metadata, " +
+  "user data structures, and imprecise size estimation in the case of 
sparse, " +
+  "unusually large records. Leaving this at the default value is 
recommended.  ")
+.doubleConf
+.createWithDefault(0.6)
+
+  private[spark] val MEMORY_USE_LEGACY_MODE = 
ConfigBuilder("spark.memory.useLegacyMode")
 
 Review comment:
   This needs to be removed now; the legacy mode was deleted recently. Same 
with `spark.shuffle.memoryFraction`, `spark.storage.memoryFraction`, 
`spark.storage.unrollFraction`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] mgaido91 commented on a change in pull request #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-01-14 Thread GitBox
mgaido91 commented on a change in pull request #23531: [SPARK-24497][SQL] 
Support recursive SQL query
URL: https://github.com/apache/spark/pull/23531#discussion_r247539199
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala
 ##
 @@ -289,13 +294,13 @@ abstract class TreeNode[BaseType <: TreeNode[BaseType]] 
extends Product {
   /**
* Returns a copy of this node where `f` has been applied to all the nodes 
children.
*/
-  def mapChildren(f: BaseType => BaseType): BaseType = {
-if (children.nonEmpty) {
+  def mapChildren(f: BaseType => BaseType, forceCopy: Boolean = false): 
BaseType = {
 
 Review comment:
   I see, my point is that this is a very critical method, so I'd prefer to 
leave it unchanged. Do we really need that makeDeepCopy? If so, why do we need 
it? I hope we can rid of it somehow...


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] mgaido91 commented on a change in pull request #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-01-14 Thread GitBox
mgaido91 commented on a change in pull request #23531: [SPARK-24497][SQL] 
Support recursive SQL query
URL: https://github.com/apache/spark/pull/23531#discussion_r247537242
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/basicPhysicalOperators.scala
 ##
 @@ -225,6 +226,57 @@ case class FilterExec(condition: Expression, child: 
SparkPlan)
   override def outputPartitioning: Partitioning = child.outputPartitioning
 }
 
+/** Physical plan for RecursiveTable. */
+case class RecursiveTableExec(
+name: String,
+anchorTerm: SparkPlan,
+recursiveTerm: SparkPlan) extends SparkPlan {
+  override def children: Seq[SparkPlan] = Seq(anchorTerm, recursiveTerm)
+
+  override def output: Seq[Attribute] = anchorTerm.output
+
+  override val doNotPrepareInAdvance: Seq[SparkPlan] = Seq(recursiveTerm)
+
+  override protected def doExecute(): RDD[InternalRow] = {
+var temp = anchorTerm.execute().map(_.copy()).cache()
+var tempCount = temp.count()
+var result = temp
+var level = 0
+do {
+  if (level > conf.getConf(SQLConf.RECURSION_LEVEL_LIMIT)) {
 
 Review comment:
   shall we get it into a val member of the node in order to avoid getting it 
multiple times during execution?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] mgaido91 commented on a change in pull request #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-01-14 Thread GitBox
mgaido91 commented on a change in pull request #23531: [SPARK-24497][SQL] 
Support recursive SQL query
URL: https://github.com/apache/spark/pull/23531#discussion_r247537386
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/basicPhysicalOperators.scala
 ##
 @@ -225,6 +226,57 @@ case class FilterExec(condition: Expression, child: 
SparkPlan)
   override def outputPartitioning: Partitioning = child.outputPartitioning
 }
 
+/** Physical plan for RecursiveTable. */
+case class RecursiveTableExec(
+name: String,
+anchorTerm: SparkPlan,
+recursiveTerm: SparkPlan) extends SparkPlan {
+  override def children: Seq[SparkPlan] = Seq(anchorTerm, recursiveTerm)
+
+  override def output: Seq[Attribute] = anchorTerm.output
+
+  override val doNotPrepareInAdvance: Seq[SparkPlan] = Seq(recursiveTerm)
+
+  override protected def doExecute(): RDD[InternalRow] = {
+var temp = anchorTerm.execute().map(_.copy()).cache()
+var tempCount = temp.count()
+var result = temp
+var level = 0
+do {
+  if (level > conf.getConf(SQLConf.RECURSION_LEVEL_LIMIT)) {
+throw new SparkException(s"Recursion level limit reached but query 
hasn't exhausted, try " +
 
 Review comment:
   nit: no `s`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] SparkQA commented on issue #23533: [CORE][MINOR] Fix some typos about MemoryMode

2019-01-14 Thread GitBox
SparkQA commented on issue #23533: [CORE][MINOR] Fix some typos about MemoryMode
URL: https://github.com/apache/spark/pull/23533#issuecomment-454047894
 
 
   **[Test build #4513 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4513/testReport)**
 for PR 23533 at commit 
[`4975d5a`](https://github.com/apache/spark/commit/4975d5a0c1e2a2681c8244d87044eeae873bbb7c).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] HyukjinKwon commented on a change in pull request #23534: [SPARK-26610][PYTHON] Fix inconsistency between toJSON Method in Python and Scala.

2019-01-14 Thread GitBox
HyukjinKwon commented on a change in pull request #23534: [SPARK-26610][PYTHON] 
Fix inconsistency between toJSON Method in Python and Scala.
URL: https://github.com/apache/spark/pull/23534#discussion_r247553642
 
 

 ##
 File path: python/pyspark/sql/dataframe.py
 ##
 @@ -109,15 +109,18 @@ def stat(self):
 @ignore_unicode_prefix
 @since(1.3)
 def toJSON(self, use_unicode=True):
-"""Converts a :class:`DataFrame` into a :class:`RDD` of string.
+"""Converts a :class:`DataFrame` into a :class:`DataFrame` of JSON 
string.
 
-Each row is turned into a JSON document as one element in the returned 
RDD.
+Each row is turned into a JSON document as one element in the returned 
DataFrame.
 
 >>> df.toJSON().first()
-u'{"age":2,"name":"Alice"}'
+Row(value=u'{"age":2,"name":"Alice"}')
 """
-rdd = self._jdf.toJSON()
-return RDD(rdd.toJavaRDD(), self._sc, UTF8Deserializer(use_unicode))
+jdf = self._jdf.toJSON()
+if self.sql_ctx._conf.pysparkDataFrameToJSONShouldReturnDataFrame():
+return DataFrame(jdf, self.sql_ctx)
+else:
+return RDD(jdf.toJavaRDD(), self._sc, 
UTF8Deserializer(use_unicode))
 
 Review comment:
   Also, returning RDD and DataFrame are different. Returning RDD was the 
previous way even in Scala / Java side whereas DataFrame is not.
   
   Keeping RDD as return has a legacy reason but returning DataFrame is 
basically we allow another incomplete way for `toJSON`.
   
   Workaround is pretty easy anyway.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] imatiach-msft commented on a change in pull request #21632: [SPARK-19591][ML][MLlib] Add sample weights to decision trees

2019-01-14 Thread GitBox
imatiach-msft commented on a change in pull request #21632: 
[SPARK-19591][ML][MLlib] Add sample weights to decision trees
URL: https://github.com/apache/spark/pull/21632#discussion_r246996910
 
 

 ##
 File path: project/MimaExcludes.scala
 ##
 @@ -241,7 +241,10 @@ object MimaExcludes {
 
 // [SPARK-26216][SQL] Do not use case class as public API 
(UserDefinedFunction)
 
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.expressions.UserDefinedFunction$"),
-
ProblemFilters.exclude[IncompatibleTemplateDefProblem]("org.apache.spark.sql.expressions.UserDefinedFunction")
+
ProblemFilters.exclude[IncompatibleTemplateDefProblem]("org.apache.spark.sql.expressions.UserDefinedFunction"),
+
+// [SPARK-19591][ML] Add sample weights to decision trees
+
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.mllib.tree.configuration.Strategy.this")
 
 Review comment:
   here is an example of the error:
   
   ```
   [error] val strategy = new Strategy(algo, impurity, maxDepth, 
numClasses, maxBins,
   [error]^
   [error] 
/home/ilya/ilmat-spark/mllib/src/main/scala/org/apache/spark/mllib/tree/RandomForest.scala:170:
 overloaded method constructor Strategy with alternatives:
   [error]   (algo: 
org.apache.spark.mllib.tree.configuration.Algo.Algo,impurity: 
org.apache.spark.mllib.tree.impurity.Impurity,maxDepth: Int,numClasses: 
Int,maxBins: Int,quantileCalculationStrategy: 
org.apache.spark.mllib.tree.configuration.QuantileStrategy.QuantileStrategy,categoricalFeaturesInfo:
 Map[Int,Int],minInstancesPerNode: Int,minInfoGain: Double,maxMemoryInMB: 
Int,subsamplingRate: Double,useNodeIdCache: Boolean,checkpointInterval: 
Int)org.apache.spark.mllib.tree.configuration.Strategy 
   [error]   (algo: 
org.apache.spark.mllib.tree.configuration.Algo.Algo,impurity: 
org.apache.spark.mllib.tree.impurity.Impurity,maxDepth: Int,numClasses: 
Int,maxBins: Int,quantileCalculationStrategy: 
org.apache.spark.mllib.tree.configuration.QuantileStrategy.QuantileStrategy,categoricalFeaturesInfo:
 Map[Int,Int],minInstancesPerNode: Int,minInfoGain: Double,maxMemoryInMB: 
Int,subsamplingRate: Double,useNodeIdCache: Boolean,checkpointInterval: 
Int,minWeightFractionPerNode: 
Double)org.apache.spark.mllib.tree.configuration.Strategy
   [error]  cannot be applied to 
(org.apache.spark.mllib.tree.configuration.Algo.Value, 
org.apache.spark.mllib.tree.impurity.Impurity, Int, Int, Int, 
org.apache.spark.mllib.tree.configuration.QuantileStrategy.Value, Map[Int,Int])
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins removed a comment on issue #23519: [SPARK-26601][SQL] Make broadcast-exchange thread pool configurable

2019-01-14 Thread GitBox
AmplabJenkins removed a comment on issue #23519: [SPARK-26601][SQL] Make 
broadcast-exchange thread pool configurable
URL: https://github.com/apache/spark/pull/23519#issuecomment-454070451
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] gatorsmile closed pull request #23535: [SPARK-26613][SQL] Add another rename table grammar for spark sql

2019-01-14 Thread GitBox
gatorsmile closed pull request #23535: [SPARK-26613][SQL] Add another rename 
table grammar for spark sql
URL: https://github.com/apache/spark/pull/23535
 
 
   

As this is a foreign pull request (from a fork), the diff has been
sent to your commit mailing list, comm...@spark.apache.org


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] gatorsmile commented on a change in pull request #23535: [SPARK-26613][SQL] Add another rename table grammar for spark sql

2019-01-14 Thread GitBox
gatorsmile commented on a change in pull request #23535: [SPARK-26613][SQL] Add 
another rename table grammar for spark sql
URL: https://github.com/apache/spark/pull/23535#discussion_r247579058
 
 

 ##
 File path: 
sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4
 ##
 @@ -105,6 +105,7 @@ statement
 ADD COLUMNS '(' columns=colTypeList ')'
#addTableColumns
 | ALTER (TABLE | VIEW) from=tableIdentifier
 RENAME TO to=tableIdentifier   
#renameTable
+| RENAME (TABLE | VIEW) from=tableIdentifier TO to=tableIdentifier 
#renameTable
 
 Review comment:
   Thanks for your contribution! I know, MySQL has both this syntax support and 
ALTER TABLE ... RENAME... 
   
   In the current stage, we are not randomly adding more syntax to Spark SQL 
for avoiding extra complexity. In the future, we might support different 
dialects. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] srowen commented on a change in pull request #23499: [SPARK-26254][CORE] Extract Hive + Kafka dependencies from Core.

2019-01-14 Thread GitBox
srowen commented on a change in pull request #23499: [SPARK-26254][CORE] 
Extract Hive + Kafka dependencies from Core.
URL: https://github.com/apache/spark/pull/23499#discussion_r247534502
 
 

 ##
 File path: pom.xml
 ##
 @@ -101,6 +101,7 @@
 examples
 repl
 launcher
+external/kafka-0-10-token-provider
 
 Review comment:
   Pardon if I'm not understanding, but is the new module specific to Kafka? 
what depends on it? I don't see that change here.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins removed a comment on issue #23533: [CORE][MINOR] Fix some typos about MemoryMode

2019-01-14 Thread GitBox
AmplabJenkins removed a comment on issue #23533: [CORE][MINOR] Fix some typos 
about MemoryMode
URL: https://github.com/apache/spark/pull/23533#issuecomment-454047251
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] srowen commented on a change in pull request #23399: [SPARK-26497][DOCS][KUBERNETES] Show users where the pre-packaged SparkR and PySpark Dockerfiles

2019-01-14 Thread GitBox
srowen commented on a change in pull request #23399: 
[SPARK-26497][DOCS][KUBERNETES] Show users where the pre-packaged SparkR and 
PySpark Dockerfiles
URL: https://github.com/apache/spark/pull/23399#discussion_r247538200
 
 

 ##
 File path: bin/docker-image-tool.sh
 ##
 @@ -208,8 +208,16 @@ Options:
   -f file   Dockerfile to build for JVM based Jobs. By default 
builds the Dockerfile shipped with Spark.
   -p file   (Optional) Dockerfile to build for PySpark Jobs. 
Builds Python dependencies and ships with Spark.
 Skips building PySpark docker image if not specified.
+A sample PySpark Dockerfile is at:
+kubernetes/dockerfiles/spark/bindings/python/Dockerfile
+File must be relative the build path, in dev this is 
copied from:
 
 Review comment:
   Super nit: ; or . instead of the , here


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins commented on issue #23533: [CORE][MINOR] Fix some typos about MemoryMode

2019-01-14 Thread GitBox
AmplabJenkins commented on issue #23533: [CORE][MINOR] Fix some typos about 
MemoryMode
URL: https://github.com/apache/spark/pull/23533#issuecomment-454047251
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] peter-toth commented on a change in pull request #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-01-14 Thread GitBox
peter-toth commented on a change in pull request #23531: [SPARK-24497][SQL] 
Support recursive SQL query
URL: https://github.com/apache/spark/pull/23531#discussion_r247536393
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala
 ##
 @@ -289,13 +294,13 @@ abstract class TreeNode[BaseType <: TreeNode[BaseType]] 
extends Product {
   /**
* Returns a copy of this node where `f` has been applied to all the nodes 
children.
*/
-  def mapChildren(f: BaseType => BaseType): BaseType = {
-if (children.nonEmpty) {
+  def mapChildren(f: BaseType => BaseType, forceCopy: Boolean = false): 
BaseType = {
 
 Review comment:
   Well, I need this for `makeDeepCopy()` and didn't want to copy the logic. 
Some of the plan nodes can't be reused in iterations so I think the most 
straightforward way is to deep copy the tree. Any suggestion?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins removed a comment on issue #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-01-14 Thread GitBox
AmplabJenkins removed a comment on issue #23531: [SPARK-24497][SQL] Support 
recursive SQL query
URL: https://github.com/apache/spark/pull/23531#issuecomment-454049332
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins removed a comment on issue #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-01-14 Thread GitBox
AmplabJenkins removed a comment on issue #23531: [SPARK-24497][SQL] Support 
recursive SQL query
URL: https://github.com/apache/spark/pull/23531#issuecomment-454049348
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/7040/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] mgaido91 commented on a change in pull request #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-01-14 Thread GitBox
mgaido91 commented on a change in pull request #23531: [SPARK-24497][SQL] 
Support recursive SQL query
URL: https://github.com/apache/spark/pull/23531#discussion_r247546282
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala
 ##
 @@ -289,13 +294,13 @@ abstract class TreeNode[BaseType <: TreeNode[BaseType]] 
extends Product {
   /**
* Returns a copy of this node where `f` has been applied to all the nodes 
children.
*/
-  def mapChildren(f: BaseType => BaseType): BaseType = {
-if (children.nonEmpty) {
+  def mapChildren(f: BaseType => BaseType, forceCopy: Boolean = false): 
BaseType = {
 
 Review comment:
   ok, now I see the issue here. But I think this is not the best way to go for 
that. I think this is the same reason why you are adding some exclusions in the 
prepare method (please correct me if I am wrong). I'd rather introduce a 
`clean` method in `SparkPlan` which by default does nothing, but in case of 
broadcasts cleans the broadcast and hence force recomputation at the next step. 
What do you think? cc @cloud-fan @dongjoon-hyun  @gatorsmile fot their opinion 
on this too


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] peter-toth commented on a change in pull request #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-01-14 Thread GitBox
peter-toth commented on a change in pull request #23531: [SPARK-24497][SQL] 
Support recursive SQL query
URL: https://github.com/apache/spark/pull/23531#discussion_r247552872
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala
 ##
 @@ -289,13 +294,13 @@ abstract class TreeNode[BaseType <: TreeNode[BaseType]] 
extends Product {
   /**
* Returns a copy of this node where `f` has been applied to all the nodes 
children.
*/
-  def mapChildren(f: BaseType => BaseType): BaseType = {
-if (children.nonEmpty) {
+  def mapChildren(f: BaseType => BaseType, forceCopy: Boolean = false): 
BaseType = {
 
 Review comment:
   You are right @mgaido91, `BroadcastExchangeExec` and similar nodes cause 
some issues there. We can't prepare them until we have the previous iteration 
results (or the anchor) This is why I introduced `doNotPrepareInAdvance`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins commented on issue #23510: [SPARK-26590][CORE] make fetch-block-to-disk backward compatible

2019-01-14 Thread GitBox
AmplabJenkins commented on issue #23510: [SPARK-26590][CORE] make 
fetch-block-to-disk backward compatible
URL: https://github.com/apache/spark/pull/23510#issuecomment-454064133
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/101181/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins commented on issue #23510: [SPARK-26590][CORE] make fetch-block-to-disk backward compatible

2019-01-14 Thread GitBox
AmplabJenkins commented on issue #23510: [SPARK-26590][CORE] make 
fetch-block-to-disk backward compatible
URL: https://github.com/apache/spark/pull/23510#issuecomment-454064124
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] Deegue commented on a change in pull request #23506: [SPARK-26577][SQL] Add input optimizer when reading Hive table by SparkSQL

2019-01-14 Thread GitBox
Deegue commented on a change in pull request #23506: [SPARK-26577][SQL] Add 
input optimizer when reading Hive table by SparkSQL
URL: https://github.com/apache/spark/pull/23506#discussion_r247558744
 
 

 ##
 File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/TableReader.scala
 ##
 @@ -311,6 +309,36 @@ class HadoopTableReader(
 // Only take the value (skip the key) because Hive works only with values.
 rdd.map(_._2)
   }
+
+  /**
+   * If `spark.sql.hive.fileInputFormat.enabled` is true, this function will 
optimize the input
+   * method(including format and the size of splits) while reading Hive tables.
+   */
+  private def getAndOptimizeInput(
+inputClassName: String): Class[InputFormat[Writable, Writable]] = {
+
+var ifc = Utils.classForName(inputClassName)
+  .asInstanceOf[java.lang.Class[InputFormat[Writable, Writable]]]
+if (conf.getConf(HiveUtils.HIVE_FILE_INPUT_FORMAT_ENABLED)) {
+  hadoopConf.set("mapreduce.input.fileinputformat.split.maxsize",
 
 Review comment:
   Thank you for your review! We named it as 
`spark.sql.hive.fileInputFormat.split.maxsize` now and it can take effect by 
replacing `mapreduce.input.fileinputformat.split.maxsize` of hadoopConf. 
Because we are not able to change the configurations of hadoop when use 
SparkSQL.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] SparkQA commented on issue #23519: [SPARK-26601][SQL] Make broadcast-exchange thread pool configurable

2019-01-14 Thread GitBox
SparkQA commented on issue #23519: [SPARK-26601][SQL] Make broadcast-exchange 
thread pool configurable
URL: https://github.com/apache/spark/pull/23519#issuecomment-454027099
 
 
   **[Test build #101185 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/101185/testReport)**
 for PR 23519 at commit 
[`ac2ec82`](https://github.com/apache/spark/commit/ac2ec82c7dc152acd6b200dc14ba8afea6abf5ad).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] gengliangwang commented on a change in pull request #23383: [SPARK-23817][SQL] Create file source V2 framework and migrate ORC read path

2019-01-14 Thread GitBox
gengliangwang commented on a change in pull request #23383: [SPARK-23817][SQL] 
Create file source V2 framework and migrate ORC read path
URL: https://github.com/apache/spark/pull/23383#discussion_r247515861
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/PartitionedFileUtil.scala
 ##
 @@ -0,0 +1,106 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution
+
+import org.apache.hadoop.fs.{BlockLocation, FileStatus, LocatedFileStatus, 
Path}
+
+import org.apache.spark.sql.SparkSession
+import org.apache.spark.sql.catalyst.InternalRow
+import org.apache.spark.sql.execution.datasources._
+
+object PartitionedFileUtil {
 
 Review comment:
   I have renamed `FilePartitionUtil` as `FilePartition`, also put the class 
`FilePartition` into the same file. I think they are much more clear now.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] ueshin commented on a change in pull request #23534: [SPARK-26610][PYTHON] Fix inconsistency between toJSON Method in Python and Scala.

2019-01-14 Thread GitBox
ueshin commented on a change in pull request #23534: [SPARK-26610][PYTHON] Fix 
inconsistency between toJSON Method in Python and Scala.
URL: https://github.com/apache/spark/pull/23534#discussion_r247518316
 
 

 ##
 File path: python/pyspark/sql/dataframe.py
 ##
 @@ -109,15 +109,18 @@ def stat(self):
 @ignore_unicode_prefix
 @since(1.3)
 def toJSON(self, use_unicode=True):
-"""Converts a :class:`DataFrame` into a :class:`RDD` of string.
+"""Converts a :class:`DataFrame` into a :class:`DataFrame` of JSON 
string.
 
-Each row is turned into a JSON document as one element in the returned 
RDD.
+Each row is turned into a JSON document as one element in the returned 
DataFrame.
 
 >>> df.toJSON().first()
-u'{"age":2,"name":"Alice"}'
+Row(value=u'{"age":2,"name":"Alice"}')
 """
-rdd = self._jdf.toJSON()
-return RDD(rdd.toJavaRDD(), self._sc, UTF8Deserializer(use_unicode))
+jdf = self._jdf.toJSON()
+if self.sql_ctx._conf.pysparkDataFrameToJSONShouldReturnDataFrame():
+return DataFrame(jdf, self.sql_ctx)
+else:
+return RDD(jdf.toJavaRDD(), self._sc, 
UTF8Deserializer(use_unicode))
 
 Review comment:
   That sounds interesting. Maybe we should fix `DataFrameReader.csv` and 
`DataFrameReader.json` to accept DataFrame of string in Python, regardless of 
the discussion here.
   Let me try.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] SparkQA commented on issue #23383: [SPARK-23817][SQL] Create file source V2 framework and migrate ORC read path

2019-01-14 Thread GitBox
SparkQA commented on issue #23383: [SPARK-23817][SQL] Create file source V2 
framework and migrate ORC read path
URL: https://github.com/apache/spark/pull/23383#issuecomment-454029022
 
 
   **[Test build #101186 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/101186/testReport)**
 for PR 23383 at commit 
[`7e5a158`](https://github.com/apache/spark/commit/7e5a158773bf05dec7aa837bd2ff8c9116f7da00).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] srowen commented on a change in pull request #23445: [SPARK-26530]Validate heartheat arguments in SparkSubmitArguments

2019-01-14 Thread GitBox
srowen commented on a change in pull request #23445: [SPARK-26530]Validate 
heartheat arguments in SparkSubmitArguments
URL: https://github.com/apache/spark/pull/23445#discussion_r247522677
 
 

 ##
 File path: 
core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala
 ##
 @@ -301,6 +301,28 @@ private[deploy] class SparkSubmitArguments(args: 
Seq[String], env: Map[String, S
 if (proxyUser != null && principal != null) {
   error("Only one of --proxy-user or --principal can be provided.")
 }
+
+// Ensure heartbeat arguments is valid.
+val executorTimeoutMs =
+  Utils.timeStringAsMs(sparkProperties.getOrElse(
+"spark.storage.blockManagerSlaveTimeoutMs",
+s"${Utils.timeStringAsSeconds(
+  sparkProperties.getOrElse("spark.network.timeout", "120s"))}s"))
+val executorHeartbeatIntervalMs = Utils.timeStringAsMs(
+  sparkProperties.getOrElse("spark.executor.heartbeatInterval", "10s"))
+val checkTimeoutIntervalMs =
+  Utils.timeStringAsSeconds(sparkProperties.getOrElse(
+"spark.network.timeoutInterval",
+s"${Utils.timeStringAsMs(sparkProperties.getOrElse(
+  "spark.storage.blockManagerTimeoutIntervalMs", "60s"))}ms")) * 1000
 
 Review comment:
   OK, yeah that's why we need to centralize the definition of the config. This 
should wait until the related PR to update all these configs' declaration goes 
in, then can use those.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins removed a comment on issue #23521: [SPARK-26604][CORE] Clean up channel registration for StreamManager

2019-01-14 Thread GitBox
AmplabJenkins removed a comment on issue #23521: [SPARK-26604][CORE] Clean up 
channel registration for StreamManager
URL: https://github.com/apache/spark/pull/23521#issuecomment-454032378
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/101187/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins removed a comment on issue #23518: [SPARK-26600] Update spark-submit usage message

2019-01-14 Thread GitBox
AmplabJenkins removed a comment on issue #23518: [SPARK-26600] Update 
spark-submit usage message
URL: https://github.com/apache/spark/pull/23518#issuecomment-453457279
 
 
   Can one of the admins verify this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] SparkQA commented on issue #23518: [SPARK-26600] Update spark-submit usage message

2019-01-14 Thread GitBox
SparkQA commented on issue #23518: [SPARK-26600] Update spark-submit usage 
message
URL: https://github.com/apache/spark/pull/23518#issuecomment-454033005
 
 
   **[Test build #4511 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4511/testReport)**
 for PR 23518 at commit 
[`930f242`](https://github.com/apache/spark/commit/930f242f366428515be8f747e4130e0eec036580).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] SparkQA commented on issue #23521: [SPARK-26604][CORE] Clean up channel registration for StreamManager

2019-01-14 Thread GitBox
SparkQA commented on issue #23521: [SPARK-26604][CORE] Clean up channel 
registration for StreamManager
URL: https://github.com/apache/spark/pull/23521#issuecomment-454037188
 
 
   **[Test build #101188 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/101188/testReport)**
 for PR 23521 at commit 
[`6b028a9`](https://github.com/apache/spark/commit/6b028a9a17e0c60fed11354d49c9dbe325202e36).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins removed a comment on issue #23521: [SPARK-26604][CORE] Clean up channel registration for StreamManager

2019-01-14 Thread GitBox
AmplabJenkins removed a comment on issue #23521: [SPARK-26604][CORE] Clean up 
channel registration for StreamManager
URL: https://github.com/apache/spark/pull/23521#issuecomment-454040546
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins removed a comment on issue #23521: [SPARK-26604][CORE] Clean up channel registration for StreamManager

2019-01-14 Thread GitBox
AmplabJenkins removed a comment on issue #23521: [SPARK-26604][CORE] Clean up 
channel registration for StreamManager
URL: https://github.com/apache/spark/pull/23521#issuecomment-454040554
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/7039/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins commented on issue #23521: [SPARK-26604][CORE] Clean up channel registration for StreamManager

2019-01-14 Thread GitBox
AmplabJenkins commented on issue #23521: [SPARK-26604][CORE] Clean up channel 
registration for StreamManager
URL: https://github.com/apache/spark/pull/23521#issuecomment-454040554
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/7039/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins commented on issue #23521: [SPARK-26604][CORE] Clean up channel registration for StreamManager

2019-01-14 Thread GitBox
AmplabJenkins commented on issue #23521: [SPARK-26604][CORE] Clean up channel 
registration for StreamManager
URL: https://github.com/apache/spark/pull/23521#issuecomment-454040546
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins commented on issue #23447: [SPARK-26462][CORE] Use ConfigEntry for hardcoded configs for execution categories

2019-01-14 Thread GitBox
AmplabJenkins commented on issue #23447: [SPARK-26462][CORE] Use ConfigEntry 
for hardcoded configs for execution categories
URL: https://github.com/apache/spark/pull/23447#issuecomment-454043387
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/101179/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] SparkQA commented on issue #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-01-14 Thread GitBox
SparkQA commented on issue #23531: [SPARK-24497][SQL] Support recursive SQL 
query
URL: https://github.com/apache/spark/pull/23531#issuecomment-454043473
 
 
   **[Test build #101190 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/101190/testReport)**
 for PR 23531 at commit 
[`a69473f`](https://github.com/apache/spark/commit/a69473fa2c14880b9b6a0fd6d862e8012dd8360c).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins removed a comment on issue #23447: [SPARK-26462][CORE] Use ConfigEntry for hardcoded configs for execution categories

2019-01-14 Thread GitBox
AmplabJenkins removed a comment on issue #23447: [SPARK-26462][CORE] Use 
ConfigEntry for hardcoded configs for execution categories
URL: https://github.com/apache/spark/pull/23447#issuecomment-454043374
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins removed a comment on issue #23447: [SPARK-26462][CORE] Use ConfigEntry for hardcoded configs for execution categories

2019-01-14 Thread GitBox
AmplabJenkins removed a comment on issue #23447: [SPARK-26462][CORE] Use 
ConfigEntry for hardcoded configs for execution categories
URL: https://github.com/apache/spark/pull/23447#issuecomment-454043387
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/101179/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins commented on issue #23447: [SPARK-26462][CORE] Use ConfigEntry for hardcoded configs for execution categories

2019-01-14 Thread GitBox
AmplabJenkins commented on issue #23447: [SPARK-26462][CORE] Use ConfigEntry 
for hardcoded configs for execution categories
URL: https://github.com/apache/spark/pull/23447#issuecomment-454043374
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins commented on issue #23396: [SPARK-26397][SQL] Driver side only metrics support

2019-01-14 Thread GitBox
AmplabJenkins commented on issue #23396: [SPARK-26397][SQL] Driver side only 
metrics support
URL: https://github.com/apache/spark/pull/23396#issuecomment-454046243
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins commented on issue #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-01-14 Thread GitBox
AmplabJenkins commented on issue #23531: [SPARK-24497][SQL] Support recursive 
SQL query
URL: https://github.com/apache/spark/pull/23531#issuecomment-454049348
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/7040/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins commented on issue #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-01-14 Thread GitBox
AmplabJenkins commented on issue #23531: [SPARK-24497][SQL] Support recursive 
SQL query
URL: https://github.com/apache/spark/pull/23531#issuecomment-454049332
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] peter-toth commented on a change in pull request #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-01-14 Thread GitBox
peter-toth commented on a change in pull request #23531: [SPARK-24497][SQL] 
Support recursive SQL query
URL: https://github.com/apache/spark/pull/23531#discussion_r247552872
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala
 ##
 @@ -289,13 +294,13 @@ abstract class TreeNode[BaseType <: TreeNode[BaseType]] 
extends Product {
   /**
* Returns a copy of this node where `f` has been applied to all the nodes 
children.
*/
-  def mapChildren(f: BaseType => BaseType): BaseType = {
-if (children.nonEmpty) {
+  def mapChildren(f: BaseType => BaseType, forceCopy: Boolean = false): 
BaseType = {
 
 Review comment:
   You are right @mgaido91, `BroadcastExchangeExec` and similar nodes cause 
some issues there. We can't prepare them until we have the result of the 
previous iteration (or the result of the anchor). This is why I introduced 
`doNotPrepareInAdvance`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins removed a comment on issue #23510: [SPARK-26590][CORE] make fetch-block-to-disk backward compatible

2019-01-14 Thread GitBox
AmplabJenkins removed a comment on issue #23510: [SPARK-26590][CORE] make 
fetch-block-to-disk backward compatible
URL: https://github.com/apache/spark/pull/23510#issuecomment-454064124
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins removed a comment on issue #23510: [SPARK-26590][CORE] make fetch-block-to-disk backward compatible

2019-01-14 Thread GitBox
AmplabJenkins removed a comment on issue #23510: [SPARK-26590][CORE] make 
fetch-block-to-disk backward compatible
URL: https://github.com/apache/spark/pull/23510#issuecomment-454064133
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/101181/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins removed a comment on issue #23519: [SPARK-26601][SQL] Make broadcast-exchange thread pool configurable

2019-01-14 Thread GitBox
AmplabJenkins removed a comment on issue #23519: [SPARK-26601][SQL] Make 
broadcast-exchange thread pool configurable
URL: https://github.com/apache/spark/pull/23519#issuecomment-454070468
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/101185/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] SparkQA commented on issue #23519: [SPARK-26601][SQL] Make broadcast-exchange thread pool configurable

2019-01-14 Thread GitBox
SparkQA commented on issue #23519: [SPARK-26601][SQL] Make broadcast-exchange 
thread pool configurable
URL: https://github.com/apache/spark/pull/23519#issuecomment-454025284
 
 
   **[Test build #101184 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/101184/testReport)**
 for PR 23519 at commit 
[`9669569`](https://github.com/apache/spark/commit/9669569dddb4c4d36026cef8b0e79440c445e8ab).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] srowen commented on a change in pull request #23538: [MINOR][BUILD] Remove binary license/notice files in a source release for branch-2.4+ only

2019-01-14 Thread GitBox
srowen commented on a change in pull request #23538: [MINOR][BUILD] Remove 
binary license/notice files in a source release for branch-2.4+ only
URL: https://github.com/apache/spark/pull/23538#discussion_r247520255
 
 

 ##
 File path: dev/create-release/release-build.sh
 ##
 @@ -176,10 +176,14 @@ if [[ "$1" == "package" ]]; then
   # Source and binary tarballs
   echo "Packaging release source tarballs"
   cp -r spark spark-$SPARK_VERSION
-  # For source release, exclude copy of binary license/notice
-  rm spark-$SPARK_VERSION/LICENSE-binary
-  rm spark-$SPARK_VERSION/NOTICE-binary
-  rm -r spark-$SPARK_VERSION/licenses-binary
+
+  # For source release in v2.4+, exclude copy of binary license/notice
+  if [[ $SPARK_VERSION > "2.4" ]]; then
 
 Review comment:
   This seems OK, but do the `rm`s cause a problem? if they fail or print an 
error, add `-f`? But this is fine too.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] viirya commented on issue #23521: [SPARK-26604][CORE] Register channel for stream request

2019-01-14 Thread GitBox
viirya commented on issue #23521: [SPARK-26604][CORE] Register channel for 
stream request
URL: https://github.com/apache/spark/pull/23521#issuecomment-454030949
 
 
   @cloud-fan Since `StreamManager` doesn't register channel but only 
`OneForOneStreamManager` does it, I remove `registerChannel` from 
`StreamManager`. When `OneForOneStreamManager` goes to serve chunk or stream 
request, it will register channel for the stream.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins commented on issue #23521: [SPARK-26604][CORE] Register channel for stream request

2019-01-14 Thread GitBox
AmplabJenkins commented on issue #23521: [SPARK-26604][CORE] Register channel 
for stream request
URL: https://github.com/apache/spark/pull/23521#issuecomment-454031389
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/7038/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins commented on issue #23383: [SPARK-23817][SQL] Create file source V2 framework and migrate ORC read path

2019-01-14 Thread GitBox
AmplabJenkins commented on issue #23383: [SPARK-23817][SQL] Create file source 
V2 framework and migrate ORC read path
URL: https://github.com/apache/spark/pull/23383#issuecomment-454031436
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/7037/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] srowen commented on issue #22807: [SPARK-25811][PySpark] Raise a proper error when unsafe cast is detected by PyArrow

2019-01-14 Thread GitBox
srowen commented on issue #22807: [SPARK-25811][PySpark] Raise a proper error 
when unsafe cast is detected by PyArrow
URL: https://github.com/apache/spark/pull/22807#issuecomment-454036714
 
 
   So, in these situations, we agree some error should occur. The current one 
is an overflow error -- right? or does it silently overfloat? Non-arrow UDFs 
seem to do the latter.
   
   Based on that, I'm not clear the Arrow behavior should be different, 
especially if it already matches that behavior, and is what Arrow users expect.
   
   No, this isn't a case for a flag. That is a failure to decide, punted to the 
user.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] SparkQA commented on issue #23396: [SPARK-26397][SQL] Driver side only metrics support

2019-01-14 Thread GitBox
SparkQA commented on issue #23396: [SPARK-26397][SQL] Driver side only metrics 
support
URL: https://github.com/apache/spark/pull/23396#issuecomment-454041371
 
 
   **[Test build #101189 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/101189/testReport)**
 for PR 23396 at commit 
[`1019762`](https://github.com/apache/spark/commit/10197623309aac89c095f4f765f8ff1ed28a9452).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] peter-toth commented on a change in pull request #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-01-14 Thread GitBox
peter-toth commented on a change in pull request #23531: [SPARK-24497][SQL] 
Support recursive SQL query
URL: https://github.com/apache/spark/pull/23531#discussion_r247536393
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala
 ##
 @@ -289,13 +294,13 @@ abstract class TreeNode[BaseType <: TreeNode[BaseType]] 
extends Product {
   /**
* Returns a copy of this node where `f` has been applied to all the nodes 
children.
*/
-  def mapChildren(f: BaseType => BaseType): BaseType = {
-if (children.nonEmpty) {
+  def mapChildren(f: BaseType => BaseType, forceCopy: Boolean = false): 
BaseType = {
 
 Review comment:
   Well, I need this for `makeDeepCopy()` and didn't want to copy the logic. 
Any suggestion?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] peter-toth commented on a change in pull request #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-01-14 Thread GitBox
peter-toth commented on a change in pull request #23531: [SPARK-24497][SQL] 
Support recursive SQL query
URL: https://github.com/apache/spark/pull/23531#discussion_r247536528
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ##
 @@ -1530,6 +1689,19 @@ class Analyzer(
   object GlobalAggregates extends Rule[LogicalPlan] {
 def apply(plan: LogicalPlan): LogicalPlan = plan.resolveOperators {
   case Project(projectList, child) if containsAggregates(projectList) =>
+
+def traversePlanAndCheck(plan: LogicalPlan, inRecursiveTable: Boolean 
= false): Unit =
+  plan match {
+case RecursiveReference(name, _) if !inRecursiveTable =>
+  throw new AnalysisException(s"Recursive reference ${name} can't 
be used in an " +
+s"aggregate")
+case RecursiveTable(_, _, recursiveTerm, _) =>
+  traversePlanAndCheck(recursiveTerm, true)
 
 Review comment:
   thanks, fixed


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] mgaido91 commented on a change in pull request #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-01-14 Thread GitBox
mgaido91 commented on a change in pull request #23531: [SPARK-24497][SQL] 
Support recursive SQL query
URL: https://github.com/apache/spark/pull/23531#discussion_r247442316
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
 ##
 @@ -1625,6 +1625,11 @@ object SQLConf {
 "a SparkConf entry.")
   .booleanConf
   .createWithDefault(true)
+
+  val RECURSION_LEVEL_LIMIT = buildConf("spark.sql.recursion.level.limit")
 
 Review comment:
   nit: probably we can make this internal


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] peter-toth commented on a change in pull request #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-01-14 Thread GitBox
peter-toth commented on a change in pull request #23531: [SPARK-24497][SQL] 
Support recursive SQL query
URL: https://github.com/apache/spark/pull/23531#discussion_r247536393
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala
 ##
 @@ -289,13 +294,13 @@ abstract class TreeNode[BaseType <: TreeNode[BaseType]] 
extends Product {
   /**
* Returns a copy of this node where `f` has been applied to all the nodes 
children.
*/
-  def mapChildren(f: BaseType => BaseType): BaseType = {
-if (children.nonEmpty) {
+  def mapChildren(f: BaseType => BaseType, forceCopy: Boolean = false): 
BaseType = {
 
 Review comment:
   Well, I need this for `makeDeepCopy()` and didn't want to copy the logic. 
Some of the plan nodes can't be reused in iterations so I think the most 
straightforward way is to deep copy the subtree. Any suggestion?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] SparkQA commented on issue #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-01-14 Thread GitBox
SparkQA commented on issue #23531: [SPARK-24497][SQL] Support recursive SQL 
query
URL: https://github.com/apache/spark/pull/23531#issuecomment-454052123
 
 
   **[Test build #101191 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/101191/testReport)**
 for PR 23531 at commit 
[`f6d8c17`](https://github.com/apache/spark/commit/f6d8c17aa6eb4f406ccb326c5a530c15cfdc1a83).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins removed a comment on issue #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-01-14 Thread GitBox
AmplabJenkins removed a comment on issue #23531: [SPARK-24497][SQL] Support 
recursive SQL query
URL: https://github.com/apache/spark/pull/23531#issuecomment-454057027
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins commented on issue #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-01-14 Thread GitBox
AmplabJenkins commented on issue #23531: [SPARK-24497][SQL] Support recursive 
SQL query
URL: https://github.com/apache/spark/pull/23531#issuecomment-454057027
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins removed a comment on issue #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-01-14 Thread GitBox
AmplabJenkins removed a comment on issue #23531: [SPARK-24497][SQL] Support 
recursive SQL query
URL: https://github.com/apache/spark/pull/23531#issuecomment-454057036
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/7042/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] SparkQA commented on issue #23383: [SPARK-23817][SQL] Create file source V2 framework and migrate ORC read path

2019-01-14 Thread GitBox
SparkQA commented on issue #23383: [SPARK-23817][SQL] Create file source V2 
framework and migrate ORC read path
URL: https://github.com/apache/spark/pull/23383#issuecomment-454069631
 
 
   **[Test build #101192 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/101192/testReport)**
 for PR 23383 at commit 
[`01b3ca9`](https://github.com/apache/spark/commit/01b3ca97a675ff3059a9b16ebde99ee929384a2a).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] srowen commented on a change in pull request #21632: [SPARK-19591][ML][MLlib] Add sample weights to decision trees

2019-01-14 Thread GitBox
srowen commented on a change in pull request #21632: [SPARK-19591][ML][MLlib] 
Add sample weights to decision trees
URL: https://github.com/apache/spark/pull/21632#discussion_r247562160
 
 

 ##
 File path: 
mllib/src/test/scala/org/apache/spark/mllib/tree/DecisionTreeSuite.scala
 ##
 @@ -167,7 +167,7 @@ class DecisionTreeSuite extends SparkFunSuite with 
MLlibTestSparkContext {
 val rdd = sc.parallelize(arr)
 val strategy = new Strategy(Classification, Entropy, maxDepth = 3,
   numClasses = 2, maxBins = 100)
-val metadata = DecisionTreeMetadata.buildMetadata(rdd.map(_.asML), 
strategy)
+val metadata = 
DecisionTreeMetadata.buildMetadata(rdd.map(_.asML.toInstance), strategy)
 
 Review comment:
   Rather than call `.toInstance` everywhere this is called, why not overload 
`buildMetadata` to accept an RDD of either one?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] srowen commented on a change in pull request #21632: [SPARK-19591][ML][MLlib] Add sample weights to decision trees

2019-01-14 Thread GitBox
srowen commented on a change in pull request #21632: [SPARK-19591][ML][MLlib] 
Add sample weights to decision trees
URL: https://github.com/apache/spark/pull/21632#discussion_r247563305
 
 

 ##
 File path: mllib/src/test/scala/org/apache/spark/ml/tree/impl/TreeTests.scala
 ##
 @@ -18,37 +18,46 @@
 package org.apache.spark.ml.tree.impl
 
 import scala.collection.JavaConverters._
+import scala.util.Random
 
 import org.apache.spark.{SparkContext, SparkFunSuite}
 import org.apache.spark.api.java.JavaRDD
 import org.apache.spark.ml.attribute.{AttributeGroup, NominalAttribute, 
NumericAttribute}
-import org.apache.spark.ml.feature.LabeledPoint
+import org.apache.spark.ml.feature.{Instance, LabeledPoint}
 import org.apache.spark.ml.linalg.Vectors
 import org.apache.spark.ml.tree._
+import org.apache.spark.mllib.util.TestingUtils._
 import org.apache.spark.rdd.RDD
 import org.apache.spark.sql.{DataFrame, SparkSession}
 
 private[ml] object TreeTests extends SparkFunSuite {
 
   /**
* Convert the given data to a DataFrame, and set the features and label 
metadata.
+   *
* @param data  Dataset.  Categorical features and labels must already have 
0-based indices.
*  This must be non-empty.
* @param categoricalFeatures  Map: categorical feature index to number of 
distinct values
* @param numClasses  Number of classes label can take.  If 0, mark as 
continuous.
* @return DataFrame with metadata
*/
   def setMetadata(
-  data: RDD[LabeledPoint],
+  data: RDD[_],
 
 Review comment:
   Being test code, this isn't a big deal, but this could also be overloaded 
with two versions that accept RDDs of different types, and then call to a 
common private function to proceed from there. I don't feel strongly about it, 
just dislike seeing `[_]` in a method signature


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins commented on issue #23383: [SPARK-23817][SQL] Create file source V2 framework and migrate ORC read path

2019-01-14 Thread GitBox
AmplabJenkins commented on issue #23383: [SPARK-23817][SQL] Create file source 
V2 framework and migrate ORC read path
URL: https://github.com/apache/spark/pull/23383#issuecomment-454070024
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/7043/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] srowen commented on a change in pull request #21632: [SPARK-19591][ML][MLlib] Add sample weights to decision trees

2019-01-14 Thread GitBox
srowen commented on a change in pull request #21632: [SPARK-19591][ML][MLlib] 
Add sample weights to decision trees
URL: https://github.com/apache/spark/pull/21632#discussion_r247563937
 
 

 ##
 File path: mllib/src/main/scala/org/apache/spark/ml/feature/LabeledPoint.scala
 ##
 @@ -37,4 +37,9 @@ case class LabeledPoint(@Since("2.0.0") label: Double, 
@Since("2.0.0") features:
   override def toString: String = {
 s"($label,$features)"
   }
+
+  private[spark] def toInstance(weight: Double): Instance = {
+Instance(label, weight, features)
 
 Review comment:
   OK, I'm persuaded. Maybe One Day we can clean up the relationship between 
these classes a bit more but this isn't particularly worse.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins commented on issue #23521: [SPARK-26604][CORE] Register channel for stream request

2019-01-14 Thread GitBox
AmplabJenkins commented on issue #23521: [SPARK-26604][CORE] Register channel 
for stream request
URL: https://github.com/apache/spark/pull/23521#issuecomment-454031383
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins removed a comment on issue #23521: [SPARK-26604][CORE] Register channel for stream request

2019-01-14 Thread GitBox
AmplabJenkins removed a comment on issue #23521: [SPARK-26604][CORE] Register 
channel for stream request
URL: https://github.com/apache/spark/pull/23521#issuecomment-454031389
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/7038/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins removed a comment on issue #23383: [SPARK-23817][SQL] Create file source V2 framework and migrate ORC read path

2019-01-14 Thread GitBox
AmplabJenkins removed a comment on issue #23383: [SPARK-23817][SQL] Create file 
source V2 framework and migrate ORC read path
URL: https://github.com/apache/spark/pull/23383#issuecomment-454031436
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/7037/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins commented on issue #23383: [SPARK-23817][SQL] Create file source V2 framework and migrate ORC read path

2019-01-14 Thread GitBox
AmplabJenkins commented on issue #23383: [SPARK-23817][SQL] Create file source 
V2 framework and migrate ORC read path
URL: https://github.com/apache/spark/pull/23383#issuecomment-454031424
 
 
   Build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins removed a comment on issue #23521: [SPARK-26604][CORE] Register channel for stream request

2019-01-14 Thread GitBox
AmplabJenkins removed a comment on issue #23521: [SPARK-26604][CORE] Register 
channel for stream request
URL: https://github.com/apache/spark/pull/23521#issuecomment-454031383
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] peter-toth commented on a change in pull request #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-01-14 Thread GitBox
peter-toth commented on a change in pull request #23531: [SPARK-24497][SQL] 
Support recursive SQL query
URL: https://github.com/apache/spark/pull/23531#discussion_r247535248
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ##
 @@ -205,30 +206,184 @@ class Analyzer(
   CleanupAliases)
   )
 
+  object ResolveRecursiveReferneces extends Rule[LogicalPlan] {
+def apply(plan: LogicalPlan): LogicalPlan = {
+  val recursiveTables = plan.collect {
+case rt @ RecursiveTable(name, _, _, _) if rt.anchorResolved => name 
-> rt
+  }.toMap
+
+  plan.resolveOperatorsUp {
+case UnresolvedRecursiveReference(name) if 
recursiveTables.contains(name) =>
+  RecursiveReference(name, recursiveTables(name).output)
+case other => other
+  }
+}
+  }
+
   /**
* Analyze cte definitions and substitute child plan with analyzed cte 
definitions.
*/
   object CTESubstitution extends Rule[LogicalPlan] {
 def apply(plan: LogicalPlan): LogicalPlan = plan.resolveOperatorsUp {
-  case With(child, relations) =>
+  case With(child, relations, allowRecursion) =>
 substituteCTE(child, relations.foldLeft(Seq.empty[(String, 
LogicalPlan)]) {
   case (resolved, (name, relation)) =>
-resolved :+ name -> executeSameContext(substituteCTE(relation, 
resolved))
-})
+val recursiveTableName = if (allowRecursion) Some(name) else None
+resolved :+
+  name -> executeSameContext(substituteCTE(relation, resolved, 
recursiveTableName))
+}, None)
   case other => other
 }
 
-def substituteCTE(plan: LogicalPlan, cteRelations: Seq[(String, 
LogicalPlan)]): LogicalPlan = {
-  plan resolveOperatorsDown {
-case u: UnresolvedRelation =>
-  cteRelations.find(x => resolver(x._1, u.tableIdentifier.table))
-.map(_._2).getOrElse(u)
-case other =>
-  // This cannot be done in ResolveSubquery because ResolveSubquery 
does not know the CTE.
-  other transformExpressions {
-case e: SubqueryExpression =>
-  e.withNewPlan(substituteCTE(e.plan, cteRelations))
+def substituteCTE(
+plan: LogicalPlan,
+cteRelations: Seq[(String, LogicalPlan)],
+recursiveTableName: Option[String]): LogicalPlan = {
+  def substitute(
+  plan: LogicalPlan,
+  inSubQuery: Boolean = false): (LogicalPlan, Boolean) = {
+val references = mutable.Set.empty[UnresolvedRecursiveReference]
+
+def newReference(recursiveTableName: String) = {
+  val recursiveReference = 
UnresolvedRecursiveReference(recursiveTableName)
+  references += recursiveReference
+
+  recursiveReference
+}
+
+val newPlan = plan resolveOperatorsDown {
+  case u: UnresolvedRelation =>
+val table = u.tableIdentifier.table
+
+val recursiveReference = recursiveTableName.find(resolver(_, 
table)).map { name =>
+  if (inSubQuery) {
+throw new AnalysisException(
+  s"Recursive reference ${name} can't be used in a subquery")
+  }
+
+  newReference(name)
+}
+
+recursiveReference
+  .orElse(cteRelations.find(x => resolver(x._1, table)).map(_._2))
+  .getOrElse(u)
+
+  case other =>
+// This cannot be done in ResolveSubquery because ResolveSubquery 
does not know the CTE.
+other transformExpressions {
+  case e: SubqueryExpression => e.withNewPlan(substitute(e.plan, 
true)._1)
+}
+}
+
+(newPlan, !references.isEmpty)
+  }
+
+  plan match {
+case SubqueryAlias(name, u: Union) if recursiveTableName.isDefined =>
+  def combineUnions(union: Union): Seq[LogicalPlan] = 
union.children.flatMap {
+case u: Union => combineUnions(u)
+case o => Seq(o)
   }
+
+  val substitutedTerms = combineUnions(u).map(substitute(_))
+  val (anchorTerms, recursiveTerms) = substitutedTerms.partition(!_._2)
+
+  if (!recursiveTerms.isEmpty) {
+if (anchorTerms.isEmpty) {
+  throw new AnalysisException(s"There should be at least 1 anchor 
term defined in a " +
+s"recursive query $name")
+}
+
+case class PlanTraverseStatus(
 
 Review comment:
   Indeed this was a bit ugly. Changed as you suggested.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] srowen commented on issue #23535: [SPARK-26613][SQL] Add another rename table grammar for spark sql

2019-01-14 Thread GitBox
srowen commented on issue #23535: [SPARK-26613][SQL] Add another rename table 
grammar for spark sql
URL: https://github.com/apache/spark/pull/23535#issuecomment-454044393
 
 
   Is it standard SQL or something Hive does? otherwise I don't think we'd 
generally add it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] srowen commented on a change in pull request #23412: [SPARK-26477][CORE] Use ConfigEntry for hardcoded configs for unsafe category

2019-01-14 Thread GitBox
srowen commented on a change in pull request #23412: [SPARK-26477][CORE] Use 
ConfigEntry for hardcoded configs for unsafe category 
URL: https://github.com/apache/spark/pull/23412#discussion_r247537355
 
 

 ##
 File path: 
core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeSorterSpillReader.java
 ##
 @@ -59,21 +60,22 @@ public UnsafeSorterSpillReader(
   File file,
   BlockId blockId) throws IOException {
 assert (file.length() > 0);
+final ConfigEntry bufferSizeConfigEntry =
+package$.MODULE$.UNSAFE_SORTER_SPILL_READER_BUFFER_SIZE();
 
 Review comment:
   This can't be `ConfigEntry` because the Scala side has the generic 
type as a `long`? It seems like it ends up a `Long` anyway as you're able to 
`.get()` and cast it to `long`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] peter-toth commented on a change in pull request #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-01-14 Thread GitBox
peter-toth commented on a change in pull request #23531: [SPARK-24497][SQL] 
Support recursive SQL query
URL: https://github.com/apache/spark/pull/23531#discussion_r247537146
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
 ##
 @@ -591,7 +591,8 @@ object FoldablePropagation extends Rule[LogicalPlan] {
 } else {
   CleanupAliases(plan.transformUp {
 // We can only propagate foldables for a subset of unary nodes.
-case u: UnaryNode if foldableMap.nonEmpty && canPropagateFoldables(u) 
=>
+case u: UnaryNode if foldableMap.nonEmpty && canPropagateFoldables(u)
 
 Review comment:
   I reverted this part and changed `ResolveRecursiveReferneces` to use new 
attributes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins removed a comment on issue #23396: [SPARK-26397][SQL] Driver side only metrics support

2019-01-14 Thread GitBox
AmplabJenkins removed a comment on issue #23396: [SPARK-26397][SQL] Driver side 
only metrics support
URL: https://github.com/apache/spark/pull/23396#issuecomment-454046255
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/7041/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins commented on issue #23396: [SPARK-26397][SQL] Driver side only metrics support

2019-01-14 Thread GitBox
AmplabJenkins commented on issue #23396: [SPARK-26397][SQL] Driver side only 
metrics support
URL: https://github.com/apache/spark/pull/23396#issuecomment-454046255
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/7041/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] srowen commented on issue #23507: [SPARK-26576][SQL] Broadcast hint not applied to partitioned table

2019-01-14 Thread GitBox
srowen commented on issue #23507: [SPARK-26576][SQL] Broadcast hint not applied 
to partitioned table
URL: https://github.com/apache/spark/pull/23507#issuecomment-454030131
 
 
   It's possible it merely uncovered a different problem, in the test or main 
code, that only exists in 2.3.
   
   How important is the change for 2.3? given that there is a release imminent, 
would it be OK to revert it now in 2.3, or does that lose an important fix?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins removed a comment on issue #23383: [SPARK-23817][SQL] Create file source V2 framework and migrate ORC read path

2019-01-14 Thread GitBox
AmplabJenkins removed a comment on issue #23383: [SPARK-23817][SQL] Create file 
source V2 framework and migrate ORC read path
URL: https://github.com/apache/spark/pull/23383#issuecomment-454031424
 
 
   Build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] SparkQA commented on issue #23521: [SPARK-26604][CORE] Clean up channel registration for StreamManager

2019-01-14 Thread GitBox
SparkQA commented on issue #23521: [SPARK-26604][CORE] Clean up channel 
registration for StreamManager
URL: https://github.com/apache/spark/pull/23521#issuecomment-454032343
 
 
   **[Test build #101187 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/101187/testReport)**
 for PR 23521 at commit 
[`9082f01`](https://github.com/apache/spark/commit/9082f01b9fec637399a1cf43eb55141cc0dcee6b).
* This patch **fails to build**.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins commented on issue #23521: [SPARK-26604][CORE] Clean up channel registration for StreamManager

2019-01-14 Thread GitBox
AmplabJenkins commented on issue #23521: [SPARK-26604][CORE] Clean up channel 
registration for StreamManager
URL: https://github.com/apache/spark/pull/23521#issuecomment-454032368
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins commented on issue #23521: [SPARK-26604][CORE] Clean up channel registration for StreamManager

2019-01-14 Thread GitBox
AmplabJenkins commented on issue #23521: [SPARK-26604][CORE] Clean up channel 
registration for StreamManager
URL: https://github.com/apache/spark/pull/23521#issuecomment-454032378
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/101187/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] srowen commented on a change in pull request #23534: [SPARK-26610][PYTHON] Fix inconsistency between toJSON Method in Python and Scala.

2019-01-14 Thread GitBox
srowen commented on a change in pull request #23534: [SPARK-26610][PYTHON] Fix 
inconsistency between toJSON Method in Python and Scala.
URL: https://github.com/apache/spark/pull/23534#discussion_r247525774
 
 

 ##
 File path: python/pyspark/sql/dataframe.py
 ##
 @@ -109,15 +109,18 @@ def stat(self):
 @ignore_unicode_prefix
 @since(1.3)
 def toJSON(self, use_unicode=True):
-"""Converts a :class:`DataFrame` into a :class:`RDD` of string.
+"""Converts a :class:`DataFrame` into a :class:`DataFrame` of JSON 
string.
 
-Each row is turned into a JSON document as one element in the returned 
RDD.
+Each row is turned into a JSON document as one element in the returned 
DataFrame.
 
 >>> df.toJSON().first()
-u'{"age":2,"name":"Alice"}'
+Row(value=u'{"age":2,"name":"Alice"}')
 """
-rdd = self._jdf.toJSON()
-return RDD(rdd.toJavaRDD(), self._sc, UTF8Deserializer(use_unicode))
+jdf = self._jdf.toJSON()
+if self.sql_ctx._conf.pysparkDataFrameToJSONShouldReturnDataFrame():
+return DataFrame(jdf, self.sql_ctx)
+else:
+return RDD(jdf.toJavaRDD(), self._sc, 
UTF8Deserializer(use_unicode))
 
 Review comment:
   I agree it's not clear which one is more consistent. I do not think we 
should add a config -- config options are, as someone once said, often a 
failure of design, that punts unresolved questions to the user. I think I'd 
slightly prefer keeping an RDD here if there's no compelling reason to 
significantly change the Pyspark API.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] SparkQA commented on issue #23447: [SPARK-26462][CORE] Use ConfigEntry for hardcoded configs for execution categories

2019-01-14 Thread GitBox
SparkQA commented on issue #23447: [SPARK-26462][CORE] Use ConfigEntry for 
hardcoded configs for execution categories
URL: https://github.com/apache/spark/pull/23447#issuecomment-454042522
 
 
   **[Test build #101179 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/101179/testReport)**
 for PR 23447 at commit 
[`d0ecdb6`](https://github.com/apache/spark/commit/d0ecdb6b03b452eee6942b5513bb2ad220c35616).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] peter-toth commented on a change in pull request #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-01-14 Thread GitBox
peter-toth commented on a change in pull request #23531: [SPARK-24497][SQL] 
Support recursive SQL query
URL: https://github.com/apache/spark/pull/23531#discussion_r247535564
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
 ##
 @@ -47,6 +48,41 @@ case class Subquery(child: LogicalPlan) extends 
OrderPreservingUnaryNode {
   override def output: Seq[Attribute] = child.output
 }
 
+case class RecursiveTable(
+name: String,
+anchorTerm: LogicalPlan,
+recursiveTerm: LogicalPlan,
+levelLimit: Int) extends LogicalPlan {
 
 Review comment:
   thanks, removed


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] SparkQA commented on issue #23416: [SPARK-26463][CORE] Use ConfigEntry for hardcoded configs for scheduler categories.

2019-01-14 Thread GitBox
SparkQA commented on issue #23416: [SPARK-26463][CORE] Use ConfigEntry for 
hardcoded configs for scheduler categories.
URL: https://github.com/apache/spark/pull/23416#issuecomment-454045040
 
 
   **[Test build #4512 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4512/testReport)**
 for PR 23416 at commit 
[`af8e003`](https://github.com/apache/spark/commit/af8e003e7fc8a5b0f6a4db724387f26ed8f0fb43).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins removed a comment on issue #23533: [CORE][MINOR] Fix some typos about MemoryMode

2019-01-14 Thread GitBox
AmplabJenkins removed a comment on issue #23533: [CORE][MINOR] Fix some typos 
about MemoryMode
URL: https://github.com/apache/spark/pull/23533#issuecomment-454047258
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/101180/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] SparkQA commented on issue #23510: [SPARK-26590][CORE] make fetch-block-to-disk backward compatible

2019-01-14 Thread GitBox
SparkQA commented on issue #23510: [SPARK-26590][CORE] make fetch-block-to-disk 
backward compatible
URL: https://github.com/apache/spark/pull/23510#issuecomment-454063335
 
 
   **[Test build #101181 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/101181/testReport)**
 for PR 23510 at commit 
[`6e1a05b`](https://github.com/apache/spark/commit/6e1a05b088cabddc986e0a3d8e126095f0f7b32b).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] srowen commented on a change in pull request #21632: [SPARK-19591][ML][MLlib] Add sample weights to decision trees

2019-01-14 Thread GitBox
srowen commented on a change in pull request #21632: [SPARK-19591][ML][MLlib] 
Add sample weights to decision trees
URL: https://github.com/apache/spark/pull/21632#discussion_r247562734
 
 

 ##
 File path: mllib/src/test/scala/org/apache/spark/ml/util/MLTestingUtils.scala
 ##
 @@ -268,4 +269,20 @@ object MLTestingUtils extends SparkFunSuite {
 assert(newDatasetF.schema(featuresColName).dataType.equals(new 
ArrayType(FloatType, false)))
 (newDataset, newDatasetD, newDatasetF)
   }
+
+  def modelPredictionEquals[M <: PredictionModel[_, M]](
 
 Review comment:
   This is asserting that some subset are within tolerances... is it necessary 
to loosen it this much? I haven't read carefully; why might some be out of 
tolerances and that's OK? what about a looser tolerance instead?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] SparkQA commented on issue #23519: [SPARK-26601][SQL] Make broadcast-exchange thread pool configurable

2019-01-14 Thread GitBox
SparkQA commented on issue #23519: [SPARK-26601][SQL] Make broadcast-exchange 
thread pool configurable
URL: https://github.com/apache/spark/pull/23519#issuecomment-454070170
 
 
   **[Test build #101185 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/101185/testReport)**
 for PR 23519 at commit 
[`ac2ec82`](https://github.com/apache/spark/commit/ac2ec82c7dc152acd6b200dc14ba8afea6abf5ad).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] srowen commented on a change in pull request #21632: [SPARK-19591][ML][MLlib] Add sample weights to decision trees

2019-01-14 Thread GitBox
srowen commented on a change in pull request #21632: [SPARK-19591][ML][MLlib] 
Add sample weights to decision trees
URL: https://github.com/apache/spark/pull/21632#discussion_r247561861
 
 

 ##
 File path: 
mllib/src/main/scala/org/apache/spark/ml/tree/impl/DecisionTreeMetadata.scala
 ##
 @@ -115,7 +122,11 @@ private[spark] object DecisionTreeMetadata extends 
Logging {
 }
 require(numFeatures > 0, s"DecisionTree requires number of features > 0, " 
+
   s"but was given an empty features vector")
-val numExamples = input.count()
+val (numExamples, weightSum) = input.aggregate((0L, 0.0))(
+  seqOp = (cw, instance) => (cw._1 + 1L, cw._2 + instance.weight),
 
 Review comment:
   This can be taken further: `((count, weight), instance) =>`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] srowen commented on a change in pull request #21632: [SPARK-19591][ML][MLlib] Add sample weights to decision trees

2019-01-14 Thread GitBox
srowen commented on a change in pull request #21632: [SPARK-19591][ML][MLlib] 
Add sample weights to decision trees
URL: https://github.com/apache/spark/pull/21632#discussion_r247560135
 
 

 ##
 File path: project/MimaExcludes.scala
 ##
 @@ -241,7 +241,10 @@ object MimaExcludes {
 
 // [SPARK-26216][SQL] Do not use case class as public API 
(UserDefinedFunction)
 
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.expressions.UserDefinedFunction$"),
-
ProblemFilters.exclude[IncompatibleTemplateDefProblem]("org.apache.spark.sql.expressions.UserDefinedFunction")
+
ProblemFilters.exclude[IncompatibleTemplateDefProblem]("org.apache.spark.sql.expressions.UserDefinedFunction"),
+
+// [SPARK-19591][ML] Add sample weights to decision trees
+
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.mllib.tree.configuration.Strategy.this")
 
 Review comment:
   Hm, yeah, the problem is that both constructors define a bunch of optional 
arguments, and this call doesn't specify those optional argument. So both 
constructors apply. The 'fix' would be to make the backwards-compatible 
constructor not take any optional arguments. It was already source-compatible 
without this additional constructor, but would make it binary-compatible too.
   
   I'm on the fence about it. It's not something apps would normally specify 
directly, but it's also pretty trivial to write in this additional constructor. 
If it works, I'd keep the backwards-compatible constructor. Little benefit, 
very little cost.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins commented on issue #23383: [SPARK-23817][SQL] Create file source V2 framework and migrate ORC read path

2019-01-14 Thread GitBox
AmplabJenkins commented on issue #23383: [SPARK-23817][SQL] Create file source 
V2 framework and migrate ORC read path
URL: https://github.com/apache/spark/pull/23383#issuecomment-454070005
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   6   7   8   9   10   >