date:20191110

[GitHub] [spark] SparkQA commented on issue #26263: [SPARK-29570][WEBUI] Improve tooltip for Executor Tab for Shuffle Write, Blacklisted, Logs, Threaddump columns

2019-11-10 Thread GitBox

SparkQA commented on issue #26263: [SPARK-29570][WEBUI] Improve tooltip for 
Executor Tab for Shuffle Write,Blacklisted,Logs,Threaddump columns
URL: https://github.com/apache/spark/pull/26263#issuecomment-552232572
 
 
   **[Test build #4918 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4918/testReport)**
 for PR 26263 at commit 
[`1a75e4d`](https://github.com/apache/spark/commit/1a75e4d63c840eb4cf170dce1a909be8d5430e7e).
* This patch **fails Spark unit tests**.
* This patch **does not merge cleanly**.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #26263: [SPARK-29570][WEBUI] Improve tooltip for Executor Tab for Shuffle Write, Blacklisted, Logs, Threaddump columns

2019-11-10 Thread GitBox

SparkQA removed a comment on issue #26263: [SPARK-29570][WEBUI] Improve tooltip 
for Executor Tab for Shuffle Write,Blacklisted,Logs,Threaddump columns
URL: https://github.com/apache/spark/pull/26263#issuecomment-55854
 
 
   **[Test build #4918 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4918/testReport)**
 for PR 26263 at commit 
[`1a75e4d`](https://github.com/apache/spark/commit/1a75e4d63c840eb4cf170dce1a909be8d5430e7e).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25734: [SPARK-28939][SQL][2.4] Propagate SQLConf for plans executed by toRdd

2019-11-10 Thread GitBox

SparkQA commented on issue #25734: [SPARK-28939][SQL][2.4] Propagate SQLConf 
for plans executed by toRdd
URL: https://github.com/apache/spark/pull/25734#issuecomment-552237974
 
 
   **[Test build #113545 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113545/testReport)**
 for PR 25734 at commit 
[`1b145e2`](https://github.com/apache/spark/commit/1b145e2158679dc27fce07a8ddf17f6341175afe).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #25734: [SPARK-28939][SQL][2.4] Propagate SQLConf for plans executed by toRdd

2019-11-10 Thread GitBox

SparkQA removed a comment on issue #25734: [SPARK-28939][SQL][2.4] Propagate 
SQLConf for plans executed by toRdd
URL: https://github.com/apache/spark/pull/25734#issuecomment-552220925
 
 
   **[Test build #113545 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113545/testReport)**
 for PR 25734 at commit 
[`1b145e2`](https://github.com/apache/spark/commit/1b145e2158679dc27fce07a8ddf17f6341175afe).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] maropu edited a comment on issue #26458: [SPARK-29821] Allow calling non-aggregate SQL functions with column name

2019-11-10 Thread GitBox

maropu edited a comment on issue #26458: [SPARK-29821] Allow calling 
non-aggregate SQL functions with column name
URL: https://github.com/apache/spark/pull/26458#issuecomment-552192167
 
 
   IIRC we don't actively add an interface for string column names in 
functions. plz use `selectExpr` instead. cc: @HyukjinKwon @srowen 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] maropu commented on a change in pull request #26459: [SPARK-29825][SQL][TESTS] Add join conditions in join-related tests of SQLQueryTestSuite

2019-11-10 Thread GitBox

maropu commented on a change in pull request #26459: [SPARK-29825][SQL][TESTS] 
Add join conditions in join-related tests of SQLQueryTestSuite
URL: https://github.com/apache/spark/pull/26459#discussion_r344526125
 
 

 ##
 File path: sql/core/src/test/resources/sql-tests/inputs/postgreSQL/join.sql
 ##
 @@ -6,6 +6,11 @@
 -- Test JOIN clauses
 -- 
https://github.com/postgres/postgres/blob/REL_12_BETA2/src/test/regress/sql/join.sql
 --
+
+--SET spark.sql.autoBroadcastJoinThreshold=10485760
 
 Review comment:
   I just copied them from the other join-related tests (e.g., 
https://github.com/apache/spark/blob/master/sql/core/src/test/resources/sql-tests/inputs/natural-join.sql#L2-L4),
 so I'm not sure about why the value chosen. I think this configuration is just 
to prohibit broadcast hash joins for tests.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider

2019-11-10 Thread GitBox

AmplabJenkins removed a comment on issue #26097: [SPARK-29421][SQL] Supporting 
Create Table Like Using Provider
URL: https://github.com/apache/spark/pull/26097#issuecomment-552265005
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113555/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider

2019-11-10 Thread GitBox

SparkQA commented on issue #26097: [SPARK-29421][SQL] Supporting Create Table 
Like Using Provider
URL: https://github.com/apache/spark/pull/26097#issuecomment-552268405
 
 
   **[Test build #113559 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113559/testReport)**
 for PR 26097 at commit 
[`4008073`](https://github.com/apache/spark/commit/40080738b987c646e0ac8fde7c436539edfdae01).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] LantaoJin commented on issue #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider

2019-11-10 Thread GitBox

LantaoJin commented on issue #26097: [SPARK-29421][SQL] Supporting Create Table 
Like Using Provider
URL: https://github.com/apache/spark/pull/26097#issuecomment-552268279
 
 
   retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #26461: [SPARK-29831][SQL] Scan Hive partitioned table should not dramatically increase data parallelism

2019-11-10 Thread GitBox

AmplabJenkins removed a comment on issue #26461: [SPARK-29831][SQL] Scan Hive 
partitioned table should not dramatically increase data parallelism
URL: https://github.com/apache/spark/pull/26461#issuecomment-552289217
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #26461: [SPARK-29831][SQL] Scan Hive partitioned table should not dramatically increase data parallelism

2019-11-10 Thread GitBox

AmplabJenkins removed a comment on issue #26461: [SPARK-29831][SQL] Scan Hive 
partitioned table should not dramatically increase data parallelism
URL: https://github.com/apache/spark/pull/26461#issuecomment-552289221
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113557/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #26461: [SPARK-29831][SQL] Scan Hive partitioned table should not dramatically increase data parallelism

2019-11-10 Thread GitBox

AmplabJenkins commented on issue #26461: [SPARK-29831][SQL] Scan Hive 
partitioned table should not dramatically increase data parallelism
URL: https://github.com/apache/spark/pull/26461#issuecomment-552289221
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113557/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #26461: [SPARK-29831][SQL] Scan Hive partitioned table should not dramatically increase data parallelism

2019-11-10 Thread GitBox

AmplabJenkins commented on issue #26461: [SPARK-29831][SQL] Scan Hive 
partitioned table should not dramatically increase data parallelism
URL: https://github.com/apache/spark/pull/26461#issuecomment-552289217
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] imback82 commented on a change in pull request #26441: [SPARK-29682][SQL] Resolve conflicting references in aggregate expressions

2019-11-10 Thread GitBox

imback82 commented on a change in pull request #26441: [SPARK-29682][SQL] 
Resolve conflicting references in aggregate expressions 
URL: https://github.com/apache/spark/pull/26441#discussion_r344557807
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ##
 @@ -949,14 +949,19 @@ class Analyzer(
 if oldVersion.outputSet.intersect(conflictingAttributes).nonEmpty 
=>
   (oldVersion, oldVersion.copy(serializer = 
oldVersion.serializer.map(_.newInstance(
 
-// Handle projects that create conflicting aliases.
 case oldVersion @ Project(projectList, _)
-if 
findAliases(projectList).intersect(conflictingAttributes).nonEmpty =>
-  (oldVersion, oldVersion.copy(projectList = newAliases(projectList)))
+if hasConflict(projectList, conflictingAttributes) =>
+  (oldVersion,
+oldVersion.copy(
+  projectList =
+newNamedExpression(projectList, conflictingAttributes)))
 
 case oldVersion @ Aggregate(_, aggregateExpressions, _)
-if 
findAliases(aggregateExpressions).intersect(conflictingAttributes).nonEmpty =>
-  (oldVersion, oldVersion.copy(aggregateExpressions = 
newAliases(aggregateExpressions)))
+if hasConflict(aggregateExpressions, conflictingAttributes) =>
+  (oldVersion,
+oldVersion.copy(
+  aggregateExpressions =
+newNamedExpression(aggregateExpressions, 
conflictingAttributes)))
 
 Review comment:
   updated as suggested. thanks!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] beliefer commented on a change in pull request #26420: [SPARK-27986][SQL] Support ANSI SQL filter predicate for aggregate expression.

2019-11-10 Thread GitBox

beliefer commented on a change in pull request #26420: [SPARK-27986][SQL] 
Support ANSI SQL filter predicate for aggregate expression.
URL: https://github.com/apache/spark/pull/26420#discussion_r344557825
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/higherOrderFunctions.scala
 ##
 @@ -33,7 +33,7 @@ import org.apache.spark.sql.types.DataType
 case class ResolveHigherOrderFunctions(catalog: SessionCatalog) extends 
Rule[LogicalPlan] {
 
   override def apply(plan: LogicalPlan): LogicalPlan = plan.resolveExpressions 
{
-case u @ UnresolvedFunction(fn, children, false)
+case u @ UnresolvedFunction(fn, children, false, _)
 
 Review comment:
   @cloud-fan Thanks for your remind. I have throw exception in `Analyzer`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] beliefer commented on a change in pull request #26420: [SPARK-27986][SQL] Support ANSI SQL filter predicate for aggregate expression.

2019-11-10 Thread GitBox

beliefer commented on a change in pull request #26420: [SPARK-27986][SQL] 
Support ANSI SQL filter predicate for aggregate expression.
URL: https://github.com/apache/spark/pull/26420#discussion_r344557706
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ##
 @@ -1574,7 +1579,7 @@ class Analyzer(
 s"its class is ${other.getClass.getCanonicalName}, which 
is not a generator.")
   }
 }
-  case u @ UnresolvedFunction(funcId, children, isDistinct) =>
+  case u @ UnresolvedFunction(funcId, children, isDistinct, filter) =>
 
 Review comment:
   @maropu Thanks for your remind. I will add it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] beliefer commented on a change in pull request #26420: [SPARK-27986][SQL] Support ANSI SQL filter predicate for aggregate expression.

2019-11-10 Thread GitBox

beliefer commented on a change in pull request #26420: [SPARK-27986][SQL] 
Support ANSI SQL filter predicate for aggregate expression.
URL: https://github.com/apache/spark/pull/26420#discussion_r344557683
 
 

 ##
 File path: docs/sql-keywords.md
 ##
 @@ -115,6 +115,7 @@ Below is a list of all the keywords in Spark SQL.
   
FALSEreservednon-reservedreserved
   
FETCHreservednon-reservedreserved
   
FIELDSnon-reservednon-reservednon-reserved
+  
FILTERreservednon-reservednon-reserved
 
 Review comment:
   @maropu Thanks for your remind. I will add it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] imback82 commented on a change in pull request #26441: [SPARK-29682][SQL] Resolve conflicting references in aggregate expressions

2019-11-10 Thread GitBox

imback82 commented on a change in pull request #26441: [SPARK-29682][SQL] 
Resolve conflicting references in aggregate expressions 
URL: https://github.com/apache/spark/pull/26441#discussion_r344557828
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ##
 @@ -949,14 +949,19 @@ class Analyzer(
 if oldVersion.outputSet.intersect(conflictingAttributes).nonEmpty 
=>
   (oldVersion, oldVersion.copy(serializer = 
oldVersion.serializer.map(_.newInstance(
 
-// Handle projects that create conflicting aliases.
 case oldVersion @ Project(projectList, _)
-if 
findAliases(projectList).intersect(conflictingAttributes).nonEmpty =>
-  (oldVersion, oldVersion.copy(projectList = newAliases(projectList)))
+if hasConflict(projectList, conflictingAttributes) =>
+  (oldVersion,
+oldVersion.copy(
+  projectList =
+newNamedExpression(projectList, conflictingAttributes)))
 
 Review comment:
   updated as suggested. thanks!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] maropu commented on issue #26459: [SPARK-29825][SQL][TESTS] Add join-related configs in `inner-join.sql` and `postgreSQL/join.sql`

2019-11-10 Thread GitBox

maropu commented on issue #26459: [SPARK-29825][SQL][TESTS] Add join-related 
configs in `inner-join.sql` and `postgreSQL/join.sql`
URL: https://github.com/apache/spark/pull/26459#issuecomment-552252750
 
 
   > Hi, @maropu . This PR adds comment. Is Add join conditions in join-related 
tests correct?
   
   Ambiguous? I updated it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] LantaoJin edited a comment on issue #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider

2019-11-10 Thread GitBox

LantaoJin edited a comment on issue #26097: [SPARK-29421][SQL] Supporting 
Create Table Like Using Provider
URL: https://github.com/apache/spark/pull/26097#issuecomment-552254605
 
 
   > we can add an extra check `DDLUtils.isHiveProvider`, to make it work
   
   I think we can reuse `DDLUtils.isHiveTable(provider: Option[String])`
   ```scala
 def isHiveTable(provider: Option[String]): Boolean = {
   provider.isDefined && provider.get.toLowerCase(Locale.ROOT) == 
HIVE_PROVIDER
 }
   ```
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled"

2019-11-10 Thread GitBox

AmplabJenkins removed a comment on issue #26444: [SPARK-29807][SQL] Rename 
"spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled"
URL: https://github.com/apache/spark/pull/26444#issuecomment-552258689
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18441/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on a change in pull request #26435: [SPARK-29821][SQL] Allow calling non-aggregate SQL functions with column name

2019-11-10 Thread GitBox

HyukjinKwon commented on a change in pull request #26435: [SPARK-29821][SQL] 
Allow calling non-aggregate SQL functions with column name
URL: https://github.com/apache/spark/pull/26435#discussion_r344531175
 
 

 ##
 File path: sql/core/src/main/scala/org/apache/spark/sql/functions.scala
 ##
 @@ -1082,6 +1082,14 @@ object functions {
*/
   def isnan(e: Column): Column = withExpr { IsNaN(e.expr) }
 
+  /**
+   * Return true iff the column is NaN.
+   *
+   * @group normal_funcs
+   * @since 1.6.0
+   */
+  def isnan(columnName: String): Column = isnan(Column(columnName))
 
 Review comment:
   We won't add this per the comments on the top of this file.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled"

2019-11-10 Thread GitBox

AmplabJenkins removed a comment on issue #26444: [SPARK-29807][SQL] Rename 
"spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled"
URL: https://github.com/apache/spark/pull/26444#issuecomment-552258687
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] Icysandwich commented on a change in pull request #26454: [SPARK-29818][MLLIB] Missing persist on RDD

2019-11-10 Thread GitBox

Icysandwich commented on a change in pull request #26454: [SPARK-29818][MLLIB] 
Missing persist on RDD
URL: https://github.com/apache/spark/pull/26454#discussion_r344531250
 
 

 ##
 File path: 
mllib/src/main/scala/org/apache/spark/ml/evaluation/MultilabelClassificationEvaluator.scala
 ##
 @@ -96,6 +96,7 @@ class MultilabelClassificationEvaluator (override val uid: 
String)
 .rdd.map { row =>
 (row.getSeq[Double](0).toArray, row.getSeq[Double](1).toArray)
   }
+predictionAndLabels.persist()
 
 Review comment:
   It is used multiple times in new MultilabelMetrics to initilize fields.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] Icysandwich commented on a change in pull request #26454: [SPARK-29818][MLLIB] Missing persist on RDD

2019-11-10 Thread GitBox

Icysandwich commented on a change in pull request #26454: [SPARK-29818][MLLIB] 
Missing persist on RDD
URL: https://github.com/apache/spark/pull/26454#discussion_r344531250
 
 

 ##
 File path: 
mllib/src/main/scala/org/apache/spark/ml/evaluation/MultilabelClassificationEvaluator.scala
 ##
 @@ -96,6 +96,7 @@ class MultilabelClassificationEvaluator (override val uid: 
String)
 .rdd.map { row =>
 (row.getSeq[Double](0).toArray, row.getSeq[Double](1).toArray)
   }
+predictionAndLabels.persist()
 
 Review comment:
   It is used mulple times in new MultilabelMetrics to initilize fields.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on a change in pull request #26435: [SPARK-29821][SQL] Allow calling non-aggregate SQL functions with column name

2019-11-10 Thread GitBox

HyukjinKwon commented on a change in pull request #26435: [SPARK-29821][SQL] 
Allow calling non-aggregate SQL functions with column name
URL: https://github.com/apache/spark/pull/26435#discussion_r344531088
 
 

 ##
 File path: python/pyspark/sql/functions.py
 ##
 @@ -513,6 +513,8 @@ def isnan(col):
 [Row(r1=False, r2=False), Row(r1=True, r2=True)]
 """
 sc = SparkContext._active_spark_context
+if type(col) is str:
 
 Review comment:
   This seems already working.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled"

2019-11-10 Thread GitBox

AmplabJenkins commented on issue #26444: [SPARK-29807][SQL] Rename 
"spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled"
URL: https://github.com/apache/spark/pull/26444#issuecomment-552258689
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18441/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled"

2019-11-10 Thread GitBox

AmplabJenkins commented on issue #26444: [SPARK-29807][SQL] Rename 
"spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled"
URL: https://github.com/apache/spark/pull/26444#issuecomment-552258687
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on a change in pull request #26435: [SPARK-29821][SQL] Allow calling non-aggregate SQL functions with column name

2019-11-10 Thread GitBox

HyukjinKwon commented on a change in pull request #26435: [SPARK-29821][SQL] 
Allow calling non-aggregate SQL functions with column name
URL: https://github.com/apache/spark/pull/26435#discussion_r344530892
 
 

 ##
 File path: python/pyspark/sql/functions.py
 ##
 @@ -513,6 +513,8 @@ def isnan(col):
 [Row(r1=False, r2=False), Row(r1=True, r2=True)]
 """
 sc = SparkContext._active_spark_context
+if type(col) is str:
+return Column(sc._jvm.functions.isnan(col))
 
 Review comment:
   Can you add a doctest?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on a change in pull request #26435: [SPARK-29821][SQL] Allow calling non-aggregate SQL functions with column name

2019-11-10 Thread GitBox

HyukjinKwon commented on a change in pull request #26435: [SPARK-29821][SQL] 
Allow calling non-aggregate SQL functions with column name
URL: https://github.com/apache/spark/pull/26435#discussion_r344531140
 
 

 ##
 File path: python/pyspark/sql/functions.py
 ##
 @@ -513,6 +513,8 @@ def isnan(col):
 [Row(r1=False, r2=False), Row(r1=True, r2=True)]
 """
 sc = SparkContext._active_spark_context
+if type(col) is str:
 
 Review comment:
   ```python
   >>> from pyspark.sql.functions import isnan
   >>> df = spark.createDataFrame([(1.0, float('nan')), (float('nan'), 2.0)], 
("a", "b"))
   >>> df.select(isnan("a")).collect()
   [Row(isnan(a)=False), Row(isnan(a)=True)]
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on a change in pull request #26435: [SPARK-29821][SQL] Allow calling non-aggregate SQL functions with column name

2019-11-10 Thread GitBox

HyukjinKwon commented on a change in pull request #26435: [SPARK-29821][SQL] 
Allow calling non-aggregate SQL functions with column name
URL: https://github.com/apache/spark/pull/26435#discussion_r344531286
 
 

 ##
 File path: sql/core/src/main/scala/org/apache/spark/sql/functions.scala
 ##
 @@ -1082,6 +1082,14 @@ object functions {
*/
   def isnan(e: Column): Column = withExpr { IsNaN(e.expr) }
 
+  /**
+   * Return true iff the column is NaN.
+   *
+   * @group normal_funcs
+   * @since 1.6.0
+   */
+  def isnan(columnName: String): Column = isnan(Column(columnName))
 
 Review comment:
   
https://github.com/apache/spark/blob/f8b1424d2f51bc8a5b500c70742be8a9dfffa1df/sql/core/src/main/scala/org/apache/spark/sql/functions.scala#L58-L60


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] Icysandwich commented on a change in pull request #26454: [SPARK-29818][MLLIB] Missing persist on RDD

2019-11-10 Thread GitBox

Icysandwich commented on a change in pull request #26454: [SPARK-29818][MLLIB] 
Missing persist on RDD
URL: https://github.com/apache/spark/pull/26454#discussion_r344531250
 
 

 ##
 File path: 
mllib/src/main/scala/org/apache/spark/ml/evaluation/MultilabelClassificationEvaluator.scala
 ##
 @@ -96,6 +96,7 @@ class MultilabelClassificationEvaluator (override val uid: 
String)
 .rdd.map { row =>
 (row.getSeq[Double](0).toArray, row.getSeq[Double](1).toArray)
   }
+predictionAndLabels.persist()
 
 Review comment:
   It is used multiple times in new MultilabelMetrics(predictionAndLabels) to 
initilize fields.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on a change in pull request #26435: [SPARK-29821][SQL] Allow calling non-aggregate SQL functions with column name

2019-11-10 Thread GitBox

HyukjinKwon commented on a change in pull request #26435: [SPARK-29821][SQL] 
Allow calling non-aggregate SQL functions with column name
URL: https://github.com/apache/spark/pull/26435#discussion_r344530920
 
 

 ##
 File path: python/pyspark/sql/functions.py
 ##
 @@ -513,6 +513,8 @@ def isnan(col):
 [Row(r1=False, r2=False), Row(r1=True, r2=True)]
 """
 sc = SparkContext._active_spark_context
+if type(col) is str:
 
 Review comment:
   Shall we use `isinstance`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] stczwd commented on a change in pull request #26433: [SPARK-29771][K8S] Add configure to limit executor failures

2019-11-10 Thread GitBox

stczwd commented on a change in pull request #26433: [SPARK-29771][K8S] Add 
configure to limit executor failures
URL: https://github.com/apache/spark/pull/26433#discussion_r344532781
 
 

 ##
 File path: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala
 ##
 @@ -37,6 +37,8 @@ private[spark] class ExecutorPodsAllocator(
 snapshotsStore: ExecutorPodsSnapshotsStore,
 clock: Clock) extends Logging {
 
+  private val EXIT_MAX_EXECUTOR_FAILURES = 10
 
 Review comment:
   Maybe it's better to reuse YARN's EXIT_MAX_EXECUTOR_FAILURES=11.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] viirya opened a new pull request #26461: [SPARK-29831][SQL] Scan Hive partitioned table should not dramatically increase data parallelism

2019-11-10 Thread GitBox

viirya opened a new pull request #26461: [SPARK-29831][SQL] Scan Hive 
partitioned table should not dramatically increase data parallelism
URL: https://github.com/apache/spark/pull/26461
 
 
   
   
   ### What changes were proposed in this pull request?
   
   
   Hive table scan operator reads each Hive partition as a HadoopRDD and unions 
all RDDs. The data parallelism of the result RDD can be dramatically increased, 
when reading a lot of partitions with a lot of files.
   
   This patch proposes to add a config to limit the maximum of the data 
parallelism for scanning Hive partitioned table.
   
   ### Why are the changes needed?
   
   
   Although users can also do coalesce by themselves, this patch proposes to 
add a config to limit the maximum of the data parallelism. Because:
   
   1. end-users might not understand details and get confused by big partition 
number. end-users might not know why/when/where to add coalesce.
   2. end-users need to add coalesce to each time Hive table scan. It is 
annoying. From the perspective of of cluster operator, it is much easier to 
config instead of asking each end-users to know the details and add coalesce.
   
   ### Does this PR introduce any user-facing change?
   
   
   No, if not set the config.
   
   If set a maximum value by the config, when scanning Hive partitioned table, 
once the number of partitions exceeds the maximum, Spark coalesces the result 
RDD.
   
   ### How was this patch tested?
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on issue #26459: [SPARK-29825][SQL][TESTS] Add join-related configs in `inner-join.sql` and `postgreSQL/join.sql`

2019-11-10 Thread GitBox

dongjoon-hyun commented on issue #26459: [SPARK-29825][SQL][TESTS] Add 
join-related configs in `inner-join.sql` and `postgreSQL/join.sql`
URL: https://github.com/apache/spark/pull/26459#issuecomment-552266698
 
 
   Shall we add the following line as a first line? This PR is adding a comment 
describing a running environment instead of adding the real configration. I 
mean that's confusing.
   ```
   -- List of configuration the test suite is run against:
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] viirya commented on issue #26461: [SPARK-29831][SQL] Scan Hive partitioned table should not dramatically increase data parallelism

2019-11-10 Thread GitBox

viirya commented on issue #26461: [SPARK-29831][SQL] Scan Hive partitioned 
table should not dramatically increase data parallelism
URL: https://github.com/apache/spark/pull/26461#issuecomment-552266990
 
 
   cc @cloud-fan @dongjoon-hyun @felixcheung 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] maropu commented on issue #26459: [SPARK-29825][SQL][TESTS] Add join-related configs in `inner-join.sql` and `postgreSQL/join.sql`

2019-11-10 Thread GitBox

maropu commented on issue #26459: [SPARK-29825][SQL][TESTS] Add join-related 
configs in `inner-join.sql` and `postgreSQL/join.sql`
URL: https://github.com/apache/spark/pull/26459#issuecomment-552266911
 
 
   Ah, ok.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] mob-ai commented on a change in pull request #26124: [SPARK-29224][ML]Implement Factorization Machines as a ml-pipeline component

2019-11-10 Thread GitBox

mob-ai commented on a change in pull request #26124: [SPARK-29224][ML]Implement 
Factorization Machines as a ml-pipeline component 
URL: https://github.com/apache/spark/pull/26124#discussion_r344544387
 
 

 ##
 File path: 
mllib/src/main/scala/org/apache/spark/ml/regression/FactorizationMachines.scala
 ##
 @@ -0,0 +1,757 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.ml.regression
+
+import scala.util.Random
+
+import breeze.linalg.{axpy => brzAxpy, norm => brzNorm, Vector => BV}
+import breeze.numerics.{sqrt => brzSqrt}
+import org.apache.hadoop.fs.Path
+
+import org.apache.spark.annotation.Since
+import org.apache.spark.internal.Logging
+import org.apache.spark.ml.{PredictionModel, Predictor, PredictorParams}
+import org.apache.spark.ml.linalg._
+import org.apache.spark.ml.linalg.BLAS._
+import org.apache.spark.ml.param._
+import org.apache.spark.ml.param.shared._
+import org.apache.spark.ml.util._
+import org.apache.spark.ml.util.Instrumentation.instrumented
+import org.apache.spark.mllib.{linalg => OldLinalg}
+import org.apache.spark.mllib.linalg.{Vector => OldVector, Vectors => 
OldVectors}
+import org.apache.spark.mllib.linalg.VectorImplicits._
+import org.apache.spark.mllib.optimization.{Gradient, GradientDescent, 
SquaredL2Updater, Updater}
+import org.apache.spark.mllib.regression.{LabeledPoint => OldLabeledPoint}
+import org.apache.spark.mllib.util.MLUtils
+import org.apache.spark.rdd.RDD
+import org.apache.spark.sql.{Dataset, Row}
+import org.apache.spark.sql.functions.col
+import org.apache.spark.storage.StorageLevel
+
+/**
+ * Params for Factorization Machines
+ */
+private[regression] trait FactorizationMachinesParams
+  extends PredictorParams
+  with HasMaxIter with HasStepSize with HasTol with HasSolver with HasLoss {
+
+  import FactorizationMachines._
+
+  /**
+   * Param for dimensionality of the factors (= 0)
+   * @group param
+   */
+  @Since("3.0.0")
+  final val numFactors: IntParam = new IntParam(this, "numFactors",
+"dimensionality of the factor vectors, " +
+  "which are used to get pairwise interactions between variables",
+ParamValidators.gt(0))
+
+  /** @group getParam */
+  @Since("3.0.0")
+  final def getNumFactors: Int = $(numFactors)
+
+  /**
+   * Param for whether to fit global bias term
+   * @group param
+   */
+  @Since("3.0.0")
+  final val fitBias: BooleanParam = new BooleanParam(this, "fitBias",
+"whether to fit global bias term")
+
+  /** @group getParam */
+  @Since("3.0.0")
+  final def getFitBias: Boolean = $(fitBias)
+
+  /**
+   * Param for whether to fit linear term (aka 1-way term)
+   * @group param
+   */
+  @Since("3.0.0")
+  final val fitLinear: BooleanParam = new BooleanParam(this, "fitLinear",
+"whether to fit linear term (aka 1-way term)")
+
+  /** @group getParam */
+  @Since("3.0.0")
+  final def getFitLinear: Boolean = $(fitLinear)
+
+  /**
+   * Param for L2 regularization parameter (= 0)
+   * @group param
+   */
+  @Since("3.0.0")
+  final val regParam: DoubleParam = new DoubleParam(this, "regParam",
+"the parameter of l2-regularization term, " +
+  "which prevents overfitting by adding sum of squares of all the 
parameters",
+ParamValidators.gtEq(0))
+
+  /** @group getParam */
+  @Since("3.0.0")
+  final def getRegParam: Double = $(regParam)
+
+  /**
+   * Param for mini-batch fraction, must be in range (0, 1]
+   * @group param
+   */
+  @Since("3.0.0")
+  final val miniBatchFraction: DoubleParam = new DoubleParam(this, 
"miniBatchFraction",
+"fraction of the input data set that should be used for one iteration of 
gradient descent",
+ParamValidators.inRange(0, 1, false, true))
+
+  /** @group getParam */
+  @Since("3.0.0")
+  final def getMiniBatchFraction: Double = $(miniBatchFraction)
+
+  /**
+   * Param for standard deviation of initial coefficients
+   * @group param
+   */
+  @Since("3.0.0")
+  final val initStd: DoubleParam = new DoubleParam(this, "initStd",
+"standard deviation of initial coefficients", ParamValidators.gt(0))
+
+  /** @group getParam */
+  @Since("3.0.0")
+  final def getInitStd: Double = $(initStd)
+
+  /**
+   * The solver algorithm for optimization.
+   * Supported

[GitHub] [spark] AmplabJenkins removed a comment on issue #25964: [SPARK-29287][Core] Add ExecutorConstructed message to tell driver which executor is ready for making offers

2019-11-10 Thread GitBox

AmplabJenkins removed a comment on issue #25964: [SPARK-29287][Core] Add 
ExecutorConstructed message to tell driver which executor is ready for making 
offers
URL: https://github.com/apache/spark/pull/25964#issuecomment-552276404
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18450/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AngersZhuuuu commented on a change in pull request #26459: [SPARK-29825][SQL][TESTS] Add join-related configs in `inner-join.sql` and `postgreSQL/join.sql`

2019-11-10 Thread GitBox

AngersZh commented on a change in pull request #26459: 
[SPARK-29825][SQL][TESTS] Add join-related configs in `inner-join.sql` and 
`postgreSQL/join.sql`
URL: https://github.com/apache/spark/pull/26459#discussion_r344544527
 
 

 ##
 File path: sql/core/src/test/resources/sql-tests/inputs/postgreSQL/join.sql
 ##
 @@ -6,6 +6,11 @@
 -- Test JOIN clauses
 -- 
https://github.com/postgres/postgres/blob/REL_12_BETA2/src/test/regress/sql/join.sql
 --
+
+--SET spark.sql.autoBroadcastJoinThreshold=10485760
 
 Review comment:
   @dongjoon-hyun @maropu 
`10 * 1024 * 1024 = 10485760` 
   This config's default value is `10m`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25964: [SPARK-29287][Core] Add ExecutorConstructed message to tell driver which executor is ready for making offers

2019-11-10 Thread GitBox

AmplabJenkins removed a comment on issue #25964: [SPARK-29287][Core] Add 
ExecutorConstructed message to tell driver which executor is ready for making 
offers
URL: https://github.com/apache/spark/pull/25964#issuecomment-552276390
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25964: [SPARK-29287][Core] Add ExecutorConstructed message to tell driver which executor is ready for making offers

2019-11-10 Thread GitBox

AmplabJenkins commented on issue #25964: [SPARK-29287][Core] Add 
ExecutorConstructed message to tell driver which executor is ready for making 
offers
URL: https://github.com/apache/spark/pull/25964#issuecomment-552276404
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18450/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25964: [SPARK-29287][Core] Add ExecutorConstructed message to tell driver which executor is ready for making offers

2019-11-10 Thread GitBox

AmplabJenkins commented on issue #25964: [SPARK-29287][Core] Add 
ExecutorConstructed message to tell driver which executor is ready for making 
offers
URL: https://github.com/apache/spark/pull/25964#issuecomment-552276390
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled"

2019-11-10 Thread GitBox

AmplabJenkins removed a comment on issue #26444: [SPARK-29807][SQL] Rename 
"spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled"
URL: https://github.com/apache/spark/pull/26444#issuecomment-552284932
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] iRakson commented on a change in pull request #26315: [SPARK-29152][CORE] Executor Plugin shutdown when dynamic allocation is ena…

2019-11-10 Thread GitBox

iRakson commented on a change in pull request #26315: [SPARK-29152][CORE] 
Executor Plugin shutdown when dynamic allocation is ena…
URL: https://github.com/apache/spark/pull/26315#discussion_r344551590
 
 

 ##
 File path: core/src/main/scala/org/apache/spark/executor/Executor.scala
 ##
 @@ -65,6 +65,12 @@ private[spark] class Executor(
 
   logInfo(s"Starting executor ID $executorId on host $executorHostname")
 
+  @volatile private var executorShutdown = false
+  ShutdownHookManager.addShutdownHook(
+() => if (!executorShutdown) {
 
 Review comment:
   If i don't check it here then stop() method will be called twice in case of 
graceful shutdown.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled"

2019-11-10 Thread GitBox

AmplabJenkins removed a comment on issue #26444: [SPARK-29807][SQL] Rename 
"spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled"
URL: https://github.com/apache/spark/pull/26444#issuecomment-552284941
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113552/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25964: [SPARK-29287][Core] Add ExecutorConstructed message to tell driver which executor is ready for making offers

2019-11-10 Thread GitBox

AmplabJenkins removed a comment on issue #25964: [SPARK-29287][Core] Add 
ExecutorConstructed message to tell driver which executor is ready for making 
offers
URL: https://github.com/apache/spark/pull/25964#issuecomment-552297321
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18455/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25964: [SPARK-29287][Core] Add ExecutorConstructed message to tell driver which executor is ready for making offers

2019-11-10 Thread GitBox

AmplabJenkins commented on issue #25964: [SPARK-29287][Core] Add 
ExecutorConstructed message to tell driver which executor is ready for making 
offers
URL: https://github.com/apache/spark/pull/25964#issuecomment-552297316
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25964: [SPARK-29287][Core] Add ExecutorConstructed message to tell driver which executor is ready for making offers

2019-11-10 Thread GitBox

AmplabJenkins removed a comment on issue #25964: [SPARK-29287][Core] Add 
ExecutorConstructed message to tell driver which executor is ready for making 
offers
URL: https://github.com/apache/spark/pull/25964#issuecomment-552297316
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25964: [SPARK-29287][Core] Add ExecutorConstructed message to tell driver which executor is ready for making offers

2019-11-10 Thread GitBox

AmplabJenkins commented on issue #25964: [SPARK-29287][Core] Add 
ExecutorConstructed message to tell driver which executor is ready for making 
offers
URL: https://github.com/apache/spark/pull/25964#issuecomment-552297321
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18455/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on issue #25024: [SPARK-27296][SQL] User Defined Aggregators that do not ser/de on each input row

2019-11-10 Thread GitBox

cloud-fan commented on issue #25024: [SPARK-27296][SQL] User Defined 
Aggregators that do not ser/de on each input row
URL: https://github.com/apache/spark/pull/25024#issuecomment-552303072
 
 
   UDAF should work in Java, and I don't think put scala implict in the public 
API is a good idea.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] viirya commented on a change in pull request #26461: [SPARK-29831][SQL] Scan Hive partitioned table should not dramatically increase data parallelism

2019-11-10 Thread GitBox

viirya commented on a change in pull request #26461: [SPARK-29831][SQL] Scan 
Hive partitioned table should not dramatically increase data parallelism
URL: https://github.com/apache/spark/pull/26461#discussion_r344565178
 
 

 ##
 File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveUtils.scala
 ##
 @@ -155,6 +155,16 @@ private[spark] object HiveUtils extends Logging {
 .booleanConf
 .createWithDefault(true)
 
+  val HIVE_TABLE_SCAN_MAX_PARALLELISM = 
buildConf("spark.sql.hive.tableScan.maxParallelism")
 
 Review comment:
   When reading a Hive partitioned table, users could get an unreasonable 
number of partitions like dozens of thousands. 
   
   Hive Scan node returns a UnionRDD of Hive table partitions. Each Hive table 
partition is read as a HadoopRDD. For each Hive table partition, the 
parallelism depends on data size. But final UnionRDD sums up all number of 
parallelism of all Hive table partitions.
   
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on a change in pull request #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values

2019-11-10 Thread GitBox

cloud-fan commented on a change in pull request #26449: [SPARK-29822][SQL] Fix 
cast error when there are white spaces between signs and values
URL: https://github.com/apache/spark/pull/26449#discussion_r344573164
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
 ##
 @@ -425,11 +425,15 @@ object IntervalUtils {
   }
 
   private object ParseState extends Enumeration {
+type ParseState = Value
+
 val PREFIX,
 BEGIN_VALUE,
 PARSE_SIGN,
+TRIM_VALUE,
 
 Review comment:
   can we make the name clearer? e.g. `TRIM_BEFORE_PARSE_SIGN`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on a change in pull request #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values

2019-11-10 Thread GitBox

cloud-fan commented on a change in pull request #26449: [SPARK-29822][SQL] Fix 
cast error when there are white spaces between signs and values
URL: https://github.com/apache/spark/pull/26449#discussion_r344573164
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
 ##
 @@ -425,11 +425,15 @@ object IntervalUtils {
   }
 
   private object ParseState extends Enumeration {
+type ParseState = Value
+
 val PREFIX,
 BEGIN_VALUE,
 PARSE_SIGN,
+TRIM_VALUE,
 
 Review comment:
   can we make the name clearer? e.g. `TRIM_BEFORE_UNIT_VALUE`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dilipbiswal commented on a change in pull request #26437: [SPARK-29800][SQL] Plan Exists 's subquery in PlanSubqueries

2019-11-10 Thread GitBox

dilipbiswal commented on a change in pull request #26437: [SPARK-29800][SQL] 
Plan Exists 's subquery in PlanSubqueries
URL: https://github.com/apache/spark/pull/26437#discussion_r344582313
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/subquery.scala
 ##
 @@ -106,12 +106,20 @@ object RewritePredicateSubquery extends 
Rule[LogicalPlan] with PredicateHelper {
 
   // Filter the plan by applying left semi and left anti joins.
   withSubquery.foldLeft(newFilter) {
-case (p, Exists(sub, conditions, _)) =>
-  val (joinCond, outerPlan) = rewriteExistentialExpr(conditions, p)
-  buildJoin(outerPlan, sub, LeftSemi, joinCond)
-case (p, Not(Exists(sub, conditions, _))) =>
-  val (joinCond, outerPlan) = rewriteExistentialExpr(conditions, p)
-  buildJoin(outerPlan, sub, LeftAnti, joinCond)
+case (p, exists @ Exists(sub, conditions, _)) =>
+  if (SubqueryExpression.hasCorrelatedSubquery(exists)) {
+val (joinCond, outerPlan) = rewriteExistentialExpr(conditions, p)
+buildJoin(outerPlan, sub, LeftSemi, joinCond)
+  } else {
+Filter(exists, newFilter)
+  }
+case (p, Not(exists @ Exists(sub, conditions, _))) =>
+  if (SubqueryExpression.hasCorrelatedSubquery(exists)) {
+val (joinCond, outerPlan) = rewriteExistentialExpr(conditions, p)
+buildJoin(outerPlan, sub, LeftAnti, joinCond)
+  } else {
+Filter(Not(exists), newFilter)
+  }
 
 Review comment:
   @AngersZh I discussed this with Wenchen. Do you think we can safely 
inject a "LIMIT 1" into our subplan to expedite its execution ? Pl. lets us 
know what you think ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dilipbiswal commented on a change in pull request #26437: [SPARK-29800][SQL] Plan Exists 's subquery in PlanSubqueries

2019-11-10 Thread GitBox

dilipbiswal commented on a change in pull request #26437: [SPARK-29800][SQL] 
Plan Exists 's subquery in PlanSubqueries
URL: https://github.com/apache/spark/pull/26437#discussion_r344582313
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/subquery.scala
 ##
 @@ -106,12 +106,20 @@ object RewritePredicateSubquery extends 
Rule[LogicalPlan] with PredicateHelper {
 
   // Filter the plan by applying left semi and left anti joins.
   withSubquery.foldLeft(newFilter) {
-case (p, Exists(sub, conditions, _)) =>
-  val (joinCond, outerPlan) = rewriteExistentialExpr(conditions, p)
-  buildJoin(outerPlan, sub, LeftSemi, joinCond)
-case (p, Not(Exists(sub, conditions, _))) =>
-  val (joinCond, outerPlan) = rewriteExistentialExpr(conditions, p)
-  buildJoin(outerPlan, sub, LeftAnti, joinCond)
+case (p, exists @ Exists(sub, conditions, _)) =>
+  if (SubqueryExpression.hasCorrelatedSubquery(exists)) {
+val (joinCond, outerPlan) = rewriteExistentialExpr(conditions, p)
+buildJoin(outerPlan, sub, LeftSemi, joinCond)
+  } else {
+Filter(exists, newFilter)
+  }
+case (p, Not(exists @ Exists(sub, conditions, _))) =>
+  if (SubqueryExpression.hasCorrelatedSubquery(exists)) {
+val (joinCond, outerPlan) = rewriteExistentialExpr(conditions, p)
+buildJoin(outerPlan, sub, LeftAnti, joinCond)
+  } else {
+Filter(Not(exists), newFilter)
+  }
 
 Review comment:
   @AngersZh I discussed this with Wenchen briefly. Do you think we can 
safely inject a "LIMIT 1" into our subplan to expedite its execution ? Pl. lets 
us know what you think ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #26118: [SPARK-24915][Python] Fix Row handling with Schema.

2019-11-10 Thread GitBox

AmplabJenkins removed a comment on issue #26118: [SPARK-24915][Python] Fix Row 
handling with Schema.
URL: https://github.com/apache/spark/pull/26118#issuecomment-552230486
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #26118: [SPARK-24915][Python] Fix Row handling with Schema.

2019-11-10 Thread GitBox

AmplabJenkins removed a comment on issue #26118: [SPARK-24915][Python] Fix Row 
handling with Schema.
URL: https://github.com/apache/spark/pull/26118#issuecomment-552230489
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113550/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] imback82 commented on a change in pull request #26441: [SPARK-29682][SQL] Resolve conflicting references in aggregate expressions

2019-11-10 Thread GitBox

imback82 commented on a change in pull request #26441: [SPARK-29682][SQL] 
Resolve conflicting references in aggregate expressions 
URL: https://github.com/apache/spark/pull/26441#discussion_r344516803
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ##
 @@ -949,14 +949,19 @@ class Analyzer(
 if oldVersion.outputSet.intersect(conflictingAttributes).nonEmpty 
=>
   (oldVersion, oldVersion.copy(serializer = 
oldVersion.serializer.map(_.newInstance(
 
-// Handle projects that create conflicting aliases.
 case oldVersion @ Project(projectList, _)
-if 
findAliases(projectList).intersect(conflictingAttributes).nonEmpty =>
-  (oldVersion, oldVersion.copy(projectList = newAliases(projectList)))
+if hasConflict(projectList, conflictingAttributes) =>
+  (oldVersion,
+oldVersion.copy(
+  projectList =
+newNamedExpression(projectList, conflictingAttributes)))
 
 case oldVersion @ Aggregate(_, aggregateExpressions, _)
 
 Review comment:
   > Could we fix this issue in an easier way than the current fix?
   
   I don't think it is robust enough. For example, the following test fails 
with the suggested fix:
   ```
   [info] - [SPARK-6231] join - self join auto resolve ambiguity *** FAILED *** 
(251 milliseconds)
   [info]   Failed to analyze query: org.apache.spark.sql.AnalysisException: 
Resolved attribute(s) key#4619 missing from key#4518,value#4519 in operator 
!Aggregate [key#4619], [key#4619, sum(cast(key#4619 as bigint)) AS 
sum(key)#4620L]. Attribute(s) with the same name appear in the operation: key. 
Please check if the right attribute(s) are used.;;
   [info]   Join Inner, (key#4518 = key#4518)
   [info]   :- Aggregate [key#4518], [key#4518, count(1) AS count(1)#4610L]
   [info]   :  +- Project [_1#4513 AS key#4518, _2#4514 AS value#4519]
   [info]   : +- LocalRelation [_1#4513, _2#4514]
   [info]   +- !Aggregate [key#4619], [key#4619, sum(cast(key#4619 as bigint)) 
AS sum(key)#4620L]
   [info]  +- Project [_1#4513 AS key#4518, _2#4514 AS value#4519]
   [info] +- LocalRelation [_1#4513, _2#4514]
   [info]   
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun closed pull request #25734: [SPARK-28939][SQL][2.4] Propagate SQLConf for plans executed by toRdd

2019-11-10 Thread GitBox

dongjoon-hyun closed pull request #25734: [SPARK-28939][SQL][2.4] Propagate 
SQLConf for plans executed by toRdd
URL: https://github.com/apache/spark/pull/25734
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #26446: [SPARK-29393][SQL] Add `make_interval` function

2019-11-10 Thread GitBox

AmplabJenkins commented on issue #26446: [SPARK-29393][SQL] Add `make_interval` 
function
URL: https://github.com/apache/spark/pull/26446#issuecomment-552241564
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113546/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #26446: [SPARK-29393][SQL] Add `make_interval` function

2019-11-10 Thread GitBox

SparkQA removed a comment on issue #26446: [SPARK-29393][SQL] Add 
`make_interval` function
URL: https://github.com/apache/spark/pull/26446#issuecomment-552221679
 
 
   **[Test build #113546 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113546/testReport)**
 for PR 26446 at commit 
[`0f9a3bb`](https://github.com/apache/spark/commit/0f9a3bb846f2f7f40a3be4c13ccc201a09aaf554).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #26446: [SPARK-29393][SQL] Add `make_interval` function

2019-11-10 Thread GitBox

AmplabJenkins commented on issue #26446: [SPARK-29393][SQL] Add `make_interval` 
function
URL: https://github.com/apache/spark/pull/26446#issuecomment-552241563
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] holdenk commented on issue #26312: [SPARK-29649][SQL] Stop task set if FileAlreadyExistsException was thrown when writing to output file

2019-11-10 Thread GitBox

holdenk commented on issue #26312: [SPARK-29649][SQL] Stop task set if 
FileAlreadyExistsException was thrown when writing to output file
URL: https://github.com/apache/spark/pull/26312#issuecomment-552246303
 
 
   Sure, I've got some review cycles on Tuesday I'll take a look then unless 
it's blocking something.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] maropu commented on issue #26458: [SPARK-29821] Allow calling non-aggregate SQL functions with column name

2019-11-10 Thread GitBox

maropu commented on issue #26458: [SPARK-29821] Allow calling non-aggregate SQL 
functions with column name
URL: https://github.com/apache/spark/pull/26458#issuecomment-552249017
 
 
   Similar PRs to support string columns periodically come up, so it might be 
worth leaving some notes about this historical policy for referring to it...


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled"

2019-11-10 Thread GitBox

AmplabJenkins removed a comment on issue #26444: [SPARK-29807][SQL] Rename 
"spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled"
URL: https://github.com/apache/spark/pull/26444#issuecomment-552257101
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18440/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled"

2019-11-10 Thread GitBox

AmplabJenkins removed a comment on issue #26444: [SPARK-29807][SQL] Rename 
"spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled"
URL: https://github.com/apache/spark/pull/26444#issuecomment-552257094
 
 
   Build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] hahadsg commented on a change in pull request #26124: [SPARK-29224][ML]Implement Factorization Machines as a ml-pipeline component

2019-11-10 Thread GitBox

hahadsg commented on a change in pull request #26124: 
[SPARK-29224][ML]Implement Factorization Machines as a ml-pipeline component 
URL: https://github.com/apache/spark/pull/26124#discussion_r344534187
 
 

 ##
 File path: 
mllib/src/main/scala/org/apache/spark/ml/classification/FMClassifier.scala
 ##
 @@ -0,0 +1,326 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.ml.classification
+
+import org.apache.hadoop.fs.Path
+
+import org.apache.spark.annotation.Since
+import org.apache.spark.internal.Logging
+import org.apache.spark.ml.linalg._
+import org.apache.spark.ml.param._
+import org.apache.spark.ml.regression.{FactorizationMachines, 
FactorizationMachinesParams}
+import org.apache.spark.ml.regression.FactorizationMachines._
+import org.apache.spark.ml.util._
+import org.apache.spark.ml.util.Instrumentation.instrumented
+import org.apache.spark.mllib.linalg.{Vector => OldVector}
+import org.apache.spark.mllib.linalg.VectorImplicits._
+import org.apache.spark.rdd.RDD
+import org.apache.spark.sql.{Dataset, Row}
+import org.apache.spark.sql.functions.col
+import org.apache.spark.storage.StorageLevel
+
+/**
+ * Params for FMClassifier.
+ */
+private[classification] trait FMClassifierParams extends 
ProbabilisticClassifierParams
+  with FactorizationMachinesParams {
+}
+
+/**
+ * Factorization Machines learning algorithm for classification.
+ * It supports normal gradient descent and AdamW solver.
+ *
+ * The implementation is based upon:
+ * https://www.csie.ntu.edu.tw/~b97053/paper/Rendle2010FM.pdf;>
+ * S. Rendle. "Factorization machines" 2010.
+ *
+ * FM is able to estimate interactions even in problems with huge sparsity
+ * (like advertising and recommendation system).
+ * FM formula is:
+ * {{{
+ *   y = w_0 + \sum\limits^n_{i-1} w_i x_i +
+ * \sum\limits^n_{i=1} \sum\limits^n_{j=i+1} \langle v_i, v_j \rangle x_i 
x_j
+ * }}}
+ * First two terms denote global bias and linear term (as same as linear 
regression),
+ * and last term denotes pairwise interactions term. {{{v_i}}} describes the 
i-th variable
+ * with k factors.
+ *
+ * FM classification model uses logistic loss which can be solved by gradient 
descent method, and
+ * regularization terms like L2 are usually added to the loss function to 
prevent overfitting.
+ *
+ * @note Multiclass labels are not currently supported.
+ */
+@Since("3.0.0")
+class FMClassifier @Since("3.0.0") (
+@Since("3.0.0") override val uid: String)
+  extends ProbabilisticClassifier[Vector, FMClassifier, FMClassifierModel]
+  with FactorizationMachines with FMClassifierParams with 
DefaultParamsWritable with Logging {
+
+  @Since("3.0.0")
+  def this() = this(Identifiable.randomUID("fmc"))
+
+  /**
+   * Set the dimensionality of the factors.
+   * Default is 8.
+   *
+   * @group setParam
+   */
+  @Since("3.0.0")
+  def setNumFactors(value: Int): this.type = set(numFactors, value)
+  setDefault(numFactors -> 8)
+
+  /**
+   * Set whether to fit global bias term.
+   * Default is true.
+   *
+   * @group setParam
+   */
+  @Since("3.0.0")
+  def setFitBias(value: Boolean): this.type = set(fitBias, value)
+  setDefault(fitBias -> true)
+
+  /**
+   * Set whether to fit linear term.
+   * Default is true.
+   *
+   * @group setParam
+   */
+  @Since("3.0.0")
+  def setFitLinear(value: Boolean): this.type = set(fitLinear, value)
+  setDefault(fitLinear -> true)
+
+  /**
+   * Set the L2 regularization parameter.
+   * Default is 0.0.
+   *
+   * @group setParam
+   */
+  @Since("3.0.0")
+  def setRegParam(value: Double): this.type = set(regParam, value)
+  setDefault(regParam -> 0.0)
+
+  /**
+   * Set the mini-batch fraction parameter.
+   * Default is 1.0.
+   *
+   * @group setParam
+   */
+  @Since("3.0.0")
+  def setMiniBatchFraction(value: Double): this.type = set(miniBatchFraction, 
value)
+  setDefault(miniBatchFraction -> 1.0)
+
+  /**
+   * Set the standard deviation of initial coefficients.
+   * Default is 0.01.
+   *
+   * @group setParam
+   */
+  @Since("3.0.0")
+  def setInitStd(value: Double): this.type = set(initStd, value)
+  setDefault(initStd -> 0.01)
+
+  /**
+   * Set the maximum number of iterations.
+   * Default is 100.
+   *
+   *

[GitHub] [spark] hahadsg commented on a change in pull request #26124: [SPARK-29224][ML]Implement Factorization Machines as a ml-pipeline component

2019-11-10 Thread GitBox

hahadsg commented on a change in pull request #26124: 
[SPARK-29224][ML]Implement Factorization Machines as a ml-pipeline component 
URL: https://github.com/apache/spark/pull/26124#discussion_r344534187
 
 

 ##
 File path: 
mllib/src/main/scala/org/apache/spark/ml/classification/FMClassifier.scala
 ##
 @@ -0,0 +1,326 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.ml.classification
+
+import org.apache.hadoop.fs.Path
+
+import org.apache.spark.annotation.Since
+import org.apache.spark.internal.Logging
+import org.apache.spark.ml.linalg._
+import org.apache.spark.ml.param._
+import org.apache.spark.ml.regression.{FactorizationMachines, 
FactorizationMachinesParams}
+import org.apache.spark.ml.regression.FactorizationMachines._
+import org.apache.spark.ml.util._
+import org.apache.spark.ml.util.Instrumentation.instrumented
+import org.apache.spark.mllib.linalg.{Vector => OldVector}
+import org.apache.spark.mllib.linalg.VectorImplicits._
+import org.apache.spark.rdd.RDD
+import org.apache.spark.sql.{Dataset, Row}
+import org.apache.spark.sql.functions.col
+import org.apache.spark.storage.StorageLevel
+
+/**
+ * Params for FMClassifier.
+ */
+private[classification] trait FMClassifierParams extends 
ProbabilisticClassifierParams
+  with FactorizationMachinesParams {
+}
+
+/**
+ * Factorization Machines learning algorithm for classification.
+ * It supports normal gradient descent and AdamW solver.
+ *
+ * The implementation is based upon:
+ * https://www.csie.ntu.edu.tw/~b97053/paper/Rendle2010FM.pdf;>
+ * S. Rendle. "Factorization machines" 2010.
+ *
+ * FM is able to estimate interactions even in problems with huge sparsity
+ * (like advertising and recommendation system).
+ * FM formula is:
+ * {{{
+ *   y = w_0 + \sum\limits^n_{i-1} w_i x_i +
+ * \sum\limits^n_{i=1} \sum\limits^n_{j=i+1} \langle v_i, v_j \rangle x_i 
x_j
+ * }}}
+ * First two terms denote global bias and linear term (as same as linear 
regression),
+ * and last term denotes pairwise interactions term. {{{v_i}}} describes the 
i-th variable
+ * with k factors.
+ *
+ * FM classification model uses logistic loss which can be solved by gradient 
descent method, and
+ * regularization terms like L2 are usually added to the loss function to 
prevent overfitting.
+ *
+ * @note Multiclass labels are not currently supported.
+ */
+@Since("3.0.0")
+class FMClassifier @Since("3.0.0") (
+@Since("3.0.0") override val uid: String)
+  extends ProbabilisticClassifier[Vector, FMClassifier, FMClassifierModel]
+  with FactorizationMachines with FMClassifierParams with 
DefaultParamsWritable with Logging {
+
+  @Since("3.0.0")
+  def this() = this(Identifiable.randomUID("fmc"))
+
+  /**
+   * Set the dimensionality of the factors.
+   * Default is 8.
+   *
+   * @group setParam
+   */
+  @Since("3.0.0")
+  def setNumFactors(value: Int): this.type = set(numFactors, value)
+  setDefault(numFactors -> 8)
+
+  /**
+   * Set whether to fit global bias term.
+   * Default is true.
+   *
+   * @group setParam
+   */
+  @Since("3.0.0")
+  def setFitBias(value: Boolean): this.type = set(fitBias, value)
+  setDefault(fitBias -> true)
+
+  /**
+   * Set whether to fit linear term.
+   * Default is true.
+   *
+   * @group setParam
+   */
+  @Since("3.0.0")
+  def setFitLinear(value: Boolean): this.type = set(fitLinear, value)
+  setDefault(fitLinear -> true)
+
+  /**
+   * Set the L2 regularization parameter.
+   * Default is 0.0.
+   *
+   * @group setParam
+   */
+  @Since("3.0.0")
+  def setRegParam(value: Double): this.type = set(regParam, value)
+  setDefault(regParam -> 0.0)
+
+  /**
+   * Set the mini-batch fraction parameter.
+   * Default is 1.0.
+   *
+   * @group setParam
+   */
+  @Since("3.0.0")
+  def setMiniBatchFraction(value: Double): this.type = set(miniBatchFraction, 
value)
+  setDefault(miniBatchFraction -> 1.0)
+
+  /**
+   * Set the standard deviation of initial coefficients.
+   * Default is 0.01.
+   *
+   * @group setParam
+   */
+  @Since("3.0.0")
+  def setInitStd(value: Double): this.type = set(initStd, value)
+  setDefault(initStd -> 0.01)
+
+  /**
+   * Set the maximum number of iterations.
+   * Default is 100.
+   *
+   *

[GitHub] [spark] mob-ai commented on a change in pull request #26124: [SPARK-29224][ML]Implement Factorization Machines as a ml-pipeline component

2019-11-10 Thread GitBox

mob-ai commented on a change in pull request #26124: [SPARK-29224][ML]Implement 
Factorization Machines as a ml-pipeline component 
URL: https://github.com/apache/spark/pull/26124#discussion_r344536192
 
 

 ##
 File path: 
mllib/src/main/scala/org/apache/spark/ml/classification/FMClassifier.scala
 ##
 @@ -0,0 +1,326 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.ml.classification
+
+import org.apache.hadoop.fs.Path
+
+import org.apache.spark.annotation.Since
+import org.apache.spark.internal.Logging
+import org.apache.spark.ml.linalg._
+import org.apache.spark.ml.param._
+import org.apache.spark.ml.regression.{FactorizationMachines, 
FactorizationMachinesParams}
+import org.apache.spark.ml.regression.FactorizationMachines._
+import org.apache.spark.ml.util._
+import org.apache.spark.ml.util.Instrumentation.instrumented
+import org.apache.spark.mllib.linalg.{Vector => OldVector}
+import org.apache.spark.mllib.linalg.VectorImplicits._
+import org.apache.spark.rdd.RDD
+import org.apache.spark.sql.{Dataset, Row}
+import org.apache.spark.sql.functions.col
+import org.apache.spark.storage.StorageLevel
+
+/**
+ * Params for FMClassifier.
+ */
+private[classification] trait FMClassifierParams extends 
ProbabilisticClassifierParams
+  with FactorizationMachinesParams {
+}
+
+/**
+ * Factorization Machines learning algorithm for classification.
+ * It supports normal gradient descent and AdamW solver.
+ *
+ * The implementation is based upon:
+ * https://www.csie.ntu.edu.tw/~b97053/paper/Rendle2010FM.pdf;>
+ * S. Rendle. "Factorization machines" 2010.
+ *
+ * FM is able to estimate interactions even in problems with huge sparsity
+ * (like advertising and recommendation system).
+ * FM formula is:
+ * {{{
+ *   y = w_0 + \sum\limits^n_{i-1} w_i x_i +
+ * \sum\limits^n_{i=1} \sum\limits^n_{j=i+1} \langle v_i, v_j \rangle x_i 
x_j
+ * }}}
+ * First two terms denote global bias and linear term (as same as linear 
regression),
+ * and last term denotes pairwise interactions term. {{{v_i}}} describes the 
i-th variable
+ * with k factors.
+ *
+ * FM classification model uses logistic loss which can be solved by gradient 
descent method, and
+ * regularization terms like L2 are usually added to the loss function to 
prevent overfitting.
+ *
+ * @note Multiclass labels are not currently supported.
+ */
+@Since("3.0.0")
+class FMClassifier @Since("3.0.0") (
+@Since("3.0.0") override val uid: String)
+  extends ProbabilisticClassifier[Vector, FMClassifier, FMClassifierModel]
+  with FactorizationMachines with FMClassifierParams with 
DefaultParamsWritable with Logging {
+
+  @Since("3.0.0")
+  def this() = this(Identifiable.randomUID("fmc"))
+
+  /**
+   * Set the dimensionality of the factors.
+   * Default is 8.
+   *
+   * @group setParam
+   */
+  @Since("3.0.0")
+  def setNumFactors(value: Int): this.type = set(numFactors, value)
+  setDefault(numFactors -> 8)
+
+  /**
+   * Set whether to fit global bias term.
+   * Default is true.
+   *
+   * @group setParam
+   */
+  @Since("3.0.0")
+  def setFitBias(value: Boolean): this.type = set(fitBias, value)
+  setDefault(fitBias -> true)
+
+  /**
+   * Set whether to fit linear term.
+   * Default is true.
+   *
+   * @group setParam
+   */
+  @Since("3.0.0")
+  def setFitLinear(value: Boolean): this.type = set(fitLinear, value)
+  setDefault(fitLinear -> true)
+
+  /**
+   * Set the L2 regularization parameter.
+   * Default is 0.0.
+   *
+   * @group setParam
+   */
+  @Since("3.0.0")
+  def setRegParam(value: Double): this.type = set(regParam, value)
+  setDefault(regParam -> 0.0)
+
+  /**
+   * Set the mini-batch fraction parameter.
+   * Default is 1.0.
+   *
+   * @group setParam
+   */
+  @Since("3.0.0")
+  def setMiniBatchFraction(value: Double): this.type = set(miniBatchFraction, 
value)
+  setDefault(miniBatchFraction -> 1.0)
+
+  /**
+   * Set the standard deviation of initial coefficients.
+   * Default is 0.01.
+   *
+   * @group setParam
+   */
+  @Since("3.0.0")
+  def setInitStd(value: Double): this.type = set(initStd, value)
+  setDefault(initStd -> 0.01)
+
+  /**
+   * Set the maximum number of iterations.
+   * Default is 100.
+   *
+   * @group

[GitHub] [spark] SparkQA commented on issue #26416: [WIP][SPARK-29779][CORE] Compact old event log files and cleanup

2019-11-10 Thread GitBox

SparkQA commented on issue #26416: [WIP][SPARK-29779][CORE] Compact old event 
log files and cleanup
URL: https://github.com/apache/spark/pull/26416#issuecomment-552265549
 
 
   **[Test build #113556 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113556/testReport)**
 for PR 26416 at commit 
[`404e747`](https://github.com/apache/spark/commit/404e7477e0a8063442f6d0c464c16e8ae4d75e08).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #26359: [SPARK-29713][SQL] Support Interval Unit Abbreviations in Interval Literals

2019-11-10 Thread GitBox

AmplabJenkins commented on issue #26359: [SPARK-29713][SQL] Support Interval 
Unit Abbreviations in Interval Literals
URL: https://github.com/apache/spark/pull/26359#issuecomment-552273207
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #26359: [SPARK-29713][SQL] Support Interval Unit Abbreviations in Interval Literals

2019-11-10 Thread GitBox

AmplabJenkins commented on issue #26359: [SPARK-29713][SQL] Support Interval 
Unit Abbreviations in Interval Literals
URL: https://github.com/apache/spark/pull/26359#issuecomment-552273213
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18448/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #26359: [SPARK-29713][SQL] Support Interval Unit Abbreviations in Interval Literals

2019-11-10 Thread GitBox

AmplabJenkins removed a comment on issue #26359: [SPARK-29713][SQL] Support 
Interval Unit Abbreviations in Interval Literals
URL: https://github.com/apache/spark/pull/26359#issuecomment-552273207
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #26359: [SPARK-29713][SQL] Support Interval Unit Abbreviations in Interval Literals

2019-11-10 Thread GitBox

AmplabJenkins removed a comment on issue #26359: [SPARK-29713][SQL] Support 
Interval Unit Abbreviations in Interval Literals
URL: https://github.com/apache/spark/pull/26359#issuecomment-552273213
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18448/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #26439: [SPARK-29801][ML] ML models unify toString method

2019-11-10 Thread GitBox

dongjoon-hyun commented on a change in pull request #26439: [SPARK-29801][ML] 
ML models unify toString method
URL: https://github.com/apache/spark/pull/26439#discussion_r344539537
 
 

 ##
 File path: 
mllib/src/main/scala/org/apache/spark/ml/clustering/GaussianMixture.scala
 ##
 @@ -89,6 +89,9 @@ class GaussianMixtureModel private[ml] (
   extends Model[GaussianMixtureModel] with GaussianMixtureParams with 
MLWritable
   with HasTrainingSummary[GaussianMixtureSummary] {
 
+  @Since("3.0.0")
+  val numFeatures: Int = gaussians.head.mean.size
 
 Review comment:
   This PR seems to add at least 4 `numFeatures` instances. Could you add this 
into the PR description explicitly?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #26359: [SPARK-29713][SQL] Support Interval Unit Abbreviations in Interval Literals

2019-11-10 Thread GitBox

SparkQA commented on issue #26359: [SPARK-29713][SQL] Support Interval Unit 
Abbreviations in Interval Literals
URL: https://github.com/apache/spark/pull/26359#issuecomment-552272984
 
 
   **[Test build #113560 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113560/testReport)**
 for PR 26359 at commit 
[`116d92e`](https://github.com/apache/spark/commit/116d92ece2769acf2da22e6de861f16a24c45168).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider

2019-11-10 Thread GitBox

SparkQA commented on issue #26097: [SPARK-29421][SQL] Supporting Create Table 
Like Using Provider
URL: https://github.com/apache/spark/pull/26097#issuecomment-552274572
 
 
   **[Test build #113561 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113561/testReport)**
 for PR 26097 at commit 
[`0ae26a6`](https://github.com/apache/spark/commit/0ae26a627060c576d9daea23bd2eb17e4ec81b55).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] fuwhu commented on a change in pull request #26176: [SPARK-29519][SQL] SHOW TBLPROPERTIES should do multi-catalog resolution.

2019-11-10 Thread GitBox

fuwhu commented on a change in pull request #26176: [SPARK-29519][SQL] SHOW 
TBLPROPERTIES should do multi-catalog resolution.
URL: https://github.com/apache/spark/pull/26176#discussion_r344542958
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ##
 @@ -691,6 +691,11 @@ class Analyzer(
 .map(rel => alter.copy(table = rel))
 .getOrElse(alter)
 
+  case show @ ShowTableProperties(u: UnresolvedV2Relation, _) =>
+CatalogV2Util.loadRelation(u.catalog, u.tableName)
+  .map(rel => show.copy(table = rel))
+  .getOrElse(u)
+
 
 Review comment:
   Why is it '.getOrElse(u)' instead of '.getOrElse(show)' here ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled"

2019-11-10 Thread GitBox

SparkQA commented on issue #26444: [SPARK-29807][SQL] Rename 
"spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled"
URL: https://github.com/apache/spark/pull/26444#issuecomment-552275701
 
 
   **[Test build #113551 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113551/testReport)**
 for PR 26444 at commit 
[`1b8a0da`](https://github.com/apache/spark/commit/1b8a0da95f4c8b6aedb39d247231afbdf783c805).
* This patch **fails Spark unit tests**.
* This patch **does not merge cleanly**.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled"

2019-11-10 Thread GitBox

SparkQA removed a comment on issue #26444: [SPARK-29807][SQL] Rename 
"spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled"
URL: https://github.com/apache/spark/pull/26444#issuecomment-552256924
 
 
   **[Test build #113551 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113551/testReport)**
 for PR 26444 at commit 
[`1b8a0da`](https://github.com/apache/spark/commit/1b8a0da95f4c8b6aedb39d247231afbdf783c805).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] stczwd commented on issue #26433: [SPARK-29771][K8S] Add configure to limit executor failures

2019-11-10 Thread GitBox

stczwd commented on issue #26433: [SPARK-29771][K8S] Add configure to limit 
executor failures
URL: https://github.com/apache/spark/pull/26433#issuecomment-552275557
 
 
   @dongjoon-hyun Thanks for paying attention to this patch, I have made 
changes to comments.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AngersZhuuuu commented on a change in pull request #26378: [SPARK-29724][SPARK-29726][WEBUI][SQL] Support JDBC/ODBC tab for HistoryServer WebUI

2019-11-10 Thread GitBox

AngersZh commented on a change in pull request #26378: 
[SPARK-29724][SPARK-29726][WEBUI][SQL] Support JDBC/ODBC tab for HistoryServer 
WebUI
URL: https://github.com/apache/spark/pull/26378#discussion_r344548231
 
 

 ##
 File path: 
sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/ui/ThriftServerTab.scala
 ##
 @@ -19,28 +19,26 @@ package org.apache.spark.sql.hive.thriftserver.ui
 
 import org.apache.spark.{SparkContext, SparkException}
 import org.apache.spark.internal.Logging
-import org.apache.spark.sql.hive.thriftserver.HiveThriftServer2
-import org.apache.spark.sql.hive.thriftserver.ui.ThriftServerTab._
+import org.apache.spark.sql.hive.thriftserver.{HiveThriftServer2, 
HiveThriftServer2Listener}
 import org.apache.spark.ui.{SparkUI, SparkUITab}
 
 /**
  * Spark Web UI tab that shows statistics of jobs running in the thrift server.
  * This assumes the given SparkContext has enabled its SparkUI.
  */
-private[thriftserver] class ThriftServerTab(sparkContext: SparkContext)
-  extends SparkUITab(getSparkUI(sparkContext), "sqlserver") with Logging {
-
+private[thriftserver] class ThriftServerTab(
+   val store: HiveThriftServer2AppStatusStore,
+   sparkUI: SparkUI) extends SparkUITab(sparkUI, "sqlserver") with Logging {
 
 Review comment:
   Why we need to move getSparkUI to HiveThriftServer2


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #26462: [SPARK-29833][YARN] Add FileNotFoundException check for spark.yarn.jars

2019-11-10 Thread GitBox

AmplabJenkins commented on issue #26462: [SPARK-29833][YARN] Add 
FileNotFoundException check for spark.yarn.jars
URL: https://github.com/apache/spark/pull/26462#issuecomment-552291173
 
 
   Can one of the admins verify this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #26462: [SPARK-29833][YARN] Add FileNotFoundException check for spark.yarn.jars

2019-11-10 Thread GitBox

AmplabJenkins commented on issue #26462: [SPARK-29833][YARN] Add 
FileNotFoundException check for spark.yarn.jars
URL: https://github.com/apache/spark/pull/26462#issuecomment-552290926
 
 
   Can one of the admins verify this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25964: [SPARK-29287][Core] Add ExecutorConstructed message to tell driver which executor is ready for making offers

2019-11-10 Thread GitBox

AmplabJenkins removed a comment on issue #25964: [SPARK-29287][Core] Add 
ExecutorConstructed message to tell driver which executor is ready for making 
offers
URL: https://github.com/apache/spark/pull/25964#issuecomment-552293293
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113562/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25964: [SPARK-29287][Core] Add ExecutorConstructed message to tell driver which executor is ready for making offers

2019-11-10 Thread GitBox

AmplabJenkins commented on issue #25964: [SPARK-29287][Core] Add 
ExecutorConstructed message to tell driver which executor is ready for making 
offers
URL: https://github.com/apache/spark/pull/25964#issuecomment-552293290
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25964: [SPARK-29287][Core] Add ExecutorConstructed message to tell driver which executor is ready for making offers

2019-11-10 Thread GitBox

SparkQA commented on issue #25964: [SPARK-29287][Core] Add ExecutorConstructed 
message to tell driver which executor is ready for making offers
URL: https://github.com/apache/spark/pull/25964#issuecomment-552293178
 
 
   **[Test build #113562 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113562/testReport)**
 for PR 25964 at commit 
[`aac0b00`](https://github.com/apache/spark/commit/aac0b00260374bb89c1006cdeabe1a55d8b4fb20).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds the following public classes _(experimental)_:
 * `  case class LaunchedExecutor(executorId: String) extends 
CoarseGrainedClusterMessage`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25964: [SPARK-29287][Core] Add ExecutorConstructed message to tell driver which executor is ready for making offers

2019-11-10 Thread GitBox

AmplabJenkins commented on issue #25964: [SPARK-29287][Core] Add 
ExecutorConstructed message to tell driver which executor is ready for making 
offers
URL: https://github.com/apache/spark/pull/25964#issuecomment-552293293
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113562/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #25964: [SPARK-29287][Core] Add ExecutorConstructed message to tell driver which executor is ready for making offers

2019-11-10 Thread GitBox

SparkQA removed a comment on issue #25964: [SPARK-29287][Core] Add 
ExecutorConstructed message to tell driver which executor is ready for making 
offers
URL: https://github.com/apache/spark/pull/25964#issuecomment-552275998
 
 
   **[Test build #113562 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113562/testReport)**
 for PR 25964 at commit 
[`aac0b00`](https://github.com/apache/spark/commit/aac0b00260374bb89c1006cdeabe1a55d8b4fb20).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #26420: [SPARK-27986][SQL] Support ANSI SQL filter predicate for aggregate expression.

2019-11-10 Thread GitBox

SparkQA commented on issue #26420: [SPARK-27986][SQL] Support ANSI SQL filter 
predicate for aggregate expression.
URL: https://github.com/apache/spark/pull/26420#issuecomment-552293709
 
 
   **[Test build #113566 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113566/testReport)**
 for PR 26420 at commit 
[`f32ac4d`](https://github.com/apache/spark/commit/f32ac4de90e4cb78918a8cefc251ea8872a60276).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #26441: [SPARK-29682][SQL] Resolve conflicting references in aggregate expressions

2019-11-10 Thread GitBox

SparkQA commented on issue #26441: [SPARK-29682][SQL] Resolve conflicting 
references in aggregate expressions 
URL: https://github.com/apache/spark/pull/26441#issuecomment-552293705
 
 
   **[Test build #113565 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113565/testReport)**
 for PR 26441 at commit 
[`7a295cd`](https://github.com/apache/spark/commit/7a295cd05dc0d6f028c2feaf376ff9e55d90926f).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25964: [SPARK-29287][Core] Add ExecutorConstructed message to tell driver which executor is ready for making offers

2019-11-10 Thread GitBox

AmplabJenkins removed a comment on issue #25964: [SPARK-29287][Core] Add 
ExecutorConstructed message to tell driver which executor is ready for making 
offers
URL: https://github.com/apache/spark/pull/25964#issuecomment-552293290
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on issue #26441: [SPARK-29682][SQL] Resolve conflicting references in aggregate expressions

2019-11-10 Thread GitBox

cloud-fan commented on issue #26441: [SPARK-29682][SQL] Resolve conflicting 
references in aggregate expressions 
URL: https://github.com/apache/spark/pull/26441#issuecomment-552298083
 
 
   It's better to explain why the bug happens in the PR description. I don't 
understand the current fix, just FYI why we only handle alias in `Project`: The 
self-join dedup logical tries to find the root which causes conflicts. 
Sometimes it's alias in `Project`, sometimes it's leaf node. For attributes in 
`Project`, there must be other nodes under `Project` that cause the conflicts.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider

2019-11-10 Thread GitBox

SparkQA commented on issue #26097: [SPARK-29421][SQL] Supporting Create Table 
Like Using Provider
URL: https://github.com/apache/spark/pull/26097#issuecomment-552318069
 
 
   **[Test build #113561 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113561/testReport)**
 for PR 26097 at commit 
[`0ae26a6`](https://github.com/apache/spark/commit/0ae26a627060c576d9daea23bd2eb17e4ec81b55).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] yaooqinn commented on a change in pull request #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values

2019-11-10 Thread GitBox

yaooqinn commented on a change in pull request #26449: [SPARK-29822][SQL] Fix 
cast error when there are white spaces between signs and values
URL: https://github.com/apache/spark/pull/26449#discussion_r344583580
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
 ##
 @@ -425,11 +425,15 @@ object IntervalUtils {
   }
 
   private object ParseState extends Enumeration {
+type ParseState = Value
+
 val PREFIX,
 BEGIN_VALUE,
 
 Review comment:
   or `NEXT_VALUE_UNIT`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #26118: [SPARK-24915][Python] Fix Row handling with Schema.

2019-11-10 Thread GitBox

SparkQA removed a comment on issue #26118: [SPARK-24915][Python] Fix Row 
handling with Schema.
URL: https://github.com/apache/spark/pull/26118#issuecomment-552228093
 
 
   **[Test build #113550 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113550/testReport)**
 for PR 26118 at commit 
[`c262689`](https://github.com/apache/spark/commit/c262689470655244234d1ff26764697d39b3f752).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #26118: [SPARK-24915][Python] Fix Row handling with Schema.

2019-11-10 Thread GitBox

AmplabJenkins commented on issue #26118: [SPARK-24915][Python] Fix Row handling 
with Schema.
URL: https://github.com/apache/spark/pull/26118#issuecomment-552230489
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113550/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #26118: [SPARK-24915][Python] Fix Row handling with Schema.

2019-11-10 Thread GitBox

AmplabJenkins commented on issue #26118: [SPARK-24915][Python] Fix Row handling 
with Schema.
URL: https://github.com/apache/spark/pull/26118#issuecomment-552230486
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 4 5 6 7 >

1 - 100 of 693 matches

Mail list logo