date:20190606

[GitHub] [spark] SparkQA commented on issue #24805: [SPARK-27798][SQL] from_avro shouldn't produces same value when converted to local relation

2019-06-06 Thread GitBox

SparkQA commented on issue #24805: [SPARK-27798][SQL] from_avro shouldn't 
produces same value when converted to local relation
URL: https://github.com/apache/spark/pull/24805#issuecomment-499767122
 
 
   **[Test build #106266 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106266/testReport)**
 for PR 24805 at commit 
[`69996a6`](https://github.com/apache/spark/commit/69996a61a8f1c8e0cba6a50f5f93f00e40d23c3b).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds the following public classes _(experimental)_:
 * `case class ExprReuseOutput(child: Expression) extends UnaryExpression `


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] ketank-new commented on issue #24788: [SPARK-26985] [core]

2019-06-06 Thread GitBox

ketank-new commented on issue #24788: [SPARK-26985] [core]
URL: https://github.com/apache/spark/pull/24788#issuecomment-499766766
 
 
   > If that's the exact same problem, please link that JIRA to the PR title 
(see https://spark.apache.org/contributing.html).
   
   Linked by changing the Title above.. were in the JIRA number is seen
   > 
   > Also, please clarify why changing from little endian to big endian is safe 
in little endian OSes.
   
   Clarification for the changes
   putFloats() and putDoubles() from files OffHeapColumnVector.java and 
OnHeapColumnVector.java
   do get called whenever test cases use float and double data respectively.
   
   If you check the definitions of putFloat() and putDouble() for a BIG_ENDIAN 
system, the control moves into the else block,within the else block the 
byteorder which is set is LITTLE_ENDIAN which is exactly opposite to what is 
expected on a BIG_ENDIAN system
   
   changing this byteorder to BIG_ENDIAN represents float and double values as 
expected on a BIG_ENDIAN processor and hence further helps in passing the 
written test cases.
   
   With this changes done and running the test cases on LITTLE_ENDIAN moves the 
control to if block and thereby also passes test cases for LITTLE_ENDIAN system.
   
   Thereby i conclude that the changes are working and test for both types of 
systems that is LITTLE_ENDIAN and BIG_ENDIAN
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] ketank-new commented on issue #24788: [SPARK-26985] [core]

2019-06-06 Thread GitBox

ketank-new commented on issue #24788: [SPARK-26985] [core]
URL: https://github.com/apache/spark/pull/24788#issuecomment-499765576
 
 
   > Could you please add JIRA number `[SPARK-]` and `[core]` as a prefix 
of the title?
   
   done


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24335: [SPARK-27425][SQL] Add count_if function

2019-06-06 Thread GitBox

AmplabJenkins removed a comment on issue #24335: [SPARK-27425][SQL] Add 
count_if function
URL: https://github.com/apache/spark/pull/24335#issuecomment-499764312
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24335: [SPARK-27425][SQL] Add count_if function

2019-06-06 Thread GitBox

AmplabJenkins removed a comment on issue #24335: [SPARK-27425][SQL] Add 
count_if function
URL: https://github.com/apache/spark/pull/24335#issuecomment-499764318
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/11515/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] zsxwing commented on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-06-06 Thread GitBox

zsxwing commented on issue #24796: [SPARK-27900][CORE] Add uncaught exception 
handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-499764530
 
 
   > `onReceive()` is interrupted before `onStop()`
   
   But there will be a race condition if removing `join`. We cannot guarantee 
that `onReceive` can return immediately when it receives the interrupt signal.
   
   By the way, is there any theory about how this deadlock can happen? As I 
mentioned here: 
https://github.com/apache/spark/pull/24796#discussion_r290908122 I could not 
reproduce it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24335: [SPARK-27425][SQL] Add count_if function

2019-06-06 Thread GitBox

AmplabJenkins commented on issue #24335: [SPARK-27425][SQL] Add count_if 
function
URL: https://github.com/apache/spark/pull/24335#issuecomment-499764312
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24335: [SPARK-27425][SQL] Add count_if function

2019-06-06 Thread GitBox

AmplabJenkins commented on issue #24335: [SPARK-27425][SQL] Add count_if 
function
URL: https://github.com/apache/spark/pull/24335#issuecomment-499764318
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/11515/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] gatorsmile commented on a change in pull request #24800: [SPARK-27947][SQL] ParsedStatement subclass toString may throw ClassCastException

2019-06-06 Thread GitBox

gatorsmile commented on a change in pull request #24800: [SPARK-27947][SQL] 
ParsedStatement subclass toString may throw ClassCastException
URL: https://github.com/apache/spark/pull/24800#discussion_r291457845
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/sql/ParsedStatement.scala
 ##
 @@ -36,8 +38,11 @@ import 
org.apache.spark.sql.catalyst.plans.logical.LogicalPlan
 private[sql] abstract class ParsedStatement extends LogicalPlan {
   // Redact properties and options when parsed nodes are used by generic 
methods like toString
   override def productIterator: Iterator[Any] = super.productIterator.map {
-case mapArg: Map[_, _] => 
conf.redactOptions(mapArg.asInstanceOf[Map[String, String]])
-case other => other
+case mapArg: Map[_, _] =>
+  // May match any Map type, e.g. Map[String, Int], due to type erasure
+  Try(conf.redactOptions(mapArg.asInstanceOf[Map[String, 
String]])).getOrElse(mapArg)
 
 Review comment:
   In Spark source code, we always try to avoid rely on the exception handling 
if we can possibly avoid it. 
   
   Also, we try our best to avoid make an assumption in the utility class. 
   
   I think we can enhance these Utils redact methods in this PR.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] gatorsmile commented on a change in pull request #24800: [SPARK-27947][SQL] ParsedStatement subclass toString may throw ClassCastException

2019-06-06 Thread GitBox

gatorsmile commented on a change in pull request #24800: [SPARK-27947][SQL] 
ParsedStatement subclass toString may throw ClassCastException
URL: https://github.com/apache/spark/pull/24800#discussion_r291457845
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/sql/ParsedStatement.scala
 ##
 @@ -36,8 +38,11 @@ import 
org.apache.spark.sql.catalyst.plans.logical.LogicalPlan
 private[sql] abstract class ParsedStatement extends LogicalPlan {
   // Redact properties and options when parsed nodes are used by generic 
methods like toString
   override def productIterator: Iterator[Any] = super.productIterator.map {
-case mapArg: Map[_, _] => 
conf.redactOptions(mapArg.asInstanceOf[Map[String, String]])
-case other => other
+case mapArg: Map[_, _] =>
+  // May match any Map type, e.g. Map[String, Int], due to type erasure
+  Try(conf.redactOptions(mapArg.asInstanceOf[Map[String, 
String]])).getOrElse(mapArg)
 
 Review comment:
   In Spark source code, we always try to avoid rely on the exception handling 
if we can possibly avoid it. 
   
   Also, we try our best to avoid making a hidden assumption in the utility 
class. 
   
   I think we can enhance these Utils redact methods in this PR.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #24335: [SPARK-27425][SQL] Add count_if function

2019-06-06 Thread GitBox

SparkQA commented on issue #24335: [SPARK-27425][SQL] Add count_if function
URL: https://github.com/apache/spark/pull/24335#issuecomment-499763307
 
 
   **[Test build #106269 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106269/testReport)**
 for PR 24335 at commit 
[`81ab7e6`](https://github.com/apache/spark/commit/81ab7e662d13b8c18d8a02e05799d0a554d07bd2).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cryeo commented on issue #24335: [SPARK-27425][SQL] Add count_if function

2019-06-06 Thread GitBox

cryeo commented on issue #24335: [SPARK-27425][SQL] Add count_if function
URL: https://github.com/apache/spark/pull/24335#issuecomment-499763393
 
 
   @dongjoon-hyun Thanks for your review. I just modified code and PR 
description. Could you confirm it?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24688: [SPARK-27970][SQL] Support Hive 3.0 metastore

2019-06-06 Thread GitBox

AmplabJenkins removed a comment on issue #24688: [SPARK-27970][SQL] Support 
Hive 3.0 metastore
URL: https://github.com/apache/spark/pull/24688#issuecomment-499762988
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/11514/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24688: [SPARK-27970][SQL] Support Hive 3.0 metastore

2019-06-06 Thread GitBox

AmplabJenkins removed a comment on issue #24688: [SPARK-27970][SQL] Support 
Hive 3.0 metastore
URL: https://github.com/apache/spark/pull/24688#issuecomment-499762981
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24815: [SPARK-27961][SQL] DataSourceV2Relation should not have refresh method

2019-06-06 Thread GitBox

AmplabJenkins removed a comment on issue #24815: [SPARK-27961][SQL] 
DataSourceV2Relation should not have refresh method
URL: https://github.com/apache/spark/pull/24815#issuecomment-499762964
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24815: [SPARK-27961][SQL] DataSourceV2Relation should not have refresh method

2019-06-06 Thread GitBox

AmplabJenkins removed a comment on issue #24815: [SPARK-27961][SQL] 
DataSourceV2Relation should not have refresh method
URL: https://github.com/apache/spark/pull/24815#issuecomment-499762968
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/11513/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24688: [SPARK-27970][SQL] Support Hive 3.0 metastore

2019-06-06 Thread GitBox

AmplabJenkins commented on issue #24688: [SPARK-27970][SQL] Support Hive 3.0 
metastore
URL: https://github.com/apache/spark/pull/24688#issuecomment-499762988
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/11514/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24688: [SPARK-27970][SQL] Support Hive 3.0 metastore

2019-06-06 Thread GitBox

AmplabJenkins commented on issue #24688: [SPARK-27970][SQL] Support Hive 3.0 
metastore
URL: https://github.com/apache/spark/pull/24688#issuecomment-499762981
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24815: [SPARK-27961][SQL] DataSourceV2Relation should not have refresh method

2019-06-06 Thread GitBox

AmplabJenkins commented on issue #24815: [SPARK-27961][SQL] 
DataSourceV2Relation should not have refresh method
URL: https://github.com/apache/spark/pull/24815#issuecomment-499762968
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/11513/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24815: [SPARK-27961][SQL] DataSourceV2Relation should not have refresh method

2019-06-06 Thread GitBox

AmplabJenkins commented on issue #24815: [SPARK-27961][SQL] 
DataSourceV2Relation should not have refresh method
URL: https://github.com/apache/spark/pull/24815#issuecomment-499762964
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on issue #24788: s390x specific changes

2019-06-06 Thread GitBox

HyukjinKwon commented on issue #24788: s390x specific changes
URL: https://github.com/apache/spark/pull/24788#issuecomment-499762117
 
 
   If that's the exact same problem, please link that JIRA to the PR title (see 
https://spark.apache.org/contributing.html).
   
   Also, please clarify why changing from little endian to big endian is safe 
in little endian OSes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #24815: [SPARK-27961][SQL] DataSourceV2Relation should not have refresh method

2019-06-06 Thread GitBox

SparkQA commented on issue #24815: [SPARK-27961][SQL] DataSourceV2Relation 
should not have refresh method
URL: https://github.com/apache/spark/pull/24815#issuecomment-499762010
 
 
   **[Test build #106267 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106267/testReport)**
 for PR 24815 at commit 
[`98c6105`](https://github.com/apache/spark/commit/98c61053ab519cc0002b9372bbc93752cc507cef).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #24688: [SPARK-27970][SQL] Support Hive 3.0 metastore

2019-06-06 Thread GitBox

SparkQA commented on issue #24688: [SPARK-27970][SQL] Support Hive 3.0 metastore
URL: https://github.com/apache/spark/pull/24688#issuecomment-499762009
 
 
   **[Test build #106268 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106268/testReport)**
 for PR 24688 at commit 
[`dba68a1`](https://github.com/apache/spark/commit/dba68a1ca0ca1f3538276cb06ff0972e2122fa98).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] wangyum commented on a change in pull request #24688: [SPARK-27970][SQL] Support Hive 3.0 metastore

2019-06-06 Thread GitBox

wangyum commented on a change in pull request #24688: [SPARK-27970][SQL] 
Support Hive 3.0 metastore
URL: https://github.com/apache/spark/pull/24688#discussion_r291456525
 
 

 ##
 File path: docs/sql-data-sources-hive-tables.md
 ##
 @@ -130,7 +130,7 @@ The following options can be used to configure the version 
of Hive that is used
 1.2.1
 
   Version of the Hive metastore. Available
-  options are 0.12.0 through 2.3.5 and 
3.1.0 through 3.1.1.
+  options are 0.12.0 through 3.1.1.
 
 Review comment:
   Done


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24819: [SPARK-27973][MINOR] [EXAMPLES]correct DirectKafkaWordCount usage text with groupId

2019-06-06 Thread GitBox

AmplabJenkins removed a comment on issue #24819: [SPARK-27973][MINOR] 
[EXAMPLES]correct DirectKafkaWordCount usage text with groupId
URL: https://github.com/apache/spark/pull/24819#issuecomment-499757833
 
 
   Can one of the admins verify this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24819: [SPARK-27973][MINOR] [EXAMPLES]correct DirectKafkaWordCount usage text with groupId

2019-06-06 Thread GitBox

AmplabJenkins commented on issue #24819: [SPARK-27973][MINOR] [EXAMPLES]correct 
DirectKafkaWordCount usage text with groupId
URL: https://github.com/apache/spark/pull/24819#issuecomment-499758184
 
 
   Can one of the admins verify this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24819: [SPARK-27973][MINOR] [EXAMPLES]correct DirectKafkaWordCount usage text with groupId

2019-06-06 Thread GitBox

AmplabJenkins removed a comment on issue #24819: [SPARK-27973][MINOR] 
[EXAMPLES]correct DirectKafkaWordCount usage text with groupId
URL: https://github.com/apache/spark/pull/24819#issuecomment-499757749
 
 
   Can one of the admins verify this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24819: [SPARK-27973][MINOR] [EXAMPLES]correct DirectKafkaWordCount usage text with groupId

2019-06-06 Thread GitBox

AmplabJenkins commented on issue #24819: [SPARK-27973][MINOR] [EXAMPLES]correct 
DirectKafkaWordCount usage text with groupId
URL: https://github.com/apache/spark/pull/24819#issuecomment-499757833
 
 
   Can one of the admins verify this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24819: [SPARK-27973][MINOR] [EXAMPLES]correct DirectKafkaWordCount usage text with groupId

2019-06-06 Thread GitBox

AmplabJenkins commented on issue #24819: [SPARK-27973][MINOR] [EXAMPLES]correct 
DirectKafkaWordCount usage text with groupId
URL: https://github.com/apache/spark/pull/24819#issuecomment-499757749
 
 
   Can one of the admins verify this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24818: [SPARK-27971[SQL][R] MapPartitionsInRWithArrowExec.evaluate shouldn't eagerly read the first batch

2019-06-06 Thread GitBox

AmplabJenkins removed a comment on issue #24818: [SPARK-27971[SQL][R] 
MapPartitionsInRWithArrowExec.evaluate shouldn't eagerly read the first batch
URL: https://github.com/apache/spark/pull/24818#issuecomment-499757129
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24818: [SPARK-27971[SQL][R] MapPartitionsInRWithArrowExec.evaluate shouldn't eagerly read the first batch

2019-06-06 Thread GitBox

AmplabJenkins removed a comment on issue #24818: [SPARK-27971[SQL][R] 
MapPartitionsInRWithArrowExec.evaluate shouldn't eagerly read the first batch
URL: https://github.com/apache/spark/pull/24818#issuecomment-499757133
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/106264/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cnZach opened a new pull request #24819: [SPARK-27973][MINOR] [EXAMPLES]correct DirectKafkaWordCount usage text with groupId

2019-06-06 Thread GitBox

cnZach opened a new pull request #24819: [SPARK-27973][MINOR] [EXAMPLES]correct 
DirectKafkaWordCount usage text with groupId
URL: https://github.com/apache/spark/pull/24819
 
 
   ## What changes were proposed in this pull request?
   
   
   Usage: DirectKafkaWordCount  
   --
is a list of one or more Kafka brokers
is a consumer group name to consume from topics
is a list of one or more kafka topics to consume from
   
   
   ## How was this patch tested?
   N/A.
   
   Please review https://spark.apache.org/contributing.html before opening a 
pull request.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24818: [SPARK-27971[SQL][R] MapPartitionsInRWithArrowExec.evaluate shouldn't eagerly read the first batch

2019-06-06 Thread GitBox

AmplabJenkins commented on issue #24818: [SPARK-27971[SQL][R] 
MapPartitionsInRWithArrowExec.evaluate shouldn't eagerly read the first batch
URL: https://github.com/apache/spark/pull/24818#issuecomment-499757129
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24818: [SPARK-27971[SQL][R] MapPartitionsInRWithArrowExec.evaluate shouldn't eagerly read the first batch

2019-06-06 Thread GitBox

AmplabJenkins commented on issue #24818: [SPARK-27971[SQL][R] 
MapPartitionsInRWithArrowExec.evaluate shouldn't eagerly read the first batch
URL: https://github.com/apache/spark/pull/24818#issuecomment-499757133
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/106264/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #24818: [SPARK-27971[SQL][R] MapPartitionsInRWithArrowExec.evaluate shouldn't eagerly read the first batch

2019-06-06 Thread GitBox

SparkQA removed a comment on issue #24818: [SPARK-27971[SQL][R] 
MapPartitionsInRWithArrowExec.evaluate shouldn't eagerly read the first batch
URL: https://github.com/apache/spark/pull/24818#issuecomment-499732479
 
 
   **[Test build #106264 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106264/testReport)**
 for PR 24818 at commit 
[`d848354`](https://github.com/apache/spark/commit/d8483541ee161ac249c8a439343a66136d2f0079).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #24818: [SPARK-27971[SQL][R] MapPartitionsInRWithArrowExec.evaluate shouldn't eagerly read the first batch

2019-06-06 Thread GitBox

SparkQA commented on issue #24818: [SPARK-27971[SQL][R] 
MapPartitionsInRWithArrowExec.evaluate shouldn't eagerly read the first batch
URL: https://github.com/apache/spark/pull/24818#issuecomment-499756844
 
 
   **[Test build #106264 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106264/testReport)**
 for PR 24818 at commit 
[`d848354`](https://github.com/apache/spark/commit/d8483541ee161ac249c8a439343a66136d2f0079).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] Udbhav30 commented on issue #24601: [SPARK-27702][K8S] Allow using some alternatives for service accounts

2019-06-06 Thread GitBox

Udbhav30 commented on issue #24601: [SPARK-27702][K8S] Allow using some 
alternatives for service accounts
URL: https://github.com/apache/spark/pull/24601#issuecomment-499756451
 
 
   Gentle ping, @dongjoon-hyun


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] ketank-new commented on issue #24788: s390x specific changes

2019-06-06 Thread GitBox

ketank-new commented on issue #24788: s390x specific changes
URL: https://github.com/apache/spark/pull/24788#issuecomment-499754058
 
 
   @HyukjinKwon : I do not mind raising a new JIRA for the above changes
   But let me inform you that we have been continuously in discussion on JIRA 
regarding this earlier too
   Please refer SPARK-26985


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cryeo commented on a change in pull request #24335: [SPARK-27425][SQL] Add count_if function

2019-06-06 Thread GitBox

cryeo commented on a change in pull request #24335: [SPARK-27425][SQL] Add 
count_if function
URL: https://github.com/apache/spark/pull/24335#discussion_r291444380
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/CountIf.scala
 ##
 @@ -0,0 +1,55 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.expressions.aggregate
+
+import org.apache.spark.sql.catalyst.analysis.TypeCheckResult
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.types._
+
+@ExpressionDescription(
+  usage = """
+_FUNC_(expr) - Returns the number of rows that the supplied expression is 
non-null and true.
+  """,
+  examples = """
+Examples:
+  > SELECT _FUNC_(col % 2 = 0) FROM VALUES (NULL), (0), (1), (2), (3) AS 
tab(col);
+   2
+  > SELECT _FUNC_(col IS NULL) FROM VALUES (NULL), (0), (1), (2), (3) AS 
tab(col);
+   1
+  """,
+  since = "3.0.0")
+case class CountIf(predicate: Expression) extends UnevaluableAggregate with 
ImplicitCastInputTypes {
+  override def prettyName: String = "count_if"
+
+  override def children: Seq[Expression] = predicate :: Nil
+
+  override def nullable: Boolean = false
+
+  override def dataType: DataType = LongType
+
+  override def inputTypes: Seq[AbstractDataType] = BooleanType :: Nil
 
 Review comment:
   Is it better to change `children` together?
   ```scala
   override def children: Seq[Expression] = Seq(predicate)
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on a change in pull request #24797: Detecting key in map type when value type is complex

2019-06-06 Thread GitBox

HyukjinKwon commented on a change in pull request #24797: Detecting key in map 
type when value type is complex
URL: https://github.com/apache/spark/pull/24797#discussion_r291443140
 
 

 ##
 File path: R/pkg/R/schema.R
 ##
 @@ -162,7 +162,7 @@ checkType <- function(type) {
 },
 m = {
   # Map type
-  m <- regexec("^map<(.+),(.+)>$", type)
+  m <- regexec("map<(string|character),(.+)>", type)
 
 Review comment:
   This is just a legacy sanity check since R type parsing is now delegated 
into SQL parser. We should actually remove this method entirely (see 
https://github.com/apache/spark/commit/70f1bcd7bcd42b30eabcf06a9639363f1ca4b449).
   
   Can you file a JIRA, review https://spark.apache.org/contributing.html 
closely, and update this PR to remove this with a set of tests?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24795: [SPARK-27945][SQL] Minimal changes to support columnar processing

2019-06-06 Thread GitBox

dongjoon-hyun commented on a change in pull request #24795: [SPARK-27945][SQL] 
Minimal changes to support columnar processing
URL: https://github.com/apache/spark/pull/24795#discussion_r291442978
 
 

 ##
 File path: NOTICE-binary
 ##
 @@ -73,6 +73,10 @@ Copyright 2005-2015 The Apache Software Foundation
 This product includes software developed at
 OW2 Consortium (http://asm.ow2.org/)
 
+This product includes software developed at
+NVIDIA (https://www.nvidia.com)
+* Copyright 2019 NVIDIA CORPORATION
+
 
 Review comment:
   Thank you for removing this, @revans2 .


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on a change in pull request #24800: [SPARK-27947][SQL] ParsedStatement subclass toString may throw ClassCastException

2019-06-06 Thread GitBox

HyukjinKwon commented on a change in pull request #24800: [SPARK-27947][SQL] 
ParsedStatement subclass toString may throw ClassCastException
URL: https://github.com/apache/spark/pull/24800#discussion_r291442243
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/sql/ParsedStatement.scala
 ##
 @@ -36,8 +38,11 @@ import 
org.apache.spark.sql.catalyst.plans.logical.LogicalPlan
 private[sql] abstract class ParsedStatement extends LogicalPlan {
   // Redact properties and options when parsed nodes are used by generic 
methods like toString
   override def productIterator: Iterator[Any] = super.productIterator.map {
 
 Review comment:
   why don't we add like:
   
   ```scala
   protected def options: Map[String, String] = { Map.empty }
   protected def properties: Map[String, String] = { Map.empty }
   ```
   
   and, 
   
   ```diff
   -options: Map[String, String],
   +override val options: Map[String, String],
   ```
   
   at implementation of this classes? Seems like currently we'll check every 
maps whatever it is.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] viirya commented on a change in pull request #24806: [SPARK-27856][SQL] Only allow type upcasting when inserting table

2019-06-06 Thread GitBox

viirya commented on a change in pull request #24806: [SPARK-27856][SQL] Only 
allow type upcasting when inserting table
URL: https://github.com/apache/spark/pull/24806#discussion_r291439699
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
 ##
 @@ -126,9 +126,12 @@ object Cast {
*/
   def canUpCast(from: DataType, to: DataType): Boolean = (from, to) match {
 case _ if from == to => true
+case (NullType, _) => false
 
 Review comment:
   Is this covered by default case previously? Or is it missing before?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] viirya commented on a change in pull request #24806: [SPARK-27856][SQL] Only allow type upcasting when inserting table

2019-06-06 Thread GitBox

viirya commented on a change in pull request #24806: [SPARK-27856][SQL] Only 
allow type upcasting when inserting table
URL: https://github.com/apache/spark/pull/24806#discussion_r291441249
 
 

 ##
 File path: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionStateBuilder.scala
 ##
 @@ -81,6 +81,7 @@ class HiveSessionStateBuilder(session: SparkSession, 
parentState: Option[Session
 RelationConversions(conf, catalog) +:
 PreprocessTableCreation(session) +:
 PreprocessTableInsertion(conf) +:
+ResolveUpCast +:
 
 Review comment:
   Is it good to add a comment like `BaseSessionStateBuilder`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] viirya commented on a change in pull request #24806: [SPARK-27856][SQL] Only allow type upcasting when inserting table

2019-06-06 Thread GitBox

viirya commented on a change in pull request #24806: [SPARK-27856][SQL] Only 
allow type upcasting when inserting table
URL: https://github.com/apache/spark/pull/24806#discussion_r291440948
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/rules.scala
 ##
 @@ -356,8 +358,28 @@ case class PreprocessTableInsertion(conf: SQLConf) 
extends Rule[LogicalPlan] {
   s"including ${staticPartCols.size} partition column(s) having 
constant value(s).")
 }
 
-val newQuery = DDLPreprocessingUtils.castAndRenameQueryOutput(
 
 Review comment:
   I saw there is another usage of `castAndRenameQueryOutput` , for 
`CreateTable` case, should it get rid of unsafe casts too?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24805: [SPARK-27798][SQL] from_avro shouldn't produces same value when converted to local relation

2019-06-06 Thread GitBox

dongjoon-hyun commented on a change in pull request #24805: [SPARK-27798][SQL] 
from_avro shouldn't produces same value when converted to local relation
URL: https://github.com/apache/spark/pull/24805#discussion_r291440164
 
 

 ##
 File path: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/ConvertToLocalRelationSuite.scala
 ##
 @@ -70,4 +72,36 @@ class ConvertToLocalRelationSuite extends PlanTest {
 
 comparePlans(optimized, correctAnswer)
   }
+
+  test("SPARK-27798: Expression reusing output shouldn't override values in 
local relation") {
 
 Review comment:
   Thank you for adding this, @viirya .


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun edited a comment on issue #24335: [SPARK-27425][SQL] Add count_if function

2019-06-06 Thread GitBox

dongjoon-hyun edited a comment on issue #24335: [SPARK-27425][SQL] Add count_if 
function
URL: https://github.com/apache/spark/pull/24335#issuecomment-499741465
 
 
   I also support this feature and @HyukjinKwon .
   
   cc @gatorsmile 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on issue #24788: s390x specific changes

2019-06-06 Thread GitBox

HyukjinKwon commented on issue #24788: s390x specific changes
URL: https://github.com/apache/spark/pull/24788#issuecomment-499741691
 
 
   @ketank-new, please file a JIRA with error message with problem analysis and 
describe how this PR fixes in PR description. Otherwise, no one knows what's 
going on about what you faced.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on issue #24335: [SPARK-27425][SQL] Add count_if function

2019-06-06 Thread GitBox

dongjoon-hyun commented on issue #24335: [SPARK-27425][SQL] Add count_if 
function
URL: https://github.com/apache/spark/pull/24335#issuecomment-499741465
 
 
   cc @gatorsmile 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on issue #24335: [SPARK-27425][SQL] Add count_if function

2019-06-06 Thread GitBox

dongjoon-hyun commented on issue #24335: [SPARK-27425][SQL] Add count_if 
function
URL: https://github.com/apache/spark/pull/24335#issuecomment-499741306
 
 
   @cryeo . Please update the PR description with more SQL references. You 
already told us  `Presto/BigQuery/Excel` references. That will make this PR 
stronger.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on a change in pull request #24805: [SPARK-27798][SQL] from_avro shouldn't produces same value when converted to local relation

2019-06-06 Thread GitBox

HyukjinKwon commented on a change in pull request #24805: [SPARK-27798][SQL] 
from_avro shouldn't produces same value when converted to local relation
URL: https://github.com/apache/spark/pull/24805#discussion_r291439572
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 ##
 @@ -1420,9 +1420,9 @@ object ConvertToLocalRelation extends Rule[LogicalPlan] {
   def apply(plan: LogicalPlan): LogicalPlan = plan transform {
 case Project(projectList, LocalRelation(output, data, isStreaming))
 if !projectList.exists(hasUnevaluableExpr) =>
-  val projection = new InterpretedProjection(projectList, output)
+  val projection = new InterpretedMutableProjection(projectList, output)
   projection.initialize(0)
-  LocalRelation(projectList.map(_.toAttribute), data.map(projection), 
isStreaming)
+  LocalRelation(projectList.map(_.toAttribute), 
data.map(projection(_).copy()), isStreaming)
 
 Review comment:
   I agree with this take (Option 2),


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24335: [SPARK-27425][SQL] Add count_if function

2019-06-06 Thread GitBox

dongjoon-hyun commented on a change in pull request #24335: [SPARK-27425][SQL] 
Add count_if function
URL: https://github.com/apache/spark/pull/24335#discussion_r291439473
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/CountIf.scala
 ##
 @@ -0,0 +1,55 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.expressions.aggregate
+
+import org.apache.spark.sql.catalyst.analysis.TypeCheckResult
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.types._
+
+@ExpressionDescription(
+  usage = """
+_FUNC_(expr) - Returns the number of rows that the supplied expression is 
non-null and true.
 
 Review comment:
   I know this follows the description of `Count`, but it looks a little bit 
weird at `non-null and true`? `True` is already not a null.
   
   Can we say like Presto/BigQuery? Also, we can give the alternative for Spark 
2.4 and older together like the following.
   ```
   Returns the number of TRUE values for the expression. This function is 
equivalent to count(CASE WHEN x THEN 1 END).
   ```
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #24805: [SPARK-27798][SQL] from_avro shouldn't produces same value when converted to local relation

2019-06-06 Thread GitBox

SparkQA commented on issue #24805: [SPARK-27798][SQL] from_avro shouldn't 
produces same value when converted to local relation
URL: https://github.com/apache/spark/pull/24805#issuecomment-499740453
 
 
   **[Test build #106266 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106266/testReport)**
 for PR 24805 at commit 
[`69996a6`](https://github.com/apache/spark/commit/69996a61a8f1c8e0cba6a50f5f93f00e40d23c3b).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24805: [SPARK-27798][SQL] from_avro shouldn't produces same value when converted to local relation

2019-06-06 Thread GitBox

AmplabJenkins removed a comment on issue #24805: [SPARK-27798][SQL] from_avro 
shouldn't produces same value when converted to local relation
URL: https://github.com/apache/spark/pull/24805#issuecomment-499740154
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/11512/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24805: [SPARK-27798][SQL] from_avro shouldn't produces same value when converted to local relation

2019-06-06 Thread GitBox

AmplabJenkins removed a comment on issue #24805: [SPARK-27798][SQL] from_avro 
shouldn't produces same value when converted to local relation
URL: https://github.com/apache/spark/pull/24805#issuecomment-499740151
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24805: [SPARK-27798][SQL] from_avro shouldn't produces same value when converted to local relation

2019-06-06 Thread GitBox

AmplabJenkins commented on issue #24805: [SPARK-27798][SQL] from_avro shouldn't 
produces same value when converted to local relation
URL: https://github.com/apache/spark/pull/24805#issuecomment-499740154
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/11512/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24805: [SPARK-27798][SQL] from_avro shouldn't produces same value when converted to local relation

2019-06-06 Thread GitBox

AmplabJenkins commented on issue #24805: [SPARK-27798][SQL] from_avro shouldn't 
produces same value when converted to local relation
URL: https://github.com/apache/spark/pull/24805#issuecomment-499740151
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] viirya commented on a change in pull request #24805: [SPARK-27798][SQL] from_avro shouldn't produces same value when converted to local relation

2019-06-06 Thread GitBox

viirya commented on a change in pull request #24805: [SPARK-27798][SQL] 
from_avro shouldn't produces same value when converted to local relation
URL: https://github.com/apache/spark/pull/24805#discussion_r291438298
 
 

 ##
 File path: 
external/avro/src/test/scala/org/apache/spark/sql/avro/AvroSuite.scala
 ##
 @@ -1491,4 +1494,38 @@ class AvroSuite extends QueryTest with SharedSQLContext 
with SQLTestUtils {
   |}
 """.stripMargin)
   }
+
+  test("SPARK-27798: from_avro produces same value when converted to local 
relation") {
 
 Review comment:
   Moved. Thanks.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on a change in pull request #24807: [SPARK-27958] Stopping a SparkSession should not always stop Spark Context

2019-06-06 Thread GitBox

HyukjinKwon commented on a change in pull request #24807: [SPARK-27958] 
Stopping a SparkSession should not always stop Spark Context
URL: https://github.com/apache/spark/pull/24807#discussion_r291438274
 
 

 ##
 File path: sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala
 ##
 @@ -711,12 +711,15 @@ class SparkSession private(
   // scalastyle:on
 
   /**
-   * Stop the underlying `SparkContext`.
+   * Stop the underlying `SparkContext` if there are are no active sessions 
remaining.
*
* @since 2.0.0
*/
   def stop(): Unit = {
 
 Review comment:
   Hey, I think this was a design decision that stopping sessions stops spark 
context too. Why don't you just don't call `stop()` on the session since what 
it does it just stops the session? Seems like the behaviour is documented 
properly as well.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24811: [SPARK-27962][R][CORE] Propagate subprocess stdout in deploy.RRunner in exception

2019-06-06 Thread GitBox

AmplabJenkins removed a comment on issue #24811: [SPARK-27962][R][CORE] 
Propagate subprocess stdout in deploy.RRunner in exception
URL: https://github.com/apache/spark/pull/24811#issuecomment-499739443
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/106265/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24811: [SPARK-27962][R][CORE] Propagate subprocess stdout in deploy.RRunner in exception

2019-06-06 Thread GitBox

AmplabJenkins removed a comment on issue #24811: [SPARK-27962][R][CORE] 
Propagate subprocess stdout in deploy.RRunner in exception
URL: https://github.com/apache/spark/pull/24811#issuecomment-499739438
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #24811: [SPARK-27962][R][CORE] Propagate subprocess stdout in deploy.RRunner in exception

2019-06-06 Thread GitBox

SparkQA removed a comment on issue #24811: [SPARK-27962][R][CORE] Propagate 
subprocess stdout in deploy.RRunner in exception
URL: https://github.com/apache/spark/pull/24811#issuecomment-499739362
 
 
   **[Test build #106265 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106265/testReport)**
 for PR 24811 at commit 
[`e96ced9`](https://github.com/apache/spark/commit/e96ced93159734cd83420853b7ab89706ecf8f99).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24811: [SPARK-27962][R][CORE] Propagate subprocess stdout in deploy.RRunner in exception

2019-06-06 Thread GitBox

AmplabJenkins commented on issue #24811: [SPARK-27962][R][CORE] Propagate 
subprocess stdout in deploy.RRunner in exception
URL: https://github.com/apache/spark/pull/24811#issuecomment-499739443
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/106265/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24811: [SPARK-27962][R][CORE] Propagate subprocess stdout in deploy.RRunner in exception

2019-06-06 Thread GitBox

AmplabJenkins commented on issue #24811: [SPARK-27962][R][CORE] Propagate 
subprocess stdout in deploy.RRunner in exception
URL: https://github.com/apache/spark/pull/24811#issuecomment-499739438
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #24811: [SPARK-27962][R][CORE] Propagate subprocess stdout in deploy.RRunner in exception

2019-06-06 Thread GitBox

SparkQA commented on issue #24811: [SPARK-27962][R][CORE] Propagate subprocess 
stdout in deploy.RRunner in exception
URL: https://github.com/apache/spark/pull/24811#issuecomment-499739434
 
 
   **[Test build #106265 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106265/testReport)**
 for PR 24811 at commit 
[`e96ced9`](https://github.com/apache/spark/commit/e96ced93159734cd83420853b7ab89706ecf8f99).
* This patch **fails RAT tests**.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #24811: [SPARK-27962][R][CORE] Propagate subprocess stdout in deploy.RRunner in exception

2019-06-06 Thread GitBox

SparkQA commented on issue #24811: [SPARK-27962][R][CORE] Propagate subprocess 
stdout in deploy.RRunner in exception
URL: https://github.com/apache/spark/pull/24811#issuecomment-499739362
 
 
   **[Test build #106265 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106265/testReport)**
 for PR 24811 at commit 
[`e96ced9`](https://github.com/apache/spark/commit/e96ced93159734cd83420853b7ab89706ecf8f99).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24811: [SPARK-27962][R][CORE] Propagate subprocess stdout in deploy.RRunner in exception

2019-06-06 Thread GitBox

AmplabJenkins removed a comment on issue #24811: [SPARK-27962][R][CORE] 
Propagate subprocess stdout in deploy.RRunner in exception
URL: https://github.com/apache/spark/pull/24811#issuecomment-499739047
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24811: [SPARK-27962][R][CORE] Propagate subprocess stdout in deploy.RRunner in exception

2019-06-06 Thread GitBox

AmplabJenkins commented on issue #24811: [SPARK-27962][R][CORE] Propagate 
subprocess stdout in deploy.RRunner in exception
URL: https://github.com/apache/spark/pull/24811#issuecomment-499739056
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/11511/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24811: [SPARK-27962][R][CORE] Propagate subprocess stdout in deploy.RRunner in exception

2019-06-06 Thread GitBox

AmplabJenkins commented on issue #24811: [SPARK-27962][R][CORE] Propagate 
subprocess stdout in deploy.RRunner in exception
URL: https://github.com/apache/spark/pull/24811#issuecomment-499739047
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on a change in pull request #24811: [SPARK-27962][R][CORE] Propagate subprocess stdout in deploy.RRunner in exception

2019-06-06 Thread GitBox

HyukjinKwon commented on a change in pull request #24811: 
[SPARK-27962][R][CORE] Propagate subprocess stdout in deploy.RRunner in 
exception
URL: https://github.com/apache/spark/pull/24811#discussion_r291437663
 
 

 ##
 File path: core/src/main/scala/org/apache/spark/deploy/RRunner.scala
 ##
 @@ -100,15 +100,17 @@ object RRunner {
 builder.redirectErrorStream(true) // Ugly but needed for stdout and 
stderr to synchronize
 val process = builder.start()
 
-new RedirectThread(process.getInputStream, System.out, "redirect R 
output").start()
+val stdoutBuffer = new CircularBuffer(1024)
+val output = new TeeOutputStream(System.out, stdoutBuffer)
+new RedirectThread(process.getInputStream, output, "redirect R 
output").start()
 
-process.waitFor()
+val returnCode = process.waitFor()
+if (returnCode != 0) {
+  throw SparkUserAppException(returnCode, 
Option(stdoutBuffer.toString))
 
 Review comment:
   or do you mean it's an issue because the error message is not included in 
the exception message?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24811: [SPARK-27962][R][CORE] Propagate subprocess stdout in deploy.RRunner in exception

2019-06-06 Thread GitBox

AmplabJenkins removed a comment on issue #24811: [SPARK-27962][R][CORE] 
Propagate subprocess stdout in deploy.RRunner in exception
URL: https://github.com/apache/spark/pull/24811#issuecomment-499739056
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/11511/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on a change in pull request #24811: [SPARK-27962][R][CORE] Propagate subprocess stdout in deploy.RRunner in exception

2019-06-06 Thread GitBox

HyukjinKwon commented on a change in pull request #24811: 
[SPARK-27962][R][CORE] Propagate subprocess stdout in deploy.RRunner in 
exception
URL: https://github.com/apache/spark/pull/24811#discussion_r291437528
 
 

 ##
 File path: core/src/main/scala/org/apache/spark/deploy/RRunner.scala
 ##
 @@ -100,15 +100,17 @@ object RRunner {
 builder.redirectErrorStream(true) // Ugly but needed for stdout and 
stderr to synchronize
 val process = builder.start()
 
-new RedirectThread(process.getInputStream, System.out, "redirect R 
output").start()
+val stdoutBuffer = new CircularBuffer(1024)
+val output = new TeeOutputStream(System.out, stdoutBuffer)
+new RedirectThread(process.getInputStream, output, "redirect R 
output").start()
 
-process.waitFor()
+val returnCode = process.waitFor()
+if (returnCode != 0) {
+  throw SparkUserAppException(returnCode, 
Option(stdoutBuffer.toString))
+}
   } finally {
 sparkRBackend.close()
   }
-  if (returnCode != 0) {
 
 Review comment:
   @jeremyjliu, can you show before/after error messages? Seems like we 
redirect stderr. Doesn't that work?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24811: [SPARK-27962][R][CORE] Propagate subprocess stdout in deploy.RRunner in exception

2019-06-06 Thread GitBox

AmplabJenkins removed a comment on issue #24811: [SPARK-27962][R][CORE] 
Propagate subprocess stdout in deploy.RRunner in exception
URL: https://github.com/apache/spark/pull/24811#issuecomment-499240743
 
 
   Can one of the admins verify this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on issue #24811: [SPARK-27962][R][CORE] Propagate subprocess stdout in deploy.RRunner in exception

2019-06-06 Thread GitBox

HyukjinKwon commented on issue #24811: [SPARK-27962][R][CORE] Propagate 
subprocess stdout in deploy.RRunner in exception
URL: https://github.com/apache/spark/pull/24811#issuecomment-499738236
 
 
   ok to test


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24335: [SPARK-27425][SQL] Add count_if functions

2019-06-06 Thread GitBox

dongjoon-hyun commented on a change in pull request #24335: [SPARK-27425][SQL] 
Add count_if functions
URL: https://github.com/apache/spark/pull/24335#discussion_r291436969
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/CountIf.scala
 ##
 @@ -0,0 +1,55 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.expressions.aggregate
+
+import org.apache.spark.sql.catalyst.analysis.TypeCheckResult
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.types._
+
+@ExpressionDescription(
+  usage = """
+_FUNC_(expr) - Returns the number of rows that the supplied expression is 
non-null and true.
+  """,
+  examples = """
+Examples:
+  > SELECT _FUNC_(col % 2 = 0) FROM VALUES (NULL), (0), (1), (2), (3) AS 
tab(col);
+   2
+  > SELECT _FUNC_(col IS NULL) FROM VALUES (NULL), (0), (1), (2), (3) AS 
tab(col);
+   1
+  """,
+  since = "3.0.0")
+case class CountIf(predicate: Expression) extends UnevaluableAggregate with 
ImplicitCastInputTypes {
+  override def prettyName: String = "count_if"
+
+  override def children: Seq[Expression] = predicate :: Nil
+
+  override def nullable: Boolean = false
+
+  override def dataType: DataType = LongType
+
+  override def inputTypes: Seq[AbstractDataType] = BooleanType :: Nil
 
 Review comment:
   nit.
   ```scala
   override def inputTypes: Seq[AbstractDataType] = Seq(BooleanType)
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on a change in pull request #24814: set Int MaxValue

2019-06-06 Thread GitBox

HyukjinKwon commented on a change in pull request #24814: set Int MaxValue
URL: https://github.com/apache/spark/pull/24814#discussion_r291436506
 
 

 ##
 File path: mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala
 ##
 @@ -233,7 +233,7 @@ class Word2Vec extends Serializable with Logging {
   a += 1
 }
 while (a < 2 * vocabSize) {
-  count(a) = 1e9.toInt
+  count(a) = Int.MaxValue
 
 Review comment:
   Both values are different. Why do we need to change? Can you file a JIRA 
since before/after aren't virtually same.
   
   ```scala
   scala> Int.MaxValue
   res0: Int = 2147483647
   
   scala> 1e9.toInt
   res1: Int = 10
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on issue #24814: set Int MaxValue

2019-06-06 Thread GitBox

HyukjinKwon commented on issue #24814: set Int MaxValue
URL: https://github.com/apache/spark/pull/24814#issuecomment-499737601
 
 
   Please review https://spark.apache.org/contributing.html before opening a 
pull request. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon closed pull request #24784: [SPARK-27938][SQL] Remove feature flag LEGACY_PASS_PARTITION_BY_AS_OPTIONS

2019-06-06 Thread GitBox

HyukjinKwon closed pull request #24784: [SPARK-27938][SQL] Remove feature flag 
LEGACY_PASS_PARTITION_BY_AS_OPTIONS
URL: https://github.com/apache/spark/pull/24784
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on a change in pull request #24784: [SPARK-27938][SQL] Remove feature flag LEGACY_PASS_PARTITION_BY_AS_OPTIONS

2019-06-06 Thread GitBox

HyukjinKwon commented on a change in pull request #24784: [SPARK-27938][SQL] 
Remove feature flag LEGACY_PASS_PARTITION_BY_AS_OPTIONS
URL: https://github.com/apache/spark/pull/24784#discussion_r291435639
 
 

 ##
 File path: 
sql/core/src/test/scala/org/apache/spark/sql/test/DataFrameReaderWriterSuite.scala
 ##
 @@ -225,21 +225,13 @@ class DataFrameReaderWriterSuite extends QueryTest with 
SharedSQLContext with Be
   }
 
   test("pass partitionBy as options") {
-Seq(true, false).foreach { flag =>
-  withSQLConf(SQLConf.LEGACY_PASS_PARTITION_BY_AS_OPTIONS.key -> s"$flag") 
{
-Seq(1).toDF.write
-  .format("org.apache.spark.sql.test")
-  .partitionBy("col1", "col2")
-  .save()
-
-if (flag) {
-  val partColumns = 
LastOptions.parameters(DataSourceUtils.PARTITIONING_COLUMNS_KEY)
-  assert(DataSourceUtils.decodePartitioningColumns(partColumns) === 
Seq("col1", "col2"))
-} else {
-  
assert(!LastOptions.parameters.contains(DataSourceUtils.PARTITIONING_COLUMNS_KEY))
-}
-  }
-}
+Seq(1).toDF.write
+  .format("org.apache.spark.sql.test")
+  .partitionBy("col1", "col2")
+  .save()
+
+val partColumns = 
LastOptions.parameters(DataSourceUtils.PARTITIONING_COLUMNS_KEY)
+assert(DataSourceUtils.decodePartitioningColumns(partColumns) === 
Seq("col1", "col2"))
 
 Review comment:
   `decodePartitioningColumns` is under `execution` package that's not supposed 
to be exposed so users shouldn't use this util directly.
   
   Did we document this option to any public datasource v1 API? We should also 
say this is a JSON string.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on issue #24784: [SPARK-27938][SQL] Remove feature flag LEGACY_PASS_PARTITION_BY_AS_OPTIONS

2019-06-06 Thread GitBox

HyukjinKwon commented on issue #24784: [SPARK-27938][SQL] Remove feature flag 
LEGACY_PASS_PARTITION_BY_AS_OPTIONS
URL: https://github.com/apache/spark/pull/24784#issuecomment-499736700
 
 
   LGTM too. strictly 
https://github.com/apache/spark/pull/24784#discussion_r291435639 can be done 
separately.
   
   Merged to master.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24335: [SPARK-27425][SQL] Add count_if functions

2019-06-06 Thread GitBox

dongjoon-hyun commented on a change in pull request #24335: [SPARK-27425][SQL] 
Add count_if functions
URL: https://github.com/apache/spark/pull/24335#discussion_r291433907
 
 

 ##
 File path: 
sql/core/src/test/scala/org/apache/spark/sql/DataFrameAggregateSuite.scala
 ##
 @@ -894,4 +894,30 @@ class DataFrameAggregateSuite extends QueryTest with 
SharedSQLContext {
 error.message.contains("function min_by does not support ordering on 
type map"))
 }
   }
+
+  test("SPARK-27425: count_if function") {
+def checkError(df: => DataFrame): Unit = {
+  val thrownException = the [AnalysisException] thrownBy 
df.queryExecution.analyzed
+  assert(thrownException.message.contains("function count_if requires 
boolean type"))
+}
+
+withTempView("tempView") {
+  Seq(("a", None), ("a", Some(1)), ("a", Some(2)), ("a", Some(3)),
+("b", None), ("b", Some(4)), ("b", Some(5)), ("b", Some(6)))
+.toDF("x", "y")
+.createOrReplaceTempView("tempView")
+
+  checkAnswer(
+sql("SELECT COUNT_IF(NULL), COUNT_IF(y % 2 = 0), COUNT_IF(y % 2 <> 0), 
" +
+  "COUNT_IF(y IS NULL) FROM tempView"),
+Row(0L, 3L, 3L, 2L))
+
+  checkAnswer(
+sql("SELECT x, COUNT_IF(NULL), COUNT_IF(y % 2 = 0), COUNT_IF(y % 2 <> 
0), " +
+  "COUNT_IF(y IS NULL) FROM tempView GROUP BY x"),
+Row("a", 0L, 1L, 2L, 1L) :: Row("b", 0L, 2L, 1L, 1L) :: Nil)
+
+  checkError(sql("SELECT COUNT_IF(x) FROM tempView"))
 
 Review comment:
   We usually test like the following.
   ```scala
   val m = intercept[AnalysisException] {
 sql("SELECT COUNT_IF(x) FROM tempView")
   }.getMessage
   assert(m.contains("function count_if requires boolean type"))
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24335: [SPARK-27425][SQL] Add count_if functions

2019-06-06 Thread GitBox

dongjoon-hyun commented on a change in pull request #24335: [SPARK-27425][SQL] 
Add count_if functions
URL: https://github.com/apache/spark/pull/24335#discussion_r291433432
 
 

 ##
 File path: 
sql/core/src/test/scala/org/apache/spark/sql/DataFrameAggregateSuite.scala
 ##
 @@ -894,4 +894,30 @@ class DataFrameAggregateSuite extends QueryTest with 
SharedSQLContext {
 error.message.contains("function min_by does not support ordering on 
type map"))
 }
   }
+
+  test("SPARK-27425: count_if function") {
+def checkError(df: => DataFrame): Unit = {
+  val thrownException = the [AnalysisException] thrownBy 
df.queryExecution.analyzed
+  assert(thrownException.message.contains("function count_if requires 
boolean type"))
+}
 
 Review comment:
   Let's not declare a function which is used once.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24335: [SPARK-27425][SQL] Add count_if functions

2019-06-06 Thread GitBox

dongjoon-hyun commented on a change in pull request #24335: [SPARK-27425][SQL] 
Add count_if functions
URL: https://github.com/apache/spark/pull/24335#discussion_r291433282
 
 

 ##
 File path: 
sql/core/src/test/scala/org/apache/spark/sql/DataFrameAggregateSuite.scala
 ##
 @@ -894,4 +894,30 @@ class DataFrameAggregateSuite extends QueryTest with 
SharedSQLContext {
 error.message.contains("function min_by does not support ordering on 
type map"))
 }
   }
+
+  test("SPARK-27425: count_if function") {
 
 Review comment:
   In general, we don't use SPARK JIRA id for new feature test case name. Could 
you remove `SPARK-27425: `?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24335: [SPARK-27425][SQL] Add count_if functions

2019-06-06 Thread GitBox

dongjoon-hyun commented on a change in pull request #24335: [SPARK-27425][SQL] 
Add count_if functions
URL: https://github.com/apache/spark/pull/24335#discussion_r291432774
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/CountIf.scala
 ##
 @@ -0,0 +1,55 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.expressions.aggregate
+
+import org.apache.spark.sql.catalyst.analysis.TypeCheckResult
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.types._
+
+@ExpressionDescription(
+  usage = """
+_FUNC_(expr) - Returns the number of rows that the supplied expression is 
non-null and true.
+  """,
+  examples = """
+Examples:
+  > SELECT _FUNC_(col % 2 = 0) FROM VALUES (NULL), (0), (1), (2), (3) AS 
tab(col);
+   2
+  > SELECT _FUNC_(col IS NULL) FROM VALUES (NULL), (0), (1), (2), (3) AS 
tab(col);
+   1
+  """,
+  since = "3.0.0")
+case class CountIf(predicate: Expression) extends UnevaluableAggregate with 
ImplicitCastInputTypes {
+  override def prettyName: String = "count_if"
+
+  override def children: Seq[Expression] = predicate :: Nil
+
+  override def nullable: Boolean = false
+
+  override def dataType: DataType = LongType
+
+  override def inputTypes: Seq[AbstractDataType] = BooleanType :: Nil
+
+  override def checkInputDataTypes(): TypeCheckResult = predicate.dataType 
match {
+case BooleanType =>
+  TypeCheckResult.TypeCheckSuccess
+case _ =>
+  TypeCheckResult.TypeCheckFailure(
+s"function ${prettyName} requires boolean type, not 
${predicate.dataType.catalogString}"
 
 Review comment:
   `${prettyName}` -> `$prettyName`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24818: [SPARK-27971[SQL][R] MapPartitionsInRWithArrowExec.evaluate shouldn't eagerly read the first batch

2019-06-06 Thread GitBox

AmplabJenkins removed a comment on issue #24818: [SPARK-27971[SQL][R] 
MapPartitionsInRWithArrowExec.evaluate shouldn't eagerly read the first batch
URL: https://github.com/apache/spark/pull/24818#issuecomment-499732132
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24818: [SPARK-27971[SQL][R] MapPartitionsInRWithArrowExec.evaluate shouldn't eagerly read the first batch

2019-06-06 Thread GitBox

AmplabJenkins removed a comment on issue #24818: [SPARK-27971[SQL][R] 
MapPartitionsInRWithArrowExec.evaluate shouldn't eagerly read the first batch
URL: https://github.com/apache/spark/pull/24818#issuecomment-499732137
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/11510/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24335: [SPARK-27425][SQL] Add count_if functions

2019-06-06 Thread GitBox

dongjoon-hyun commented on a change in pull request #24335: [SPARK-27425][SQL] 
Add count_if functions
URL: https://github.com/apache/spark/pull/24335#discussion_r291431861
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/CountIf.scala
 ##
 @@ -0,0 +1,55 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.expressions.aggregate
+
+import org.apache.spark.sql.catalyst.analysis.TypeCheckResult
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.types._
 
 Review comment:
   Shall we import explicitly?
   ```scala
   import org.apache.spark.sql.catalyst.expressions.{Expression, 
ExpressionDescription, ImplicitCastInputTypes, UnevaluableAggregate}
   import org.apache.spark.sql.types.{AbstractDataType, BooleanType, DataType, 
LongType}
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #24818: [SPARK-27971[SQL][R] MapPartitionsInRWithArrowExec.evaluate shouldn't eagerly read the first batch

2019-06-06 Thread GitBox

SparkQA commented on issue #24818: [SPARK-27971[SQL][R] 
MapPartitionsInRWithArrowExec.evaluate shouldn't eagerly read the first batch
URL: https://github.com/apache/spark/pull/24818#issuecomment-499732479
 
 
   **[Test build #106264 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106264/testReport)**
 for PR 24818 at commit 
[`d848354`](https://github.com/apache/spark/commit/d8483541ee161ac249c8a439343a66136d2f0079).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24818: [SPARK-27971[SQL][R] MapPartitionsInRWithArrowExec.evaluate shouldn't eagerly read the first batch

2019-06-06 Thread GitBox

AmplabJenkins commented on issue #24818: [SPARK-27971[SQL][R] 
MapPartitionsInRWithArrowExec.evaluate shouldn't eagerly read the first batch
URL: https://github.com/apache/spark/pull/24818#issuecomment-499732132
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24818: [SPARK-27971[SQL][R] MapPartitionsInRWithArrowExec.evaluate shouldn't eagerly read the first batch

2019-06-06 Thread GitBox

AmplabJenkins commented on issue #24818: [SPARK-27971[SQL][R] 
MapPartitionsInRWithArrowExec.evaluate shouldn't eagerly read the first batch
URL: https://github.com/apache/spark/pull/24818#issuecomment-499732137
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/11510/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon opened a new pull request #24818: [SPARK-27971[SQL][R] MapPartitionsInRWithArrowExec.evaluate shouldn't eagerly read the first batch

2019-06-06 Thread GitBox

HyukjinKwon opened a new pull request #24818: [SPARK-27971[SQL][R] 
MapPartitionsInRWithArrowExec.evaluate shouldn't eagerly read the first batch
URL: https://github.com/apache/spark/pull/24818
 
 
   ## What changes were proposed in this pull request?
   
   This PR is the same fix as https://github.com/apache/spark/pull/24816 but in 
vectorized `dapply` in SparkR.
   
   ## How was this patch tested?
   
   Manually tested.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24734: [SPARK-27870][SQL][PySpark] Flush batch timely for pandas UDF (for improving pandas UDFs pipeline)

2019-06-06 Thread GitBox

AmplabJenkins removed a comment on issue #24734: [SPARK-27870][SQL][PySpark] 
Flush batch timely for pandas UDF (for improving pandas UDFs pipeline)
URL: https://github.com/apache/spark/pull/24734#issuecomment-499730842
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/106262/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24734: [SPARK-27870][SQL][PySpark] Flush batch timely for pandas UDF (for improving pandas UDFs pipeline)

2019-06-06 Thread GitBox

AmplabJenkins removed a comment on issue #24734: [SPARK-27870][SQL][PySpark] 
Flush batch timely for pandas UDF (for improving pandas UDFs pipeline)
URL: https://github.com/apache/spark/pull/24734#issuecomment-499730835
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24734: [SPARK-27870][SQL][PySpark] Flush batch timely for pandas UDF (for improving pandas UDFs pipeline)

2019-06-06 Thread GitBox

AmplabJenkins commented on issue #24734: [SPARK-27870][SQL][PySpark] Flush 
batch timely for pandas UDF (for improving pandas UDFs pipeline)
URL: https://github.com/apache/spark/pull/24734#issuecomment-499730842
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/106262/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24734: [SPARK-27870][SQL][PySpark] Flush batch timely for pandas UDF (for improving pandas UDFs pipeline)

2019-06-06 Thread GitBox

AmplabJenkins commented on issue #24734: [SPARK-27870][SQL][PySpark] Flush 
batch timely for pandas UDF (for improving pandas UDFs pipeline)
URL: https://github.com/apache/spark/pull/24734#issuecomment-499730835
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #24734: [SPARK-27870][SQL][PySpark] Flush batch timely for pandas UDF (for improving pandas UDFs pipeline)

2019-06-06 Thread GitBox

SparkQA removed a comment on issue #24734: [SPARK-27870][SQL][PySpark] Flush 
batch timely for pandas UDF (for improving pandas UDFs pipeline)
URL: https://github.com/apache/spark/pull/24734#issuecomment-499697138
 
 
   **[Test build #106262 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106262/testReport)**
 for PR 24734 at commit 
[`ed7aee0`](https://github.com/apache/spark/commit/ed7aee06344fd75e6921fa38a0f24183285b1e12).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #24734: [SPARK-27870][SQL][PySpark] Flush batch timely for pandas UDF (for improving pandas UDFs pipeline)

2019-06-06 Thread GitBox

SparkQA commented on issue #24734: [SPARK-27870][SQL][PySpark] Flush batch 
timely for pandas UDF (for improving pandas UDFs pipeline)
URL: https://github.com/apache/spark/pull/24734#issuecomment-499730555
 
 
   **[Test build #106262 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106262/testReport)**
 for PR 24734 at commit 
[`ed7aee0`](https://github.com/apache/spark/commit/ed7aee06344fd75e6921fa38a0f24183285b1e12).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24815: [SPARK-27961][SQL] DataSourceV2Relation should not have refresh method

2019-06-06 Thread GitBox

dongjoon-hyun commented on a change in pull request #24815: [SPARK-27961][SQL] 
DataSourceV2Relation should not have refresh method
URL: https://github.com/apache/spark/pull/24815#discussion_r291424651
 
 

 ##
 File path: 
sql/core/src/test/scala/org/apache/spark/sql/MetadataCacheSuite.scala
 ##
 @@ -57,13 +56,20 @@ abstract class MetadataCacheSuite extends QueryTest with 
SharedSQLContext {
 df.count()
   }
   assert(e.getMessage.contains("FileNotFoundException"))
-  assert(e.getMessage.contains("REFRESH"))
+  assert(e.getMessage.contains("recreating the Dataset/DataFrame 
involved"))
 }
   }
+}
+
+class MetadataCacheV1Suite extends MetadataCacheSuite {
+  override protected def sparkConf: SparkConf =
+super
+  .sparkConf
+  .set(SQLConf.USE_V1_SOURCE_READER_LIST, "orc")
 
   test("SPARK-16337,SPARK-27504 temporary view refresh") {
 
 Review comment:
   The `SPARK-27504` had better be removed from this test case name like 
[this](https://github.com/apache/spark/pull/24815/files#diff-0667b59236ca014a47b3fc20b6ea820eR41)?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24817: [WIP][SPARK-27963][core] Allow dynamic allocation without a shuffle service.

2019-06-06 Thread GitBox

AmplabJenkins removed a comment on issue #24817: [WIP][SPARK-27963][core] Allow 
dynamic allocation without a shuffle service.
URL: https://github.com/apache/spark/pull/24817#issuecomment-499728669
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/106263/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24817: [WIP][SPARK-27963][core] Allow dynamic allocation without a shuffle service.

2019-06-06 Thread GitBox

AmplabJenkins removed a comment on issue #24817: [WIP][SPARK-27963][core] Allow 
dynamic allocation without a shuffle service.
URL: https://github.com/apache/spark/pull/24817#issuecomment-499728668
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 4 5 6 7 >

1 - 100 of 698 matches

Mail list logo