[GitHub] [spark] stijndehaes commented on a change in pull request #28423: [SPARK-24266][k8s] Restart the watcher when we receive a version changed from k8s

2020-05-03 Thread GitBox


stijndehaes commented on a change in pull request #28423:
URL: https://github.com/apache/spark/pull/28423#discussion_r419224763



##
File path: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/submit/KubernetesClientApplication.scala
##
@@ -127,25 +129,33 @@ private[spark] class Client(
 .endSpec()
   .build()
 val driverPodName = resolvedDriverPod.getMetadata.getName
-Utils.tryWithResource(
-  kubernetesClient
-.pods()
-.withName(driverPodName)
-.watch(watcher)) { _ =>
-  val createdDriverPod = kubernetesClient.pods().create(resolvedDriverPod)
-  try {
-val otherKubernetesResources =
-  resolvedDriverSpec.driverKubernetesResources ++ Seq(configMap)
-addDriverOwnerReference(createdDriverPod, otherKubernetesResources)
-kubernetesClient.resourceList(otherKubernetesResources: 
_*).createOrReplace()
-  } catch {
-case NonFatal(e) =>
-  kubernetesClient.pods().delete(createdDriverPod)
-  throw e
-  }
 
-  val sId = Seq(conf.namespace, driverPodName).mkString(":")
-  watcher.watchOrStop(sId)
+var watch: Watch = null
+val createdDriverPod = kubernetesClient.pods().create(resolvedDriverPod)
+try {
+  val otherKubernetesResources = 
resolvedDriverSpec.driverKubernetesResources ++ Seq(configMap)
+  addDriverOwnerReference(createdDriverPod, otherKubernetesResources)
+  kubernetesClient.resourceList(otherKubernetesResources: 
_*).createOrReplace()
+} catch {
+  case NonFatal(e) =>
+kubernetesClient.pods().delete(createdDriverPod)
+throw e
+}
+val sId = Seq(conf.namespace, driverPodName).mkString(":")
+breakable {
+  while (true) {
+try {
+watch = kubernetesClient
+  .pods()
+  .withName(driverPodName)
+  .watch(watcher)
+watcher.watchOrStop(sId)

Review comment:
   I think you get the latest version as first message without anything 
needing to change, but I will double check this.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28445: [SPARK-31212][SQL][2.4] Fix Failure of casting the '1000-02-29' string to the date type

2020-05-03 Thread GitBox


AmplabJenkins commented on pull request #28445:
URL: https://github.com/apache/spark/pull/28445#issuecomment-623270846







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #28438: [SPARK-31267][SQL] Flaky test: WholeStageCodegenSparkSubmitSuite.Generated code on driver should not embed platform-specific constant

2020-05-03 Thread GitBox


HyukjinKwon commented on pull request #28438:
URL: https://github.com/apache/spark/pull/28438#issuecomment-623270515


   Merged to master and branch-3.0.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28445: [SPARK-31212][SQL][2.4] Fix Failure of casting the '1000-02-29' string to the date type

2020-05-03 Thread GitBox


SparkQA commented on pull request #28445:
URL: https://github.com/apache/spark/pull/28445#issuecomment-623270511


   **[Test build #122244 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/122244/testReport)**
 for PR 28445 at commit 
[`ce2470a`](https://github.com/apache/spark/commit/ce2470a6951e059ede9e34b687b8d309b3cf688d).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #28445: [SPARK-31212][SQL][2.4] Fix Failure of casting the '1000-02-29' string to the date type

2020-05-03 Thread GitBox


HyukjinKwon commented on pull request #28445:
URL: https://github.com/apache/spark/pull/28445#issuecomment-623269930







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #28445: [SPARK-31212][SQL][2.4] Fix Failure of casting the '1000-02-29' string to the date type

2020-05-03 Thread GitBox


HyukjinKwon commented on pull request #28445:
URL: https://github.com/apache/spark/pull/28445#issuecomment-623269894


   Looks like we should also check 
https://github.com/apache/spark/blob/branch-2.4/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala#L608-L610.
 Is it all instances we should fix?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] kiszk commented on pull request #28445: [SPARK-31212][SQL] Fix Failure of casting the '1000-02-29' string to the date type in 2.4

2020-05-03 Thread GitBox


kiszk commented on pull request #28445:
URL: https://github.com/apache/spark/pull/28445#issuecomment-623269103


   Could you please add `[2.4]` into the title like 
[this](https://github.com/apache/spark/pull/26901)?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #28447: [MINOR][Doc] Fix typo in documents

2020-05-03 Thread GitBox


HyukjinKwon commented on a change in pull request #28447:
URL: https://github.com/apache/spark/pull/28447#discussion_r419221781



##
File path: docs/ml-migration-guide.md
##
@@ -281,7 +281,7 @@ Several deprecated methods were removed in the 
`spark.mllib` and `spark.ml` pack
 * `weights` in `LinearRegression` and `LogisticRegression` in `spark.ml`
 * `setMaxNumIterations` in `mllib.optimization.LBFGS` (marked as 
`DeveloperApi`)
 * `treeReduce` and `treeAggregate` in `mllib.rdd.RDDFunctions` (these 
functions are available on `RDD`s directly, and were marked as `DeveloperApi`)
-* `defaultStategy` in `mllib.tree.configuration.Strategy`
+* `defaultStrategy` in `mllib.tree.configuration.Strategy`

Review comment:
   This one too. It was deprecated at SPARK-9609.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #28447: [MINOR][Doc] Fix typo in documents

2020-05-03 Thread GitBox


HyukjinKwon commented on a change in pull request #28447:
URL: https://github.com/apache/spark/pull/28447#discussion_r419221722



##
File path: project/MimaExcludes.scala
##
@@ -1422,7 +1422,7 @@ object MimaExcludes {
   
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.SparkEnv.getThreadLocal"),
   
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.mllib.rdd.RDDFunctions.treeReduce"),
   
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.mllib.rdd.RDDFunctions.treeAggregate"),
-  
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.mllib.tree.configuration.Strategy.defaultStategy"),
+  
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.mllib.tree.configuration.Strategy.defaultStrategy"),

Review comment:
   I think `defaultStategy` is correct. It was deprecated at SPARK-9609 and 
removed at SPARK-14089. `defaultStrategy` seems fine.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #26585: [WIP][SPARK-25351][SQL][Python] Handle Pandas category type when converting from Python with Arrow

2020-05-03 Thread GitBox


AmplabJenkins removed a comment on pull request #26585:
URL: https://github.com/apache/spark/pull/26585#issuecomment-623266501







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #26585: [WIP][SPARK-25351][SQL][Python] Handle Pandas category type when converting from Python with Arrow

2020-05-03 Thread GitBox


AmplabJenkins commented on pull request #26585:
URL: https://github.com/apache/spark/pull/26585#issuecomment-623266730







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #26585: [WIP][SPARK-25351][SQL][Python] Handle Pandas category type when converting from Python with Arrow

2020-05-03 Thread GitBox


SparkQA removed a comment on pull request #26585:
URL: https://github.com/apache/spark/pull/26585#issuecomment-623266263


   **[Test build #122243 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/122243/testReport)**
 for PR 26585 at commit 
[`99d5026`](https://github.com/apache/spark/commit/99d50269dbd7d60c67b8f6ce43123d8023a8f307).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #26585: [WIP][SPARK-25351][SQL][Python] Handle Pandas category type when converting from Python with Arrow

2020-05-03 Thread GitBox


AmplabJenkins removed a comment on pull request #26585:
URL: https://github.com/apache/spark/pull/26585#issuecomment-623266495







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #26585: [WIP][SPARK-25351][SQL][Python] Handle Pandas category type when converting from Python with Arrow

2020-05-03 Thread GitBox


SparkQA commented on pull request #26585:
URL: https://github.com/apache/spark/pull/26585#issuecomment-623266485


   **[Test build #122243 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/122243/testReport)**
 for PR 26585 at commit 
[`99d5026`](https://github.com/apache/spark/commit/99d50269dbd7d60c67b8f6ce43123d8023a8f307).
* This patch **fails Python style tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #26585: [WIP][SPARK-25351][SQL][Python] Handle Pandas category type when converting from Python with Arrow

2020-05-03 Thread GitBox


AmplabJenkins commented on pull request #26585:
URL: https://github.com/apache/spark/pull/26585#issuecomment-623266495







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #26585: [WIP][SPARK-25351][SQL][Python] Handle Pandas category type when converting from Python with Arrow

2020-05-03 Thread GitBox


SparkQA commented on pull request #26585:
URL: https://github.com/apache/spark/pull/26585#issuecomment-623266263


   **[Test build #122243 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/122243/testReport)**
 for PR 26585 at commit 
[`99d5026`](https://github.com/apache/spark/commit/99d50269dbd7d60c67b8f6ce43123d8023a8f307).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] jalpan-randeri commented on a change in pull request #26585: [WIP][SPARK-25351][SQL][Python] Handle Pandas category type when converting from Python with Arrow

2020-05-03 Thread GitBox


jalpan-randeri commented on a change in pull request #26585:
URL: https://github.com/apache/spark/pull/26585#discussion_r419219239



##
File path: python/pyspark/sql/tests/test_arrow.py
##
@@ -415,6 +415,20 @@ def run_test(num_records, num_parts, max_records, 
use_delay=False):
 for case in cases:
 run_test(*case)
 
+def test_createDateFrame_with_category_type(self):
+pdf = pd.DataFrame({"A": [u"a", u"b", u"c", u"a"]})
+pdf["B"] = pdf["A"].astype('category')
+
+with self.sql_conf({"spark.sql.execution.arrow.pyspark.enabled": 
True}):
+arrow_df = self.spark.createDataFrame(pdf)
+result_arrow = arrow_df.collect()
+
+with self.sql_conf({"spark.sql.execution.arrow.pyspark.enabled": 
False}):
+df = self.spark.createDataFrame(pdf)
+result_spark = df.collect()
+
+assert result_arrow == result_spark

Review comment:
   Fixed





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] jalpan-randeri commented on a change in pull request #26585: [WIP][SPARK-25351][SQL][Python] Handle Pandas category type when converting from Python with Arrow

2020-05-03 Thread GitBox


jalpan-randeri commented on a change in pull request #26585:
URL: https://github.com/apache/spark/pull/26585#discussion_r419219194



##
File path: python/pyspark/sql/pandas/serializers.py
##
@@ -155,7 +156,11 @@ def create_array(s, t):
 if t is not None and pa.types.is_timestamp(t):
 s = _check_series_convert_timestamps_internal(s, 
self._timezone)
 try:
-array = pa.Array.from_pandas(s, mask=mask, type=t, 
safe=self._safecheck)
+if type(s.dtype) == CategoricalDtype:
+s = s.astype(s.dtypes.categories.dtype)
+array = pa.array(s)

Review comment:
   Fixed





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] jalpan-randeri commented on a change in pull request #26585: [WIP][SPARK-25351][SQL][Python] Handle Pandas category type when converting from Python with Arrow

2020-05-03 Thread GitBox


jalpan-randeri commented on a change in pull request #26585:
URL: https://github.com/apache/spark/pull/26585#discussion_r419219147



##
File path: python/pyspark/sql/pandas/serializers.py
##
@@ -155,7 +156,11 @@ def create_array(s, t):
 if t is not None and pa.types.is_timestamp(t):
 s = _check_series_convert_timestamps_internal(s, 
self._timezone)
 try:
-array = pa.Array.from_pandas(s, mask=mask, type=t, 
safe=self._safecheck)
+if type(s.dtype) == CategoricalDtype:

Review comment:
   Done.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun edited a comment on pull request #26624: [SPARK-8981][core][test-hadoop3.2][test-java11] Add MDC support in Executor

2020-05-03 Thread GitBox


dongjoon-hyun edited a comment on pull request #26624:
URL: https://github.com/apache/spark/pull/26624#issuecomment-623258947


   Although I didn't test this PR yet, MDC feature has been known to be broken 
on JDK9+. I made a PR to upgrade SLF4J first. Let's see. I hope we can update 
the related dependency together if possible.
   - https://github.com/apache/spark/pull/28446



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28447: [MINOR][Doc] Fix typo in documents

2020-05-03 Thread GitBox


AmplabJenkins commented on pull request #28447:
URL: https://github.com/apache/spark/pull/28447#issuecomment-623259575







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28447: [MINOR][Doc] Fix typo in documents

2020-05-03 Thread GitBox


AmplabJenkins removed a comment on pull request #28447:
URL: https://github.com/apache/spark/pull/28447#issuecomment-623259575







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun edited a comment on pull request #26624: [SPARK-8981][core][test-hadoop3.2][test-java11] Add MDC support in Executor

2020-05-03 Thread GitBox


dongjoon-hyun edited a comment on pull request #26624:
URL: https://github.com/apache/spark/pull/26624#issuecomment-623258947


   Although I didn't test this PR yet, MDC feature has been known to be broken 
on JDK9+. I made a PR to upgrade SLF4J first. Let's see.
   - https://github.com/apache/spark/pull/28446



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28447: [MINOR][Doc] Fix typo in documents

2020-05-03 Thread GitBox


SparkQA commented on pull request #28447:
URL: https://github.com/apache/spark/pull/28447#issuecomment-623259242


   **[Test build #122242 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/122242/testReport)**
 for PR 28447 at commit 
[`6af8b1d`](https://github.com/apache/spark/commit/6af8b1d6c0f819e993f32eaea366f3cda8ddfd4c).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on pull request #26624: [SPARK-8981][core][test-hadoop3.2][test-java11] Add MDC support in Executor

2020-05-03 Thread GitBox


dongjoon-hyun commented on pull request #26624:
URL: https://github.com/apache/spark/pull/26624#issuecomment-623258947


   MDC feature has been known to be broken on JDK9+. I made a PR to upgrade 
SLF4J first.
   - https://github.com/apache/spark/pull/28446



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] kiszk opened a new pull request #28447: [MINOR][Doc] Fix typo in document and comments

2020-05-03 Thread GitBox


kiszk opened a new pull request #28447:
URL: https://github.com/apache/spark/pull/28447


   
   
   ### What changes were proposed in this pull request?
   Fixed typo in `docs` directory and in `project/MimaExcludes.scala`
   
   
   ### Why are the changes needed?
   Better readability of documents
   
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   ### How was this patch tested?
   No test needed
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28446: [SPARK-31633][BUILD] Upgrade SLF4J from 1.7.16 to 1.7.30

2020-05-03 Thread GitBox


AmplabJenkins removed a comment on pull request #28446:
URL: https://github.com/apache/spark/pull/28446#issuecomment-623254789







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28446: [SPARK-31633][BUILD] Upgrade SLF4J from 1.7.16 to 1.7.30

2020-05-03 Thread GitBox


SparkQA commented on pull request #28446:
URL: https://github.com/apache/spark/pull/28446#issuecomment-623255917


   **[Test build #122241 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/122241/testReport)**
 for PR 28446 at commit 
[`a6c2b51`](https://github.com/apache/spark/commit/a6c2b511e1c0be3317b0aac76385f2ebf298793a).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28446: [SPARK-31633][BUILD] Upgrade SLF4J from 1.7.16 to 1.7.30

2020-05-03 Thread GitBox


AmplabJenkins commented on pull request #28446:
URL: https://github.com/apache/spark/pull/28446#issuecomment-623254789







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun opened a new pull request #28446: [SPARK-31633][BUILD] Upgrade SLF4J from 1.7.16 to 1.7.30

2020-05-03 Thread GitBox


dongjoon-hyun opened a new pull request #28446:
URL: https://github.com/apache/spark/pull/28446


   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28430: [SPARK-31372][SQL][TEST][FOLLOW-UP] Improve ExpressionsSchemaSuite so that easy to track the diff.

2020-05-03 Thread GitBox


AmplabJenkins removed a comment on pull request #28430:
URL: https://github.com/apache/spark/pull/28430#issuecomment-623251869







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28430: [SPARK-31372][SQL][TEST][FOLLOW-UP] Improve ExpressionsSchemaSuite so that easy to track the diff.

2020-05-03 Thread GitBox


AmplabJenkins commented on pull request #28430:
URL: https://github.com/apache/spark/pull/28430#issuecomment-623251869







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28430: [SPARK-31372][SQL][TEST][FOLLOW-UP] Improve ExpressionsSchemaSuite so that easy to track the diff.

2020-05-03 Thread GitBox


SparkQA commented on pull request #28430:
URL: https://github.com/apache/spark/pull/28430#issuecomment-623251556


   **[Test build #122240 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/122240/testReport)**
 for PR 28430 at commit 
[`951bf24`](https://github.com/apache/spark/commit/951bf24a0d52ce0b84410636b7caeba136c4bf75).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on a change in pull request #28430: [SPARK-31372][SQL][TEST][FOLLOW-UP] Improve ExpressionsSchemaSuite so that easy to track the diff.

2020-05-03 Thread GitBox


beliefer commented on a change in pull request #28430:
URL: https://github.com/apache/spark/pull/28430#discussion_r419209216



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/ExpressionsSchemaSuite.scala
##
@@ -156,14 +156,16 @@ class ExpressionsSchemaSuite extends QueryTest with 
SharedSparkSession {
   stringToFile(resultFile, goldenOutput)
 }
 
+val outputSize = outputs.size
 val expectedOutputs: Seq[QueryOutput] = {
-  val goldenOutput = fileToString(resultFile)
-  val lines = goldenOutput.split("\n")
+  val expectedGoldenOutput = fileToString(resultFile)
+  val lines = expectedGoldenOutput.split("\n")
+  val expectedSize = lines.size
 
   // The header of golden file has one line, plus four lines of the 
summary and three
   // lines of the header of schema table.
-  assert(lines.size == outputs.size + 8,
-s"Expected ${outputs.size + 8} blocks in result file but got 
${lines.size}. " +
+  assert(expectedSize == outputSize + 8,

Review comment:
   OK.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28438: [SPARK-31267][SQL] Flaky test: WholeStageCodegenSparkSubmitSuite.Generated code on driver should not embed platform-specific co

2020-05-03 Thread GitBox


AmplabJenkins removed a comment on pull request #28438:
URL: https://github.com/apache/spark/pull/28438#issuecomment-623250310







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28438: [SPARK-31267][SQL] Flaky test: WholeStageCodegenSparkSubmitSuite.Generated code on driver should not embed platform-specific constant

2020-05-03 Thread GitBox


AmplabJenkins commented on pull request #28438:
URL: https://github.com/apache/spark/pull/28438#issuecomment-623250310







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28438: [SPARK-31267][SQL] Flaky test: WholeStageCodegenSparkSubmitSuite.Generated code on driver should not embed platform-specific constant

2020-05-03 Thread GitBox


SparkQA removed a comment on pull request #28438:
URL: https://github.com/apache/spark/pull/28438#issuecomment-623225479


   **[Test build #122238 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/122238/testReport)**
 for PR 28438 at commit 
[`5276146`](https://github.com/apache/spark/commit/527614681b74dd591f4cef1b3471e6d00240e6a1).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28438: [SPARK-31267][SQL] Flaky test: WholeStageCodegenSparkSubmitSuite.Generated code on driver should not embed platform-specific constant

2020-05-03 Thread GitBox


SparkQA commented on pull request #28438:
URL: https://github.com/apache/spark/pull/28438#issuecomment-623249921


   **[Test build #122238 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/122238/testReport)**
 for PR 28438 at commit 
[`5276146`](https://github.com/apache/spark/commit/527614681b74dd591f4cef1b3471e6d00240e6a1).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #26624: [SPARK-8981][core][test-hadoop3.2][test-java11] Add MDC support in Executor

2020-05-03 Thread GitBox


AmplabJenkins removed a comment on pull request #26624:
URL: https://github.com/apache/spark/pull/26624#issuecomment-623248915







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #26624: [SPARK-8981][core][test-hadoop3.2][test-java11] Add MDC support in Executor

2020-05-03 Thread GitBox


AmplabJenkins commented on pull request #26624:
URL: https://github.com/apache/spark/pull/26624#issuecomment-623248915







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #26624: [SPARK-8981][core][test-hadoop3.2][test-java11] Add MDC support in Executor

2020-05-03 Thread GitBox


SparkQA commented on pull request #26624:
URL: https://github.com/apache/spark/pull/26624#issuecomment-623248724


   **[Test build #122239 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/122239/testReport)**
 for PR 26624 at commit 
[`559fe09`](https://github.com/apache/spark/commit/559fe098f666065f5411ec373dc466b260f6f87d).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on pull request #26624: [SPARK-8981][core][test-hadoop3.2][test-java11] Add MDC support in Executor

2020-05-03 Thread GitBox


dongjoon-hyun commented on pull request #26624:
URL: https://github.com/apache/spark/pull/26624#issuecomment-623248314


   Hi, All.
   It would be great if we can verify this in JDK11 together.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #26624: [SPARK-8981][core][test-hadoop3.2][test-java11] Add MDC support in Executor

2020-05-03 Thread GitBox


AmplabJenkins removed a comment on pull request #26624:
URL: https://github.com/apache/spark/pull/26624#issuecomment-621026766


   Can one of the admins verify this patch?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu edited a comment on pull request #27803: [SPARK-31049][SQL] Support nested adjacent generators, e.g., explode(explode(v))

2020-05-03 Thread GitBox


maropu edited a comment on pull request #27803:
URL: https://github.com/apache/spark/pull/27803#issuecomment-623237474


   Thanks for the sharing, @dilipbiswal. Yea, as you suggested above, I think 
now that `explode+flatten` looks fine for the most use cases and this PR makes 
the rule a bit complicated. Is it okay to close it now? If we get user's 
feedbacks for this, I think we can revisit it anytime. WDYT? @dongjoon-hyun 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on pull request #26624: [SPARK-8981][core][test-hadoop3.2][test-java11] Add MDC support in Executor

2020-05-03 Thread GitBox


dongjoon-hyun commented on pull request #26624:
URL: https://github.com/apache/spark/pull/26624#issuecomment-623248068


   Retest this please.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28445: [SPARK-31212][SQL] Fix Failure of casting the '1000-02-29' string to the date type in 2.4

2020-05-03 Thread GitBox


AmplabJenkins removed a comment on pull request #28445:
URL: https://github.com/apache/spark/pull/28445#issuecomment-623243685


   Can one of the admins verify this patch?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] tianshizz commented on pull request #28445: [SPARK-31212][SQL] Fix Failure of casting the '1000-02-29' string to the date type in 2.4

2020-05-03 Thread GitBox


tianshizz commented on pull request #28445:
URL: https://github.com/apache/spark/pull/28445#issuecomment-623243888


   attempted to fix the bug in branch-2.4, as discussed in 
https://github.com/apache/spark/pull/28443
   
   cc @MaxGekk @HyukjinKwon 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28445: [SPARK-31212][SQL] Fix Failure of casting the '1000-02-29' string to the date type in 2.4

2020-05-03 Thread GitBox


AmplabJenkins commented on pull request #28445:
URL: https://github.com/apache/spark/pull/28445#issuecomment-623243847


   Can one of the admins verify this patch?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28445: [SPARK-31212][SQL] Fix Failure of casting the '1000-02-29' string to the date type in 2.4

2020-05-03 Thread GitBox


AmplabJenkins commented on pull request #28445:
URL: https://github.com/apache/spark/pull/28445#issuecomment-623243685


   Can one of the admins verify this patch?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] tianshizz opened a new pull request #28445: [SPARK-31212][SQL] Fix Failure of casting the '1000-02-29' string to the date type in 2.4

2020-05-03 Thread GitBox


tianshizz opened a new pull request #28445:
URL: https://github.com/apache/spark/pull/28445


   
   
   ### What changes were proposed in this pull request?
   
   Use `GregorianCanlendar.isLeapYear()` as suggested in SPARK-31212 to fix 
parsing date string such as `1000-02-29`
   
   ### Why are the changes needed?
   
   Fix a bug that leap years in Julian calendar can't be parsed correctly in 
v2.4
   
   ### Does this PR introduce _any_ user-facing change?
   
   No
   
   ### How was this patch tested?
   
   added a unit test that would fail in the current branch



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on pull request #27803: [SPARK-31049][SQL] Support nested adjacent generators, e.g., explode(explode(v))

2020-05-03 Thread GitBox


maropu commented on pull request #27803:
URL: https://github.com/apache/spark/pull/27803#issuecomment-623237474


   Thanks for the sharing, @dilipbiswal. Yea, as you suggested above, I think 
now that `explode+flatten` looks fine for the most use cases. Is it okay to 
close it now? If we get user's feedbacks for this, I think we can revisit it 
anytime. WDYT? @dongjoon-hyun 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28386: [SPARK-26199][SPARK-31517][R] fix strategy for handling ... names in mutate

2020-05-03 Thread GitBox


SparkQA removed a comment on pull request #28386:
URL: https://github.com/apache/spark/pull/28386#issuecomment-623224390


   **[Test build #122237 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/122237/testReport)**
 for PR 28386 at commit 
[`473e6ba`](https://github.com/apache/spark/commit/473e6bad3c9e0d9f0e978ba655a94fde1bf6fc92).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28386: [SPARK-26199][SPARK-31517][R] fix strategy for handling ... names in mutate

2020-05-03 Thread GitBox


AmplabJenkins removed a comment on pull request #28386:
URL: https://github.com/apache/spark/pull/28386#issuecomment-623230179







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28386: [SPARK-26199][SPARK-31517][R] fix strategy for handling ... names in mutate

2020-05-03 Thread GitBox


AmplabJenkins commented on pull request #28386:
URL: https://github.com/apache/spark/pull/28386#issuecomment-623230179







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28386: [SPARK-26199][SPARK-31517][R] fix strategy for handling ... names in mutate

2020-05-03 Thread GitBox


SparkQA commented on pull request #28386:
URL: https://github.com/apache/spark/pull/28386#issuecomment-623230116


   **[Test build #122237 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/122237/testReport)**
 for PR 28386 at commit 
[`473e6ba`](https://github.com/apache/spark/commit/473e6bad3c9e0d9f0e978ba655a94fde1bf6fc92).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds the following public classes _(experimental)_:
 * `public class JavaFValueTestExample `



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28438: [SPARK-31267][SQL] Flaky test: WholeStageCodegenSparkSubmitSuite.Generated code on driver should not embed platform-specific co

2020-05-03 Thread GitBox


AmplabJenkins removed a comment on pull request #28438:
URL: https://github.com/apache/spark/pull/28438#issuecomment-623225680







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28438: [SPARK-31267][SQL] Flaky test: WholeStageCodegenSparkSubmitSuite.Generated code on driver should not embed platform-specific constant

2020-05-03 Thread GitBox


AmplabJenkins commented on pull request #28438:
URL: https://github.com/apache/spark/pull/28438#issuecomment-623225680







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28438: [SPARK-31267][SQL] Flaky test: WholeStageCodegenSparkSubmitSuite.Generated code on driver should not embed platform-specific constant

2020-05-03 Thread GitBox


SparkQA commented on pull request #28438:
URL: https://github.com/apache/spark/pull/28438#issuecomment-623225479


   **[Test build #122238 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/122238/testReport)**
 for PR 28438 at commit 
[`5276146`](https://github.com/apache/spark/commit/527614681b74dd591f4cef1b3471e6d00240e6a1).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on pull request #28438: [SPARK-31267][SQL] Flaky test: WholeStageCodegenSparkSubmitSuite.Generated code on driver should not embed platform-specific constant

2020-05-03 Thread GitBox


maropu commented on pull request #28438:
URL: https://github.com/apache/spark/pull/28438#issuecomment-623225339


   Could you put the info ` It needs 1.5 - 2 minutes to finish that test` in 
the PR description, too?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on pull request #28438: [SPARK-31267][SQL] Flaky test: WholeStageCodegenSparkSubmitSuite.Generated code on driver should not embed platform-specific constant

2020-05-03 Thread GitBox


maropu commented on pull request #28438:
URL: https://github.com/apache/spark/pull/28438#issuecomment-623225194


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] MichaelChirico commented on pull request #28379: [SPARK-28040][SPARK-28070][R] Write type object s3

2020-05-03 Thread GitBox


MichaelChirico commented on pull request #28379:
URL: https://github.com/apache/spark/pull/28379#issuecomment-623225215


   Thanks @felixcheung indeed that was me filing that issue about `glue` 
   
   This PR is intended to be a much more general solution to that issue -- a 
`glue` object would be passed to `writeObject.character` under this PR.
   
   There are already a bunch of tests you could basically drop in from the gist 
above.
   
   My suspicion at the moment is that the usage in `worker.R` is the issue, but 
I'm having trouble debugging that. Any suggestion for debugging the executor 
process?
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] MichaelChirico commented on pull request #28386: [SPARK-26199][SPARK-31517][R] fix strategy for handling ... names in mutate

2020-05-03 Thread GitBox


MichaelChirico commented on pull request #28386:
URL: https://github.com/apache/spark/pull/28386#issuecomment-623224648


   Thanks @felixcheung. both backports are directly from `base` R:
   
   
https://github.com/wch/r-source/blob/08ebf253e44e10bfb445f27b53b2a43bc7e6740d/src/library/base/R/utilities.R#L26-L27
   
   
https://github.com/wch/r-source/blob/08ebf253e44e10bfb445f27b53b2a43bc7e6740d/src/library/base/R/strwrap.R#L219-L229
   
   Does that mean we need to use the R license header on the `backports.R` file 
specifically?
   
   As for the file ordering, that's [handled by the `Collate` 
field](https://cran.r-project.org/doc/manuals/r-release/R-exts.html#The-DESCRIPTION-file);
 I've adjusted the order & pushed.
   
   > There was a different way to do method signature compatibility (you should 
be able to find it), maybe it will work better.
   
   I don't follow this, could you ellaborate?
   
   Happy to add tests for `mutate`, could you please help point out where such 
tests should be added? I was a bit confused by the structure of tests.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28386: [SPARK-26199][SPARK-31517][R] fix strategy for handling ... names in mutate

2020-05-03 Thread GitBox


AmplabJenkins removed a comment on pull request #28386:
URL: https://github.com/apache/spark/pull/28386#issuecomment-623224566







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28386: [SPARK-26199][SPARK-31517][R] fix strategy for handling ... names in mutate

2020-05-03 Thread GitBox


AmplabJenkins commented on pull request #28386:
URL: https://github.com/apache/spark/pull/28386#issuecomment-623224566







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28386: [SPARK-26199][SPARK-31517][R] fix strategy for handling ... names in mutate

2020-05-03 Thread GitBox


SparkQA commented on pull request #28386:
URL: https://github.com/apache/spark/pull/28386#issuecomment-623224390


   **[Test build #122237 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/122237/testReport)**
 for PR 28386 at commit 
[`473e6ba`](https://github.com/apache/spark/commit/473e6bad3c9e0d9f0e978ba655a94fde1bf6fc92).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28434: [SPARK-31624] Fix SHOW TBLPROPERTIES for V2 tables that leverage the session catalog

2020-05-03 Thread GitBox


AmplabJenkins removed a comment on pull request #28434:
URL: https://github.com/apache/spark/pull/28434#issuecomment-623220796







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28434: [SPARK-31624] Fix SHOW TBLPROPERTIES for V2 tables that leverage the session catalog

2020-05-03 Thread GitBox


AmplabJenkins commented on pull request #28434:
URL: https://github.com/apache/spark/pull/28434#issuecomment-623220796







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28434: [SPARK-31624] Fix SHOW TBLPROPERTIES for V2 tables that leverage the session catalog

2020-05-03 Thread GitBox


SparkQA removed a comment on pull request #28434:
URL: https://github.com/apache/spark/pull/28434#issuecomment-623182874


   **[Test build #122236 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/122236/testReport)**
 for PR 28434 at commit 
[`9828286`](https://github.com/apache/spark/commit/98282869d92e609ba6d74a8fa98dcbd0e51080b7).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28434: [SPARK-31624] Fix SHOW TBLPROPERTIES for V2 tables that leverage the session catalog

2020-05-03 Thread GitBox


SparkQA commented on pull request #28434:
URL: https://github.com/apache/spark/pull/28434#issuecomment-623220371


   **[Test build #122236 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/122236/testReport)**
 for PR 28434 at commit 
[`9828286`](https://github.com/apache/spark/commit/98282869d92e609ba6d74a8fa98dcbd0e51080b7).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] felixcheung commented on pull request #28386: [SPARK-26199][SPARK-31517][R] fix strategy for handling ... names in mutate

2020-05-03 Thread GitBox


felixcheung commented on pull request #28386:
URL: https://github.com/apache/spark/pull/28386#issuecomment-623220355


   Where is the code on deparse1 etc from? It might have license implication.
   
   I think backport.R is implicitly dependent on the alphabetical order of .R 
files, I’d suggest looking at that more closely and formalize the naming 
convention.
   
   There was a different way to do method signature compatibility (you should 
be able to find it), maybe it will work better.
   
   Finally, I think we need more tests for mutate()
   
   
   
   
   
   From: UCB AMPLab 
   Sent: Tuesday, April 28, 2020 11:34:33 AM
   To: apache/spark 
   Cc: Felix Cheung ; Mention 

   Subject: Re: [apache/spark] [SPARK-26199][SPARK-31517][R] fix strategy for 
handling ... names in mutate (#28386)
   
   
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed):
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/122011/
   Test PASSed.
   
   —
   You are receiving this because you were mentioned.
   Reply to this email directly, view it on 
GitHub, or 
unsubscribe.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] felixcheung commented on pull request #28379: [SPARK-28040][SPARK-28070][R] Write type object s3

2020-05-03 Thread GitBox


felixcheung commented on pull request #28379:
URL: https://github.com/apache/spark/pull/28379#issuecomment-623219437


   I didn’t review but I recall something similar this to support glue like 2 
or 3 years ago? This serde is very sensitive - we probably need (to add) a lot 
more tests on this (existing tests are not very comprehensive)
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] tianshizz commented on pull request #28438: [SPARK-31267][SQL] Flaky test: WholeStageCodegenSparkSubmitSuite.Generated code on driver should not embed platform-specific constant

2020-05-03 Thread GitBox


tianshizz commented on pull request #28438:
URL: https://github.com/apache/spark/pull/28438#issuecomment-623216905


   @maropu I'm not able to see the jenkins log, but I was able to reproduce it 
on my own laptop, which is quite slow... It needs 1.5 - 2 minutes to finish 
that test. After increasing the timeout to 3 minutes, it can pass.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #28430: [SPARK-31372][SQL][TEST][FOLLOW-UP] Improve ExpressionsSchemaSuite so that easy to track the diff.

2020-05-03 Thread GitBox


HyukjinKwon commented on a change in pull request #28430:
URL: https://github.com/apache/spark/pull/28430#discussion_r419183476



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/ExpressionsSchemaSuite.scala
##
@@ -156,14 +156,16 @@ class ExpressionsSchemaSuite extends QueryTest with 
SharedSparkSession {
   stringToFile(resultFile, goldenOutput)
 }
 
+val outputSize = outputs.size
 val expectedOutputs: Seq[QueryOutput] = {
-  val goldenOutput = fileToString(resultFile)
-  val lines = goldenOutput.split("\n")
+  val expectedGoldenOutput = fileToString(resultFile)
+  val lines = expectedGoldenOutput.split("\n")
+  val expectedSize = lines.size
 
   // The header of golden file has one line, plus four lines of the 
summary and three
   // lines of the header of schema table.
-  assert(lines.size == outputs.size + 8,
-s"Expected ${outputs.size + 8} blocks in result file but got 
${lines.size}. " +
+  assert(expectedSize == outputSize + 8,

Review comment:
   @beliefer, can we have one string variable to keep the summary and 
header lines, and calculate the number (`8`) of extra lines from the variable?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #28430: [SPARK-31372][SQL][TEST][FOLLOW-UP] Improve ExpressionsSchemaSuite so that easy to track the diff.

2020-05-03 Thread GitBox


HyukjinKwon commented on a change in pull request #28430:
URL: https://github.com/apache/spark/pull/28430#discussion_r419183476



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/ExpressionsSchemaSuite.scala
##
@@ -156,14 +156,16 @@ class ExpressionsSchemaSuite extends QueryTest with 
SharedSparkSession {
   stringToFile(resultFile, goldenOutput)
 }
 
+val outputSize = outputs.size
 val expectedOutputs: Seq[QueryOutput] = {
-  val goldenOutput = fileToString(resultFile)
-  val lines = goldenOutput.split("\n")
+  val expectedGoldenOutput = fileToString(resultFile)
+  val lines = expectedGoldenOutput.split("\n")
+  val expectedSize = lines.size
 
   // The header of golden file has one line, plus four lines of the 
summary and three
   // lines of the header of schema table.
-  assert(lines.size == outputs.size + 8,
-s"Expected ${outputs.size + 8} blocks in result file but got 
${lines.size}. " +
+  assert(expectedSize == outputSize + 8,

Review comment:
   @beliefer, can we have one string variable to keep the summary and 
header lines, and calculate the number of extra lines from the variable?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu edited a comment on pull request #28438: [SPARK-31267][SQL] Flaky test: WholeStageCodegenSparkSubmitSuite.Generated code on driver should not embed platform-specific constant

2020-05-03 Thread GitBox


maropu edited a comment on pull request #28438:
URL: https://github.com/apache/spark/pull/28438#issuecomment-623215783


   Thanks for the work, @tianshizz . btw, have you checked a root cause of the 
flaky failure is just a timeout, e.g., from jenkins logs? If that's true, the 
fix looks fine.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on pull request #28438: [SPARK-31267][SQL] Flaky test: WholeStageCodegenSparkSubmitSuite.Generated code on driver should not embed platform-specific constant

2020-05-03 Thread GitBox


maropu commented on pull request #28438:
URL: https://github.com/apache/spark/pull/28438#issuecomment-623215783


   Thanks for the work, @tianshizz . btw, have you checked a root cause of the 
flaky failure is just a timeout, e.g., from jenkins logs?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #28443: [SPARK-31212][SQL][WIP] Fix Failure of casting the '1000-02-29' string to the date type

2020-05-03 Thread GitBox


HyukjinKwon commented on a change in pull request #28443:
URL: https://github.com/apache/spark/pull/28443#discussion_r419182567



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala
##
@@ -205,6 +205,29 @@ object DateTimeUtils {
 (day.toInt, MICROSECONDS.toNanos(micros))
   }
 
+  /**
+   *Returns Gregorian days since epoch
+   */
+  def toGregorianDays(
+  year: Int,
+  month: Byte = 1,
+  day: Byte = 1,
+  hour: Byte = 0,
+  minute: Byte = 0,
+  sec: Byte = 0): SQLDate = {
+val calendar = new GregorianCalendar(year, month - 1, day, hour, minute, 
sec)

Review comment:
   if we can land a minimised fix, let's go ahead for branch-2.4. 
Otherwise, I think I would rather treat that calendar switching supersedes it.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on pull request #28440: [SPARK-31527][SQL][TESTS][FOLLOWUP] Fix the number of rows in `DateTimeBenchmark`

2020-05-03 Thread GitBox


maropu commented on pull request #28440:
URL: https://github.com/apache/spark/pull/28440#issuecomment-623212105


   Thanks! Merged to master/3.0.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dilipbiswal commented on pull request #27803: [SPARK-31049][SQL] Support nested adjacent generators, e.g., explode(explode(v))

2020-05-03 Thread GitBox


dilipbiswal commented on pull request #27803:
URL: https://github.com/apache/spark/pull/27803#issuecomment-623211354


   Hi @maropu , currently we will be able use flatten and explode to achieve 
the same result ? Perhaps supporting nested generators will support more use 
cases. But a simple use case can work with flatten - FYI.
   ```
   scala> spark.sql("SELECT explode(flatten(array(array(4), array(5 v").show
   +---+
   |  v|
   +---+
   |  4|
   |  5|
   +---+
   ```



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #28442: [SPARK-31631][TESTS] Fix test flakiness caused by MiniKdc which throws 'address in use' BindException with retry

2020-05-03 Thread GitBox


maropu commented on a change in pull request #28442:
URL: https://github.com/apache/spark/pull/28442#discussion_r419180458



##
File path: 
external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala
##
@@ -131,11 +132,33 @@ class KafkaTestUtils(
   }
 
   private def setUpMiniKdc(): Unit = {
-val kdcDir = Utils.createTempDir()
 val kdcConf = MiniKdc.createConf()
 kdcConf.setProperty(MiniKdc.DEBUG, "true")
-kdc = new MiniKdc(kdcConf, kdcDir)
-kdc.start()
+var bindException = false
+var kdcDir: File = null
+var numRetries = 1
+do {

Review comment:
   Could you put some notes (e.g., the hadoop jira ID) here about why we 
need this logic for `MiniKdc`?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #28442: [SPARK-31631][TESTS] Fix test flakiness caused by MiniKdc which throws 'address in use' BindException with retry

2020-05-03 Thread GitBox


maropu commented on a change in pull request #28442:
URL: https://github.com/apache/spark/pull/28442#discussion_r419180108



##
File path: 
external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala
##
@@ -131,11 +132,33 @@ class KafkaTestUtils(
   }
 
   private def setUpMiniKdc(): Unit = {
-val kdcDir = Utils.createTempDir()
 val kdcConf = MiniKdc.createConf()
 kdcConf.setProperty(MiniKdc.DEBUG, "true")
-kdc = new MiniKdc(kdcConf, kdcDir)
-kdc.start()

Review comment:
   We have the same issue in the docker integration test? 
   
https://github.com/apache/spark/blob/master/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/DockerKrbJDBCIntegrationSuite.scala#L50
   
   If so, could we make a helper func for that?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on pull request #28390: [SPARK-27340][SS][TESTS][FOLLOW-UP] Rephrase API comments and simplify tests

2020-05-03 Thread GitBox


dongjoon-hyun commented on pull request #28390:
URL: https://github.com/apache/spark/pull/28390#issuecomment-623202015


   Thank you all.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] tianshizz commented on a change in pull request #28443: [SPARK-31212][SQL][WIP] Fix Failure of casting the '1000-02-29' string to the date type

2020-05-03 Thread GitBox


tianshizz commented on a change in pull request #28443:
URL: https://github.com/apache/spark/pull/28443#discussion_r419172503



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala
##
@@ -205,6 +205,29 @@ object DateTimeUtils {
 (day.toInt, MICROSECONDS.toNanos(micros))
   }
 
+  /**
+   *Returns Gregorian days since epoch
+   */
+  def toGregorianDays(
+  year: Int,
+  month: Byte = 1,
+  day: Byte = 1,
+  hour: Byte = 0,
+  minute: Byte = 0,
+  sec: Byte = 0): SQLDate = {
+val calendar = new GregorianCalendar(year, month - 1, day, hour, minute, 
sec)

Review comment:
   I see. Didn't know that the semantic has changed in these two versions. 
Thanks for the explanation. So if I want to fix 2.4, which branch should I 
create a PR for? branch-2.4?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28444: [SPARK-31632][CORE][WEBUI] Make the ApplicationInfo always available when accessed

2020-05-03 Thread GitBox


AmplabJenkins removed a comment on pull request #28444:
URL: https://github.com/apache/spark/pull/28444#issuecomment-623194303


   Can one of the admins verify this patch?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28444: [SPARK-31632][CORE][WEBUI] Make the ApplicationInfo always available when accessed

2020-05-03 Thread GitBox


AmplabJenkins commented on pull request #28444:
URL: https://github.com/apache/spark/pull/28444#issuecomment-623194443


   Can one of the admins verify this patch?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28444: [SPARK-31632][CORE][WEBUI] Make the ApplicationInfo always available when accessed

2020-05-03 Thread GitBox


AmplabJenkins commented on pull request #28444:
URL: https://github.com/apache/spark/pull/28444#issuecomment-623194303


   Can one of the admins verify this patch?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] xccui opened a new pull request #28444: [SPARK-31632][CORE][WEBUI] Make the ApplicationInfo always available when accessed

2020-05-03 Thread GitBox


xccui opened a new pull request #28444:
URL: https://github.com/apache/spark/pull/28444


   
   
   ### What changes were proposed in this pull request?
   
   This PR adds a while loop check for the `ApplicationInfo` returned in 
`AppStatusStore.applicationInfo()`. It eliminates the `NoSuchElementError` 
exception occasionally happens when a user accesses the Web UI during Spark 
startup.
   
   ### Why are the changes needed?
   During the initialization of `SparkContext`, it first starts the Web UI and 
then set up the `LiveListenerBus` thread for dispatching the 
`SparkListenerApplicationStart` event (which will trigger writing the requested 
`ApplicationInfo` to `InMemoryStore`). If the Web UI is accessed before this 
info's being written to `InMemoryStore`, the following `NoSuchElementException` 
will be thrown.
   ```
WARN org.eclipse.jetty.server.HttpChannel: /jobs/
java.util.NoSuchElementException
at java.util.Collections$EmptyIterator.next(Collections.java:4191)
at 
org.apache.spark.util.kvstore.InMemoryStore$InMemoryIterator.next(InMemoryStore.java:467)
at 
org.apache.spark.status.AppStatusStore.applicationInfo(AppStatusStore.scala:39)
at org.apache.spark.ui.jobs.AllJobsPage.render(AllJobsPage.scala:266)
at org.apache.spark.ui.WebUI.$anonfun$attachPage$1(WebUI.scala:89)
at org.apache.spark.ui.JettyUtils$$anon$1.doGet(JettyUtils.scala:80)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:873)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1623)
at 
org.apache.spark.ui.HttpSecurityFilter.doFilter(HttpSecurityFilter.scala:95)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1610)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:540)
at 
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1345)
at 
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:480)
at 
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1247)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)
at 
org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:753)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:220)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
at org.eclipse.jetty.server.Server.handle(Server.java:505)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:370)
at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:267)
at 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:698)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:804)
at java.lang.Thread.run(Thread.java:748)
   ```
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   Manually tested



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] MaxGekk commented on a change in pull request #28443: [SPARK-31212][SQL][WIP] Fix Failure of casting the '1000-02-29' string to the date type

2020-05-03 Thread GitBox


MaxGekk commented on a change in pull request #28443:
URL: https://github.com/apache/spark/pull/28443#discussion_r419168264



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala
##
@@ -205,6 +205,29 @@ object DateTimeUtils {
 (day.toInt, MICROSECONDS.toNanos(micros))
   }
 
+  /**
+   *Returns Gregorian days since epoch
+   */
+  def toGregorianDays(
+  year: Int,
+  month: Byte = 1,
+  day: Byte = 1,
+  hour: Byte = 0,
+  minute: Byte = 0,
+  sec: Byte = 0): SQLDate = {
+val calendar = new GregorianCalendar(year, month - 1, day, hour, minute, 
sec)

Review comment:
   The ticket is opened for Spark 2.4 where it is correct date but your PR 
is for master 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] tianshizz commented on a change in pull request #28443: [SPARK-31212][SQL][WIP] Fix Failure of casting the '1000-02-29' string to the date type

2020-05-03 Thread GitBox


tianshizz commented on a change in pull request #28443:
URL: https://github.com/apache/spark/pull/28443#discussion_r419164577



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala
##
@@ -205,6 +205,29 @@ object DateTimeUtils {
 (day.toInt, MICROSECONDS.toNanos(micros))
   }
 
+  /**
+   *Returns Gregorian days since epoch
+   */
+  def toGregorianDays(
+  year: Int,
+  month: Byte = 1,
+  day: Byte = 1,
+  hour: Byte = 0,
+  minute: Byte = 0,
+  sec: Byte = 0): SQLDate = {
+val calendar = new GregorianCalendar(year, month - 1, day, hour, minute, 
sec)

Review comment:
   Ah, okay, should we close SPARK-31212 then? Is it the right behavior to 
not able to cast `1000-02-29`?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] MaxGekk commented on a change in pull request #28443: [SPARK-31212][SQL][WIP] Fix Failure of casting the '1000-02-29' string to the date type

2020-05-03 Thread GitBox


MaxGekk commented on a change in pull request #28443:
URL: https://github.com/apache/spark/pull/28443#discussion_r419162505



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala
##
@@ -205,6 +205,29 @@ object DateTimeUtils {
 (day.toInt, MICROSECONDS.toNanos(micros))
   }
 
+  /**
+   *Returns Gregorian days since epoch
+   */
+  def toGregorianDays(
+  year: Int,
+  month: Byte = 1,
+  day: Byte = 1,
+  hour: Byte = 0,
+  minute: Byte = 0,
+  sec: Byte = 0): SQLDate = {
+val calendar = new GregorianCalendar(year, month - 1, day, hour, minute, 
sec)

Review comment:
   it is not pure Gregorian  calendar. For 1000-02-29, it is Julian 
calendar.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28434: [SPARK-31624] Fix SHOW TBLPROPERTIES for V2 tables that leverage the session catalog

2020-05-03 Thread GitBox


SparkQA commented on pull request #28434:
URL: https://github.com/apache/spark/pull/28434#issuecomment-623182874


   **[Test build #122236 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/122236/testReport)**
 for PR 28434 at commit 
[`9828286`](https://github.com/apache/spark/commit/98282869d92e609ba6d74a8fa98dcbd0e51080b7).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28434: [SPARK-31624] Fix SHOW TBLPROPERTIES for V2 tables that leverage the session catalog

2020-05-03 Thread GitBox


AmplabJenkins removed a comment on pull request #28434:
URL: https://github.com/apache/spark/pull/28434#issuecomment-623182098







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28434: [SPARK-31624] Fix SHOW TBLPROPERTIES for V2 tables that leverage the session catalog

2020-05-03 Thread GitBox


AmplabJenkins commented on pull request #28434:
URL: https://github.com/apache/spark/pull/28434#issuecomment-623182098







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] MaxGekk commented on a change in pull request #28310: [SPARK-31527][SQL] date add/subtract interval only allow those day precision in ansi mode

2020-05-03 Thread GitBox


MaxGekk commented on a change in pull request #28310:
URL: https://github.com/apache/spark/pull/28310#discussion_r419154657



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala
##
@@ -618,6 +618,22 @@ object DateTimeUtils {
 instantToMicros(resultTimestamp.toInstant)
   }
 
+  /**
+   * Add the date and the interval's months and days.
+   * Returns a date value, expressed in days since 1.1.1970.
+   *
+   * @throws DateTimeException if the result exceeds the supported date range
+   * @throws IllegalArgumentException if the interval has `microseconds` part
+   */
+  def dateAddInterval(
+ start: SQLDate,
+ interval: CalendarInterval): SQLDate = {
+require(interval.microseconds == 0,
+  "Cannot add hours, minutes or seconds, milliseconds, microseconds to a 
date")
+val ld = 
LocalDate.ofEpochDay(start).plusMonths(interval.months).plusDays(interval.days)

Review comment:
   I see, thanks. It would be nice to document such behavior of this 
function and timestampAddInterval somewhere. It is not obvious that we add 
month then days and then micros. The order could be opposite. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28435: [SPARK-31625][YARN] Unregister application from YARN RM outside the shutdown hook if it succeeds

2020-05-03 Thread GitBox


AmplabJenkins removed a comment on pull request #28435:
URL: https://github.com/apache/spark/pull/28435#issuecomment-623173494







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28435: [SPARK-31625][YARN] Unregister application from YARN RM outside the shutdown hook if it succeeds

2020-05-03 Thread GitBox


SparkQA removed a comment on pull request #28435:
URL: https://github.com/apache/spark/pull/28435#issuecomment-623170659


   **[Test build #122235 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/122235/testReport)**
 for PR 28435 at commit 
[`079fcbc`](https://github.com/apache/spark/commit/079fcbc1665ff88c2ff31d498574ce766edf0ce1).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28435: [SPARK-31625][YARN] Unregister application from YARN RM outside the shutdown hook if it succeeds

2020-05-03 Thread GitBox


SparkQA commented on pull request #28435:
URL: https://github.com/apache/spark/pull/28435#issuecomment-623173446


   **[Test build #122235 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/122235/testReport)**
 for PR 28435 at commit 
[`079fcbc`](https://github.com/apache/spark/commit/079fcbc1665ff88c2ff31d498574ce766edf0ce1).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28435: [SPARK-31625][YARN] Unregister application from YARN RM outside the shutdown hook if it succeeds

2020-05-03 Thread GitBox


AmplabJenkins commented on pull request #28435:
URL: https://github.com/apache/spark/pull/28435#issuecomment-623173494







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28435: [SPARK-31625][YARN] Unregister application from YARN RM outside the shutdown hook if it succeeds

2020-05-03 Thread GitBox


AmplabJenkins removed a comment on pull request #28435:
URL: https://github.com/apache/spark/pull/28435#issuecomment-623170877







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28435: [SPARK-31625][YARN] Unregister application from YARN RM outside the shutdown hook if it succeeds

2020-05-03 Thread GitBox


AmplabJenkins commented on pull request #28435:
URL: https://github.com/apache/spark/pull/28435#issuecomment-623170877







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   >