[spark] branch master updated: [SPARK-38423][K8S] Reuse driver pod's `priorityClassName` for `PodGroup`

2022-03-06 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new f36d1bf  [SPARK-38423][K8S] Reuse driver pod's `priorityClassName` for 
`PodGroup`
f36d1bf is described below

commit f36d1bfba47f6f6ff0f4375a1eb74bb606f8a0b7
Author: Yikun Jiang 
AuthorDate: Sun Mar 6 23:54:18 2022 -0800

[SPARK-38423][K8S] Reuse driver pod's `priorityClassName` for `PodGroup`

### What changes were proposed in this pull request?
This patch set podgroup `priorityClassName` to 
`driver.pod.spec.priorityClassName`.

### Why are the changes needed?
Support priority scheduling with Volcano implementations

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
- New UT to make sure feature step set podgroup priority as expected.
- Add two integration tests:
  - 1. Submit 3 different priority jobs (spark pi) to make sure job 
completed result as expected.
  - 2. Submit 3 different priority jobs (driver submisson) to make sure job 
scheduler order as expected.
- All existing UT and IT

Closes #35639 from Yikun/SPARK-38189.

Authored-by: Yikun Jiang 
Signed-off-by: Dongjoon Hyun 
---
 .../deploy/k8s/features/VolcanoFeatureStep.scala   |   6 +
 .../k8s/features/VolcanoFeatureStepSuite.scala |  30 
 .../src/test/resources/volcano/disable-queue.yml   |  24 +++
 .../src/test/resources/volcano/enable-queue.yml|  24 +++
 .../volcano/high-priority-driver-template.yml  |  26 
 .../volcano/low-priority-driver-template.yml   |  26 
 .../volcano/medium-priority-driver-template.yml|  26 
 .../src/test/resources/volcano/priorityClasses.yml |  33 +
 .../k8s/integrationtest/VolcanoTestsSuite.scala| 163 ++---
 9 files changed, 340 insertions(+), 18 deletions(-)

diff --git 
a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/VolcanoFeatureStep.scala
 
b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/VolcanoFeatureStep.scala
index c6efe4d..48303c8 100644
--- 
a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/VolcanoFeatureStep.scala
+++ 
b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/VolcanoFeatureStep.scala
@@ -32,6 +32,7 @@ private[spark] class VolcanoFeatureStep extends 
KubernetesDriverCustomFeatureCon
   private lazy val podGroupName = s"${kubernetesConf.appId}-podgroup"
   private lazy val namespace = kubernetesConf.namespace
   private lazy val queue = kubernetesConf.get(KUBERNETES_JOB_QUEUE)
+  private var priorityClassName: Option[String] = None
 
   override def init(config: KubernetesDriverConf): Unit = {
 kubernetesConf = config
@@ -50,10 +51,15 @@ private[spark] class VolcanoFeatureStep extends 
KubernetesDriverCustomFeatureCon
 
 queue.foreach(podGroup.editOrNewSpec().withQueue(_).endSpec())
 
+
priorityClassName.foreach(podGroup.editOrNewSpec().withPriorityClassName(_).endSpec())
+
 Seq(podGroup.build())
   }
 
   override def configurePod(pod: SparkPod): SparkPod = {
+
+priorityClassName = Some(pod.pod.getSpec.getPriorityClassName)
+
 val k8sPodBuilder = new PodBuilder(pod.pod)
   .editMetadata()
 .addToAnnotations(POD_GROUP_ANNOTATION, podGroupName)
diff --git 
a/resource-managers/kubernetes/core/src/test/scala/org/apache/spark/deploy/k8s/features/VolcanoFeatureStepSuite.scala
 
b/resource-managers/kubernetes/core/src/test/scala/org/apache/spark/deploy/k8s/features/VolcanoFeatureStepSuite.scala
index eda1ccc..350df77 100644
--- 
a/resource-managers/kubernetes/core/src/test/scala/org/apache/spark/deploy/k8s/features/VolcanoFeatureStepSuite.scala
+++ 
b/resource-managers/kubernetes/core/src/test/scala/org/apache/spark/deploy/k8s/features/VolcanoFeatureStepSuite.scala
@@ -16,6 +16,7 @@
  */
 package org.apache.spark.deploy.k8s.features
 
+import io.fabric8.kubernetes.api.model.{ContainerBuilder, PodBuilder}
 import io.fabric8.volcano.scheduling.v1beta1.PodGroup
 
 import org.apache.spark.{SparkConf, SparkFunSuite}
@@ -57,4 +58,33 @@ class VolcanoFeatureStepSuite extends SparkFunSuite {
 val annotations = configuredPod.pod.getMetadata.getAnnotations
 assert(annotations.get("scheduling.k8s.io/group-name") === 
s"${kubernetesConf.appId}-podgroup")
   }
+
+  test("SPARK-38423: Support priorityClassName") {
+// test null priority
+val podWithNullPriority = SparkPod.initialPod()
+assert(podWithNullPriority.pod.getSpec.getPriorityClassName === null)
+verifyPriority(SparkPod.initialPod())
+// test normal priority
+val podWithPriority = SparkPod(
+  new PodBuilder()
+.withNewMetadata()
+.endMetadata()
+.withNewSpec()
+  

[spark] branch branch-3.2 updated: [SPARK-38430][K8S][DOCS] Add `SBT` commands to K8s IT README

2022-03-06 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.2
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.2 by this push:
 new 7eafadb  [SPARK-38430][K8S][DOCS] Add `SBT` commands to K8s IT README
7eafadb is described below

commit 7eafadbbd962f28bac54cbc45eef9a37fc785966
Author: William Hyun 
AuthorDate: Sun Mar 6 22:04:20 2022 -0800

[SPARK-38430][K8S][DOCS] Add `SBT` commands to K8s IT README

### What changes were proposed in this pull request?
This PR aims to add SBT commands to K8s IT README.

### Why are the changes needed?
This will introduce new SBT commands to developers.

### Does this PR introduce _any_ user-facing change?
No, this is a dev-only change.

### How was this patch tested?
Manual.

Closes #35745 from williamhyun/sbtdoc.

Authored-by: William Hyun 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 3bbc43d662ccfff6bd93a351fcbf96179289f58f)
Signed-off-by: Dongjoon Hyun 
---
 .../kubernetes/integration-tests/README.md | 25 ++
 1 file changed, 25 insertions(+)

diff --git a/resource-managers/kubernetes/integration-tests/README.md 
b/resource-managers/kubernetes/integration-tests/README.md
index 3a81033..a7edcf4 100644
--- a/resource-managers/kubernetes/integration-tests/README.md
+++ b/resource-managers/kubernetes/integration-tests/README.md
@@ -255,3 +255,28 @@ to the wrapper scripts and using the wrapper scripts will 
simply set these appro
 
   
 
+
+# Running the Kubernetes Integration Tests with SBT
+
+You can use SBT in the same way to build image and run all K8s integration 
tests except Minikube-only ones.
+
+build/sbt -Psparkr -Pkubernetes -Pkubernetes-integration-tests \
+-Dtest.exclude.tags=minikube \
+-Dspark.kubernetes.test.deployMode=docker-desktop \
+-Dspark.kubernetes.test.imageTag=2022-03-06 \
+'kubernetes-integration-tests/test'
+
+The following is an example to rerun tests with the pre-built image.
+
+build/sbt -Psparkr -Pkubernetes -Pkubernetes-integration-tests \
+-Dtest.exclude.tags=minikube \
+-Dspark.kubernetes.test.deployMode=docker-desktop \
+-Dspark.kubernetes.test.imageTag=2022-03-06 \
+'kubernetes-integration-tests/runIts'
+
+In addition, you can run a single test selectively.
+
+build/sbt -Psparkr -Pkubernetes -Pkubernetes-integration-tests \
+-Dspark.kubernetes.test.deployMode=docker-desktop \
+-Dspark.kubernetes.test.imageTag=2022-03-06 \
+'kubernetes-integration-tests/testOnly -- -z "Run SparkPi with a very 
long application name"'

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-38430][K8S][DOCS] Add `SBT` commands to K8s IT README

2022-03-06 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 3bbc43d  [SPARK-38430][K8S][DOCS] Add `SBT` commands to K8s IT README
3bbc43d is described below

commit 3bbc43d662ccfff6bd93a351fcbf96179289f58f
Author: William Hyun 
AuthorDate: Sun Mar 6 22:04:20 2022 -0800

[SPARK-38430][K8S][DOCS] Add `SBT` commands to K8s IT README

### What changes were proposed in this pull request?
This PR aims to add SBT commands to K8s IT README.

### Why are the changes needed?
This will introduce new SBT commands to developers.

### Does this PR introduce _any_ user-facing change?
No, this is a dev-only change.

### How was this patch tested?
Manual.

Closes #35745 from williamhyun/sbtdoc.

Authored-by: William Hyun 
Signed-off-by: Dongjoon Hyun 
---
 .../kubernetes/integration-tests/README.md | 25 ++
 1 file changed, 25 insertions(+)

diff --git a/resource-managers/kubernetes/integration-tests/README.md 
b/resource-managers/kubernetes/integration-tests/README.md
index edd3bf5..2151b7f 100644
--- a/resource-managers/kubernetes/integration-tests/README.md
+++ b/resource-managers/kubernetes/integration-tests/README.md
@@ -269,3 +269,28 @@ to the wrapper scripts and using the wrapper scripts will 
simply set these appro
 
   
 
+
+# Running the Kubernetes Integration Tests with SBT
+
+You can use SBT in the same way to build image and run all K8s integration 
tests except Minikube-only ones.
+
+build/sbt -Psparkr -Pkubernetes -Pkubernetes-integration-tests \
+-Dtest.exclude.tags=minikube \
+-Dspark.kubernetes.test.deployMode=docker-desktop \
+-Dspark.kubernetes.test.imageTag=2022-03-06 \
+'kubernetes-integration-tests/test'
+
+The following is an example to rerun tests with the pre-built image.
+
+build/sbt -Psparkr -Pkubernetes -Pkubernetes-integration-tests \
+-Dtest.exclude.tags=minikube \
+-Dspark.kubernetes.test.deployMode=docker-desktop \
+-Dspark.kubernetes.test.imageTag=2022-03-06 \
+'kubernetes-integration-tests/runIts'
+
+In addition, you can run a single test selectively.
+
+build/sbt -Psparkr -Pkubernetes -Pkubernetes-integration-tests \
+-Dspark.kubernetes.test.deployMode=docker-desktop \
+-Dspark.kubernetes.test.imageTag=2022-03-06 \
+'kubernetes-integration-tests/testOnly -- -z "Run SparkPi with a very 
long application name"'

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (d83ab94 -> fc6b5e5)

2022-03-06 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from d83ab94  [SPARK-38419][BUILD] Replace tabs that exist in the script 
with spaces
 add fc6b5e5  [SPARK-38188][K8S][TESTS][FOLLOWUP] Cleanup resources in 
`afterEach`

No new revisions were added by this update.

Summary of changes:
 .../k8s/integrationtest/VolcanoTestsSuite.scala| 61 ++
 1 file changed, 50 insertions(+), 11 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (b99f58a -> d83ab94)

2022-03-06 Thread srowen
This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b99f58a  [SPARK-38267][CORE][SQL][SS] Replace pattern matches on 
boolean expressions with conditional statements
 add d83ab94  [SPARK-38419][BUILD] Replace tabs that exist in the script 
with spaces

No new revisions were added by this update.

Summary of changes:
 .../docker/src/main/dockerfiles/spark/entrypoint.sh  |  4 ++--
 sbin/spark-daemon.sh | 12 ++--
 sbin/start-master.sh |  8 
 sbin/start-mesos-dispatcher.sh   |  8 
 4 files changed, 16 insertions(+), 16 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-38267][CORE][SQL][SS] Replace pattern matches on boolean expressions with conditional statements

2022-03-06 Thread srowen
This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new b99f58a  [SPARK-38267][CORE][SQL][SS] Replace pattern matches on 
boolean expressions with conditional statements
b99f58a is described below

commit b99f58a57c880ed9cdec3d37ac8683c31daa4c10
Author: yangjie01 
AuthorDate: Sun Mar 6 19:26:45 2022 -0600

[SPARK-38267][CORE][SQL][SS] Replace pattern matches on boolean expressions 
with conditional statements

### What changes were proposed in this pull request?
This pr uses `conditional statements` to simplify `pattern matches on 
boolean`:

**Before**

```scala
val bool: Boolean
bool match {
case true => do something when bool is true
case false => do something when bool is false
}
```

**After**

```scala
val bool: Boolean
if (bool) {
  do something when bool is true
} else {
  do something when bool is false
}
```

### Why are the changes needed?
Simplify unnecessary pattern match.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Pass GA

Closes #35589 from LuciferYang/trivial-match.

Authored-by: yangjie01 
Signed-off-by: Sean Owen 
---
 .../BlockManagerDecommissionIntegrationSuite.scala |  7 +--
 .../catalyst/expressions/datetimeExpressions.scala | 50 +++---
 .../spark/sql/catalyst/parser/AstBuilder.scala | 14 +++---
 .../sql/internal/ExecutorSideSQLConfSuite.scala|  7 +--
 .../streaming/FlatMapGroupsWithStateSuite.scala|  7 +--
 5 files changed, 43 insertions(+), 42 deletions(-)

diff --git 
a/core/src/test/scala/org/apache/spark/storage/BlockManagerDecommissionIntegrationSuite.scala
 
b/core/src/test/scala/org/apache/spark/storage/BlockManagerDecommissionIntegrationSuite.scala
index 8999a12..e004c33 100644
--- 
a/core/src/test/scala/org/apache/spark/storage/BlockManagerDecommissionIntegrationSuite.scala
+++ 
b/core/src/test/scala/org/apache/spark/storage/BlockManagerDecommissionIntegrationSuite.scala
@@ -165,9 +165,10 @@ class BlockManagerDecommissionIntegrationSuite extends 
SparkFunSuite with LocalS
   }
   x.map(y => (y, y))
 }
-val testRdd = shuffle match {
-  case true => baseRdd.reduceByKey(_ + _)
-  case false => baseRdd
+val testRdd = if (shuffle) {
+  baseRdd.reduceByKey(_ + _)
+} else {
+  baseRdd
 }
 
 // Listen for the job & block updates
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
index 8b5a387..d8cf474 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
@@ -2903,25 +2903,25 @@ case class SubtractTimestamps(
   @transient private lazy val zoneIdInEval: ZoneId = 
zoneIdForType(left.dataType)
 
   @transient
-  private lazy val evalFunc: (Long, Long) => Any = legacyInterval match {
-case false => (leftMicros, rightMicros) =>
-  subtractTimestamps(leftMicros, rightMicros, zoneIdInEval)
-case true => (leftMicros, rightMicros) =>
+  private lazy val evalFunc: (Long, Long) => Any = if (legacyInterval) {
+(leftMicros, rightMicros) =>
   new CalendarInterval(0, 0, leftMicros - rightMicros)
+  } else {
+(leftMicros, rightMicros) =>
+  subtractTimestamps(leftMicros, rightMicros, zoneIdInEval)
   }
 
   override def nullSafeEval(leftMicros: Any, rightMicros: Any): Any = {
 evalFunc(leftMicros.asInstanceOf[Long], rightMicros.asInstanceOf[Long])
   }
 
-  override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = 
legacyInterval match {
-case false =>
-  val zid = ctx.addReferenceObj("zoneId", zoneIdInEval, 
classOf[ZoneId].getName)
-  val dtu = DateTimeUtils.getClass.getName.stripSuffix("$")
-  defineCodeGen(ctx, ev, (l, r) => s"""$dtu.subtractTimestamps($l, $r, 
$zid)""")
-case true =>
-  defineCodeGen(ctx, ev, (end, start) =>
-s"new org.apache.spark.unsafe.types.CalendarInterval(0, 0, $end - 
$start)")
+  override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = if 
(legacyInterval) {
+defineCodeGen(ctx, ev, (end, start) =>
+  s"new org.apache.spark.unsafe.types.CalendarInterval(0, 0, $end - 
$start)")
+  } else {
+val zid = ctx.addReferenceObj("zoneId", zoneIdInEval, 
classOf[ZoneId].getName)
+val dtu = DateTimeUtils.getClass.getName.stripSuffix("$")
+defineCodeGen(ctx, ev, (l, r) => s"""$dtu.subtractTimestamps($l, $r, 
$zid)""")
   }
 
   override def toString: String = s"($left - $right)"
@@ -2961,26 

[spark] branch master updated: [SPARK-38394][BUILD] Upgrade `scala-maven-plugin` to 4.4.0 for Hadoop 3 profile

2022-03-06 Thread srowen
This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 3175d83  [SPARK-38394][BUILD] Upgrade `scala-maven-plugin` to 4.4.0 
for Hadoop 3 profile
3175d83 is described below

commit 3175d830cb029d41909de8960aa790d4272aa188
Author: Steve Loughran 
AuthorDate: Sun Mar 6 19:23:31 2022 -0600

[SPARK-38394][BUILD] Upgrade `scala-maven-plugin` to 4.4.0 for Hadoop 3 
profile

### What changes were proposed in this pull request?

This sets scala-maven-plugin.version to 4.4.0 except when the hadoop-2.7
profile is used, because SPARK-36547 shows that only 4.3.0 works there.

### Why are the changes needed?

1. If you try to build against a local snapshot of hadoop trunk with 
`-Dhadoop.version=3.4.0-SNAPSHOT` the build failes with the error shown in the 
JIRA.
2. upgrading the scala plugin version fixes this. It is a plugin issue.
3. the version is made configurable so the hadoop 2.7 profile can switch 
back to the one which works there.

As to why this only surfaces when compiling hadoop trunk, or why hadoop-2.7 
requires the new one -who knows. they both look certificate related, which is 
interesting. maybe something related to signed JARs?

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

by successfully building spark against a local build of hadoop 
3.4.0-SNAPSHOT

Closes #35725 from steveloughran/SPARK-38394-compiler-version.

Authored-by: Steve Loughran 
Signed-off-by: Sean Owen 
---
 pom.xml | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/pom.xml b/pom.xml
index 176d3af..8e03167 100644
--- a/pom.xml
+++ b/pom.xml
@@ -163,6 +163,10 @@
 2.12.15
 2.12
 2.0.2
+
+
+4.4.0
 --test
 
 true
@@ -2775,8 +2779,7 @@
 
   net.alchim31.maven
   scala-maven-plugin
-  
-  4.3.0
+  ${scala-maven-plugin.version}
   
 
   eclipse-add-source
@@ -3430,6 +3433,7 @@
 hadoop-client
 
hadoop-yarn-api
 
hadoop-client
+4.3.0
   
 
 

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.2 updated: [SPARK-38416][PYTHON][TESTS] Change day to month

2022-03-06 Thread gurwls223
This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.2
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.2 by this push:
 new 1406d0c  [SPARK-38416][PYTHON][TESTS] Change day to month
1406d0c is described below

commit 1406d0cc744ede2a2beb58f22040d0e05582e776
Author: bjornjorgensen 
AuthorDate: Mon Mar 7 09:00:06 2022 +0900

[SPARK-38416][PYTHON][TESTS] Change day to month

### What changes were proposed in this pull request?
Right now we have two functions that are testing the same thing.

### Why are the changes needed?
To test both day and mount

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Got the green light.

Closes #35741 from bjornjorgensen/change-day-to-month.

Authored-by: bjornjorgensen 
Signed-off-by: Hyukjin Kwon 
(cherry picked from commit b6516174a84d849bd620417dca9e0a81e0d3b5dc)
Signed-off-by: Hyukjin Kwon 
---
 python/pyspark/pandas/tests/indexes/test_datetime.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/python/pyspark/pandas/tests/indexes/test_datetime.py 
b/python/pyspark/pandas/tests/indexes/test_datetime.py
index e3bf14e..85a2b21 100644
--- a/python/pyspark/pandas/tests/indexes/test_datetime.py
+++ b/python/pyspark/pandas/tests/indexes/test_datetime.py
@@ -120,7 +120,7 @@ class DatetimeIndexTest(PandasOnSparkTestCase, TestUtils):
 
 def test_month_name(self):
 for psidx, pidx in self.idx_pairs:
-self.assert_eq(psidx.day_name(), pidx.day_name())
+self.assert_eq(psidx.month_name(), pidx.month_name())
 
 def test_normalize(self):
 for psidx, pidx in self.idx_pairs:

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (135841f -> b651617)

2022-03-06 Thread gurwls223
This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 135841f  [SPARK-38411][CORE] Use `UTF-8` when 
`doMergeApplicationListingInternal` reads event logs
 add b651617  [SPARK-38416][PYTHON][TESTS] Change day to month

No new revisions were added by this update.

Summary of changes:
 python/pyspark/pandas/tests/indexes/test_datetime.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.0 updated: [SPARK-38411][CORE] Use `UTF-8` when `doMergeApplicationListingInternal` reads event logs

2022-03-06 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new e036de3  [SPARK-38411][CORE] Use `UTF-8` when 
`doMergeApplicationListingInternal` reads event logs
e036de3 is described below

commit e036de326bdc6bc828eee910861851d52c81f6d5
Author: Cheng Pan 
AuthorDate: Sun Mar 6 15:41:20 2022 -0800

[SPARK-38411][CORE] Use `UTF-8` when `doMergeApplicationListingInternal` 
reads event logs

### What changes were proposed in this pull request?

Use UTF-8 instead of system default encoding to read event log

### Why are the changes needed?

After SPARK-29160, we should always use UTF-8 to read event log, otherwise, 
if Spark History Server run with different default charset than "UTF-8", will 
encounter such error.

```
2022-03-04 12:16:00,143 [3752440] - INFO  
[log-replay-executor-19:Logging57] - Parsing 
hdfs://hz-cluster11/spark2-history/application_1640597251469_2453817_1.lz4 for 
listing data...
2022-03-04 12:16:00,145 [3752442] - ERROR 
[log-replay-executor-18:Logging94] - Exception while merging application 
listings
java.nio.charset.MalformedInputException: Input length = 1
at java.nio.charset.CoderResult.throwException(CoderResult.java:281)
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:339)
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
at java.io.InputStreamReader.read(InputStreamReader.java:184)
at java.io.BufferedReader.fill(BufferedReader.java:161)
at java.io.BufferedReader.readLine(BufferedReader.java:324)
at java.io.BufferedReader.readLine(BufferedReader.java:389)
at 
scala.io.BufferedSource$BufferedLineIterator.hasNext(BufferedSource.scala:74)
at scala.collection.Iterator$$anon$20.hasNext(Iterator.scala:884)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:511)
at 
org.apache.spark.scheduler.ReplayListenerBus.replay(ReplayListenerBus.scala:82)
at 
org.apache.spark.deploy.history.FsHistoryProvider.$anonfun$doMergeApplicationListing$4(FsHistoryProvider.scala:819)
at 
org.apache.spark.deploy.history.FsHistoryProvider.$anonfun$doMergeApplicationListing$4$adapted(FsHistoryProvider.scala:801)
at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2626)
at 
org.apache.spark.deploy.history.FsHistoryProvider.doMergeApplicationListing(FsHistoryProvider.scala:801)
at 
org.apache.spark.deploy.history.FsHistoryProvider.mergeApplicationListing(FsHistoryProvider.scala:715)
at 
org.apache.spark.deploy.history.FsHistoryProvider.$anonfun$checkForLogs$15(FsHistoryProvider.scala:581)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
```

### Does this PR introduce _any_ user-facing change?

Yes, bug fix.

### How was this patch tested?

Verification steps in ubuntu:20.04

1. build `spark-3.3.0-SNAPSHOT-bin-master.tgz` on commit `34618a7ef6` using 
`dev/make-distribution.sh --tgz --name master`
2. build `spark-3.3.0-SNAPSHOT-bin-SPARK-38411.tgz` on commit `2a8f56038b` 
using `dev/make-distribution.sh --tgz --name SPARK-38411`
3. switch to UTF-8 using `export LC_ALL=C.UTF-8 && bash`
4. generate event log contains no-ASCII chars.
```
bin/spark-submit \
--master local[*] \
--class org.apache.spark.examples.SparkPi \
--conf spark.eventLog.enabled=true \
--conf spark.user.key='计算圆周率' \
examples/jars/spark-examples_2.12-3.3.0-SNAPSHOT.jar
```
5. switch to POSIX using `export LC_ALL=POSIX && bash`
6. run `spark-3.3.0-SNAPSHOT-bin-master/sbin/start-history-server.sh` and 
watch logs


```
Spark Command: /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -cp 
/spark-3.3.0-SNAPSHOT-bin-master/conf/:/spark-3.3.0-SNAPSHOT-bin-master/jars/* 
-Xmx1g org.apache.spark.deploy.history.HistoryServer

Using Spark's default log4j profile: 
org/apache/spark/log4j2-defaults.properties
22/03/06 13:37:19 INFO HistoryServer: Started daemon with process name: 
48729c3ffc10aa9
22/03/06 13:37:19 INFO SignalUtils: Registering signal handler for TERM
22/03/06 13:37:19 INFO SignalUtils: Registering signal handler for HUP
22/03/06 13:37:19 INFO SignalUtils: Registering signal handler for INT
22/03/06 13:37:21 WARN NativeCodeLoader: 

[spark] branch branch-3.1 updated: [SPARK-38411][CORE] Use `UTF-8` when `doMergeApplicationListingInternal` reads event logs

2022-03-06 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.1 by this push:
 new 8d70d5d  [SPARK-38411][CORE] Use `UTF-8` when 
`doMergeApplicationListingInternal` reads event logs
8d70d5d is described below

commit 8d70d5da3d74ebdd612b2cdc38201e121b88b24f
Author: Cheng Pan 
AuthorDate: Sun Mar 6 15:41:20 2022 -0800

[SPARK-38411][CORE] Use `UTF-8` when `doMergeApplicationListingInternal` 
reads event logs

### What changes were proposed in this pull request?

Use UTF-8 instead of system default encoding to read event log

### Why are the changes needed?

After SPARK-29160, we should always use UTF-8 to read event log, otherwise, 
if Spark History Server run with different default charset than "UTF-8", will 
encounter such error.

```
2022-03-04 12:16:00,143 [3752440] - INFO  
[log-replay-executor-19:Logging57] - Parsing 
hdfs://hz-cluster11/spark2-history/application_1640597251469_2453817_1.lz4 for 
listing data...
2022-03-04 12:16:00,145 [3752442] - ERROR 
[log-replay-executor-18:Logging94] - Exception while merging application 
listings
java.nio.charset.MalformedInputException: Input length = 1
at java.nio.charset.CoderResult.throwException(CoderResult.java:281)
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:339)
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
at java.io.InputStreamReader.read(InputStreamReader.java:184)
at java.io.BufferedReader.fill(BufferedReader.java:161)
at java.io.BufferedReader.readLine(BufferedReader.java:324)
at java.io.BufferedReader.readLine(BufferedReader.java:389)
at 
scala.io.BufferedSource$BufferedLineIterator.hasNext(BufferedSource.scala:74)
at scala.collection.Iterator$$anon$20.hasNext(Iterator.scala:884)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:511)
at 
org.apache.spark.scheduler.ReplayListenerBus.replay(ReplayListenerBus.scala:82)
at 
org.apache.spark.deploy.history.FsHistoryProvider.$anonfun$doMergeApplicationListing$4(FsHistoryProvider.scala:819)
at 
org.apache.spark.deploy.history.FsHistoryProvider.$anonfun$doMergeApplicationListing$4$adapted(FsHistoryProvider.scala:801)
at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2626)
at 
org.apache.spark.deploy.history.FsHistoryProvider.doMergeApplicationListing(FsHistoryProvider.scala:801)
at 
org.apache.spark.deploy.history.FsHistoryProvider.mergeApplicationListing(FsHistoryProvider.scala:715)
at 
org.apache.spark.deploy.history.FsHistoryProvider.$anonfun$checkForLogs$15(FsHistoryProvider.scala:581)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
```

### Does this PR introduce _any_ user-facing change?

Yes, bug fix.

### How was this patch tested?

Verification steps in ubuntu:20.04

1. build `spark-3.3.0-SNAPSHOT-bin-master.tgz` on commit `34618a7ef6` using 
`dev/make-distribution.sh --tgz --name master`
2. build `spark-3.3.0-SNAPSHOT-bin-SPARK-38411.tgz` on commit `2a8f56038b` 
using `dev/make-distribution.sh --tgz --name SPARK-38411`
3. switch to UTF-8 using `export LC_ALL=C.UTF-8 && bash`
4. generate event log contains no-ASCII chars.
```
bin/spark-submit \
--master local[*] \
--class org.apache.spark.examples.SparkPi \
--conf spark.eventLog.enabled=true \
--conf spark.user.key='计算圆周率' \
examples/jars/spark-examples_2.12-3.3.0-SNAPSHOT.jar
```
5. switch to POSIX using `export LC_ALL=POSIX && bash`
6. run `spark-3.3.0-SNAPSHOT-bin-master/sbin/start-history-server.sh` and 
watch logs


```
Spark Command: /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -cp 
/spark-3.3.0-SNAPSHOT-bin-master/conf/:/spark-3.3.0-SNAPSHOT-bin-master/jars/* 
-Xmx1g org.apache.spark.deploy.history.HistoryServer

Using Spark's default log4j profile: 
org/apache/spark/log4j2-defaults.properties
22/03/06 13:37:19 INFO HistoryServer: Started daemon with process name: 
48729c3ffc10aa9
22/03/06 13:37:19 INFO SignalUtils: Registering signal handler for TERM
22/03/06 13:37:19 INFO SignalUtils: Registering signal handler for HUP
22/03/06 13:37:19 INFO SignalUtils: Registering signal handler for INT
22/03/06 13:37:21 WARN NativeCodeLoader: 

[spark] branch branch-3.2 updated: [SPARK-38411][CORE] Use `UTF-8` when `doMergeApplicationListingInternal` reads event logs

2022-03-06 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.2
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.2 by this push:
 new 56ddf50  [SPARK-38411][CORE] Use `UTF-8` when 
`doMergeApplicationListingInternal` reads event logs
56ddf50 is described below

commit 56ddf50e20bd38f37d6d037b97c1b1d59100116b
Author: Cheng Pan 
AuthorDate: Sun Mar 6 15:41:20 2022 -0800

[SPARK-38411][CORE] Use `UTF-8` when `doMergeApplicationListingInternal` 
reads event logs

### What changes were proposed in this pull request?

Use UTF-8 instead of system default encoding to read event log

### Why are the changes needed?

After SPARK-29160, we should always use UTF-8 to read event log, otherwise, 
if Spark History Server run with different default charset than "UTF-8", will 
encounter such error.

```
2022-03-04 12:16:00,143 [3752440] - INFO  
[log-replay-executor-19:Logging57] - Parsing 
hdfs://hz-cluster11/spark2-history/application_1640597251469_2453817_1.lz4 for 
listing data...
2022-03-04 12:16:00,145 [3752442] - ERROR 
[log-replay-executor-18:Logging94] - Exception while merging application 
listings
java.nio.charset.MalformedInputException: Input length = 1
at java.nio.charset.CoderResult.throwException(CoderResult.java:281)
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:339)
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
at java.io.InputStreamReader.read(InputStreamReader.java:184)
at java.io.BufferedReader.fill(BufferedReader.java:161)
at java.io.BufferedReader.readLine(BufferedReader.java:324)
at java.io.BufferedReader.readLine(BufferedReader.java:389)
at 
scala.io.BufferedSource$BufferedLineIterator.hasNext(BufferedSource.scala:74)
at scala.collection.Iterator$$anon$20.hasNext(Iterator.scala:884)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:511)
at 
org.apache.spark.scheduler.ReplayListenerBus.replay(ReplayListenerBus.scala:82)
at 
org.apache.spark.deploy.history.FsHistoryProvider.$anonfun$doMergeApplicationListing$4(FsHistoryProvider.scala:819)
at 
org.apache.spark.deploy.history.FsHistoryProvider.$anonfun$doMergeApplicationListing$4$adapted(FsHistoryProvider.scala:801)
at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2626)
at 
org.apache.spark.deploy.history.FsHistoryProvider.doMergeApplicationListing(FsHistoryProvider.scala:801)
at 
org.apache.spark.deploy.history.FsHistoryProvider.mergeApplicationListing(FsHistoryProvider.scala:715)
at 
org.apache.spark.deploy.history.FsHistoryProvider.$anonfun$checkForLogs$15(FsHistoryProvider.scala:581)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
```

### Does this PR introduce _any_ user-facing change?

Yes, bug fix.

### How was this patch tested?

Verification steps in ubuntu:20.04

1. build `spark-3.3.0-SNAPSHOT-bin-master.tgz` on commit `34618a7ef6` using 
`dev/make-distribution.sh --tgz --name master`
2. build `spark-3.3.0-SNAPSHOT-bin-SPARK-38411.tgz` on commit `2a8f56038b` 
using `dev/make-distribution.sh --tgz --name SPARK-38411`
3. switch to UTF-8 using `export LC_ALL=C.UTF-8 && bash`
4. generate event log contains no-ASCII chars.
```
bin/spark-submit \
--master local[*] \
--class org.apache.spark.examples.SparkPi \
--conf spark.eventLog.enabled=true \
--conf spark.user.key='计算圆周率' \
examples/jars/spark-examples_2.12-3.3.0-SNAPSHOT.jar
```
5. switch to POSIX using `export LC_ALL=POSIX && bash`
6. run `spark-3.3.0-SNAPSHOT-bin-master/sbin/start-history-server.sh` and 
watch logs


```
Spark Command: /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -cp 
/spark-3.3.0-SNAPSHOT-bin-master/conf/:/spark-3.3.0-SNAPSHOT-bin-master/jars/* 
-Xmx1g org.apache.spark.deploy.history.HistoryServer

Using Spark's default log4j profile: 
org/apache/spark/log4j2-defaults.properties
22/03/06 13:37:19 INFO HistoryServer: Started daemon with process name: 
48729c3ffc10aa9
22/03/06 13:37:19 INFO SignalUtils: Registering signal handler for TERM
22/03/06 13:37:19 INFO SignalUtils: Registering signal handler for HUP
22/03/06 13:37:19 INFO SignalUtils: Registering signal handler for INT
22/03/06 13:37:21 WARN NativeCodeLoader: 

[spark] branch master updated: [SPARK-38411][CORE] Use `UTF-8` when `doMergeApplicationListingInternal` reads event logs

2022-03-06 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 135841f  [SPARK-38411][CORE] Use `UTF-8` when 
`doMergeApplicationListingInternal` reads event logs
135841f is described below

commit 135841f257fbb008aef211a5e38222940849cb26
Author: Cheng Pan 
AuthorDate: Sun Mar 6 15:41:20 2022 -0800

[SPARK-38411][CORE] Use `UTF-8` when `doMergeApplicationListingInternal` 
reads event logs

### What changes were proposed in this pull request?

Use UTF-8 instead of system default encoding to read event log

### Why are the changes needed?

After SPARK-29160, we should always use UTF-8 to read event log, otherwise, 
if Spark History Server run with different default charset than "UTF-8", will 
encounter such error.

```
2022-03-04 12:16:00,143 [3752440] - INFO  
[log-replay-executor-19:Logging57] - Parsing 
hdfs://hz-cluster11/spark2-history/application_1640597251469_2453817_1.lz4 for 
listing data...
2022-03-04 12:16:00,145 [3752442] - ERROR 
[log-replay-executor-18:Logging94] - Exception while merging application 
listings
java.nio.charset.MalformedInputException: Input length = 1
at java.nio.charset.CoderResult.throwException(CoderResult.java:281)
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:339)
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
at java.io.InputStreamReader.read(InputStreamReader.java:184)
at java.io.BufferedReader.fill(BufferedReader.java:161)
at java.io.BufferedReader.readLine(BufferedReader.java:324)
at java.io.BufferedReader.readLine(BufferedReader.java:389)
at 
scala.io.BufferedSource$BufferedLineIterator.hasNext(BufferedSource.scala:74)
at scala.collection.Iterator$$anon$20.hasNext(Iterator.scala:884)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:511)
at 
org.apache.spark.scheduler.ReplayListenerBus.replay(ReplayListenerBus.scala:82)
at 
org.apache.spark.deploy.history.FsHistoryProvider.$anonfun$doMergeApplicationListing$4(FsHistoryProvider.scala:819)
at 
org.apache.spark.deploy.history.FsHistoryProvider.$anonfun$doMergeApplicationListing$4$adapted(FsHistoryProvider.scala:801)
at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2626)
at 
org.apache.spark.deploy.history.FsHistoryProvider.doMergeApplicationListing(FsHistoryProvider.scala:801)
at 
org.apache.spark.deploy.history.FsHistoryProvider.mergeApplicationListing(FsHistoryProvider.scala:715)
at 
org.apache.spark.deploy.history.FsHistoryProvider.$anonfun$checkForLogs$15(FsHistoryProvider.scala:581)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
```

### Does this PR introduce _any_ user-facing change?

Yes, bug fix.

### How was this patch tested?

Verification steps in ubuntu:20.04

1. build `spark-3.3.0-SNAPSHOT-bin-master.tgz` on commit `34618a7ef6` using 
`dev/make-distribution.sh --tgz --name master`
2. build `spark-3.3.0-SNAPSHOT-bin-SPARK-38411.tgz` on commit `2a8f56038b` 
using `dev/make-distribution.sh --tgz --name SPARK-38411`
3. switch to UTF-8 using `export LC_ALL=C.UTF-8 && bash`
4. generate event log contains no-ASCII chars.
```
bin/spark-submit \
--master local[*] \
--class org.apache.spark.examples.SparkPi \
--conf spark.eventLog.enabled=true \
--conf spark.user.key='计算圆周率' \
examples/jars/spark-examples_2.12-3.3.0-SNAPSHOT.jar
```
5. switch to POSIX using `export LC_ALL=POSIX && bash`
6. run `spark-3.3.0-SNAPSHOT-bin-master/sbin/start-history-server.sh` and 
watch logs


```
Spark Command: /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -cp 
/spark-3.3.0-SNAPSHOT-bin-master/conf/:/spark-3.3.0-SNAPSHOT-bin-master/jars/* 
-Xmx1g org.apache.spark.deploy.history.HistoryServer

Using Spark's default log4j profile: 
org/apache/spark/log4j2-defaults.properties
22/03/06 13:37:19 INFO HistoryServer: Started daemon with process name: 
48729c3ffc10aa9
22/03/06 13:37:19 INFO SignalUtils: Registering signal handler for TERM
22/03/06 13:37:19 INFO SignalUtils: Registering signal handler for HUP
22/03/06 13:37:19 INFO SignalUtils: Registering signal handler for INT
22/03/06 13:37:21 WARN NativeCodeLoader: Unable to 

[spark] branch master updated (18219d4 -> 69bc9d1)

2022-03-06 Thread srowen
This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 18219d4  [SPARK-37400][SPARK-37426][PYTHON][MLLIB] Inline type hints 
for pyspark.mllib classification and regression
 add 69bc9d1  [SPARK-38239][PYTHON][MLLIB] Fix 
pyspark.mllib.LogisticRegressionModel.__repr__

No new revisions were added by this update.

Summary of changes:
 python/pyspark/mllib/classification.py | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org