date:20220316

[spark] branch master updated (78ed4cc -> 7630787)

2022-03-16 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 78ed4cc  [SPARK-38575][INFRA] Duduplicate branch specification in 
GitHub Actions workflow
 add 7630787  [SPARK-38575][INFRA][FOLLOW-UP] Fix ** to '**' in 
ansi_sql_mode_test.yml

No new revisions were added by this update.

Summary of changes:
 .github/workflows/ansi_sql_mode_test.yml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (7d1ff01 -> 78ed4cc)

2022-03-16 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 7d1ff01  [SPARK-38556][PYTHON] Disable Pandas usage logging for method 
calls inside @contextmanager functions
 add 78ed4cc  [SPARK-38575][INFRA] Duduplicate branch specification in 
GitHub Actions workflow

No new revisions were added by this update.

Summary of changes:
 .github/workflows/ansi_sql_mode_test.yml |  2 +-
 .github/workflows/build_and_test.yml | 21 +++--
 2 files changed, 12 insertions(+), 11 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.3 updated: [SPARK-38556][PYTHON] Disable Pandas usage logging for method calls inside @contextmanager functions

2022-03-16 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.3
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.3 by this push:
 new c284faa  [SPARK-38556][PYTHON] Disable Pandas usage logging for method 
calls inside @contextmanager functions
c284faa is described below

commit c284faad2d7d3b813c1c94c612b814c129b6dad3
Author: Yihong He 
AuthorDate: Thu Mar 17 10:03:42 2022 +0900

[SPARK-38556][PYTHON] Disable Pandas usage logging for method calls inside 
@contextmanager functions

### What changes were proposed in this pull request?

Wrap AbstractContextManager returned by contexmanager decorator function in 
function calls. The comment in the code change explain why it uses a wrapper 
class instead of wrapping functions of AbstractContextManager directly.

### Why are the changes needed?

Currently, method calls inside contextmanager functions are treated as 
external for **with** statements.

For example, the below code records config.set_option calls inside 
ps.option_context(...)

```python
with ps.option_context("compute.ops_on_diff_frames", True):
pass
```

We should disable usage logging for calls inside contextmanager functions 
to improve accuracy of the usage data

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

- Existing tests
- Manual test by running `./bin/pyspark` and verified the output:
```
>>> sc.setLogLevel("info")
>>> import pyspark.pandas as ps
22/03/15 17:10:50 INFO Log4jUsageLogger: pandasOnSparkImported=1.0, 
tags=List(), blob=
>>> with ps.option_context("compute.ops_on_diff_frames", True):
... pass
...
22/03/15 17:11:17 INFO Log4jUsageLogger: pandasOnSparkFunctionCalled=1.0, 
tags=List(pandasOnSparkFunction=option_context(*args: Any) -> 
Iterator[NoneType], className=config, status=success), blob={"duration": 
0.161525994123}
22/03/15 17:11:18 INFO Log4jUsageLogger: initialConfigLogging=1.0, 
tags=List(sparkApplicationId=local-1647360645198, sparkExecutionId=null, 
sparkJobGroupId=null), 
blob={"spark.sql.warehouse.dir":"file:/Users/yihong.he/spark/spark-warehouse","spark.executor.extraJavaOptions":"-XX:+IgnoreUnrecognizedVMOptions
 --add-opens=java.base/java.lang=ALL-UNNAMED 
--add-opens=java.base/java.lang.invoke=ALL-UNNAMED 
--add-opens=java.base/java.lang.reflect=ALL-UNNAMED 
--add-opens=java.base/java.io=ALL [...]
22/03/15 17:11:19 INFO Log4jUsageLogger: pandasOnSparkFunctionCalled=1.0, 
tags=List(pandasOnSparkFunction=option_context.__enter__(), className=config, 
status=success), blob={"duration": 1594.156939978}
22/03/15 17:11:19 INFO Log4jUsageLogger: pandasOnSparkFunctionCalled=1.0, 
tags=List(pandasOnSparkFunction=option_context.__exit__(type, value, 
traceback), className=config, status=success), blob={"duration": 
12.61017002086}
```

Closes #35861 from heyihong/SPARK-38556.

Authored-by: Yihong He 
Signed-off-by: Hyukjin Kwon 
(cherry picked from commit 7d1ff01299c88a1aadfac032ea0b3ef87f4ae50d)
Signed-off-by: Hyukjin Kwon 
---
 python/pyspark/instrumentation_utils.py | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/python/pyspark/instrumentation_utils.py 
b/python/pyspark/instrumentation_utils.py
index 908f5cb..b9aacf6 100644
--- a/python/pyspark/instrumentation_utils.py
+++ b/python/pyspark/instrumentation_utils.py
@@ -21,6 +21,7 @@ import inspect
 import threading
 import importlib
 import time
+from contextlib import AbstractContextManager
 from types import ModuleType
 from typing import Tuple, Union, List, Callable, Any, Type
 
@@ -30,6 +31,24 @@ __all__: List[str] = []
 _local = threading.local()
 
 
+class _WrappedAbstractContextManager(AbstractContextManager):
+def __init__(
+self, acm: AbstractContextManager, class_name: str, function_name: 
str, logger: Any
+):
+self._enter_func = _wrap_function(
+class_name, "{}.__enter__".format(function_name), acm.__enter__, 
logger
+)
+self._exit_func = _wrap_function(
+class_name, "{}.__exit__".format(function_name), acm.__exit__, 
logger
+)
+
+def __enter__(self):  # type: ignore[no-untyped-def]
+return self._enter_func()
+
+def __exit__(self, exc_type, exc_val, exc_tb):  # type: 
ignore[no-untyped-def]
+return self._exit_func(exc_type, exc_val, exc_tb)
+
+
 def _wrap_function(class_name: str, function_name: str, func: Callable, 
logger: Any) -> Callable:
 
 signature = inspect.signature(func)
@@ -44,6 +63,17 @@ def _wrap_function(class_name: str, function_name: str, 
func: Callable, logger:
 start = time.perf_counter()
 try:
 res = func(*args, **kwargs)
+if isinst

[spark] branch master updated (b348acd -> 7d1ff01)

2022-03-16 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b348acd  [SPARK-38441][PYTHON] Support string and bool `regex` in 
`Series.replace`
 add 7d1ff01  [SPARK-38556][PYTHON] Disable Pandas usage logging for method 
calls inside @contextmanager functions

No new revisions were added by this update.

Summary of changes:
 python/pyspark/instrumentation_utils.py | 30 ++
 1 file changed, 30 insertions(+)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (b16a9e9 -> b348acd)

2022-03-16 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b16a9e9  [SPARK-38572][BUILD] Setting version to 3.4.0-SNAPSHOT
 add b348acd  [SPARK-38441][PYTHON] Support string and bool `regex` in 
`Series.replace`

No new revisions were added by this update.

Summary of changes:
 python/pyspark/pandas/series.py| 72 ++
 python/pyspark/pandas/tests/test_series.py | 29 ++--
 2 files changed, 89 insertions(+), 12 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (6d3e8eb -> b16a9e9)

2022-03-16 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 6d3e8eb  [SPARK-38555][NETWORK][SHUFFLE] Avoid contention and get or 
create clientPools quickly in the TransportClientFactory
 add b16a9e9  [SPARK-38572][BUILD] Setting version to 3.4.0-SNAPSHOT

No new revisions were added by this update.

Summary of changes:
 R/pkg/DESCRIPTION  | 2 +-
 assembly/pom.xml   | 2 +-
 common/kvstore/pom.xml | 2 +-
 common/network-common/pom.xml  | 2 +-
 common/network-shuffle/pom.xml | 2 +-
 common/network-yarn/pom.xml| 2 +-
 common/sketch/pom.xml  | 2 +-
 common/tags/pom.xml| 2 +-
 common/unsafe/pom.xml  | 2 +-
 core/pom.xml   | 2 +-
 docs/_config.yml   | 4 ++--
 examples/pom.xml   | 2 +-
 external/avro/pom.xml  | 2 +-
 external/docker-integration-tests/pom.xml  | 2 +-
 external/kafka-0-10-assembly/pom.xml   | 2 +-
 external/kafka-0-10-sql/pom.xml| 2 +-
 external/kafka-0-10-token-provider/pom.xml | 2 +-
 external/kafka-0-10/pom.xml| 2 +-
 external/kinesis-asl-assembly/pom.xml  | 2 +-
 external/kinesis-asl/pom.xml   | 2 +-
 external/spark-ganglia-lgpl/pom.xml| 2 +-
 graphx/pom.xml | 2 +-
 hadoop-cloud/pom.xml   | 2 +-
 launcher/pom.xml   | 2 +-
 mllib-local/pom.xml| 2 +-
 mllib/pom.xml  | 2 +-
 pom.xml| 2 +-
 project/MimaExcludes.scala | 5 +
 python/pyspark/version.py  | 2 +-
 repl/pom.xml   | 2 +-
 resource-managers/kubernetes/core/pom.xml  | 2 +-
 resource-managers/kubernetes/integration-tests/pom.xml | 2 +-
 resource-managers/mesos/pom.xml| 2 +-
 resource-managers/yarn/pom.xml | 2 +-
 sql/catalyst/pom.xml   | 2 +-
 sql/core/pom.xml   | 2 +-
 sql/hive-thriftserver/pom.xml  | 2 +-
 sql/hive/pom.xml   | 2 +-
 streaming/pom.xml  | 2 +-
 tools/pom.xml  | 2 +-
 40 files changed, 45 insertions(+), 40 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-38555][NETWORK][SHUFFLE] Avoid contention and get or create clientPools quickly in the TransportClientFactory

2022-03-16 Thread mridulm80

This is an automated email from the ASF dual-hosted git repository.

mridulm80 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 6d3e8eb  [SPARK-38555][NETWORK][SHUFFLE] Avoid contention and get or 
create clientPools quickly in the TransportClientFactory
6d3e8eb is described below

commit 6d3e8eba055bb2809f17d74aa3442b18bf7beb16
Author: weixiuli 
AuthorDate: Wed Mar 16 17:01:33 2022 -0500

[SPARK-38555][NETWORK][SHUFFLE] Avoid contention and get or create 
clientPools quickly in the TransportClientFactory

### What changes were proposed in this pull request?
Avoid contention and get or create clientPools quickly in the 
TransportClientFactory.
### Why are the changes needed?

Avoid contention for getting or creating clientPools, and clean up the code.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Existing unittests.

Closes #35860 from weixiuli/SPARK-38555-NETWORK.

Authored-by: weixiuli 
Signed-off-by: Mridul Muralidharan gmail.com>
---
 .../org/apache/spark/network/client/TransportClientFactory.java   | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git 
a/common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java
 
b/common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java
index 43408d4..6fb9923 100644
--- 
a/common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java
+++ 
b/common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java
@@ -155,12 +155,8 @@ public class TransportClientFactory implements Closeable {
   InetSocketAddress.createUnresolved(remoteHost, remotePort);
 
 // Create the ClientPool if we don't have it yet.
-ClientPool clientPool = connectionPool.get(unresolvedAddress);
-if (clientPool == null) {
-  connectionPool.putIfAbsent(unresolvedAddress, new 
ClientPool(numConnectionsPerPeer));
-  clientPool = connectionPool.get(unresolvedAddress);
-}
-
+ClientPool clientPool = connectionPool.computeIfAbsent(unresolvedAddress,
+key -> new ClientPool(numConnectionsPerPeer));
 int clientIndex = rand.nextInt(numConnectionsPerPeer);
 TransportClient cachedClient = clientPool.clients[clientIndex];
 

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (4ff40c1 -> 5967f29)

2022-03-16 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 4ff40c1  [SPARK-38561][K8S][DOCS] Add doc for `Customized Kubernetes 
Schedulers`
 add 5967f29  [SPARK-38545][BUILD] Upgarde scala-maven-plugin from 4.4.0 to 
4.5.6

No new revisions were added by this update.

Summary of changes:
 pom.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-38561][K8S][DOCS] Add doc for `Customized Kubernetes Schedulers`

2022-03-16 Thread holden

This is an automated email from the ASF dual-hosted git repository.

holden pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 4ff40c1  [SPARK-38561][K8S][DOCS] Add doc for `Customized Kubernetes 
Schedulers`
4ff40c1 is described below

commit 4ff40c10f02f6e0735ce6554f7338489d8555bce
Author: Yikun Jiang 
AuthorDate: Wed Mar 16 11:12:54 2022 -0700

[SPARK-38561][K8S][DOCS] Add doc for `Customized Kubernetes Schedulers`

### What changes were proposed in this pull request?
This is PR to doc for basic framework capability for Customized Kubernetes 
Schedulers.

### Why are the changes needed?
Guide user how to use spark on kubernetes custom scheduler

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
CI passed

Closes #35869 from Yikun/SPARK-38561.

Authored-by: Yikun Jiang 
Signed-off-by: Holden Karau 
---
 docs/running-on-kubernetes.md | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/docs/running-on-kubernetes.md b/docs/running-on-kubernetes.md
index de37e22..d1b2fcd 100644
--- a/docs/running-on-kubernetes.md
+++ b/docs/running-on-kubernetes.md
@@ -1713,6 +1713,25 @@ spec:
 image: will-be-overwritten
 ```
 
+ Customized Kubernetes Schedulers for Spark on Kubernetes
+
+Spark allows users to specify a custom Kubernetes schedulers.
+
+1. Specify scheduler name.
+
+   Users can specify a custom scheduler using 
spark.kubernetes.scheduler.name or
+   spark.kubernetes.{driver/executor}.scheduler.name 
configuration.
+
+2. Specify scheduler related configurations.
+
+   To configure the custom scheduler the user can use [Pod 
templates](#pod-template), add labels 
(spark.kubernetes.{driver,executor}.label.*)  and/or annotations 
(spark.kubernetes.{driver/executor}.annotation.*).
+
+3. Specify scheduler feature step.
+
+   Users may also consider to use 
spark.kubernetes.{driver/executor}.pod.featureSteps to support 
more complex requirements, including but not limited to:
+  - Create additional Kubernetes custom resources for driver/executor 
scheduling.
+  - Set scheduler hints according to configuration or existing Pod info 
dynamically.
+
 ### Stage Level Scheduling Overview
 
 Stage level scheduling is supported on Kubernetes when dynamic allocation is 
enabled. This also requires 
spark.dynamicAllocation.shuffleTracking.enabled to be enabled 
since Kubernetes doesn't support an external shuffle service at this time. The 
order in which containers for different profiles is requested from Kubernetes 
is not guaranteed. Note that since dynamic allocation on Kubernetes requires 
the shuffle tracking feature, this means that executors from previous stages t 
[...]

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.3 updated: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

2022-03-16 Thread tgraves

This is an automated email from the ASF dual-hosted git repository.

tgraves pushed a commit to branch branch-3.3
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.3 by this push:
 new 8405ec3  [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor 
configurable
8405ec3 is described below

commit 8405ec352dbed6a3199fc2af3c60fae7186d15b5
Author: Adam Binford 
AuthorDate: Wed Mar 16 10:54:18 2022 -0500

[SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

### What changes were proposed in this pull request?

Add a new config to set the memory overhead factor for drivers and 
executors. Currently the memory overhead is hard coded to 10% (except in 
Kubernetes), and the only way to set it higher is to set it to a specific 
memory amount.

### Why are the changes needed?

In dynamic environments where different people or use cases need different 
memory requirements, it would be helpful to set a higher memory overhead factor 
instead of having to set a higher specific memory overhead value. The 
kubernetes resource manager already makes this configurable. This makes it 
configurable across the board.

### Does this PR introduce _any_ user-facing change?

No change to default behavior, just adds a new config users can change.

### How was this patch tested?

New UT to check the memory calculation.

Closes #35504 from Kimahriman/yarn-configurable-memory-overhead-factor.

Authored-by: Adam Binford 
Signed-off-by: Thomas Graves 
(cherry picked from commit 71e2110b799220adc107c9ac5ce737281f2b65cc)
Signed-off-by: Thomas Graves 
---
 .../main/scala/org/apache/spark/SparkConf.scala|  4 +-
 .../org/apache/spark/internal/config/package.scala | 28 ++
 docs/configuration.md  | 30 ++-
 docs/running-on-kubernetes.md  |  9 
 .../k8s/features/BasicDriverFeatureStep.scala  | 13 +++--
 .../k8s/features/BasicExecutorFeatureStep.scala|  7 ++-
 .../k8s/features/BasicDriverFeatureStepSuite.scala | 63 --
 .../features/BasicExecutorFeatureStepSuite.scala   | 54 +++
 .../spark/deploy/rest/mesos/MesosRestServer.scala  |  5 +-
 .../cluster/mesos/MesosSchedulerUtils.scala|  9 ++--
 .../deploy/rest/mesos/MesosRestServerSuite.scala   |  8 ++-
 .../org/apache/spark/deploy/yarn/Client.scala  | 14 +++--
 .../apache/spark/deploy/yarn/YarnAllocator.scala   |  5 +-
 .../spark/deploy/yarn/YarnSparkHadoopUtil.scala|  5 +-
 .../spark/deploy/yarn/YarnAllocatorSuite.scala | 29 ++
 15 files changed, 248 insertions(+), 35 deletions(-)

diff --git a/core/src/main/scala/org/apache/spark/SparkConf.scala 
b/core/src/main/scala/org/apache/spark/SparkConf.scala
index 5f37a1a..cf12174 100644
--- a/core/src/main/scala/org/apache/spark/SparkConf.scala
+++ b/core/src/main/scala/org/apache/spark/SparkConf.scala
@@ -636,7 +636,9 @@ private[spark] object SparkConf extends Logging {
   DeprecatedConfig("spark.blacklist.killBlacklistedExecutors", "3.1.0",
 "Please use spark.excludeOnFailure.killExcludedExecutors"),
   
DeprecatedConfig("spark.yarn.blacklist.executor.launch.blacklisting.enabled", 
"3.1.0",
-"Please use spark.yarn.executor.launch.excludeOnFailure.enabled")
+"Please use spark.yarn.executor.launch.excludeOnFailure.enabled"),
+  DeprecatedConfig("spark.kubernetes.memoryOverheadFactor", "3.3.0",
+"Please use spark.driver.memoryOverheadFactor and 
spark.executor.memoryOverheadFactor")
 )
 
 Map(configs.map { cfg => (cfg.key -> cfg) } : _*)
diff --git a/core/src/main/scala/org/apache/spark/internal/config/package.scala 
b/core/src/main/scala/org/apache/spark/internal/config/package.scala
index dbec61a..ffe4501 100644
--- a/core/src/main/scala/org/apache/spark/internal/config/package.scala
+++ b/core/src/main/scala/org/apache/spark/internal/config/package.scala
@@ -105,6 +105,22 @@ package object config {
 .bytesConf(ByteUnit.MiB)
 .createOptional
 
+  private[spark] val DRIVER_MEMORY_OVERHEAD_FACTOR =
+ConfigBuilder("spark.driver.memoryOverheadFactor")
+  .doc("Fraction of driver memory to be allocated as additional non-heap 
memory per driver " +
+"process in cluster mode. This is memory that accounts for things like 
VM overheads, " +
+"interned strings, other native overheads, etc. This tends to grow 
with the container " +
+"size. This value defaults to 0.10 except for Kubernetes non-JVM jobs, 
which defaults to " +
+"0.40. This is done as non-JVM tasks need more non-JVM heap space and 
such tasks " +
+"commonly fail with \"Memory Overhead Exceeded\" errors. This preempts 
this error " +
+"with a higher default. This value is ignored if 
spark.driver.memoryOverhead is set " +
+"directly.")
+  .ve

[spark] branch master updated: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

2022-03-16 Thread tgraves

This is an automated email from the ASF dual-hosted git repository.

tgraves pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 71e2110  [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor 
configurable
71e2110 is described below

commit 71e2110b799220adc107c9ac5ce737281f2b65cc
Author: Adam Binford 
AuthorDate: Wed Mar 16 10:54:18 2022 -0500

[SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

### What changes were proposed in this pull request?

Add a new config to set the memory overhead factor for drivers and 
executors. Currently the memory overhead is hard coded to 10% (except in 
Kubernetes), and the only way to set it higher is to set it to a specific 
memory amount.

### Why are the changes needed?

In dynamic environments where different people or use cases need different 
memory requirements, it would be helpful to set a higher memory overhead factor 
instead of having to set a higher specific memory overhead value. The 
kubernetes resource manager already makes this configurable. This makes it 
configurable across the board.

### Does this PR introduce _any_ user-facing change?

No change to default behavior, just adds a new config users can change.

### How was this patch tested?

New UT to check the memory calculation.

Closes #35504 from Kimahriman/yarn-configurable-memory-overhead-factor.

Authored-by: Adam Binford 
Signed-off-by: Thomas Graves 
---
 .../main/scala/org/apache/spark/SparkConf.scala|  4 +-
 .../org/apache/spark/internal/config/package.scala | 28 ++
 docs/configuration.md  | 30 ++-
 docs/running-on-kubernetes.md  |  9 
 .../k8s/features/BasicDriverFeatureStep.scala  | 13 +++--
 .../k8s/features/BasicExecutorFeatureStep.scala|  7 ++-
 .../k8s/features/BasicDriverFeatureStepSuite.scala | 63 --
 .../features/BasicExecutorFeatureStepSuite.scala   | 54 +++
 .../spark/deploy/rest/mesos/MesosRestServer.scala  |  5 +-
 .../cluster/mesos/MesosSchedulerUtils.scala|  9 ++--
 .../deploy/rest/mesos/MesosRestServerSuite.scala   |  8 ++-
 .../org/apache/spark/deploy/yarn/Client.scala  | 14 +++--
 .../apache/spark/deploy/yarn/YarnAllocator.scala   |  5 +-
 .../spark/deploy/yarn/YarnSparkHadoopUtil.scala|  5 +-
 .../spark/deploy/yarn/YarnAllocatorSuite.scala | 29 ++
 15 files changed, 248 insertions(+), 35 deletions(-)

diff --git a/core/src/main/scala/org/apache/spark/SparkConf.scala 
b/core/src/main/scala/org/apache/spark/SparkConf.scala
index 5f37a1a..cf12174 100644
--- a/core/src/main/scala/org/apache/spark/SparkConf.scala
+++ b/core/src/main/scala/org/apache/spark/SparkConf.scala
@@ -636,7 +636,9 @@ private[spark] object SparkConf extends Logging {
   DeprecatedConfig("spark.blacklist.killBlacklistedExecutors", "3.1.0",
 "Please use spark.excludeOnFailure.killExcludedExecutors"),
   
DeprecatedConfig("spark.yarn.blacklist.executor.launch.blacklisting.enabled", 
"3.1.0",
-"Please use spark.yarn.executor.launch.excludeOnFailure.enabled")
+"Please use spark.yarn.executor.launch.excludeOnFailure.enabled"),
+  DeprecatedConfig("spark.kubernetes.memoryOverheadFactor", "3.3.0",
+"Please use spark.driver.memoryOverheadFactor and 
spark.executor.memoryOverheadFactor")
 )
 
 Map(configs.map { cfg => (cfg.key -> cfg) } : _*)
diff --git a/core/src/main/scala/org/apache/spark/internal/config/package.scala 
b/core/src/main/scala/org/apache/spark/internal/config/package.scala
index dbec61a..ffe4501 100644
--- a/core/src/main/scala/org/apache/spark/internal/config/package.scala
+++ b/core/src/main/scala/org/apache/spark/internal/config/package.scala
@@ -105,6 +105,22 @@ package object config {
 .bytesConf(ByteUnit.MiB)
 .createOptional
 
+  private[spark] val DRIVER_MEMORY_OVERHEAD_FACTOR =
+ConfigBuilder("spark.driver.memoryOverheadFactor")
+  .doc("Fraction of driver memory to be allocated as additional non-heap 
memory per driver " +
+"process in cluster mode. This is memory that accounts for things like 
VM overheads, " +
+"interned strings, other native overheads, etc. This tends to grow 
with the container " +
+"size. This value defaults to 0.10 except for Kubernetes non-JVM jobs, 
which defaults to " +
+"0.40. This is done as non-JVM tasks need more non-JVM heap space and 
such tasks " +
+"commonly fail with \"Memory Overhead Exceeded\" errors. This preempts 
this error " +
+"with a higher default. This value is ignored if 
spark.driver.memoryOverhead is set " +
+"directly.")
+  .version("3.3.0")
+  .doubleConf
+  .checkValue(factor => factor > 0,
+"Ensure that memory overhead is

[spark] branch branch-3.3 updated: [SPARK-38567][INFRA][3.3] Enable GitHub Action build_and_test on branch-3.3

2022-03-16 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.3
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.3 by this push:
 new 1ec220f  [SPARK-38567][INFRA][3.3] Enable GitHub Action build_and_test 
on branch-3.3
1ec220f is described below

commit 1ec220f029f90a6ab109ef87f7c17337038d91d3
Author: Max Gekk 
AuthorDate: Wed Mar 16 20:50:14 2022 +0900

[SPARK-38567][INFRA][3.3] Enable GitHub Action build_and_test on branch-3.3

### What changes were proposed in this pull request?

Like branch-3.2, this PR aims to update GitHub Action `build_and_test` in 
branch-3.3.

### Why are the changes needed?

Currently, GitHub Action on branch-3.3 is not working.
- https://github.com/apache/spark/commits/branch-3.3

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

N/A

Closes #35876 from MaxGekk/fix-github-actions-3.3.

Authored-by: Max Gekk 
Signed-off-by: Hyukjin Kwon 
---
 .github/workflows/ansi_sql_mode_test.yml |  2 +-
 .github/workflows/build_and_test.yml | 32 +---
 2 files changed, 10 insertions(+), 24 deletions(-)

diff --git a/.github/workflows/ansi_sql_mode_test.yml 
b/.github/workflows/ansi_sql_mode_test.yml
index e68b04b..cc4ac57 100644
--- a/.github/workflows/ansi_sql_mode_test.yml
+++ b/.github/workflows/ansi_sql_mode_test.yml
@@ -22,7 +22,7 @@ name: ANSI SQL mode test
 on:
   push:
 branches:
-  - master
+  - branch-3.3
 
 jobs:
   ansi_sql_test:
diff --git a/.github/workflows/build_and_test.yml 
b/.github/workflows/build_and_test.yml
index ebe17b5..7baabc7 100644
--- a/.github/workflows/build_and_test.yml
+++ b/.github/workflows/build_and_test.yml
@@ -23,20 +23,6 @@ on:
   push:
 branches:
 - '**'
-- '!branch-*.*'
-  schedule:
-# master, Hadoop 2
-- cron: '0 1 * * *'
-# master
-- cron: '0 4 * * *'
-# branch-3.2
-- cron: '0 7 * * *'
-# PySpark coverage for master branch
-- cron: '0 10 * * *'
-# Java 11
-- cron: '0 13 * * *'
-# Java 17
-- cron: '0 16 * * *'
   workflow_call:
 inputs:
   ansi_enabled:
@@ -96,7 +82,7 @@ jobs:
   echo '::set-output name=hadoop::hadoop3'
 else
   echo '::set-output name=java::8'
-  echo '::set-output name=branch::master' # Default branch to run on. 
CHANGE here when a branch is cut out.
+  echo '::set-output name=branch::branch-3.3' # Default branch to run 
on. CHANGE here when a branch is cut out.
   echo '::set-output name=type::regular'
   echo '::set-output name=envs::{"SPARK_ANSI_SQL_MODE": "${{ 
inputs.ansi_enabled }}"}'
   echo '::set-output name=hadoop::hadoop3'
@@ -115,7 +101,7 @@ jobs:
   with:
 fetch-depth: 0
 repository: apache/spark
-ref: master
+ref: branch-3.3
 - name: Sync the current branch with the latest in Apache Spark
   if: github.repository != 'apache/spark'
   run: |
@@ -325,7 +311,7 @@ jobs:
   with:
 fetch-depth: 0
 repository: apache/spark
-ref: master
+ref: branch-3.3
 - name: Sync the current branch with the latest in Apache Spark
   if: github.repository != 'apache/spark'
   run: |
@@ -413,7 +399,7 @@ jobs:
   with:
 fetch-depth: 0
 repository: apache/spark
-ref: master
+ref: branch-3.3
 - name: Sync the current branch with the latest in Apache Spark
   if: github.repository != 'apache/spark'
   run: |
@@ -477,7 +463,7 @@ jobs:
   with:
 fetch-depth: 0
 repository: apache/spark
-ref: master
+ref: branch-3.3
 - name: Sync the current branch with the latest in Apache Spark
   if: github.repository != 'apache/spark'
   run: |
@@ -590,7 +576,7 @@ jobs:
   with:
 fetch-depth: 0
 repository: apache/spark
-ref: master
+ref: branch-3.3
 - name: Sync the current branch with the latest in Apache Spark
   if: github.repository != 'apache/spark'
   run: |
@@ -639,7 +625,7 @@ jobs:
   with:
 fetch-depth: 0
 repository: apache/spark
-ref: master
+ref: branch-3.3
 - name: Sync the current branch with the latest in Apache Spark
   if: github.repository != 'apache/spark'
   run: |
@@ -687,7 +673,7 @@ jobs:
   with:
 fetch-depth: 0
 repository: apache/spark
-ref: master
+ref: branch-3.3
 - name: Sync the current branch with the latest in Apache Spark
   if: github.repository != 'apache/spark'
   run: |
@@ -786,7 +772,7 @@ jobs:
   with:
 fetch-depth: 0
 repository: apache/spark
-ref: master
+ref: branch-3.3
 - name: Sync the current branch with the latest in Ap

[spark] branch master updated (8193b40 -> 1b41416)

2022-03-16 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 8193b40  [SPARK-38563][PYTHON] Upgrade to Py4J 0.10.9.4
 add 1b41416  [SPARK-38106][SQL] Use error classes in the parsing errors of 
functions

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/errors/QueryParsingErrors.scala  |  27 ++--
 .../spark/sql/errors/QueryParsingErrorsSuite.scala | 172 +
 .../spark/sql/execution/command/DDLSuite.scala |  51 --
 3 files changed, 189 insertions(+), 61 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.2 updated: [SPARK-38563][PYTHON] Upgrade to Py4J 0.10.9.4

2022-03-16 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.2
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.2 by this push:
 new 6990320  [SPARK-38563][PYTHON] Upgrade to Py4J 0.10.9.4
6990320 is described below

commit 69903200845b68a0474ecb0a3317dc744490c521
Author: Hyukjin Kwon 
AuthorDate: Wed Mar 16 18:20:50 2022 +0900

[SPARK-38563][PYTHON] Upgrade to Py4J 0.10.9.4

This PR upgrade Py4J 0.10.9.4, with relevant documentation changes.

Py4J 0.10.9.3 has a resource leak issue when pinned thread mode is enabled 
- it's enabled by default in PySpark at 
https://github.com/apache/spark/commit/41af409b7bcfe1b3960274c0b3085bcc1f9d1c98.
We worked around this by enforcing users to use `InheritableThread` or 
`inhteritable_thread_target` as a workaround.
After upgrading, we don't need to enforce users anymore because it 
automatically cleans up, see also https://github.com/py4j/py4j/pull/471

Yes, users don't have to use `InheritableThread` or 
`inhteritable_thread_target` to avoid resource leaking problem anymore.

CI in this PR should test it out.

Closes #35871 from HyukjinKwon/SPARK-38563.

Authored-by: Hyukjin Kwon 
Signed-off-by: Hyukjin Kwon 
(cherry picked from commit 8193b405f02f867439dd2d2017bf7b3c814b5cc8)
Signed-off-by: Hyukjin Kwon 
---
 bin/pyspark|   2 +-
 bin/pyspark2.cmd   |   2 +-
 core/pom.xml   |   2 +-
 .../org/apache/spark/api/python/PythonUtils.scala  |   2 +-
 dev/deps/spark-deps-hadoop-2.7-hive-2.3|   2 +-
 dev/deps/spark-deps-hadoop-3.2-hive-2.3|   2 +-
 docs/job-scheduling.md |   2 +-
 python/docs/Makefile   |   2 +-
 python/docs/make2.bat  |   2 +-
 python/docs/source/getting_started/install.rst |   2 +-
 python/lib/py4j-0.10.9.3-src.zip   | Bin 42021 -> 0 bytes
 python/lib/py4j-0.10.9.4-src.zip   | Bin 0 -> 42404 bytes
 python/pyspark/context.py  |   6 ++--
 python/pyspark/util.py |  33 -
 python/setup.py|   2 +-
 sbin/spark-config.sh   |   2 +-
 16 files changed, 20 insertions(+), 43 deletions(-)

diff --git a/bin/pyspark b/bin/pyspark
index 4840589..1e16c56 100755
--- a/bin/pyspark
+++ b/bin/pyspark
@@ -50,7 +50,7 @@ export PYSPARK_DRIVER_PYTHON_OPTS
 
 # Add the PySpark classes to the Python path:
 export PYTHONPATH="${SPARK_HOME}/python/:$PYTHONPATH"
-export PYTHONPATH="${SPARK_HOME}/python/lib/py4j-0.10.9.3-src.zip:$PYTHONPATH"
+export PYTHONPATH="${SPARK_HOME}/python/lib/py4j-0.10.9.4-src.zip:$PYTHONPATH"
 
 # Load the PySpark shell.py script when ./pyspark is used interactively:
 export OLD_PYTHONSTARTUP="$PYTHONSTARTUP"
diff --git a/bin/pyspark2.cmd b/bin/pyspark2.cmd
index a19627a..f20c320 100644
--- a/bin/pyspark2.cmd
+++ b/bin/pyspark2.cmd
@@ -30,7 +30,7 @@ if "x%PYSPARK_DRIVER_PYTHON%"=="x" (
 )
 
 set PYTHONPATH=%SPARK_HOME%\python;%PYTHONPATH%
-set PYTHONPATH=%SPARK_HOME%\python\lib\py4j-0.10.9.3-src.zip;%PYTHONPATH%
+set PYTHONPATH=%SPARK_HOME%\python\lib\py4j-0.10.9.4-src.zip;%PYTHONPATH%
 
 set OLD_PYTHONSTARTUP=%PYTHONSTARTUP%
 set PYTHONSTARTUP=%SPARK_HOME%\python\pyspark\shell.py
diff --git a/core/pom.xml b/core/pom.xml
index 3833794..94b3e58 100644
--- a/core/pom.xml
+++ b/core/pom.xml
@@ -433,7 +433,7 @@
 
   net.sf.py4j
   py4j
-  0.10.9.3
+  0.10.9.4
 
 
   org.apache.spark
diff --git a/core/src/main/scala/org/apache/spark/api/python/PythonUtils.scala 
b/core/src/main/scala/org/apache/spark/api/python/PythonUtils.scala
index 8daba86..a9c35369 100644
--- a/core/src/main/scala/org/apache/spark/api/python/PythonUtils.scala
+++ b/core/src/main/scala/org/apache/spark/api/python/PythonUtils.scala
@@ -27,7 +27,7 @@ import org.apache.spark.SparkContext
 import org.apache.spark.api.java.{JavaRDD, JavaSparkContext}
 
 private[spark] object PythonUtils {
-  val PY4J_ZIP_NAME = "py4j-0.10.9.3-src.zip"
+  val PY4J_ZIP_NAME = "py4j-0.10.9.4-src.zip"
 
   /** Get the PYTHONPATH for PySpark, either from SPARK_HOME, if it is set, or 
from our JAR */
   def sparkPythonPath: String = {
diff --git a/dev/deps/spark-deps-hadoop-2.7-hive-2.3 
b/dev/deps/spark-deps-hadoop-2.7-hive-2.3
index c2882bd..742710e 100644
--- a/dev/deps/spark-deps-hadoop-2.7-hive-2.3
+++ b/dev/deps/spark-deps-hadoop-2.7-hive-2.3
@@ -208,7 +208,7 @@ 
parquet-format-structures/1.12.2//parquet-format-structures-1.12.2.jar
 parquet-hadoop/1.12.2//parquet-hadoop-1.12.2.jar
 parquet-jackson/1.12.2//parquet-jackson-1.12.2.jar
 protobuf-java/2.5.0//protobuf-java-2.5.0.jar
-py4j/0.10.9.3//py4j-0.10.9.3.j

[spark] branch master updated (8476c8b -> 8193b40)

2022-03-16 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 8476c8b  [SPARK-38542][SQL] UnsafeHashedRelation should serialize 
numKeys out
 add 8193b40  [SPARK-38563][PYTHON] Upgrade to Py4J 0.10.9.4

No new revisions were added by this update.

Summary of changes:
 bin/pyspark|   2 +-
 bin/pyspark2.cmd   |   2 +-
 core/pom.xml   |   2 +-
 .../org/apache/spark/api/python/PythonUtils.scala  |   2 +-
 dev/deps/spark-deps-hadoop-2-hive-2.3  |   2 +-
 dev/deps/spark-deps-hadoop-3-hive-2.3  |   2 +-
 docs/job-scheduling.md |   2 +-
 python/docs/Makefile   |   2 +-
 python/docs/make2.bat  |   2 +-
 python/docs/source/getting_started/install.rst |   2 +-
 python/lib/py4j-0.10.9.3-src.zip   | Bin 42021 -> 0 bytes
 python/lib/py4j-0.10.9.4-src.zip   | Bin 0 -> 42404 bytes
 python/pyspark/context.py  |   6 ++--
 python/pyspark/util.py |  35 +++--
 python/setup.py|   2 +-
 sbin/spark-config.sh   |   2 +-
 16 files changed, 20 insertions(+), 45 deletions(-)
 delete mode 100644 python/lib/py4j-0.10.9.3-src.zip
 create mode 100644 python/lib/py4j-0.10.9.4-src.zip

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.3 updated: [SPARK-38563][PYTHON] Upgrade to Py4J 0.10.9.4

2022-03-16 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.3
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.3 by this push:
 new 3bbf346  [SPARK-38563][PYTHON] Upgrade to Py4J 0.10.9.4
3bbf346 is described below

commit 3bbf346d9ca984faa0c3e67cd1387a13b2bd1e37
Author: Hyukjin Kwon 
AuthorDate: Wed Mar 16 18:20:50 2022 +0900

[SPARK-38563][PYTHON] Upgrade to Py4J 0.10.9.4

### What changes were proposed in this pull request?

This PR upgrade Py4J 0.10.9.4, with relevant documentation changes.

### Why are the changes needed?

Py4J 0.10.9.3 has a resource leak issue when pinned thread mode is enabled 
- it's enabled by default in PySpark at 
https://github.com/apache/spark/commit/41af409b7bcfe1b3960274c0b3085bcc1f9d1c98.
We worked around this by enforcing users to use `InheritableThread` or 
`inhteritable_thread_target` as a workaround.
After upgrading, we don't need to enforce users anymore because it 
automatically cleans up, see also https://github.com/py4j/py4j/pull/471

### Does this PR introduce _any_ user-facing change?

Yes, users don't have to use `InheritableThread` or 
`inhteritable_thread_target` to avoid resource leaking problem anymore.

### How was this patch tested?

CI in this PR should test it out.

Closes #35871 from HyukjinKwon/SPARK-38563.

Authored-by: Hyukjin Kwon 
Signed-off-by: Hyukjin Kwon 
(cherry picked from commit 8193b405f02f867439dd2d2017bf7b3c814b5cc8)
Signed-off-by: Hyukjin Kwon 
---
 bin/pyspark|   2 +-
 bin/pyspark2.cmd   |   2 +-
 core/pom.xml   |   2 +-
 .../org/apache/spark/api/python/PythonUtils.scala  |   2 +-
 dev/deps/spark-deps-hadoop-2-hive-2.3  |   2 +-
 dev/deps/spark-deps-hadoop-3-hive-2.3  |   2 +-
 docs/job-scheduling.md |   2 +-
 python/docs/Makefile   |   2 +-
 python/docs/make2.bat  |   2 +-
 python/docs/source/getting_started/install.rst |   2 +-
 python/lib/py4j-0.10.9.3-src.zip   | Bin 42021 -> 0 bytes
 python/lib/py4j-0.10.9.4-src.zip   | Bin 0 -> 42404 bytes
 python/pyspark/context.py  |   6 ++--
 python/pyspark/util.py |  35 +++--
 python/setup.py|   2 +-
 sbin/spark-config.sh   |   2 +-
 16 files changed, 20 insertions(+), 45 deletions(-)

diff --git a/bin/pyspark b/bin/pyspark
index 4840589..1e16c56 100755
--- a/bin/pyspark
+++ b/bin/pyspark
@@ -50,7 +50,7 @@ export PYSPARK_DRIVER_PYTHON_OPTS
 
 # Add the PySpark classes to the Python path:
 export PYTHONPATH="${SPARK_HOME}/python/:$PYTHONPATH"
-export PYTHONPATH="${SPARK_HOME}/python/lib/py4j-0.10.9.3-src.zip:$PYTHONPATH"
+export PYTHONPATH="${SPARK_HOME}/python/lib/py4j-0.10.9.4-src.zip:$PYTHONPATH"
 
 # Load the PySpark shell.py script when ./pyspark is used interactively:
 export OLD_PYTHONSTARTUP="$PYTHONSTARTUP"
diff --git a/bin/pyspark2.cmd b/bin/pyspark2.cmd
index a19627a..f20c320 100644
--- a/bin/pyspark2.cmd
+++ b/bin/pyspark2.cmd
@@ -30,7 +30,7 @@ if "x%PYSPARK_DRIVER_PYTHON%"=="x" (
 )
 
 set PYTHONPATH=%SPARK_HOME%\python;%PYTHONPATH%
-set PYTHONPATH=%SPARK_HOME%\python\lib\py4j-0.10.9.3-src.zip;%PYTHONPATH%
+set PYTHONPATH=%SPARK_HOME%\python\lib\py4j-0.10.9.4-src.zip;%PYTHONPATH%
 
 set OLD_PYTHONSTARTUP=%PYTHONSTARTUP%
 set PYTHONSTARTUP=%SPARK_HOME%\python\pyspark\shell.py
diff --git a/core/pom.xml b/core/pom.xml
index 9d3b170..953c76b 100644
--- a/core/pom.xml
+++ b/core/pom.xml
@@ -423,7 +423,7 @@
 
   net.sf.py4j
   py4j
-  0.10.9.3
+  0.10.9.4
 
 
   org.apache.spark
diff --git a/core/src/main/scala/org/apache/spark/api/python/PythonUtils.scala 
b/core/src/main/scala/org/apache/spark/api/python/PythonUtils.scala
index 8daba86..a9c35369 100644
--- a/core/src/main/scala/org/apache/spark/api/python/PythonUtils.scala
+++ b/core/src/main/scala/org/apache/spark/api/python/PythonUtils.scala
@@ -27,7 +27,7 @@ import org.apache.spark.SparkContext
 import org.apache.spark.api.java.{JavaRDD, JavaSparkContext}
 
 private[spark] object PythonUtils {
-  val PY4J_ZIP_NAME = "py4j-0.10.9.3-src.zip"
+  val PY4J_ZIP_NAME = "py4j-0.10.9.4-src.zip"
 
   /** Get the PYTHONPATH for PySpark, either from SPARK_HOME, if it is set, or 
from our JAR */
   def sparkPythonPath: String = {
diff --git a/dev/deps/spark-deps-hadoop-2-hive-2.3 
b/dev/deps/spark-deps-hadoop-2-hive-2.3
index bcbf8b9..f2db663 100644
--- a/dev/deps/spark-deps-hadoop-2-hive-2.3
+++ b/dev/deps/spark-deps-hadoop-2-hive-2.3
@@ -233,7 +233,7 @@ parquet-hadoop/1.12.2//parquet-hadoop-1.12.2.jar
 parq

[spark] branch master updated (78ed4cc -> 7630787)

[spark] branch master updated (7d1ff01 -> 78ed4cc)

[spark] branch branch-3.3 updated: [SPARK-38556][PYTHON] Disable Pandas usage logging for method calls inside @contextmanager functions

[spark] branch master updated (b348acd -> 7d1ff01)

[spark] branch master updated (b16a9e9 -> b348acd)

[spark] branch master updated (6d3e8eb -> b16a9e9)

[spark] branch master updated: [SPARK-38555][NETWORK][SHUFFLE] Avoid contention and get or create clientPools quickly in the TransportClientFactory

[spark] branch master updated (4ff40c1 -> 5967f29)

[spark] branch master updated: [SPARK-38561][K8S][DOCS] Add doc for `Customized Kubernetes Schedulers`

[spark] branch branch-3.3 updated: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

[spark] branch master updated: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

[spark] branch branch-3.3 updated: [SPARK-38567][INFRA][3.3] Enable GitHub Action build_and_test on branch-3.3

[spark] branch master updated (8193b40 -> 1b41416)

[spark] branch branch-3.2 updated: [SPARK-38563][PYTHON] Upgrade to Py4J 0.10.9.4

[spark] branch master updated (8476c8b -> 8193b40)

[spark] branch branch-3.3 updated: [SPARK-38563][PYTHON] Upgrade to Py4J 0.10.9.4

16 matches

Site Navigation

Mail list logo

Footer information