date:20240129

(spark) branch master updated: [SPARK-46905][SQL] Add dedicated class to keep column definition instead of StructField in Create/ReplaceTable command

2024-01-29 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 4da7c3d316e3 [SPARK-46905][SQL] Add dedicated class to keep column 
definition instead of StructField in Create/ReplaceTable command
4da7c3d316e3 is described below

commit 4da7c3d316e3d1340258698e841be370bd16d6fa
Author: Wenchen Fan 
AuthorDate: Tue Jan 30 15:09:15 2024 +0800

[SPARK-46905][SQL] Add dedicated class to keep column definition instead of 
StructField in Create/ReplaceTable command

### What changes were proposed in this pull request?

This is a follow-up of https://github.com/apache/spark/pull/44876 to 
refactor the code and make it cleaner. The idea is to add a dedicated class for 
column definition instead of `StructField` in Create/ReplaceTable command. This 
is more flexible and cleaner than adding an additional default column 
expression in `CreateTable` command.

### Why are the changes needed?

Code refactor, and makes it easier to fully eliminate the need of a fake 
analyzer for default value handling in the future. We should make similar code 
changes to ALTER TABLE command and v1 CREATE TABLE command.

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

existing tests

### Was this patch authored or co-authored using generative AI tooling?

no

Closes #44935 from cloud-fan/refactor.

Lead-authored-by: Wenchen Fan 
Co-authored-by: Wenchen Fan 
Signed-off-by: Wenchen Fan 
---
 .../src/main/resources/error/error-classes.json|   5 +
 ...conditions-invalid-default-value-error-class.md |   4 +
 .../sql/catalyst/analysis/CheckAnalysis.scala  |   3 +
 .../spark/sql/catalyst/parser/AstBuilder.scala | 106 +
 .../catalyst/plans/logical/ColumnDefinition.scala  | 164 +
 .../sql/catalyst/plans/logical/v2Commands.scala|  55 +++
 .../catalyst/util/ResolveDefaultColumnsUtil.scala  |  30 +++-
 .../sql/connector/catalog/CatalogV2Util.scala  |  26 +---
 .../spark/sql/errors/QueryCompilationErrors.scala  |  13 ++
 .../spark/sql/catalyst/parser/DDLParserSuite.scala | 148 ++-
 .../catalyst/analysis/ReplaceCharWithVarchar.scala |  12 +-
 .../catalyst/analysis/ResolveSessionCatalog.scala  |   5 +-
 .../datasources/v2/DataSourceV2Strategy.scala  |  37 +++--
 .../apache/spark/sql/internal/CatalogImpl.scala|   4 +-
 .../spark/sql/streaming/DataStreamWriter.scala |   9 +-
 .../org/apache/spark/sql/sources/InsertSuite.scala |   2 +-
 16 files changed, 393 insertions(+), 230 deletions(-)

diff --git a/common/utils/src/main/resources/error/error-classes.json 
b/common/utils/src/main/resources/error/error-classes.json
index 64d65fd4beed..8e47490f5a61 100644
--- a/common/utils/src/main/resources/error/error-classes.json
+++ b/common/utils/src/main/resources/error/error-classes.json
@@ -1766,6 +1766,11 @@
   "which requires  type, but the statement provided a 
value of incompatible  type."
 ]
   },
+  "NOT_CONSTANT" : {
+"message" : [
+  "which is not a constant expression whose equivalent value is known 
at query planning time."
+]
+  },
   "SUBQUERY_EXPRESSION" : {
 "message" : [
   "which contains subquery expressions."
diff --git a/docs/sql-error-conditions-invalid-default-value-error-class.md 
b/docs/sql-error-conditions-invalid-default-value-error-class.md
index c73d9d5ccbbb..72a5b0db8da0 100644
--- a/docs/sql-error-conditions-invalid-default-value-error-class.md
+++ b/docs/sql-error-conditions-invalid-default-value-error-class.md
@@ -34,6 +34,10 @@ This error class has the following derived error classes:
 
 which requires `` type, but the statement provided a value of 
incompatible `` type.
 
+## NOT_CONSTANT
+
+which is not a constant expression whose equivalent value is known at query 
planning time.
+
 ## SUBQUERY_EXPRESSION
 
 which contains subquery expressions.
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
index 1b69e933815b..3b1663b4c54c 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
@@ -302,6 +302,9 @@ trait CheckAnalysis extends PredicateHelper with 
LookupCatalog with QueryErrorsB
 // general unresolved check below to throw a more tailored error 
message.
 new 
ResolveReferencesInAggregate(catalogManager).checkUnresolvedGroupByAll(operator)
 
+// Early checks for column definitions, to produce better error 
messages
+ColumnDefinition.ch

(spark) branch branch-3.4 updated: [SPARK-46893][UI] Remove inline scripts from UI descriptions

2024-01-29 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.4 by this push:
 new edaa0fd8d096 [SPARK-46893][UI] Remove inline scripts from UI 
descriptions
edaa0fd8d096 is described below

commit edaa0fd8d096a3e57918e4b6e437337fcfdc8276
Author: Willi Raschkowski 
AuthorDate: Mon Jan 29 22:43:21 2024 -0800

[SPARK-46893][UI] Remove inline scripts from UI descriptions

### What changes were proposed in this pull request?
This PR prevents malicious users from injecting inline scripts via job and 
stage descriptions.

Spark's Web UI [already checks the security of job and stage 
descriptions](https://github.com/apache/spark/blob/a368280708dd3c6eb90bd3b09a36a68bdd096222/core/src/main/scala/org/apache/spark/ui/UIUtils.scala#L528-L545)
 before rendering them as HTML (or treating them as plain text). The UI already 
disallows `

(spark) branch branch-3.5 updated: [SPARK-46893][UI] Remove inline scripts from UI descriptions

2024-01-29 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.5
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.5 by this push:
 new 343ae8226161 [SPARK-46893][UI] Remove inline scripts from UI 
descriptions
343ae8226161 is described below

commit 343ae822616185022570f1c14b151e54ff54e265
Author: Willi Raschkowski 
AuthorDate: Mon Jan 29 22:43:21 2024 -0800

[SPARK-46893][UI] Remove inline scripts from UI descriptions

### What changes were proposed in this pull request?
This PR prevents malicious users from injecting inline scripts via job and 
stage descriptions.

Spark's Web UI [already checks the security of job and stage 
descriptions](https://github.com/apache/spark/blob/a368280708dd3c6eb90bd3b09a36a68bdd096222/core/src/main/scala/org/apache/spark/ui/UIUtils.scala#L528-L545)
 before rendering them as HTML (or treating them as plain text). The UI already 
disallows `

(spark) branch master updated (41a1426e9ee3 -> abd9d27e87b9)

2024-01-29 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 41a1426e9ee3 [SPARK-46914][UI] Shorten app name in the summary table 
on the History Page
 add abd9d27e87b9 [SPARK-46893][UI] Remove inline scripts from UI 
descriptions

No new revisions were added by this update.

Summary of changes:
 core/src/main/scala/org/apache/spark/ui/UIUtils.scala  | 12 +---
 core/src/test/scala/org/apache/spark/ui/UIUtilsSuite.scala | 14 ++
 2 files changed, 23 insertions(+), 3 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

(spark) branch master updated: [SPARK-46914][UI] Shorten app name in the summary table on the History Page

2024-01-29 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 41a1426e9ee3 [SPARK-46914][UI] Shorten app name in the summary table 
on the History Page
41a1426e9ee3 is described below

commit 41a1426e9ee318a9421fad11776eb6894bb1f04b
Author: Kent Yao 
AuthorDate: Mon Jan 29 22:07:19 2024 -0800

[SPARK-46914][UI] Shorten app name in the summary table on the History Page

### What changes were proposed in this pull request?

This Pull Request shortens long app names to prevent overflow in the app 
table.

### Why are the changes needed?

better UX

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

new js tests and built and tested locally:

![image](https://github.com/apache/spark/assets/8326978/f78bd580-74b1-4fe5-9d8b-f2d49ce85ed9)

![image](https://github.com/apache/spark/assets/8326978/10bca509-00e5-4d8f-bf11-324c1080190b)

### Was this patch authored or co-authored using generative AI tooling?

no

Closes #44944 from yaooqinn/SPARK-46914.

Authored-by: Kent Yao 
Signed-off-by: Dongjoon Hyun 
---
 .../resources/org/apache/spark/ui/static/historypage.js   | 12 
 .../main/resources/org/apache/spark/ui/static/utils.js| 15 ++-
 ui-test/tests/utils.test.js   |  7 +++
 3 files changed, 29 insertions(+), 5 deletions(-)

diff --git a/core/src/main/resources/org/apache/spark/ui/static/historypage.js 
b/core/src/main/resources/org/apache/spark/ui/static/historypage.js
index 85cd5a554750..8961140a4019 100644
--- a/core/src/main/resources/org/apache/spark/ui/static/historypage.js
+++ b/core/src/main/resources/org/apache/spark/ui/static/historypage.js
@@ -17,7 +17,7 @@
 
 /* global $, Mustache, jQuery, uiRoot */
 
-import {formatDuration, formatTimeMillis} from "./utils.js";
+import {formatDuration, formatTimeMillis, stringAbbreviate} from "./utils.js";
 
 export {setAppLimit};
 
@@ -186,9 +186,13 @@ $(document).ready(function() {
 name: 'appId',
 type: "appid-numeric",
 data: 'id',
-render:  (id, type, row) => `${id}`
+render: (id, type, row) => `${id}`
+  },
+  {
+name: 'appName',
+data: 'name',
+render: (name) => stringAbbreviate(name, 60)
   },
-  {name: 'appName', data: 'name' },
   {
 name: attemptIdColumnName,
 data: 'attemptId',
@@ -200,7 +204,7 @@ $(document).ready(function() {
 name: durationColumnName,
 type: "title-numeric",
 data: 'duration',
-render:  (id, type, row) => `${row.duration}`
+render: (id, type, row) => `${row.duration}`
   },
   {name: 'user', data: 'sparkUser' },
   {name: 'lastUpdated', data: 'lastUpdated' },
diff --git a/core/src/main/resources/org/apache/spark/ui/static/utils.js 
b/core/src/main/resources/org/apache/spark/ui/static/utils.js
index 960640791fe5..2d4123bc75ab 100644
--- a/core/src/main/resources/org/apache/spark/ui/static/utils.js
+++ b/core/src/main/resources/org/apache/spark/ui/static/utils.js
@@ -20,7 +20,7 @@ export {
   errorMessageCell, errorSummary,
   formatBytes, formatDate, formatDuration, formatLogsCells, formatTimeMillis,
   getBaseURI, getStandAloneAppId, getTimeZone,
-  setDataTableDefaults
+  setDataTableDefaults, stringAbbreviate
 };
 
 /* global $, uiRoot */
@@ -272,3 +272,16 @@ function errorMessageCell(errorMessage) {
   const details = detailsUINode(isMultiline, errorMessage);
   return summary + details;
 }
+
+function stringAbbreviate(content, limit) {
+  if (content && content.length > limit) {
+const summary = content.substring(0, limit) + '...';
+// TODO: Reused stacktrace-details* style for convenience, but it's not 
really a stacktrace
+// Consider creating a new style for this case if stacktrace-details is 
not appropriate in
+// the future.
+const details = detailsUINode(true, content);
+return summary + details;
+  } else {
+return content;
+  }
+}
diff --git a/ui-test/tests/utils.test.js b/ui-test/tests/utils.test.js
index ad3e87b76641..a6815577bd82 100644
--- a/ui-test/tests/utils.test.js
+++ b/ui-test/tests/utils.test.js
@@ -67,3 +67,10 @@ test('errorSummary', function () {
   const e2 = "java.lang.RuntimeException: random text";
   
expect(utils.errorSummary(e2).toString()).toBe('java.lang.RuntimeException,true');
 });
+
+test('stringAbbreviate', function () {
+  expect(utils.stringAbbreviate(null, 10)).toBe(null);
+  expect(utils.stringAbbreviate('1234567890', 10)).toBe('1234567890');
+  expect(utils.stringAbbreviate('12345678901', 10)).toContain('1234567890...');
+  expect(utils

(spark) branch master updated: [SPARK-46916][PS][TESTS] Clean up `pyspark.pandas.tests.indexes.*`

2024-01-29 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new e3143c4c2806 [SPARK-46916][PS][TESTS] Clean up 
`pyspark.pandas.tests.indexes.*`
e3143c4c2806 is described below

commit e3143c4c28068b80865c4ed9780a5a4beec0a7e8
Author: Ruifeng Zheng 
AuthorDate: Mon Jan 29 22:05:12 2024 -0800

[SPARK-46916][PS][TESTS] Clean up `pyspark.pandas.tests.indexes.*`

### What changes were proposed in this pull request?
Clean up `pyspark.pandas.tests.indexes.*`:
1, delete unused imports, variables;
2, avoid double definition of the testing datasets;

### Why are the changes needed?
code clean up

### Does this PR introduce _any_ user-facing change?
no

### How was this patch tested?
ci

### Was this patch authored or co-authored using generative AI tooling?
no

Closes #44945 from zhengruifeng/ps_test_index_cleanup.

Authored-by: Ruifeng Zheng 
Signed-off-by: Dongjoon Hyun 
---
 .../pandas/tests/connect/indexes/test_parity_align.py| 11 ++-
 .../pandas/tests/connect/indexes/test_parity_indexing.py | 11 ++-
 .../pandas/tests/connect/indexes/test_parity_reindex.py  | 11 ++-
 .../pandas/tests/connect/indexes/test_parity_rename.py   | 11 ++-
 .../pandas/tests/connect/indexes/test_parity_reset_index.py  |  9 -
 python/pyspark/pandas/tests/indexes/test_align.py|  8 ++--
 python/pyspark/pandas/tests/indexes/test_asof.py |  4 ++--
 python/pyspark/pandas/tests/indexes/test_astype.py   |  4 ++--
 python/pyspark/pandas/tests/indexes/test_datetime.py |  6 +-
 python/pyspark/pandas/tests/indexes/test_delete.py   |  4 ++--
 python/pyspark/pandas/tests/indexes/test_diff.py |  4 ++--
 python/pyspark/pandas/tests/indexes/test_drop.py |  4 ++--
 python/pyspark/pandas/tests/indexes/test_indexing.py | 12 ++--
 .../pandas/tests/indexes/test_indexing_loc_multi_idx.py  |  1 -
 python/pyspark/pandas/tests/indexes/test_insert.py   | 11 ++-
 python/pyspark/pandas/tests/indexes/test_map.py  |  4 ++--
 python/pyspark/pandas/tests/indexes/test_reindex.py  |  8 ++--
 python/pyspark/pandas/tests/indexes/test_rename.py   |  8 ++--
 python/pyspark/pandas/tests/indexes/test_reset_index.py  |  8 ++--
 python/pyspark/pandas/tests/indexes/test_sort.py |  4 ++--
 python/pyspark/pandas/tests/indexes/test_symmetric_diff.py   |  4 ++--
 python/pyspark/pandas/tests/indexes/test_take.py |  4 ++--
 python/pyspark/pandas/tests/indexes/test_timedelta.py|  6 +-
 23 files changed, 92 insertions(+), 65 deletions(-)

diff --git a/python/pyspark/pandas/tests/connect/indexes/test_parity_align.py 
b/python/pyspark/pandas/tests/connect/indexes/test_parity_align.py
index 0bf84e6421f2..2bb56242ba34 100644
--- a/python/pyspark/pandas/tests/connect/indexes/test_parity_align.py
+++ b/python/pyspark/pandas/tests/connect/indexes/test_parity_align.py
@@ -16,16 +16,17 @@
 #
 import unittest
 
-from pyspark import pandas as ps
 from pyspark.pandas.tests.indexes.test_align import FrameAlignMixin
 from pyspark.testing.connectutils import ReusedConnectTestCase
 from pyspark.testing.pandasutils import PandasOnSparkTestUtils
 
 
-class FrameParityAlignTests(FrameAlignMixin, PandasOnSparkTestUtils, 
ReusedConnectTestCase):
-@property
-def psdf(self):
-return ps.from_pandas(self.pdf)
+class FrameParityAlignTests(
+FrameAlignMixin,
+PandasOnSparkTestUtils,
+ReusedConnectTestCase,
+):
+pass
 
 
 if __name__ == "__main__":
diff --git 
a/python/pyspark/pandas/tests/connect/indexes/test_parity_indexing.py 
b/python/pyspark/pandas/tests/connect/indexes/test_parity_indexing.py
index a76489314d25..5e52dd91474a 100644
--- a/python/pyspark/pandas/tests/connect/indexes/test_parity_indexing.py
+++ b/python/pyspark/pandas/tests/connect/indexes/test_parity_indexing.py
@@ -16,16 +16,17 @@
 #
 import unittest
 
-from pyspark import pandas as ps
 from pyspark.pandas.tests.indexes.test_indexing import FrameIndexingMixin
 from pyspark.testing.connectutils import ReusedConnectTestCase
 from pyspark.testing.pandasutils import PandasOnSparkTestUtils
 
 
-class FrameParityIndexingTests(FrameIndexingMixin, PandasOnSparkTestUtils, 
ReusedConnectTestCase):
-@property
-def psdf(self):
-return ps.from_pandas(self.pdf)
+class FrameParityIndexingTests(
+FrameIndexingMixin,
+PandasOnSparkTestUtils,
+ReusedConnectTestCase,
+):
+pass
 
 
 if __name__ == "__main__":
diff --git a/python/pyspark/pandas/tests/connect/indexes/test_parity_reindex.py 
b/python/pyspark/pandas/tests/connect/indexes/test_parity_reindex.py
index 7e9c5356d686.

(spark) branch master updated: [SPARK-46736][PROTOBUF] retain empty message field in protobuf connector

2024-01-29 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 3f7994217d5a [SPARK-46736][PROTOBUF] retain empty message field in 
protobuf connector
3f7994217d5a is described below

commit 3f7994217d5a8d2816165459c1ce10d9b31bc7fd
Author: Chaoqin Li 
AuthorDate: Tue Jan 30 11:02:35 2024 +0900

[SPARK-46736][PROTOBUF] retain empty message field in protobuf connector

### What changes were proposed in this pull request?
Since Spark doesn't allow empty StructType, empty proto message type as 
field will be dropped by default. introduce an option to allow retaining an 
empty message field by inserting a dummy column.

### Why are the changes needed?
In protobuf, it is common to have empty message type without any field as a 
place holder, in some case people may not want to drop these empty message 
field.

### Does this PR introduce _any_ user-facing change?
Yes. The default behavior is still dropping an empty message field. The new 
option will enable customer to keep the empty message field though they will 
observe a dummy column.

### How was this patch tested?
Unit test and integration test.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes #44643 from chaoqin-li1123/empty_proto.

Authored-by: Chaoqin Li 
Signed-off-by: Jungtaek Lim 
---
 .../spark/sql/protobuf/utils/ProtobufOptions.scala |  17 +++
 .../sql/protobuf/utils/SchemaConverters.scala  |  34 --
 .../test/resources/protobuf/functions_suite.proto  |   9 ++
 .../sql/protobuf/ProtobufFunctionsSuite.scala  | 123 -
 4 files changed, 171 insertions(+), 12 deletions(-)

diff --git 
a/connector/protobuf/src/main/scala/org/apache/spark/sql/protobuf/utils/ProtobufOptions.scala
 
b/connector/protobuf/src/main/scala/org/apache/spark/sql/protobuf/utils/ProtobufOptions.scala
index 5f8c42df365a..6644bce98293 100644
--- 
a/connector/protobuf/src/main/scala/org/apache/spark/sql/protobuf/utils/ProtobufOptions.scala
+++ 
b/connector/protobuf/src/main/scala/org/apache/spark/sql/protobuf/utils/ProtobufOptions.scala
@@ -207,6 +207,23 @@ private[sql] class ProtobufOptions(
   //nil => nil, Int32Value(0) => 0, Int32Value(100) => 100.
   val unwrapWellKnownTypes: Boolean =
 parameters.getOrElse("unwrap.primitive.wrapper.types", 
false.toString).toBoolean
+
+  // Since Spark doesn't allow writing empty StructType, empty proto message 
type will be
+  // dropped by default. Setting this option to true will insert a dummy 
column to empty proto
+  // message so that the empty message will be retained.
+  // For example, an empty message is used as field in another message:
+  //
+  // ```
+  // message A {}
+  // message B {A a = 1, string name = 2}
+  // ```
+  //
+  // By default, in the spark schema field a will be dropped, which result in 
schema
+  // b struct
+  // If retain.empty.message.types=true, field a will be retained by inserting 
a dummy column.
+  // b struct, name: string>
+  val retainEmptyMessage: Boolean =
+parameters.getOrElse("retain.empty.message.types", 
false.toString).toBoolean
 }
 
 private[sql] object ProtobufOptions {
diff --git 
a/connector/protobuf/src/main/scala/org/apache/spark/sql/protobuf/utils/SchemaConverters.scala
 
b/connector/protobuf/src/main/scala/org/apache/spark/sql/protobuf/utils/SchemaConverters.scala
index b35aa153aaa1..feb5aed03451 100644
--- 
a/connector/protobuf/src/main/scala/org/apache/spark/sql/protobuf/utils/SchemaConverters.scala
+++ 
b/connector/protobuf/src/main/scala/org/apache/spark/sql/protobuf/utils/SchemaConverters.scala
@@ -51,12 +51,13 @@ object SchemaConverters extends Logging {
   def toSqlTypeHelper(
   descriptor: Descriptor,
   protobufOptions: ProtobufOptions): SchemaType = {
-SchemaType(
-  StructType(descriptor.getFields.asScala.flatMap(
-structFieldFor(_,
-  Map(descriptor.getFullName -> 1),
-  protobufOptions: ProtobufOptions)).toArray),
-  nullable = true)
+val fields = descriptor.getFields.asScala.flatMap(
+  structFieldFor(_,
+Map(descriptor.getFullName -> 1),
+protobufOptions: ProtobufOptions)).toSeq
+if (fields.isEmpty && protobufOptions.retainEmptyMessage) {
+  
SchemaType(convertEmptyProtoToStructWithDummyField(descriptor.getFullName), 
nullable = true)
+} else SchemaType(StructType(fields), nullable = true)
   }
 
   // existingRecordNames: Map[String, Int] used to track the depth of 
recursive fields and to
@@ -212,11 +213,15 @@ object SchemaConverters extends Logging {
   ).toSeq
   fields match {
 case Nil =>
-  log.info(
-s"Dropping ${fd.getFullName} as it does not have any fields 
left " +
-"li

(spark) branch master updated: [SPARK-46910][PYTHON] Eliminate JDK Requirement in PySpark Installation

2024-01-29 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 83fad32f1c68 [SPARK-46910][PYTHON] Eliminate JDK Requirement in 
PySpark Installation
83fad32f1c68 is described below

commit 83fad32f1c68c991cdeaead5e14052cdac89f3b7
Author: Amanda Liu 
AuthorDate: Tue Jan 30 09:47:56 2024 +0900

[SPARK-46910][PYTHON] Eliminate JDK Requirement in PySpark Installation

### What changes were proposed in this pull request?
Modifies the PySpark installation script to ask users to allow installation 
of the necessary JDK, if not already installed.

### Why are the changes needed?
Simplifying the PySpark installation process is a critical part of 
improving the new user onboarding experience. Many new PySpark users get 
blocked in the installation process, due to confusing errors from not having 
Java installed. This change simplifies the PySpark user onboarding process.

### Does this PR introduce _any_ user-facing change?
Yes, modifies the PySpark installation script.

### How was this patch tested?
Installing PySpark in virtual environments

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #44940 from asl3/jdk-install.

Lead-authored-by: Amanda Liu 
Co-authored-by: Hyukjin Kwon 
Signed-off-by: Hyukjin Kwon 
---
 bin/pyspark | 27 +++
 1 file changed, 27 insertions(+)

diff --git a/bin/pyspark b/bin/pyspark
index 1ae28b1f507c..2f08f7836915 100755
--- a/bin/pyspark
+++ b/bin/pyspark
@@ -48,6 +48,33 @@ export PYSPARK_PYTHON
 export PYSPARK_DRIVER_PYTHON
 export PYSPARK_DRIVER_PYTHON_OPTS
 
+# Attempt to find JAVA_HOME.
+# If JAVA_HOME not set, install JDK 17 and set JAVA_HOME using a temp dir, and 
adding the
+# temp dir to the PYTHONPATH.
+if [ -n "${JAVA_HOME}" ]; then
+  RUNNER="${JAVA_HOME}/bin/java"
+else
+  if [ "$(command -v java)" ]; then
+RUNNER="java"
+  else
+echo -n "JAVA_HOME is not set. Would you like to install JDK 17 and set 
JAVA_HOME? (Y/N) " >&2
+
+read -r input
+
+if [[ "${input,,}" == "y" ]]; then
+TEMP_DIR=$(mktemp -d)
+$PYSPARK_DRIVER_PYTHON -m pip install --target="$TEMP_DIR" install-jdk
+export JAVA_HOME=$(PYTHONPATH="$TEMP_DIR" $PYSPARK_DRIVER_PYTHON -c 
'import jdk; print(jdk.install("17"))')
+RUNNER="${JAVA_HOME}/bin/java"
+echo "JDK was installed to the path \"$JAVA_HOME\""
+echo "You can avoid needing to re-install JDK by setting your 
JAVA_HOME environment variable to \"$JAVA_HOME\""
+else
+echo "JDK installation skipped. You can manually install JDK (17 or 
later) and set JAVA_HOME in your environment."
+exit 1
+fi
+  fi
+fi
+
 # Add the PySpark classes to the Python path:
 export PYTHONPATH="${SPARK_HOME}/python/:$PYTHONPATH"
 export PYTHONPATH="${SPARK_HOME}/python/lib/py4j-0.10.9.7-src.zip:$PYTHONPATH"


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

(spark) branch master updated: [SPARK-46907][CORE] Show driver log location in Spark History Server

2024-01-29 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 29355c07580e [SPARK-46907][CORE] Show driver log location in Spark 
History Server
29355c07580e is described below

commit 29355c07580e68d48546e9e210c876b69a8c10a2
Author: Dongjoon Hyun 
AuthorDate: Mon Jan 29 14:04:07 2024 -0800

[SPARK-46907][CORE] Show driver log location in Spark History Server

### What changes were proposed in this pull request?

This PR aims to show `Driver Log Location` in Spark History Server UI if 
`spark.driver.log.dfsDir` is configured.

### Why are the changes needed?

**BEFORE (or `spark.driver.log.dfsDir` is absent)**
![Screenshot 2024-01-29 at 10 11 06 
AM](https://github.com/apache/spark/assets/9700541/6d709b4b-d002-422b-a1df-bb5e1b50b539)

**AFTER**
![Screenshot 2024-01-29 at 10 10 25 
AM](https://github.com/apache/spark/assets/9700541/83b35a7d-5fc9-443a-a6e5-7b6bd98dbdc6)

### Does this PR introduce _any_ user-facing change?

No, this is a new additional UI information only for the users who uses 
`spark.driver.log.dfsDir` configurations.

### How was this patch tested?

Manual.

```
$ mkdir /tmp/history
$ mkdir /tmp/driver-logs
$ SPARK_HISTORY_OPTS="-Dspark.history.fs.logDirectory=/tmp/history 
-Dspark.driver.log.dfsDir=/tmp/driver-logs"  sbin/start-history-server.sh
```

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #44936 from dongjoon-hyun/SPARK-46907.

Authored-by: Dongjoon Hyun 
Signed-off-by: Dongjoon Hyun 
---
 .../scala/org/apache/spark/deploy/history/FsHistoryProvider.scala  | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
index 8f64de0847ec..7c888a07263a 100644
--- 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
+++ 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
@@ -381,7 +381,12 @@ private[history] class FsHistoryProvider(conf: SparkConf, 
clock: Clock)
 } else {
   Map()
 }
-Map("Event log directory" -> logDir) ++ safeMode
+val driverLog = if (conf.contains(DRIVER_LOG_DFS_DIR)) {
+  Map("Driver log directory" -> conf.get(DRIVER_LOG_DFS_DIR).get)
+} else {
+  Map()
+}
+Map("Event log directory" -> logDir) ++ safeMode ++ driverLog
   }
 
   override def start(): Unit = {


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

(spark) branch master updated: [SPARK-46687][PYTHON][CONNECT] Basic support of SparkSession-based memory profiler

2024-01-29 Thread ueshin

This is an automated email from the ASF dual-hosted git repository.

ueshin pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 528ac8b3e854 [SPARK-46687][PYTHON][CONNECT] Basic support of 
SparkSession-based memory profiler
528ac8b3e854 is described below

commit 528ac8b3e8548a53d931007c36db3427c610f4da
Author: Xinrong Meng 
AuthorDate: Mon Jan 29 13:08:17 2024 -0800

[SPARK-46687][PYTHON][CONNECT] Basic support of SparkSession-based memory 
profiler

### What changes were proposed in this pull request?

Basic support of SparkSession-based memory profiler in both Spark Connect 
and non-Spark-Connect.

### Why are the changes needed?

We need to make the memory profiler SparkSession-based to support memory 
profiling in Spark Connect.

### Does this PR introduce _any_ user-facing change?

Yes, the SparkSession-based memory profiler is available.

An example is as shown below
```py
from pyspark.sql import SparkSession
from pyspark.sql.functions import *
from pyspark.taskcontext import TaskContext

spark.conf.set("spark.sql.pyspark.udf.profiler", "memory")

udf("string")
def f(x):
  if TaskContext.get().partitionId() % 2 == 0:
return str(x)
  else:
return None

spark.range(10).select(f(col("id"))).show()

spark.showMemoryProfiles()
```
shows profile result:
```

Profile of UDF

Filename: 
/var/folders/h_/60n1p_5s7751jx1st4_sk078gp/T/ipykernel_72839/2848225169.py

Line #Mem usageIncrement  Occurrences   Line Contents
=
 7113.2 MiB113.2 MiB  10   udf("string")
 8 def f(x):
 9114.4 MiB  1.3 MiB  10 if 
TaskContext.get().partitionId() % 2 == 0:
10 31.8 MiB  0.1 MiB   4   return str(x)
11   else:
12 82.8 MiB  0.1 MiB   6   return None
```

### How was this patch tested?

New and existing unit tests:
- pyspark.tests.test_memory_profiler
- pyspark.sql.tests.connect.test_parity_memory_profiler

And manual tests on Jupyter notebook.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #44775 from xinrong-meng/memory_profiler_v2.

Authored-by: Xinrong Meng 
Signed-off-by: Takuya UESHIN 
---
 dev/sparktestsupport/modules.py|   1 +
 python/pyspark/profiler.py |  75 ++--
 python/pyspark/sql/connect/session.py  |   5 +
 python/pyspark/sql/session.py  |   5 +
 .../tests/connect/test_parity_memory_profiler.py   |  59 ++
 python/pyspark/tests/test_memory_profiler.py   | 212 -
 python/pyspark/worker.py   |  36 +++-
 7 files changed, 368 insertions(+), 25 deletions(-)

diff --git a/dev/sparktestsupport/modules.py b/dev/sparktestsupport/modules.py
index b9541c4be9b3..508cf56b9c87 100644
--- a/dev/sparktestsupport/modules.py
+++ b/dev/sparktestsupport/modules.py
@@ -1002,6 +1002,7 @@ pyspark_connect = Module(
 "pyspark.sql.tests.connect.test_parity_readwriter",
 "pyspark.sql.tests.connect.test_parity_udf",
 "pyspark.sql.tests.connect.test_parity_udf_profiler",
+"pyspark.sql.tests.connect.test_parity_memory_profiler",
 "pyspark.sql.tests.connect.test_parity_udtf",
 "pyspark.sql.tests.connect.test_parity_pandas_udf",
 "pyspark.sql.tests.connect.test_parity_pandas_map",
diff --git a/python/pyspark/profiler.py b/python/pyspark/profiler.py
index b5f1bc4d714d..aa2288b36a02 100644
--- a/python/pyspark/profiler.py
+++ b/python/pyspark/profiler.py
@@ -19,6 +19,7 @@ from typing import (
 Any,
 Callable,
 Dict,
+Iterator,
 List,
 Optional,
 Tuple,
@@ -37,7 +38,7 @@ import sys
 import warnings
 
 try:
-from memory_profiler import choose_backend, CodeMap, LineProfiler  # type: 
ignore[import]
+from memory_profiler import CodeMap, LineProfiler  # type: ignore[import]
 
 has_memory_profiler = True
 except Exception:
@@ -196,16 +197,40 @@ if has_memory_profiler:
 for subcode in filter(inspect.iscode, code.co_consts):
 self.add(subcode, toplevel_code=toplevel_code)
 
+class CodeMapForUDFV2(CodeMap):
+def add(
+self,
+code: Any,
+toplevel_code: Optional[Any] = None,
+) -> None:
+if code in self:
+return
+
+if toplevel

(spark) branch master updated (e211dbdee42c -> c468c3d5c685)

2024-01-29 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from e211dbdee42c [SPARK-46831][SQL] Collations - Extending StringType and 
PhysicalStringType with collationId field
 add c468c3d5c685 [SPARK-46904][UI] Fix display issue of History UI summary

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/deploy/history/HistoryPage.scala  | 121 +++--
 1 file changed, 64 insertions(+), 57 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

(spark) branch master updated: [SPARK-46831][SQL] Collations - Extending StringType and PhysicalStringType with collationId field

2024-01-29 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new e211dbdee42c [SPARK-46831][SQL] Collations - Extending StringType and 
PhysicalStringType with collationId field
e211dbdee42c is described below

commit e211dbdee42c887c99635623a0312857a240ebaa
Author: Aleksandar Tomic 
AuthorDate: Mon Jan 29 17:15:29 2024 +0300

[SPARK-46831][SQL] Collations - Extending StringType and PhysicalStringType 
with collationId field

### What changes were proposed in this pull request?

This PR represents initial change for introducing collation concept into 
Spark engine. For higher level overview please take a look at the umbrella 
[JIRA](https://issues.apache.org/jira/browse/SPARK-46830).

This PR extends both `StringType` and `PhysicalStringType` with collationId 
field. At this point this is just a noop field. In the following PRs this field 
will be used for fetching right UTF8String comparison rules from global 
collation table.

Goal is to make sure that we keep backwards compatibility - this is ensured 
by keeping singleton `object StringType` that inherits 
`StringType(DEFAULT_COLLATION_ID)`. DEFAULT_COLLATION_ID represents UTF8 Binary 
collation rules (i.e. byte for byte comparison, that is already used in Spark). 
Hence, any code that relies on `StringType` will stay binary compatible with 
this version.

It may be hard to see end state from just this initial PR. For reviewers 
who want to see how this will fit in the end state, please take a look at this 
draft [PR](https://github.com/apache/spark/pull/44537).

### Why are the changes needed?

Please refer to umbrella JIRA ticket for collation effort.

### Does this PR introduce _any_ user-facing change?

At this point No.

### How was this patch tested?

This initial PR doesn't introduce any surface level changes.

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #44901 from dbatomic/string_with_collation_type.

Authored-by: Aleksandar Tomic 
Signed-off-by: Max Gekk 
---
 project/MimaExcludes.scala   |  2 ++
 .../main/scala/org/apache/spark/sql/types/StringType.scala   |  9 ++---
 .../scala/org/apache/spark/sql/catalyst/InternalRow.scala|  2 +-
 .../catalyst/expressions/InterpretedUnsafeProjection.scala   |  2 +-
 .../sql/catalyst/expressions/codegen/CodeGenerator.scala |  4 ++--
 .../org/apache/spark/sql/catalyst/expressions/literals.scala |  2 +-
 .../spark/sql/catalyst/expressions/namedExpressions.scala|  2 +-
 .../apache/spark/sql/catalyst/types/PhysicalDataType.scala   | 12 +++-
 .../org/apache/spark/sql/execution/columnar/ColumnType.scala |  3 ++-
 .../spark/sql/execution/columnar/ColumnarDataTypeUtils.scala |  2 +-
 10 files changed, 24 insertions(+), 16 deletions(-)

diff --git a/project/MimaExcludes.scala b/project/MimaExcludes.scala
index eb4c130cc6a9..43723742be97 100644
--- a/project/MimaExcludes.scala
+++ b/project/MimaExcludes.scala
@@ -107,6 +107,8 @@ object MimaExcludes {
 
 // SPARK-46410: Assign error classes/subclasses to 
JdbcUtils.classifyException
 
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.sql.jdbc.JdbcDialect.classifyException"),
+// [SPARK-464878][CORE][SQL] (false alert). Invalid rule for StringType 
extension.
+
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.sql.types.StringType.this"),
 
 (problem: Problem) => problem match {
   case MissingClassProblem(cls) => 
!cls.fullName.startsWith("org.sparkproject.jpmml") &&
diff --git a/sql/api/src/main/scala/org/apache/spark/sql/types/StringType.scala 
b/sql/api/src/main/scala/org/apache/spark/sql/types/StringType.scala
index 5985238a863e..bd2ff8475741 100644
--- a/sql/api/src/main/scala/org/apache/spark/sql/types/StringType.scala
+++ b/sql/api/src/main/scala/org/apache/spark/sql/types/StringType.scala
@@ -23,9 +23,10 @@ import org.apache.spark.annotation.Stable
  * The data type representing `String` values. Please use the singleton 
`DataTypes.StringType`.
  *
  * @since 1.3.0
+ * @param collationId The id of collation for this StringType.
  */
 @Stable
-class StringType private() extends AtomicType {
+class StringType private(val collationId: Int) extends AtomicType {
   /**
* The default size of a value of the StringType is 20 bytes.
*/
@@ -38,5 +39,7 @@ class StringType private() extends AtomicType {
  * @since 1.3.0
  */
 @Stable
-case object StringType extends StringType
-
+case object StringType extends StringType(0) {
+  val DEFAULT_COLLATION_ID = 0
+  def apply(collationId: Int): StringType = new StringType(collationId)
+}
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/

(spark) branch master updated: [SPARK-46903][CORE] Support Spark History Server Log UI

2024-01-29 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 8dd395b2eabd [SPARK-46903][CORE] Support Spark History Server Log UI
8dd395b2eabd is described below

commit 8dd395b2eabd2815982022b38a5287dae7af8b82
Author: Dongjoon Hyun 
AuthorDate: Mon Jan 29 01:32:45 2024 -0800

[SPARK-46903][CORE] Support Spark History Server Log UI

### What changes were proposed in this pull request?

This PR aims to make `Spark History Server` to provide its server log view 
link and page.

### Why are the changes needed?

To improve UX.

- `Show server log` link is added at the bottom of page.
![Screenshot 2024-01-29 at 12 54 41 
AM](https://github.com/apache/spark/assets/9700541/7e5cea9f-8ac8-4a60-a249-d1bb31f6e269)

- The link opens the following log view page.
![Screenshot 2024-01-29 at 12 55 41 
AM](https://github.com/apache/spark/assets/9700541/70cf0c77-fc67-4ad8-97db-b061fdd1ffd0)

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Manually.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #44932 from dongjoon-hyun/SPARK-46903.

Authored-by: Dongjoon Hyun 
Signed-off-by: Dongjoon Hyun 
---
 .../apache/spark/deploy/history/HistoryPage.scala  |   2 +
 .../spark/deploy/history/HistoryServer.scala   |   1 +
 .../org/apache/spark/deploy/history/LogPage.scala  | 126 +
 3 files changed, 129 insertions(+)

diff --git 
a/core/src/main/scala/org/apache/spark/deploy/history/HistoryPage.scala 
b/core/src/main/scala/org/apache/spark/deploy/history/HistoryPage.scala
index 7ba9b2c54937..03d880f47306 100644
--- a/core/src/main/scala/org/apache/spark/deploy/history/HistoryPage.scala
+++ b/core/src/main/scala/org/apache/spark/deploy/history/HistoryPage.scala
@@ -94,6 +94,8 @@ private[history] class HistoryPage(parent: HistoryServer) 
extends WebUIPage("")
   }
   }
 
+
+  Show server log
   
   
 UIUtils.basicSparkPage(request, content, "History Server", true)
diff --git 
a/core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala 
b/core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala
index 321f76923411..8ba610e0a13d 100644
--- a/core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala
+++ b/core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala
@@ -148,6 +148,7 @@ class HistoryServer(
*/
   def initialize(): Unit = {
 attachPage(new HistoryPage(this))
+attachPage(new LogPage(conf))
 
 attachHandler(ApiRootResource.getServletHandler(this))
 
diff --git a/core/src/main/scala/org/apache/spark/deploy/history/LogPage.scala 
b/core/src/main/scala/org/apache/spark/deploy/history/LogPage.scala
new file mode 100644
index ..72d88e14a122
--- /dev/null
+++ b/core/src/main/scala/org/apache/spark/deploy/history/LogPage.scala
@@ -0,0 +1,126 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.deploy.history
+
+import java.io.File
+import javax.servlet.http.HttpServletRequest
+
+import scala.xml.{Node, Unparsed}
+
+import org.apache.spark.SparkConf
+import org.apache.spark.internal.Logging
+import org.apache.spark.ui.{UIUtils, WebUIPage}
+import org.apache.spark.util.Utils
+import org.apache.spark.util.logging.RollingFileAppender
+
+private[history] class LogPage(conf: SparkConf) extends WebUIPage("logPage") 
with Logging {
+  private val defaultBytes = 100 * 1024
+
+  def render(request: HttpServletRequest): Seq[Node] = {
+val logDir = sys.env.getOrElse("SPARK_LOG_DIR", "logs/")
+val logType = request.getParameter("logType")
+val offset = Option(request.getParameter("offset")).map(_.toLong)
+val byteLength = Option(request.getParameter("byteLength")).map(_.toInt)
+  .getOrElse(defaultBytes)
+val (logText, startByte, endByte, logLength) = getLog(logDir, logType, 
o

(spark) branch master updated: [SPARK-46902][UI] Fix Spark History Server UI for using un-exported setAppLimit

2024-01-29 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 1386b52f3eb6 [SPARK-46902][UI] Fix Spark History Server UI for using 
un-exported setAppLimit
1386b52f3eb6 is described below

commit 1386b52f3eb624331345611ef1f6ecc44047f80f
Author: Kent Yao 
AuthorDate: Mon Jan 29 01:26:57 2024 -0800

[SPARK-46902][UI] Fix Spark History Server UI for using un-exported 
setAppLimit

### What changes were proposed in this pull request?

Fix Spark History Server UI for using un-exported `setAppLimit` to render 
the dataTables of app list

close #44930

### Why are the changes needed?

bugfix

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

Locally built and tested


![image](https://github.com/apache/spark/assets/8326978/6899b1a2-0232-4f85-9389-e5c18db8d9d3)

### Was this patch authored or co-authored using generative AI tooling?

no

Closes #44931 from yaooqinn/SPARK-46902.

Authored-by: Kent Yao 
Signed-off-by: Dongjoon Hyun 
---
 .../main/resources/org/apache/spark/ui/static/historypage.js  |  2 ++
 .../scala/org/apache/spark/deploy/history/HistoryPage.scala   | 11 +--
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/core/src/main/resources/org/apache/spark/ui/static/historypage.js 
b/core/src/main/resources/org/apache/spark/ui/static/historypage.js
index 08438e6eda61..85cd5a554750 100644
--- a/core/src/main/resources/org/apache/spark/ui/static/historypage.js
+++ b/core/src/main/resources/org/apache/spark/ui/static/historypage.js
@@ -19,6 +19,8 @@
 
 import {formatDuration, formatTimeMillis} from "./utils.js";
 
+export {setAppLimit};
+
 var appLimit = -1;
 
 /* eslint-disable no-unused-vars */
diff --git 
a/core/src/main/scala/org/apache/spark/deploy/history/HistoryPage.scala 
b/core/src/main/scala/org/apache/spark/deploy/history/HistoryPage.scala
index b8f064c68cdd..7ba9b2c54937 100644
--- a/core/src/main/scala/org/apache/spark/deploy/history/HistoryPage.scala
+++ b/core/src/main/scala/org/apache/spark/deploy/history/HistoryPage.scala
@@ -19,10 +19,11 @@ package org.apache.spark.deploy.history
 
 import javax.servlet.http.HttpServletRequest
 
-import scala.xml.Node
+import scala.xml.{Node, Unparsed}
 
 import org.apache.spark.status.api.v1.ApplicationInfo
 import org.apache.spark.ui.{UIUtils, WebUIPage}
+import org.apache.spark.ui.UIUtils.formatImportJavaScript
 
 private[history] class HistoryPage(parent: HistoryServer) extends 
WebUIPage("") {
 
@@ -63,12 +64,18 @@ private[history] class HistoryPage(parent: HistoryServer) 
extends WebUIPage("")
 
 {
 if (displayApplications) {
+  val js =
+s"""
+   |${formatImportJavaScript(request, 
"/static/historypage.js", "setAppLimit")}
+   |
+   |setAppLimit(${parent.maxApplications});
+   |""".stripMargin
++
  ++
  ++
-setAppLimit({parent.maxApplications})
+{Unparsed(js)}
 } else if (requestedIncomplete) {
   No incomplete applications found!
 } else if (eventLogsUnderProcessCount > 0) {


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

(spark) branch master updated: [SPARK-46905][SQL] Add dedicated class to keep column definition instead of StructField in Create/ReplaceTable command

(spark) branch branch-3.4 updated: [SPARK-46893][UI] Remove inline scripts from UI descriptions

(spark) branch branch-3.5 updated: [SPARK-46893][UI] Remove inline scripts from UI descriptions

(spark) branch master updated (41a1426e9ee3 -> abd9d27e87b9)

(spark) branch master updated: [SPARK-46914][UI] Shorten app name in the summary table on the History Page

(spark) branch master updated: [SPARK-46916][PS][TESTS] Clean up `pyspark.pandas.tests.indexes.*`

(spark) branch master updated: [SPARK-46736][PROTOBUF] retain empty message field in protobuf connector

(spark) branch master updated: [SPARK-46910][PYTHON] Eliminate JDK Requirement in PySpark Installation

(spark) branch master updated: [SPARK-46907][CORE] Show driver log location in Spark History Server

(spark) branch master updated: [SPARK-46687][PYTHON][CONNECT] Basic support of SparkSession-based memory profiler

(spark) branch master updated (e211dbdee42c -> c468c3d5c685)

(spark) branch master updated: [SPARK-46831][SQL] Collations - Extending StringType and PhysicalStringType with collationId field

(spark) branch master updated: [SPARK-46903][CORE] Support Spark History Server Log UI

(spark) branch master updated: [SPARK-46902][UI] Fix Spark History Server UI for using un-exported setAppLimit

14 matches

Site Navigation

Mail list logo

Footer information