[spark] branch branch-3.2 updated: [SPARK-39775][CORE][AVRO] Disable validate default values when parsing Avro schemas

2022-08-04 Thread wenchen
This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch branch-3.2
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.2 by this push:
 new 0e5812c49d2 [SPARK-39775][CORE][AVRO] Disable validate default values 
when parsing Avro schemas
0e5812c49d2 is described below

commit 0e5812c49d2552d8779f94fbaad2fc1b69d8a9e8
Author: Yuming Wang 
AuthorDate: Fri Aug 5 11:25:51 2022 +0800

[SPARK-39775][CORE][AVRO] Disable validate default values when parsing Avro 
schemas

### What changes were proposed in this pull request?

This PR disables validate default values when parsing Avro schemas.

### Why are the changes needed?

Spark will throw exception if upgrade to Spark 3.2. We have fixed the Hive 
serde tables before: SPARK-34512.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Unit test.

Closes #37191 from wangyum/SPARK-39775.

Authored-by: Yuming Wang 
Signed-off-by: Wenchen Fan 
(cherry picked from commit 5c1b99f441ec5e178290637a9a9e7902aaa116e1)
Signed-off-by: Wenchen Fan 
---
 .../spark/serializer/GenericAvroSerializer.scala   |  4 +--
 .../serializer/GenericAvroSerializerSuite.scala| 16 +++
 .../apache/spark/sql/avro/AvroDataToCatalyst.scala |  3 +-
 .../org/apache/spark/sql/avro/AvroOptions.scala|  4 +--
 .../apache/spark/sql/avro/CatalystDataToAvro.scala |  2 +-
 .../apache/spark/sql/avro/AvroFunctionsSuite.scala | 32 ++
 6 files changed, 55 insertions(+), 6 deletions(-)

diff --git 
a/core/src/main/scala/org/apache/spark/serializer/GenericAvroSerializer.scala 
b/core/src/main/scala/org/apache/spark/serializer/GenericAvroSerializer.scala
index c1ef3ee769a..7d2923fdf37 100644
--- 
a/core/src/main/scala/org/apache/spark/serializer/GenericAvroSerializer.scala
+++ 
b/core/src/main/scala/org/apache/spark/serializer/GenericAvroSerializer.scala
@@ -97,7 +97,7 @@ private[serializer] class GenericAvroSerializer[D <: 
GenericContainer]
 } {
   in.close()
 }
-new Schema.Parser().parse(new String(bytes, StandardCharsets.UTF_8))
+new Schema.Parser().setValidateDefaults(false).parse(new String(bytes, 
StandardCharsets.UTF_8))
   })
 
   /**
@@ -137,7 +137,7 @@ private[serializer] class GenericAvroSerializer[D <: 
GenericContainer]
 val fingerprint = input.readLong()
 schemaCache.getOrElseUpdate(fingerprint, {
   schemas.get(fingerprint) match {
-case Some(s) => new Schema.Parser().parse(s)
+case Some(s) => new 
Schema.Parser().setValidateDefaults(false).parse(s)
 case None =>
   throw new SparkException(
 "Error reading attempting to read avro data -- encountered an 
unknown " +
diff --git 
a/core/src/test/scala/org/apache/spark/serializer/GenericAvroSerializerSuite.scala
 
b/core/src/test/scala/org/apache/spark/serializer/GenericAvroSerializerSuite.scala
index 54e4aebe544..98493c12f59 100644
--- 
a/core/src/test/scala/org/apache/spark/serializer/GenericAvroSerializerSuite.scala
+++ 
b/core/src/test/scala/org/apache/spark/serializer/GenericAvroSerializerSuite.scala
@@ -110,4 +110,20 @@ class GenericAvroSerializerSuite extends SparkFunSuite 
with SharedSparkContext {
   assert(rdd.collect() sameElements Array.fill(10)(datum))
 }
   }
+
+  test("SPARK-39775: Disable validate default values when parsing Avro 
schemas") {
+val avroTypeStruct = s"""
+  |{
+  |  "type": "record",
+  |  "name": "struct",
+  |  "fields": [
+  |{"name": "id", "type": "long", "default": null}
+  |  ]
+  |}
+""".stripMargin
+val schema = new 
Schema.Parser().setValidateDefaults(false).parse(avroTypeStruct)
+
+val genericSer = new GenericAvroSerializer(conf.getAvroSchema)
+assert(schema === 
genericSer.decompress(ByteBuffer.wrap(genericSer.compress(schema
+  }
 }
diff --git 
a/external/avro/src/main/scala/org/apache/spark/sql/avro/AvroDataToCatalyst.scala
 
b/external/avro/src/main/scala/org/apache/spark/sql/avro/AvroDataToCatalyst.scala
index b4965003ba3..c4a4b16b052 100644
--- 
a/external/avro/src/main/scala/org/apache/spark/sql/avro/AvroDataToCatalyst.scala
+++ 
b/external/avro/src/main/scala/org/apache/spark/sql/avro/AvroDataToCatalyst.scala
@@ -53,7 +53,8 @@ private[avro] case class AvroDataToCatalyst(
 
   private lazy val avroOptions = AvroOptions(options)
 
-  @transient private lazy val actualSchema = new 
Schema.Parser().parse(jsonFormatSchema)
+  @transient private lazy val actualSchema =
+new Schema.Parser().setValidateDefaults(false).parse(jsonFormatSchema)
 
   @transient private lazy val expectedSchema = 
avroOptions.schema.getOrElse(actualSchema)
 
diff --git 
a/external/avro/src/main/scala/org/apache/spark/sql/avro/AvroOptions.scala 

[spark] branch branch-3.3 updated: [SPARK-39775][CORE][AVRO] Disable validate default values when parsing Avro schemas

2022-08-04 Thread wenchen
This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch branch-3.3
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.3 by this push:
 new c358ee67615 [SPARK-39775][CORE][AVRO] Disable validate default values 
when parsing Avro schemas
c358ee67615 is described below

commit c358ee6761539b4a4d12dbe36a4dd1a632a0efeb
Author: Yuming Wang 
AuthorDate: Fri Aug 5 11:25:51 2022 +0800

[SPARK-39775][CORE][AVRO] Disable validate default values when parsing Avro 
schemas

### What changes were proposed in this pull request?

This PR disables validate default values when parsing Avro schemas.

### Why are the changes needed?

Spark will throw exception if upgrade to Spark 3.2. We have fixed the Hive 
serde tables before: SPARK-34512.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Unit test.

Closes #37191 from wangyum/SPARK-39775.

Authored-by: Yuming Wang 
Signed-off-by: Wenchen Fan 
(cherry picked from commit 5c1b99f441ec5e178290637a9a9e7902aaa116e1)
Signed-off-by: Wenchen Fan 
---
 .../spark/serializer/GenericAvroSerializer.scala   |  4 +--
 .../serializer/GenericAvroSerializerSuite.scala| 16 +++
 .../apache/spark/sql/avro/AvroDataToCatalyst.scala |  3 +-
 .../org/apache/spark/sql/avro/AvroOptions.scala|  4 +--
 .../apache/spark/sql/avro/CatalystDataToAvro.scala |  2 +-
 .../apache/spark/sql/avro/AvroFunctionsSuite.scala | 32 ++
 6 files changed, 55 insertions(+), 6 deletions(-)

diff --git 
a/core/src/main/scala/org/apache/spark/serializer/GenericAvroSerializer.scala 
b/core/src/main/scala/org/apache/spark/serializer/GenericAvroSerializer.scala
index c1ef3ee769a..7d2923fdf37 100644
--- 
a/core/src/main/scala/org/apache/spark/serializer/GenericAvroSerializer.scala
+++ 
b/core/src/main/scala/org/apache/spark/serializer/GenericAvroSerializer.scala
@@ -97,7 +97,7 @@ private[serializer] class GenericAvroSerializer[D <: 
GenericContainer]
 } {
   in.close()
 }
-new Schema.Parser().parse(new String(bytes, StandardCharsets.UTF_8))
+new Schema.Parser().setValidateDefaults(false).parse(new String(bytes, 
StandardCharsets.UTF_8))
   })
 
   /**
@@ -137,7 +137,7 @@ private[serializer] class GenericAvroSerializer[D <: 
GenericContainer]
 val fingerprint = input.readLong()
 schemaCache.getOrElseUpdate(fingerprint, {
   schemas.get(fingerprint) match {
-case Some(s) => new Schema.Parser().parse(s)
+case Some(s) => new 
Schema.Parser().setValidateDefaults(false).parse(s)
 case None =>
   throw new SparkException(
 "Error reading attempting to read avro data -- encountered an 
unknown " +
diff --git 
a/core/src/test/scala/org/apache/spark/serializer/GenericAvroSerializerSuite.scala
 
b/core/src/test/scala/org/apache/spark/serializer/GenericAvroSerializerSuite.scala
index 54e4aebe544..98493c12f59 100644
--- 
a/core/src/test/scala/org/apache/spark/serializer/GenericAvroSerializerSuite.scala
+++ 
b/core/src/test/scala/org/apache/spark/serializer/GenericAvroSerializerSuite.scala
@@ -110,4 +110,20 @@ class GenericAvroSerializerSuite extends SparkFunSuite 
with SharedSparkContext {
   assert(rdd.collect() sameElements Array.fill(10)(datum))
 }
   }
+
+  test("SPARK-39775: Disable validate default values when parsing Avro 
schemas") {
+val avroTypeStruct = s"""
+  |{
+  |  "type": "record",
+  |  "name": "struct",
+  |  "fields": [
+  |{"name": "id", "type": "long", "default": null}
+  |  ]
+  |}
+""".stripMargin
+val schema = new 
Schema.Parser().setValidateDefaults(false).parse(avroTypeStruct)
+
+val genericSer = new GenericAvroSerializer(conf.getAvroSchema)
+assert(schema === 
genericSer.decompress(ByteBuffer.wrap(genericSer.compress(schema
+  }
 }
diff --git 
a/external/avro/src/main/scala/org/apache/spark/sql/avro/AvroDataToCatalyst.scala
 
b/external/avro/src/main/scala/org/apache/spark/sql/avro/AvroDataToCatalyst.scala
index b4965003ba3..c4a4b16b052 100644
--- 
a/external/avro/src/main/scala/org/apache/spark/sql/avro/AvroDataToCatalyst.scala
+++ 
b/external/avro/src/main/scala/org/apache/spark/sql/avro/AvroDataToCatalyst.scala
@@ -53,7 +53,8 @@ private[avro] case class AvroDataToCatalyst(
 
   private lazy val avroOptions = AvroOptions(options)
 
-  @transient private lazy val actualSchema = new 
Schema.Parser().parse(jsonFormatSchema)
+  @transient private lazy val actualSchema =
+new Schema.Parser().setValidateDefaults(false).parse(jsonFormatSchema)
 
   @transient private lazy val expectedSchema = 
avroOptions.schema.getOrElse(actualSchema)
 
diff --git 
a/external/avro/src/main/scala/org/apache/spark/sql/avro/AvroOptions.scala 

[spark] branch master updated (82dc17cdf7a -> 5c1b99f441e)

2022-08-04 Thread wenchen
This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 82dc17cdf7a [SPARK-39986][PS][DOC] Better example for Co-grouped Map
 add 5c1b99f441e [SPARK-39775][CORE][AVRO] Disable validate default values 
when parsing Avro schemas

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/avro/AvroDataToCatalyst.scala |  3 +-
 .../org/apache/spark/sql/avro/AvroOptions.scala|  4 +--
 .../apache/spark/sql/avro/CatalystDataToAvro.scala |  2 +-
 .../apache/spark/sql/avro/AvroFunctionsSuite.scala | 32 ++
 .../spark/serializer/GenericAvroSerializer.scala   |  4 +--
 .../serializer/GenericAvroSerializerSuite.scala| 16 +++
 6 files changed, 55 insertions(+), 6 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (e9e8dcb25fe -> 82dc17cdf7a)

2022-08-04 Thread gurwls223
This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from e9e8dcb25fe [SPARK-39974][INFRA] Create separate static image tag for 
infra cache
 add 82dc17cdf7a [SPARK-39986][PS][DOC] Better example for Co-grouped Map

No new revisions were added by this update.

Summary of changes:
 examples/src/main/python/sql/arrow.py  | 22 +++---
 .../source/getting_started/quickstart_df.ipynb |  8 
 2 files changed, 15 insertions(+), 15 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-39974][INFRA] Create separate static image tag for infra cache

2022-08-04 Thread gurwls223
This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new e9e8dcb25fe [SPARK-39974][INFRA] Create separate static image tag for 
infra cache
e9e8dcb25fe is described below

commit e9e8dcb25fe1f5c0d925852c8af5e06ce0935684
Author: Yikun Jiang 
AuthorDate: Fri Aug 5 08:57:12 2022 +0900

[SPARK-39974][INFRA] Create separate static image tag for infra cache

### What changes were proposed in this pull request?
Create separate static image tag for infra static image

### Why are the changes needed?
Currently, we put the **static image** and **cache** together in same tag 
like 
[`ghcr.io/apache/spark/apache-spark-github-action-image-cache:master`](https://github.com/apache/spark/pkgs/container/spark%2Fapache-spark-github-action-image-cache/versions).

Cache and static image occupy separate different image hash and same image 
tags. this bring some problem in below cases:
- **Debug job with static docker images**, they have to find hash. If use 
cache directly, will raise something like:
```
yikun-x86:~# docker run -ti 
ghcr.io/yikun/apache-spark-github-action-image-cache:master
Unable to find image 
'ghcr.io/yikun/apache-spark-github-action-image-cache:master' locally
master: Pulling from yikun/apache-spark-github-action-image-cache
docker: no matching manifest for linux/amd64 in the manifest list entries.
```

- **Use static image in CI**, such as for some reason we want to switch 
static image temporarily.
- **Easy to see history for last cache**, such as system deps/lib.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
- Local test: https://github.com/Yikun/spark/pull/144, and static image tag 
push 
[passed](https://github.com/Yikun/spark/runs/7664266955?check_suite_focus=true#step:6:212)
- Run static image:
```
rootyikun-x86:~# docker run -ti 
ghcr.io/yikun/apache-spark-github-action-image-cache:master-static
Unable to find image 
'ghcr.io/yikun/apache-spark-github-action-image-cache:master-static' locally
master-static: Pulling from yikun/apache-spark-github-action-image-cache
Digest: 
sha256:5198fd8111c925b7c92d04427268bcb0e5574bb72cef09808076595f3372bf7b
Status: Downloaded newer image for 
ghcr.io/yikun/apache-spark-github-action-image-cache:master-static
root3550e09e0e93:/# exit
```

Closes #37402 from Yikun/patch-32.

Authored-by: Yikun Jiang 
Signed-off-by: Hyukjin Kwon 
---
 .github/workflows/build_infra_images_cache.yml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/.github/workflows/build_infra_images_cache.yml 
b/.github/workflows/build_infra_images_cache.yml
index c35b34e201e..bd5685f69b0 100644
--- a/.github/workflows/build_infra_images_cache.yml
+++ b/.github/workflows/build_infra_images_cache.yml
@@ -49,7 +49,7 @@ jobs:
 with:
   context: ./dev/infra/
   push: true
-  tags: 
ghcr.io/apache/spark/apache-spark-github-action-image-cache:${{ github.ref_name 
}}
+  tags: 
ghcr.io/apache/spark/apache-spark-github-action-image-cache:${{ github.ref_name 
}}-static
   cache-from: 
type=registry,ref=ghcr.io/apache/spark/apache-spark-github-action-image-cache:${{
 github.ref_name }}
   cache-to: 
type=registry,ref=ghcr.io/apache/spark/apache-spark-github-action-image-cache:${{
 github.ref_name }},mode=max
   - name: Image digest


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-39961][SQL] DS V2 push-down translate Cast if the cast is safe

2022-08-04 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new bf8a4c47ac3 [SPARK-39961][SQL] DS V2 push-down translate Cast if the 
cast is safe
bf8a4c47ac3 is described below

commit bf8a4c47ac3752edf86f1e14e4050e8d202b34a4
Author: Jiaan Geng 
AuthorDate: Thu Aug 4 09:59:09 2022 -0700

[SPARK-39961][SQL] DS V2 push-down translate Cast if the cast is safe

### What changes were proposed in this pull request?
Currently, DS V2 push-down translate `Cast` only if the ansi mode is true.
In fact, if the cast is safe(e.g. cast number to string, cast int to long), 
we can translate it too.

This PR will call `Cast.canUpCast` so as we can translate `Cast` to V2 
`Cast` safely.

Note: The rule `SimplifyCasts` optimize some safe cast, e.g. cast int to 
long, so we may not see the `Cast`.

### Why are the changes needed?
Add the range for DS V2 push down `Cast`.

### Does this PR introduce _any_ user-facing change?
'Yes'.
`Cast` could be pushed down to data source in more cases.

### How was this patch tested?
Test cases updated.

Closes #37388 from beliefer/SPARK-39961.

Authored-by: Jiaan Geng 
Signed-off-by: Dongjoon Hyun 
---
 .../sql/catalyst/util/V2ExpressionBuilder.scala|  3 +-
 .../org/apache/spark/sql/jdbc/JDBCV2Suite.scala| 45 +-
 2 files changed, 20 insertions(+), 28 deletions(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala
index 41415553729..d451c73b39d 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala
@@ -88,7 +88,8 @@ class V2ExpressionBuilder(e: Expression, isPredicate: Boolean 
= false) {
   } else {
 None
   }
-case Cast(child, dataType, _, true) =>
+case Cast(child, dataType, _, ansiEnabled)
+if ansiEnabled || Cast.canUpCast(child.dataType, dataType) =>
   generateExpression(child).map(v => new V2Cast(v, dataType))
 case Abs(child, true) => generateExpressionWithName("ABS", Seq(child))
 case Coalesce(children) => generateExpressionWithName("COALESCE", children)
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala
index 3b226d60643..da4f9175cd5 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala
@@ -1109,7 +1109,7 @@ class JDBCV2Suite extends QueryTest with 
SharedSparkSession with ExplainSuiteHel
 "CAST(BONUS AS string) LIKE '%30%', CAST(DEPT AS byte) > 1, " +
 "CAST(DEPT AS short) > 1, CAST(BONUS AS decimal(20,2)) > 1200.00]"
 } else {
-  "PushedFilters: [BONUS IS NOT NULL, DEPT IS NOT NULL],"
+  "PushedFilters: [BONUS IS NOT NULL, DEPT IS NOT NULL, CAST(BONUS AS 
string) LIKE '%30%']"
 }
 checkPushedInfo(df6, expectedPlanFragment6)
 checkAnswer(df6, Seq(Row(2, "david", 1, 1300, true)))
@@ -1199,18 +1199,16 @@ class JDBCV2Suite extends QueryTest with 
SharedSparkSession with ExplainSuiteHel
   checkPushedInfo(df1, "PushedFilters: [CHAR_LENGTH(NAME) > 2],")
   checkAnswer(df1, Seq(Row("fred", 1), Row("mary", 2)))
 
-  withSQLConf(SQLConf.ANSI_ENABLED.key -> "true") {
-val df2 = sql(
-  """
-|SELECT *
-|FROM h2.test.people
-|WHERE h2.my_strlen(CASE WHEN NAME = 'fred' THEN NAME ELSE "abc" 
END) > 2
+  val df2 = sql(
+"""
+  |SELECT *
+  |FROM h2.test.people
+  |WHERE h2.my_strlen(CASE WHEN NAME = 'fred' THEN NAME ELSE "abc" 
END) > 2
   """.stripMargin)
-checkFiltersRemoved(df2)
-checkPushedInfo(df2,
-  "PushedFilters: [CHAR_LENGTH(CASE WHEN NAME = 'fred' THEN NAME ELSE 
'abc' END) > 2],")
-checkAnswer(df2, Seq(Row("fred", 1), Row("mary", 2)))
-  }
+  checkFiltersRemoved(df2)
+  checkPushedInfo(df2,
+"PushedFilters: [CHAR_LENGTH(CASE WHEN NAME = 'fred' THEN NAME ELSE 
'abc' END) > 2],")
+  checkAnswer(df2, Seq(Row("fred", 1), Row("mary", 2)))
 } finally {
   JdbcDialects.unregisterDialect(testH2Dialect)
   JdbcDialects.registerDialect(H2Dialect)
@@ -2262,24 +2260,17 @@ class JDBCV2Suite extends QueryTest with 
SharedSparkSession with ExplainSuiteHel
   }
 
   test("scan with aggregate push-down: partial push-down AVG with overflow") {
-def createDataFrame: DataFrame = spark.read
-  .option("partitionColumn", "id")
-  

[GitHub] [spark-website] srowen commented on pull request #410: Change some tags in last commit to .

2022-08-04 Thread GitBox


srowen commented on PR #410:
URL: https://github.com/apache/spark-website/pull/410#issuecomment-1205369622

   Oh, BTW, have you fixed the source markdown in the Spark distro? I forgot to 
ask. Normally we don't edit old releases' generated docs unless it's just not 
working
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[GitHub] [spark-website] MacrothT opened a new pull request, #410: Change some tags in last commit to .

2022-08-04 Thread GitBox


MacrothT opened a new pull request, #410:
URL: https://github.com/apache/spark-website/pull/410

   Change level of HTML tags to match original Markdown headings that were ###.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark-website] branch asf-site updated: Correct some tags/headings and add missing TOC.

2022-08-04 Thread srowen
This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/spark-website.git


The following commit(s) were added to refs/heads/asf-site by this push:
 new 36b5a3d4f Correct some tags/headings and add missing TOC.
36b5a3d4f is described below

commit 36b5a3d4f29e88ffb3edfddfa52d8fe1c4d7f915
Author: MacrothT <109898529+macro...@users.noreply.github.com>
AuthorDate: Thu Aug 4 08:02:50 2022 -0500

Correct some tags/headings and add missing TOC.

Correct mal-encoding tags that caused mal-formatted HTML doc.
Replace Markdown headings with HTML tags to show proper heading format.
Add missing TOC.



Author: MacrothT <109898529+macro...@users.noreply.github.com>

Closes #409 from MacrothT/patch-1.
---
 site/docs/3.2.1/running-on-kubernetes.html | 27 +--
 1 file changed, 17 insertions(+), 10 deletions(-)

diff --git a/site/docs/3.2.1/running-on-kubernetes.html 
b/site/docs/3.2.1/running-on-kubernetes.html
index aa43ebbef..039a3acb2 100644
--- a/site/docs/3.2.1/running-on-kubernetes.html
+++ b/site/docs/3.2.1/running-on-kubernetes.html
@@ -183,8 +183,15 @@
   Future 
Work
 
   
-  Configuration
+  Configuration
+
   Spark 
Properties
+  Pod Template Properties
+  Pod 
Metadata
+  Pod Spec
+  Container 
spec
+  Resource 
Allocation and Configuration Overview
+  Stage Level Scheduling 
Overview
 
   
 
@@ -1446,13 +1453,13 @@ using --conf as means
   3.0.0
 
 
-  spark.kubernetes.executor.scheduler.name/td
+  spark.kubernetes.executor.scheduler.name
   (none)
   
Specify the scheduler name for each executor pod.
   
   3.0.0
-/tr
+
 
   spark.kubernetes.configMap.maxSize
   1572864
@@ -1571,13 +1578,13 @@ using --conf as means
   
   3.1.3
 
-/table
+
 
- Pod template properties
+Pod Template Properties
 
 See the below table for the full list of pod specifications that will be 
overwritten by spark.
 
-### Pod Metadata
+Pod Metadata
 
 
 Pod metadata keyModified valueDescription
@@ -1613,7 +1620,7 @@ See the below table for the full list of pod 
specifications that will be overwri
 
 
 
-### Pod Spec
+Pod Spec
 
 
 Pod spec keyModified valueDescription
@@ -1664,7 +1671,7 @@ See the below table for the full list of pod 
specifications that will be overwri
 
 
 
-### Container spec
+Container Spec
 
 The following affect the driver and executor containers. All other containers 
in the pod spec will be unaffected.
 
@@ -1721,7 +1728,7 @@ The following affect the driver and executor containers. 
All other containers in
 
 
 
-### Resource Allocation and Configuration Overview
+Resource Allocation 
and Configuration Overview
 
 Please make sure to have read the Custom Resource Scheduling and Configuration 
Overview section on the [configuration page](configuration.html). This section 
only talks about the Kubernetes specific aspects of resource scheduling.
 
@@ -1731,7 +1738,7 @@ Spark automatically handles translating the Spark configs 
spark.{driver/ex
 
 Kubernetes does not tell Spark the addresses of the resources allocated to 
each container. For that reason, the user must specify a discovery script that 
gets run by the executor on startup to discover what resources are available to 
that executor. You can find an example scripts in 
`examples/src/main/scripts/getGpusResources.sh`. The script must have execute 
permissions set and the user should setup permissions to not allow malicious 
users to modify it. The script should write to STDOUT [...]
 
-### Stage Level Scheduling Overview
+Stage Level Scheduling Overview
 
 Stage level scheduling is supported on Kubernetes when dynamic allocation is 
enabled. This also requires 
spark.dynamicAllocation.shuffleTracking.enabled to be enabled 
since Kubernetes doesn't support an external shuffle service at this time. The 
order in which containers for different profiles is requested from Kubernetes 
is not guaranteed. Note that since dynamic allocation on Kubernetes requires 
the shuffle tracking feature, this means that executors from previous stages t 
[...]
 Note, there is a difference in the way pod template resources are handled 
between the base default profile and custom ResourceProfiles. Any resources 
specified in the pod template file will only be used with the base default 
profile. If you create custom ResourceProfiles be sure to include all necessary 
resources there since the resources from the template file will not be 
propagated to custom ResourceProfiles.


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[GitHub] [spark-website] srowen closed pull request #409: Correct some tags/headings and add missing TOC.

2022-08-04 Thread GitBox


srowen closed pull request #409: Correct some tags/headings and add missing TOC.
URL: https://github.com/apache/spark-website/pull/409


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[GitHub] [spark-website] MacrothT opened a new pull request, #409: Correct some tags/headings and add missing TOC.

2022-08-04 Thread GitBox


MacrothT opened a new pull request, #409:
URL: https://github.com/apache/spark-website/pull/409

   Correct mal-encoding tags that caused mal-formatted HTML doc.
   Replace Markdown headings with HTML tags to show proper heading format.
   Add missing TOC.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (432b667dea7 -> a7cded5dae0)

2022-08-04 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 432b667dea7 [SPARK-39959][BUILD][INFRA] Pin roxygen2 version to 7.2.0 
in infra
 add a7cded5dae0 [SPARK-39913][BUILD] Upgrade to Arrow 9.0.0

No new revisions were added by this update.

Summary of changes:
 dev/deps/spark-deps-hadoop-2-hive-2.3 | 8 
 dev/deps/spark-deps-hadoop-3-hive-2.3 | 8 
 pom.xml   | 2 +-
 3 files changed, 9 insertions(+), 9 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org