This is an automated email from the ASF dual-hosted git repository.
gengliang pushed a commit to branch branch-3.5
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/branch-3.5 by this push:
new 64242bf6a64 [SPARK-43380][SQL][FOLLOW-UP] Fix slowdown in Avro read
64242bf6a64 is described below
commit 64242bf6a6425274b83bc1191230437c2d3fbc71
Author: zeruibao <[email protected]>
AuthorDate: Tue Oct 31 16:46:40 2023 -0700
[SPARK-43380][SQL][FOLLOW-UP] Fix slowdown in Avro read
### What changes were proposed in this pull request?
Fix slowdown in Avro read. There is a
https://github.com/apache/spark/pull/42503 that causes the performance
regression. It seems that `SQLConf.get.getConf(confKey)` is very costly. Move
it out of `newWriter` function.
### Why are the changes needed?
Need to fix the performance regression of Avro read.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
Existing UT test
### Was this patch authored or co-authored using generative AI tooling?
No
Closes #43606 from zeruibao/SPARK-43380-FIX-SLOWDOWN.
Authored-by: zeruibao <[email protected]>
Signed-off-by: Gengliang Wang <[email protected]>
(cherry picked from commit 45f73bc69655a236323be1bcb2988341d2aa5203)
Signed-off-by: Gengliang Wang <[email protected]>
---
.../src/main/scala/org/apache/spark/sql/avro/AvroDeserializer.scala | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git
a/connector/avro/src/main/scala/org/apache/spark/sql/avro/AvroDeserializer.scala
b/connector/avro/src/main/scala/org/apache/spark/sql/avro/AvroDeserializer.scala
index fe0bd7392b6..ec34d10a5ff 100644
---
a/connector/avro/src/main/scala/org/apache/spark/sql/avro/AvroDeserializer.scala
+++
b/connector/avro/src/main/scala/org/apache/spark/sql/avro/AvroDeserializer.scala
@@ -105,6 +105,9 @@ private[sql] class AvroDeserializer(
s"Cannot convert Avro type $rootAvroType to SQL type
${rootCatalystType.sql}.", ise)
}
+ private lazy val preventReadingIncorrectType = !SQLConf.get
+ .getConf(SQLConf.LEGACY_AVRO_ALLOW_INCOMPATIBLE_SCHEMA)
+
def deserialize(data: Any): Option[Any] = converter(data)
/**
@@ -122,8 +125,6 @@ private[sql] class AvroDeserializer(
s"schema is incompatible (avroType = $avroType, sqlType =
${catalystType.sql})"
val realDataType = SchemaConverters.toSqlType(avroType,
useStableIdForUnionType).dataType
- val confKey = SQLConf.LEGACY_AVRO_ALLOW_INCOMPATIBLE_SCHEMA
- val preventReadingIncorrectType = !SQLConf.get.getConf(confKey)
(avroType.getType, catalystType) match {
case (NULL, NullType) => (updater, ordinal, _) =>
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]