[
https://issues.apache.org/jira/browse/SPARK-56893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Max Gekk resolved SPARK-56893.
------------------------------
Fix Version/s: 4.3.0
Resolution: Fixed
Issue resolved by pull request 55920
[https://github.com/apache/spark/pull/55920]
> Optimize Parquet dictionary decoding with hasNull fast path and per-class
> updater overrides
> -------------------------------------------------------------------------------------------
>
> Key: SPARK-56893
> URL: https://issues.apache.org/jira/browse/SPARK-56893
> Project: Spark
> Issue Type: Sub-task
> Components: SQL
> Affects Versions: 4.3.0
> Reporter: Ismaël Mejía
> Assignee: Ismaël Mejía
> Priority: Major
> Labels: pull-request-available
> Fix For: 4.3.0
>
>
> Optimize Parquet dictionary decoding with hasNull fast path and per-class
> updater overrides.
> Two optimizations to ParquetVectorUpdater.decodeDictionaryIds:
> 1. *hasNull() fast path*: A static decodeBatch helper splits decoding into
> two loops -- when values.hasNull() is false, the per-element isNullAt(i)
> check is skipped entirely.
> 2. *Per-class decodeDictionaryIds overrides* in six hot-path updaters
> (IntegerUpdater, IntegerToLongUpdater, LongUpdater, FloatUpdater,
> FloatToDoubleUpdater, DoubleUpdater): each override gives C2 a monomorphic
> call site for decodeSingleDictionaryId, enabling full inlining.
> Benchmark results on AMD EPYC 9V74 (baseline vs optimized, same CPU):
> ||Scenario||JDK 17||JDK 21||JDK 25||
> |No nulls|1.21-1.22x|*1.56-1.62x*|1.24-1.25x|
> |10% nulls|~1.0x|1.24-1.29x|~1.0x|
> |50% nulls|~1.0x|1.25-1.26x|~1.0x|
> JDK 21 benefits dramatically across all null fractions due to monomorphic
> devirtualization. JDK 17/25 benefit primarily in the no-nulls fast path.
> PR: https://github.com/apache/spark/pull/55920
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]