[
https://issues.apache.org/jira/browse/AVRO-4007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17859603#comment-17859603
]
ASF subversion and git services commented on AVRO-4007:
-------------------------------------------------------
Commit 4eda118a42f930bdc6f463621e2a6450098cbfe7 in avro's branch
refs/heads/main from John Emhoff
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=4eda118a4 ]
AVRO-4007: [rust] Faster `is_nullable` for UnionSchema (#2961)
* Faster `is_nullable` for UnionSchema
I'm writing several gigabytes of Avro and noticed that it seems
oddly slow. I ran a profile and noticed that about 25% of my total
run time was being spent in `UnionSchema::is_nullable`.
It looks like what's happening is that the test `x == Schema::Null`
is slow because the equality test involves a schema canonicalization.
I've updated the match to match against Schema::Null instead and see
a significant performance increase.
* Fix formatting
* Apply clippy suggestion
Signed-off-by: Martin Tzvetanov Grigorov <[email protected]>
---------
Signed-off-by: Martin Tzvetanov Grigorov <[email protected]>
Co-authored-by: Martin Grigorov <[email protected]>
Co-authored-by: Martin Tzvetanov Grigorov <[email protected]>
> [Rust] Faster is_nullable for UnionSchema
> -----------------------------------------
>
> Key: AVRO-4007
> URL: https://issues.apache.org/jira/browse/AVRO-4007
> Project: Apache Avro
> Issue Type: Improvement
> Components: rust
> Reporter: Martin Tzvetanov Grigorov
> Assignee: Martin Tzvetanov Grigorov
> Priority: Minor
> Labels: pull-request-available
> Time Spent: 20m
> Remaining Estimate: 0h
>
> https://github.com/apache/avro/pull/2961
> {code}
> Writing large amounts of avro data in rust is slow because (in my case) ~40%
> of total run time is spent in the function UnionSchema::is_nullable. The
> issue is that the x == Schema::Null invokes schema canonicalization which is
> apparently somewhat slow. I've modified the method to use match instead and
> see a considerable performance improvement.
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)