[ 
https://issues.apache.org/jira/browse/AVRO-4007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17863023#comment-17863023
 ] 

ASF subversion and git services commented on AVRO-4007:
-------------------------------------------------------

Commit 4eda118a42f930bdc6f463621e2a6450098cbfe7 in avro's branch 
refs/heads/dependabot/cargo/lang/rust/env_logger-0.11.3 from John Emhoff
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=4eda118a4 ]

AVRO-4007: [rust] Faster `is_nullable` for UnionSchema (#2961)

* Faster `is_nullable` for UnionSchema

I'm writing several gigabytes of Avro and noticed that it seems
oddly slow. I ran a profile and noticed that about 25% of my total
run time was being spent in `UnionSchema::is_nullable`.

It looks like what's happening is that the test `x == Schema::Null`
is slow because the equality test involves a schema canonicalization.

I've updated the match to match against Schema::Null instead and see
a significant performance increase.

* Fix formatting

* Apply clippy suggestion

Signed-off-by: Martin Tzvetanov Grigorov <[email protected]>

---------

Signed-off-by: Martin Tzvetanov Grigorov <[email protected]>
Co-authored-by: Martin Grigorov <[email protected]>
Co-authored-by: Martin Tzvetanov Grigorov <[email protected]>

> [Rust] Faster is_nullable for UnionSchema
> -----------------------------------------
>
>                 Key: AVRO-4007
>                 URL: https://issues.apache.org/jira/browse/AVRO-4007
>             Project: Apache Avro
>          Issue Type: Improvement
>          Components: rust
>            Reporter: Martin Tzvetanov Grigorov
>            Assignee: Martin Tzvetanov Grigorov
>            Priority: Minor
>              Labels: pull-request-available
>             Fix For: 1.12.0, 1.11.4
>
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> https://github.com/apache/avro/pull/2961
> {code}
> Writing large amounts of avro data in rust is slow because (in my case) ~40% 
> of total run time is spent in the function UnionSchema::is_nullable. The 
> issue is that the x == Schema::Null invokes schema canonicalization which is
> apparently somewhat slow. I've modified the method to use match instead and 
> see a considerable performance improvement.
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to