I believe feature flags are not the right choice here. The problem is that
feature flags are toggled globally, so the behavior of your crate can be
affected by the behavior of some other crate that toggles the feature.

A better approach is to provide two methods, one safe and the other unsafe,
see for example:
https://doc.rust-lang.org/stable/std/primitive.slice.html#method.get
https://doc.rust-lang.org/stable/std/primitive.slice.html#method.get_unchecked

Speaking of O(1) vs O(n), a correct program would perform these checks at
least once anyway, and so the complexity is unavoidable. However, you can
avoid re-checking bounds for the same buffer over and over if the state of
the check is encoded in the type system - for example, the `&str` standard
library type is just a slice that's already been checked and found to be
valid UTF-8.

Although I struggle to imagine bounds checks being a performance bottleneck
when done once on object creation, given that removing them from hot loops
usually wins you only a couple % in performance.

чт, 30 сент. 2021 г. в 22:28, Andrew Lamb <al...@influxdata.com>:

> I understand the need to avoid sacrificing performance as much as
> possible. I have begun looking into adding validation into ArrayData::new
> as you suggest.
>
> I am making progress, but haven't fully figured out the nested types yet.
> Hope to have a PR up in the next day or two.
>
> Should we come up with something that adds unacceptable performance
> overhead, we can also feature flag it (so that the check is done by
> default, but can be disabled if needed)
>
> Andrew
>
> On Thu, Sep 30, 2021 at 5:33 AM Jörn Horstmann <
> joern.horstm...@signavio.com> wrote:
>
>> Most of these issues seem to originate from creating arrays from
>> ArrayData.
>> While we could validate buffers in the XArray::from implementations, that
>> would have some performance overhead and also some edge cases for nested
>> data. Thinking about List<List<Utf8>>>, we recursively would need to
>> validate and know which buffers are used for offsets or utf8 data. Since
>> the root cause is always ArrayData::new (and possibly
>> ArrayDataBuilder::finish) I would suggest marking those functions as
>> unsafe.
>>
>> Adding checks for the top-level buffer length would still be useful, I'm
>> just a bit opposed to having to iterate over whole buffers for validation,
>> turning these operations from O(1) to O(n).
>>
>> Slightly related, GenericListArray::try_new_from_array_data currently
>> validates that the first offset is 0. I think this check could actually be
>> relaxed and the offset only needs to be in bounds for the child array.
>>
>> Jörn
>>
>>
>> On Wed, Sep 29, 2021 at 9:12 PM Andrew Lamb <al...@influxdata.com> wrote:
>>
>> > Dear Rust Developers,
>> >
>> > As a heads up, several pre-existing security tickets filed against
>> arrow-rs
>> > have been added[1][2] to the RUSTSEC database[1][2] a few hours ago. The
>> > author says he plans to file additional RUSTSEC entries for the other
>> > security tickets [3].
>> >
>> > The criteria used for adding the arrow issues to the RUSTSEC database is
>> > not clear to me,  but given widely used tools such as `cargo audit`
>> report
>> > such issues, it is likely that this will become an visible issue for our
>> > users soon.
>> >
>> > Given this, I will likely start looking into existing security issues
>> [4]
>> > reported against arrow-rs and any help would be appreciated.
>> >
>> > Andrew
>> >
>> > [1] https://github.com/rustsec/advisory-db/pull/1057
>> > [2] https://github.com/rustsec/advisory-db/pull/1059
>> > [3]
>> > https://github.com/rustsec/advisory-db/pull/1057#issuecomment-930455127
>> > [4] https://github.com/apache/arrow-rs/labels/security
>> >
>>
>>
>> --
>> *Jörn Horstmann* | Senior Backend Engineer
>>
>> www.signavio.com
>> Kurfürstenstraße 111, 10787 Berlin, Germany
>>
>> Work with us! <https://hubs.ly/H0wwzcr0>
>>
>>
>>
>> <https://www.linkedin.com/company/signavio/>
>> <https://www.twitter.com/signavio>   <https://www.facebook.com/signavio>
>> <https://www.youtube.com/user/signavio>
>> <https://www.xing.com/companies/signaviogmbh>
>>
>> [image: BPI Forum] <https://t-eu.xink.io/Tracking/Index/sSMAALBtAAAnu0MA0
>> >
>>
>> HRB 121584 B Charlottenburg District Court, VAT ID: DE265675123
>> Managing Directors: Dr. Gero Decker, Rouven Morato Adam
>>
>

Reply via email to