[ 
https://issues.apache.org/jira/browse/ARROW-1589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16179491#comment-16179491
 ] 

Uwe L. Korn commented on ARROW-1589:
------------------------------------

(citing different posts)

> Currently it is not clearly stated that the message stream is trusted, 
> therefore the opposite will be assumed by developers. 

Arrow is an in-memory specification and library, in this context you are likely 
to give away access to critical things like shared memory. Libraries in this 
space are never built for high security but assume many things to deliver 
results in the fastest fashion possible. You will see the same effects when 
using certain file readers or network protocol in the data analytics space. 
Security always comes at a (performance) cost, internal analytics is normally 
the area where you don't like to trade in performance for it.

> UntrustedMessageReader

This might really be a confusing name, I would have rather expected to call it 
{{SafeMessageReader}}, maybe we should really look at other implementation how 
this can be best named.

> The fact that there are not more tests for the cases you're describing is 
> definitely not due to a failure on my part to think outside the box

This is actually a thing where I hope the fuzzing code will help all us Arrow 
developers: Spare us some time thinking about edge cases that we need to test. 
We still should think of the tests we need to write but simply running a fuzzer 
might already show us the simple edge cases that we forgot before we actually 
release the library.

> [C++] Fuzzing for certain input formats
> ---------------------------------------
>
>                 Key: ARROW-1589
>                 URL: https://issues.apache.org/jira/browse/ARROW-1589
>             Project: Apache Arrow
>          Issue Type: Test
>            Reporter: Marco Neumann
>            Assignee: Marco Neumann
>
> The arrow lib should have fuzzing tests for certain input formats, e.g. for 
> reading record batches from streams. Ideally, malformed input must not crash 
> the system but must report a proper error. This could easily be implemented 
> e.g. w/ [libfuzzer|https://llvm.org/docs/LibFuzzer.html] in combination with 
> address sanitizer (that's already implemented by Arrow's build system).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to