I think that Rivest’s question may be a good reason to rethink the
initialization of structs and offer the explicit guarantee that all
unassigned elements will be initialized to 0 (and not just the jl_value_t
pointers). I would argue that the current behavior resulted more from a
desire to avoid clearing the array twice (if the user is about to call
fill, zeros, ones, +, etc.) than an intentional, casual exposure of
uninitialized memory.

A random array of integers is also a security concern if an attacker can
extract some other information (with some probability) about the state of
the program. Julia is not hardened by design, so you can’t safely run an
unknown code fragment, but you still might have an unintended memory
exposure in a client-facing app. While zero’ing memory doesn’t prevent the
user from simply reusing a memory buffer in a security-unaware fashion
(rather than consistently allocating a new one for each use), it’s not
clear to me that the performance penalty would be all that noticeable for
map Array(X) to zero(X), and only providing an internal constructor for
grabbing uninitialized memory (perhaps Base.Unchecked.Array(X) from #8227)

On Mon Nov 24 2014 at 12:57:22 PM Stefan Karpinski
[email protected] <http://mailto:[email protected]> wrote:

There are two rather different issues to consider:
>
> 1. Preventing problems due to inadvertent programmer errors.
> 2. Preventing malicious security attacks.
>
> When we initially made this choice, it wasn't clear if 1 would be a big
> issue but we decided to see how it played out. It hasn't been a problem in
> practice: once people grok that the Array(T, dims...) constructor gives
> uninitialized memory and that the standard usage pattern is to call it and
> then immediately initialize the memory, everything is ok. I can't recall
> a single situation where someone has had some terrible bug due to
> uninitialized int/float arrays.
>
> Regarding 2, Julia is not intended to be a hardened language for writing
> highly secure software. It allows all sorts of unsafe actions: pointer
> arithmetic, direct memory access, calling arbitrary C functions, etc. The
> future of really secure software seems to be small formally verified
> kernels written in statically typed languages that communicate with larger
> unverified systems over restricted channels. Julia might be appropriate for
> the larger unverified system but certainly not for the trusted kernel.
> Adding enough verification to Julia to write secure kernels is not
> inconceivable, but would be a major research effort. The implementation
> would have to check lots of things, including, of course, ensuring that all
> arrays are initialized.
>
> A couple of other points:
>
> Modern OSes protect against data leaking between processes by zeroing
> pages before a process first accesses them. Thus any data exposed by
> Array(T, dims...) comes from the same process and is not a security leak.
>
> An uninitialized array of, say, integers is not in itself a security
> concern – the issue is what you do with those integers. The classic
> security hole is to use a "random" value from uninitialized memory to
> access other memory by using it to index into an array or otherwise convert
> it to a pointer. In the presence of bounds checking, however, this isn't
> actually a big concern since you will still either get a bounds error or a
> valid array value – not a meaningful one, of course, but still just a value.
>
> Writing programs that are secure against malicious attacks is a hard,
> unsolved problem. So is doing efficient, productive high-level numerical
> programming. Trying to solve both problems at the same time seems like a
> recipe for failing at both.
>
> On Nov 24, 2014, at 11:43 AM, David Smith <[email protected]> wrote:
>
> Some ideas:
>
> Is there a way to return an error for accesses before at least one
> assignment in bits types?  I.e. when the object is created uninitialized it
> is marked "dirty" and only after assignment of some user values can it be
> "cleanly" accessed?
>
> Can Julia provide a thin memory management layer that grabs memory from
> the OS first, zeroes it, and then gives it to the user upon initial
> allocation?  After gc+reallocation it doesn't need to be zeroed again,
> unless the next allocation is larger than anything previous, at which time
> Julia grabs more memory, sanitizes it, and hands it off.
>
> On Monday, November 24, 2014 2:48:05 AM UTC-6, Mauro wrote:
>>
>> Pointer types will initialise to undef and any operation on them fails:
>> julia> a = Array(ASCIIString, 5);
>>
>> julia> a[1]
>> ERROR: access to undefined reference
>>  in getindex at array.jl:246
>>
>> But you're right, for bits-types this is not an error an will just
>> return whatever was there before.  I think the reason this will stay
>> that way is that Julia is a numerics oriented language.  Thus you many
>> wanna create a 1GB array of Float64 and then fill it with something as
>> opposed to first fill it with zeros and then fill it with something.
>> See:
>>
>> julia> @time b = Array(Float64, 10^9);
>> elapsed time: 0.029523638 seconds (8000000144 bytes allocated)
>>
>> julia> @time c = zeros(Float64, 10^9);
>> elapsed time: 0.835062841 seconds (8000000168 bytes allocated)
>>
>> You can argue that the time gain isn't worth the risk but I suspect that
>> others may feel different.
>>
>> On Mon, 2014-11-24 at 09:28, Ronald L. Rivest <[email protected]>
>> wrote:
>> > I am just learning Julia...
>> >
>> > I was quite shocked today to learn that Julia does *not*
>> > initialize allocated storage (e.g. to 0 or some default value).
>> > E.g. the code
>> >      A = Array(Int64,5)
>> >      println(A[1])
>> > has unpredictable behavior, may disclose information from
>> > other modules, etc.
>> >
>> > This is really quite unacceptable in a modern programming
>> > language; it is as bad as not checking array reads for out-of-bounds
>> > indices.
>> >
>> > Google for "uninitialized security" to find numerous instances
>> > of security violations and unreliability problems caused by the
>> > use of uninitialized variables, and numerous security advisories
>> > warning of problems caused by the (perhaps inadvertent) use
>> > of uninitialized variables.
>> >
>> > You can't design a programming language today under the naive
>> > assumption that code in that language won't be used in highly
>> > critical applications or won't be under adversarial attack.
>> >
>> > You can't reasonably ask all programmers to properly initialize
>> > their allocated storage manually any more than you can ask them
>> > to test all indices before accessing an array manually; these are
>> > things that a high-level language should do for you.
>> >
>> > The default non-initialization of allocated storage is a
>> > mis-feature that should absolutely be fixed.
>> >
>> > There is no efficiency argument here in favor of uninitialized storage
>> > that can outweigh the security and reliability disadvantages...
>> >
>> > Cheers,
>> > Ron Rivest
>>
>> ​

Reply via email to