There are two rather different issues to consider: 1. Preventing problems due to inadvertent programmer errors. 2. Preventing malicious security attacks.
When we initially made this choice, it wasn't clear if 1 would be a big issue but we decided to see how it played out. It hasn't been a problem in practice: once people grok that the Array(T, dims...) constructor gives uninitialized memory and that the standard usage pattern is to call it and then immediately initialize the memory, everything is ok. I can't recall a single situation where someone has had some terrible bug due to uninitialized int/float arrays. Regarding 2, Julia is not intended to be a hardened language for writing highly secure software. It allows all sorts of unsafe actions: pointer arithmetic, direct memory access, calling arbitrary C functions, etc. The future of really secure software seems to be small formally verified kernels written in statically typed languages that communicate with larger unverified systems over restricted channels. Julia might be appropriate for the larger unverified system but certainly not for the trusted kernel. Adding enough verification to Julia to write secure kernels is not inconceivable, but would be a major research effort. The implementation would have to check lots of things, including, of course, ensuring that all arrays are initialized. A couple of other points: Modern OSes protect against data leaking between processes by zeroing pages before a process first accesses them. Thus any data exposed by Array(T, dims...) comes from the same process and is not a security leak. An uninitialized array of, say, integers is not in itself a security concern – the issue is what you do with those integers. The classic security hole is to use a "random" value from uninitialized memory to access other memory by using it to index into an array or otherwise convert it to a pointer. In the presence of bounds checking, however, this isn't actually a big concern since you will still either get a bounds error or a valid array value – not a meaningful one, of course, but still just a value. Writing programs that are secure against malicious attacks is a hard, unsolved problem. So is doing efficient, productive high-level numerical programming. Trying to solve both problems at the same time seems like a recipe for failing at both. > On Nov 24, 2014, at 11:43 AM, David Smith <[email protected]> wrote: > > Some ideas: > > Is there a way to return an error for accesses before at least one assignment > in bits types? I.e. when the object is created uninitialized it is marked > "dirty" and only after assignment of some user values can it be "cleanly" > accessed? > > Can Julia provide a thin memory management layer that grabs memory from the > OS first, zeroes it, and then gives it to the user upon initial allocation? > After gc+reallocation it doesn't need to be zeroed again, unless the next > allocation is larger than anything previous, at which time Julia grabs more > memory, sanitizes it, and hands it off. > >> On Monday, November 24, 2014 2:48:05 AM UTC-6, Mauro wrote: >> Pointer types will initialise to undef and any operation on them fails: >> julia> a = Array(ASCIIString, 5); >> >> julia> a[1] >> ERROR: access to undefined reference >> in getindex at array.jl:246 >> >> But you're right, for bits-types this is not an error an will just >> return whatever was there before. I think the reason this will stay >> that way is that Julia is a numerics oriented language. Thus you many >> wanna create a 1GB array of Float64 and then fill it with something as >> opposed to first fill it with zeros and then fill it with something. >> See: >> >> julia> @time b = Array(Float64, 10^9); >> elapsed time: 0.029523638 seconds (8000000144 bytes allocated) >> >> julia> @time c = zeros(Float64, 10^9); >> elapsed time: 0.835062841 seconds (8000000168 bytes allocated) >> >> You can argue that the time gain isn't worth the risk but I suspect that >> others may feel different. >> >> On Mon, 2014-11-24 at 09:28, Ronald L. Rivest <[email protected]> wrote: >> > I am just learning Julia... >> > >> > I was quite shocked today to learn that Julia does *not* >> > initialize allocated storage (e.g. to 0 or some default value). >> > E.g. the code >> > A = Array(Int64,5) >> > println(A[1]) >> > has unpredictable behavior, may disclose information from >> > other modules, etc. >> > >> > This is really quite unacceptable in a modern programming >> > language; it is as bad as not checking array reads for out-of-bounds >> > indices. >> > >> > Google for "uninitialized security" to find numerous instances >> > of security violations and unreliability problems caused by the >> > use of uninitialized variables, and numerous security advisories >> > warning of problems caused by the (perhaps inadvertent) use >> > of uninitialized variables. >> > >> > You can't design a programming language today under the naive >> > assumption that code in that language won't be used in highly >> > critical applications or won't be under adversarial attack. >> > >> > You can't reasonably ask all programmers to properly initialize >> > their allocated storage manually any more than you can ask them >> > to test all indices before accessing an array manually; these are >> > things that a high-level language should do for you. >> > >> > The default non-initialization of allocated storage is a >> > mis-feature that should absolutely be fixed. >> > >> > There is no efficiency argument here in favor of uninitialized storage >> > that can outweigh the security and reliability disadvantages... >> > >> > Cheers, >> > Ron Rivest >>
