Re: [rust-dev] Scoped numeric literal type directives
On 9/20/10 11:09 AM, Graydon Hoare wrote: Is it possible to infer the type from the expression, e.g. in 2 + x the 2 matches the declared type of x? Plausible. Go takes a similar angle on this; their numeric literals are untyped and acquire a type via inference from the context. I'm not sure I'm really keen on that -- every additional complication to inference is a bit of a penalty to implementers, tools and future readers -- but it could work. Any feelings from others? Particularly .. those who have worked on the type inference module :) I'm also somewhat opposed to the proposal. To me, the strict typing of numeric literals is a great feature of Rust: it lets the programmer look at the code and immediately tell what's going to happen in the generated instructions and memory layout. C's integer promotion rules are confusing; by looking at C code I can never really tell precisely which instruction is going to be emitted. Adding Go-style untyped numeric literals would also be difficult to reconcile with our type inference and polymorphic binary operators: what's the type of x in auto x = 2 + 3;? Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
[rust-dev] Statically linking rustrt
I'm looking at the generated assembly code for std.rc (which is now compiling, although it fails to link due to a strange mangling LLVM is performing on duplicate native symbols). Even with all of LLVM's optimizations, our hash insertion code has 4x the instruction count of that of glib. One major reason for this is that we have enormous overhead when calling upcalls like get_type_desc() and size_of(). These calls are completely opaque to LLVM. Even if we fixed the crate-relative encoding issues, these calls would still be opaque to LLVM. Most upcalls are trivial (get_type_desc() is an exception; I don't know why it needs to exist, actually). For those, it would be great to inline them. To do that, we need LTO, which basically means that we compile rustrt with clang and link the resulting .bc together with the .bc that rustc yields before doing LLVM's optimization passes. I think this would be a huge win; we would remove all the upcall glue and make these low-level calls, of which there are quite a lot, no longer opaque to LLVM. Thoughts? Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
[rust-dev] Metadata encoding format
The current blocker for rustc self-hosting is writing out and reading the crate metadata. Marijn wrote up a proposal outlining the data that needs to be encoded on the wiki, which looks good to me. The code that inserts the data blob into and reads the data blob from the files is working, modulo the Franken-LLVM issue. What's missing is the actual encoding format. AFAICT the design criteria are that the format needs to be seekable and extensible. Ideally it should be compact and simple as well. I've floated the idea of using EBML [1], which is a dead simple format used by Matroska (including WebM). It's more or less just tag ID + size + contents, where the contents can recursively include other tags. I had good results with this format for my Android profiler. When I was writing that I did a quick survey of the options and went with EBML over BSON, because BSON, while more mainstream, is not at all compact (it's usually as large as the corresponding JSON, its only advantage being that it's seekable and has more data types than JSON). Any opinions? I started sketching out a tiny EBML library for Rust, but I thought I'd ask the mailing list before going further. Patrick [1]: http://matroska.org/technical/specs/rfc/index.html ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
[rust-dev] Integer overflow checking
Hi everyone, I've been wondering for a while whether it's feasible from a performance standpoint for a systems language to detect and abort on potentially-dangerous integer overflows by default. Overflow is an insidious problem for several reasons: (1) It can happen practically anywhere; anytime the basic arithmetic operators are used, an overflow or underflow could occur. (2) To reason about overflow, the programmer has to solve a global data flow problem. An expression as simple as a + b necessitates an answer to the question could the program input have influenced a or b such that the operation could overflow? (3) Overflow checking is rarely used in practice due to the performance costs associated with it. ISAs aren't that well-suited for overflow checking. For example, on the x86 one has to test for the overflow and/or carry flag after every integer operation that could possibly set it. Contrast this with the floating-point situation, in which a SIGFPE is raised on overflow without having to explicitly test after each instruction. (4) It can be catastrophic from a memory safety and security standpoint when overflow errors creep in, especially when unsafe operations such as memory allocation and unchecked array copying are performed. We do permit unsafe operations in Rust (although we certainly hope they're going to be rare!) I did a quick survey of the available literature and there isn't too much out there*, but there is a recent gem of a paper from CERT: http://www.cert.org/archive/pdf/09tn023.pdf They managed to get quite impressive numbers: under 6% slowdown using their As-If-Infinitely-Ranged model on GCC -O3. The trick is to delay overflow checking to observation points, which roughly correspond to state being updated or I/O being performed (there's an interesting connection between this and the operations that made a function impure in the previous effect system). This area seems promising enough that I was wondering if there was interest in something like this for Rust. There's no harm in having the programmer explicitly be able to turn off the checking at the block or item level; some algorithms, such as hashing algorithms, rely on the overflow semantics, after all. But it seems in the spirit of Rust (at the risk of relying on a nebulous term) to be as safe as possible by default, and so I'd like to propose exploring opt-out overflow checking for integers at some point in the future. Thoughts? Patrick * That said, Microsoft seems to have put more effort than most into detecting integer overflows in its huge C++ codebases, both through fairly sophisticated static analysis [1] and through dynamic checks with the SafeInt library [2]. Choice quote: SafeInt is currently used extensively throughout Microsoft, with substantial adoption within Office and Windows. [1]: http://research.microsoft.com/pubs/80722/z3prefix.pdf [2]: http://safeint.codeplex.com/ ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] status
On 04/24/2011 01:19 PM, Graydon Hoare wrote: Thought everyone following along might want an update: boot/rustboot builds a stage0/rustc that builds a functional stage1/rustc that can, itself, build and pass about 60% of the testsuite (174 tests); we cannot yet build stage1/libstd, nor stage2/rustc (which will be a candidate bootstrapped image, likely but not necessarily a fixpoint; we'll have to build stage3 to check that). Wonderful news! That's much better than I had anticipated. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
[rust-dev] -O0 on Mac
Currently rustrt is built with -O0 on the Mac due to a GCC bug. This is starting to cause performance problems; in particular, very suboptimal assembler is generated for next_power_of_two, which is the third-highest function in the profiles. Apple's GCC is way outdated anyhow. Will anyone object if I make clang a dependency on that platform? Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
[rust-dev] [PATCH] compiler-rt: Sanity check architectures
(rust-dev: This is an LLVM patch you might want to apply if you're trying to build with clang on the Mac.) Hi everyone, I've got a quick patch to compiler-rt that makes it do a simple sanity check on the toolchain before trying to compile for each architecture. This makes clang able to be built again on Darwin without having to install the iOS SDK. Thanks! Patrick Index: compiler-rt/make/platform/clang_darwin.mk === --- compiler-rt.orig/make/platform/clang_darwin.mk 2011-05-08 19:44:40.0 -0700 +++ compiler-rt/make/platform/clang_darwin.mk 2011-05-08 20:25:06.0 -0700 @@ -6,6 +6,19 @@ Description := Static runtime libraries for clang/Darwin. +# A function that ensures we don't try to build for architectures that we +# don't have working toolchains for. +CheckArches = \ + $(shell \ +result=; \ +for arch in $(1); do \ + gcc -arch $$arch; \ + if test $$? == 1; then result=$$result$$arch ; fi; \ +done; \ +echo $$result) + +### + Configs := UniversalArchs := @@ -13,23 +26,23 @@ # still be referenced from Darwin system headers. This symbol is only ever # needed on i386. Configs += eprintf -UniversalArchs.eprintf := i386 +UniversalArchs.eprintf := $(call CheckArches,i386) # Configuration for targetting 10.4. We need a few functions missing from # libgcc_s.10.4.dylib. We only build x86 slices since clang doesn't really # support targetting PowerPC. Configs += 10.4 -UniversalArchs.10.4 := i386 x86_64 +UniversalArchs.10.4 := $(call CheckArches,i386 x86_64) # Configuration for targetting iOS, for some ARMv6 functions, which must be # in the same linkage unit, and for a couple of other functions that didn't # make it into libSystem. Configs += ios -UniversalArchs.ios := i386 x86_64 armv6 armv7 +UniversalArchs.ios := $(call CheckArches,i386 x86_64 armv6 armv7) # Configuration for use with kernel/kexts. Configs += cc_kext -UniversalArchs.cc_kext := armv6 armv7 i386 x86_64 +UniversalArchs.cc_kext := $(call CheckArches,armv6 armv7 i386 x86_64) ### ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] The module naming situation
On 5/12/11 9:44 AM, Marijn Haverbeke wrote: I went ahead an implemented a large part of this--using a single colon as a module separator, and downcasing the module names again. A separate module namespace isn't done yet. Look around https://github.com/marijnh/rust/tree/modulesep if you're curious what it looks like. I'd prefer ::, if for no other reason than that it's consistent with C++, Ruby, PHP, and Perl. Also fewer special cases in the grammar are always nice. Graydon can cast the deciding vote here :) I always figured that import foo::bar or from foo import bar would import both the bar item and the bar module if both exist. I don't see much harm in that off the top of my head. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] Hello
Hi John, There is nothing preventing you from writing functions that take one tuple parameter instead of multiple parameters. If you're faced with library routines with signatures that aren't to your liking, it's simple to write macros that tuple, untuple, curry, or uncurry parameters. For example, take the function map, which has the type fn[T,U](fn(T) - U, vec[mutable? T]) - vec[mutable? U]. If you would prefer that it had the type fn[T,U](tup(fn(T) - U), vec[mutable? T])) - vec[mutable? U] instead, then you could define a macro #tuple so that you could say #tuple(map) to get the version of map with the signature you prefer. Incidentally, I would be careful with the OCaml comparison; OCaml has functions that take multiple, untupled parameters via the named and optional parameters feature. Finally, there's no need for the hyperbolic [Rust's type system] isn't a type system at all kind of language. Of course Rust has a type system. Unless you have found a way in which Rust's type system is unsound (which we would definitely like to hear about if so!), that is beyond debate. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
[rust-dev] Statically linked library crates
Hi everyone, It occurred to me that our insistence on dynamic linking for the standard library is likely to cause performance problems. This has been lingering at the back of my mind for a while, although my fears have been allayed to some degree so far by the fact that trivial standard library routines rarely show up in profiles. But it occurred to me that there's one small class of functions that we really can't get away with not inlining: simple iterators. Consider uint::range(). This is the preferred way to do a C-style for loop in Rust. Up to this point we've been preferring to write C-style for loops as while loops, as near as I can tell, due to some early flakiness in rustboot, and for performance reasons. While loops work fine. But while loops aren't something we want programmers to have to write to get good performance out of their loops. They're a quite low-level construct. I've been burned many, many times while writing Rust code by forgetting to increment the loop counter. At the moment, *each iteration* of uint::range() requires not one, but two indirect function calls, both across boundaries that are opaque to LLVM (indeed, each boundary is a DLL boundary). This is likely not going to work; we're inhibiting all loop optimizations and forcing new stack frames to be created for every trip around the loop. I'm pretty sure that LLVM can inline away the overhead of range(), by first inlining the iterator itself, then inlining the for-each body. But, in order to do that, it needs to statically know the iterator definition. So range() can't be dynamically linked anymore. The simplest solution I can think of is to allow both static and dynamic library crates. There's been a movement lately in some circles to forbid dynamic linking entirely; Go for example explicitly does not support it (in keeping with the Plan 9 tradition). I tend to disagree; I think that supporting dynamic linking is useful to enable memory savings and to make security problems in the wild easy to patch. (I'm told there was a zlib security bug that was much easier to fix on Linux than Windows because Windows apps tend to statically link against everything but the system libraries.) But exclusively relying on it is not going to be tenable; some routines are so simple, yet so critical to performance, that static linking is probably going to be the only viable option. Thoughts? Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] Statically linked library crates
On 5/16/11 12:21 AM, Graydon Hoare wrote: Two? It should only be one. Range calls block, block runs, range does += 1, range calls block again. It's also not exactly a surprising indirect function; it keeps calling the same one, from the same place in the code + stack. IOW it'll *probably* be well predicted. That's true, sorry. It is definitely going to inhibit some optimizations, it's a question of asking which ones and how high the costs are. Measure? Well, I'm not sure what kind of measurements we can usefully do to demonstrate this. I could certainly write a loop microbenchmark, but that won't tell us much. This doesn't hurt us much in rustc because (a) rustc is mostly tree-traversal, not hot inner loops; (b) we're using while loops in rustc anyway. I suspect this will be the kind of problem we will see in things like video decoders... but that's a ways off :) I think you mean more than static linking, since LLVM is not a linker; you mean static linking where the .a file actually contains LLVM bitcode, a la the LLVM-LTO approach. That's plausible. It's *different again* from actual static linking. And it's different from pickled ASTs, and different from macros. So there are like ... 4 points on the spectrum here in addition to our existing compilation strategy: Yes, I should have been more clear when I referred to static linking. I think traditional static linking is mostly obsolete with the advent of LLVM bitcode; the only reason one would want it is if the object files aren't in LLVM bitcode format (not a problem we have in this case), or for compilation speed (which is a fair point). - We want a macro system anyways. So let's build that soon. Agreed :) - You only have an inlining problem *when* you're dealing with separate crates. Granted, uint::range is standard library fodder, but if you have some custom data structure X with iters on X implemented in the same crate as the iter-callers, LLVM may well inline the iter already. And uint::range is one of those cases that will fall into supported by a scalar-range special case loop, so let's not over-generalize the problem too quickly. Right, that's one of the approaches that I was considering (but left out for the sake of email brevity). The issue is that I suspect a fair number of folks will end up writing Rust in a functional style and use functions like map, filter, and possibly reduce/fold liberally. I worry about them getting burned. We could of course add language support for those constructs too a la Python's list comprehensions, but that increases the complexity budget. The reason is simply this: if standard wisdom is to use LLVM-LTO against the standard library to get adequate performance, you get two big follow-on problems: Sorry, I should have been more clear. What I'm proposing is to segment the standard library into two parts: a small, performance-critical part that benefits from LTO and a larger, dynamically-linked part. This is what I was getting at by defending dynamic linking; there are parts of the standard library that we would definitely like to be able to share among processes, to be able to patch for security issues, and to avoid recompiling. But there would be a small core that would be shipped as a .bc file and linked at LLVM optimization time into Rust binaries. This allows us to alter the dividing line between static and dynamic over time. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] alias analysis
On 6/3/11 2:07 PM, Graydon Hoare wrote: On 11-06-03 01:51 PM, Patrick Walton wrote: Thoughts? I like the line of reasoning; let me try phrasing in a slightly more terse/pithy fashion: Alias-formation must preserve unique ownership of the referent Right, that's a good way to put it. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] alias analysis
On 6/7/11 12:59 AM, Marijn Haverbeke wrote: Unfortunately, there's this hole I mentioned before. What this analysis guarantees is that the location pointed to by an alias will always hold a value of type X — if you reassign to it, the alias will still be valid. Except when going through a tag type. If you have an alias inside a tag (as alt creates them) and you reassign its parent, your alias might not point at memory of the right type anymore. Sorry if I'm misunderstanding, but I thought I covered that in my previous email--we have to forbid access to the expression in an alt statement inside the case blocks. This means that the expression obeys normal alias rules, if it's an lval: there can only be one reference to it so that forbidding access to that one lval actually forbids access to the value. There's actually another problem I just thought of, but it's orthogonal to this one. I have to get ready this morning, will elaborate in detail later. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] Move-by-default for temporaries?
On 7/7/11 7:03 AM, Marijn Haverbeke wrote: Shall we specify that temporary values, when put into a data structure, are moved, rather than copied? For non-temporary values, this usually not what you want, so we should provide an explicit operator to specify that we want to move those. I was going to implement copy constructor elision for this purpose. There is a bug on this IIRC. The right thing to do is to neither move nor copy, but actually write *directly* into the slot where the temporary is going. The generated assembly code for typeck::check_expr, for example, sorely needs this optimization. Note that this is needed for resources to be at all usable: auto x = my_resource(); would generate an error without some sort of copy constructor elision, because the resource would be copied into a temporary for the return value and then copied into x, violating the noncopyability restriction. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
[rust-dev] Function types
Hi everyone, Dave and I just whiteboarded ideas for function types, in light of the issues that Andrew was seeing. We came to these tentative conclusions: * Functions being noncopyable doesn't seem to work, because it breaks bind. bind must copy the environment of the function bound to. * Blocks (unnamed lambdas) and named functions have different properties. The former can capture stack locals, while the latter has dynamic lifetime and can capture only shared state. Also, blocks have extra control flow options (break, continue, outer return) while named functions don't have this ability. * Interior semantics aren't enough to guarantee that unnamed lambdas can't escape their defined scope. It'd be possible to swap an interior fn with another interior fn (in some data structure, say) and have it escape its scope. * We would like to be able to declare functions that take as one (or more) of their arguments either a named function or a block, without having to duplicate the function. For example, we should be able to say map([ 1, 2, 3 ], { |x| x * x }) or map([ 1, 2, 3 ], square). To deal with these issues, here's a conservative proposal as a starting point: (1) The general fn type represents a block. It can only appear as the type of an alias parameter to a function. (2) All named functions have type @fn. They are reference counted. They can close over outer variables, but only those on the heap (i.e. only over boxes). (3) As a corollary to point (1), blocks can only be constructed with the { |x| ... } syntax when calling functions. They are able to use break, continue, and return with the semantics that we discussed previously. (4) bind stays around, but only works on values of type @fn. (5) Named functions (@fn) can be converted to type fn when calling a function using the dereference operator *. In the example above, we could call map using map([ 1, 2, 3 ], *square). Under the hood, an extra function will be generated; this is required because blocks need a different signature than named functions in order to handle break and return. Drawbacks: * fn and @fn are now different types; there is no compositionality, although the dereference operator lessens the pain of this somewhat. * Functions can't be sent over channels. We could potentially extend this later to remove some of the limitations, but I think this works as a starting point. Andrew, I'm particularly curious as to whether this helps solve some of the issues you were encountering. To the rest of the team, I'm interested to hear your feedback. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] We *could* allow locals to be aliases
On 7/14/11 3:08 AM, Marijn Haverbeke wrote: First, a relatively non-controversial case: autoccx = cx.fcx.lcx.ccx; // Look ma, no refcounting bumps! This case is very similar to alt/for blocks, and the alias checker could check it with a relatively simple extension. Next, of couse, I'm going to argue that 'accessing things through blocks', for example the proposed hash table accessor approach (where you pass a block to the accessor in which you'll have access to the value) is an abomination, and we should allow functions to return aliases to the content of their arguments. To the alias checker, this isn't any more complicated than the alias-in-a-block case, and it is certainly more pleasant on the programmer. After the conversation among Dave, Dan Grossman, and me yesterday, I actually think that we don't want accessing things through blocks at all. I believe it's impossible to make memory-safe. Consider: let h1 = @hashmap::mk(); let h2 = id(h1);// identity fn; compiler can't see through this hashmap::insert(*h1, foo, bar); hashmap::get(*h1, foo, { |val| hashmap::delete(h2, foo); print val; // crash }); I think we *have* to copy the values. (Note that we are already copying the values to please the alias checker in the hashmap implementation, so this adds no more overhead than what we have!) This gives get() a more natural return value, making the accessing things through blocks pattern pointless. So the problem goes away. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] Proposal: Eliminate let hoisting
On 7/31/11 9:11 AM, Brendan Eich wrote: JS already has function hoisting, which wins for programming in top-down style, maintaining source without having to topologically sort functions, etc. I made functions hoist to top of program or outer function body to mimic letrec, way back in the day. Given this precedent, we believe function-declaration-in-block, which binds a block-local (as let does), should hoist to top of block and be initialized on entry to block. Rust always hoists function items (named functions), even nested ones. I think that's a great feature for the reasons you describe -- it gets rid of having to think about the function dependency DAG, which is a big pain in C and C++, and even worse in Standard ML and OCaml, where there aren't even prototypes. Hoisting for named functions is fine because they're always completely defined before execution begins -- in particular, they can't close over any values, so there's no question as to which bindings they capture (uncertainty over this is why I assume ML functions don't hoist). So I should have been more clear -- in this scheme local variables would be the only non-hoisted bindings. It's rare that local variables need to be mutually recursive; the only time is when you want mutually recursive capturing lambdas, and in that case I don't think manually hoisting is too bad. Absent mutual recursion, I don't see any benefit to hoisting local variables, other than (a) consistency between items and locals and (b) simplifying the compiler implementation a bit (but note that we actually get this wrong at the moment -- we initialize local variables at the time we see the let, which can cause segfaults). Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
[rust-dev] Can we have our tuples back?
The lack of them makes destructuring assignment a lot less convenient... Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] Can we have our tuples back?
On 8/11/11 4:10 PM, Marijn Haverbeke wrote: As for destructuring, records seem to work very well there. In most cases, you'll use the field names for your variables, so you get simply let {key, val} = someexpression(); Also, what I'd like to do is this: let x, y = 1, 2; I have to write: let { x, y } = { x: 1, y: 2 }; Which violates DRY, or: let x = 1; let y = 2; Which is wordy. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
[rust-dev] Dynamically sized frames and GC
So it turns out that dynamically-sized frames are quite tricky to get right with GC. Essentially, when crawling the stack, the GC needs to rerun all of the dynamic size and alignment calculations in order to determine the layout of the frame so that it can find the roots. LLVM has no support for this. Note that this problem is similar to the issue with dynamically-sized frames and stack growth. There are different ways to handle this problem (monomorphization being an especially attractive one, I think), but it seems to me that the simplest solution for a 0.1 release to just have GC bail out when it discovers a dynamically-sized frame on the stack. Is this okay with others? Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] Dynamically sized frames and GC
On 08/13/2011 12:47 PM, Marijn Haverbeke wrote: Where bailing out means it simply allocates more memory and doesn't collect anything? I can imaging a long-running computation inside some generic function 'leaking' until it runs out of memory. Yup, it can definitely leak. We need to solve this problem for real at some point. Does every generic have a dynamically sized frame or only those who create locals of parameterized variables? Only the latter. We'd need to add an analysis pass that determines whether the frame is dynamically-sized. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] pointers and values in rust
On 08/28/2011 07:10 PM, Graydon Hoare wrote: Non on shared boxes. They might be cyclic, so it might not terminate. We're no longer statically differentiating the cyclic from the acyclic. Well, this prevents people from using the built-in = as the hash table key comparison function in many cases (most commonly, if the key is @str). I think that may violate POLS pretty severely. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] pointers and values in rust
On 8/29/11 8:34 AM, Graydon Hoare wrote: (Others: feel free to chime in, we've been back-and-forth on this issue in conversation since ... years now?) Dave suggested pointer equality only on mutable boxes, but deep equality on immutable boxes. This might be a nice sweet spot. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] We can do without (immutable) by-alias parameters
On 09/07/2011 07:18 AM, Marijn Haverbeke wrote: While this is attractive from the perspective of having the right defaults, it also makes the semantics of the code at a glance more subtle. The semantics are not effected at all -- both with 'structurally immutable' and with immediate values, it is not observable whether the parameter was passed by value or by reference. Yeah, I'm more concerned about mutable structural values (e.g. records with one or more mutable fields). In that case it does matter semantically whether the parameter is passed by value or by alias. There aren't many languages I know of that have mutable value types -- I only know of C, C++, and C# (though I wouldn't be surprised if Fortran, Pascal, COBOL, or Ada had them too). To my knowledge, in all of them, passing a value type as a parameter results in a copy. In C++, where mutable structural value types are commonest, this often gets criticized for leading to subtle bugs, so I agree we need a solution there. It may well be a question of having the default mode be alias. To be clear, I'm not opposed to your proposal. I just have two concerns: (a) that we are getting dangerously close to violating programmers' assumptions regarding what a mutable structural value type means, and (b) that we are making the ABI more difficult to predict. These concerns are why I advanced the suggestion earlier: make no sigil mean immutable alias semantics, but the compiler may choose to promote to value if the value is immutable and small, but have an explicit copy sigil mean always copy. Follow-up thought: If we had an explicit copy sigil, that might mean we could use this function: fn copyT(*x : T) - T { x } as our copy operator... Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] We can do without (immutable) by-alias parameters
On 9/9/11 9:14 AM, Marijn Haverbeke wrote: Update: The approach sketched earlier does not work out because it is possible for a function to take a parameterized type without the caller knowing that it does. (For example, a fnT(x: fn(T)), when given a function fn(int), will call it with the argument given by reference, since it sees it as type T, whereas the function will expect it by value.) Oh, another thing: Monomorphizing fixes this problem, so it's worth keeping this idea in mind for the future, even if it can't be implemented in trans as it currently stands. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] Two small syntax change proposals.
On 9/14/11 10:55 AM, Marijn Haverbeke wrote: Thoughts? So do you intend to make bracey-if a statement? What about expression-alt? That'd still have the current awkwardness when followed by '(' or '['. Oh right, alt too. Maybe ML-style alt as well: let x = alt y of some(z) { z } | none. { 0 }; Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] Two small syntax change proposals.
On 9/14/11 11:00 AM, Graydon Hoare wrote: General responses: - The if/then/else form of if-exprs will not satisfy ternary users, I imagine, and requires a bunch of lookahead to find the 'then'; I don't particularly like the look of it. - Doesn't solve 'alt', does it? - Still requires the do { ... } block, no? - I believe a large measure of the purpose of expr-izing everything was to achieve greater compositionality in the grammar, i.e. for macro uses and whatnot. I think that's still a valid concern, so am hesitant to stmt-ize a grab-bag of exprs as suggested. Fair enough; if others don't like separate expression and statement forms, I'd vote for |val| for block-expression, without the |res| (it's an interesting idea, but I'm not sure it's necessary -- maybe something to think about for future versions?) I suspect |val| will be rare. - If we ignore trailing semis, don't we force people to write final-nil with some frequency, as in ML? I thought we had a pretty hard set of competing forces that pushed us into our current rules. Only if we expect people to ignore return values a lot. I'd prefer that people didn't, but I guess this is orthogonal to the issue at hand. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] Two small syntax change proposals.
On 9/14/11 11:10 AM, Patrick Walton wrote: Fair enough; if others don't like separate expression and statement forms, I'd vote for |val| for block-expression, without the |res| (it's an interesting idea, but I'm not sure it's necessary -- maybe something to think about for future versions?) I suspect |val| will be rare. Come to think of it, this actually solves an ugliness in C++: the artificial block pattern used when you want to run a destructor at some specified time. You see stuff like this in C++: ... { auto_ptrsome_big_data_structure ptr = make_big_data_structure(); cout ptr.to_string(); // I want the pointer immediately destroyed here } ... But the artificial blocks look pretty ugly. With |val| (or |do|) we could make it nicer: ... do { let ptr = ~make_big_data_structure(); log ptr; // I want the pointer immediately destroyed here } ... Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] The lowdown on return-by-reference
On 9/16/11 2:10 PM, Marijn Haverbeke wrote: The problem: Accessor functions always have to copy their return value, so you can't efficiently get at the content of data structures (except by duplicating the logic needed to access them). The original solution proposed was to pass the accessor a block and pass the value to that block by reference. This would cause blocks to spring up everywhere (with all the indentation and noise that comes with it) and be extremely un-composable. A while ago I wrote a message to this list proposing a system whereby functions could return references. This week, I've finally implemented that. The exact syntax and semantics are not yet set in stone, so if you see room for improvement, reply. How about having container types in which accessors swap the original with none? fn get_swapT(mutable optionT : opt) - T { let opt2 = none; opt :=: opt2; ret alt opt2 { none. { fail } some(x) { x } } } Rationale: For boxes (option@T) get() is fine; it just returns a pointer (maybe bumping an RC; in GC this is free). For interior types (optionlarge_rec)... well, programmers are paying a huge performance/memory penalty for these anyway, because |none| is so large, so I'm not sure it's worth optimizing. For unique pointers (vectors, strings too), this is where we get big wins for performance, and it seems to me that we can easily optimize the return statement above to move the pointer. I've been quite concerned about the complexity of incorporating type-based alias analysis into the language semantics for some time. TBAA is traditionally an optimization, not something necessary for soundness. I'd like to see how far we can go with alias analysis that isn't type-based, but rather layer- (interior/unique/box) and mutability-based. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] The lowdown on return-by-reference
On 9/16/11 2:40 PM, Patrick Walton wrote: fn get_swapT(mutable optionT : opt) - T { let opt2 = none; opt :=: opt2; ret alt opt2 { none. { fail } some(x) { x } } } Sorry, this should read: fn get_swapT(mutable opt : optionT) - T { let opt2 = none; opt :=: opt2; ret alt opt2 { none. { fail } some(x) { x } } } ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] Move as a unary operator, or, alternatively, as an implicit optimization
On 9/17/11 11:59 AM, Marijn Haverbeke wrote: - The implicit, clever approach: Notice that the only situation where you want to do this is when using a local variable (you can't move out of data structures) for the last time, and simply make the compiler optimize the last use of a variable into a move. We'll probably want to do this anyway, but the question is: Is it enough, or do people want to 'see' their move operations in front of them? If an explicit copy operator is needed for uniques, then this isn't even ambiguous, is it? IOW the only thing that |x = y| with str/vec/unique pointer |y| can possibly mean is move y, since if you wanted to copy it you'd have to say |x = copy y|. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
[rust-dev] Purpose of |put;|?
Minor thing: What is the purpose of |put;| (with no arguments)? Just curious. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
[rust-dev] Proposal: Java-style iterators
It would be nice if we could figure out what to do about iterators for 0.1. I was thinking that we could make them Java-style iterators -- that is, objects with has_next() : bool and next() : T methods. |for each| would simply be syntactic sugar. This form: for each (x in iter()) { ... } Would desugar (in trans) into: let _i = iter(); while (_i.has_next()) { let x = _i.next(); ... } This has the following advantages: * Easy to implement. * Easy for LLVM to optimize. Simple SROA and inlining should make this as efficient as a for loop. * Simple performance model. * Familiar to programmers. * Allows us to support |break|, |cont|, and early |ret| easily. * Allows |break| and |ret| to be efficient. They simply throw away the iterator object. No special stack discipline necessary. * Makes upvars (outer variables referenced from the loop body) free in terms of performance. * Generator-like patterns can be achieved by making the iterator implementation use tasks internally. * Allows us to eliminate |put|. But it has these disadvantages: * Tasks can be more syntactically heavyweight than sequential-style iteration using |put|. * Data sharing between tasks is restrictive, which can make generator-like patterns awkward to use. Thoughts? Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] Proposal: Java-style iterators
On 9/28/11 5:27 PM, Brendan Eich wrote: On principle I do not want us to go down this path, even if we change later. It adds risk that we won't change. It imposes a stateful model on iterators where has_next and next must be coherent, and you have to write two methos (not one as in Python or JS.next). And, Java. I'd be fine with a single-method solution too: iterators could just be a closure that returns option::t, with none used to indicate the end of iteration. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] Proposal: Java-style iterators
On 9/29/11 5:20 AM, Peter Hull wrote: On Thu, Sep 29, 2011 at 5:17 AM, Graydon Hoaregray...@mozilla.com wrote: - I prefer the closure-passing form: With this form, would it be possible to extract more than one value per loop - for example if I had a sequence of numbers that I wanted to pair up to make a sequence of (x, y) coordinates, such as 0, 0, 0, 1, 1, 1, 1, 0 - (0,0), (0,1), (1,1), (1,0)? Use a tuple. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] Proposal: Java-style iterators
On 9/28/11 9:17 PM, Graydon Hoare wrote: - Expresses the iteratee is aliased during iteration fact to the alias-checker, so you don't have to worry about invalidating externalized iterators. This is important; particularly if you want to exploit next part ... I don't understand this, sorry... could you explain? - Affords its own optimization opportunities, just elsewhere; consider: for (elt in vec::each(v)) { ... } iter eachT([T] v) - T { unsafe { let p = ptr::addr_of(v[0]); let e = p + vec::len(v); while (p != e) { put *p; } } } We can't do this with exterior iters. What about: fn eachT([T] v) - fn()-optionT { let p, e; unsafe { p = @mutable ptr::addr_of(v[0]); e = p + vec::len(v); } ret lambda() { unsafe { if p e { let rv = *p; *p = ptr::next(*p); ret some(rv); } ret none; } } } I agree it should be mopped up before 0.1, but I prefer the path of just finishing up loopctl and removing the 'for' / 'for each' distinction. The complexity of loopctl is what I'm worried about, I guess. It requires a bunch of dynamic did this loop break? did this loop early-return? checks. I'm not sure what the precedent for this is. The thing that makes my proposal possibly problematic is that it ties option::t into the language at a much closer level than it's ever been tied before. I'm quite possibly ok with that, given that there are optimization opportunities for option::t that pretty much no other tags get (specifically, optimizing option::t~T and option::t@T into null pointers instead of two words). But it does mean that it's probably a post 0.1 feature. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] Proposal: Java-style iterators
On 9/29/11 5:25 AM, Marijn Haverbeke wrote: I guess we could use return-by-alias here, yes. I was kind of assuming everybody hated that and wanted it to go away. But if we use that we can no longer return a tag to indicate end of sequence (you can't currently wrap a reference in a data structure -- supporting that would get way too hairy). True, it's a bummer. At some point you just need to start using swap if you want to avoid copies. Or you iterate by index only (probably the least painful solution). This sort of thing is why I think @ is going to end up being pretty common in real-world code. Not too bad IMHO; optimizing heap allocation is well-studied, after all. We're still leagues better than Java on the memory management front because we support stack allocation. A bigger issue is when you want to iterate over stuff on the exchange heap, of course. For that we'll just have to see how swap turns out. I foresee having a borrowable pointer type (basically just a sugary mutable option::t) in the standard library. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] Proposal: Java-style iterators
On 9/29/11 5:46 AM, Marijn Haverbeke wrote: I still consider using swap to return things an absolute non-solution. Try writing some code in that style. Unless I'm missing some part of the way you want to approach this, it's absolutely dreadful to work with. If swap is unacceptable, then, as I've said before, we need to completely rethink the way we do concurrency. Even if we come up with some reference-based solution for single-threaded iteration, we're still back to swap for parallel iteration. If parallel iteration is too painful, people won't use it, defeating many use cases of Rust. My point here is that we need to make swap palatable. Whether we do that through integrating borrowable pointers into the language at a deeper level, or through standard library magic, doesn't matter to me. But just saying swap is a non-solution is sweeping a problem under the rug. Sure, but being a heap-centric language is completely opposed to what we've been doing so far. A language has shouldn't straddle the fence about the style it prefers, or you get a mess. We should either go modern-style (allocate everything except immediates, garbage-collect, use escape analysis to put some allocations on the stack), or C-style (stack is default, go out of our way to provide safe references). Doing both seems bad taste. I disagree with this dichotomy. There's nothing wrong with having the programmer be explicit about where allocations live; that's part of what makes Rust a systems language. C# has both value types that live on the stack and reference types that live on the heap and are garbage collected. Expand the notion of garbage collection to include reference counting and C++ does too (via shared_ptr and the numerous other reference counted smart pointer templates). Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] Parameter passing and native calls
On 9/29/11 4:51 PM, Graydon Hoare wrote: I think this is probably less of a worry than you're seeing; the new rust abi is probably by reference, but by value in some cases. It only has the freedom to choose when it's unobservable to safe code anyways (when it's an immutable value). So the values are effectively the same as passed-by-value as far as callee is concerned. That's why marijn could make the change without breaking everything, after all :) It doesn't have the freedom to choose; it always goes by reference, due to type passing. This is the problem. We can only fix this by monomorphizing. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
[rust-dev] FYI: C stacks
C stacks are starting to make their way into rustc. This is a prerequisite for stack growth -- C code can't in general be expected to perform stack growth checks, so we need to reserve a large stack for it (one per thread, in the current implementation). Right now C stack usage is implemented as a native ABI, so it's opt-in per function. One major challenge with the implementation of C stacks in rustboot was getting gdb to understand backtraces across C functions. With the current assembler trampoline stub (src/rt/arch/i386/ccall.S), I tried hard to make the stack frame look as normal as possible from gdb's perspective. Backtraces do seem to work on the Mac GDB, once it's appropriately patched to avoid the (corrupt stack?) warning (see [1] if you don't mind applying a binary patch, for Snow Leopard). I'm planning to start moving our native calls over to using the C stack, and I definitely don't want to break backtraces for everyone else. They're really important for productivity :) So, please don't hesitate to speak up if you're finding broken backtraces due to C stacks over the next few days. I'll do my best to sort out any issues that people have. We all have slightly different build environments and slightly different GDBs. GDB tends to be notoriously fussy about the stack layout, and I want to make sure people aren't left out in the cold. Thanks! Patrick [1]: https://mail.mozilla.org/pipermail/rust-dev/2010-December/000139.html ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] Proposal: Java-style iterators
On 9/29/11 11:00 AM, Graydon Hoare wrote: - Write 1 iter interfaces on most collections, with varying strategies, to reduce unwanted boilerplate in cases you don't need it. Cold code is cold, harmless. vec::each(v) == takes fn(e) - (), returns () vec::scan(v) == takes fn(e) - bool, returns () vec::find(v) == takes fn(e) - bool, returns option::te I really, really think the exterior-iterator style is wrong. It puts logic in the wrong place, encourages hand-recoding loop logic incorrectly rather than collecting a rich library of correct enumerators. You can write a whole lot of cold enumerators that are safe and do-interesting-things to cover, I think, any of the needs of an external-iterator. This was less true when we had no closures, but now that we have block closures, I don't think there's any excuse. Given that we clearly need more discussion here before we reach consensus, I'd like to propose doing the above for 0.1, and removing iters as they currently stand. I already added vec::eachi and it's pretty nice to use -- the added sugar in |for each| doesn't seem to buy us much. I whipped up a small testcase using clang and it seems that LLVM indeed can optimize this style of iteration into a loop. Is everyone ok with this? Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] Parameter passing and native calls
On 9/30/11 12:14 AM, Marijn Haverbeke wrote: Now that every parameter is passed by reference, our ABI is no longer compatible with C. Are you talking about calling C functions from rust, or calling rust from C? In the first case, we are generating a wrapper anyway, which currently makes sure things are passed by value. Not anymore, with the C decl. I agree the current everything-by-reference situation is awful. I'm not keen on bringing back references though. One relatively simple solution would be partial monomorphizing: For each type parameter that a function takes, we generate one version for immediate (by value) types, and one version for structural types. You'd get 2^N (where N is the number of type params, rarely more than 2) version of each generic function. They can be generated in the crate that defines the function, so no involved cross-crate magic is needed yet. This'd make it predictable again whether an argument is passed by value or by reference, so the everything-by-reference hack can go again. It'd also make generic functions much more efficient when called on immediate values. What do you think? How about what I proposed earlier: the '+' sigil means by-value, the '' sigil means by-immutable-reference, and leaving it off has the compiler choose a sensible default based on the type? (Word-sized immutable stuff would get +, and others would get .) You might still have to have in a few corner cases involving generics, although I can't think of any concrete instances off the top of my head in which this would be the case. I have a feeling this would avoid all the noise we had before while maintaining ABI predictability. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] Parameter passing and native calls
On 9/30/11 8:37 AM, Patrick Walton wrote: On 9/30/11 12:14 AM, Marijn Haverbeke wrote: Are you talking about calling C functions from rust, or calling rust from C? In the first case, we are generating a wrapper anyway, which currently makes sure things are passed by value. Not anymore, with the C decl. Make that C stack. Ow. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
[rust-dev] Removing ty_native
I think native types might have outlived their usefulness at this point. We can represent them as ints, and their type safety can be achieved via tags. So native mod ... { type ModuleRef; } becomes tag ModuleRef = int; This has the nice benefit of being able to use sizes other than words; e.g. u64. (This was the use case that motivated this post.) Thoughts on removing them? Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] Renaming tag and log_err
On 10/29/2011 05:06 AM, David Rajchenbach-Teller wrote: I disagree. I would expect print to be something that writes to stdout. As I understand it, log is a specialised debug/trace facility which is built-in and configurable. For example if you write log hello world it won't print anything unless RUST_LOG is set. I don't think it would be too mind blowing to introduce std::io::stdout() in the Hello World program. Why not a print that (unconditionally) prints to stdout? In my experience log_err is for quick and dirty printf debugging, so stderr seems appropriate to me. It's not intended to be the main way for command-line programs to get stuff on the screen. It's not ideal for that purpose anyway, since it's polymorphic and you typically don't want to show the equivalent of toSource() output to end users. Perhaps show would be a better name, to make it clear that it's a debugging tool? Given what I just said above, maybe our hello world shouldn't encourage poor practices and should import the print function from the standard library. I'm a little concerned from a developer ergonomics/marketing perspective that we require an import statement for hello world (note that not even Java requires this), but I suppose it can't be helped unless we really want to have a std::pervasives module that's imported by default. Between enum and union, I tend to favor enum, for a simple reason: - attempting to use a variant as a C-style or Java-style enum will work flawlessly; - by opposition, attempting to use a variant as a C-style union will fail for reasons that will be very unclear for C programmers. I'd prefer to not use either, because variants are really a combination of enums and unions in the C/C++ sense. If we use enum, the union aspect of variants will look weird to C folks; if we use union, the enum aspect will likewise look weird. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] Renaming tag and log_err
As a newbie, I do not mind either way between importing std::io or having the function baked in a Pervasives/Prelude module. However, I concur that log is probably not the right tool for Hello world. Looks like it's decided. Filed a bug to get us a Pervasives module: https://github.com/graydon/rust/issues/1096 I also concur that names log/log_err do not accurately represent their behavior. What about renaming log_err to dump and log to dump_log or something such? Looks like we have echo, show, and dump as the three contenders, all equally short. Anyone have strong preferences? I'm inclined toward show if there's no consensus. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] Read-only strings?
On 10/30/2011 01:53 PM, David Rajchenbach-Teller wrote: If type `str` is indeed (at least partly) mutable, each of these functions must copy the `str`, which is rather costly. I wonder if there is a type-based mechanism that I could use to guarantee that a `str` is never mutated, hopefully some trivial typestate-based pattern that currently escapes my grasp. For now, @str is what you want. Vector uniqueness is causing all sorts of problems, and it's possible/probable that we'll change this in the future. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
[rust-dev] Naming convention for libraries
We recently renamed libstd.so to libruststd.so to avoid stomping on a libstd that might exist in /usr/lib. Perhaps we should attack this in a more holistic way: either (a) all Rust libraries should start with rust* or (b) Rust libraries should install themselves into /usr/lib/rust. The latter seems to be more common among language runtimes, but I'm not sure if there are going to be library path issues on some systems if we go down that route. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] Renaming tag and log_err
I agree. log_err was a kludge and I'd prefer to un-kludge it before shipping rather than adding another keyword. Multiple log-levels is the way to go. Macro if there's something relatively easy, or just keep 'log' as compiler-supported, but extend the syntax and include a bunch of log-level numeric constant names in the prelude (err, warn, info, debug, say?) By the way, `alert` was suggested in a bug as a replacement for log_err (and as a synonym for log(err, ...)), which Marijn and I like. Has a cute JavaScripty feel to it. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] Renaming tag and log_err
On 11/8/11 9:41 AM, Graydon Hoare wrote: Likely yes. Though we should offer a compiler flag / crate attribute to disable the auto-import of it. In fact, we'll have to, in order to bootstrap. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] Object system redesign
On 11/8/11 9:12 AM, David Rajchenbach-Teller wrote: Does this mean that we can only have one constructor? If we wish to have several constructors – and if we accept that they must not have the same name – the class could yield a full module, in which each constructor is a function. Yeah, we may want multiple constructors. I left it out due to simplicity, but I don't have any opposition. If we do have them, they should be named though, as you say. What is the rationale behind **@class**? Because you may want to e.g. register the instance with an observer, and if all you have is an alias to self you can't do that. So a destructor cannot trigger cleanup of any heap-allocated data? Right, but keep in mind that if a class instance contains the last reference to heap-allocated data, the destructor on that heap-allocated data will be called as usual. Class instances are copyable if and only if the class has no destructor and all of its fields are copyable. Class instances are sendable if and only if all of its fields are sendable and the class was not declared with **@class**. No destructor? I would rather have guessed no constructor, what am I missing? Destructors are for classes that encapsulate things like OS file descriptors. If you copy their contents, you'd end up closing the file twice. Classes may be type-parametric. Methods may not be type-parametric. Hmmm... What is the rationale? Mostly because it's what we did before. I wouldn't be opposed to making methods type-parametric either, honestly. I like it. Be warned that someone is bound to ask for RTTI to pattern-match on interfaces and/or on interface fields. Rust has always been slated to have reflection -- RTTI is baked into the system already, it's just a matter of implementing it. One more question, though: what about class-less objects? I feel that they could be quite useful to provide defer-style cleanup. I'm not opposed to that, if we can make it work. Keep in mind that you can define nominal classes (and nominal items generally) inside functions already, so this is mostly a question of avoiding giving a class a name (and maybe closing over surrounding lexical items; the same memory management issues we have with closures apply, of course). Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] Object system redesign
On 11/11/2011 07:29 AM, Niko Matsakis wrote: In principle, it might be nice to allow something like bounded polymorphism: fn call_fooT:has_foo(x: T) { x.foo(); } Without subtyping, it would make less sense. Perhaps it corresponds to passing the vtable that converts a `T` into a `has_foo`, so when you invoke x.foo() it compiles down to T_vtable.foo(x) (which I think is how Haskell type classes work at runtime, but that's more from me guessing, perhaps people who know Haskell better can correct me). It does. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] criteria for core lib
On 12/04/2011 02:02 PM, Graydon Hoare wrote: Cross-crate inlining (when and if we do it) is a mixed blessing anyways. It hurts data and procedural abstraction -- both virtues of proper software design -- in order to help compile-time (but not run-time) modularity. I'm happy to experiment with it, but I don't think it should be seen as a panacea either. The tradeoffs are numerous. Strongly disagree. If we cannot inline stuff like map, we cannot create a performant browser engine. There is no way around this. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] Instance Chains: Type Class Programming Without Overlapping Instances and type classes in Rust
On 1/9/12 5:36 PM, Tim Chevalier wrote: The problem the paper addresses is in Haskell, where having multiple instances in scope for the same class and type can cause unpredictable behavior. (The paper explains the basic problem in more detail pretty well.) It seems like there's an analogous issue in Rust when you import multiple sets of methods for the same type class, which I understand is handled with a compile failure. Our discussion led to the ability to disambiguate at the call site. However, it's not entirely satisfactory: it does mean that there are some subtle traps involving things you would like to be typeclasses (like Hashable) and consistency of data types (like what happens if you create a hash table and add some keys using one type class instance for Hashable and later on try to add keys using another type class instance for it). Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
[rust-dev] Zero-variant tags?
I broke zero-variant tags with my syntax change. Is this something we want to support? Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] tail-call performance issue?
On 1/25/12 10:42 AM, Matthew O'Connor wrote: Hi, I was reading https://github.com/mozilla/rust/wiki/Bikeshed-tailcall and wondered about the statement Tail calls cannot be implemented in general without a serious performance hit for all calls. I've never heard this before. tjc speculated it had to do with decrementing refcounts on normal function returns, but our discussion didn't reveal any obvious reasons. What is the reason for this performance hit on all calls? Pascal calling conventions versus C ones, basically. To handle tail calls in the case in which the callee has more arguments than the caller, you have to make sure that callees pop all their arguments (the Pascal convention). This prevents callers from reusing one set of outgoing argument space for all calls (the C convention). I don't know what the performance hit is in practice; I suspect it's fairly small and not serious. I'm sure that one could make a microbenchmark that performs significantly worse under Pascal calling conventions than C ones, though. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] Proposal: rename sequence concatenation operator to ++
On 1/25/12 11:46 AM, Marijn Haverbeke wrote: Currently it is simply '+'. The thing that prompted this is issue #1520 -- operator overloading. Delegating + on non-builtin-numeric types to a `num` interface that implements methods add/sub/mult/div/rem/neg methods seems elegant, and similar to Haskell's approach. Vector-concatenation + messes everything up though, since vectors can't meaningfully implement the full num interface. What about an add interface? Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] A couple of tweaks to make typeclasses easier?
On 1/26/12 6:35 PM, Kevin Atkinson wrote: As a potential user of the language, I have to agree with Graydon. In particular I do not like the idea of having to use a different symbol for what I see as method access and field access. Ok, let's not do it then. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] Suggestions
On 02/04/2012 06:21 AM, Arne Döring wrote: The second suggestion is concerning tho #fmt macro. #fmt works like printf, but its string is parsed at compile time, so that errors might be thrown when the string is incorrect. So when you unwind this format string at compile time, you know also all variable names and their types from scope. So it might be more natural to directly import variable names from the scope inside of the String, like it is done in many scripting languages. This is also done in tho Programming language Nemerle. Well, #fmt is a macro, and I would generally be opposed to macros implicitly capturing the local variables in scope -- that's unhygienic. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] RFC: Addressing Dan's bug through immutable memory
On 2/7/12 4:23 PM, Graydon Hoare wrote: Hm. I am confused at the description of the hole in the type system. I was under the impression that this was the distinction between immutable values and immutably-rooted values (those contained within a path-of-immutable-references). Am I misunderstanding? It's not a hole per se, but our concept of immutability is pretty unique among languages I know of. (C++ doesn't work this way, in particular.) Moreover, our notion of immutability makes it more difficult than it could be to prevent Dan's bug. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] RFC: Addressing Dan's bug through immutable memory
Ah yes, I remember now. Can this also be used to break refinement types (i.e. break typestate)? We have subtyping via refinements. Patrick -- Sent from my Android phone with K-9 Mail. Please excuse my brevity. Niko Matsakis n...@alum.mit.edu wrote: On 2/7/12 4:27 PM, Patrick Walton wrote: It's not a hole per se As I pointed out elsewhere, I think this is a genuine hole. It allows you to create an immutable box that can later be mutated, as I showed. Niko ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] RFC: Addressing Dan's bug through immutable memory
On 02/08/2012 06:54 AM, Marijn Haverbeke wrote: Yes. What would be legal would be: let x = @{a: 10}; x = @{a: 20}; That seems a rather heavy-handed restriction. C++ has the same restriction. In practice I've found that it rarely comes up. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] RFC: Addressing Dan's bug through immutable memory
On 02/08/2012 08:48 AM, Marijn Haverbeke wrote: C++ has the same restriction. In practice I've found that it rarely comes up. Obviously, since const field are rare in C. In Rust, one rarely uses a record type that doesn't have immutable fields. Fair enough. Still: (1) This is a type unsoundness. I think we have to fix it, because as it stands typestate means nothing -- it can be broken in the safe language. (2) The type unsoundness notwithstanding, I think it's more logical for immutable to mean truly immutable. Right now it's possible to mutate all immutable fields. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] datetime module and some questions
On 02/20/2012 11:44 AM, Ted Horst wrote: Is there a better way to do this in rust? I think this is the sincerest form of a feature request for const vectors. :) There's a bug on it here: https://github.com/mozilla/rust/issues/571 Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] How suitable is Rust for a Distributed Datastore?
On 03/03/2012 10:34 AM, tav wrote: My language of choice for systems development in recent times has been Go. However, whilst it has excellent networking support, Go's stop-the-world garbage collector gets in the way of the needs of an in-memory datastore. I understand that Rust would be give me better control over the memory use and layout without interruption by a garbage collector. Is this the case? Sorry, I've only managed to find http://doc.rust-lang.org/doc/rust.html#memory-and-concurrency-models as far as documentation. Is there a page on the wiki I should be reading? Your tasks can still get interrupted by a garbage collector (right now, a cycle collector), but it's never a *concurrent* garbage collector. Other tasks can still run while one task is collecting garbage. This was one of the most important design goals of Rust. I would definitely like to make the GC incremental in the future. This is nontrivial (mostly because of LLVM's poor support for GC at the moment), but it's a whole lot easier than writing an incremental concurrent garbage collector. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] relax type checking of ints and uints?
On 3/19/12 6:56 PM, Tim Chevalier wrote: I don't think we have any plans to add implicit casts as implied by your other 4 examples. It seems too complex -- if any of the variables in your example were mutated after being initialized, the pass that would insert these casts would get pretty complicated. Actually, I don't think Chris was suggesting having a pass that computes the possible set of values for each local. I think he was getting at more of a must-be-wider-than relation among numeric types. For example, you know that every u8 value can be losslessly converted to an i16 based on the types alone. Java has something similar. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] Arrays, vectors...
I like all of this. On 3/23/12 1:42 PM, Graydon Hoare wrote: Are the semantics over-complex? I think we really have a large number of use-cases and there's no way we can hit them all with a single abstraction. Shared-heap != unique-heap != constant != alloca != borrowed region, and fixed-size != variable-size. Yeah, that's my only concern, but I don't know how to make it simpler, really. Is the trailing slash absolutely hideous? There are a few other unused ASCII symbols in the type grammar but this was the nicest-looking I could see (that didn't collide with something else). It's also possible to write them as str/10, str@ and str~, say, and similarly [int/10], [int]~ and [int]@. That's fewer slashes but a bit more visual ambiguity if you have a leading ~ or @ as well. LLVM uses x; [int x 10], [int x 30]. I think it's kind of cute. There's precedent for x as an operator, in Perl, and I don't believe it requires making x a keyword. It does require whitespace around the x, though. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] Arrays, vectors...
On 3/23/12 6:54 PM, Graydon Hoare wrote: On 12-03-23 06:47 PM, Sebastian Sylvan wrote: On Fri, Mar 23, 2012 at 1:42 PM, Graydon Hoaregray...@mozilla.com wrote: - [1,2,3,4,5]-- constant memory, type [int] - [1,2,3,4,5]/5 -- constant memory, type [int/5] Hmm, why couldn't literals always be fixed-size? They get auto-promoted to slices when needed, right? That would remove the need for the /X part of the literal syntax at least. Compare: let x = [1,2,3,4,5]; // x:[int], 5 words storage, 2 words for x let y = x; // y:[int], 2 words for y // total: 9 words, 1 array vs. let x = [1,2,3,4,5]/5; // x:[int/5], 5 words for x let y = x; // y:[int/5], 5 words for y // total: 10 words, 2 arrays But if literals promoted to slices you could take advantage of that to achieve the same effect: let x: [int] = [1,2,3,4,5]; // x:[int], 5 words storage, 2 words for x let y = x; // y:[int], 2 words for y // total: 9 words, 1 array vs. let x = [1,2,3,4,5];// x:[int/5], 5 words for x let y = x; // y:[int/5], 5 words for y // total: 10 words, 2 arrays Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] Arrays, vectors...
On 03/25/2012 02:16 AM, Gareth Smith wrote: On 23/03/12 20:42, Graydon Hoare wrote: - Existing vecs are always unique. Sometimes you want shared, but boxing them as @[] causes double-indirection, feels awkward. Apologies if I am missing the point here, but how about using some sort of copy-on-write mechanism instead of unique pointers? So: For sending purposes. We need to make sure that data is uniquely owned to avoid atomic reference counting and data races when sending data between tasks. Also, we used to rely on that optimization in earlier versions of Rust, and it turned out to be hard to predict when there were outstanding references. It was a bit of a performance footgun... Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] Performance optimization
On 04/07/2012 04:08 AM, Grahame Bowland wrote: For the case of one big data structure multiple workers want to read from, couldn't we write a module to do this within the language as it stands? The module could take a unique reference (which can't contain anything mutable), then issue (via unsafe code) immutable pointers to the structure on request. Obviously it's a broken thing to do (and afaik the language doesn't guarantee that the address won't change unexpectedly, although I don't think it will in the current implementation), but it might be an interesting experiment. Relevant here are Niko's thoughts on parallel blocks: http://smallcultfollowing.com/babysteps/blog/2011/12/09/pure-blocks/ I'm very much in favor of experimentation in the unsafe module to find what works in advance of baking stuff into the language. I feel that's the right way to do programming language work -- pave the cow paths, as Dave Herman likes to say :) Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] console?
On 04/08/2012 11:15 AM, Kobi Lurie wrote: hi rust list, is there something like Console.ReadLine in rust? I want to experiment, get a feel for the language by writing a little hangman game. Check out the io module: http://doc.rust-lang.org/doc/core/io.html In particular use stdin() to get a handle to standard input, and then use read_line(). You will need to import reader and reader_util. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
[rust-dev] Brace-free if and alt
Here's a total bikeshed. Apologies in advance: There's been some criticism of Rust's syntax for being too brace-heavy. I've been thinking this for a while. Here's a minimal delta on the current syntax to address this: Examples: // before: if foo() == bar { 10 } else { 20 } // after: if foo() == bar then 10 else 20 // or: if foo() == bar { 10 } else { 20 } // before: alt foo() { bar { 10 } baz { 20 } boo { 30 } } // after: alt foo() { bar = 10, baz = 20, boo = 30 } // or: alt foo() { bar { 10 } baz { 20 } boo { 30 } } BNF: if ::== if expr (then expr | block) (else expr)? alt ::== alt expr { (arm* last-arm) } arm ::== block-arm | pat = expr , last-arm ::== block-arm | pat = expr ,? block-arm ::== pat block You can think of it this way: We insert a then before the then-expression of each if; however, you can omit it if you use a block. We also insert a = before each expression in an alt arm and a , to separate expressions from subsequent patterns; however, both can be omitted if the arm expression is a block. This does, unfortunately, create the dangling else ambiguity. I'm not sure this is much of a problem in practice, but it might be an issue. The pretty printer would always omit the then and the =/, when the alt arm is a block. That way, we aren't introducing multiple preferred syntactic forms of the same Rust code (which I agree is generally undesirable); the blessed style is to never over-annotate when a then body or an alt expression is a block. Here's an example piece of code (Jonanin's emulator) written before-and-after: Before: https://github.com/Jonanin/rust-dcpu16/blob/master/asm.rs After: https://gist.github.com/2360838 Thoughts? Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] Fall-through in alt, breakcontinue by label
On 04/15/2012 06:17 PM, Sebastian Sylvan wrote: Could tail calls work? I.e. each label would equal a separate function (any state would have to be passed through), and then you'd just keep tail-calling from state to state. Without really knowing exactly what kind of code you're trying to generate, this seems like it might be workable. Only if LLVM's optimizer is smart enough to turn that code into a goto-based state machine. I'm not sure if it is. (Of course, if it's not, that's possibly fixable...) Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] Fall-through in alt, breakcontinue by label
On 4/16/12 11:46 AM, Graydon Hoare wrote: They're already present (were from the beginning) but they broke when we shifted from rustboot (hand-rolled code generator) to rustc (LLVM). It turns out that you have to adopt a somewhat pessimistic ABI in all cases if your functions are to be tail-callable. There's a bug open on this[1] that discusses in some more detail, but I think the feature is drifting towards a decision to remove the feature altogether. I actually disagree with this; I think that we should measure. I'm not sure that the Pascal calling convention is worse than the C calling convention in practice. In any case, I believe we're doing sibling call optimization already. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] bikeshed on closure syntax
Here's a little before-and-after with some of the syntax and semantic changes discussed (snippet from Sebastian Sylvan's raytracer [1] and modified slightly): --- Before --- #[inline(always)] fn get_rand_env() - rand_env { let rng = rand::rng(); let disk_samples = vec::from_fn(513u) { |_x| // compute random position on light disk let r_sqrt = f32::sqrt(rng.next_float() as f32); let theta = rng.next_float() as f32 * 2f32 * f32::consts::pi; (r_sqrt * theta.cos(), r_sqrt * theta.sin()) } let mut hemicos_samples = []; for uint::range(0u, NUM_GI_SAMPLES_SQRT) { |x| for uint::range(0u, NUM_GI_SAMPLES_SQRT) { |y| let (u, v) = ((x as f32 + rng.next_float() as f32) / NUM_GI_SAMPLES_SQRT as f32, (y as f32 + rng.next_float() as f32) / NUM_GI_SAMPLES_SQRT as f32); hemicos_samples.push(cosine_hemisphere_sample(u, v)); } } { rng: rng, floats: vec::from_fn(513u, { |_x| rng.next_float() as f32 }), disk_samples: disk_samples, hemicos_samples: hemicos_samples } } --- After --- #[inline(always)] fn get_rand_env() - rand_env { let rng = rand::rng(); let disk_samples = vec::from_fn(513): x { // compute random position on light disk let r_sqrt = rng.next_float().(f32).sqrt(); let theta = rng.next_float().(f32) * 2.0 * f32::consts::pi; (r_sqrt * theta.cos(), r_sqrt * theta.sin()); } let mut hemicos_samples = []/~; for uint::range(0, NUM_GI_SAMPLES_SQRT): x { for uint::range(0, NUM_GI_SAMPLES_SQRT): y { let (u, v) = ((x.(f32) + rng.next_float().(f32)) / NUM_GI_SAMPLES_SQRT.(f32), (y.(f32) + rng.next_float().(f32)) / NUM_GI_SAMPLES_SQRT.(f32)); hemicos_samples.push(cosine_hemisphere_sample(u, v)); } } { rng: rng, floats: vec::from_fn(513, _ - rng.next_float().(f32)), disk_samples: disk_samples, hemicos_samples: hemicos_samples }; } --- Patrick [1]: https://github.com/brson/rustray/blob/master/raytracer.rs ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] bikeshed on closure syntax
On 04/17/2012 10:03 PM, David Piepgrass wrote: This requires arbitrary lookahead to disambiguate from tuples. This bit in particular. Really really don't want to cross the bridge to arbitrary lookahead in the grammar. Pardon me, but I'm not convinced that there is a problem in lambdas like (x, y) - (x + y). By analogy, you can realize that ((x * y) + z, q) is a tuple instead of a simple parenthesized expression when you reach the comma -- you don't need to look ahead for a comma in advance. So why not treat (x, y) as a tuple until you reach the - and then reinterpret the contents at that point? This works as long as the syntax of a lambda argument list is a subset of the tuple syntax, anyway. If that's not the case, parsing gets messier, though I'm sure arbitrary lookahead is not be the only possible implementation. Patterns are not a subset of the expression grammar. For example, : has meaning in a pattern (type test), but not in an expression. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] bikeshed on closure syntax
On 4/18/12 5:07 PM, Jeff Schultz wrote: Any reason we can't just have an empty '||' instead of the '-'? It's easier to type and makes it easier to find all closures. || looks a little like line noise to me, although I'm not wedded to the thin arrow. spawn(): - { log(Hi!); } vs. spawn(): || { log(Hi!); } 3. In any function call, trailing closure arguments can be pulled out and placed after a colon. If this is done, the semicolon statement separator after the call can be omitted. Why is the ':' needed? To disambiguate a block from bitwise or. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] 2 possible simplifications: reverse application, records as arguments
On 04/21/2012 10:28 AM, gasche wrote: I've been wondering about a problem tightly related to named Re. function types: if you consider those parameter-passing structures as first class (which does necessarily mean that they are convenient to use, for example if they're not adressable they will be less flexible), the natural choice is to have a family of types for them. Those types could come with restrictions and an unspoken kinding discipline, so that for example they cannot be used to instantiate type variables, maybe cannot be nested, etc. That's the main reason why I think one should think of such structures as real structures rather than syntactic sugar; it forces you to have a proper design for types and other aspects. There are several issues with going to tupled arguments: * We'd still need formal parameters for C interoperability. At the ABI level, a single-argument function applied to a 3-ary tuple is very different from a function with 3 arguments. * It prohibits us from having optional parameters in the future (at least, not without some very hairy typechecking). * I don't know how to make the block loop syntax work. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] In favor of types of unknown size
On 04/28/2012 03:17 AM, Matthieu Monrocq wrote: I would also like to point out that if it's an implementation detail, the actual representation might vary from known size to unknown size without impact for the user, so starting without for the moment because it's easier and refining it later is an option. Another option is to have a fixed size with an alternative representation using something similar to SSO (Short String Optimization); that is small vectors/strings allocate their storage in place while larger ones push their storage to the heap to avoid trashing the stack. We tried this once. It was a disaster in terms of code size; you really don't want all strings and vectors doing this. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] Interesting paper on RC vs GC
On 05/01/2012 12:53 AM, Florian Weimer wrote: * Sebastian Sylvan: R. Shahriyar, S. M. Blackburn, and D. Frampton, Down for the Count? Getting Reference Counting Back in the Ring, in Proceedings of the Eleventh ACM SIGPLAN International Symposium on Memory Management, ISMM ‘12, Beijing, China, June 15-16, 2012. http://users.cecs.anu.edu.au/~steveb/downloads/pdf/rc-ismm-2012.pdf I think they give up deterministic finalization, which would make this approach not suitable to Rust. Do you mean destroying an object the moment the last reference to it drops? If so, that's not a hard requirement in Rust's case. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] Interesting paper on RC vs GC
On 5/1/12 8:59 AM, Matthieu Monrocq wrote: I agree that the technics outlined, especially with the details on their advantages/drawbacks are a very interesting read. As for the predictable timing, anyway it seems hard to have something predictable when you take cycle of references into account: I do not know any inexpensive algorithm to realize that by removing a link you are suddenly creating a self-sustaining group of objects that should be collected. Therefore I would venture that anyway such groups would be collected in a deferred fashion (using some tracing algorithm). Agreed. The problem with reference counting in the long term isn't the reference counting itself; it's the cycle collection. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] method calls vs closure calls
On 5/4/12 2:57 PM, Niko Matsakis wrote: I had a crazy thought for how to make method call syntax unambiguously distinguishable from field access without making it ugly. In short: make `a.b(c, d)` *always* a method call, rather than parsing it as a call to a value stored in a field. If you actually wanted to call a closure in a field, you would make that explicit by writing `(a.b)(c, d)`. Thoughts? I like this. It addresses the concern I had when proposing -, without the ugliness of -. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] HasField
On 05/20/2012 05:23 AM, Bennie Kloosteman wrote: New member to the list im no genius but am looking for a replacement for c for system programming. 1) Great work on UTF8 strings and USC4 chars ( I do some work with chinese chars and USC-2 just doesn't work here) . BTW why null terminated strings if you have arrays . Any c interop will require conversion to Unicode chars anyway fr a generic lib ( and for most IO the perf cost of a copy is pretty irrelevant) At the moment we aren't willing to risk that the cost of a copy for C interoperability is irrelevant. Might be worth measuring once we have more of Servo up and running though. (The operative constraint here is basically that it's fast enough to build a competitive browser with.) , Speaking of which its very useful in strings keeping the char count as well as the byte length mainly so you can go if ( str-length == str-count) to avoid any Unicode parsing when you convert to ascii. ( you could use a flag or the high bits of a int64 ) - or use a subtype stating its ASCII. Interesting idea. Patches welcome! 2) Does Rust have anything like bitc HasField ,? BitC introduced has field constraint and this was very useful , the field could be a function or property but you can then put on a method that the type HasField x and like an interface it hides the implementation , you could use any type with met the constraint. . I don't see any reason why we couldn't have HasField, although none of us have needed it yet. It also meant encapsulated types and the associated this pointer / vcall issues could possible be avoided since the method only specified what filed it used the rest would be private ( for that method) .To me this greatly reduced the need of having private fields/members which existing c programmers will just follow as they find it hard to pick up type classes initially. You may be able to use hashfield for a fascade to further effect to hide internals . Since you guys are now considering this and objects I would read the below comments very carefully. I have ideas as to how we could implement inheritance if we need it (and I think we might well need it, as I'm running into issues with not being able to exploit the prefix property for subclassing with Servo). It's not totally sketched out, so I don't have a concrete proposal. That notwithstanding, basically it involves unifying enums and classes so that (a) enum variants are actual types, subtypes of the enclosing enum to be exact; (b) enums are classes and can have fields and methods; (c) enum variants can access fields and methods of the enclosing enum; (d) enum variants can contain other sub-variants. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] Back to errors, failures and exceptions
On 5/25/12 10:16 AM, David Rajchenbach-Teller wrote: What's the difference between |scope| and Rust's resources, exactly? scope would be executed unconditionally at the end of the current block, while the rules for standard Rust RAII are somewhat more complex and depend on the initedness of the variable, whether it was moved, etc. There's a hidden dynamic flag created by the compiler and set I've been toying with the idea of changing standard Rust RAII to execute this when the variable goes dead (in the classic compiler liveness sense) and introduce a scope keyword for something more like C++ or D RAII in which the liveness of the variable is restricted to be exactly the extent of the block, making it safe to unconditionally run the destructor once the variable goes out of scope. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] Back to errors, failures and exceptions
On 5/25/12 10:28 AM, Patrick Walton wrote: There's a hidden dynamic flag created by the compiler Hit send too early. There's a hidden dynamic flag created by the compiler to track initedness for each variable. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] Doc-comment syntax
On 06/02/2012 06:30 AM, Gareth Smith wrote: Does this proposal have any hope? I agree completely and have thought the exact same thing in the past. I will bring this up at the next meeting. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
[rust-dev] RFC: Block lambda syntax tweak
Hi everyone, Here's a revised lambda syntax tweak proposal. It's gotten feedback from several, so I think it's time to present it more generally. I don't think we should do this now; we should remain focused on bugs for 0.3. I'm just interested in getting feedback. Executive summary of the proposal: 1. Change the block lambda syntax from { |x| x + 3 } to |x| x + 3. 2. In general, remove the special case of pulling the last argument out and turning it into a block from the language. However, it is still supported, and in fact required, as part of the for construct and the new do construct. 3. Change this: for [ 1, 2, 3 ].map { |x| ... } To this: for [ 1, 2, 3 ].map |x| { ... } 4. Add a new form, `do`, in order to support block syntax for constructs that aren't strictly loops. Also remove the mandatory || for zero-argument lambda blocks. Thus instead of: spawn { || ... } We have: do spawn { ... } Rationale: * Removing the ability for the last argument of a function call to be pulled out in the general case significantly decreases the complexity of the language grammar. This feature simply becomes a mandatory part of the for or do expressions. * Removing the pipes around zero-argument functions in do or for constructs improves readability. In particular do spawn reads naturally. * Having the opening brace be at the end of the line increases familiarity for programmers accustomed to the C language family. * Removing the mandatory braces around lambdas decreases visual noise for small lambda expressions. Consider the closing parentheses on this line: log([ 1, 2, 3 ].map(|x| x + 1)) Versus the sequence '}))' on this line: log([ 1, 2, 3 ].map({ |x| x + 1 })) * Having only one way to get the block syntax iteration (for or do) provides an incentive for library authors to ensure that their blocks follow the iteration protocol. At the same time, it minimizes surprises when people try to put break/continue/ret in blocks intended for iteration and discover it doesn't work. Now it always works for iterators. * The do block allows continue for early returns, which is something that cannot be done at the moment with block lambdas. Here are the technical details: Pseudo-BNF: PrimaryExpr ::== ... | BlockLambda | ForExpr | DoExpr BlockLambda ::== AbbreviatedArgs Expr AbbreviatedArgs ::== '||' | '|' (ModeSigil Identifier)* '|' ForExpr ::== 'for' (CallExpr | PrimaryExpr) AbbreviatedArgs? Block DoExpr ::== 'do' (CallExpr | PrimaryExpr) AbbreviatedArgs? Block If the for expression or do expression head is a primary expression that is not a call expression, then it's treated as a call expression with an empty argument list. The block lambda in a for expression is treated as returning bool, while the block lambda in a do expression is treated as returning unit. As such, break/continue/ret all work in for loops, while only continue works in do expressions (which effects an early return). Thoughts? Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] RFC: Block lambda syntax tweak
On 6/6/12 10:23 AM, Graydon Hoare wrote: I'd also possibly prefer break rather than continue to get an early-exit from a 'do'. But then, we're still debating what to do for the word continue in the grammar anyway (#2229, I still prefer loop; there!) Sure, either one works for me. I don't actually mind whether it's break or continue, as long as there's some way to do it :) (Does slightly make me think again to the no-scope-end fat-arrow lambdas that were proposed last time we discussed this -- (x) = expr -- but I'm not sure if the symmetry with pattern fat-arrow syntax is worth the chattiness cost nor the need for the parser to suspend judgment on tuple-of-ident expressions while parsing. Thoughts?) Yeah, I think the parsing will become really hairy there, especially if the block lambda arguments grow into patterns. Not to shoot it down, of course, just mentioning that it's an (LA)LR hazard. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] RFC: Block lambda syntax tweak
On 6/6/12 10:52 AM, Sebastian Sylvan wrote: Just a quick question: Can I pass in a multi-statement lambda to a function without using do or for, and if so what does it look like? I'm guessing something like this, but I didn't see it spelled out: foo( |x| { let y = x+1; y+1 }); Yep, that's exactly how it works. It falls out of blocks being expressions. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] RFC: Block lambda syntax tweak
On 6/7/12 2:20 PM, Gareth Smith wrote: I think that allowing an early exit with a break/continue from lambdas that use this special form is confusing, because breaking may or may-not actually resume the code that follows the do-call. The lambda might be put into a data structure for later execution, or, like in your spawn example, executed in a new task. I would prefer that instead of having a special `do` form, all `|| expr` lambdas could have early returns using `ret`. If you allow all block lambdas to have early returns with ret, then this: fn f() { for int::range(1, 10) |i| { ret i; } } Has a very different meaning from: fn f() { int::range(1, 10) |i| { ret i; } } IMHO this is likely to be pretty confusing. Also, eliminating do makes this ambiguous: spawn() || { ... } Is ambiguous with bitwise-or. This is also ambiguous: spawn || { ... } In any case, I think this looks nicer: do spawn { ... } Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] I am confused regarding implicit copies
On 6/7/12 1:41 PM, Gareth Smith wrote: Hi Rust-Dev, I have recently (using the latest rust from github) encountered some new warnings about implicitly copying a non-implicitly-copyable value. I believe this is due to the fix for https://github.com/mozilla/rust/issues/2450. This warning seems to pop up all over the place because, as the bug points out, vecs/strs are in-fact copied rather a lot (at least in the rust I have written). I think that in some places I can restructure the code to avoid copies, but that still leaves many places where I guess I need to either add a `copy` or just ignore the warning (or turn it off). I am worried that I will have to litter my code with `copy`. Use @[int] instead of [int] or the new dvec to avoid copies. You can also use region pointers (). In practice copies hurt Rust performance a *lot*. Copies of vectors and strings are extremely expensive, and often make our performance drop way below even dynamically typed, interpreted languages. The bug above mentions that the warning has been disabled for [Mozilla's] existing projects. So my question is: what is the long-term plan here? Will Mozilla's projects be restructured to avoid copying strs/vecs, or to add `copy` where it is not possible? Will the new vecs/strs make this a non-issue somehow (are vecs/strs still unique in the new scheme?)? The long term plan is: * Introduce the new vectors and strings and make stack vectors the default, drastically reducing the number of unique vectors and strings. * Only use unique vectors for vectors that are designed to cross from task to task. * Clean up all copies from the existing Rust projects. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] Syntax tweak: Alt without pattern
On 6/7/12 10:27 PM, David Rajchenbach-Teller wrote: I believe that this snippet has a more immediately visible structure than the original and is easier to read, while the syntax tweak is trivial to compile. What do you think? How about a Scheme-like cond macro? Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] RFC: unifying patterns in alt/destructuring assignment
On 06/10/2012 08:08 AM, Niko Matsakis wrote: This was also posted on my blog, but I wanted to make sure people saw it, as I'd like to discuss this on Tuesday, because one of the logical next steps for the regions work is to begin deciding precisely what to do about the types of identifiers in alts. I like this. The only concern, as a comment pointed out, is that * might be slightly confusing; maybe ref is better. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] RFC: unifying patterns in alt/destructuring assignment
On 06/10/2012 12:14 PM, Graydon Hoare wrote: On 10/06/2012 11:30 AM, Patrick Walton wrote: I like this. The only concern, as a comment pointed out, is that * might be slightly confusing; maybe ref is better. I like it too. Though I wonder if the ambiguity between -as-a-reference-taker and -as-a-pattern is actually problematic. Consider two cases (assuming we use here): #1: let foo = {1,2}; let {a, b} = foo; #2: let x = 1; let y = 2; let foo = {x, y}; let {a, b} = foo; It seems to me that in both cases you're introducing two variables, a and b, of type int. In #1 they point into foo, using the to take references to the record components; in #2 they point to x and y respectively, using the to match against the existing -types inside the record. But, as I understand it, in #2 they would actually copy out the values. So in #1, a : int and b : int, but in #2, a : int and b : int. cf: let x = 1; // x : int let y = 2; // y : int let foo = (@x, @y);// foo : (@int, @int) let (@a, @b) = foo;// a: int and b: int Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] RFC: unifying patterns in alt/destructuring assignment
On 6/12/12 3:28 PM, Niko Matsakis wrote: It's a good point that *unsafe.T is rather noisy. I thought that was ok, but I wasn't thinking about the fact that it will appear often in FFIs. I am not sure how this would work out in practice. As you say, region ptrs in argument position are ok, but region ptrs in return position would yield inappropriate errors when calling the function. What if pointers in native function signatures were implicitly unsafe? C has no concept of regions anyhow... Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] RFC: unifying patterns in alt/destructuring assignment
On 6/13/12 6:37 PM, Niko Matsakis wrote: On 6/12/12 3:33 PM, Graydon Hoare wrote: Try sketching some code in a buffer, see how it looks. Might be possible to come up with an abbreviation (!T perhaps, or the old unused sigil ?T maybe?), might be possible for inference and a couple rules about the boundaries of extern functions to fill in the details. We could use ^T for unsafe ptrs, like Pascal. I've been thinking that the idea of an unsafe region is perhaps not the best thing ever. It's not clear where it fits into the region hierarchy. We'd need various weird special-case code to cope with it (it's kind of the longest lived region there is, but illegal to dereference, and illegal to use as the value of a function's region argument, unless you're in an unsafe block...). Probably cleaner to just keep unsafe pointers as their own thing. +1 for ^T. There's also precedent in Managed C++ (although ^ is a managed pointer there, while we'd be using it for the opposite). ^ is somewhat ugly, but unsafe pointers are by their very nature ugly and it's at least a lightweight-looking sigil. fn free(^void); vs. fn free(*unsafe.void); fn free(unsafe.void); fn free(void); Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] a vision for vectors
On 6/14/12 10:23 AM, Graydon Hoare wrote: What disappoints me about all this is that the language is now dependent on unsafe library code in order to do asymptotically fast aggregate types at all. The += optimization and vec doubling-allocation was the one primitive related to aggregate types that the compiler was filling in for us safely, before, for the sake of building more-efficient containers on top. Now pretty much every primitive container that has a doubling-store growth pattern is going to involve unsafe code. That's a bitter pill, but I guess it was true before too, it was just unsafe code with an interface curated by the compiler and runtime, not core library :( Actually, I think it'd be great if we moved that stuff out of the language itself. I think that, ceteris paribus, code in core is easier to debug and get right than code in trans. I think of it just like ports and channels; the implementation improved quite rapidly once we moved it to the library. But this general move feels like obscuring a pretty important bug, that I'm inclined to point out: we need to expand operator overloading to have op= forms, in general, such that a user could _define_ a version of += that does what the compiler currently does, when that's reasonable (overload arg-type symmetry aside). Operator overloading is simply incomplete here; likewise, fn [](...) needs to have a 3-arg lval form, not try to return a mutable ref or something. Yes, I've thought this for a while. D is good precedent here, with its opIndexAssign. Note that D got in trouble here, though; in a custom user-defined dictionary type, this can be made to work: foo[bar] = 3; But this doesn't: foo[bar]++; However, we nicely dodged that bullet by not supporting ++ and -- at all. Very prescient :) Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] a vision for vectors
On 6/14/12 1:41 PM, Graydon Hoare wrote: However, we nicely dodged that bullet by not supporting ++ and -- at all. Very prescient :) Not really. Reoccurs with foo[bar] += 1; Well, there's a fairly straightforward desugaring: foo[bar] = foo[bar] + 1; For ++ it's less certain (you have to dummy up a 1 constant of... some type?) Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] RFC: unifying patterns in alt/destructuring assignment
On 6/14/12 4:01 PM, Graydon Hoare wrote: I actually think if you're going to go down that road you want * to be unsafe as it is now, and ^ to be your region pointer. That has both more precedent in other languages and, looking at the above examples, is a bit less visually noisy. Bonus intuitions: * evokes C, which is what it's used for, and region pointers always point up the stack to a root pinned in an earlier frame :) Well, I'm concerned that Rust code will be littered with ^ everywhere, which doesn't look like a pointer to most programmers. It'd be nice if, when a programmer looks at a glance at typical Rust code, the meaning of the pointer sigil were obvious. ^ will be significantly less common. Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] RFC: Block lambda syntax tweak
On 06/16/2012 07:30 AM, Niko Matsakis wrote: Jumping back to an old thread: Yes, except that continue/break/ret would be valid in the latter but not the former. It seems like we should allow continue or break but not both. And it seems mildly inconsistent to allow breaks in a `do` but not other blocks. Perhaps we should just allow `break` in any block? I assume you mean labeled break, right? Otherwise this won't do what you expect: for array.each |x| { if (x == 1) { break; } } Patrick ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev