Am Sun, 23 Dec 2012 12:20:07 -0500
schrieb Patrick Walton <[email protected]>:
> On 12/23/12 10:43 AM, Michael Neumann wrote:
> > Hi,
> >
> > I've spent the last days hacking in Rust and a few questions and
> > ideas have accumulated over that time.
> >
> > * If I use unique ~pointers, there is absolutely no runtime
> > overhead, so neither ref-counting nor GC is involved, right?
>
> Well, you have to malloc and free, but I assume you aren't counting
> that. There is no reference counting or GC, and the GC is totally
> unaware of such pointers.
>
> (There is one caveat: when we have tracing GC, it must scan ~
> pointers that contain @ pointers, just as it must scan the stack.
> Such pointers are generally uncommon though.)
What is the big advantage of having a tracing GC over ref counting?
With GC we'd get rid of the extra indirection and extra operations
during aliasing, so it's basically a performance issue, right?
> > * Heap-allocated pointers incur ref-counting. So when I pass a
> > @pointer, I will basically pass a
> >
> > struct heap_ptr {ptr: *byte, cnt: uint}
> >
> > around. Right?
>
> They currently do thread-unsafe reference counting, but we would like
> to eventually change that to tracing GC. However, the structure is
> different: we use intrusive reference counting, so it's actually a
> pointer to this structure:
>
> pointer --> [ ref count, type_info, next alloc, prev alloc, data... ]
Oh, I see, there is actually no double indirection, as [pointer+x]
always points to the data. Neat!
> You're only passing one word around, not two. The reference count is
> inside the object pointed to. This setup saves one allocation over
> C++ std::shared_ptr.
>
> > * vec::build_sized() somehow seems to be pretty slow. When I use it,
> > instead of a for() loop, my rust-msgpack library slows down by
> > factor 2 for loading msgpack data.
> >
> > Also, I would have expected that vec::build_sized() will call my
> > supplied function "n" times. IMHO the name is little bit
> > misleading here.
>
> You want vec::from_fn() instead. vec::build_sized() is not commonly
> used and could probably be renamed without too much trouble.
Actually I was thinking of sth like in Ruby:
Array.new(size=10) {|i| i % 2}
gives:
[0, 1, 0, 1, 0, 1, 0, 1...]
fn make_sized<T>(n: uint, f: fn(uint) -> T) -> ~[T] {
let mut v: ~[T] = vec::with_capacity(n);
let mut i: uint = 0;
while (i < n) {
v.push(f(i));
i += 1;
}
v
}
do vec::make_sized(10) |i| {i % 2}
> I suspect the performance problem you're seeing with it is due to not
> supplying enough LLVM inline hints. LLVM's inline heuristics are not
> well tuned to Rust at the moment; we work around it by writing
> #[inline(always)] in a lot of places, but we should probably have the
> compiler insert those automatically for certain uses of higher-order
> functions. When LLVM inlines properly, the higher-order functions
> generally compile down into for loops.
Is this an issue the LLVM developers are working on?
> > * I do not fully understand the warning of the following script:
> >
> > fn main() {
> > let bytes =
> > io::read_whole_file(&path::Path("/tmp/matching.msgpack")).get();
> > }
> >
> > t2.rs:2:14: 2:78 warning: instantiating copy type parameter with
> > a not implicitly copyable type t2.rs:2 let bytes =
> > io::read_whole_file(&path::Path("/tmp/matching.msgpack")).get();
> > ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >
> > Does it mean that it will copy the ~str again? When I use pattern
> > matching instead of get(), I don't get this warning, but it
> > seems to be slower. Will it just silence the warning???
>
> Yes, it means it will copy the string again. To avoid this, you want
> result::unwrap() or option::unwrap() instead. I've been thinking for
> some time that .unwrap() should change to .get() and .get() should
> change to .copy_value() or something.
Yes, I think sth with copy in the name would be less surprising. Ok,
unwrap makes sense. Or maybe get() and get_copy()?
> >
> > * This is also strange to me:
> >
> > fn nowarn(bytes: &[u8]) {}
> >
> > fn main() {
> > let bytes = ~[1,2,3];
> > nowarn(bytes);
> > let br = io::BytesReader { bytes: bytes, pos: 0 }; // FAILS
> > }
> >
> > t.rs:6:36: 6:41 error: mismatched types: expected `&/[u8]` but
> > found `~[u8]` ([] storage differs: expected & but found ~) t.rs:6
> > let br = io::BytesReader { bytes: bytes, pos: 0 }; ^~~~~
> >
> > It implicitly converts the ~pointer into a borrowed pointer when
> > calling the function, but the same does not work when using the
> > BytesReader struct. I think, I should use a make_bytes_reader
> > function, but I didn't found one.
>
> This is a missing feature that should be in the language. Struct
> literals are basically just like functions; their fields should cause
> coercions as well.
Ok.
> > * String literals seem to be not immutable. Is that right. That
> > means they are always "heap" allocated. I wished they were
> > immutable, so that writing ~"my string" is stored in read-only
> > memory.
>
> ~"my string" isn't designed to be stored in read-only memory. You
> want `&static/str` instead; since it's a borrowed pointer, it cannot
> be mutated. `static` is the read-only memory region.
I understand. Makes sense.
> > Is there a way how a function which takes a ~str can state that
> > it will not modify the content?
>
> Take an `&str` (or an `&~str`) instead. Functions that take `~str`
> require that the caller give up its ownership of the string. If you,
> the caller, give up a string, then you give up your say in how it is
> used, including mutability. However, if you as the callee *borrow*
> the string via `&str` or `&~str`, then you are not allowed to change
> its mutability, since you are not the owner.
So a "const" function (in terms of C++ ;-) would always take a &str
pointer? Makes absolute sense to me.
> > In this regard I very much like the way the D language handles
> > this. It uses "const" to state that it won't modify the value,
> > while the value itself may be mutable. Then there is "immutable",
> > and a value declared as such will not change during the whole
> > lifetime.
>
> We have a "const" qualifier as well, which means what "const" does in
> D. It has a good chance of becoming redundant and going away,
> however, with the changes suggested in the blog post "Imagine Never
> Hearing the Words 'Aliasable, Mutable' Again".
>
> Having the ability to declare that a data type is forever immutable
> is something we've talked about a lot, but we'd have to add a lot of
> type system machinery for it to be as flexible as we'd like. Being
> able to arbitrarily freeze and thaw deep data structures is a very
> powerful feature, and having data types specify that they must be
> immutable forever is at odds with that. (For that matter, the `mut`
> keyword on struct fields is at odds with that in the other direction,
> which is why I'd like to get rid of that too.)
>
> > Of course in Rust, thanks to unique pointers, there is less need
> > for immutability, as you cannot share a unique pointer between
> > threads.
>
> You can share unique pointers between threads with an ARC data type,
> actually (in `std::arc`). The ARC demands that the pointer be
> immutable and will not allow it to be mutated.
>
> > * Appending to strings. It's easy to push an element to an array by
> > doing:
> >
> > let mut v: ~[int] = ~[1,2];
> > v.push(3);
> > v.push(4);
> >
> > But when I want to append to a string, I have to write:
> >
> > let mut s: ~str = ~"";
> > let mut s = str::append(s, "abc");
> > let mut s = str::append(s, "def");
> >
> > I found this a bit counter-intuitive. I know there exists "+=",
> > but this will always create a new string. A "<<" operator would be
> > really nice to append to strings (or to arrays).
>
> There should probably be a ".append()" method on strings with a "&mut
> self" argument. Then you could write:
>
> let mut s = ~"";
> s.append("abc");
> s.append("def");
>
> There are also plans to make "+=" separately overloadable. This would
> allow += to work in this case, I believe.
Ideally there would be an operator, as writing .append() all the time is
quite tedious.
> > * Default initializers for structs. Would be nice to specify them
> > like:
> >
> > struct S {a: int = 4, b: int = 3};
> >
> > I know I can use the ".." notation, and this is very cool and
> > more flexible, but I will have to type in a lot of code if the
> > struct get pretty large.
> >
> > const DefaultS = S{a: 4, b: 3}; // imagine this has 100 fields :)
> > let s = S{a: 4, ..DefaultS};
>
> Perhaps. This might be a good job for a macro at first, then we can
> see about folding it into the language if it's widely used.
>
> > * Metaprogramming
> >
> > Given an arbitrary struct S {...} with some fields, it would be
> > nice to somehow derive S.serialize and S.deserialize functions
> > automatically. Are there any ideas how to do that? In C++ I use
> > the preprocessor and templates for that. In D, thanks to
> > compile-time-code-evaluation, I can write code that will
> > introspect the struct during compile-time and then generate code.
>
> There are #[auto_encode] and #[auto_decode] syntax extensions that
> exist already, actually (although the documentation is almost
> nonexistent). These are polymorphic over the actual serialization
> method, so you can choose the actual serialization format. There is
> also a visitor you can use for reflection, although it will be slower
> than generating the code at compile time.
Hm, this is interesting. Is there somewhere a simple example how to use
#[auto_encode] and what my msgpack library needs to implement to work
with it?
> We currently have syntax extensions written as compiler plugins.
> These allow you to write any code you want and have it executed at
> compile time. There are two main issues with them at the moment: (1)
> they have to be compiled as part of the compiler itself; (2) they
> expose too many internals of the `rustc` compiler, making your code
> likely to break when we change the compiler (or on alternative
> compilers implementing the Rust language, if they existed). The plan
> to fix (1) is to allow plugins to be written as separate crates and
> dynamically loaded; we've also talked about, longer-term, allowing
> them to be JIT'd, allowing you to execute any code you wish at
> compile time. The plan to fix (2) is to make the syntax extensions
> operate on token trees, not AST nodes, basically along the lines of
> Scheme syntax objects.
I see. So it would be possible to write a syntax extension called for
example iter_fields!(struct_Type) which could be used to generate i.e.
custom serializers. But that would be probably similar to #auto_encode,
just that it could be more user-defined.
>
> > I guess I could write a macro like:
> >
> > define_ser_struct!(S, field1, int, field2, uint, ...)
> >
> > which would generate the struct S and two functions for
> > serialization. Would that be possible with macros?
>
> Yes, you should be able do this with macros today, now that macros
> can expand to items.
Great. I will try that as an example to learn more about macros.
Thanks!
Best,
Michael
_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev