On 12/23/12 10:43 AM, Michael Neumann wrote:
Hi,
I've spent the last days hacking in Rust and a few questions and ideas
have accumulated over that time.
* If I use unique ~pointers, there is absolutely no runtime overhead,
so neither ref-counting nor GC is involved, right?
Well, you have to malloc and free, but I assume you aren't counting
that. There is no reference counting or GC, and the GC is totally
unaware of such pointers.
(There is one caveat: when we have tracing GC, it must scan ~ pointers
that contain @ pointers, just as it must scan the stack. Such pointers
are generally uncommon though.)
* Heap-allocated pointers incur ref-counting. So when I pass a
@pointer, I will basically pass a
struct heap_ptr {ptr: *byte, cnt: uint}
around. Right?
They currently do thread-unsafe reference counting, but we would like to
eventually change that to tracing GC. However, the structure is
different: we use intrusive reference counting, so it's actually a
pointer to this structure:
pointer --> [ ref count, type_info, next alloc, prev alloc, data... ]
You're only passing one word around, not two. The reference count is
inside the object pointed to. This setup saves one allocation over C++
std::shared_ptr.
* vec::build_sized() somehow seems to be pretty slow. When I use it,
instead of a for() loop, my rust-msgpack library slows down by
factor 2 for loading msgpack data.
Also, I would have expected that vec::build_sized() will call my
supplied function "n" times. IMHO the name is little bit
misleading here.
You want vec::from_fn() instead. vec::build_sized() is not commonly used
and could probably be renamed without too much trouble.
I suspect the performance problem you're seeing with it is due to not
supplying enough LLVM inline hints. LLVM's inline heuristics are not
well tuned to Rust at the moment; we work around it by writing
#[inline(always)] in a lot of places, but we should probably have the
compiler insert those automatically for certain uses of higher-order
functions. When LLVM inlines properly, the higher-order functions
generally compile down into for loops.
* I do not fully understand the warning of the following script:
fn main() {
let bytes =
io::read_whole_file(&path::Path("/tmp/matching.msgpack")).get();
}
t2.rs:2:14: 2:78 warning: instantiating copy type parameter with a not
implicitly copyable type t2.rs:2 let bytes =
io::read_whole_file(&path::Path("/tmp/matching.msgpack")).get();
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Does it mean that it will copy the ~str again? When I use pattern
matching instead of get(), I don't get this warning, but it seems to
be slower. Will it just silence the warning???
Yes, it means it will copy the string again. To avoid this, you want
result::unwrap() or option::unwrap() instead. I've been thinking for
some time that .unwrap() should change to .get() and .get() should
change to .copy_value() or something.
* This is also strange to me:
fn nowarn(bytes: &[u8]) {}
fn main() {
let bytes = ~[1,2,3];
nowarn(bytes);
let br = io::BytesReader { bytes: bytes, pos: 0 }; // FAILS
}
t.rs:6:36: 6:41 error: mismatched types: expected `&/[u8]` but found
`~[u8]` ([] storage differs: expected & but found ~) t.rs:6 let br =
io::BytesReader { bytes: bytes, pos: 0 }; ^~~~~
It implicitly converts the ~pointer into a borrowed pointer when
calling the function, but the same does not work when using the
BytesReader struct. I think, I should use a make_bytes_reader
function, but I didn't found one.
This is a missing feature that should be in the language. Struct
literals are basically just like functions; their fields should cause
coercions as well.
* String literals seem to be not immutable. Is that right. That means
they are always "heap" allocated. I wished they were immutable, so
that writing ~"my string" is stored in read-only memory.
~"my string" isn't designed to be stored in read-only memory. You want
`&static/str` instead; since it's a borrowed pointer, it cannot be
mutated. `static` is the read-only memory region.
Is there a way how a function which takes a ~str can state that it
will not modify the content?
Take an `&str` (or an `&~str`) instead. Functions that take `~str`
require that the caller give up its ownership of the string. If you, the
caller, give up a string, then you give up your say in how it is used,
including mutability. However, if you as the callee *borrow* the string
via `&str` or `&~str`, then you are not allowed to change its
mutability, since you are not the owner.
In this regard I very much like the way the D language handles this.
It uses "const" to state that it won't modify the value, while the
value itself may be mutable. Then there is "immutable", and a value
declared as such will not change during the whole lifetime.
We have a "const" qualifier as well, which means what "const" does in D.
It has a good chance of becoming redundant and going away, however, with
the changes suggested in the blog post "Imagine Never Hearing the Words
'Aliasable, Mutable' Again".
Having the ability to declare that a data type is forever immutable is
something we've talked about a lot, but we'd have to add a lot of type
system machinery for it to be as flexible as we'd like. Being able to
arbitrarily freeze and thaw deep data structures is a very powerful
feature, and having data types specify that they must be immutable
forever is at odds with that. (For that matter, the `mut` keyword on
struct fields is at odds with that in the other direction, which is why
I'd like to get rid of that too.)
Of course in Rust, thanks to unique pointers, there is less need for
immutability, as you cannot share a unique pointer between threads.
You can share unique pointers between threads with an ARC data type,
actually (in `std::arc`). The ARC demands that the pointer be immutable
and will not allow it to be mutated.
* Appending to strings. It's easy to push an element to an array by
doing:
let mut v: ~[int] = ~[1,2];
v.push(3);
v.push(4);
But when I want to append to a string, I have to write:
let mut s: ~str = ~"";
let mut s = str::append(s, "abc");
let mut s = str::append(s, "def");
I found this a bit counter-intuitive. I know there exists "+=", but
this will always create a new string. A "<<" operator would be really
nice to append to strings (or to arrays).
There should probably be a ".append()" method on strings with a "&mut
self" argument. Then you could write:
let mut s = ~"";
s.append("abc");
s.append("def");
There are also plans to make "+=" separately overloadable. This would
allow += to work in this case, I believe.
* Default initializers for structs. Would be nice to specify them like:
struct S {a: int = 4, b: int = 3};
I know I can use the ".." notation, and this is very cool and more
flexible, but I will have to type in a lot of code if the struct get
pretty large.
const DefaultS = S{a: 4, b: 3}; // imagine this has 100 fields :)
let s = S{a: 4, ..DefaultS};
Perhaps. This might be a good job for a macro at first, then we can see
about folding it into the language if it's widely used.
* Metaprogramming
Given an arbitrary struct S {...} with some fields, it would be nice
to somehow derive S.serialize and S.deserialize functions
automatically. Are there any ideas how to do that? In C++ I use the
preprocessor and templates for that. In D, thanks to
compile-time-code-evaluation, I can write code that will introspect
the struct during compile-time and then generate code.
There are #[auto_encode] and #[auto_decode] syntax extensions that exist
already, actually (although the documentation is almost nonexistent).
These are polymorphic over the actual serialization method, so you can
choose the actual serialization format. There is also a visitor you can
use for reflection, although it will be slower than generating the code
at compile time.
We currently have syntax extensions written as compiler plugins. These
allow you to write any code you want and have it executed at compile
time. There are two main issues with them at the moment: (1) they have
to be compiled as part of the compiler itself; (2) they expose too many
internals of the `rustc` compiler, making your code likely to break when
we change the compiler (or on alternative compilers implementing the
Rust language, if they existed). The plan to fix (1) is to allow plugins
to be written as separate crates and dynamically loaded; we've also
talked about, longer-term, allowing them to be JIT'd, allowing you to
execute any code you wish at compile time. The plan to fix (2) is to
make the syntax extensions operate on token trees, not AST nodes,
basically along the lines of Scheme syntax objects.
I guess I could write a macro like:
define_ser_struct!(S, field1, int, field2, uint, ...)
which would generate the struct S and two functions for
serialization. Would that be possible with macros?
Yes, you should be able do this with macros today, now that macros can
expand to items.
Patrick
_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev