Re: [rust-dev] UTF-8 strings versus "encoded ropes"
On 5/2/14, Malthe Borch wrote: > On 2 May 2014 00:06, Tony Arcieri wrote: >> This sounds like the exact same painful failure mode as Ruby (transcoding >> blowing up at completely unexpected times) with even more complexity, >> making >> it even harder to debug. > > Here is a concrete example of when this would blow up: > > 1. You have a rope with non-ascii characters. > 2. You attempt to "encode" into "ascii". > > I don't think this is the structure he is talking about. I think something heavier that contains the encoding type and some sort of reference to the next set of chars which may have a different encoding. . . works well with dependent types. Ben ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] UTF-8 strings versus "encoded ropes"
The encoding / glob. code in .NET works well , the strings use of code-points is poor choice and both C# and Java suffer heavily for it when doing IO. Ropes / chords/ chains etc belong at a higher level not the lowest level type. Ben On Fri, May 2, 2014 at 8:03 AM, John Downey wrote: > I have actually always been a fan of how .NET did this. The System.String > type is opinionated in how it is stored internally and does not allow > anyone to change that (unlike Ruby). The conversion from String to byte[] > is done using explicit conversion methods like: > >- Encoding.UTF8.GetBytes(String) >- Encoding.UTF8.GetString(byte[]) >- Encoding.UTF32.GetBytes(String) >- Encoding.UTF32.GetString(byte[]) >- and so on > > That way if you end up with a bunch of bytes, you know exactly what those > bytes represent. > > > ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] Removing some autoref magic
While i like auto magic eg you could let region analysis work out the pointer type , borrowed pointers etc ,however you need to be very careful . C# uses ref but autoboxing is a major issue , you dont know when the struct is boxed ( there are some places besides the obvious virt call ) and then you incur boxing costs defeating the whole point of value types . The vast majority of C# developers only use structs for a few corner cases eg a type holding a few values ( many dont use value types at all) , they could go further but often it will get boxed somewhere and kill the benefits so why go through the hassle. Ben On Wed, Nov 20, 2013 at 9:35 AM, Vadim wrote: > If the function is not going to mutate it, I don't really care if it's > by-ref or by-value. That's why I said "const T&", not "T&". Well, > except for the case when a function caches a reference to something which I > later mutate. But borrow checking will prevent that. > > I would also be totally fine with letting the compiler figure out how > immutable arguments are passed (based on data type size, existence of > destructors, and whatnot). This might not be the best idea for binary > compatibility of public APIs, but I don't see why it can't be done with > crate-private functions. > > > > > On Tue, Nov 19, 2013 at 5:07 PM, Ziad Hatahet wrote: > >> > Personally I would prefer if & in Rust worked similar to const T& in c++ >> >> In that case, you would not be able to tell whether a function argument >> was passed by value or by reference. I actually like this feature about >> Rust (C# has it too with the `ref` keyword). >> >> -- >> Ziad >> >> >> On Tue, Nov 19, 2013 at 4:54 PM, Vadim wrote: >> >>> So why did Rust adopt auto-moving instead of explicit moving? If the >>> second example had to be written as foo(move a) there would be no >>> confusion. The and the third example should arguably be sort(mut >>> a.as_mut_slice()). >>> >>> Personally I would prefer if & in Rust worked similar to const T& in c++ >>> (i.e. for most intents and purposes you can treat a reference as a value), >>> otherwise half of the arguments on each function call will have to be >>> adorned with ampersands. >>> >>> Can we have it such that foo(a) would be guaranteed to not mutate or >>> move a and require "mut" or "move" prefix otherwise? >>> >>> >>> ___ >>> Rust-dev mailing list >>> Rust-dev@mozilla.org >>> https://mail.mozilla.org/listinfo/rust-dev >>> >>> >> > > ___ > Rust-dev mailing list > Rust-dev@mozilla.org > https://mail.mozilla.org/listinfo/rust-dev > > ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] Please simplify the syntax for Great Justice
He said IMHO .. An explicit "new" is a huge indicator of what is going on ... most people are familiar enough with stack allocation early adopters are not 9-5 developers.. Ben On Tue, Nov 19, 2013 at 11:41 AM, Kevin Ballard wrote: > Is that really why, or are you just guessing? I'm assuming the real > reason is that people are used to languages where heap allocation is common > and stack allocation rare or nonexistant, and don't understand why boxing > everything is a bad idea. In other words, it's a problem that a proper > tutorial should be able to help with. I don't think changing syntax is > going to make much of a difference. > > -Kevin > ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] More on stack safety
> > > > > In terms of the stack size check , as has been mentioned you can do >> more with some analysis , each method gives an indication of worst >> cases stack growth , these can then be added together to reduces checks >> by 80% and hence significantly reduce the performance impact . >> > > Citation for the 80% number? > > Actually it was a guess based on the algoirithm in my head but i did just find a paper which showed up to 89% of checks can be removed. http://courses.cs.vt.edu/cs5204/fall05-gback/papers/capriccio-sosp-2003.pdf Segmented stacks for Apache with highly concurrent workload. " At 0.1% of call sites, checkpoints caused a new stack chunk to be linked, at a cost of 27 instructions. At 0.4–0.5% of call sites, a large stack chunk was linked unconditionally in order to handle an external function, costing 20 instructions. At 10% of call sites, a checkpoint determined that a new chunk was not required, which cost 6 instructions. The remaining 89% of call sites were unaffected. Assuming all instructions are roughly equal in cost, the result is a 71–73% slowdown when considering function calls alone. Since call instructions make up only 5% of the program’s instructions, the overall slowdown is approximately 3% to 4%" This is also interesting for a compiler that implements it. http://www.cs.technion.ac.il/~erez/Papers/sctack-scan-vee09.pdf Note im in the big stack and tune it down camp was just trying to work out why segments were so bad. Ben ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] More on stack safety
> > The big issue here then is really that rust tasks can be pre-empted, and > therefore you can't guarantee that *any* function is free from blocking > (which means you can't assign it a large stack, since it may go to sleep > while holding it). IMO co-operative tasks (by default) is worth considering > as a solution here. > re Co operative tasks , I have bad memories of 3.X apps which could lock the whole OS/GUI when in a tight loop ? Certainly not a solution for single core systems , even on dual system it can be an issue. On quad+ there is some merit but history has shown every addin /lib will use as many resources as they can ( eg flash add animations ) and not yield willingly. It can be done but you need to check tasks with a schedule service task and if they dont behave pre-empt / reduce priority. Ben ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] More on stack safety
On Tue, Oct 29, 2013 at 7:08 PM, Niko Matsakis wrote: > If I understand correctly, what you are proposing is to offer fixed > size stacks and to use a guard page to check for stack overflow (vs a > stack preamble)? > > My two thoughts are: > > 1. I do think segmented stacks offer several tangible benefits: > > - Recursion limited only by available memory / address space > - Avoid chewing up address space on 32 bit builds > > However, I can accept that on balance they are not worth the price, > given that point #2 is probably not that important for 64-bit systems. > > It is sad to lose limitless recursion but typically one can rewrite > routines that recurse arbitrarily deep to use a stack on the heap, > though sometimes the code is very unnatural, and using a stack on the > heap will not play as well with lifetimes, since the compiler doesn't > know that you are obeying a stack discipline. Fixed stacks can be expanded once you have precise collection ( help will come from LLVM eventually...). What i think hasnt been discussed ( though it has previously) is how much of a penalty segmented stacks represent and what it could be .. In terms of memory allocations a faster expansion or greater reuse from a segment pool ( Why does the scheduler segment pool start empty. I would have thought you would have 2Meg worth pre allocated - one runs in the context of a microbench the other allocated before hand ( like C stacks) ) . a 4^N expansion of each stack would result in more contigous stacks. And if stack is unused for a long time hand some segments back. In terms of the stack size check , as has been mentioned you can do more with some analysis , each method gives an indication of worst cases stack growth , these can then be added together to reduces checks by 80% and hence significantly reduce the performance impact . Though obviously for production release soon you will get there faster with fixed stacks earlier. Ben ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
[rust-dev] Fwd: Faster communication between tasks
Simon 1 thing you may want to test is 10-20 senders to 1 reciever. Multiple senders have completely diffirent behaviour and can create a lot of contention around locks / interlocked calls . Also checks what happens to CPU when the receiver blocks for 100ms disk accesses every 100ms. Disruptor as used by Lmax normally uses very few senders / receivers and the main/busy threads do no IO. Ben On Wed, Oct 30, 2013 at 1:03 PM, Simon Ruggier wrote: > See my first message, I tested the throughput of the pipes API, it is far > slower. Synchronization between sender and receiver depends on which wait > strategy is used. There is a strategy that blocks indefinitely if no new > items are sent. To see how it works, look at this comment: > > https://github.com/sruggier/rust-disruptor/blob/7cbc2fababa087d0bc116a8a739cbb759354388b/disruptor.rs#L762 > > Multiple senders are also on my roadmap. > > Some things just aren't testable, because the memory ordering guarantees > depend on the hardware you're running on. For it to be truly correct and > portable, the source code has to be simple enough for a reviewer to able to > verify correctness at compile time. The comment I link to above is a good > example, I could never test that code thoroughly enough to be satisfied, a > proof of correctness is the only way. > > > ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] Faster communication between tasks
Note pre fetch and non temporal instructions really help with this.. I'm not deeply familiar with Disruptor, but I believe that it uses bounded >> queues. My general feeling thus far is that, as the general 'go-to' channel >> type, people should not be using bounded queues that block the sender when >> full because of the potential for unexpected deadlocks. I could be >> convinced otherwise though if it's just not possible to have reasonably >> fast unbounded channels. Note that I don't think it's critical for the >> general-purpose channel to be as fast as possible - it's more important to >> be convenient. >> > Rust tasks however are more than a GP channel .. they are your parralelism and saying they should be convenient is the same in the parralel world of saying your method callls should be convenient rather than fast - they need to be both with little compromise. It will be a major bottle neck to using independent task based components. Note you dont need to stop the sender if the queue is full , what can do is write the messages to a buffer in the senders memory when the queue is full and continue running , you can then make a decision of how big that should be based on memory pressure ( you can persist part of the queue even ( maybe force paging) , causing a disk acces which the thread will wait for) etc.. Lockless , non blocking async IPC is possible ( so its possible with rust tasks) though its an area of little research . The key is the sender must either slow , have sufficient memory to handle all requests up till the point it stops running waiting for something else or persist some of that. Expanding queues are possible and help reduce memory but need carefull scheduling. ( obvious way is stop both , create a new queue and copy the data accross) . The worst case is probably a logger / printf service we have all experienced apps slowing to write to disk / console ,in this case they would not but it comes at the cost of memory . Ben ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] Faster communication between tasks
Hi Simon , You may want to test the througput of tasks first to set a base line. Disruptor is faster but normally runs in a tight loop ( or a yield loop) so pretty much tying up a core ( even when yielding it will run again very soon , its the fact that its always running which makes it very fast as there is no contention /communication ). You cant run 100 tasks like that with real work not just micro benches . While i like there Disruptor approach it doesnt fit well with rust tasks which are more an actor model . ( Well Disruptor is like tasks its just they are much bigger). Anyway Disruptor really rests on the old ringbuffer , take the ring buffer , think about scheduling / sequencing and test so it doesnt set the CPU to 100%.. ( Note there catch up extentions on the ring buffer may or may not be relevant) . Im sure the performance of tasks can be approved have a look at IPC in barrel fish OS . It uses a ring buffer like Disruptor and plays around with cache prefetch to give either low latency or high latency with better throughput ( its simple) .. It can be improved further with non temporal SIMD moves but this is more tricky to do as its the compiler / marsheling the types. Ben On Tue, Oct 29, 2013 at 1:02 PM, Simon Ruggier wrote: > Greetings fellow Rustians! > > First of all, thanks for working on such a great language. I really like > the clean syntax, increased safety, separation of data from function > definitions, and freedom from having to declare duplicate method prototypes > in header files. > > I've been working on an alternate way to communicate between tasks in > Rust, following the same approach as the LMAX Disruptor.[1] I'm hoping to > eventually offer a superset of the functionality in the pipes API, and > replace them as the default communication mechanism between tasks. Just as > with concurrency in general, my main motivation in implementing this is to > improve performance. For more information about the disruptor approach, > there's a lot of information linked from their home page, in a variety of > formats. > > This is my first major contribution of new functionality to an open-source > project, so I didn't want to discuss it in advance until I had a working > system to demonstrate. I currently have a very basic proof of concept that > achieves almost two orders of magnitude better performance than the pipes > API. On my hardware[2], I currently see throughput of about 27 million > items per second when synchronizing with a double-checked wait condition > protocol between sender and receivers, 80+ million items with no blocking > (i.e. busy waiting), and anywhere from 240,000 to 600,000 when using pipes. > The LMAX Disruptor library gets up to 110 million items per second on the > same hardware (using busy waiting and yielding), so there's definitely > still room for significant improvement. > > I've put the code up on GitHub (I'm using rustc from master).[3] > Currently, single and multi-stage pipelines of receivers are supported, > while many features are missing, like multiple concurrent senders, multiple > concurrent receivers, or mutation of the items as they pass through the > pipeline. However, given what I have so far, now is probably the right time > to start soliciting feedback and advice. I'm looking for review, > suggestions/constructive criticism, and guidance about contributing this to > the Rust codebase. > > Thanks, > Simon > > [1] http://lmax-exchange.github.io/disruptor/ > [2] A 2.66GHz Intel P8800 CPU running in a Thinkpad T500 on Linux x86_64 > [3] https://github.com/sruggier/rust-disruptor > > ___ > Rust-dev mailing list > Rust-dev@mozilla.org > https://mail.mozilla.org/listinfo/rust-dev > > ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev
Re: [rust-dev] On Stack Safety
Yes its a simple region but region analysis is normally compile time while this is runtime..knowing the pointer points to the stack doesnt help . If you copy in a new stack you need to know every pointer to that stack if you move it to a new location .. Copying GCs do this all the time but they scan the heap for the mark and know the references and can use either a global pause or a write barrier . If you had a precise GC ( which can precisely define all the references) you could on need to grow stack , stop the world , scan the whole heap for references to the stack move the stack and update the references. This is expensive so you would want a warning to the programmer to increase the stack size. ( Note conservative collectors can confirm pointers to heap objects its just expensive , there may be a way of confirming a stack object maybe with a typecheck or wrapping such an object with some more data which allows precise determination). Why cant rust do precise marking , is it pointers to internal objects rather than offsets ? All good modern GCs are precise/copying so not having it is a big issue . Ben Kloosterman On Fri, Oct 25, 2013 at 2:39 AM, David Rajchenbach-Teller < dtel...@mozilla.com> wrote: > I am curious. Isn't the stack a very simple region? Couldn't region > inference be used to solve pointer-to-stack problems? > > Or do I completely misunderstand the definition of pointer-to-the-stack? > > ___ Rust-dev mailing list Rust-dev@mozilla.org https://mail.mozilla.org/listinfo/rust-dev