Re: Ropes (concatenation trees) for strings in D ?

2014-08-15 Thread Kagamin via Digitalmars-d-learn

Search gave result enough rope to hang yourself.


Re: extern (c++) std::function?

2014-08-15 Thread FreeSlave via Digitalmars-d-learn

On Friday, 15 August 2014 at 03:10:43 UTC, Etienne Cimon wrote:
I'm looking into making a binding for the C++ API called Botan, 
and the constructors in it take a std::function. I'm wondering 
if there's a D equivalent for this binding to work out, or if I 
have to make a C++ wrapper as well?


There are some restrictions about sharing complex types between 
C++ and D. Currently only POD-structs and classes with virtual 
functions are supported for transparent interaction.


In this case things become even more complicated since 
std::function is template class and D can't instantiate C++ 
templates. You should stick with some predefined signatures and 
make wrappers on C++ side, which will accept 'plane' functions 
and construct std::function.


Re: Appender is ... slow

2014-08-15 Thread Philippe Sigaud via Digitalmars-d-learn
 Quick test...

Ah, thanks a lot Jonathan. I kept telling me I should probably test it
on a simple case.
OK, the good news is, Appender works in these cases (I mean, that's
good news for Phobos).
Now, I just have to find out why it's slower in my case :)



 
 import std.array;
 import std.datetime;
 import std.stdio;

 enum size = 1000;

 void test1()
 {
 auto arr = appender!(int[])();
 foreach(n; 0 .. size)
 arr.put(n);
 }


I used ~= to append to an Appender. The examples use .put, but ~= is
documented so I considered the operator was just syntax sugar. I
didn't have a look at its implementation.


 void test2()
 {
 int[] arr;
 foreach(n; 0 .. size)
 arr ~= n;
 }

 void test3()
 {
 auto arr = new int[](size);
 foreach(n; 0 .. size)
 arr[n] = n;
 }

This one is simple and interesting. In my case I don't know the length
in advance (I'm doing some parsing and predicting the size of the
parse forest based on the input length is somewhat tricky), but I can
pre-allocate some, based on a simple heuristics.

 void test4()
 {
 auto arr = uninitializedArray!(int[])(size);
 foreach(n; 0 .. size)
 arr[n] = n;
 }

Oh, I didn't know this one.


 With size being 1000, I get

 178 ms, 229 μs, and 4 hnsecs
 321 ms, 272 μs, and 8 hnsecs
 27 ms, 138 μs, and 7 hnsecs
 24 ms, 970 μs, and 9 hnsecs

Same here, good.
Using ~= n instead of .put(n) gives me results consistently slightly
slower for Appender (170 ms - 190 ms, repeatedly, even going back and
forth between the two possibilities.
I created a test1prime to test that.


 With size being 100,000, I get

 15 secs, 731 ms, 499 μs, and 1 hnsec
 29 secs, 339 ms, 553 μs, and 8 hnsecs
 2 secs, 602 ms, 385 μs, and 2 hnsecs
 2 secs, 409 ms, 448 μs, and 9 hnsecs

Ditto. OK, that's good. Also, it's still slower with using Appender ~=
n instead of Appender.put. (18s instead of 15, 20% slower)
So, kids: don't do that.

 So, it looks like using Appender to create an array of ints is about twice
 as fast as appending to the array directly, and unsurprisingly, creating the
 array at the correct size up front and filling in the values is far faster
 than either, whether the array's elements are default-initialized or not.
 And the numbers are about the same if it's an array of char rather than an
 array of int.

Perfect. That also means my go-to method will still be using standard
arrays, but with pre-allocation.
I just feel stupid writing that, because it's obvious that reserving
things should give it better speed.


 Also, curiously, if I add a test which is the same as test2 (so it's just
 appending to the array) except that it calls reserve on the array with size,
 the result is almost the same as not reserving. It's a smidgeon faster but
 probably not enough to matter. So, it looks like reserve doesn't buy you
 much for some reason. Maybe the extra work for checking whether it needs to
 reallocate or whatever fancy logic it has to do in ~= dwarfs the extra cost
 of the reallocations? That certainly seems odd to me, but bizarrely, the
 evidence does seem to say that reserving doesn't help much. That should
 probably be looked into.

Yeah, I get a small boost of 5% compared to not reserving at size
1000, which completely disappears for longer arrays.
(No different for size 100_000).


 In any case, from what I can see, if all you're doing is creating an array
 and then throwing away the Appender, it's faster to use Appender than to
 directly append to the array.

Indeed.

 Maybe that changes with structs? I don't know.

I'm using classes, because what I'm pushing into the Appender are
graph nodes and I got fed up with tracing pointer to strucs
everywhere. Maybe I should change back to structs and see what it
does.


 It would be interesting to know what's different about your code that's
 causing Appender to be slower, but from what I can see, it's definitely
 faster to use Appender unless you know the target size of the array up
 front.

Good conclusion. Thanks for your help. My takeaway idea is that I'll
use arrays, but 'new T[](size)' them before. If that makes heavy
concatenation 10 times faster, it should have a positive effect (I'm
not of course waiting for anything near a 10x boost in my computation
time).



Re: core.thread.Fiber --- runtime stack overflow unlike goroutines

2014-08-15 Thread Kagamin via Digitalmars-d-learn

On Thursday, 14 August 2014 at 18:52:00 UTC, Sean Kelly wrote:
On 64 bit, reserve a huge chunk of memory, set a SEGV handler 
and commit more as needed. Basically how kernel thread stacks 
work. I've been meaning to do this but haven't gotten around to 
it yet.


AFAIK, OS already provides this transparently: when you allocate 
memory, OS only reserves the memory range and commits pages when 
they are accessed.


Re: Appender is ... slow

2014-08-15 Thread Philippe Sigaud via Digitalmars-d-learn
 I wonder if using plain `Array` instead may be result in better performance
 where immutability is not needed.

Hmm, no:

module appendertest;

import std.array;
import std.datetime;
import std.stdio;
import std.container;

enum size = 1_000;

void test1()
{
auto arr = appender!(int[])();
foreach(n; 0 .. size)
arr.put(n);
}

void test1prime()
{
auto arr = appender!(int[])();
foreach(n; 0 .. size)
arr ~= n;
}

void test2()
{
int[] arr;
foreach(n; 0 .. size)
arr ~= n;
}

void test2prime()
{
int[] arr;
arr.reserve(size);
foreach(n; 0 .. size)
arr ~= n;
}

void test3()
{
Array!int arr;
foreach(n; 0 .. size)
arr ~= n;
}

void test3prime()
{
Array!int arr;
arr.reserve(size);
foreach(n; 0 .. size)
arr ~= n;
}

void test4()
{
auto arr = new int[](size);
foreach(n; 0 .. size)
arr[n] = n;
}

void test5()
{
auto arr = uninitializedArray!(int[])(size);
foreach(n; 0 .. size)
arr[n] = n;
}

void main()
{
auto result = benchmark!(test1, test1prime,
 test2, test2prime,
 test3, test3prime,
 test4,
 test5
)(10_000);

writeln(Appender.put   :, cast(Duration)result[0]);
writeln(Appender ~=:, cast(Duration)result[1]);
writeln(Std array  :, cast(Duration)result[2]);
writeln(Std array.reserve  :, cast(Duration)result[3]);
writeln(Array  :, cast(Duration)result[4]);
writeln(Array.reserve  :, cast(Duration)result[5]);
writeln(new T[]()  :, cast(Duration)result[6]);
writeln(uninitializedArray :, cast(Duration)result[7]);
}

Times:

Appender.put   :157 ms, 602 μs, and 3 hnsecs
Appender ~=:182 ms, 807 μs, and 1 hnsec
Std array  :256 ms, 210 μs, and 7 hnsecs
Std array.reserve  :244 ms, 770 μs, and 4 hnsecs
Array  :336 ms, 207 μs, and 3 hnsecs
Array.reserve  :321 ms, 500 μs, and 6 hnsecs
new T[]()  :28 ms, 496 μs, and 6 hnsecs
uninitializedArray :26 ms and 620 μs



Re: Appender is ... slow

2014-08-15 Thread Philippe Sigaud via Digitalmars-d-learn
On Thu, Aug 14, 2014 at 11:33 PM, Joseph Rushton Wakeling via
Digitalmars-d-learn digitalmars-d-learn@puremagic.com wrote:
 On 14/08/14 19:16, Philippe Sigaud via Digitalmars-d-learn wrote:

 Do people here get good results from Appender? And if yes, how are you
 using it?


 An example where it worked for me:
 http://braingam.es/2013/09/betweenness-centrality-in-dgraph/

 (You will have to scroll down a bit to get to the point where appenders get
 introduced.)

I remember reading this and loving it! Any news on this? Do you have
new algorithms?


Re: core.thread.Fiber --- runtime stack overflow unlike goroutines

2014-08-15 Thread Kagamin via Digitalmars-d-learn

http://msdn.microsoft.com/en-us/library/windows/desktop/aa366887%28v=vs.85%29.aspx
Allocates memory charges (from the overall size of memory and 
the paging files on disk) for the specified reserved memory 
pages. The function also guarantees that when the caller later 
initially accesses the memory, the contents will be zero. 
Actual physical pages are not allocated unless/until the 
virtual addresses are actually accessed.


Re: core.thread.Fiber --- runtime stack overflow unlike goroutines

2014-08-15 Thread Kagamin via Digitalmars-d-learn
On Thursday, 14 August 2014 at 07:46:29 UTC, Carl Sturtivant 
wrote:
The default size of the runtime stack for a Fiber is 4*PAGESIZE 
which is very small, and a quick test shows that a Fiber 
suffers a stack overflow that doesn't lead to a clean 
termination when this limit is exceeded.


Pass a bigger stack size to the Fiber constructor?


Re: extern (c++) std::function?

2014-08-15 Thread Rémy Mouëza via Digitalmars-d-learn
You'll certainly have to make a C++ wrapper. However, a delegate being 
implemented as a struct containing a context pointer and a function, you 
can get some degree of interoperability between C++ and D
(BUT note that it is an undocumented implementation detail subject to 
change without notice -- althought it hasn't changed in many years):


/* === */
/// ddg.d
import std.stdio;
import std.string;

/// A C++ function that will take a D delegate.
extern (C) void callDg (immutable(char)* delegate (int, int));

/// A dummy class.
class X {
/// This method can be used as a delegate.
extern (C)
immutable(char)* callMe (int i, int j) {
return %d, %d.format (i, j).toStringz;
}
}

void main () {
auto x = new X;
callDg (x.callMe);
}

/* === */
/// cpp_dg.cpp
#include iostream

using namespace std;

/// A D delegate representation in C++.
struct Dg {
/// The context pointer.
void * ctx;

/// The function within the delegate: the first argument is the 
context pointer.

const char *(*dg) (void * ctx, int i, int j);

/// C++ sugar: calling a struct Dg as a function.
const char * operator ()(int i, int j) {
return dg (ctx, i, j);
}
};

/// Extern C allows D compatibilty.
extern C {
void callDg (Dg dg) {
/// Call the extern (C) D delegate.
cout  dg (42, 7)  endl;
}
}
/* === */
$ g++ -c cpp_dg.cpp
$ dmd ddg.d cpp_dg.o -L-lstdc++
$ ./ddg
42, 7
/* === */

According to http://en.cppreference.com/w/cpp/utility/functional/function: 
 Class template std::function is a general-purpose polymorphic
 function wrapper. Instances of std::function can store, copy, and
 invoke any Callable target -- functions, lambda expressions, bind
 expressions, or other function objects, as well as pointers to member
 functions and pointers to data members.


Thus the struct Dg in the example above should be compatible with the 
Botan constructors.


Also, extern (C) delegates are not that convenient in D, especially with 
assignments of anonymous/inline ones. You may want to add a layer of 
abstraction to the API you expose in D so that user D delegates are used 
from a second extern (C) delegate itself used by the C++ wrapper:


class BotanStuff {
protected void delegate (string) ddg;
protected BotanWrapper wrapr;

this (void delegate (string) dg) {
ddg   = dg;
wrapr = new BotanWrapper ( this.cppDg);
}

extern (C) void cppDg (immutable(char)* cStr) {
import std.conv;
dg (cStr.to!string);
}
}

If you are planning to use Swig for your binding, this kind of wrapping 
may be conveniently done using custom typemaps.



On 08/15/2014 05:10 AM, Etienne Cimon wrote:

I'm looking into making a binding for the C++ API called Botan, and the
constructors in it take a std::function. I'm wondering if there's a D
equivalent for this binding to work out, or if I have to make a C++
wrapper as well?




Re: Appender is ... slow

2014-08-15 Thread Dicebot via Digitalmars-d-learn
On Friday, 15 August 2014 at 08:35:41 UTC, Philippe Sigaud via 
Digitalmars-d-learn wrote:
I wonder if using plain `Array` instead may be result in 
better performance

where immutability is not needed.


Hmm, no:

...


It is very different with better compiler though :

$ ldmd2 -release -O a.d
$ ./appendertest
Appender.put   :378 ms, 794 μs, and 9 hnsecs
Appender ~=:378 ms, 416 μs, and 3 hnsecs
Std array  :2 secs, 222 ms, 256 μs, and 2 hnsecs
Std array.reserve  :2 secs, 199 ms, 64 μs, and 5 hnsecs
Array  :349 ms and 250 μs
Array.reserve  :347 ms, 694 μs, and 1 hnsec
new T[]()  :37 ms, 989 μs, and 8 hnsecs
uninitializedArray :39 ms, 333 μs, and 3 hnsecs

(size 10_000)

I am still somewhat disappointed though because I'd expect static 
Array to get close in performance to pre-allocated array of exact 
size over many iterations - which doesn't happen. Need to 
investigate.


Re: Appender is ... slow

2014-08-15 Thread Vladimir Panteleev via Digitalmars-d-learn

On Thursday, 14 August 2014 at 18:31:15 UTC, Dicebot wrote:
I don't know much about Phobos appender implementation details 
but the key thing with reusable buffer is avoid freeing them. 
AFAIR Appender.clear frees the allocated memory but 
`Appender.length = 0` does not, making it possible to just 
overwrite stuff again and again.


Won't promise you anything though!


Appender has no .length property, and .clear does rewind the 
write pointer without deallocating memory, thus allowing you to 
reuse the buffer.


Re: Appender is ... slow

2014-08-15 Thread Philippe Sigaud via Digitalmars-d-learn
 It is very different with better compiler though :

 $ ldmd2 -release -O a.d
 $ ./appendertest
 Appender.put   :378 ms, 794 μs, and 9 hnsecs
 Appender ~=:378 ms, 416 μs, and 3 hnsecs
 Std array  :2 secs, 222 ms, 256 μs, and 2 hnsecs
 Std array.reserve  :2 secs, 199 ms, 64 μs, and 5 hnsecs
 Array  :349 ms and 250 μs
 Array.reserve  :347 ms, 694 μs, and 1 hnsec
 new T[]()  :37 ms, 989 μs, and 8 hnsecs
 uninitializedArray :39 ms, 333 μs, and 3 hnsecs

 (size 10_000)

OK, same here, except std array gives me only 1.4 s, while the other
timings are about the same. (0.14 alpha2 : ldmd2 -O -inline).
Drat, that means testing on many different compilers. Oh well, let's
start small: pre-allocating, better algorithms, then I'll do real
speed instrumentation.



Re: Appender is ... slow

2014-08-15 Thread Messenger via Digitalmars-d-learn

On Friday, 15 August 2014 at 10:31:59 UTC, Dicebot wrote:
On Friday, 15 August 2014 at 08:35:41 UTC, Philippe Sigaud via 
Digitalmars-d-learn wrote:
I wonder if using plain `Array` instead may be result in 
better performance

where immutability is not needed.


Hmm, no:

...


It is very different with better compiler though :

$ ldmd2 -release -O a.d
$ ./appendertest
Appender.put   :378 ms, 794 μs, and 9 hnsecs
Appender ~=:378 ms, 416 μs, and 3 hnsecs
Std array  :2 secs, 222 ms, 256 μs, and 2 hnsecs
Std array.reserve  :2 secs, 199 ms, 64 μs, and 5 hnsecs
Array  :349 ms and 250 μs
Array.reserve  :347 ms, 694 μs, and 1 hnsec
new T[]()  :37 ms, 989 μs, and 8 hnsecs
uninitializedArray :39 ms, 333 μs, and 3 hnsecs

(size 10_000)


T[size] beats all of those on dmd head, though it is inarguably a
bit limiting.


Re: Appender is ... slow

2014-08-15 Thread Philippe Sigaud via Digitalmars-d-learn
On Fri, Aug 15, 2014 at 1:57 PM, Messenger via Digitalmars-d-learn
digitalmars-d-learn@puremagic.com wrote:

 T[size] beats all of those on dmd head, though it is inarguably a
 bit limiting.

I confirm (even with 2.065). With ldc2 it's optimized out of the way,
so it gives 0 hnsecs :-)
Hmm, what about a sort of linked list of static arrays, that allocates
a new one when necessary?


Re: Appender is ... slow

2014-08-15 Thread monarch_dodra via Digitalmars-d-learn

On Friday, 15 August 2014 at 11:57:30 UTC, Messenger wrote:
T[size] beats all of those on dmd head, though it is inarguably 
a

bit limiting.


Hey guys, just a bit of background and my own understanding of
Appender, having worked on it a fair bit.

First of all, Appender was not designed as a neck-breaking,
mind-bending speed object. It is merely a tool to offset the
slow GC-based appending.

Indeed, when doing a raw GC-append, you first have to give the GC
a pointer to the start of your array. The GC will then lookup in
which block that pointer belongs, then look up the info related
to that block, check if appending is possible, and then do the
append proper...
...And then it will do all that all over again on the next append.

Appender is simply a tool to cache the results of that once,
and then do quicker appends.

There are two other things to take into consideration with
Appender: For starters, it can append to an *existing* array it
is given. Second, you may destroy the Appender object at any
time, and the referenced array is still valid: Appender does not
*own* its buffer, and as such, is not allowed certain
optimizations.

Really, it's just designed for convenience and pretty good
speed.

Also, another thing to take into account when benchmarking, is
that Appender is a reference semantic object: It has a payload
which itself references an array. This creates a double
indirection. This usually doesn't have much impact, but with the
right optimizations, it can probably explain the x10 performance
differences we are seeing, in our *synthetic* benchmarks. I have
some doubts about the validity of the results in a real
application.

So TL;DR; yeah, you can probably do faster. But Appender is
convenient, fast enough, and works with the GC.

If you *do* need super speeds, look into something a bit more
manual: Walter's ScopeBuffer would be a good choice.

I also did some work on something called ScopeAppender, but
didn't have time to finish it yet.
https://github.com/monarchdodra/phobos/compare/ScopeAppender
It provides better speeds and deterministic management, at the
cost of partial private buffer ownership.


Re: Appender is ... slow

2014-08-15 Thread monarch_dodra via Digitalmars-d-learn
On Friday, 15 August 2014 at 12:08:58 UTC, Philippe Sigaud via 
Digitalmars-d-learn wrote:
Hmm, what about a sort of linked list of static arrays, that 
allocates

a new one when necessary?


Appender is not a container, and has no freedom on the data it 
manipulates. It has to be able to accept an array as input, and 
when it is finished, it needs to be able to return an actual 
array, so that's arguably out of the question.


Re: core.thread.Fiber --- runtime stack overflow unlike goroutines

2014-08-15 Thread Sean Kelly via Digitalmars-d-learn

On Friday, 15 August 2014 at 08:36:34 UTC, Kagamin wrote:

http://msdn.microsoft.com/en-us/library/windows/desktop/aa366887%28v=vs.85%29.aspx
Allocates memory charges (from the overall size of memory and 
the paging files on disk) for the specified reserved memory 
pages. The function also guarantees that when the caller later 
initially accesses the memory, the contents will be zero. 
Actual physical pages are not allocated unless/until the 
virtual addresses are actually accessed.


Oh handy, so there's basically no work to be done on Windows.  
I'll have to check the behavior of mmap on Posix.


Re: core.thread.Fiber --- runtime stack overflow unlike goroutines

2014-08-15 Thread Dicebot via Digitalmars-d-learn

On Friday, 15 August 2014 at 08:41:30 UTC, Kagamin wrote:
On Thursday, 14 August 2014 at 07:46:29 UTC, Carl Sturtivant 
wrote:
The default size of the runtime stack for a Fiber is 
4*PAGESIZE which is very small, and a quick test shows that a 
Fiber suffers a stack overflow that doesn't lead to a clean 
termination when this limit is exceeded.


Pass a bigger stack size to the Fiber constructor?


Won't that kind of kill the purpose of Fiber as low-cost context 
abstraction? Stack size does add up for thousands of fibers.


Re: core.thread.Fiber --- runtime stack overflow unlike goroutines

2014-08-15 Thread Kagamin via Digitalmars-d-learn

On Friday, 15 August 2014 at 14:26:28 UTC, Sean Kelly wrote:
Oh handy, so there's basically no work to be done on Windows.  
I'll have to check the behavior of mmap on Posix.


I heard, calloc behaves this way on linux (COW blank page mapped 
to the entire range), it was discussed here some time ago.


Re: core.thread.Fiber --- runtime stack overflow unlike goroutines

2014-08-15 Thread Kagamin via Digitalmars-d-learn

On Friday, 15 August 2014 at 14:28:34 UTC, Kagamin wrote:

On Friday, 15 August 2014 at 14:26:28 UTC, Sean Kelly wrote:
Oh handy, so there's basically no work to be done on Windows.  
I'll have to check the behavior of mmap on Posix.


I heard, calloc behaves this way on linux (COW blank page 
mapped to the entire range), it was discussed here some time 
ago.


Another abstraction for core.vmm.


Re: Appender is ... slow

2014-08-15 Thread Philippe Sigaud via Digitalmars-d-learn
Well, I created a wrapper around a std.array.uninitializedArray 
call, to manage the interface I need (queue behavior: pushing at 
the end, popping at the beginning). When hitting the end of the 
current array, it either reuse the current buffer or create a new 
one, depending of the remaining capacity.


On the 'synthetic' benchmarks, it performs quite reasonably: half 
the time of  Array or Appender (twice faster), 5x faster than 
standard array, and 3-4x slower than uninitializedArray.


And... It does not change the timings in my code, it even makes 
things slower when pre-allocating to much. Only by pre-allocating 
only a few elements do I get back the original timings.


So, I guess I'm suffering from a bad case of premature 
optimization :)


I thought that, having lots of concatenation in my code, that'd 
be a bottleneck. But it appears than pre-allocation does not give 
me any speed-up.


Well, at least it got me thinking, testing LDC a bit more and 
learning things on Array and Appender ;)


Thank for your help guys, it's now time for the -profile switch 
again...




Re: core.thread.Fiber --- runtime stack overflow unlike goroutines

2014-08-15 Thread Sean Kelly via Digitalmars-d-learn

On Friday, 15 August 2014 at 14:28:34 UTC, Dicebot wrote:


Won't that kind of kill the purpose of Fiber as low-cost 
context abstraction? Stack size does add up for thousands of 
fibers.


As long as allocation speed is fast for large allocs (which I 
have to test), I want to change the default size to be very 
large.  The virtual address space in a 64-bit app is enormous, 
and if the memory is committed on demand then physical memory use 
should only match what the user actually requires.  This should 
allow us to create millions of fibers and not overrun system 
memory, and also not worry about stack overruns, which has always 
been a concern with the default fiber stack size.


Re: core.thread.Fiber --- runtime stack overflow unlike goroutines

2014-08-15 Thread Kagamin via Digitalmars-d-learn

On Friday, 15 August 2014 at 14:28:34 UTC, Dicebot wrote:
Won't that kind of kill the purpose of Fiber as low-cost 
context abstraction? Stack size does add up for thousands of 
fibers.


I didn't measure it.


Re: core.thread.Fiber --- runtime stack overflow unlike goroutines

2014-08-15 Thread Sean Kelly via Digitalmars-d-learn

On Friday, 15 August 2014 at 14:26:28 UTC, Sean Kelly wrote:

On Friday, 15 August 2014 at 08:36:34 UTC, Kagamin wrote:

http://msdn.microsoft.com/en-us/library/windows/desktop/aa366887%28v=vs.85%29.aspx
Allocates memory charges (from the overall size of memory and 
the paging files on disk) for the specified reserved memory 
pages. The function also guarantees that when the caller 
later initially accesses the memory, the contents will be 
zero. Actual physical pages are not allocated unless/until 
the virtual addresses are actually accessed.


Oh handy, so there's basically no work to be done on Windows.  
I'll have to check the behavior of mmap on Posix.


It sounds like mmap (typically) works the same way on Linux, and 
that malloc generally does as well.  I'll have to test this to be 
sure.  If so, and if doing so is fast, I'm going to increase the 
default stack size on 64-bit systems to something reasonably 
large.  And here I thought I would have to do this all manually.


Re: core.thread.Fiber --- runtime stack overflow unlike goroutines

2014-08-15 Thread Sean Kelly via Digitalmars-d-learn
At least on OSX, it appears that mapping memory is constant time 
regardless of size, but there is some max total memory I'm 
allowed to map, presumably based on the size of a vmm lookup 
tabe.  The max block size I can allocate is 1 GB, and I can 
allocate roughly 131,000 of these blocks before getting an out of 
memory error.  If I reduce the block size to 4 MB I can allocate 
more than 10M blocks without error.  I think some default stack 
size around 4 MB seems about right.  Increasing the size to 16 MB 
failed after about 800,000 allocations, which isn't enough 
(potential) fibers.


Re: core.thread.Fiber --- runtime stack overflow unlike goroutines

2014-08-15 Thread Dicebot via Digitalmars-d-learn

On Friday, 15 August 2014 at 14:45:02 UTC, Sean Kelly wrote:

On Friday, 15 August 2014 at 14:28:34 UTC, Dicebot wrote:


Won't that kind of kill the purpose of Fiber as low-cost 
context abstraction? Stack size does add up for thousands of 
fibers.


As long as allocation speed is fast for large allocs (which I 
have to test), I want to change the default size to be very 
large.  The virtual address space in a 64-bit app is enormous, 
and if the memory is committed on demand then physical memory 
use should only match what the user actually requires.  This 
should allow us to create millions of fibers and not overrun 
system memory, and also not worry about stack overruns, which 
has always been a concern with the default fiber stack size.


No, I was referring to the proposal to supply bigger stack size 
to Fiber constructor - AFAIR it currently does allocate that 
memory eagerly (and does not use any OS CoW tools), doesn't it?


Re: core.thread.Fiber --- runtime stack overflow unlike goroutines

2014-08-15 Thread Sean Kelly via Digitalmars-d-learn

On Friday, 15 August 2014 at 15:25:23 UTC, Dicebot wrote:


No, I was referring to the proposal to supply bigger stack size 
to Fiber constructor - AFAIR it currently does allocate that 
memory eagerly (and does not use any OS CoW tools), doesn't it?


I thought it did, but apparently the behavior of VirtualAlloc and 
mmap (which Fiber uses to allocate the stack) simply reserves the 
range and then commits it lazily, even though what you've told it 
to do is allocate the memory.  This is really great news since it 
means that no code changes will be required to do the thing I 
wanted to do anyway.


Re: core.thread.Fiber --- runtime stack overflow unlike goroutines

2014-08-15 Thread Dicebot via Digitalmars-d-learn

On Friday, 15 August 2014 at 15:40:35 UTC, Sean Kelly wrote:

On Friday, 15 August 2014 at 15:25:23 UTC, Dicebot wrote:


No, I was referring to the proposal to supply bigger stack 
size to Fiber constructor - AFAIR it currently does allocate 
that memory eagerly (and does not use any OS CoW tools), 
doesn't it?


I thought it did, but apparently the behavior of VirtualAlloc 
and mmap (which Fiber uses to allocate the stack) simply 
reserves the range and then commits it lazily, even though what 
you've told it to do is allocate the memory.  This is really 
great news since it means that no code changes will be required 
to do the thing I wanted to do anyway.


Guess that means some experimenting this weekend for me too :)


Re: Appender is ... slow

2014-08-15 Thread monarch_dodra via Digitalmars-d-learn

On Friday, 15 August 2014 at 14:40:36 UTC, Philippe Sigaud wrote:
Well, I created a wrapper around a std.array.uninitializedArray 
call, to manage the interface I need


Make sure you don't use that if your type has elaborate 
construction, or assumes a certain initial state (unless you are 
actually emplacing your objects of course).


I thought that, having lots of concatenation in my code, that'd 
be a bottleneck. But it appears than pre-allocation does not 
give me any speed-up.


If you are using raw GC arrays, then the raw append operation 
will, outweigh the relocation cost on extension. So 
pre-allocation wouldn't really help in this situation (though the 
use of Appender *should*)


@safe, pure and nothrow at the beginning of a module

2014-08-15 Thread Philippe Sigaud via Digitalmars-d-learn

So I'm trying to use @safe, pure and nothrow.

If I understand correctly Adam Ruppe's Cookbook, by putting

@safe:
pure:
nothrow:

at the beginning of a module, I distribute it on all definitions, 
right? Even methods, inner classes, and so on?


Because I did just that on half a dozen of modules and the 
compiler did not complain. Does that mean my code is clean(?) or 
that what I did has no effect?




Re: Appender is ... slow

2014-08-15 Thread Philippe Sigaud via Digitalmars-d-learn

On Friday, 15 August 2014 at 16:48:10 UTC, monarch_dodra wrote:
On Friday, 15 August 2014 at 14:40:36 UTC, Philippe Sigaud 
wrote:
Well, I created a wrapper around a 
std.array.uninitializedArray call, to manage the interface I 
need


Make sure you don't use that if your type has elaborate 
construction, or assumes a certain initial state (unless you 
are actually emplacing your objects of course).


Hmm, what's elaborate construction? They are classes and have 
constructors, of course. I assumed that this produced only null's 
in the array.



I thought that, having lots of concatenation in my code, 
that'd be a bottleneck. But it appears than pre-allocation 
does not give me any speed-up.


If you are using raw GC arrays, then the raw append 
operation will, outweigh the relocation cost on extension. So 
pre-allocation wouldn't really help in this situation (though 
the use of Appender *should*)



OK.


Re: Max/Min values in an associative array

2014-08-15 Thread monarch_dodra via Digitalmars-d-learn
On Wednesday, 6 August 2014 at 18:07:08 UTC, H. S. Teoh via 
Digitalmars-d-learn wrote:


import std.algorithm : reduce, max, min;

	auto highest = reduce!((a,b) = max(a,b))(-double.max, 
bids.byValue());
	auto lowest = reduce!((a,b) = min(a,b))(double.max, 
bids.byValue());



T


Take a look at Justin Whear's dpaste. Dual pred reduce FTW.


Re: @safe, pure and nothrow at the beginning of a module

2014-08-15 Thread Jonathan M Davis via Digitalmars-d-learn

On Friday, 15 August 2014 at 16:54:54 UTC, Philippe Sigaud wrote:

So I'm trying to use @safe, pure and nothrow.

If I understand correctly Adam Ruppe's Cookbook, by putting

@safe:
pure:
nothrow:

at the beginning of a module, I distribute it on all 
definitions, right? Even methods, inner classes, and so on?


Because I did just that on half a dozen of modules and the 
compiler did not complain. Does that mean my code is clean(?) 
or that what I did has no effect?


Hmmm... It _should_ apply to everything, but maybe it only 
applies to the outer-level declarations. Certainly, in most 
cases, I'd be surprised if marking everything in a module with 
those attributes would work on the first go. It's _possible_, 
depending on what you're doing, but in my experience, odds are 
that you're doing _something_ that violates one or all of those 
in several places.


- Jonathan M Davis


Re: Max/Min values in an associative array

2014-08-15 Thread H. S. Teoh via Digitalmars-d-learn
On Fri, Aug 15, 2014 at 04:51:59PM +, monarch_dodra via Digitalmars-d-learn 
wrote:
 On Wednesday, 6 August 2014 at 18:07:08 UTC, H. S. Teoh via
 Digitalmars-d-learn wrote:
 
  import std.algorithm : reduce, max, min;
 
  auto highest = reduce!((a,b) = max(a,b))(-double.max, bids.byValue());
  auto lowest = reduce!((a,b) = min(a,b))(double.max, bids.byValue());
 
 
 T
 
 Take a look at Justin Whear's dpaste. Dual pred reduce FTW.

Yeah I saw that. Learned something new. :-)


T

-- 
Bare foot: (n.) A device for locating thumb tacks on the floor.


Re: Appender is ... slow

2014-08-15 Thread monarch_dodra via Digitalmars-d-learn

On Friday, 15 August 2014 at 16:51:20 UTC, Philippe Sigaud wrote:

On Friday, 15 August 2014 at 16:48:10 UTC, monarch_dodra wrote:
Make sure you don't use that if your type has elaborate 
construction, or assumes a certain initial state (unless you 
are actually emplacing your objects of course).


Hmm, what's elaborate construction? They are classes and have 
constructors, of course. I assumed that this produced only 
null's in the array.


Actually, my statement was inaccurate. More specifically, never 
use anything that wasn't first properly initialized. Note that in 
some cases, operator= is itself elaborate, meaning it will also 
read data, so that's not a valid method of initialization.


uninitializedArray simply creates an array with unspecified data 
in it.


You *could* just use new instead. It's not really any slower, 
and has the advantage of being certifiably safe.


Re: @safe, pure and nothrow at the beginning of a module

2014-08-15 Thread Philippe Sigaud via Digitalmars-d-learn
In another module I marked as '@safe:' at the top, the compiler told
me that a class opEquals could not be @safe (because Object.opEquals
is @system).

So it seems that indeed a module-level '@safe:' affects everything,
since a class method was found lacking.

(I put a @trusted attribute on it).


Re: Appender is ... slow

2014-08-15 Thread John Colvin via Digitalmars-d-learn
On Thursday, 14 August 2014 at 17:16:42 UTC, Philippe Sigaud 
wrote:
From time to time, I try to speed up some array-heavy code by 
using std.array.Appender, reserving some capacity and so on.


It never works. Never. It gives me executables that are maybe 
30-50% slower than bog-standard array code.


I don't do anything fancy: I just replace the types, use 
clear() instead of = null...


Do people here get good results from Appender? And if yes, how 
are you using it?


compiler, version, OS, architecture, flags?

Have you looked at the assembly to check that all the Appender 
method calls are being inlined?


Re: core.thread.Fiber --- runtime stack overflow unlike goroutines

2014-08-15 Thread Carl Sturtivant via Digitalmars-d-learn

On Thursday, 14 August 2014 at 18:52:00 UTC, Sean Kelly wrote:
On 64 bit, reserve a huge chunk of memory, set a SEGV handler 
and commit more as needed. Basically how kernel thread stacks 
work. I've been meaning to do this but haven't gotten around to 
it yet.


Very nice; the hardware VM machinery takes care of it and there's 
only overhead when overflow occurs. D's Fibers will really be 
something remarkable for user-space with this facility.


Re: core.thread.Fiber --- runtime stack overflow unlike goroutines

2014-08-15 Thread Carl Sturtivant via Digitalmars-d-learn

On Friday, 15 August 2014 at 08:41:30 UTC, Kagamin wrote:
On Thursday, 14 August 2014 at 07:46:29 UTC, Carl Sturtivant 
wrote:
The default size of the runtime stack for a Fiber is 
4*PAGESIZE which is very small, and a quick test shows that a 
Fiber suffers a stack overflow that doesn't lead to a clean 
termination when this limit is exceeded.


Pass a bigger stack size to the Fiber constructor?


No good if the stack size needed depends dynamically on the 
computation in that Fiber.


Re: core.thread.Fiber --- runtime stack overflow unlike goroutines

2014-08-15 Thread Carl Sturtivant via Digitalmars-d-learn

On Friday, 15 August 2014 at 20:11:43 UTC, Carl Sturtivant wrote:

On Friday, 15 August 2014 at 08:41:30 UTC, Kagamin wrote:
On Thursday, 14 August 2014 at 07:46:29 UTC, Carl Sturtivant 
wrote:
The default size of the runtime stack for a Fiber is 
4*PAGESIZE which is very small, and a quick test shows that a 
Fiber suffers a stack overflow that doesn't lead to a clean 
termination when this limit is exceeded.


Pass a bigger stack size to the Fiber constructor?


No good if the stack size needed depends dynamically on the 
computation in that Fiber.


Should have read further down the thread --- you're right as the 
memory is in effect merely reserved virtual memory and isn't 
actually allocated.


Re: core.thread.Fiber --- runtime stack overflow unlike goroutines

2014-08-15 Thread Carl Sturtivant via Digitalmars-d-learn

On Friday, 15 August 2014 at 15:40:35 UTC, Sean Kelly wrote:

On Friday, 15 August 2014 at 15:25:23 UTC, Dicebot wrote:


No, I was referring to the proposal to supply bigger stack 
size to Fiber constructor - AFAIR it currently does allocate 
that memory eagerly (and does not use any OS CoW tools), 
doesn't it?


I thought it did, but apparently the behavior of VirtualAlloc 
and mmap (which Fiber uses to allocate the stack) simply 
reserves the range and then commits it lazily, even though what 
you've told it to do is allocate the memory.  This is really 
great news since it means that no code changes will be required 
to do the thing I wanted to do anyway.


Just read this after posting earlier replies! Very exciting.

I'll be doing some experiments to see how this works out.

What about at 32-bits?


Re: Appender is ... slow

2014-08-15 Thread Philippe Sigaud via Digitalmars-d-learn
On Fri, Aug 15, 2014 at 10:04 PM, John Colvin via Digitalmars-d-learn
digitalmars-d-learn@puremagic.com wrote:

 compiler, version, OS, architecture, flags?

Compiler: DMD 2.065 and LDC 0.14
OS: Linux 64bits (8 cores, but there is no parallelism here)
flags: -O -release -inline (and -noboundscheck for DMD)


 Have you looked at the assembly to check that all the Appender method calls
 are being inlined?

I do not know how to look at the assembly, neither do I know how to
see if Appender method calls are being inlined.

I did spend some time with -profile and gained a nice 10% increase in
speed with that, fighting bottlenecks in my code.


Re: Faster ways to redirect stdout of child process to file

2014-08-15 Thread Thomas Mader via Digitalmars-d-learn
I found out that the redirect was not responsible for the CPU 
time, it was some other code part which was responsible for it 
and totally unrelated to the redirect.


I also saw that a redirect in my case is much simpler by using 
spawnProcess:


auto logFile = File(errors.log, w);
auto pid = spawnProcess([dmd, myprog.d],
std.stdio.stdin,
std.stdio.stdout,
logFile);


Re: Appender is ... slow

2014-08-15 Thread Jonathan M Davis via Digitalmars-d-learn

On Friday, 15 August 2014 at 16:48:10 UTC, monarch_dodra wrote:
If you are using raw GC arrays, then the raw append 
operation will, outweigh the relocation cost on extension. So 
pre-allocation wouldn't really help in this situation (though 
the use of Appender *should*)


Is that because it's able to grab memory from the GC without 
actually having to allocate anything? Normally, I would have 
expected allocations to far outweigh the cost on extension and 
that preallocating would help a lot. But that would be with 
malloc or C++'s new rather than the GC, which has already 
allocated memory to reuse after it collects garbage.


- Jonathan M Davis


Re: core.thread.Fiber --- runtime stack overflow unlike goroutines

2014-08-15 Thread Sean Kelly via Digitalmars-d-learn

On Friday, 15 August 2014 at 20:17:51 UTC, Carl Sturtivant wrote:

On Friday, 15 August 2014 at 15:40:35 UTC, Sean Kelly wrote:


I thought it did, but apparently the behavior of VirtualAlloc 
and mmap (which Fiber uses to allocate the stack) simply 
reserves the range and then commits it lazily, even though 
what you've told it to do is allocate the memory.  This is 
really great news since it means that no code changes will be 
required to do the thing I wanted to do anyway.


Just read this after posting earlier replies! Very exciting.

I'll be doing some experiments to see how this works out.

What about at 32-bits?


I'm sure it works the same, but reserving large chunks of memory
there would eat up the address space.  I think the default will
have to remain some reasonably low number on 32-bit.


Re: Appender is ... slow

2014-08-15 Thread monarch_dodra via Digitalmars-d-learn

On Friday, 15 August 2014 at 21:24:25 UTC, Jonathan M Davis wrote:

On Friday, 15 August 2014 at 16:48:10 UTC, monarch_dodra wrote:
If you are using raw GC arrays, then the raw append 
operation will, outweigh the relocation cost on extension. So 
pre-allocation wouldn't really help in this situation (though 
the use of Appender *should*)


Is that because it's able to grab memory from the GC without 
actually having to allocate anything? Normally, I would have 
expected allocations to far outweigh the cost on extension and 
that preallocating would help a lot. But that would be with 
malloc or C++'s new rather than the GC, which has already 
allocated memory to reuse after it collects garbage.


- Jonathan M Davis


It's mostly just because GC-array appending is slow. A single 
operation is itself not that slow, but if you plan to append 
10_000 elements, then the total cost will add up. A lot.


Re: core.thread.Fiber --- runtime stack overflow unlike goroutines

2014-08-15 Thread ketmar via Digitalmars-d-learn
On Fri, 15 Aug 2014 20:19:18 +
Carl Sturtivant via Digitalmars-d-learn
digitalmars-d-learn@puremagic.com wrote:

 Should have read further down the thread --- you're right as the 
 memory is in effect merely reserved virtual memory and isn't 
 actually allocated.
and we -- 32-bit addicts -- will run out of address space while 64-bit
happy people will laugh looking at us. ;-)


signature.asc
Description: PGP signature


Re: @safe, pure and nothrow at the beginning of a module

2014-08-15 Thread Vlad Levenfeld via Digitalmars-d-learn

On Friday, 15 August 2014 at 16:54:54 UTC, Philippe Sigaud wrote:

So I'm trying to use @safe, pure and nothrow.

If I understand correctly Adam Ruppe's Cookbook, by putting

@safe:
pure:
nothrow:

at the beginning of a module, I distribute it on all 
definitions, right? Even methods, inner classes, and so on?


Because I did just that on half a dozen of modules and the 
compiler did not complain. Does that mean my code is clean(?) 
or that what I did has no effect?


I've noticed the same thing. If I want pure and nothrow to 
propage to inner structs and classes I have to place another 
label inside the class definition. Otherwise only free functions 
are affected.




Re: Ropes (concatenation trees) for strings in D ?

2014-08-15 Thread ketmar via Digitalmars-d-learn
On Fri, 15 Aug 2014 19:04:10 -0700
Timothee Cour via Digitalmars-d-learn
digitalmars-d-learn@puremagic.com wrote:

sounds like my C library based on this article:
http://e98cuenc.free.fr/wordprocessor/piecetable.html

i'm slowly converting my C code to D (nothing fancy yet, still C-style).

it's not a 'real' rope -- it's designed for text editing tasks, and it
allocates like crazy now, but it's pretty usable to writing rich text
editors and renderers in various GUI toolkits.

don't know when i'll finish first working version though, it's all
little tedious and i have alot other things to do. but i'll announce it
when it will be done. ;-)


signature.asc
Description: PGP signature


Re: @safe, pure and nothrow at the beginning of a module

2014-08-15 Thread Philpax via Digitalmars-d-learn

On Friday, 15 August 2014 at 23:22:27 UTC, Vlad Levenfeld wrote:
On Friday, 15 August 2014 at 16:54:54 UTC, Philippe Sigaud 
wrote:

So I'm trying to use @safe, pure and nothrow.

If I understand correctly Adam Ruppe's Cookbook, by putting

@safe:
pure:
nothrow:

at the beginning of a module, I distribute it on all 
definitions, right? Even methods, inner classes, and so on?


Because I did just that on half a dozen of modules and the 
compiler did not complain. Does that mean my code is clean(?) 
or that what I did has no effect?


I've noticed the same thing. If I want pure and nothrow to 
propage to inner structs and classes I have to place another 
label inside the class definition. Otherwise only free 
functions are affected.


I had a similar experience when trying to use @nogc. Having to 
insert @nogc into every struct I use is mildly annoying.