Re: Go's march to low-latency GC

2016-09-26 Thread Dejan Lekic via Digitalmars-d

On Saturday, 9 July 2016 at 23:14:38 UTC, ZombineDev wrote:


https://github.com/dlang/druntime/blob/master/src/gc/gcinterface.d
https://github.com/dlang/druntime/blob/master/src/gc/impl/manual/gc.d

What else do you need to start working on a new GC 
implementation?


That is actually the only case that I know of that an interface 
was provided to be implemented by 3rd parties... My reply was 
about Phobos in general. To repeat again - Phobos should provide 
the API (interfaces) *and* reference implementations of those.


Re: Go's march to low-latency GC

2016-07-16 Thread The D dude via Digitalmars-d

On Saturday, 16 July 2016 at 11:02:00 UTC, thedeemon wrote:

On Thursday, 14 July 2016 at 10:58:47 UTC, Istvan Dobos wrote:
 I was thinking, okay, so D's GC seems to turned out not that 
great. But how about the idea of transplanting Rust's 
ownership system instead of trying to make the GC better?


This requires drastically changing 99% of the language and it's 
bringing not just the benefits but also all the pain coming 
with this ownership system. Productivity goes down, learning 
curve goes up. And it will be a very different language in the 
end, so you might want to just use Rust instead of trying to 
make D another Rust.


Yes that's the case for Rust, but no one has proven yet that an 
ownership system needs to such a pain.


In fact someone recently proposed an idea for a readable 
ownership system:


http://forum.dlang.org/post/ensdiijttlpcwuhdf...@forum.dlang.org

and I believe that it's quite possible to improve over Rust and 
still having a productive language. In fact the simple `scope` 
statements are a first and excellent step on this journey ;-)


Re: Go's march to low-latency GC

2016-07-16 Thread thedeemon via Digitalmars-d

On Thursday, 14 July 2016 at 10:58:47 UTC, Istvan Dobos wrote:
 I was thinking, okay, so D's GC seems to turned out not that 
great. But how about the idea of transplanting Rust's ownership 
system instead of trying to make the GC better?


This requires drastically changing 99% of the language and it's 
bringing not just the benefits but also all the pain coming with 
this ownership system. Productivity goes down, learning curve 
goes up. And it will be a very different language in the end, so 
you might want to just use Rust instead of trying to make D 
another Rust.


Re: Go's march to low-latency GC

2016-07-14 Thread Istvan Dobos via Digitalmars-d
On Saturday, 9 July 2016 at 17:41:59 UTC, Andrei Alexandrescu 
wrote:

On 7/7/16 6:36 PM, Enamex wrote:

https://news.ycombinator.com/item?id=12042198

^ reposting a link in the right place.


A very nice article and success story. We've had similar 
stories with several products at Facebook. There is of course 
the opposite view - an orders-of-magnitude improvement means 
there was quite a lot of waste just before that.


I wish we could amass the experts able to make similar things 
happen for us.



Andrei



Hello Andrei,

May only be slightly related, but when you talked about D vs Go 
vs Rust in that Quora answer (here: 
https://www.quora.com/Which-language-has-the-brightest-future-in-replacement-of-C-between-D-Go-and-Rust-And-Why/answer/Andrei-Alexandrescu), I was thinking, okay, so D's GC seems to turned out not that great. But how about the idea of transplanting Rust's ownership system instead of trying to make the GC better?


Disclaimer: I know very little about D's possibly similar 
mechanisms.


Thanks,
Istvan



Re: Go's march to low-latency GC

2016-07-12 Thread deadalnix via Digitalmars-d

On Tuesday, 12 July 2016 at 13:28:33 UTC, jmh530 wrote:
On Monday, 11 July 2016 at 17:23:49 UTC, Ola Fosheim Grøstad 
wrote:


You can, but OSes usually give you randomized memory layout as 
a security measure.


What if the memory allocation scheme were something like: 
randomly pick memory locations below some threshold from the 
32bit segment and then above the threshold pick from elsewhere?


There is a mmap flag for this on linux.


Re: Go's march to low-latency GC

2016-07-12 Thread Ola Fosheim Grøstad via Digitalmars-d

On Tuesday, 12 July 2016 at 13:28:33 UTC, jmh530 wrote:
On Monday, 11 July 2016 at 17:23:49 UTC, Ola Fosheim Grøstad 
wrote:


You can, but OSes usually give you randomized memory layout as 
a security measure.


What if the memory allocation scheme were something like: 
randomly pick memory locations below some threshold from the 
32bit segment and then above the threshold pick from elsewhere?


One possible technique is to use contiguous "unmapped" memory 
areas that cover your worst case number of elements with a 
specific base and just use indexes instead of absolute 
addressing. That way you often can just use 16 bits typed 
addressing (assuming max 65535 objects of a given type + a null 
index).


The base address may then be injected (during linking or by using 
self-modifying code if the OS allows it) into the code segments. 
Or you could use TLS + indexing, or whatever the OS supports.


Using global 64 bit pointers is just for generality and to keep 
language-implementation simple.  It is not strictly hardware 
related if you have a MMU, nor directly related to machine 
language as such. For a statically typed language you could 
probably get away with 16 or 32 bits for typed pointers most of 
the time if the OS and language doesn't make it difficult (like 
the conservative D GC scan).





Re: Go's march to low-latency GC

2016-07-12 Thread jmh530 via Digitalmars-d
On Monday, 11 July 2016 at 17:23:49 UTC, Ola Fosheim Grøstad 
wrote:


You can, but OSes usually give you randomized memory layout as 
a security measure.


What if the memory allocation scheme were something like: 
randomly pick memory locations below some threshold from the 
32bit segment and then above the threshold pick from elsewhere?


Re: Go's march to low-latency GC

2016-07-12 Thread Kagamin via Digitalmars-d

On Monday, 11 July 2016 at 13:13:02 UTC, Patrick Schluter wrote:
(it's more about PAE but the reasons why 64 bits is a good 
thing in general are the same: address space!)


And what's with address space?


Re: Go's march to low-latency GC

2016-07-11 Thread deadalnix via Digitalmars-d

On Monday, 11 July 2016 at 13:05:09 UTC, Russel Winder wrote:

Agreed. I don't know why golang guys bother about it.


Because they have nothing else to propose than massive goroutine 
orgy so they kind of have to make it work.



Maybe because they are developing a language for the 1980s?

;-)


It's not like they are using the Plan9 toolchain...

Ho wait...



Re: Go's march to low-latency GC

2016-07-11 Thread Ola Fosheim Grøstad via Digitalmars-d

On Monday, 11 July 2016 at 17:14:17 UTC, jmh530 wrote:

On Monday, 11 July 2016 at 13:13:02 UTC, Patrick Schluter wrote:


Because of attitudes like shown in that thread
https://forum.dlang.org/post/ilbmfvywzktilhskp...@forum.dlang.org
people who do not really understand why 32 bit systems are a 
really problematic even if the apps don't use more than 2 GiB 
of memory.


Here's Linus Torvalds classic rant about 64 bit
https://cl4ssic4l.wordpress.com/2011/05/24/linus-torvalds-about-pae/  (it's 
more about PAE but the reasons why 64 bits is a good thing in general are the 
same: address space!)


Why can't you use both 32bit and 64bit pointers when compiling 
for x86_64?


My guess would be that using 64bit registers precludes the use 
of 32bit registers.


You can, but OSes usually give you randomized memory layout as a 
security measure.




Re: Go's march to low-latency GC

2016-07-11 Thread jmh530 via Digitalmars-d

On Monday, 11 July 2016 at 13:13:02 UTC, Patrick Schluter wrote:


Because of attitudes like shown in that thread
https://forum.dlang.org/post/ilbmfvywzktilhskp...@forum.dlang.org
people who do not really understand why 32 bit systems are a 
really problematic even if the apps don't use more than 2 GiB 
of memory.


Here's Linus Torvalds classic rant about 64 bit
https://cl4ssic4l.wordpress.com/2011/05/24/linus-torvalds-about-pae/  (it's 
more about PAE but the reasons why 64 bits is a good thing in general are the 
same: address space!)


Why can't you use both 32bit and 64bit pointers when compiling 
for x86_64?


My guess would be that using 64bit registers precludes the use of 
32bit registers.


Re: Go's march to low-latency GC

2016-07-11 Thread Ola Fosheim Grøstad via Digitalmars-d

On Monday, 11 July 2016 at 13:05:09 UTC, Russel Winder wrote:

Maybe because they are developing a language for the 1980s?

;-)


It is quite common for web services to run with less than 1GB. 
64bit would be very wasteful.




Re: Go's march to low-latency GC

2016-07-11 Thread Patrick Schluter via Digitalmars-d

On Monday, 11 July 2016 at 12:21:04 UTC, Sergey Podobry wrote:

On Monday, 11 July 2016 at 11:23:26 UTC, Dicebot wrote:

On Sunday, 10 July 2016 at 19:49:11 UTC, Sergey Podobry wrote:


Remember that virtual address space is limited on 32-bit 
platforms. Thus spawning 2000 threads 1 MB stack each will 
occupy all available VA space and you'll get an allocation 
failure (even if the real memory usage is low).


Sorry, but someone who tries to run highly concurrent server 
software with thousands of fibers on 32-bit platform is quite 
unwise and there is no point in taking such use case into 
account. 32-bit has its own niche with different kinds of 
concerns.


Agreed. I don't know why golang guys bother about it.


Because of attitudes like shown in that thread
https://forum.dlang.org/post/ilbmfvywzktilhskp...@forum.dlang.org
people who do not really understand why 32 bit systems are a 
really problematic even if the apps don't use more than 2 GiB of 
memory.


Here's Linus Torvalds classic rant about 64 bit
https://cl4ssic4l.wordpress.com/2011/05/24/linus-torvalds-about-pae/  (it's 
more about PAE but the reasons why 64 bits is a good thing in general are the 
same: address space!)


Re: Go's march to low-latency GC

2016-07-11 Thread Russel Winder via Digitalmars-d
On Mon, 2016-07-11 at 12:21 +, Sergey Podobry via Digitalmars-d
wrote:
> On Monday, 11 July 2016 at 11:23:26 UTC, Dicebot wrote:
> > On Sunday, 10 July 2016 at 19:49:11 UTC, Sergey Podobry wrote:
> > > 
> > > Remember that virtual address space is limited on 32-bit 
> > > platforms. Thus spawning 2000 threads 1 MB stack each will 
> > > occupy all available VA space and you'll get an allocation 
> > > failure (even if the real memory usage is low).
> > 
> > Sorry, but someone who tries to run highly concurrent server 
> > software with thousands of fibers on 32-bit platform is quite 
> > unwise and there is no point in taking such use case into 
> > account. 32-bit has its own niche with different kinds of 
> > concerns.
> 
> Agreed. I don't know why golang guys bother about it.

Maybe because they are developing a language for the 1980s?

;-)

-- 

Russel.
=
Dr Russel Winder  t: +44 20 7585 2200   voip: sip:russel.win...@ekiga.net
41 Buckmaster Roadm: +44 7770 465 077   xmpp: rus...@winder.org.uk
London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder

signature.asc
Description: This is a digitally signed message part


Re: Go's march to low-latency GC

2016-07-11 Thread Sergey Podobry via Digitalmars-d

On Monday, 11 July 2016 at 11:23:26 UTC, Dicebot wrote:

On Sunday, 10 July 2016 at 19:49:11 UTC, Sergey Podobry wrote:


Remember that virtual address space is limited on 32-bit 
platforms. Thus spawning 2000 threads 1 MB stack each will 
occupy all available VA space and you'll get an allocation 
failure (even if the real memory usage is low).


Sorry, but someone who tries to run highly concurrent server 
software with thousands of fibers on 32-bit platform is quite 
unwise and there is no point in taking such use case into 
account. 32-bit has its own niche with different kinds of 
concerns.


Agreed. I don't know why golang guys bother about it.


Re: Go's march to low-latency GC

2016-07-11 Thread Dicebot via Digitalmars-d

On Sunday, 10 July 2016 at 19:49:11 UTC, Sergey Podobry wrote:

On Saturday, 9 July 2016 at 13:48:41 UTC, Dicebot wrote:
Nope, this is exactly the point. You can demand crazy 10 MB of 
stack for each fiber and only the actually used part will be 
allocated by kernel.


Remember that virtual address space is limited on 32-bit 
platforms. Thus spawning 2000 threads 1 MB stack each will 
occupy all available VA space and you'll get an allocation 
failure (even if the real memory usage is low).


Sorry, but someone who tries to run highly concurrent server 
software with thousands of fibers on 32-bit platform is quite 
unwise and there is no point in taking such use case into 
account. 32-bit has its own niche with different kinds of 
concerns.


Re: Go's march to low-latency GC

2016-07-10 Thread Martin Nowak via Digitalmars-d
On Saturday, 9 July 2016 at 23:12:10 UTC, Andrei Alexandrescu 
wrote:
Yah, I was thinking in a more general sense. Plenty of 
improvements of all kinds are within reach. -- Andrei


Yes, but hardly anything that would allow us to do partial 
collections.
And without that you always have to scan the full live heap, this 
can't scale to bigger heaps, there is no way to scan a GB sized 
heap fast. So either we facilitate to get by with a small GC 
heap, i.e. more deterministic MM, or we spent a lot of time to 
make some partial collection algorithm work. Ideally we do both 
but the former is a simpler goal.


The connectivity based GC would be a realistic goal as well, only 
somewhat more complex than the precise GC. But it's unclear how 
well it will work for typical applications.


Re: Go's march to low-latency GC

2016-07-10 Thread Sergey Podobry via Digitalmars-d

On Saturday, 9 July 2016 at 13:48:41 UTC, Dicebot wrote:

On 07/09/2016 02:48 AM, ikod wrote:

If I made a wrong guess and
ask for too small stack then programm may crash. If I ask for 
too large

stack then I probably waste resources.


Nope, this is exactly the point. You can demand crazy 10 MB of 
stack for each fiber and only the actually used part will be 
allocated by kernel.


Remember that virtual address space is limited on 32-bit 
platforms. Thus spawning 2000 threads 1 MB stack each will occupy 
all available VA space and you'll get an allocation failure (even 
if the real memory usage is low).


Re: Go's march to low-latency GC

2016-07-09 Thread ZombineDev via Digitalmars-d

On Saturday, 9 July 2016 at 21:25:34 UTC, Dejan Lekic wrote:
On Saturday, 9 July 2016 at 17:41:59 UTC, Andrei Alexandrescu 
wrote:
I wish we could amass the experts able to make similar things 
happen for us.


I humbly believe it is not just about amassing experts, but 
also making it easy to do experiments. Phobos/druntime should 
provide set of APIs for literally everything so people can do 
their own implementations of ANY standard library module(s). I 
wish D offered module interfaces the same way Modula-3 did...


To work on new GC in D one needs to remove the old one, and 
replace it with his/her new implementation, while with 
competition it is more/less implementation of few interfaces, 
and instructing compiler to use the new GC...


https://github.com/dlang/druntime/blob/master/src/gc/gcinterface.d
https://github.com/dlang/druntime/blob/master/src/gc/impl/manual/gc.d

What else do you need to start working on a new GC implementation?


Re: Go's march to low-latency GC

2016-07-09 Thread Andrei Alexandrescu via Digitalmars-d

On 07/09/2016 03:42 PM, Martin Nowak wrote:

We sort of have an agreement that we don't want to pay 5% for write
barriers, so the common algorithmic GC improvements aren't available for
us.


Yah, I was thinking in a more general sense. Plenty of improvements of 
all kinds are within reach. -- Andrei


Re: Go's march to low-latency GC

2016-07-09 Thread Dejan Lekic via Digitalmars-d
On Saturday, 9 July 2016 at 17:41:59 UTC, Andrei Alexandrescu 
wrote:
I wish we could amass the experts able to make similar things 
happen for us.


I humbly believe it is not just about amassing experts, but also 
making it easy to do experiments. Phobos/druntime should provide 
set of APIs for literally everything so people can do their own 
implementations of ANY standard library module(s). I wish D 
offered module interfaces the same way Modula-3 did...


To work on new GC in D one needs to remove the old one, and 
replace it with his/her new implementation, while with 
competition it is more/less implementation of few interfaces, and 
instructing compiler to use the new GC...


Re: Go's march to low-latency GC

2016-07-09 Thread Martin Nowak via Digitalmars-d
On Saturday, 9 July 2016 at 17:41:59 UTC, Andrei Alexandrescu 
wrote:

On 7/7/16 6:36 PM, Enamex wrote:

https://news.ycombinator.com/item?id=12042198

^ reposting a link in the right place.


A very nice article and success story. We've had similar 
stories with several products at Facebook. There is of course 
the opposite view - an orders-of-magnitude improvement means 
there was quite a lot of waste just before that.


Exactly, how someone can run a big site with 2 second pauses in 
the GC code is beyond me.


I wish we could amass the experts able to make similar things 
happen for us.


We sort of have an agreement that we don't want to pay 5% for 
write barriers, so the common algorithmic GC improvements aren't 
available for us.
There is still connectivity based GC [¹], which is an interesting 
idea, but AFAIK it hasn't been widely tried.
Maybe someone has an idea for optional write barriers, i.e. zero 
cost if you don't use them. Or we agree that it's worth to have 
different incompatible binaries.


[¹]: https://www.cs.purdue.edu/homes/hosking/690M/cbgc.pdf

In any case now that we made the GC pluggable we should port the 
forking GC. It has almost no latency at the price of higher peak 
memory usage and throughput, the same trade-offs you have with 
any concurrent mark phase.
Moving the sweeping to background GC threads is sth. we should be 
doing anyhow.


Overall I think we should focus more on good deterministic MM 
alternatives, rather than investing years of engineering into our 
GC, or hoping for silver bullets.




Re: Go's march to low-latency GC

2016-07-09 Thread bob belcher via Digitalmars-d
On Saturday, 9 July 2016 at 17:41:59 UTC, Andrei Alexandrescu 
wrote:

On 7/7/16 6:36 PM, Enamex wrote:

https://news.ycombinator.com/item?id=12042198

^ reposting a link in the right place.


A very nice article and success story. We've had similar 
stories with several products at Facebook. There is of course 
the opposite view - an orders-of-magnitude improvement means 
there was quite a lot of waste just before that.


I wish we could amass the experts able to make similar things 
happen for us.



Andrei


kickstarter for improve gc :)


Re: Go's march to low-latency GC

2016-07-09 Thread Chris Wright via Digitalmars-d
On Fri, 08 Jul 2016 22:35:05 +0200, Martin Nowak wrote:

> On 07/08/2016 07:45 AM, ikod wrote:
>> Correct me if I'm wrong, but in D fibers allocate stack statically, so
>> we have to preallocate large stacks.
>> 
>> If yes - can we allocate stack frames on demand from some non-GC area?
> 
> Fiber stacks are just mapped virtual memory pages that the kernel only
> backs with physical memory when they're actually used. So they already
> are allocated on demand.

The downside is that it's difficult to release that memory. On the other 
hand, Go had a lot of problems with its implementation in part because it 
released memory. At some point you start telling users: if you want a 
fiber that does a huge recursion, dispose of it when you're done. It's 
cheap enough to create another fiber later.


Re: Go's march to low-latency GC

2016-07-09 Thread Andrei Alexandrescu via Digitalmars-d

On 7/7/16 6:36 PM, Enamex wrote:

https://news.ycombinator.com/item?id=12042198

^ reposting a link in the right place.


A very nice article and success story. We've had similar stories with 
several products at Facebook. There is of course the opposite view - an 
orders-of-magnitude improvement means there was quite a lot of waste 
just before that.


I wish we could amass the experts able to make similar things happen for us.


Andrei



Re: Go's march to low-latency GC

2016-07-09 Thread ikod via Digitalmars-d

On Saturday, 9 July 2016 at 13:48:41 UTC, Dicebot wrote:

On 07/09/2016 02:48 AM, ikod wrote:

If I made a wrong guess and
ask for too small stack then programm may crash. If I ask for 
too large

stack then I probably waste resources.


Nope, this is exactly the point. You can demand crazy 10 MB of 
stack for each fiber and only the actually used part will be 
allocated by kernel.


Thanks, nice to know.


Re: Go's march to low-latency GC

2016-07-09 Thread Dicebot via Digitalmars-d
On 07/09/2016 02:48 AM, ikod wrote:
> If I made a wrong guess and
> ask for too small stack then programm may crash. If I ask for too large
> stack then I probably waste resources.

Nope, this is exactly the point. You can demand crazy 10 MB of stack for
each fiber and only the actually used part will be allocated by kernel.


Re: Go's march to low-latency GC

2016-07-08 Thread ikod via Digitalmars-d

On Friday, 8 July 2016 at 20:35:05 UTC, Martin Nowak wrote:

On 07/08/2016 07:45 AM, ikod wrote:
Correct me if I'm wrong, but in D fibers allocate stack 
statically, so we have to preallocate large stacks.


If yes - can we allocate stack frames on demand from some 
non-GC area?


Fiber stacks are just mapped virtual memory pages that the 
kernel only backs with physical memory when they're actually 
used. So they already are allocated on demand.


But the size of fiber stack is fixed? When we call Fiber 
constructor, the second parameter for ctor is stack size. If I 
made a wrong guess and ask for too small stack then programm may 
crash. If I ask for too large stack then I probably waste 
resources. So, it would be nice if programmer will not forced to 
make any wrong decisions about fiber's stack size.


Or maybe I'm wrong and I shouldn't care about stack size when I 
create new fiber?


Re: Go's march to low-latency GC

2016-07-08 Thread Martin Nowak via Digitalmars-d
On 07/08/2016 07:45 AM, ikod wrote:
> Correct me if I'm wrong, but in D fibers allocate stack statically, so
> we have to preallocate large stacks.
> 
> If yes - can we allocate stack frames on demand from some non-GC area?

Fiber stacks are just mapped virtual memory pages that the kernel only
backs with physical memory when they're actually used. So they already
are allocated on demand.


Re: Go's march to low-latency GC

2016-07-07 Thread ikod via Digitalmars-d

On Thursday, 7 July 2016 at 22:36:29 UTC, Enamex wrote:

https://news.ycombinator.com/item?id=12042198

^ reposting a link in the right place.


While a program using 10,000 OS threads might perform poorly, 
that number of goroutines is nothing unusual. One difference is 
that a goroutine starts with a very small stack — only 
2kB — which grows as needed, contrasted with the large 
fixed-size stacks that are common elsewhere. Go’s function call 
preamble makes sure there’s enough stack space for the next 
call, and if not will move the goroutine’s stack to a larger 
memory area — rewriting pointers as needed — before allowing 
the call to continue.


Correct me if I'm wrong, but in D fibers allocate stack 
statically, so we have to preallocate large stacks.


If yes - can we allocate stack frames on demand from some non-GC 
area?