Re: vibe.d-lite v0.1.0 powered by photon

Dmitry Olshansky via Digitalmars-d-announce Fri, 17 Oct 2025 22:44:15 -0700

On Monday, 22 September 2025 at 11:14:17 UTC, Sönke Ludwig wrote:

Am 22.09.25 um 09:49 schrieb Dmitry Olshansky:
On Friday, 19 September 2025 at 17:37:36 UTC, Sönke Ludwigwrote:
So you don't support timeouts when waiting for an event atall? Otherwise I don't see why a separate API would berequired, this should be implementable with plain Posix APIswithin vibe-core-lite itself.
Photon's API is the syscall interface. So to wait on an eventyou just call poll.Behind the scenes it will just wait on the right fd to changestate.
Now vibe-core-light wants something like read(buffer, timeout)which is not syscall API but maybe added. But since I'm goingto add new API I'd rather have something consistent and sanenot just a bunch of adhoc functions to satisfy vibe.dinterface.
Why can't you then use poll() to for example implement`ManualEvent` with timeout and interrupt support? And shouldn'trecv() with timeout be implementable the same way, poll withtimeout and only read when ready?

Yes, recv with timeout is basically poll+recv. The problem isthat then I need to support interrupts in poll. Nothing reallychanged.As far as manual event goes I've implemented that with customcond var and mutex. That mutex is not interruptible as it'sbacked by semaphore on slow path in a form of eventfd.I might create custom mutex that is interruptible I guess but thenotion of interrupts would have to be introduced to photon. I donot really like it.

I think we have a misunderstanding of what vibe.d is supposedto be. It seems like you are only focused on the web/serverrole, while to me vibe-core is a general-purpose I/O andconcurrency system with no particular specialization in servertasks. With that view, your statement to me sounds like"Clearly D is not meant to do multi-threading, since main() isonly running in a single thread".

The defaults are what is important. Go defaults tomulti-threading for instance.D defaults to multi-threading because TLS by default is certainlya mark of multi-threaded environment. std.concurrency defaults tonew thread per spawn, again this tells me it's aboutmultithreading. I intend to support multi-threading by default. Iunderstand that we view this issue differently.

Of course, there could be a high-level component on top ofvibe-d:web that makes some opinionated assumptions on how tostructure a web application to ensure it is scalable, but thatwould go against the idea of being a toolkit with functionalbuilding blocks, as opposed to a framework that dictates yourapplication structure.


Agreed.

Not everything is CPU bound and using threads "justbecause" doesn't make sense either. This is especiallytrue, because of low level race conditions that requirespecial care. D's shared/immutable helps with that, butthat also means that your whole application suddenly needsto use shared/immutable when passing data between tasks.
I’m dying to know which application not being cpu boundstill needs to pass data between tasks that are all runningon a single thread.
Anything client side involving a user interface has plenty ofopportunities for employing secondary tasks or long-runningsparsely updated state logic that are not CPU bound. Most ofthe time is spent idle there. Specific computations on theother hand can of course still be handed off to other threads.
Latency still going to be better if multiple cores areutilized.
And I'm still not sure what the example is.
We are comparing fiber switches and working on data with ashared cache and no synchronization to synchronizing dataaccess and control flow between threads/cores. There is such abroad spectrum of possibilities for one of those to be fasterthan the other that it's just silly to make a general statementlike that.
The thing is that if you always share data between threads, youhave to pay for that for every single data access, regardlessof whether there is actual concurrency going on or not.

Obviously, we should strive to share responsibly. Photon hasChannels much like vibe-core has Channel. Mine are MPSC though,mostly to model Input/Output range concepts.

If you want a concrete example, take a simple download dialogwith a progress bar. There is no gain in off-loading anythingto a separate thread here, since this is fully I/O bound, butit adds quite some communication complexity if you do. CPUperformance is simply not a concern here.

Channels tame the complexity. Yes, channels could get moreexpansive in multi-threaded scenario but we already agreed thatit's not CPU bound.

But TLS variables are always "globals" in the sense thatthey outlive the scope that accesses them. A modificationin one thread would obviously not be visible in anotherthread, meaning that you may or may not have a semanticconnection when you access such a library sequentially frommultiple tasks.
And then there are said libraries that are not thread-safeat all, or are bound to the thread where you initializethem. Or handles returned from a library may be bound tothe thread that created them. Dealing with all of this justbecomes needlessly complicated and error-prone, especiallyif CPU cycles are not a concern.
TLS is fine for using not thread safe library - just makesure you initialize it for all threads. I do not switch orotherwise play dirty tricks with TLS.
The problem is that for example you might have a handle thatwas created in thread A and is not valid in thread B, or youset a state in thread A and thread B doesn't see that state.This would mean that you are limited to a single task for thecomplete library interaction.
Or just initialize it lazily in all threads that happen to useit.
Otherwise, this is basically stick to one thread really.
But then it's a different handle representing a differentobject - that's not the same thing. I'm not just talking aboutinitializing the library as a whole. But even if, there are alot of libraries that don't use TLS and are simply notthread-safe at all.

Something that is not thread-safe at all is a dying breed. It'sbeen 20 years that we have multi-cores. Most libraries can beinitialized once per thread which is quite naturally modeled withTLS handle to said library. Communicating between fibers viashared TLS handle is not something I would recommend regardlessof the default spawn behavior.

By robbing the user the control over where a task spawns,you are also forcing synchronization everywhere, which canquickly become more expensive than any benefits you wouldgain from using multiple threads.
Either of default kind of rob user of control of where thetask spawns. Which is sensible a user shouldn’t really care.
This doesn't make sense, in the original vibe-core, you cansimply choose between spawning in the same thread or in "any"thread. `shared`/`immutable` is correctly enforced in thelatter case to avoid unintended data sharing.
I have go and goOnSameThread. Guess which is the encouragedoption.
Does go() enforce proper use of shared/immutable when passingdata to the scheduled "go routine"?

It goes with the same API as we have for threads - a delegate, sosharing becomes user's responsibility. I may add function + argsfor better handling of resources passed to the lambda.

Finally, in the case of web applications, in my opinion thebetter approach for using multiple CPU cores is *usually*by running multiple *processes* in parallel, as opposed tomultiple threads within a single process. Of course, everyapplication is different and there is no one-size-fits-allapproach.
There we differ, not only load balancing is simpler within asingle application but also processes are more expansive.Current D GC situation kind of sucks on multithreadedworkloads but that is the only reason to go multiprocessIMHO.
The GC/malloc is the main reason why this is mostly false inpractice, but it extends to any central contention sourcewithin the process - yes, often you can avoid that, but oftenthat takes a lot of extra work and processes sidestep thatissue in the first place.
As is observable from the look on other languages and runtimesmalloc is not the bottleneck it used to be. Our particularversion of GC that doesn't have thread caches is a bottleneck.
malloc() will also always be a bottleneck with the right load.Just the n times larger amount of virtual address spacerequired may start to become an issue for memory heavyapplications. But even if ignore that, ruling out using theexisting GC doesn't sound like a good idea to me.

The existing GC is basically 20+ years old, ofc we need better GCandthread cached allocation solves contention in multi-threadedenvironments.Alternative memory allocator is doing great on 320 core machines.I cannot tell you which allocator that is or what exactly theseservers are. Though even jemalloc does okayish.

And the fact is that, even with relatively mild GC use, a webapplication will not scale properly with many cores.

Only partially agree, Java's GC handles load just fine and runsfaster than vibe.d(-light). It does allocations on its servingcode path.

Also, in the usual case where the threads don't have tocommunicate with each other (apart from memory allocationsynchronization), a separate process per core isn't anyslower - except maybe when hyper- threading is in play, butwhether that helps or hurts performance always depends on theconcrete workload.
The fact that context switch has to drop all of virtualaddress spaces does add a bit of overhead. Though to becertain of anything there better be a benchmark.
There is no context switch involved with each process runningon its own core.


Yeah, pinning down cores works, I stand corrected.

Separate process also have the advantage of being more robustand enabling seamless restarts and updates of the executable.And they facilitate an application design that lends itselfto scaling across multiple machines.
Then give me the example code to run multiple vibe.d inparallel processes (should be simillar to runDist) and we cancompare approaches. For all I know it could be faster thenmulti-threaded vibe.d-light. Also honestly if vibe.d's targetis multiple processes it should probably start like this bydefault.
Again, the "default" is a high-level issue and none ofvibe-core's business. The simplest way to have that work is touse `HTTPServerOption.reusePort` and then start as manyprocesses as desired.

So I did just that. To my surprise it indeed speeds up all of myD server examples.

The speed ups are roughly:

On vibe-http-light:
8 cores 1.14
12 cores 1.10
16 cores 1.08
24 cores 1.05
32 cores 1.06
48 cores 1.07

On vibe-http-classic:
8 cores 1.33
12 cores 1.45
16 cores 1.60
24 cores 2.54
32 cores 4.44
48 cores 8.56

On plain photon-http:
8 cores 1.15
12 cores 1.10
16 cores 1.09
24 cores 1.05
32 cores 1.07
48 cores 1.04

We should absolutely tweak vibe.d TechEmpower benchmark to runvibe.d as a process per core! As far as photon-powered versionsgo I see there is a point where per-process becomes less of again with more cores, so I would think there are 2 factors atplay one positive and one negative, with negative being tied tothe number of processes.

Lastly, I have found opportunities to speed up vibe-http evenwithout switching to vibe-core-light. Will send PRs.

Re: vibe.d-lite v0.1.0 powered by photon

Reply via email to