Re: Interrupts and hyperthreading.

2022-07-03 Thread Wojciech Kudla
as configured his/her > system for low noise/jitter/latency, this serialization at kernel mode > switch time among sibling threads should not occur, correct? > > On Sun, Jul 3, 2022, 1:57 AM Wojciech Kudla > wrote: > >> @Peter >> >> > Does the CPU only need

Re: Interrupts and hyperthreading.

2022-07-03 Thread Wojciech Kudla
Thanks Avi, that's interesting. I was under the impression that the Intel doc was referring to x86 specifically. I'm a bit confused about the discrepancy between what how that document describes the mitigation and your comment. Could you point to a resource or Kernel source file that sheds more

Re: Interrupts and hyperthreading.

2022-07-03 Thread Wojciech Kudla
@Peter > Does the CPU only need to serialize the transition, or does it need to serialize the interrupt/systemcall while it is in ring 0? Sadly yes, the kernel does need to temporarily "idle" the sibling thread while the other is in ring 0. This is described as transitioning from state 6a/6b to

Re: Interrupts and hyperthreading.

2022-07-02 Thread Wojciech Kudla
> If you are using a hyper-threading and there is an interrupt or a system call on one logical core, then the hyper-sibling will stall as well because both need access to the kernel (mode switch). This is kind of correct. That is, for cases where the kernel implements microarchitectural data

MMU gang wars: the TLB drive-by shootdown

2020-05-15 Thread Wojciech Kudla
Hi, A quick shameless plug. Just posted an article on TLB shootdowns and their impact on latency-sensitive applications. It's a result of a fragile balance between the subject's complexity and my own knowledge gaps in the matter. Hope it's

Where has PrintGCApplicationStoppedTime gone?

2020-05-14 Thread Wojciech Kudla
Hi, While evaluating different post-java 8 JVMs I noticed that PrintGCApplicationStoppedTime is an unsupported option now. According to Chris Newland's awesome list of JVM options it has been removed in Java 9 never to return. Was

Re: does call site polymorphism factor in method overrides?

2019-12-30 Thread Wojciech Kudla
Hi Brian, I think I can safely assume your question was dictated by (perfectly valid) concerns about method dispatch cost in extremely latency sensitive sections of code. After all, we've used to work together on the same problem space in the same institution only weeks ago. Vitaly provided a

Re: Re: JMeter and HdrHistogram Integration

2019-12-21 Thread Wojciech Kudla
> However, the messaging platform deals with transferring pub/sub messages that can vary from 1KB to 1MB in size – that is decidedly **not** in the microsecond realm, even if it were on an Infiniband network communicating via ibverbs I'm afraid I have to disagree with this statement. Contemporary

Re: Supermicro SYS-1029UX-LL1-S16 system clock running too fast

2019-09-03 Thread Wojciech Kudla
I have a few issues with the post you're referring to. It seems to completely ignore the existence of constant_tsc and invariant_tsc. The problems it talks about do not exist on modern day platforms; it's just an outdated post. http://btorpey.github.io/blog/2014/02/18/clock-sources-in-linux/

Re: RSS and CPU selection

2019-03-15 Thread Wojciech Kudla
l be processed by a random > CPU' > > > > On Friday, March 15, 2019 at 5:08:53 PM UTC+2, Wojciech Kudla wrote: >> >> The rx-queue to CPU affinity you're referring to should remain fairly >> static if the packets are getting processed fast enough. >> If you are i

Re: RSS and CPU selection

2019-03-15 Thread Wojciech Kudla
The rx-queue to CPU affinity you're referring to should remain fairly static if the packets are getting processed fast enough. If you are interested in controlling this behavior you can manipulate receive flow hash indirection tables (ethtool - x) , irq affinities

Re: Concurrent retrieval of statistics

2018-10-16 Thread Wojciech Kudla
I can only speak from my own experience. For data generated by latency critical threads you will probably want to have a simple SPSC buffer per thread meaning no competition between producers so fewer cycles lost on cache coherence. The consumer could just iterate over all known buffers and drain

Re: Throughput test of OpenHFT networking

2018-05-12 Thread Wojciech Kudla
It's probably not the response that you were hoping to see but I'd avoid testing for performance using loopback interface. There are whole parts of the network stack omitted by the Linux kernel in such scenarios. Not mentioning that open HFT may employ socket mechanics different from what's

Re: Exclusive core for a process, is it reasonable?

2018-04-09 Thread Wojciech Kudla
ally, there's more to gain from shaving off latency on network >> paths than there is from affinitizing work to cores/dies. But that's >> digressing from the OP. >> > > @Wojciech Kudla, > > That's digressing but it is very interesting. Can you refer me somewhere > w

Re: Exclusive core for a process, is it reasonable?

2018-04-09 Thread Wojciech Kudla
Some of the stuff I had a chance to work on managed to handle market data in single digit micros and trading in low tens. That's Java/c++. With modern day hardware it would be extremely hard (and costly) to push it much further. I can easily imagine how going for ASIC and staying under 1

Re: allocation memory in Java

2017-11-20 Thread Wojciech Kudla
I don't think I had any issues with symbol resolution and stack unwinding with the default fastdebug build (provided that the frame pointers are preserved). Can anyone shed some light on what the benefits of --with-native-debug-symbols=internal are? On Mon, 20 Nov 2017, 21:56 John Hening,

Re: Linux futex_wait() bug... [Yes. You read that right. UPDATE to LATEST PATCHES NOW].

2017-02-15 Thread Wojciech Kudla
Just trying to eliminate the obvious. You should be stracing JVM threads by referring their tids rather than parent process pid. That guy will pretty much always show being blocked on a futex. On Wed, 15 Feb 2017, 15:45 Gil Tene, wrote: > Don't know if this is the same bug. RHEL

Re: detecting "broken" TCP connections

2016-11-29 Thread Wojciech Kudla
Any chance that socket connection is handled by some sort of kernel bypass? All bets with blocking IO are off when running with onload/offload drivers. On Tue, 29 Nov 2016, 09:29 Alen Vrečko, wrote: > Got a situation where thread hanged on socket read (old school socket >