Congestion-control and other tcp param tuning shouldn't change latency of send/write. They can affect read for sure.
I'd check L1 and LLC cache misses, branch prediction stats, TLB misses etc (check 'perf list' for details). If this doesn't show a very sharp difference, I'd trace the impl. If you have verified that the latency comes from system-call, I'd trace syscall downwards (use perf probe or ftrace (tracefs) directly). With uprobe, you can trace the code between write0 and sys_write. On Thu, Apr 13, 2017 at 10:15 AM, J Crawford <latencyfigh...@mail.com> wrote: > Very good idea, Mike. If I only knew C :) I'll try to hire a C coder on > UpWork.com or Elance.com to do that. It shouldn't be hard for someone who > knows C network programming. I hope... > > Thanks! > > -JC > > On Wednesday, April 12, 2017 at 11:37:28 PM UTC-5, mikeb01 wrote: >> >> Rewrite the test in C to eliminate the JVM as the cause of the slowdown? >> >> On 13 April 2017 at 16:31, J Crawford <latency...@mail.com> wrote: >>> >>> Ok, this is a total mystery. Tried a bunch of strategies with no luck: >>> >>> 1. Checked the cpu frequency with i7z_64bit. No variance in the >>> frequency. >>> >>> 2. Disabled all power management. No luck. >>> >>> 3. Changed TCP Congestion Control Algorithm. No luck. >>> >>> 4. Set net.ipv4.tcp_slow_start_after_idle to false. No luck. >>> >>> 5. Tested with UDP implementation. No luck. >>> >>> 6. Placed the all sockets in blocking mode just for the heck of it. No >>> luck, same problem. >>> >>> I'm out of pointers now and don't know where to run. This is an important >>> latency problem that I must understand as it affects my trading system. >>> >>> Anyone who has any clue of what might be going on, please throw some >>> light. Also, if you run the provided Server and Client code in your own >>> environment/machine (over localhost/loopback) you will see that it does >>> happen. >>> >>> Thanks! >>> >>> -JC >>> >>> On Wednesday, April 12, 2017 at 10:23:17 PM UTC-5, Todd L. Montgomery >>> wrote: >>>> >>>> The short answer is that no congestion control algorithm is suited for >>>> low latency trading and in all cases, using raw UDP will be better for >>>> latency. Congestion control is about fairness. Latency in trading has >>>> nothing to do with fairness. >>>> >>>> The long answer is that to varying degrees, all congestion control must >>>> operate at high or complete utilization to probe. Those based on loss (all >>>> variants of CUBIC, Reno, etc.) must be operating in congestion avoidance or >>>> be in slow start. Those based on RTT (Vegas) or RTT/Bottleneck Bandwidth >>>> (BBR) must be probing for more bandwidth to determine change in RTT (as a >>>> "replacement" for loss). >>>> >>>> So, the case of sending only periodically is somewhat antithetical to >>>> the operating point that all congestion control must operate at while >>>> probing. And the reason all appropriate congestion control algorithms I >>>> know >>>> of reset upon not operating at high utilization. >>>> >>>> You can think of it this way.... the network can only sustain X >>>> msgs/sec, but X is a (seemingly random) nonlinear function of time. How do >>>> you determine X at any given time without operating at that point? You can >>>> not, that I know of, predict X without operating at X. >>>> >>>> On Wed, Apr 12, 2017 at 6:54 PM, J Crawford <latency...@mail.com> wrote: >>>>> >>>>> Hi Todd, >>>>> >>>>> I'm trying several TCP Congestion algorithms here: westwood, highspeed, >>>>> veno, etc. >>>>> >>>>> No luck so far, but there are many more I haven't tried. I'm using this >>>>> answer to change the TCP congestion algo: >>>>> http://unix.stackexchange.com/a/278217 >>>>> >>>>> Does anyone know what TCP congestion algorithm is the best for >>>>> low-latency? Or the best for the single message scenario I've described? >>>>> This looks like an important configuration for trading, when a single >>>>> order >>>>> needs to go out after some time and you don't want it to go out at a >>>>> slower >>>>> speed. >>>>> >>>>> Thanks! >>>>> >>>>> -JC >>>>> >>>>> On Wednesday, April 12, 2017 at 5:38:40 PM UTC-5, Todd L. Montgomery >>>>> wrote: >>>>>> >>>>>> Mike has the best point, I think. 30 seconds between sends will cause >>>>>> the congestion window to close. Depending on what is in use (CUBIC vs. >>>>>> Reno), this will change behavior. >>>>>> >>>>>> -- Todd >>>>>> >>>>>> On Wed, Apr 12, 2017 at 3:27 PM, Greg Young <gregor...@gmail.com> >>>>>> wrote: >>>>>>> >>>>>>> You are likely measuring wrong and just have not figured out how yet. >>>>>>> >>>>>>> On Wed, Apr 12, 2017 at 8:56 PM, J Crawford <latency...@mail.com> >>>>>>> wrote: >>>>>>> > The SO question has the source codes of a simple server and client >>>>>>> > that >>>>>>> > demonstrate and isolate the problem. Basically I'm timing the >>>>>>> > latency of a >>>>>>> > ping-pong (client-server-client) message. I start by sending one >>>>>>> > message >>>>>>> > every 1 millisecond. I wait for 200k messages to be sent so that >>>>>>> > the HotSpot >>>>>>> > has a chance to optimize the code. Then I change my pause time from >>>>>>> > 1 >>>>>>> > millisecond to 30 seconds. For my surprise my write and read >>>>>>> > operation >>>>>>> > become considerably slower. >>>>>>> > >>>>>>> > I don't think it is a JIT/HotSpot problem. I was able to pinpoint >>>>>>> > the slower >>>>>>> > method to the native JNI calls to write (write0) and read. Even if >>>>>>> > I change >>>>>>> > the pause from 1 millisecond to 1 second, problem persists. >>>>>>> > >>>>>>> > I was able to observe that on MacOS and Linux. >>>>>>> > >>>>>>> > Does anyone here have a clue of what can be happening? >>>>>>> > >>>>>>> > Note that I'm disabling Nagle's Algorithm with setTcpNoDelay(true). >>>>>>> > >>>>>>> > SO question with code and output: >>>>>>> > >>>>>>> > http://stackoverflow.com/questions/43377600/socketchannel-why-if-i-write-msgs-quickly-the-latency-of-each-message-is-low-b >>>>>>> > >>>>>>> > Thanks! >>>>>>> > >>>>>>> > -JC >>>>>>> > >>>>>>> > -- >>>>>>> > You received this message because you are subscribed to the Google >>>>>>> > Groups >>>>>>> > "mechanical-sympathy" group. >>>>>>> > To unsubscribe from this group and stop receiving emails from it, >>>>>>> > send an >>>>>>> > email to mechanical-sympathy+unsubscr...@googlegroups.com. >>>>>>> > For more options, visit https://groups.google.com/d/optout. >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Studying for the Turing test >>>>>>> >>>>>>> -- >>>>>>> You received this message because you are subscribed to the Google >>>>>>> Groups "mechanical-sympathy" group. >>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>> send an email to mechanical-sympathy+unsubscr...@googlegroups.com. >>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>> >>>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "mechanical-sympathy" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to mechanical-sympathy+unsubscr...@googlegroups.com. >>>>> For more options, visit https://groups.google.com/d/optout. >>>> >>>> >>> -- >>> You received this message because you are subscribed to the Google Groups >>> "mechanical-sympathy" group. >>> To unsubscribe from this group and stop receiving emails from it, send an >>> email to mechanical-sympathy+unsubscr...@googlegroups.com. >>> For more options, visit https://groups.google.com/d/optout. >> >> > -- > You received this message because you are subscribed to the Google Groups > "mechanical-sympathy" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to mechanical-sympathy+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. -- Regards, Janmejay http://codehunk.wordpress.com -- You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group. To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.