Hans: We (netflix) run in production 35% of the internet with these very things you identify no lock an all. We *do* have some issue we are looking at but so far I have *never* connected the dots the way you were claiming that would cause a crash. I can see where TCP would do incorrect retransmissions but I did *not* see a crash. Now granted my look was quick at this, but that was due to time constraints and the holidays. I am going to put myself full-time on this to see if I can understand both how you got at “there is a panic in tcp” and it must fully be the callout-subsystem thus we need to re-write large parts of it.
You *may* be correct in a re-write is needed, you *may* be completely incorrect. In either case I plan to dig into this and find out. R > On Jan 22, 2015, at 3:39 AM, Hans Petter Selasky <h...@selasky.org> wrote: > > On 01/22/15 09:10, Konstantin Belousov wrote: >> On Thu, Jan 22, 2015 at 08:14:26AM +0100, Hans Petter Selasky wrote: >>> On 01/22/15 06:26, Warner Losh wrote: >>> > >>>>> The code simply needs an update. It is not broken in any ways - right? If >>>>> it is not broken, fixing it is not that urgent. >>>> >>>> Radically changing the performance characteristics is breaking the code. >>>> Performance regression in the TCP stack is urgent to fix. >> >>> Not being able to enumerate what all the consumers are that use this and >>> provide an analysis about why they aren?t important to fix is a bug in >>> your process, and in your interaction with the project. We simply do not >>> operate that way. >> Right, I completely agree with this statement. >> >> >>> Hi, >>> >>> My plan is to work out a patch for the TCP stack today, which only >>> change the callout_init() call or its function. This should not need any >>> particular review. I'll let adrian test and review, because I think he >>> is closer to me timezone wise and you're standing on my head saying its >>> urgent. If he is still not happy, I can back my change out. Else it >>> remains in -current AS-IS. >> TCP regresssion was noted, so it is brought in front. There is nothing >> else which makes TCP issue different from other (hidden) issues. >> >> =========================== >>> MFC to 10-stable I can delay for sure until >>> all issues you report to me are fixed. >> =========================== >> >> Sigh, you still do not understand. It is your duty to identify all pieces >> which break after your change. After that, we can argue whether each of >> them is critical or not to allow the migration. But this must have been >> done before the KPI change hit the tree. >> > > Hi, > > Are you saying that pieces of code that runs completely unlocked using > "volatile" as only synchronization mechanism is better than what I would call > a temporary and hopefully short TCP stack performance loss? > > I don't understand? How frequently do you reboot your boxes? Maybe one every > day? And you don't care? > > --HPS > > > ----- Randall Stewart rand...@lakerest.net _______________________________________________ svn-src-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"