Re: OOMA kill with vm.pfault_oom_attempts="-1" on RPi3 at r357147
On 2020-Jan-27, at 19:53, bob prohaska wrote: > On Mon, Jan 27, 2020 at 06:22:20PM -0800, Mark Millard wrote: >> >> So far as I know, in the past progress was only made when someone >> already knowledgable got involved in isolating what was happening >> and how to control it. >> > Indeed. One can only hope said knowledgeables are reading May be I can suggest something that might kick-start evidence gathering a little bit: add 4 unconditional printf's to the kernel code, each just before one of the vm_pageout_oom(. . .) calls. Have the message uniquely identify which of the 4 it is before. The details of what I found that suggested this follows. I found: #define VM_OOM_MEM 1 #define VM_OOM_MEM_PF 2 #define VM_OOM_SWAPZ3 In vm_fault(. . .) : . . . if (vm_pfault_oom_attempts < 0 || oom < vm_pfault_oom_attempts) { oom++; vm_waitpfault(dset, vm_pfault_oom_wait * hz); goto RetryFault_oom; } if (bootverbose) printf( "proc %d (%s) failed to alloc page on fault, starting OOM\n", curproc->p_pid, curproc->p_comm); vm_pageout_oom(VM_OOM_MEM_PF); . . . (I'd not have guessed that bootverbose would control messages about OOM activity.) The above one looks to be blocked by the "-1" setting that we have been using. In vm_pageout_mightbe_oom(. . .) : . . . if (starting_page_shortage <= 0 || starting_page_shortage != page_shortage) vmd->vmd_oom_seq = 0; else vmd->vmd_oom_seq++; if (vmd->vmd_oom_seq < vm_pageout_oom_seq) { if (vmd->vmd_oom) { vmd->vmd_oom = FALSE; atomic_subtract_int(_pageout_oom_vote, 1); } return; } /* * Do not follow the call sequence until OOM condition is * cleared. */ vmd->vmd_oom_seq = 0; if (vmd->vmd_oom) return; vmd->vmd_oom = TRUE; old_vote = atomic_fetchadd_int(_pageout_oom_vote, 1); if (old_vote != vm_ndomains - 1) return; /* * The current pagedaemon thread is the last in the quorum to * start OOM. Initiate the selection and signaling of the * victim. */ vm_pageout_oom(VM_OOM_MEM); /* * After one round of OOM terror, recall our vote. On the * next pass, current pagedaemon would vote again if the low * memory condition is still there, due to vmd_oom being * false. */ vmd->vmd_oom = FALSE; atomic_subtract_int(_pageout_oom_vote, 1); . . . The above is where the other setting we have been using extends the number of tries before doing the OOM kill. If the rate of attempts increased, less time would go by for the same figure? This case might still be happening, even for the > 4000 figure used on the 5 GiByte amd64 system with the i386 jail that was reported? No specific printf above as things are. In swp_pager_meta_build(. . .) : . . . if (uma_zone_exhausted(swblk_zone)) { if (atomic_cmpset_int(_zone_exhausted, 0, 1)) printf("swap blk zone exhausted, " "increase kern.maxswzone\n"); vm_pageout_oom(VM_OOM_SWAPZ); pause("swzonxb", 10); } else uma_zwait(swblk_zone); . . . if (uma_zone_exhausted(swpctrie_zone)) { if (atomic_cmpset_int(_zone_exhausted, 0, 1)) printf("swap pctrie zone exhausted, " "increase kern.maxswzone\n"); vm_pageout_oom(VM_OOM_SWAPZ); pause("swzonxp", 10); } else uma_zwait(swpctrie_zone); . . . The above we have not been controlling: uma zone exhaustion for swblk_zone and swpctrie_zone. (Not that I'm familiar with them or the rest of this material.) On a small memory machine, there may be nothing that can be directly done that does not have other, nasty tradeoffs. Of course, there might be reasons that one or both of these exhaust faster then they used to. There are the 2 printf messages, but they are conditional. Still, they give something else to look
Re: OOMA kill with vm.pfault_oom_attempts="-1" on RPi3 at r357147
On Mon, Jan 27, 2020 at 06:22:20PM -0800, Mark Millard wrote: > > So far as I know, in the past progress was only made when someone > already knowledgable got involved in isolating what was happening > and how to control it. > Indeed. One can only hope said knowledgeables are reading Thanks for reading! bob prohaska ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: how to use the ktls
On 1/9/20 2:53 PM, Rick Macklem wrote: John Baldwin wrote: On 1/7/20 3:02 PM, Rick Macklem wrote: Someone once told me they were working on a netgraph node that did TLS encapsulation of a stream. I can not remember who it was, but I do remember they were dubious about being allowed to give it back. :-( I only mention this as it MAY be an architectural avenue to investigate. Julian Hi, Now that I've completed NFSv4.2 I'm on to the next project, which is making NFS work over TLS. Of course, I know absolutely nothing about TLS, which will make this an interesting exercise for me. I did find simple server code in the OpenSSL doc. which at least gives me a starting point for the initialization stuff. As I understand it, this initialization must be done in userspace? Then somehow, the ktls takes over and does the encryption of the data being sent on the socket via sosend_generic(). Does that sound right? So, how does the kernel know the stuff that the initialization phase (handshake) figures out, or is it magic I don't have to worry about? Don't waste much time replying to this. A few quick hints will keep me going for now. (From what I've seen sofar, this TLS stuff isn't simple. And I thought Kerberos was a pain.;-) Thanks in advance for any hints, rick Hmmm, this might be a fair bit of work indeed. If it was easy, it wouldn't be fun;-) FreeBSD13 is a ways off and if it doesn't make that, oh well.. Right now KTLS only works for transmit (though I have some WIP for receive). Hopefully your WIP will make progress someday, or I might be able to work on it. KTLS does assumes that the initial handshake and key negotiation is handled by OpenSSL. OpenSSL uses custom setockopt() calls to tell the kernel which session keys to use. Yea, I figured I'd need a daemon like the gssd for this. The krpc makes it a little more fun, since it handles TCP connections in the kernel. I think what you would want to do is use something like OpenSSL_connect() in userspace, and then check to see if KTLS "worked". Thanks (and for the code below). I found the simple server code in the OpenSSL doc, but the client code gets a web page and is quite involved. If it did, you can tell the kernel it can write to the socket directly, otherwise you will have to bounce data back out to userspace to run it through SSL_write() and have userspace do SSL_read() and then feed data into the kernel. I don't think bouncing the data up/down to/from userland would work well. I'd say "if it can't be done in the kernel, too bad". The above could be used for a NULL RPC to see it is working, for the client. The pseudo-code might look something like: SSL *s; s = SSL_new(...); /* fd is the existing TCP socket */ SSL_set_fd(s, fd); OpenSSL_connect(s); if (BIO_get_ktls_send(SSL_get_wbio(s)) { /* Can use KTLS for transmit. */ } if (BIO_get_ktls_recv(SSL_get_rbio(s)) { /* Can use KTLS for receive. */ } Thanks John, rick -- John Baldwin ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: how to use the ktls
John Baldwin wrote: >On 1/26/20 8:08 PM, Rick Macklem wrote: >> John Baldwin wrote: >> [stuff snipped] >>> Hmmm, this might be a fair bit of work indeed. >>> >>> Right now KTLS only works for transmit (though I have some WIP for receive). >>> >>> KTLS does assumes that the initial handshake and key negotiation is handled >>> by >>> OpenSSL. OpenSSL uses custom setockopt() calls to tell the kernel which >>> session keys to use. >>> >>> I think what you would want to do is use something like OpenSSL_connect() in >>> userspace, and then check to see if KTLS "worked". If it did, you can tell >>> the kernel it can write to the socket directly, otherwise you will have to >>> bounce data back out to userspace to run it through SSL_write() and have >>> userspace do SSL_read() and then feed data into the kernel. >>> >>> The pseudo-code might look something like: >>> >>> SSL *s; >>> >>> s = SSL_new(...); >>> >>> /* fd is the existing TCP socket */ >>> SSL_set_fd(s, fd); >>> OpenSSL_connect(s); >>> if (BIO_get_ktls_send(SSL_get_wbio(s)) { >>>/* Can use KTLS for transmit. */ >>> } >>> if (BIO_get_ktls_recv(SSL_get_rbio(s)) { >>>/* Can use KTLS for receive. */ >>> } >> >> So, I've been making some progress. The first stab at the daemons that do the >> handshake are now on svn in base/projects/nfs-over-tls/usr.sbin/rpctlscd and >> rpctlssd. >> >> A couple of questions... >> 1 - I haven't found BIO_get_ktls_send() or BIO_get_ktls_recv(). Are they in >> some >>different library? > >They only existing currently in OpenSSL master (which will be OpenSSL 3.0.0 >when it >is released). I have some not-yet-tested WIP changes to backport those >changes into >the base OpenSSL, but it will also add overhead to future OpenSSL imports >perhaps, >so it is something I need to work with secteam@ on to decide if it's viable >once I >have a tested PoC. > >I will try to at least provide a patch to the security/openssl port to add a >KTLS >option "soon" that you could use for testing. John, I wouldn't worry much about this. The calls are currently #ifdef notnow in the daemon and I'm fine with that. SSL_connect() has returned 1, so the daemon knows that the handshake is complete and the kernel code that did the upcall to the daemon can check for KERN_TLS support. >> 2 - After a successful SSL_connect(), the receive queue for the socket has >> 478bytes >> of stuff in it. SSL_read() seems to know how to skip over it, but I >> haven't >> figured out a good way to do this. (I currently just do a >> recv(..478,0) on the >> socket.) >> Any idea what to do with this? (Or will the receive side of the ktls >> figure out >> how to skip over it?) > >I don't know yet. :-/ With the TOE-based TLS I had been testing with, this >doesn't >happen because the NIC blocks the data until it gets the key and then it's >always >available via KTLS. With software-based KTLS for RX (which I'm going to start >working on soon), this won't be the case and you will potentially have some >data >already ready by OpenSSL that needs to be drained from OpenSSL before you can >depend on KTLS. It's probably only the first few messsages, but I will need >to figure >out a way that you can tell how much pending data in userland you need to read >via >SSL_read() and then pass back into the kernel before relying on KTLS (it would >just >be a single chunk of data after SSL_connect you would have to do this for). Well, SSL_read() doesn't return these bytes. I think it just throws them away. I have a simple test client/server where the client sends "HELLO THERE" to the server and the server replies "GOODBYE" after the SSL_connect()/SSL_accept() has been done. --> If the "HELLO THERE"/"GOODBYE" is done with SSL_write()/SSL_read() it works. however --> If the above is done with send()/recv(), the server gets the "HELLO THERE", but the client gets 485bytes of data, where the last 7 are "GOODBYE". --> If I do a recv( ..475..) in the client right after SSL_connect() it works ok. I do this for testing, since it can then do the NFS mount (unencrypted). Looking inside SSL_read() I found: * 1742 * If we are a client and haven't received the ServerHello etc then we 1743 * better do that 1744 */ 1745ossl_statem_check_finish_init(s, 0); but all ossl_statem_check_finish_init(s, 0); seems to do is set a variable "in_init = 1". Then it calls the method->read() function, which I'd guess is in the crypto code and it seems to get rid of these bytes. (I hate trying to figure out what calls what for object oriented code;-). Btw. SSL_is_finished_init() returns 1 right after the SSL_connect(), so it seems to think the hadnshake is done, although it hasn't read these 478 bytes from the server. Anyhow, I'd guess the TOE engine knows how to do get rid of this stuff like SSL_read() does? >> I'm currently testing with a kernel that doesn't have options KERN_TLS and >> (so long as I
Re: After update to r357104 build of poudriere jail fails with 'out of swap space'
On 2020-Jan-27, at 12:48, Cy Schubert wrote: > In message , Mark Millard > write > s: >> >> >> >> On 2020-Jan-27, at 10:20, Cy Schubert wrote: >> >>> On January 27, 2020 5:09:06 AM PST, Cy Schubert >> wrote: > . . . Setting a lower arc_max at boot is unlikely to help. Rust was building on the 8 GB and 5 GB 4 core machines last night. It completed successfully on the 8 GB machine, while using 12 MB of swap. ARC was at 1307 MB. On the 5 GB 4 core machine the rust build died of OOM. 328 KB swap was used. ARC was reported at 941 MB. arc_min on this machine is 489.2 MB. >>> >>> MAKE_JOBS_NUMBER=3 worked building rust on the 5 GB 4 core machine. ARC is >> at 534 MB with 12 MB swap used. >> >> If you increase vm.pageout_oom_seq to, say, 10 times what you now use, >> does MAKE_JOBS_NUMBER=4 complete --or at least go notably longer before >> getting OOM behavior from the system? (The default is 12 last I checked. >> So that might be what you are now using.) > > It's already 4096 (default is 12). Wow. Then the count of tries to get free RAM above the threshold does not seem likely to be the source of the OOM kills. >> >> Have you tried also having: vm.pfault_oom_attempts="-1" (Presuming >> you are not worried about actually running out of swap/page space, >> or can tolerate a deadlock if it does run out.) This setting presumes >> head, not release or stable. (Last I checked anyway.) > > Already there. Then page-out delay does not seem likely to be the source of the OOM kills. > The box is a sandbox with remote serial console access so deadlocks are ok. > >> >> It would be interesting to know what difference those two settings >> together might make for your context: it seems to be a good context >> for testing in this area. (But you might already have set them. >> If so, it would be good to report the figures in use.) >> >> Of course, my experiment ideas need not be your actions. > > It's a sandbox machine. We already know 8 GB works with 4 threads on as > many cores. And, 5 GB works with 3 threads on 4 cores. It would be nice to find out what category of issue in the kernel is driving the OOM kills for your 5GB context with MAKE_JOBS_NUMBER=4. Too bad the first kill does not report a backtrace spanning the code choosing to do the kill (or otherwise report the type of issue leading the the kill). Your is consistent with the small arm board folks reporting that recently contexts that were doing buildworld and the like fine under somewhat older kernels have started getting OOM kills, despite the two settings. At the moment I'm not sure how to find the category(s) of issue(s) that is(are) driving these OOM kills. Thanks for reporting what settings you were using. === Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar) ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: After update to r357104 build of poudriere jail fails with 'out of swap space'
In message , Mark Millard write s: > > > > On 2020-Jan-27, at 10:20, Cy Schubert wrote: > > > On January 27, 2020 5:09:06 AM PST, Cy Schubert > wrote: > >>> . . . > >> > >> Setting a lower arc_max at boot is unlikely to help. Rust was building > >> on > >> the 8 GB and 5 GB 4 core machines last night. It completed successfully > >> on > >> the 8 GB machine, while using 12 MB of swap. ARC was at 1307 MB. > >> > >> On the 5 GB 4 core machine the rust build died of OOM. 328 KB swap was > >> used. ARC was reported at 941 MB. arc_min on this machine is 489.2 MB. > > > > MAKE_JOBS_NUMBER=3 worked building rust on the 5 GB 4 core machine. ARC is > at 534 MB with 12 MB swap used. > > If you increase vm.pageout_oom_seq to, say, 10 times what you now use, > does MAKE_JOBS_NUMBER=4 complete --or at least go notably longer before > getting OOM behavior from the system? (The default is 12 last I checked. > So that might be what you are now using.) It's already 4096 (default is 12). > > Have you tried also having: vm.pfault_oom_attempts="-1" (Presuming > you are not worried about actually running out of swap/page space, > or can tolerate a deadlock if it does run out.) This setting presumes > head, not release or stable. (Last I checked anyway.) Already there. The box is a sandbox with remote serial console access so deadlocks are ok. > > It would be interesting to know what difference those two settings > together might make for your context: it seems to be a good context > for testing in this area. (But you might already have set them. > If so, it would be good to report the figures in use.) > > Of course, my experiment ideas need not be your actions. It's a sandbox machine. We already know 8 GB works with 4 threads on as many cores. And, 5 GB works with 3 threads on 4 cores. -- Cheers, Cy Schubert FreeBSD UNIX: Web: http://www.FreeBSD.org The need of the many outweighs the greed of the few. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: After update to r357104 build of poudriere jail fails with 'out of swap space'
On 2020-Jan-27, at 10:20, Cy Schubert wrote: > On January 27, 2020 5:09:06 AM PST, Cy Schubert > wrote: >>> . . . >> >> Setting a lower arc_max at boot is unlikely to help. Rust was building >> on >> the 8 GB and 5 GB 4 core machines last night. It completed successfully >> on >> the 8 GB machine, while using 12 MB of swap. ARC was at 1307 MB. >> >> On the 5 GB 4 core machine the rust build died of OOM. 328 KB swap was >> used. ARC was reported at 941 MB. arc_min on this machine is 489.2 MB. > > MAKE_JOBS_NUMBER=3 worked building rust on the 5 GB 4 core machine. ARC is > at 534 MB with 12 MB swap used. If you increase vm.pageout_oom_seq to, say, 10 times what you now use, does MAKE_JOBS_NUMBER=4 complete --or at least go notably longer before getting OOM behavior from the system? (The default is 12 last I checked. So that might be what you are now using.) Have you tried also having: vm.pfault_oom_attempts="-1" (Presuming you are not worried about actually running out of swap/page space, or can tolerate a deadlock if it does run out.) This setting presumes head, not release or stable. (Last I checked anyway.) It would be interesting to know what difference those two settings together might make for your context: it seems to be a good context for testing in this area. (But you might already have set them. If so, it would be good to report the figures in use.) Of course, my experiment ideas need not be your actions. === Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar) ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
FreeBSD Quarterly Status Report - Fourth Quarter 2019
FreeBSD Project Quarterly Status Report - Fourth Quarter 2019 Here is the last quarterly status report for 2019. As you might remember from last report, we changed our timeline: now we collect reports the last month of each quarter and we edit and publish the full document the next month. Thus, we cover here the period October 2019 - December 2019. If you thought that the FreeBSD community was less active in the Christmas' quarter you will be glad to be proven wrong: a quick glance at the summary will be sufficient to see that much work has been done in the last months. Have a nice read! -- Lorenzo Salvadore __ FreeBSD Team Reports * FreeBSD Core Team * FreeBSD Foundation * FreeBSD Release Engineering Team * Cluster Administration Team * Continuous Integration Projects * IPSec Extended Sequence Number (ESN) support * NFS Version 4.2 implementation * DTS Update * RockChip Support * Creating virtual FreeBSD appliances from RE VMDK images Kernel * SoC audio framework and RK3399 audio drivers * FreeBSD on Microsoft HyperV and Azure * FreeBSD on EC2 ARM64 * ENA FreeBSD Driver Update Architectures * PowerPC on Clang * NXP ARM64 SoC support Userland Programs * Linux compatibility layer update Ports * Ports Collection * KDE on FreeBSD * Java on FreeBSD * Electron and VSCode * Bastille * Universal Packaging Tool (upt) * Wine on FreeBSD Third-Party Projects * sysctlbyname-improved * pot and the nomad pot driver * 7 Days Challenge * NomadBSD __ FreeBSD Team Reports Entries from the various official and semi-official teams, as found in the Administration Page. FreeBSD Core Team Contact: FreeBSD Core Team The FreeBSD Core Team is the governing body of FreeBSD. * Julie Saravanos, the sister of Bruce D. Evans (bde), mailed core with the sad news that Bruce passed away on 2019-12-18 at the age of 68 years. Bruce was a deeply respected member of the community, served on the Core team, and made over 5,000 commits. Bruce's impact on our culture was so profound that new terminology was spawned. This is an excerpt of a message from Poul-Henning Kamp to Julie. I don't know precisely when I first communicated with Bruce, it was in the late 1980'ies via "UseNET", but I can say with certainty that few people have inspired me more, or improved my programming more, than Bruce he did over the next half of my life. All large projects invent its own vocabulary and in FreeBSD two of the neologisms are "Brucification", and "Brucified". A "brucification" meant receiving a short, courteous note pointing out a sometimes subtle deficiency, or an overlooked detail in a source code change. Not necessarily a serious problem, possibly not even a problem to anybody at all, but nonetheless something which was wrong and ought to be fixed. It was not uncommon for the critique to be considerably longer than the change in question. If one ignored brucifications one ran the risk of being "brucified", which meant receiving a long and painstakingly detailed list of every single one of the errors, mistakes, typos, shortcomings, bad decisions, questionable choices, style transgressions and general sloppiness of thinking, often expressed with deadpan humor sharpened to a near-fatal point. The most frustrating thing was that Bruce would be perfectly justified and correct. I can only recall one or two cases where I were able to respond "Sorry Bruce, but you're wrong there..." - and I confess that on those rare days I felt like I should cut a notch in my keyboard. The last email we received from Bruce is a good example of the depth of knowledge and insight he provided for the project: https://docs.freebsd.org/cgi/getmsg.cgi?fetch=1163414+0+archive/2019/svn-src-all/20191027.svn-src-all * The 12.1 release was dedicated to another FreeBSD developer who passed away in the fourth quarter of 2019, Kurt Lidl. The FreeBSD Foundation has a memorial page to Kurt. https://www.freebsdfoundation.org/blog/in-memory-of-kurt-lidl/ We send our condolences to both the families of Bruce and Kurt. * Core has documented The Project's policy on support tiers. https://www.freebsd.org/doc/en_US.ISO8859-1/articles/committers-guide/archs.html * Core approved a source commit bit for James Clarke. Brooks Davis (brooks) will mentor James and John Baldwin (jhb) will co-mentor. * The Project's first Season of Docs ended with a negative result. The work was not completed and contact could not be
Re: After update to r357104 build of poudriere jail fails with 'out of swap space'
On January 27, 2020 10:19:50 AM PST, "Rodney W. Grimes" wrote: >> In message <202001261745.00qhjkuw044...@gndrsh.dnsmgr.net>, "Rodney >W. >> Grimes" >> writes: >> > > In message <20200125233116.ga49...@troutmask.apl.washington.edu>, >Steve >> > > Kargl w >> > > rites: >> > > > On Sat, Jan 25, 2020 at 02:09:29PM -0800, Cy Schubert wrote: >> > > > > On January 25, 2020 1:52:03 PM PST, Steve Kargl >> > ingt >> > > > on.edu> wrote: >> > > > > >On Sat, Jan 25, 2020 at 01:41:16PM -0800, Cy Schubert wrote: >> > > > > >> >> > > > > >> It's not just poudeiere. Standard port builds of chromium, >rust >> > > > > >> and thunderbird also fail on my machines with less than 8 >GB. >> > > > > >> >> > > > > > >> > > > > >Interesting. I routinely build chromium, rust, firefox, >> > > > > >llvm and few other resource-hunger ports on a i386-freebsd >> > > > > >laptop with 3.4 GB available memory. This is done with >> > > > > >chrome running with a few tabs swallowing a 1-1.5 GB of >> > > > > >memory. No issues. >> > > > > >> > > > > Number of threads makes a difference too. How many >core/threads does yo >> > ur l >> > > > aptop have? >> > > > >> > > > 2 cores. >> > > >> > > This is why. >> > > >> > > > >> > > > > Reducing number of concurrent threads allowed my builds to >complete >> > > > > on the 5 GB machine. My build machines have 4 cores, 1 thread >per >> > > > > core. Reducing concurrent threads circumvented the issue. >> > > > >> > > > I use portmaster, and AFIACT, it uses 'make -j 2' for the >build. >> > > > Laptop isn't doing too much, but an update and browsing. It >does >> > > > take a long time especially if building llvm is required. >> > > >> > > I use portmaster as well (for quick incidental builds). It uses >> > > MAKE_JOBS_NUMBER=4 (which is equivalent to make -j 4). I suppose >machines >> > > with not enough memory to support their cores with certain builds >might >> > > have a better chance of having this problem. >> > > >> > > MAKE_JOBS_NUMBER_LIMIT to limit a 4 core machine with less than 2 >GB per >> > > core might be an option. Looking at it this way, instead of an >extra 3 GB, >> > > the extra 60% more memory in the other machine makes a big >difference. A >> > > rule of thumb would probably be, have ~ 2 GB RAM for every core >or thread >> > > when doing large parallel builds. >> > >> > Perhaps we need to redo some boot time calculations, for one the >> > ZFS arch cache, IMHO, is just silly at a fixed percent of total >> > memory. A high percentage at that. >> > >> > One idea based on what you just said might be: >> > >> > percore_memory_reserve = 2G (Your number, I personally would use 1G >here) >> > arc_max = MAX(memory size - (Cores * percore_memory_reserve), >512mb) >> > >> > I think that simple change would go a long ways to cutting down the >> > number of OOM reports we see. ALSO IMHO there should be a way for >> > sub systems to easily tell zfs they are memory pigs too and need to >> > share the space. Ie, bhyve is horrible if you do not tune zfs arc >> > based on how much memory your using up for VM's. >> > >> > Another formulation might be >> > percore_memory_reserve = alpha * memory_zire / cores >> > >> > Alpha most likely falling in the 0.25 to 0.5 range, I think this >one >> > would have better scalability, would need to run some numbers. >> > Probably needs to become non linear above some core count. >> >> Setting a lower arc_max at boot is unlikely to help. Rust was >building on >> the 8 GB and 5 GB 4 core machines last night. It completed >successfully on >> the 8 GB machine, while using 12 MB of swap. ARC was at 1307 MB. >> >> On the 5 GB 4 core machine the rust build died of OOM. 328 KB swap >was >> used. ARC was reported at 941 MB. arc_min on this machine is 489.2 >MB. > >What is arc_max? > >> Cy Schubert 3.8 GB. It never exceeds 1.5 to 2 GB when doing a NO_CLEAN buildworld, where it gets a 95-99% hit ratio with 8 threads. There are a couple of things going on here. First, four large multithreaded rust compiles in memory simultaneously. Secondly, a reluctance to use swap. My guess is the working set for each of the four compiles was large enough to trigger the OOM. I haven't had time to seriously look at this though but I'm guessing that the locality of reference was large enough to keep much of the memory in RAM, so here we are. -- Pardon the typos and autocorrect, small keyboard in use. Cy Schubert FreeBSD UNIX: Web: https://www.FreeBSD.org The need of the many outweighs the greed of the few. Sent from my Android device with K-9 Mail. Please excuse my brevity. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: After update to r357104 build of poudriere jail fails with 'out of swap space'
On January 27, 2020 5:09:06 AM PST, Cy Schubert wrote: >In message <202001261745.00qhjkuw044...@gndrsh.dnsmgr.net>, "Rodney W. >Grimes" >writes: >> > In message <20200125233116.ga49...@troutmask.apl.washington.edu>, >Steve >> > Kargl w >> > rites: >> > > On Sat, Jan 25, 2020 at 02:09:29PM -0800, Cy Schubert wrote: >> > > > On January 25, 2020 1:52:03 PM PST, Steve Kargl >> ingt >> > > on.edu> wrote: >> > > > >On Sat, Jan 25, 2020 at 01:41:16PM -0800, Cy Schubert wrote: >> > > > >> >> > > > >> It's not just poudeiere. Standard port builds of chromium, >rust >> > > > >> and thunderbird also fail on my machines with less than 8 >GB. >> > > > >> >> > > > > >> > > > >Interesting. I routinely build chromium, rust, firefox, >> > > > >llvm and few other resource-hunger ports on a i386-freebsd >> > > > >laptop with 3.4 GB available memory. This is done with >> > > > >chrome running with a few tabs swallowing a 1-1.5 GB of >> > > > >memory. No issues. >> > > > >> > > > Number of threads makes a difference too. How many core/threads >does yo >> ur l >> > > aptop have? >> > > >> > > 2 cores. >> > >> > This is why. >> > >> > > >> > > > Reducing number of concurrent threads allowed my builds to >complete >> > > > on the 5 GB machine. My build machines have 4 cores, 1 thread >per >> > > > core. Reducing concurrent threads circumvented the issue. >> > > >> > > I use portmaster, and AFIACT, it uses 'make -j 2' for the build. >> > > Laptop isn't doing too much, but an update and browsing. It does >> > > take a long time especially if building llvm is required. >> > >> > I use portmaster as well (for quick incidental builds). It uses >> > MAKE_JOBS_NUMBER=4 (which is equivalent to make -j 4). I suppose >machines >> > with not enough memory to support their cores with certain builds >might >> > have a better chance of having this problem. >> > >> > MAKE_JOBS_NUMBER_LIMIT to limit a 4 core machine with less than 2 >GB per >> > core might be an option. Looking at it this way, instead of an >extra 3 GB, >> > the extra 60% more memory in the other machine makes a big >difference. A >> > rule of thumb would probably be, have ~ 2 GB RAM for every core or >thread >> > when doing large parallel builds. >> >> Perhaps we need to redo some boot time calculations, for one the >> ZFS arch cache, IMHO, is just silly at a fixed percent of total >> memory. A high percentage at that. >> >> One idea based on what you just said might be: >> >> percore_memory_reserve = 2G (Your number, I personally would use 1G >here) >> arc_max = MAX(memory size - (Cores * percore_memory_reserve), 512mb) >> >> I think that simple change would go a long ways to cutting down the >> number of OOM reports we see. ALSO IMHO there should be a way for >> sub systems to easily tell zfs they are memory pigs too and need to >> share the space. Ie, bhyve is horrible if you do not tune zfs arc >> based on how much memory your using up for VM's. >> >> Another formulation might be >> percore_memory_reserve = alpha * memory_zire / cores >> >> Alpha most likely falling in the 0.25 to 0.5 range, I think this one >> would have better scalability, would need to run some numbers. >> Probably needs to become non linear above some core count. > >Setting a lower arc_max at boot is unlikely to help. Rust was building >on >the 8 GB and 5 GB 4 core machines last night. It completed successfully >on >the 8 GB machine, while using 12 MB of swap. ARC was at 1307 MB. > >On the 5 GB 4 core machine the rust build died of OOM. 328 KB swap was >used. ARC was reported at 941 MB. arc_min on this machine is 489.2 MB. MAKE_JOBS_NUMBER=3 worked building rust on the 5 GB 4 core machine. ARC is at 534 MB with 12 MB swap used. -- Pardon the typos and autocorrect, small keyboard in use. Cy Schubert FreeBSD UNIX: Web: https://www.FreeBSD.org The need of the many outweighs the greed of the few. Sent from my Android device with K-9 Mail. Please excuse my brevity. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: After update to r357104 build of poudriere jail fails with 'out of swap space'
> In message <202001261745.00qhjkuw044...@gndrsh.dnsmgr.net>, "Rodney W. > Grimes" > writes: > > > In message <20200125233116.ga49...@troutmask.apl.washington.edu>, Steve > > > Kargl w > > > rites: > > > > On Sat, Jan 25, 2020 at 02:09:29PM -0800, Cy Schubert wrote: > > > > > On January 25, 2020 1:52:03 PM PST, Steve Kargl > > > > > > ingt > > > > on.edu> wrote: > > > > > >On Sat, Jan 25, 2020 at 01:41:16PM -0800, Cy Schubert wrote: > > > > > >> > > > > > >> It's not just poudeiere. Standard port builds of chromium, rust > > > > > >> and thunderbird also fail on my machines with less than 8 GB. > > > > > >> > > > > > > > > > > > >Interesting. I routinely build chromium, rust, firefox, > > > > > >llvm and few other resource-hunger ports on a i386-freebsd > > > > > >laptop with 3.4 GB available memory. This is done with > > > > > >chrome running with a few tabs swallowing a 1-1.5 GB of > > > > > >memory. No issues. > > > > > > > > > > Number of threads makes a difference too. How many core/threads does > > > > > yo > > ur l > > > > aptop have? > > > > > > > > 2 cores. > > > > > > This is why. > > > > > > > > > > > > Reducing number of concurrent threads allowed my builds to complete > > > > > on the 5 GB machine. My build machines have 4 cores, 1 thread per > > > > > core. Reducing concurrent threads circumvented the issue. > > > > > > > > I use portmaster, and AFIACT, it uses 'make -j 2' for the build. > > > > Laptop isn't doing too much, but an update and browsing. It does > > > > take a long time especially if building llvm is required. > > > > > > I use portmaster as well (for quick incidental builds). It uses > > > MAKE_JOBS_NUMBER=4 (which is equivalent to make -j 4). I suppose machines > > > with not enough memory to support their cores with certain builds might > > > have a better chance of having this problem. > > > > > > MAKE_JOBS_NUMBER_LIMIT to limit a 4 core machine with less than 2 GB per > > > core might be an option. Looking at it this way, instead of an extra 3 > > > GB, > > > the extra 60% more memory in the other machine makes a big difference. A > > > rule of thumb would probably be, have ~ 2 GB RAM for every core or thread > > > when doing large parallel builds. > > > > Perhaps we need to redo some boot time calculations, for one the > > ZFS arch cache, IMHO, is just silly at a fixed percent of total > > memory. A high percentage at that. > > > > One idea based on what you just said might be: > > > > percore_memory_reserve = 2G (Your number, I personally would use 1G here) > > arc_max = MAX(memory size - (Cores * percore_memory_reserve), 512mb) > > > > I think that simple change would go a long ways to cutting down the > > number of OOM reports we see. ALSO IMHO there should be a way for > > sub systems to easily tell zfs they are memory pigs too and need to > > share the space. Ie, bhyve is horrible if you do not tune zfs arc > > based on how much memory your using up for VM's. > > > > Another formulation might be > > percore_memory_reserve = alpha * memory_zire / cores > > > > Alpha most likely falling in the 0.25 to 0.5 range, I think this one > > would have better scalability, would need to run some numbers. > > Probably needs to become non linear above some core count. > > Setting a lower arc_max at boot is unlikely to help. Rust was building on > the 8 GB and 5 GB 4 core machines last night. It completed successfully on > the 8 GB machine, while using 12 MB of swap. ARC was at 1307 MB. > > On the 5 GB 4 core machine the rust build died of OOM. 328 KB swap was > used. ARC was reported at 941 MB. arc_min on this machine is 489.2 MB. What is arc_max? > Cy Schubert -- Rod Grimes rgri...@freebsd.org ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: r356776 breaks kernel for powerpc64 users
No, it's ok now. I didn't report it earlier, because I wasn't sure which revision fixed it and jeff didn't reply. On 20-01-26 16:27:35, Mark Millard wrote: > Piotr Kubaj listy at anongoth.pl wrote on > Thu Jan 16 19:56:11 UTC 2020 : > > > revision 356776 breaks booting for powerpc64 users. It reportedly works > > fine on POWER8, but I get kernel panic on POWER9 (Talos II) right after the > > usual warning: WITNESS enabled. Kernel panic is uploaded to > > https://pastebin.com/s8ZaUNS2. > > > > > > @jeff > > Since you commited this patch, can you fix this issue or revert this commit? > > > > Is this still a problem for powerpc64? I've not seen > anything looking like a direct response or like a > status update for this. > > I do see arm report(s) of problems that they also > attributed to head -r356776 . But I've no clue how > good the evidence is generally. An example message > is: > > https://lists.freebsd.org/pipermail/freebsd-arm/2020-January/021069.html > > But one part of that is for specifically for going > from -r356767 to the next check-in to head: -r356776 . > That problem likely has good evidence for the > attribution to -r356776 . > > > === > Mark Millard > marklmi at yahoo.com > ( dsl-only.net went > away in early 2018-Mar) > signature.asc Description: PGP signature
Re: how to use the ktls
On Mon, Jan 27, 2020 at 8:40 AM Freddie Cash wrote: > On Sun, Jan 26, 2020 at 12:08 PM Rick Macklem > wrote: > >> Oh, and for anyone out there... >> What is the easiest freebie way to test signed certificates? >> (I currently am using a self-signed certificate, but I need to test the >> "real" version >> at some point soon.) >> > > Let's Encrypt is what you are looking for. Create real, signed, > certificates, for free. They're only good for 90 days, but they are easy > to renew. There's various script and programs out there for managing Let's > Encrypt certificates (certbot, acme.sh, dehydrated, etc). There's a bunch > of different bits available in the ports tree. > > We use dehydrated at work, using DNS for authenticating the cert requests, > and have it full automated via cron, managing certs for 50-odd domains > (school servers and firewalls). Works great. > Forgot the link: https://letsencrypt.org -- Freddie Cash fjwc...@gmail.com ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: how to use the ktls
On Sun, Jan 26, 2020 at 12:08 PM Rick Macklem wrote: > Oh, and for anyone out there... > What is the easiest freebie way to test signed certificates? > (I currently am using a self-signed certificate, but I need to test the > "real" version > at some point soon.) > Let's Encrypt is what you are looking for. Create real, signed, certificates, for free. They're only good for 90 days, but they are easy to renew. There's various script and programs out there for managing Let's Encrypt certificates (certbot, acme.sh, dehydrated, etc). There's a bunch of different bits available in the ports tree. We use dehydrated at work, using DNS for authenticating the cert requests, and have it full automated via cron, managing certs for 50-odd domains (school servers and firewalls). Works great. -- Freddie Cash fjwc...@gmail.com ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: After update to r357104 build of poudriere jail fails with 'out of swap space'
In message <202001261745.00qhjkuw044...@gndrsh.dnsmgr.net>, "Rodney W. Grimes" writes: > > In message <20200125233116.ga49...@troutmask.apl.washington.edu>, Steve > > Kargl w > > rites: > > > On Sat, Jan 25, 2020 at 02:09:29PM -0800, Cy Schubert wrote: > > > > On January 25, 2020 1:52:03 PM PST, Steve Kargl ingt > > > on.edu> wrote: > > > > >On Sat, Jan 25, 2020 at 01:41:16PM -0800, Cy Schubert wrote: > > > > >> > > > > >> It's not just poudeiere. Standard port builds of chromium, rust > > > > >> and thunderbird also fail on my machines with less than 8 GB. > > > > >> > > > > > > > > > >Interesting. I routinely build chromium, rust, firefox, > > > > >llvm and few other resource-hunger ports on a i386-freebsd > > > > >laptop with 3.4 GB available memory. This is done with > > > > >chrome running with a few tabs swallowing a 1-1.5 GB of > > > > >memory. No issues. > > > > > > > > Number of threads makes a difference too. How many core/threads does yo > ur l > > > aptop have? > > > > > > 2 cores. > > > > This is why. > > > > > > > > > Reducing number of concurrent threads allowed my builds to complete > > > > on the 5 GB machine. My build machines have 4 cores, 1 thread per > > > > core. Reducing concurrent threads circumvented the issue. > > > > > > I use portmaster, and AFIACT, it uses 'make -j 2' for the build. > > > Laptop isn't doing too much, but an update and browsing. It does > > > take a long time especially if building llvm is required. > > > > I use portmaster as well (for quick incidental builds). It uses > > MAKE_JOBS_NUMBER=4 (which is equivalent to make -j 4). I suppose machines > > with not enough memory to support their cores with certain builds might > > have a better chance of having this problem. > > > > MAKE_JOBS_NUMBER_LIMIT to limit a 4 core machine with less than 2 GB per > > core might be an option. Looking at it this way, instead of an extra 3 GB, > > the extra 60% more memory in the other machine makes a big difference. A > > rule of thumb would probably be, have ~ 2 GB RAM for every core or thread > > when doing large parallel builds. > > Perhaps we need to redo some boot time calculations, for one the > ZFS arch cache, IMHO, is just silly at a fixed percent of total > memory. A high percentage at that. > > One idea based on what you just said might be: > > percore_memory_reserve = 2G (Your number, I personally would use 1G here) > arc_max = MAX(memory size - (Cores * percore_memory_reserve), 512mb) > > I think that simple change would go a long ways to cutting down the > number of OOM reports we see. ALSO IMHO there should be a way for > sub systems to easily tell zfs they are memory pigs too and need to > share the space. Ie, bhyve is horrible if you do not tune zfs arc > based on how much memory your using up for VM's. > > Another formulation might be > percore_memory_reserve = alpha * memory_zire / cores > > Alpha most likely falling in the 0.25 to 0.5 range, I think this one > would have better scalability, would need to run some numbers. > Probably needs to become non linear above some core count. Setting a lower arc_max at boot is unlikely to help. Rust was building on the 8 GB and 5 GB 4 core machines last night. It completed successfully on the 8 GB machine, while using 12 MB of swap. ARC was at 1307 MB. On the 5 GB 4 core machine the rust build died of OOM. 328 KB swap was used. ARC was reported at 941 MB. arc_min on this machine is 489.2 MB. -- Cheers, Cy Schubert FreeBSD UNIX: Web: http://www.FreeBSD.org The need of the many outweighs the greed of the few. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: how to use the ktls
On 1/26/20 8:08 PM, Rick Macklem wrote: John Baldwin wrote: [stuff snipped] Hmmm, this might be a fair bit of work indeed. Right now KTLS only works for transmit (though I have some WIP for receive). KTLS does assumes that the initial handshake and key negotiation is handled by OpenSSL. OpenSSL uses custom setockopt() calls to tell the kernel which session keys to use. I think what you would want to do is use something like OpenSSL_connect() in userspace, and then check to see if KTLS "worked". If it did, you can tell the kernel it can write to the socket directly, otherwise you will have to bounce data back out to userspace to run it through SSL_write() and have userspace do SSL_read() and then feed data into the kernel. The pseudo-code might look something like: SSL *s; s = SSL_new(...); /* fd is the existing TCP socket */ SSL_set_fd(s, fd); OpenSSL_connect(s); if (BIO_get_ktls_send(SSL_get_wbio(s)) { /* Can use KTLS for transmit. */ } if (BIO_get_ktls_recv(SSL_get_rbio(s)) { /* Can use KTLS for receive. */ } So, I've been making some progress. The first stab at the daemons that do the handshake are now on svn in base/projects/nfs-over-tls/usr.sbin/rpctlscd and rpctlssd. A couple of questions... 1 - I haven't found BIO_get_ktls_send() or BIO_get_ktls_recv(). Are they in some different library? They only existing currently in OpenSSL master (which will be OpenSSL 3.0.0 when it is released). I have some not-yet-tested WIP changes to backport those changes into the base OpenSSL, but it will also add overhead to future OpenSSL imports perhaps, so it is something I need to work with secteam@ on to decide if it's viable once I have a tested PoC. I will try to at least provide a patch to the security/openssl port to add a KTLS option "soon" that you could use for testing. 2 - After a successful SSL_connect(), the receive queue for the socket has 478bytes of stuff in it. SSL_read() seems to know how to skip over it, but I haven't figured out a good way to do this. (I currently just do a recv(..478,0) on the socket.) Any idea what to do with this? (Or will the receive side of the ktls figure out how to skip over it?) I don't know yet. :-/ With the TOE-based TLS I had been testing with, this doesn't happen because the NIC blocks the data until it gets the key and then it's always available via KTLS. With software-based KTLS for RX (which I'm going to start working on soon), this won't be the case and you will potentially have some data already ready by OpenSSL that needs to be drained from OpenSSL before you can depend on KTLS. It's probably only the first few messsages, but I will need to figure out a way that you can tell how much pending data in userland you need to read via SSL_read() and then pass back into the kernel before relying on KTLS (it would just be a single chunk of data after SSL_connect you would have to do this for). I'm currently testing with a kernel that doesn't have options KERN_TLS and (so long as I get rid of the 478 bytes), it then just does unencrypted RPCs. So, I guess the big question is can I get access to your WIP code for KTLS receive? (I have no idea if I can make progress on it, but I can't do a lot more before I have that.) The WIP only works right now if you have a Chelsio T6 NIC as it uses the T6's TCP offload engine to do TLS. If you don't have that gear, ping me off-list. It would also let you not worry about the SSL_read case for now for initial testing. -- John Baldwin ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"