On 12/2/22 11:36, Maxime Coquelin wrote: > > > On 12/2/22 11:09, David Marchand wrote: >> On Wed, Nov 30, 2022 at 9:30 PM Ilya Maximets <i.maxim...@ovn.org> wrote: >>>>>>> Shouldn't this be 0x7f instead? >>>>>>> 0x3f doesn't enable bit #6, which is responsible for dumping >>>>>>> shared huge pages. Or am I missing something? >>>>>> >>>>>> That's a good point, the hugepage may or may not be private. I'll send >>>>>> in a new one. >>>>> >>>>> OK. One thing to think about though is that we'll grab >>>>> VM memory, I guess, in case we have vhost-user ports. >>>>> So, the core dump size can become insanely huge. >>>>> >>>>> The downside of not having them is inability to inspect >>>>> virtqueues and stuff in the dump. >>>> >>>> Did you consider madvise()? >>>> >>>> MADV_DONTDUMP (since Linux 3.4) >>>> Exclude from a core dump those pages in the range >>>> specified by addr and length. This is useful in applications that >>>> have large areas of memory that are known not to be useful in a core >>>> dump. The effect of MADV_DONT‐ >>>> DUMP takes precedence over the bit mask that is set via >>>> the /proc/[pid]/coredump_filter file (see core(5)). >>>> >>>> MADV_DODUMP (since Linux 3.4) >>>> Undo the effect of an earlier MADV_DONTDUMP. >>> >>> I don't think OVS actually knows location of particular VM memory >>> pages that we do not need. And dumping virtqueues and stuff is, >>> probably, the point of this patch (?). >>> >>> vhost-user library might have a better idea on which particular parts >>> of the memory guest may use for virtqueues and buffers, but I'm not >>> 100% sure. >> >> Yes, distinguishing hugepages of interest is a problem. >> >> Since v20.05, DPDK mem allocator takes care of excluding (unused) >> hugepages from dump. >> So with this OVS patch, if we catch private and shared hugepages, >> "interesting" DPDK hugepages will get dumped, which is useful for >> debugging post mortem. >> >> Adding Maxime, who will have a better idea of what is possible for the >> guest mapping part. >> >> > > I wonder if we could do a MADV_DONTDUMP on all the guest memory at mmap > time, then there are two cases: > a. vIOMMU = OFF. In this case we could do MADV_DODUMP on virtqueues > memory. Doing so, we would have the rings memory, but not their buffers > (except if they are located on same hugepages). > b. vIOMMU = ON. In this case we could do MADV_DODUMP on IOTLB_UPDATE > new entries and MADV_DONTDUMP on invalidated entries. Doing so we will > get both vrings and their buffers the backend is allowed to access.
I guess, while DONTDUMP calls are mainly harmless, the explicit DODUMP will override whatever user had in their global configuration. Meaning every DPDK application with vhost ports will start dumping some of the guest pages with no actual ability to turn that off. Can the behavior be configurable? > > I can prepare a PoC quickly if someone is willing to experiment. > > Regards, > Maxime > > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev