Hi Romain, On Mon, 2023-06-19 at 19:56 +0000, Romain GEISSLER via Elfutils-devel wrote: > > Thanks ! And sorry that Laurent had pinged you directly on Slack, I > wanted to reach you via this mailing list instead of through the Red > Hat customer network ;)
Slack isn't a very effective way to reach me. Most elfutils hackers do hang out on the Libera.Chat irc channel #elfutils. > I don’t know if you read the Red Hat case too. There you can find > things a bit more clarified, and splitted into what I think are potentially > 3 distinct "problems" which 3 distinct possible fix. Since there is nothing > private, I can write on this here as well on this public mailing list. I haven't looked if I have access to the customer case since you provided such a great reproducer. > So in the end I see 3 points (in addition to not understanding why > finding the elf header returns NULL while it should not and which I > guess you are currently looking at): > - the idea that systemd developers should invert their logic: first > try to parse elf/program headers from the (maybe partial) core dump > PT_LOAD program headers yes, that could in theory also be done through a custom callbacks- >find_elf. > - This special "if" condition that I have added in the original systemd > code: > > + /* This PT_LOAD section doesn't contain the start address, > so it can't be the module we are looking for. */ > + if (start < program_header->p_vaddr || start >= > program_header->p_vaddr + program_header->p_memsz) > + continue; > > to be added near this line: > https://github.com/systemd/systemd/blob/72e7bfe02d7814fff15602726c7218b389324159/src/shared/elf-util.c#L540 > > on which I would like to ask you if indeed it seems like a "right" fix with > your knowledge of how core dump and elf files are shaped. Yes, that does make sense. > - The idea that maybe this commit > https://sourceware.org/git/?p=elfutils.git;a=commitdiff;h=8db849976f07046d27b4217e9ebd08d5623acc4f > which assumed that normally the order of magnitude of program headers > is 10 for a "normal" elf file, so a linked list would be enough might be > wrong in the special case of core dump which may have much more > program headers. And if indeed it makes sense to elf_getdata_rawchunk > for each and every program header of a core, in that case should this > linked list be changed into some set/hashmap indexed by start > address/size ? Interesting. Yeah, a linked list is not the ideal datastructure here. But I don't think it is causing the really long delay. That clearly comes from the (negative) inode/dentry file search cache. But we could look at this after we solve the other issues and we maybe want to speed things up a bit more. > Cheers, Mark